Modelling of capital asset pricing by considering the lagged effects
NASA Astrophysics Data System (ADS)
Sukono; Hidayat, Y.; Bon, A. Talib bin; Supian, S.
2017-01-01
In this paper the problem of modelling the Capital Asset Pricing Model (CAPM) with the effect of the lagged is discussed. It is assumed that asset returns are analysed influenced by the market return and the return of risk-free assets. To analyse the relationship between asset returns, the market return, and the return of risk-free assets, it is conducted by using a regression equation of CAPM, and regression equation of lagged distributed CAPM. Associated with the regression equation lagged CAPM distributed, this paper also developed a regression equation of Koyck transformation CAPM. Results of development show that the regression equation of Koyck transformation CAPM has advantages, namely simple as it only requires three parameters, compared with regression equation of lagged distributed CAPM.
Comparative evaluation of urban storm water quality models
NASA Astrophysics Data System (ADS)
Vaze, J.; Chiew, Francis H. S.
2003-10-01
The estimation of urban storm water pollutant loads is required for the development of mitigation and management strategies to minimize impacts to receiving environments. Event pollutant loads are typically estimated using either regression equations or "process-based" water quality models. The relative merit of using regression models compared to process-based models is not clear. A modeling study is carried out here to evaluate the comparative ability of the regression equations and process-based water quality models to estimate event diffuse pollutant loads from impervious surfaces. The results indicate that, once calibrated, both the regression equations and the process-based model can estimate event pollutant loads satisfactorily. In fact, the loads estimated using the regression equation as a function of rainfall intensity and runoff rate are better than the loads estimated using the process-based model. Therefore, if only estimates of event loads are required, regression models should be used because they are simpler and require less data compared to process-based models.
Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.
2015-09-28
Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Regression Simulation Model. Appendix X. Users Manual,
1981-03-01
change as the prediction equations become refined. Whereas no notice will be provided when the changes are made, the programs will be modified such that...NATIONAL BUREAU Of STANDARDS 1963 A ___,_ __ _ __ _ . APPENDIX X ( R4/ EGRESSION IMULATION ’jDEL. Ape’A ’) 7 USERS MANUA submitted to The Great River...regression analysis and to establish a prediction equation (model). The prediction equation contains the partial regression coefficients (B-weights) which
Ding, A Adam; Wu, Hulin
2014-10-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method.
Ding, A. Adam; Wu, Hulin
2015-01-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method. PMID:26401093
Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio
Koltun, G.F.
2003-01-01
Regional equations for estimating 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood-peak discharges at ungaged sites on rural, unregulated streams in Ohio were developed by means of ordinary and generalized least-squares (GLS) regression techniques. One-variable, simple equations and three-variable, full-model equations were developed on the basis of selected basin characteristics and flood-frequency estimates determined for 305 streamflow-gaging stations in Ohio and adjacent states. The average standard errors of prediction ranged from about 39 to 49 percent for the simple equations, and from about 34 to 41 percent for the full-model equations. Flood-frequency estimates determined by means of log-Pearson Type III analyses are reported along with weighted flood-frequency estimates, computed as a function of the log-Pearson Type III estimates and the regression estimates. Values of explanatory variables used in the regression models were determined from digital spatial data sets by means of a geographic information system (GIS), with the exception of drainage area, which was determined by digitizing the area within basin boundaries manually delineated on topographic maps. Use of GIS-based explanatory variables represents a major departure in methodology from that described in previous reports on estimating flood-frequency characteristics of Ohio streams. Examples are presented illustrating application of the regression equations to ungaged sites on ungaged and gaged streams. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site on the same stream. A region-of-influence method, which employs a computer program to estimate flood-frequency characteristics for ungaged sites based on data from gaged sites with similar characteristics, was also tested and compared to the GLS full-model equations. For all recurrence intervals, the GLS full-model equations had superior prediction accuracy relative to the simple equations and therefore are recommended for use.
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
ERIC Educational Resources Information Center
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Data-driven discovery of partial differential equations.
Rudy, Samuel H; Brunton, Steven L; Proctor, Joshua L; Kutz, J Nathan
2017-04-01
We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg-de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable.
Peak-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.
Linear models for calculating digestibile energy for sheep diets.
Fonnesbeck, P V; Christiansen, M L; Harris, L E
1981-05-01
Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.
Temperature-viscosity models reassessed.
Peleg, Micha
2017-05-04
The temperature effect on viscosity of liquid and semi-liquid foods has been traditionally described by the Arrhenius equation, a few other mathematical models, and more recently by the WLF and VTF (or VFT) equations. The essence of the Arrhenius equation is that the viscosity is proportional to the absolute temperature's reciprocal and governed by a single parameter, namely, the energy of activation. However, if the absolute temperature in K in the Arrhenius equation is replaced by T + b where both T and the adjustable b are in °C, the result is a two-parameter model, which has superior fit to experimental viscosity-temperature data. This modified version of the Arrhenius equation is also mathematically equal to the WLF and VTF equations, which are known to be equal to each other. Thus, despite their dissimilar appearances all three equations are essentially the same model, and when used to fit experimental temperature-viscosity data render exactly the same very high regression coefficient. It is shown that three new hybrid two-parameter mathematical models, whose formulation bears little resemblance to any of the conventional models, can also have excellent fit with r 2 ∼ 1. This is demonstrated by comparing the various models' regression coefficients to published viscosity-temperature relationships of 40% sucrose solution, soybean oil, and 70°Bx pear juice concentrate at different temperature ranges. Also compared are reconstructed temperature-viscosity curves using parameters calculated directly from 2 or 3 data points and fitted curves obtained by nonlinear regression using a larger number of experimental viscosity measurements.
Data-driven discovery of partial differential equations
Rudy, Samuel H.; Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan
2017-01-01
We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg–de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable. PMID:28508044
Crawford, John R; Garthwaite, Paul H; Denham, Annie K; Chelune, Gordon J
2012-12-01
Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because (a) not all psychologists are aware that regression equations can be built not only from raw data but also using only basic summary data for a sample, and (b) the computations involved are tedious and prone to error. In an attempt to overcome these barriers, Crawford and Garthwaite (2007) provided methods to build and apply simple linear regression models using summary statistics as data. In the present study, we extend this work to set out the steps required to build multiple regression models from sample summary statistics and the further steps required to compute the associated statistics for drawing inferences concerning an individual case. We also develop, describe, and make available a computer program that implements these methods. Although there are caveats associated with the use of the methods, these need to be balanced against pragmatic considerations and against the alternative of either entirely ignoring a pertinent data set or using it informally to provide a clinical "guesstimate." Upgraded versions of earlier programs for regression in the single case are also provided; these add the point and interval estimates of effect size developed in the present article.
Asquith, William H.; Thompson, David B.
2008-01-01
The U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, investigated a refinement of the regional regression method and developed alternative equations for estimation of peak-streamflow frequency for undeveloped watersheds in Texas. A common model for estimation of peak-streamflow frequency is based on the regional regression method. The current (2008) regional regression equations for 11 regions of Texas are based on log10 transformations of all regression variables (drainage area, main-channel slope, and watershed shape). Exclusive use of log10-transformation does not fully linearize the relations between the variables. As a result, some systematic bias remains in the current equations. The bias results in overestimation of peak streamflow for both the smallest and largest watersheds. The bias increases with increasing recurrence interval. The primary source of the bias is the discernible curvilinear relation in log10 space between peak streamflow and drainage area. Bias is demonstrated by selected residual plots with superimposed LOWESS trend lines. To address the bias, a statistical framework based on minimization of the PRESS statistic through power transformation of drainage area is described and implemented, and the resulting regression equations are reported. Compared to log10-exclusive equations, the equations derived from PRESS minimization have PRESS statistics and residual standard errors less than the log10 exclusive equations. Selected residual plots for the PRESS-minimized equations are presented to demonstrate that systematic bias in regional regression equations for peak-streamflow frequency estimation in Texas can be reduced. Because the overall error is similar to the error associated with previous equations and because the bias is reduced, the PRESS-minimized equations reported here provide alternative equations for peak-streamflow frequency estimation.
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
Charles E. Rose; Thomas B. Lynch
2001-01-01
A method was developed for estimating parameters in an individual tree basal area growth model using a system of equations based on dbh rank classes. The estimation method developed is a compromise between an individual tree and a stand level basal area growth model that accounts for the correlation between trees within a plot by using seemingly unrelated regression (...
ERIC Educational Resources Information Center
Akilli, Mustafa
2015-01-01
The aim of this study is to demonstrate the science success regression levels of chosen emotional features of 8th grade students using Structural Equation Model. The study was conducted by the analysis of students' questionnaires and science success in TIMSS 2011 data using SEM. Initially, the factors that are thought to have an effect on science…
Wagner, Daniel M.; Krieger, Joshua D.; Veilleux, Andrea G.
2016-08-04
In 2013, the U.S. Geological Survey initiated a study to update regional skew, annual exceedance probability discharges, and regional regression equations used to estimate annual exceedance probability discharges for ungaged locations on streams in the study area with the use of recent geospatial data, new analytical methods, and available annual peak-discharge data through the 2013 water year. An analysis of regional skew using Bayesian weighted least-squares/Bayesian generalized-least squares regression was performed for Arkansas, Louisiana, and parts of Missouri and Oklahoma. The newly developed constant regional skew of -0.17 was used in the computation of annual exceedance probability discharges for 281 streamgages used in the regional regression analysis. Based on analysis of covariance, four flood regions were identified for use in the generation of regional regression models. Thirty-nine basin characteristics were considered as potential explanatory variables, and ordinary least-squares regression techniques were used to determine the optimum combinations of basin characteristics for each of the four regions. Basin characteristics in candidate models were evaluated based on multicollinearity with other basin characteristics (variance inflation factor < 2.5) and statistical significance at the 95-percent confidence level (p ≤ 0.05). Generalized least-squares regression was used to develop the final regression models for each flood region. Average standard errors of prediction of the generalized least-squares models ranged from 32.76 to 59.53 percent, with the largest range in flood region D. Pseudo coefficients of determination of the generalized least-squares models ranged from 90.29 to 97.28 percent, with the largest range also in flood region D. The regional regression equations apply only to locations on streams in Arkansas where annual peak discharges are not substantially affected by regulation, diversion, channelization, backwater, or urbanization. The applicability and accuracy of the regional regression equations depend on the basin characteristics measured for an ungaged location on a stream being within range of those used to develop the equations.
A Comparison between Multiple Regression Models and CUN-BAE Equation to Predict Body Fat in Adults
Fuster-Parra, Pilar; Bennasar-Veny, Miquel; Tauler, Pedro; Yañez, Aina; López-González, Angel A.; Aguiló, Antoni
2015-01-01
Background Because the accurate measure of body fat (BF) is difficult, several prediction equations have been proposed. The aim of this study was to compare different multiple regression models to predict BF, including the recently reported CUN-BAE equation. Methods Multi regression models using body mass index (BMI) and body adiposity index (BAI) as predictors of BF will be compared. These models will be also compared with the CUN-BAE equation. For all the analysis a sample including all the participants and another one including only the overweight and obese subjects will be considered. The BF reference measure was made using Bioelectrical Impedance Analysis. Results The simplest models including only BMI or BAI as independent variables showed that BAI is a better predictor of BF. However, adding the variable sex to both models made BMI a better predictor than the BAI. For both the whole group of participants and the group of overweight and obese participants, using simple models (BMI, age and sex as variables) allowed obtaining similar correlations with BF as when the more complex CUN-BAE was used (ρ = 0:87 vs. ρ = 0:86 for the whole sample and ρ = 0:88 vs. ρ = 0:89 for overweight and obese subjects, being the second value the one for CUN-BAE). Conclusions There are simpler models than CUN-BAE equation that fits BF as well as CUN-BAE does. Therefore, it could be considered that CUN-BAE overfits. Using a simple linear regression model, the BAI, as the only variable, predicts BF better than BMI. However, when the sex variable is introduced, BMI becomes the indicator of choice to predict BF. PMID:25821960
A comparison between multiple regression models and CUN-BAE equation to predict body fat in adults.
Fuster-Parra, Pilar; Bennasar-Veny, Miquel; Tauler, Pedro; Yañez, Aina; López-González, Angel A; Aguiló, Antoni
2015-01-01
Because the accurate measure of body fat (BF) is difficult, several prediction equations have been proposed. The aim of this study was to compare different multiple regression models to predict BF, including the recently reported CUN-BAE equation. Multi regression models using body mass index (BMI) and body adiposity index (BAI) as predictors of BF will be compared. These models will be also compared with the CUN-BAE equation. For all the analysis a sample including all the participants and another one including only the overweight and obese subjects will be considered. The BF reference measure was made using Bioelectrical Impedance Analysis. The simplest models including only BMI or BAI as independent variables showed that BAI is a better predictor of BF. However, adding the variable sex to both models made BMI a better predictor than the BAI. For both the whole group of participants and the group of overweight and obese participants, using simple models (BMI, age and sex as variables) allowed obtaining similar correlations with BF as when the more complex CUN-BAE was used (ρ = 0:87 vs. ρ = 0:86 for the whole sample and ρ = 0:88 vs. ρ = 0:89 for overweight and obese subjects, being the second value the one for CUN-BAE). There are simpler models than CUN-BAE equation that fits BF as well as CUN-BAE does. Therefore, it could be considered that CUN-BAE overfits. Using a simple linear regression model, the BAI, as the only variable, predicts BF better than BMI. However, when the sex variable is introduced, BMI becomes the indicator of choice to predict BF.
The Bland-Altman Method Should Not Be Used in Regression Cross-Validation Studies
ERIC Educational Resources Information Center
O'Connor, Daniel P.; Mahar, Matthew T.; Laughlin, Mitzi S.; Jackson, Andrew S.
2011-01-01
The purpose of this study was to demonstrate the bias in the Bland-Altman (BA) limits of agreement method when it is used to validate regression models. Data from 1,158 men were used to develop three regression equations to estimate maximum oxygen uptake (R[superscript 2] = 0.40, 0.61, and 0.82, respectively). The equations were evaluated in a…
Modeling animal movements using stochastic differential equations
Haiganoush K. Preisler; Alan A. Ager; Bruce K. Johnson; John G. Kie
2004-01-01
We describe the use of bivariate stochastic differential equations (SDE) for modeling movements of 216 radiocollared female Rocky Mountain elk at the Starkey Experimental Forest and Range in northeastern Oregon. Spatially and temporally explicit vector fields were estimated using approximating difference equations and nonparametric regression techniques. Estimated...
NASA Astrophysics Data System (ADS)
Cai, Jun; Wang, Kuaishe; Shi, Jiamin; Wang, Wen; Liu, Yingying
2018-01-01
Constitutive analysis for hot working of BFe10-1-2 alloy was carried out by using experimental stress-strain data from isothermal hot compression tests, in a wide range of temperature of 1,023 1,273 K, and strain rate range of 0.001 10 s-1. A constitutive equation based on modified double multiple nonlinear regression was proposed considering the independent effects of strain, strain rate, temperature and their interrelation. The predicted flow stress data calculated from the developed equation was compared with the experimental data. Correlation coefficient (R), average absolute relative error (AARE) and relative errors were introduced to verify the validity of the developed constitutive equation. Subsequently, a comparative study was made on the capability of strain-compensated Arrhenius-type constitutive model. The results showed that the developed constitutive equation based on modified double multiple nonlinear regression could predict flow stress of BFe10-1-2 alloy with good correlation and generalization.
Rank-preserving regression: a more robust rank regression model against outliers.
Chen, Tian; Kowalski, Jeanne; Chen, Rui; Wu, Pan; Zhang, Hui; Feng, Changyong; Tu, Xin M
2016-08-30
Mean-based semi-parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon-score-based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional-response-model-based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank-preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Estimation of peak-discharge frequency of urban streams in Jefferson County, Kentucky
Martin, Gary R.; Ruhl, Kevin J.; Moore, Brian L.; Rose, Martin F.
1997-01-01
An investigation of flood-hydrograph characteristics for streams in urban Jefferson County, Kentucky, was made to obtain hydrologic information needed for waterresources management. Equations for estimating peak-discharge frequencies for ungaged streams in the county were developed by combining (1) long-term annual peakdischarge data and rainfall-runoff data collected from 1991 to 1995 in 13 urban basins and (2) long-term annual peak-discharge data in four rural basins located in hydrologically similar areas of neighboring counties. The basins ranged in size from 1.36 to 64.0 square miles. The U.S. Geological Survey Rainfall- Runoff Model (RRM) was calibrated for each of the urban basins. The calibrated models were used with long-term, historical rainfall and pan-evaporation data to simulate 79 years of annual peak-discharge data. Peak-discharge frequencies were estimated by fitting the logarithms of the annual peak discharges to a Pearson-Type III frequency distribution. The simulated peak-discharge frequencies were adjusted for improved reliability by application of bias-correction factors derived from peakdischarge frequencies based on local, observed annual peak discharges. The three-parameter and the preferred seven-parameter nationwide urban-peak-discharge regression equations previously developed by USGS investigators provided biased (high) estimates for the urban basins studied. Generalized-least-square regression procedures were used to relate peakdischarge frequency to selected basin characteristics. Regression equations were developed to estimate peak-discharge frequency by adjusting peak-dischargefrequency estimates made by use of the threeparameter nationwide urban regression equations. The regression equations are presented in equivalent forms as functions of contributing drainage area, main-channel slope, and basin development factor, which is an index for measuring the efficiency of the basin drainage system. Estimates of peak discharges for streams in the county can be made for the 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals by use of the regression equations. The average standard errors of prediction of the regression equations ranges from ? 34 to ? 45 percent. The regression equations are applicable to ungaged streams in the county having a specific range of basin characteristics.
Lewis, Jason M.
2010-01-01
Peak-streamflow regression equations were determined for estimating flows with exceedance probabilities from 50 to 0.2 percent for the state of Oklahoma. These regression equations incorporate basin characteristics to estimate peak-streamflow magnitude and frequency throughout the state by use of a generalized least squares regression analysis. The most statistically significant independent variables required to estimate peak-streamflow magnitude and frequency for unregulated streams in Oklahoma are contributing drainage area, mean-annual precipitation, and main-channel slope. The regression equations are applicable for watershed basins with drainage areas less than 2,510 square miles that are not affected by regulation. The resulting regression equations had a standard model error ranging from 31 to 46 percent. Annual-maximum peak flows observed at 231 streamflow-gaging stations through water year 2008 were used for the regression analysis. Gage peak-streamflow estimates were used from previous work unless 2008 gaging-station data were available, in which new peak-streamflow estimates were calculated. The U.S. Geological Survey StreamStats web application was used to obtain the independent variables required for the peak-streamflow regression equations. Limitations on the use of the regression equations and the reliability of regression estimates for natural unregulated streams are described. Log-Pearson Type III analysis information, basin and climate characteristics, and the peak-streamflow frequency estimates for the 231 gaging stations in and near Oklahoma are listed. Methodologies are presented to estimate peak streamflows at ungaged sites by using estimates from gaging stations on unregulated streams. For ungaged sites on urban streams and streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow magnitude and frequency.
Improving precision of glomerular filtration rate estimating model by ensemble learning.
Liu, Xun; Li, Ningshan; Lv, Linsheng; Fu, Yongmei; Cheng, Cailian; Wang, Caixia; Ye, Yuqiu; Li, Shaomin; Lou, Tanqi
2017-11-09
Accurate assessment of kidney function is clinically important, but estimates of glomerular filtration rate (GFR) by regression are imprecise. We hypothesized that ensemble learning could improve precision. A total of 1419 participants were enrolled, with 1002 in the development dataset and 417 in the external validation dataset. GFR was independently estimated from age, sex and serum creatinine using an artificial neural network (ANN), support vector machine (SVM), regression, and ensemble learning. GFR was measured by 99mTc-DTPA renal dynamic imaging calibrated with dual plasma sample 99mTc-DTPA GFR. Mean measured GFRs were 70.0 ml/min/1.73 m 2 in the developmental and 53.4 ml/min/1.73 m 2 in the external validation cohorts. In the external validation cohort, precision was better in the ensemble model of the ANN, SVM and regression equation (IQR = 13.5 ml/min/1.73 m 2 ) than in the new regression model (IQR = 14.0 ml/min/1.73 m 2 , P < 0.001). The precision of ensemble learning was the best of the three models, but the models had similar bias and accuracy. The median difference ranged from 2.3 to 3.7 ml/min/1.73 m 2 , 30% accuracy ranged from 73.1 to 76.0%, and P was > 0.05 for all comparisons of the new regression equation and the other new models. An ensemble learning model including three variables, the average ANN, SVM, and regression equation values, was more precise than the new regression model. A more complex ensemble learning strategy may further improve GFR estimates.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)
Churchill, Morgan; Clementz, Mark T; Kohno, Naoki
2014-01-01
Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814
Numerical simulations for tumor and cellular immune system interactions in lung cancer treatment
NASA Astrophysics Data System (ADS)
Kolev, M.; Nawrocki, S.; Zubik-Kowal, B.
2013-06-01
We investigate a new mathematical model that describes lung cancer regression in patients treated by chemotherapy and radiotherapy. The model is composed of nonlinear integro-differential equations derived from the so-called kinetic theory for active particles and a new sink function is investigated according to clinical data from carcinoma planoepitheliale. The model equations are solved numerically and the data are utilized in order to find their unknown parameters. The results of the numerical experiments show a good correlation between the predicted and clinical data and illustrate that the mathematical model has potential to describe lung cancer regression.
Wind tunnel test of Teledyne Geotech model 1564B cup anemometer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parker, M.J.; Addis, R.P.
1991-04-04
The Department of Energy (DOE) Environment, Safety and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0--25 mph regression equations than 0--50 mphmore » regression equations. Higher wind speeds were slightly overpredicted by the 0--25 mph regression equations when compared to 0--50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweight the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0--25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.« less
Wind tunnel test of Teledyne Geotech model 1564B cup anemometer
NASA Astrophysics Data System (ADS)
Parker, M. J.; Addis, R. P.
1991-04-01
The Department of Energy (DOE) Environment, Safety, and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0-25 mph regression equations than 0-50 mph regression equations. Higher wind speeds were slightly overpredicted by the 0-25 mph regression equations when compared to 0-50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweigh the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0-25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.
Model parameter uncertainty analysis for an annual field-scale P loss model
NASA Astrophysics Data System (ADS)
Bolster, Carl H.; Vadas, Peter A.; Boykin, Debbie
2016-08-01
Phosphorous (P) fate and transport models are important tools for developing and evaluating conservation practices aimed at reducing P losses from agricultural fields. Because all models are simplifications of complex systems, there will exist an inherent amount of uncertainty associated with their predictions. It is therefore important that efforts be directed at identifying, quantifying, and communicating the different sources of model uncertainties. In this study, we conducted an uncertainty analysis with the Annual P Loss Estimator (APLE) model. Our analysis included calculating parameter uncertainties and confidence and prediction intervals for five internal regression equations in APLE. We also estimated uncertainties of the model input variables based on values reported in the literature. We then predicted P loss for a suite of fields under different management and climatic conditions while accounting for uncertainties in the model parameters and inputs and compared the relative contributions of these two sources of uncertainty to the overall uncertainty associated with predictions of P loss. Both the overall magnitude of the prediction uncertainties and the relative contributions of the two sources of uncertainty varied depending on management practices and field characteristics. This was due to differences in the number of model input variables and the uncertainties in the regression equations associated with each P loss pathway. Inspection of the uncertainties in the five regression equations brought attention to a previously unrecognized limitation with the equation used to partition surface-applied fertilizer P between leaching and runoff losses. As a result, an alternate equation was identified that provided similar predictions with much less uncertainty. Our results demonstrate how a thorough uncertainty and model residual analysis can be used to identify limitations with a model. Such insight can then be used to guide future data collection and model development and evaluation efforts.
Yamagata, Tetsuo; Zanelli, Ugo; Gallemann, Dieter; Perrin, Dominique; Dolgos, Hugues; Petersson, Carl
2017-09-01
1. We compared direct scaling, regression model equation and the so-called "Poulin et al." methods to scale clearance (CL) from in vitro intrinsic clearance (CL int ) measured in human hepatocytes using two sets of compounds. One reference set comprised of 20 compounds with known elimination pathways and one external evaluation set based on 17 compounds development in Merck (MS). 2. A 90% prospective confidence interval was calculated using the reference set. This interval was found relevant for the regression equation method. The three outliers identified were justified on the basis of their elimination mechanism. 3. The direct scaling method showed a systematic underestimation of clearance in both the reference and evaluation sets. The "Poulin et al." and the regression equation methods showed no obvious bias in either the reference or evaluation sets. 4. The regression model equation was slightly superior to the "Poulin et al." method in the reference set and showed a better absolute average fold error (AAFE) of value 1.3 compared to 1.6. A larger difference was observed in the evaluation set were the regression method and "Poulin et al." resulted in an AAFE of 1.7 and 2.6, respectively (removing the three compounds with known issues mentioned above). A similar pattern was observed for the correlation coefficient. Based on these data we suggest the regression equation method combined with a prospective confidence interval as the first choice for the extrapolation of human in vivo hepatic metabolic clearance from in vitro systems.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Effects of Employing Ridge Regression in Structural Equation Models.
ERIC Educational Resources Information Center
McQuitty, Shaun
1997-01-01
LISREL 8 invokes a ridge option when maximum likelihood or generalized least squares are used to estimate a structural equation model with a nonpositive definite covariance or correlation matrix. Implications of the ridge option for model fit, parameter estimates, and standard errors are explored through two examples. (SLD)
A Predictive Model for Microbial Counts on Beaches where Intertidal Sand is the Primary Source
Feng, Zhixuan; Reniers, Ad; Haus, Brian K.; Solo-Gabriele, Helena M.; Wang, John D.; Fleming, Lora E.
2015-01-01
Human health protection at recreational beaches requires accurate and timely information on microbiological conditions to issue advisories. The objective of this study was to develop a new numerical mass balance model for enterococci levels on nonpoint source beaches. The significant advantage of this model is its easy implementation, and it provides a detailed description of the cross-shore distribution of enterococci that is useful for beach management purposes. The performance of the balance model was evaluated by comparing predicted exceedances of a beach advisory threshold value to field data, and to a traditional regression model. Both the balance model and regression equation predicted approximately 70% the advisories correctly at the knee depth and over 90% at the waist depth. The balance model has the advantage over the regression equation in its ability to simulate spatiotemporal variations of microbial levels, and it is recommended for making more informed management decisions. PMID:25840869
Khan, I.; Hawlader, Sophie Mohammad Delwer Hossain; Arifeen, Shams El; Moore, Sophie; Hills, Andrew P.; Wells, Jonathan C.; Persson, Lars-Åke; Kabir, Iqbal
2012-01-01
The aim of this study was to investigate the validity of the Tanita TBF 300A leg-to-leg bioimpedance analyzer for estimating fat-free mass (FFM) in Bangladeshi children aged 4-10 years and to develop novel prediction equations for use in this population, using deuterium dilution as the reference method. Two hundred Bangladeshi children were enrolled. The isotope dilution technique with deuterium oxide was used for estimation of total body water (TBW). FFM estimated by Tanita was compared with results of deuterium oxide dilution technique. Novel prediction equations were created for estimating FFM, using linear regression models, fitting child's height and impedance as predictors. There was a significant difference in FFM and percentage of body fat (BF%) between methods (p<0.01), Tanita underestimating TBW in boys (p=0.001) and underestimating BF% in girls (p<0.001). A basic linear regression model with height and impedance explained 83% of the variance in FFM estimated by deuterium oxide dilution technique. The best-fit equation to predict FFM from linear regression modelling was achieved by adding weight, sex, and age to the basic model, bringing the adjusted R2 to 89% (standard error=0.90, p<0.001). These data suggest Tanita analyzer may be a valid field-assessment technique in Bangladeshi children when using population-specific prediction equations, such as the ones developed here. PMID:23082630
Lawrence, Stephen J.
2012-01-01
Regression analyses show that E. coli density in samples was strongly related to turbidity, streamflow characteristics, and season at both sites. The regression equation chosen for the Norcross data showed that 78 percent of the variability in E. coli density (in log base 10 units) was explained by the variability in turbidity values (in log base 10 units), streamflow event (dry-weather flow or stormflow), season (cool or warm), and an interaction term that is the cross product of streamflow event and turbidity. The regression equation chosen for the Atlanta data showed that 76 percent of the variability in E. coli density (in log base 10 units) was explained by the variability in turbidity values (in log base 10 units), water temperature, streamflow event, and an interaction term that is the cross product of streamflow event and turbidity. Residual analysis and model confirmation using new data indicated the regression equations selected at both sites predicted E. coli density within the 90 percent prediction intervals of the equations and could be used to predict E. coli density in real time at both sites.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Kennedy, Jeffrey R.; Paretti, Nicholas V.; Veilleux, Andrea G.
2014-01-01
Regression equations, which allow predictions of n-day flood-duration flows for selected annual exceedance probabilities at ungaged sites, were developed using generalized least-squares regression and flood-duration flow frequency estimates at 56 streamgaging stations within a single, relatively uniform physiographic region in the central part of Arizona, between the Colorado Plateau and Basin and Range Province, called the Transition Zone. Drainage area explained most of the variation in the n-day flood-duration annual exceedance probabilities, but mean annual precipitation and mean elevation were also significant variables in the regression models. Standard error of prediction for the regression equations varies from 28 to 53 percent and generally decreases with increasing n-day duration. Outside the Transition Zone there are insufficient streamgaging stations to develop regression equations, but flood-duration flow frequency estimates are presented at select streamgaging stations.
Smadi, Hanan; Sargeant, Jan M; Shannon, Harry S; Raina, Parminder
2012-12-01
Growth and inactivation regression equations were developed to describe the effects of temperature on Salmonella concentration on chicken meat for refrigerated temperatures (⩽10°C) and for thermal treatment temperatures (55-70°C). The main objectives were: (i) to compare Salmonella growth/inactivation in chicken meat versus laboratory media; (ii) to create regression equations to estimate Salmonella growth in chicken meat that can be used in quantitative risk assessment (QRA) modeling; and (iii) to create regression equations to estimate D-values needed to inactivate Salmonella in chicken meat. A systematic approach was used to identify the articles, critically appraise them, and pool outcomes across studies. Growth represented in density (Log10CFU/g) and D-values (min) as a function of temperature were modeled using hierarchical mixed effects regression models. The current meta-analysis analysis found a significant difference (P⩽0.05) between the two matrices - chicken meat and laboratory media - for both growth at refrigerated temperatures and inactivation by thermal treatment. Growth and inactivation were significantly influenced by temperature after controlling for other variables; however, no consistent pattern in growth was found. Validation of growth and inactivation equations against data not used in their development is needed. Copyright © 2012 Ministry of Health, Saudi Arabia. Published by Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Mooijaart, Ab; Satorra, Albert
2009-01-01
In this paper, we show that for some structural equation models (SEM), the classical chi-square goodness-of-fit test is unable to detect the presence of nonlinear terms in the model. As an example, we consider a regression model with latent variables and interactions terms. Not only the model test has zero power against that type of…
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2012-01-01
This paper reviews the derivation of an equation for scaling response surface modeling experiments. The equation represents the smallest number of data points required to fit a linear regression polynomial so as to achieve certain specified model adequacy criteria. Specific criteria are proposed which simplify an otherwise rather complex equation, generating a practical rule of thumb for the minimum volume of data required to adequately fit a polynomial with a specified number of terms in the model. This equation and the simplified rule of thumb it produces can be applied to minimize the cost of wind tunnel testing.
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Decreasing Multicollinearity: A Method for Models with Multiplicative Functions.
ERIC Educational Resources Information Center
Smith, Kent W.; Sasaki, M. S.
1979-01-01
A method is proposed for overcoming the problem of multicollinearity in multiple regression equations where multiplicative independent terms are entered. The method is not a ridge regression solution. (JKS)
Hybrid Rocket Performance Prediction with Coupling Method of CFD and Thermal Conduction Calculation
NASA Astrophysics Data System (ADS)
Funami, Yuki; Shimada, Toru
The final purpose of this study is to develop a design tool for hybrid rocket engines. This tool is a computer code which will be used in order to investigate rocket performance characteristics and unsteady phenomena lasting through the burning time, such as fuel regression or combustion oscillation. When phenomena inside a combustion chamber, namely boundary layer combustion, are described, it is difficult to use rigorous models for this target. It is because calculation cost may be too expensive. Therefore simple models are required for this calculation. In this study, quasi-one-dimensional compressible Euler equations for flowfields inside a chamber and the equation for thermal conduction inside a solid fuel are numerically solved. The energy balance equation at the solid fuel surface is solved to estimate fuel regression rate. Heat feedback model is Karabeyoglu's model dependent on total mass flux. Combustion model is global single step reaction model for 4 chemical species or chemical equilibrium model for 9 chemical species. As a first step, steady-state solutions are reported.
Evaluation and application of regional turbidity-sediment regression models in Virginia
Hyer, Kenneth; Jastram, John D.; Moyer, Douglas; Webber, James S.; Chanat, Jeffrey G.
2015-01-01
Conventional thinking has long held that turbidity-sediment surrogate-regression equations are site specific and that regression equations developed at a single monitoring station should not be applied to another station; however, few studies have evaluated this issue in a rigorous manner. If robust regional turbidity-sediment models can be developed successfully, their applications could greatly expand the usage of these methods. Suspended sediment load estimation could occur as soon as flow and turbidity monitoring commence at a site, suspended sediment sampling frequencies for various projects potentially could be reduced, and special-project applications (sediment monitoring following dam removal, for example) could be significantly enhanced. The objective of this effort was to investigate the turbidity-suspended sediment concentration (SSC) relations at all available USGS monitoring sites within Virginia to determine whether meaningful turbidity-sediment regression models can be developed by combining the data from multiple monitoring stations into a single model, known as a “regional” model. Following the development of the regional model, additional objectives included a comparison of predicted SSCs between the regional model and commonly used site-specific models, as well as an evaluation of why specific monitoring stations did not fit the regional model.
Kupek, Emil
2006-03-15
Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Sherwood, J.M.
1986-01-01
Methods are presented for estimating peak discharges, flood volumes and hydrograph shapes of small (less than 5 sq mi) urban streams in Ohio. Examples of how to use the various regression equations and estimating techniques also are presented. Multiple-regression equations were developed for estimating peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The significant independent variables affecting peak discharge are drainage area, main-channel slope, average basin-elevation index, and basin-development factor. Standard errors of regression and prediction for the peak discharge equations range from +/-37% to +/-41%. An equation also was developed to estimate the flood volume of a given peak discharge. Peak discharge, drainage area, main-channel slope, and basin-development factor were found to be the significant independent variables affecting flood volumes for given peak discharges. The standard error of regression for the volume equation is +/-52%. A technique is described for estimating the shape of a runoff hydrograph by applying a specific peak discharge and the estimated lagtime to a dimensionless hydrograph. An equation for estimating the lagtime of a basin was developed. Two variables--main-channel length divided by the square root of the main-channel slope and basin-development factor--have a significant effect on basin lagtime. The standard error of regression for the lagtime equation is +/-48%. The data base for the study was established by collecting rainfall-runoff data at 30 basins distributed throughout several metropolitan areas of Ohio. Five to eight years of data were collected at a 5-min record interval. The USGS rainfall-runoff model A634 was calibrated for each site. The calibrated models were used in conjunction with long-term rainfall records to generate a long-term streamflow record for each site. Each annual peak-discharge record was fitted to a Log-Pearson Type III frequency curve. Multiple-regression techniques were then used to analyze the peak discharge data as a function of the basin characteristics of the 30 sites. (Author 's abstract)
Regression models to predict hip joint centers in pathological hip population.
Mantovani, Giulia; Ng, K C Geoffrey; Lamontagne, Mario
2016-02-01
The purpose was to investigate the validity of Harrington's and Davis's hip joint center (HJC) regression equations on a population affected by a hip deformity, (i.e., femoroacetabular impingement). Sixty-seven participants (21 healthy controls, 46 with a cam-type deformity) underwent pelvic CT imaging. Relevant bony landmarks and geometric HJCs were digitized from the images, and skin thickness was measured for the anterior and posterior superior iliac spines. Non-parametric statistical and Bland-Altman tests analyzed differences between the predicted HJC (from regression equations) and the actual HJC (from CT images). The error from Davis's model (25.0 ± 6.7 mm) was larger than Harrington's (12.3 ± 5.9 mm, p<0.001). There were no differences between groups, thus, studies on femoroacetabular impingement can implement conventional regression models. Measured skin thickness was 9.7 ± 7.0mm and 19.6 ± 10.9 mm for the anterior and posterior bony landmarks, respectively, and correlated with body mass index. Skin thickness estimates can be considered to reduce the systematic error introduced by surface markers. New adult-specific regression equations were developed from the CT dataset, with the hypothesis that they could provide better estimates when tuned to a larger adult-specific dataset. The linear models were validated on external datasets and using leave-one-out cross-validation techniques; Prediction errors were comparable to those of Harrington's model, despite the adult-specific population and the larger sample size, thus, prediction accuracy obtained from these parameters could not be improved. Copyright © 2015 Elsevier B.V. All rights reserved.
Prediction equations of forced oscillation technique: the insidious role of collinearity.
Narchi, Hassib; AlBlooshi, Afaf
2018-03-27
Many studies have reported reference data for forced oscillation technique (FOT) in healthy children. The prediction equation of FOT parameters were derived from a multivariable regression model examining the effect of age, gender, weight and height on each parameter. As many of these variables are likely to be correlated, collinearity might have affected the accuracy of the model, potentially resulting in misleading, erroneous or difficult to interpret conclusions.The aim of this work was: To review all FOT publications in children since 2005 to analyze whether collinearity was considered in the construction of the published prediction equations. Then to compare these prediction equations with our own study. And to analyse, in our study, how collinearity between the explanatory variables might affect the predicted equations if it was not considered in the model. The results showed that none of the ten reviewed studies had stated whether collinearity was checked for. Half of the reports had also included in their equations variables which are physiologically correlated, such as age, weight and height. The predicted resistance varied by up to 28% amongst these studies. And in our study, multicollinearity was identified between the explanatory variables initially considered for the regression model (age, weight and height). Ignoring it would have resulted in inaccuracies in the coefficients of the equation, their signs (positive or negative), their 95% confidence intervals, their significance level and the model goodness of fit. In Conclusion with inaccurately constructed and improperly reported models, understanding the results and reproducing the models for future research might be compromised.
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.
Bersabé, Rosa; Rivas, Teresa
2010-05-01
The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
2013-01-01
Background This study aims to improve accuracy of Bioelectrical Impedance Analysis (BIA) prediction equations for estimating fat free mass (FFM) of the elderly by using non-linear Back Propagation Artificial Neural Network (BP-ANN) model and to compare the predictive accuracy with the linear regression model by using energy dual X-ray absorptiometry (DXA) as reference method. Methods A total of 88 Taiwanese elderly adults were recruited in this study as subjects. Linear regression equations and BP-ANN prediction equation were developed using impedances and other anthropometrics for predicting the reference FFM measured by DXA (FFMDXA) in 36 male and 26 female Taiwanese elderly adults. The FFM estimated by BIA prediction equations using traditional linear regression model (FFMLR) and BP-ANN model (FFMANN) were compared to the FFMDXA. The measuring results of an additional 26 elderly adults were used to validate than accuracy of the predictive models. Results The results showed the significant predictors were impedance, gender, age, height and weight in developed FFMLR linear model (LR) for predicting FFM (coefficient of determination, r2 = 0.940; standard error of estimate (SEE) = 2.729 kg; root mean square error (RMSE) = 2.571kg, P < 0.001). The above predictors were set as the variables of the input layer by using five neurons in the BP-ANN model (r2 = 0.987 with a SD = 1.192 kg and relatively lower RMSE = 1.183 kg), which had greater (improved) accuracy for estimating FFM when compared with linear model. The results showed a better agreement existed between FFMANN and FFMDXA than that between FFMLR and FFMDXA. Conclusion When compared the performance of developed prediction equations for estimating reference FFMDXA, the linear model has lower r2 with a larger SD in predictive results than that of BP-ANN model, which indicated ANN model is more suitable for estimating FFM. PMID:23388042
The Variance Normalization Method of Ridge Regression Analysis.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…
Estimating Flow-Duration and Low-Flow Frequency Statistics for Unregulated Streams in Oregon
Risley, John; Stonewall, Adam J.; Haluska, Tana
2008-01-01
Flow statistical datasets, basin-characteristic datasets, and regression equations were developed to provide decision makers with surface-water information needed for activities such as water-quality regulation, water-rights adjudication, biological habitat assessment, infrastructure design, and water-supply planning and management. The flow statistics, which included annual and monthly period of record flow durations (5th, 10th, 25th, 50th, and 95th percent exceedances) and annual and monthly 7-day, 10-year (7Q10) and 7-day, 2-year (7Q2) low flows, were computed at 466 streamflow-gaging stations at sites with unregulated flow conditions throughout Oregon and adjacent areas of neighboring States. Regression equations, created from the flow statistics and basin characteristics of the stations, can be used to estimate flow statistics at ungaged stream sites in Oregon. The study area was divided into 10 regression modeling regions based on ecological, topographic, geologic, hydrologic, and climatic criteria. In total, 910 annual and monthly regression equations were created to predict the 7 flow statistics in the 10 regions. Equations to predict the five flow-duration exceedance percentages and the two low-flow frequency statistics were created with Ordinary Least Squares and Generalized Least Squares regression, respectively. The standard errors of estimate of the equations created to predict the 5th and 95th percent exceedances had medians of 42.4 and 64.4 percent, respectively. The standard errors of prediction of the equations created to predict the 7Q2 and 7Q10 low-flow statistics had medians of 51.7 and 61.2 percent, respectively. Standard errors for regression equations for sites in western Oregon were smaller than those in eastern Oregon partly because of a greater density of available streamflow-gaging stations in western Oregon than eastern Oregon. High-flow regression equations (such as the 5th and 10th percent exceedances) also generally were more accurate than the low-flow regression equations (such as the 95th percent exceedance and 7Q10 low-flow statistic). The regression equations predict unregulated flow conditions in Oregon. Flow estimates need to be adjusted if they are used at ungaged sites that are regulated by reservoirs or affected by water-supply and agricultural withdrawals if actual flow conditions are of interest. The regression equations are installed in the USGS StreamStats Web-based tool (http://water.usgs.gov/osw/streamstats/index.html, accessed July 16, 2008). StreamStats provides users with a set of annual and monthly flow-duration and low-flow frequency estimates for ungaged sites in Oregon in addition to the basin characteristics for the sites. Prediction intervals at the 90-percent confidence level also are automatically computed.
Minimizing bias in biomass allometry: Model selection and log transformation of data
Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer
2011-01-01
Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....
A Comparison of Methods for Estimating Quadratic Effects in Nonlinear Structural Equation Models
ERIC Educational Resources Information Center
Harring, Jeffrey R.; Weiss, Brandi A.; Hsu, Jui-Chen
2012-01-01
Two Monte Carlo simulations were performed to compare methods for estimating and testing hypotheses of quadratic effects in latent variable regression models. The methods considered in the current study were (a) a 2-stage moderated regression approach using latent variable scores, (b) an unconstrained product indicator approach, (c) a latent…
Nitrate removal in stream ecosystems measured by 15N addition experiments: Total uptake
Hall, R.O.; Tank, J.L.; Sobota, D.J.; Mulholland, P.J.; O'Brien, J. M.; Dodds, W.K.; Webster, J.R.; Valett, H.M.; Poole, G.C.; Peterson, B.J.; Meyer, J.L.; McDowell, W.H.; Johnson, S.L.; Hamilton, S.K.; Grimm, N. B.; Gregory, S.V.; Dahm, Clifford N.; Cooper, L.W.; Ashkenas, L.R.; Thomas, S.M.; Sheibley, R.W.; Potter, J.D.; Niederlehner, B.R.; Johnson, L.T.; Helton, A.M.; Crenshaw, C.M.; Burgin, A.J.; Bernot, M.J.; Beaulieu, J.J.; Arangob, C.P.
2009-01-01
We measured uptake length of 15NO-3 in 72 streams in eight regions across the United States and Puerto Rico to develop quantitative predictive models on controls of NO-3 uptake length. As part of the Lotic Intersite Nitrogen eXperiment II project, we chose nine streams in each region corresponding to natural (reference), suburban-urban, and agricultural land uses. Study streams spanned a range of human land use to maximize variation in NO-3 concentration, geomorphology, and metabolism. We tested a causal model predicting controls on NO-3 uptake length using structural equation modeling. The model included concomitant measurements of ecosystem metabolism, hydraulic parameters, and nitrogen concentration. We compared this structural equation model to multiple regression models which included additional biotic, catchment, and riparian variables. The structural equation model explained 79% of the variation in log uptake length (S Wtot). Uptake length increased with specific discharge (Q/w) and increasing NO-3 concentrations, showing a loss in removal efficiency in streams with high NO-3 concentration. Uptake lengths shortened with increasing gross primary production, suggesting autotrophic assimilation dominated NO-3 removal. The fraction of catchment area as agriculture and suburban-urban land use weakly predicted NO-3 uptake in bivariate regression, and did improve prediction in a set of multiple regression models. Adding land use to the structural equation model showed that land use indirectly affected NO-3 uptake lengths via directly increasing both gross primary production and NO-3 concentration. Gross primary production shortened SWtot, while increasing NO-3 lengthened SWtot resulting in no net effect of land use on NO- 3 removal. ?? 2009.
Techniques for estimating magnitude and frequency of peak flows for Pennsylvania streams
Stuckey, Marla H.; Reed, Lloyd A.
2000-01-01
Regression equations for estimating the magnitude and frequency of floods on ungaged streams in Pennsylvania with drainage areas less that 2,000 square miles were developed on the basis of peak-flow data collected at 313 streamflow-gaging stations. All streamflow-gaging stations used in the development of the equations had 10 or more years of record and include active and discontinued continuous-record and crest-stage partial-record streamflow-gaging stations. Regional regression equations were developed for flood flows expected every 10, 25, 50, 100, and 500 years by the use of a weighted multiple linear regression model.The State was divided into two regions. The largest region, Region A, encompasses about 78 percent of Pennsylvania. The smaller region, Region B, includes only the northwestern part of the State. Basin characteristics used in the regression equations for Region A are drainage area, percentage of forest cover, percentage of urban development, percentage of basin underlain by carbonate bedrock, and percentage of basin controlled by lakes, swamps, and reservoirs. Basin characteristics used in the regression equations for Region B are drainage area and percentage of basin controlled by lakes, swamps, and reservoirs. The coefficient of determination (R2) values for the five flood-frequency equations for Region A range from 0.93 to 0.82, and for Region B, the range is from 0.96 to 0.89.While the regression equations can be used to predict the magnitude and frequency of peak flows for most streams in the State, they should not be used for streams with drainage areas greater than 2,000 square miles or less than 1.5 square miles, for streams that drain extensively mined areas, or for stream reaches immediately below flood-control reservoirs. In addition, the equations presented for Region B should not be used if the stream drains a basin with more than 5 percent urban development.
Estimated Perennial Streams of Idaho and Related Geospatial Datasets
Rea, Alan; Skinner, Kenneth D.
2009-01-01
The perennial or intermittent status of a stream has bearing on many regulatory requirements. Because of changing technologies over time, cartographic representation of perennial/intermittent status of streams on U.S. Geological Survey (USGS) topographic maps is not always accurate and (or) consistent from one map sheet to another. Idaho Administrative Code defines an intermittent stream as one having a 7-day, 2-year low flow (7Q2) less than 0.1 cubic feet per second. To establish consistency with the Idaho Administrative Code, the USGS developed regional regression equations for Idaho streams for several low-flow statistics, including 7Q2. Using these regression equations, the 7Q2 streamflow may be estimated for naturally flowing streams anywhere in Idaho to help determine perennial/intermittent status of streams. Using these equations in conjunction with a Geographic Information System (GIS) technique known as weighted flow accumulation allows for an automated and continuous estimation of 7Q2 streamflow at all points along a stream, which in turn can be used to determine if a stream is intermittent or perennial according to the Idaho Administrative Code operational definition. The selected regression equations were applied to create continuous grids of 7Q2 estimates for the eight low-flow regression regions of Idaho. By applying the 0.1 ft3/s criterion, the perennial streams have been estimated in each low-flow region. Uncertainty in the estimates is shown by identifying a 'transitional' zone, corresponding to flow estimates of 0.1 ft3/s plus and minus one standard error. Considerable additional uncertainty exists in the model of perennial streams presented in this report. The regression models provide overall estimates based on general trends within each regression region. These models do not include local factors such as a large spring or a losing reach that may greatly affect flows at any given point. Site-specific flow data, assuming a sufficient period of record, generally would be considered to represent flow conditions better at a given site than flow estimates based on regionalized regression models. The geospatial datasets of modeled perennial streams are considered a first-cut estimate, and should not be construed to override site-specific flow data.
A Simultaneous Equation Demand Model for Block Rates
NASA Astrophysics Data System (ADS)
Agthe, Donald E.; Billings, R. Bruce; Dobra, John L.; Raffiee, Kambiz
1986-01-01
This paper examines the problem of simultaneous-equations bias in estimation of the water demand function under an increasing block rate structure. The Hausman specification test is used to detect the presence of simultaneous-equations bias arising from correlation of the price measures with the regression error term in the results of a previously published study of water demand in Tucson, Arizona. An alternative simultaneous equation model is proposed for estimating the elasticity of demand in the presence of block rate pricing structures and availability of service charges. This model is used to reestimate the price and rate premium elasticities of demand in Tucson, Arizona for both the usual long-run static model and for a simple short-run demand model. The results from these simultaneous equation models are consistent with a priori expectations and are unbiased.
Indriect Measurement Of Nitrogen In A Mult-Component Natural Gas By Heating The Gas
Morrow, Thomas B.; Behring, II, Kendricks A.
2004-06-22
Methods of indirectly measuring the nitrogen concentration in a natural gas by heating the gas. In two embodiments, the heating energy is correlated to the speed of sound in the gas, the diluent concentrations in the gas, and constant values, resulting in a model equation. Regression analysis is used to calculate the constant values, which can then be substituted into the model equation. If the diluent concentrations other than nitrogen (typically carbon dioxide) are known, the model equation can be solved for the nitrogen concentration.
Stamey, Timothy C.
1998-01-01
Simple and reliable methods for estimating hourly streamflow are needed for the calibration and verification of a Chattahoochee River basin model between Buford Dam and Franklin, Ga. The river basin model is being developed by Georgia Department of Natural Resources, Environmental Protection Division, as part of their Chattahoochee River Modeling Project. Concurrent streamflow data collected at 19 continuous-record, and 31 partial-record streamflow stations, were used in ordinary least-squares linear regression analyses to define estimating equations, and in verifying drainage-area prorations. The resulting regression or drainage-area ratio estimating equations were used to compute hourly streamflow at the partial-record stations. The coefficients of determination (r-squared values) for the regression estimating equations ranged from 0.90 to 0.99. Observed and estimated hourly and daily streamflow data were computed for May 1, 1995, through October 31, 1995. Comparisons of observed and estimated daily streamflow data for 12 continuous-record tributary stations, that had available streamflow data for all or part of the period from May 1, 1995, to October 31, 1995, indicate that the mean error of estimate for the daily streamflow was about 25 percent.
ERIC Educational Resources Information Center
Ellison, William D.; Levy, Kenneth N.
2012-01-01
Using exploratory structural equation modeling and multiple regression, we examined the factor structure and criterion relations of the primary scales of the Inventory of Personality Organization (IPO; Kernberg & Clarkin, 1995) in a nonclinical sample. Participants (N = 1,260) completed the IPO and measures of self-concept clarity, defenses,…
Applicability of linear regression equation for prediction of chlorophyll content in rice leaves
NASA Astrophysics Data System (ADS)
Li, Yunmei
2005-09-01
A modeling approach is used to assess the applicability of the derived equations which are capable to predict chlorophyll content of rice leaves at a given view direction. Two radiative transfer models, including PROSPECT model operated at leaf level and FCR model operated at canopy level, are used in the study. The study is consisted of three steps: (1) Simulation of bidirectional reflectance from canopy with different leaf chlorophyll contents, leaf-area-index (LAI) and under storey configurations; (2) Establishment of prediction relations of chlorophyll content by stepwise regression; and (3) Assessment of the applicability of these relations. The result shows that the accuracy of prediction is affected by different under storey configurations and, however, the accuracy tends to be greatly improved with increase of LAI.
Fraysse, François; Thewlis, Dominic
2014-11-07
Numerous methods exist to estimate the pose of the axes of rotation of the forearm. These include anatomical definitions, such as the conventions proposed by the ISB, and functional methods based on instantaneous helical axes, which are commonly accepted as the modelling gold standard for non-invasive, in-vivo studies. We investigated the validity of a third method, based on regression equations, to estimate the rotation axes of the forearm. We also assessed the accuracy of both ISB methods. Axes obtained from a functional method were considered as the reference. Results indicate a large inter-subject variability in the axes positions, in accordance with previous studies. Both ISB methods gave the same level of accuracy in axes position estimations. Regression equations seem to improve estimation of the flexion-extension axis but not the pronation-supination axis. Overall, given the large inter-subject variability, the use of regression equations cannot be recommended. Copyright © 2014 Elsevier Ltd. All rights reserved.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
NASA Astrophysics Data System (ADS)
Fomina, E. V.; Kozhukhova, N. I.; Sverguzova, S. V.; Fomin, A. E.
2018-05-01
In this paper, the regression equations method for design of construction material was studied. Regression and polynomial equations representing the correlation between the studied parameters were proposed. The logic design and software interface of the regression equations method focused on parameter optimization to provide the energy saving effect at the stage of autoclave aerated concrete design considering the replacement of traditionally used quartz sand by coal mining by-product such as argillite. The mathematical model represented by a quadric polynomial for the design of experiment was obtained using calculated and experimental data. This allowed the estimation of relationship between the composition and final properties of the aerated concrete. The surface response graphically presented in a nomogram allowed the estimation of concrete properties in response to variation of composition within the x-space. The optimal range of argillite content was obtained leading to a reduction of raw materials demand, development of target plastic strength of aerated concrete as well as a reduction of curing time before autoclave treatment. Generally, this method allows the design of autoclave aerated concrete with required performance without additional resource and time costs.
David R. Weise; Eunmo Koo; Xiangyang Zhou; Shankar Mahalingam; Frédéric Morandini; Jacques-Henri Balbi
2016-01-01
Fire behaviour data from 240 laboratory fires in high-density live chaparral fuel beds were compared with model predictions. Logistic regression was used to develop a model to predict fire spread success in the fuel beds and linear regression was used to predict rate of spread. Predictions from the Rothermel equation and three proposed changes as well as two physically...
Morrow, Thomas B.; Behring, II, Kendricks A.
2004-10-12
A methods of indirectly measuring the nitrogen concentration in a gas mixture. The molecular weight of the gas is modeled as a function of the speed of sound in the gas, the diluent concentrations in the gas, and constant values, resulting in a model equation. Regression analysis is used to calculate the constant values, which can then be substituted into the model equation. If the speed of sound in the gas is measured at two states and diluent concentrations other than nitrogen (typically carbon dioxide) are known, two equations for molecular weight can be equated and solved for the nitrogen concentration in the gas mixture.
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Demidenko, Eugene
2017-09-01
The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Equations relating compacted and uncompacted live crown ratio for common tree species in the South
KaDonna C. Randolph
2010-01-01
Species-specific equations to predict uncompacted crown ratio (UNCR) from compacted live crown ratio (CCR), tree length, and stem diameter were developed for 24 species and 12 genera in the southern United States. Using data from the US Forest Service Forest Inventory and Analysis program, nonlinear regression was used to model UNCR with a logistic function. Model...
Stature estimation equations for South Asian skeletons based on DXA scans of contemporary adults.
Pomeroy, Emma; Mushrif-Tripathy, Veena; Wells, Jonathan C K; Kulkarni, Bharati; Kinra, Sanjay; Stock, Jay T
2018-05-03
Stature estimation from the skeleton is a classic anthropological problem, and recent years have seen the proliferation of population-specific regression equations. Many rely on the anatomical reconstruction of stature from archaeological skeletons to derive regression equations based on long bone lengths, but this requires a collection with very good preservation. In some regions, for example, South Asia, typical environmental conditions preclude the sufficient preservation of skeletal remains. Large-scale epidemiological studies that include medical imaging of the skeleton by techniques such as dual-energy X-ray absorptiometry (DXA) offer new potential datasets for developing such equations. We derived estimation equations based on known height and bone lengths measured from DXA scans from the Andhra Pradesh Children and Parents Study (Hyderabad, India). Given debates on the most appropriate regression model to use, multiple methods were compared, and the performance of the equations was tested on a published skeletal dataset of individuals with known stature. The equations have standard errors of estimates and prediction errors similar to those derived using anatomical reconstruction or from cadaveric datasets. As measured by the number of significant differences between true and estimated stature, and the prediction errors, the new equations perform as well as, and generally better than, published equations commonly used on South Asian skeletons or based on Indian cadaveric datasets. This study demonstrates the utility of DXA scans as a data source for developing stature estimation equations and offer a new set of equations for use with South Asian datasets. © 2018 Wiley Periodicals, Inc.
Singer, Donald A.; Kouda, Ryoichi
2011-01-01
Empirical evidence indicates that processes affecting number and quantity of resources in geologic settings are very general across deposit types. Sizes of permissive tracts that geologically could contain the deposits are excellent predictors of numbers of deposits. In addition, total ore tonnage of mineral deposits of a particular type in a tract is proportional to the type’s median tonnage in a tract. Regressions using size of permissive tracts and median tonnage allow estimation of number of deposits and of total tonnage of mineralization. These powerful estimators, based on 10 different deposit types from 109 permissive worldwide control tracts, generalize across deposit types. Estimates of number of deposits and of total tonnage of mineral deposits are made by regressing permissive area, and mean (in logs) tons in deposits of the type, against number of deposits and total tonnage of deposits in the tract for the 50th percentile estimates. The regression equations (R2 = 0.91 and 0.95) can be used for all deposit types just by inserting logarithmic values of permissive area in square kilometers, and mean tons in deposits in millions of metric tons. The regression equations provide estimates at the 50th percentile, and other equations are provided for 90% confidence limits for lower estimates and 10% confidence limits for upper estimates of number of deposits and total tonnage. Equations for these percentile estimates along with expected value estimates are presented here along with comparisons with independent expert estimates. Also provided are the equations for correcting for the known well-explored deposits in a tract. These deposit-density models require internally consistent grade and tonnage models and delineations for arriving at unbiased estimates.
ERIC Educational Resources Information Center
Crawford, John R.; Garthwaite, Paul H.; Denham, Annie K.; Chelune, Gordon J.
2012-01-01
Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because…
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Wu, Alan H B; Wang, Ping; Smith, Andrew; Haller, Christine; Drake, Katherine; Linder, Mark; Valdes, Roland
2008-02-01
Polymorphism in the genes for cytochrome (CYP)2C9 and the vitamin K epoxide reductase complex subunit 1 (VKORC1) affect the pharmacokinetics and pharmacodynamics of warfarin. We developed and validated a warfarin-dosing algorithm for a multi-ethnic population that predicts the best dose for stable anticoagulation, and compared its performance against other regression equations. We determined the allele and haplotype frequencies of genes for CYP2C9 and VKORC1 on 167 Caucasian, African-American, Asian and Hispanic patients on warfarin. On a subset where complete data were available (n=92), we developed a dosing equation that predicts the actual dose needed to maintain target anticoagulation using demographic variables and genotypes. This regression was validated against an independent group of subjects. We also applied our data to five other published warfarin-dosing equations. The allele frequency for CYP2C9*2 and *3 and the A allele for VKORC1 3673 was similar to previously published reports. For Caucasians and Asians, VKORC1 SNPs were in Hardy-Weinberg linkage equilibrium. Some VKORC1 SNPs among the African-American population and one SNP among Hispanics were not in equilibrium. The linear regression of predicted versus actual warfarin dose produced r-values of 0.71 for the training set and 0.67 for the validation set. The regression coefficient improved (to r=0.78 and 0.75, respectively) when rare genotypes were eliminated or when the 7566 VKORC1 genotype was added to the model. All of the regression models tested produced a similar degree of correlation. The exclusion of rare genotypes that are more associated with certain ethnicities improved the model. Minor improvements in algorithms can be observed with the inclusion of ethnicity and more CYP2C9 and VKORC1 SNPs as variables. Major improvements will likely require the identification of new gene associations with warfarin dosing.
System identification principles in studies of forest dynamics.
Rolfe A. Leary
1970-01-01
Shows how it is possible to obtain governing equation parameter estimates on the basis of observed system states. The approach used represents a constructive alternative to regression techniques for models expressed as differential equations. This approach allows scientists to more completely quantify knowledge of forest development processes, to express theories in...
Invariants for the generalized Lotka-Volterra equations
NASA Astrophysics Data System (ADS)
Cairó, Laurent; Feix, Marc R.; Goedert, Joao
A generalisation of Lotka-Volterra System is given when self limiting terms are introduced in the model. We use a modification of the Carleman embedding method to find invariants for this system of equations. The position and stability of the equilibrium point and the regression of system under invariant conditions are studied.
Merchantable sawlog and bole-length equations for the Northeastern United States
Daniel A. Yaussy; Martin E. Dale; Martin E. Dale
1991-01-01
A modified Richards growth model is used to develop species-specific coefficients for equations estimating the merchantable sawlog and bole lengths of trees from 25 species groups common to the Northeastern United States. These regression coefficients have been incorporated into the growth-and-yield simulation software, NE-TWIGS.
Wood, Molly S.; Fosness, Ryan L.; Skinner, Kenneth D.; Veilleux, Andrea G.
2016-06-27
The U.S. Geological Survey, in cooperation with the Idaho Transportation Department, updated regional regression equations to estimate peak-flow statistics at ungaged sites on Idaho streams using recent streamflow (flow) data and new statistical techniques. Peak-flow statistics with 80-, 67-, 50-, 43-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities (1.25-, 1.50-, 2.00-, 2.33-, 5.00-, 10.0-, 25.0-, 50.0-, 100-, 200-, and 500-year recurrence intervals, respectively) were estimated for 192 streamgages in Idaho and bordering States with at least 10 years of annual peak-flow record through water year 2013. The streamgages were selected from drainage basins with little or no flow diversion or regulation. The peak-flow statistics were estimated by fitting a log-Pearson type III distribution to records of annual peak flows and applying two additional statistical methods: (1) the Expected Moments Algorithm to help describe uncertainty in annual peak flows and to better represent missing and historical record; and (2) the generalized Multiple Grubbs Beck Test to screen out potentially influential low outliers and to better fit the upper end of the peak-flow distribution. Additionally, a new regional skew was estimated for the Pacific Northwest and used to weight at-station skew at most streamgages. The streamgages were grouped into six regions (numbered 1_2, 3, 4, 5, 6_8, and 7, to maintain consistency in region numbering with a previous study), and the estimated peak-flow statistics were related to basin and climatic characteristics to develop regional regression equations using a generalized least squares procedure. Four out of 24 evaluated basin and climatic characteristics were selected for use in the final regional peak-flow regression equations.Overall, the standard error of prediction for the regional peak-flow regression equations ranged from 22 to 132 percent. Among all regions, regression model fit was best for region 4 in west-central Idaho (average standard error of prediction=46.4 percent; pseudo-R2>92 percent) and region 5 in central Idaho (average standard error of prediction=30.3 percent; pseudo-R2>95 percent). Regression model fit was poor for region 7 in southern Idaho (average standard error of prediction=103 percent; pseudo-R2<78 percent) compared to other regions because few streamgages in region 7 met the criteria for inclusion in the study, and the region’s semi-arid climate and associated variability in precipitation patterns causes substantial variability in peak flows.A drainage area ratio-adjustment method, using ratio exponents estimated using generalized least-squares regression, was presented as an alternative to the regional regression equations if peak-flow estimates are desired at an ungaged site that is close to a streamgage selected for inclusion in this study. The alternative drainage area ratio-adjustment method is appropriate for use when the drainage area ratio between the ungaged and gaged sites is between 0.5 and 1.5.The updated regional peak-flow regression equations had lower total error (standard error of prediction) than all regression equations presented in a 1982 study and in four of six regions presented in 2002 and 2003 studies in Idaho. A more extensive streamgage screening process used in the current study resulted in fewer streamgages used in the current study than in the 1982, 2002, and 2003 studies. Fewer streamgages used and the selection of different explanatory variables were likely causes of increased error in some regions compared to previous studies, but overall, regional peak‑flow regression model fit was generally improved for Idaho. The revised statistical procedures and increased streamgage screening applied in the current study most likely resulted in a more accurate representation of natural peak-flow conditions.The updated, regional peak-flow regression equations will be integrated in the U.S. Geological Survey StreamStats program to allow users to estimate basin and climatic characteristics and peak-flow statistics at ungaged locations of interest. StreamStats estimates peak-flow statistics with quantifiable certainty only when used at sites with basin and climatic characteristics within the range of input variables used to develop the regional regression equations. Both the regional regression equations and StreamStats should be used to estimate peak-flow statistics only in naturally flowing, relatively unregulated streams without substantial local influences to flow, such as large seeps, springs, or other groundwater-surface water interactions that are not widespread or characteristic of the respective region.
Solving large test-day models by iteration on data and preconditioned conjugate gradient.
Lidauer, M; Strandén, I; Mäntysaari, E A; Pösö, J; Kettunen, A
1999-12-01
A preconditioned conjugate gradient method was implemented into an iteration on a program for data estimation of breeding values, and its convergence characteristics were studied. An algorithm was used as a reference in which one fixed effect was solved by Gauss-Seidel method, and other effects were solved by a second-order Jacobi method. Implementation of the preconditioned conjugate gradient required storing four vectors (size equal to number of unknowns in the mixed model equations) in random access memory and reading the data at each round of iteration. The preconditioner comprised diagonal blocks of the coefficient matrix. Comparison of algorithms was based on solutions of mixed model equations obtained by a single-trait animal model and a single-trait, random regression test-day model. Data sets for both models used milk yield records of primiparous Finnish dairy cows. Animal model data comprised 665,629 lactation milk yields and random regression test-day model data of 6,732,765 test-day milk yields. Both models included pedigree information of 1,099,622 animals. The animal model ¿random regression test-day model¿ required 122 ¿305¿ rounds of iteration to converge with the reference algorithm, but only 88 ¿149¿ were required with the preconditioned conjugate gradient. To solve the random regression test-day model with the preconditioned conjugate gradient required 237 megabytes of random access memory and took 14% of the computation time needed by the reference algorithm.
Ground effects in FAA's Integrated Noise Model
DOT National Transportation Integrated Search
2000-01-01
The lateral attenuation algorithm in the Federal Aviation Administration's (FAA) Integrated Noise Model (INM) has historically been based on the two regression equations described in the Society of Automotive Engineers' (SAE) Aerospace Information Re...
2017-01-01
The U.S. Energy Information Administration's Short-Term Energy Outlook (STEO) produces monthly projections of energy supply, demand, trade, and prices over a 13-24 month period. Every January, the forecast horizon is extended through December of the following year. The STEO model is an integrated system of econometric regression equations and identities that link data on the various components of the U.S. energy industry together in order to develop consistent forecasts. The regression equations are estimated and the STEO model is solved using the EViews 9.5 econometric software package from IHS Global Inc. The model consists of various modules specific to each energy resource. All modules provide projections for the United States, and some modules provide more detailed forecasts for different regions of the country.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
Optimizing separate phase light hydrocarbon recovery from contaminated unconfined aquifers
NASA Astrophysics Data System (ADS)
Cooper, Grant S.; Peralta, Richard C.; Kaluarachchi, Jagath J.
A modeling approach is presented that optimizes separate phase recovery of light non-aqueous phase liquids (LNAPL) for a single dual-extraction well in a homogeneous, isotropic unconfined aquifer. A simulation/regression/optimization (S/R/O) model is developed to predict, analyze, and optimize the oil recovery process. The approach combines detailed simulation, nonlinear regression, and optimization. The S/R/O model utilizes nonlinear regression equations describing system response to time-varying water pumping and oil skimming. Regression equations are developed for residual oil volume and free oil volume. The S/R/O model determines optimized time-varying (stepwise) pumping rates which minimize residual oil volume and maximize free oil recovery while causing free oil volume to decrease a specified amount. This S/R/O modeling approach implicitly immobilizes the free product plume by reversing the water table gradient while achieving containment. Application to a simple representative problem illustrates the S/R/O model utility for problem analysis and remediation design. When compared with the best steady pumping strategies, the optimal stepwise pumping strategy improves free oil recovery by 11.5% and reduces the amount of residual oil left in the system due to pumping by 15%. The S/R/O model approach offers promise for enhancing the design of free phase LNAPL recovery systems and to help in making cost-effective operation and management decisions for hydrogeologists, engineers, and regulators.
Many-level multilevel structural equation modeling: An efficient evaluation strategy.
Pritikin, Joshua N; Hunter, Michael D; von Oertzen, Timo; Brick, Timothy R; Boker, Steven M
2017-01-01
Structural equation models are increasingly used for clustered or multilevel data in cases where mixed regression is too inflexible. However, when there are many levels of nesting, these models can become difficult to estimate. We introduce a novel evaluation strategy, Rampart, that applies an orthogonal rotation to the parts of a model that conform to commonly met requirements. This rotation dramatically simplifies fit evaluation in a way that becomes more potent as the size of the data set increases. We validate and evaluate the implementation using a 3-level latent regression simulation study. Then we analyze data from a state-wide child behavioral health measure administered by the Oklahoma Department of Human Services. We demonstrate the efficiency of Rampart compared to other similar software using a latent factor model with a 5-level decomposition of latent variance. Rampart is implemented in OpenMx, a free and open source software.
Adjustment of regional regression equations for urban storm-runoff quality using at-site data
Barks, C.S.
1996-01-01
Regional regression equations have been developed to estimate urban storm-runoff loads and mean concentrations using a national data base. Four statistical methods using at-site data to adjust the regional equation predictions were developed to provide better local estimates. The four adjustment procedures are a single-factor adjustment, a regression of the observed data against the predicted values, a regression of the observed values against the predicted values and additional local independent variables, and a weighted combination of a local regression with the regional prediction. Data collected at five representative storm-runoff sites during 22 storms in Little Rock, Arkansas, were used to verify, and, when appropriate, adjust the regional regression equation predictions. Comparison of observed values of stormrunoff loads and mean concentrations to the predicted values from the regional regression equations for nine constituents (chemical oxygen demand, suspended solids, total nitrogen as N, total ammonia plus organic nitrogen as N, total phosphorus as P, dissolved phosphorus as P, total recoverable copper, total recoverable lead, and total recoverable zinc) showed large prediction errors ranging from 63 percent to more than several thousand percent. Prediction errors for 6 of the 18 regional regression equations were less than 100 percent and could be considered reasonable for water-quality prediction equations. The regression adjustment procedure was used to adjust five of the regional equation predictions to improve the predictive accuracy. For seven of the regional equations the observed and the predicted values are not significantly correlated. Thus neither the unadjusted regional equations nor any of the adjustments were appropriate. The mean of the observed values was used as a simple estimator when the regional equation predictions and adjusted predictions were not appropriate.
Development of a Standalone Thermal Wellbore Simulator
NASA Astrophysics Data System (ADS)
Xiong, Wanqiang
With continuous developments of various different sophisticated wells in the petroleum industry, wellbore modeling and simulation have increasingly received more attention. Especially in unconventional oil and gas recovery processes, there is a growing demand for more accurate wellbore modeling. Despite notable advancements made in wellbore modeling, none of the existing wellbore simulators has been as successful as reservoir simulators such as Eclipse and CMG's and further research works on handling issues such as accurate heat loss modeling and multi-tubing wellbore modeling are really necessary. A series of mathematical equations including main governing equations, auxiliary equations, PVT equations, thermodynamic equations, drift-flux model equations, and wellbore heat loss calculation equations are collected and screened from publications. Based on these modeling equations, workflows for wellbore simulation and software development are proposed. Research works are conducted in key steps for developing a wellbore simulator: discretization, a grid system, a solution method, a linear equation solver, and computer language. A standalone thermal wellbore simulator is developed by using standard C++ language. This wellbore simulator can simulate single-phase injection and production, two-phase steam injection and two-phase oil and water production. By implementing a multi-part scheme which divides a wellbore with sophisticated configuration into several relative simple simulation running units, this simulator can handle different complex wellbores: wellbore with multistage casings, horizontal wells, multilateral wells and double tubing. In pursuance of improved accuracy of heat loss calculations to surrounding formations, a semi-numerical method is proposed and a series of FLUENT simulations have been conducted in this study. This semi-numerical method involves extending the 2D formation heat transfer simulation to include a casing wall and cement and adopting new correlations regressed by this study. Meanwhile, a correlation for handling heat transfer in double-tubing annulus is regressed. This work initiates the research on heat transfer in a double-tubing wellbore system. A series of validation and test works are performed in hot water injection, steam injection, real filed data, a horizontal well, a double-tubing well and comparison with the Ramey method. The program in this study also performs well in matching with real measured field data, simulation in horizontal wells and double-tubing wells.
A new model for estimating total body water from bioelectrical resistance
NASA Technical Reports Server (NTRS)
Siconolfi, S. F.; Kear, K. T.
1992-01-01
Estimation of total body water (T) from bioelectrical resistance (R) is commonly done by stepwise regression models with height squared over R, H(exp 2)/R, age, sex, and weight (W). Polynomials of H(exp 2)/R have not been included in these models. We examined the validity of a model with third order polynomials and W. Methods: T was measured with oxygen-18 labled water in 27 subjects. R at 50 kHz was obtained from electrodes placed on the hand and foot while subjects were in the supine position. A stepwise regression equation was developed with 13 subjects (age 31.5 plus or minus 6.2 years, T 38.2 plus or minus 6.6 L, W 65.2 plus or minus 12.0 kg). Correlations, standard error of estimates and mean differences were computed between T and estimated T's from the new (N) model and other models. Evaluations were completed with the remaining 14 subjects (age 32.4 plus or minus 6.3 years, T 40.3 plus or minus 8 L, W 70.2 plus or minus 12.3 kg) and two of its subgroups (high and low) Results: A regression equation was developed from the model. The only significant mean difference was between T and one of the earlier models. Conclusion: Third order polynomials in regression models may increase the accuracy of estimating total body water. Evaluating the model with a larger population is needed.
Development of LACIE CCEA-1 weather/wheat yield models. [regression analysis
NASA Technical Reports Server (NTRS)
Strommen, N. D.; Sakamoto, C. M.; Leduc, S. K.; Umberger, D. E. (Principal Investigator)
1979-01-01
The advantages and disadvantages of the casual (phenological, dynamic, physiological), statistical regression, and analog approaches to modeling for grain yield are examined. Given LACIE's primary goal of estimating wheat production for the large areas of eight major wheat-growing regions, the statistical regression approach of correlating historical yield and climate data offered the Center for Climatic and Environmental Assessment the greatest potential return within the constraints of time and data sources. The basic equation for the first generation wheat-yield model is given. Topics discussed include truncation, trend variable, selection of weather variables, episodic events, strata selection, operational data flow, weighting, and model results.
Updated lateral attenuation in FAA's Integrated Noise Model
DOT National Transportation Integrated Search
2000-08-27
The lateral attenuation algorithm in the Federal Aviation Administration's (FAA) Integrated Noise Model (INM) has historically been based on the two regression equations described in the Society of Automotive Engineers' (SAE) Aerospace Information Re...
NASA Astrophysics Data System (ADS)
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
NASA Astrophysics Data System (ADS)
Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.
2006-11-01
As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.
Fach, S; Sitzenfrei, R; Rauch, W
2009-01-01
It is state of the art to evaluate and optimise sewer systems with urban drainage models. Since spill flow data is essential in the calibration process of conceptual models it is important to enhance the quality of such data. A wide spread approach is to calculate the spill flow volume by using standard weir equations together with measured water levels. However, these equations are only applicable to combined sewer overflow (CSO) structures, whose weir constructions correspond with the standard weir layout. The objective of this work is to outline an alternative approach to obtain spill flow discharge data based on measurements with a sonic depth finder. The idea is to determine the relation between water level and rate of spill flow by running a detailed 3D computational fluid dynamics (CFD) model. Two real world CSO structures have been chosen due to their complex structure, especially with respect to the weir construction. In a first step the simulation results were analysed to identify flow conditions for discrete steady states. It will be shown that the flow conditions in the CSO structure change after the spill flow pipe acts as a controlled outflow and therefore the spill flow discharge cannot be described with a standard weir equation. In a second step the CFD results will be used to derive rating curves which can be easily applied in everyday practice. Therefore the rating curves are developed on basis of the standard weir equation and the equation for orifice-type outlets. Because the intersection of both equations is not known, the coefficients of discharge are regressed from CFD simulation results. Furthermore, the regression of the CFD simulation results are compared with the one of the standard weir equation by using historic water levels and hydrographs generated with a hydrodynamic model. The uncertainties resulting of the wide spread use of the standard weir equation are demonstrated.
Two Enhancements of the Logarithmic Least-Squares Method for Analyzing Subjective Comparisons
1989-03-25
error term. 1 For this model, the total sum of squares ( SSTO ), defined as n 2 SSTO = E (yi y) i=1 can be partitioned into error and regression sums...of the regression line around the mean value. Mathematically, for the model given by equation A.4, SSTO = SSE + SSR (A.6) A-4 where SSTO is the total...sum of squares (i.e., the variance of the yi’s), SSE is error sum of squares, and SSR is the regression sum of squares. SSTO , SSE, and SSR are given
Collaborative Investigations of Shallow Water Optics Problems
2004-12-01
Appendix E. Reprint of Radiative transfer equation inversion: Theory and shape factor models for retrieval of oceanic inherent optical properties, by F ...4829-4834. 5 Hoge, F . E., P. E. Lyon, C. D. Mobley, and L. K. Sundman, 2003. Radiative transfer equation inversion: Theory and shape factor models for...multilinear regression algorithms for the inversion of synthetic ocean colour spectra,, Int. J. Remote Sensing, 25(21), 4829-4834. Hoge, F . E., P. E. Lyon
ERIC Educational Resources Information Center
DeVany, Arthur S.; And Others
This research was designed to develop and test a model of the Air Force manpower market. The study indicates that previous manpower supply studies failed to account for simultaneous determination of enlistments and retentions and misinterpreted regressions as supply equations. They are, instead, reduced form equations resulting from joint…
An overview of longitudinal data analysis methods for neurological research.
Locascio, Joseph J; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models.
ERIC Educational Resources Information Center
Campbell, S. Duke; Greenberg, Barry
The development of a predictive equation capable of explaining a significant percentage of enrollment variability at Florida International University is described. A model utilizing trend analysis and a multiple regression approach to enrollment forecasting was adapted to investigate enrollment dynamics at the university. Four independent…
Kokkinos, Peter; Kaminsky, Leonard A; Arena, Ross; Zhang, Jiajia; Myers, Jonathan
2017-08-15
Impaired cardiorespiratory fitness (CRF) is closely linked to chronic illness and associated with adverse events. The American College of Sports Medicine (ACSM) regression equations (ACSM equations) developed to estimate oxygen uptake have known limitations leading to well-documented overestimation of CRF, especially at higher work rates. Thus, there is a need to explore alternative equations to more accurately predict CRF. We assessed maximal oxygen uptake (VO 2 max) obtained directly by open-circuit spirometry in 7,983 apparently healthy subjects who participated in the Fitness Registry and the Importance of Exercise National Database (FRIEND). We randomly sampled 70% of the participants from each of the following age categories: <40, 40 to 50, 50 to 70, and ≥70 and used the remaining 30% for validation. Multivariable linear regression analysis was applied to identify the most relevant variables and construct the best prediction model for VO 2 max. Treadmill speed and treadmill speed × grade were considered in the final model as predictors of measured VO 2 max and the following equation was generated: VO 2 max in ml O 2 /kg/min = speed (m/min) × (0.17 + fractional grade × 0.79) + 3.5. The FRIEND equation predicted VO 2 max with an overall error >4 times lower than the error associated with the traditional ACSM equations (5.1 ± 18.3% vs 21.4 ± 24.9%, respectively). Overestimation associated with the ACSM equation was accentuated when different protocols were considered separately. In conclusion, The FRIEND equation predicts VO 2 max more precisely than the traditional ACSM equations with an overall error >4 times lower than that associated with the ACSM equations. Published by Elsevier Inc.
Breaker, Brian K.
2015-01-01
Equations for two regions were found to be statistically significant for developing regression equations for estimating harmonic mean flows at ungaged basins; thus, equations are applicable only to streams in those respective regions in Arkansas. Regression equations for dry season mean monthly flows are applicable only to streams located throughout Arkansas. All regression equations are applicable only to unaltered streams where flows were not significantly affected by regulation, diversion, or urbanization. The median number of years used for dry season mean monthly flow calculation was 43, and the median number of years used for harmonic mean flow calculations was 34 for region 1 and 43 for region 2.
Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V
2018-01-01
Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.
Karanjekar, Richa V; Bhatt, Arpita; Altouqui, Said; Jangikhatoonabad, Neda; Durai, Vennila; Sattler, Melanie L; Hossain, M D Sahadat; Chen, Victoria
2015-12-01
Accurately estimating landfill methane emissions is important for quantifying a landfill's greenhouse gas emissions and power generation potential. Current models, including LandGEM and IPCC, often greatly simplify treatment of factors like rainfall and ambient temperature, which can substantially impact gas production. The newly developed Capturing Landfill Emissions for Energy Needs (CLEEN) model aims to improve landfill methane generation estimates, but still require inputs that are fairly easy to obtain: waste composition, annual rainfall, and ambient temperature. To develop the model, methane generation was measured from 27 laboratory scale landfill reactors, with varying waste compositions (ranging from 0% to 100%); average rainfall rates of 2, 6, and 12 mm/day; and temperatures of 20, 30, and 37°C, according to a statistical experimental design. Refuse components considered were the major biodegradable wastes, food, paper, yard/wood, and textile, as well as inert inorganic waste. Based on the data collected, a multiple linear regression equation (R(2)=0.75) was developed to predict first-order methane generation rate constant values k as functions of waste composition, annual rainfall, and temperature. Because, laboratory methane generation rates exceed field rates, a second scale-up regression equation for k was developed using actual gas-recovery data from 11 landfills in high-income countries with conventional operation. The Capturing Landfill Emissions for Energy Needs (CLEEN) model was developed by incorporating both regression equations into the first-order decay based model for estimating methane generation rates from landfills. CLEEN model values were compared to actual field data from 6 US landfills, and to estimates from LandGEM and IPCC. For 4 of the 6 cases, CLEEN model estimates were the closest to actual. Copyright © 2015 Elsevier Ltd. All rights reserved.
Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams
Stuckey, Marla H.
2006-01-01
Low-flow, base-flow, and mean-flow characteristics are an important part of assessing water resources in a watershed. These streamflow characteristics can be used by watershed planners and regulators to determine water availability, water-use allocations, assimilative capacities of streams, and aquatic-habitat needs. Streamflow characteristics are commonly predicted by use of regression equations when a nearby streamflow-gaging station is not available. Regression equations for predicting low-flow, base-flow, and mean-flow characteristics for Pennsylvania streams were developed from data collected at 293 continuous- and partial-record streamflow-gaging stations with flow unaffected by upstream regulation, diversion, or mining. Continuous-record stations used in the regression analysis had 9 years or more of data, and partial-record stations used had seven or more measurements collected during base-flow conditions. The state was divided into five low-flow regions and regional regression equations were developed for the 7-day, 10-year; 7-day, 2-year; 30-day, 10-year; 30-day, 2-year; and 90-day, 10-year low flows using generalized least-squares regression. Statewide regression equations were developed for the 10-year, 25-year, and 50-year base flows using generalized least-squares regression. Statewide regression equations were developed for harmonic mean and mean annual flow using weighted least-squares regression. Basin characteristics found to be significant explanatory variables at the 95-percent confidence level for one or more regression equations were drainage area, basin slope, thickness of soil, stream density, mean annual precipitation, mean elevation, and the percentage of glaciation, carbonate bedrock, forested area, and urban area within a basin. Standard errors of prediction ranged from 33 to 66 percent for the n-day, T-year low flows; 21 to 23 percent for the base flows; and 12 to 38 percent for the mean annual flow and harmonic mean, respectively. The regression equations are not valid in watersheds with upstream regulation, diversions, or mining activities. Watersheds with karst features need close examination as to the applicability of the regression-equation results.
Wiley, J.B.; Atkins, John T.; Tasker, Gary D.
2000-01-01
Multiple and simple least-squares regression models for the log10-transformed 100-year discharge with independent variables describing the basin characteristics (log10-transformed and untransformed) for 267 streamflow-gaging stations were evaluated, and the regression residuals were plotted as areal distributions that defined three regions of the State, designated East, North, and South. Exploratory data analysis procedures identified 31 gaging stations at which discharges are different than would be expected for West Virginia. Regional equations for the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year peak discharges were determined by generalized least-squares regression using data from 236 gaging stations. Log10-transformed drainage area was the most significant independent variable for all regions.Equations developed in this study are applicable only to rural, unregulated, streams within the boundaries of West Virginia. The accuracy of estimating equations is quantified by measuring the average prediction error (from 27.7 to 44.7 percent) and equivalent years of record (from 1.6 to 20.0 years).
Jiang, Wei; Xu, Chao-Zhen; Jiang, Si-Zhi; Zhang, Tang-Duo; Wang, Shi-Zhen; Fang, Bai-Shan
2017-04-01
L-tert-Leucine (L-Tle) and its derivatives are extensively used as crucial building blocks for chiral auxiliaries, pharmaceutically active ingredients, and ligands. Combining with formate dehydrogenase (FDH) for regenerating the expensive coenzyme NADH, leucine dehydrogenase (LeuDH) is continually used for synthesizing L-Tle from α-keto acid. A multilevel factorial experimental design was executed for research of this system. In this work, an efficient optimization method for improving the productivity of L-Tle was developed. And the mathematical model between different fermentation conditions and L-Tle yield was also determined in the form of the equation by using uniform design and regression analysis. The multivariate regression equation was conveniently implemented in water, with a space time yield of 505.9 g L -1 day -1 and an enantiomeric excess value of >99 %. These results demonstrated that this method might become an ideal protocol for industrial production of chiral compounds and unnatural amino acids such as chiral drug intermediates.
Kohn, Michael S.; Stevens, Michael R.; Harden, Tessa M.; Godaire, Jeanne E.; Klinger, Ralph E.; Mommandi, Amanullah
2016-09-09
The U.S. Geological Survey (USGS), in cooperation with the Colorado Department of Transportation, developed regional-regression equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, 0.2-percent annual exceedance-probability discharge (AEPD) for natural streamflow in eastern Colorado. A total of 188 streamgages, consisting of 6,536 years of record and a mean of approximately 35 years of record per streamgage, were used to develop the peak-streamflow regional-regression equations. The estimated AEPDs for each streamgage were computed using the USGS software program PeakFQ. The AEPDs were determined using systematic data through water year 2013. Based on previous studies conducted in Colorado and neighboring States and on the availability of data, 72 characteristics (57 basin and 15 climatic characteristics) were evaluated as candidate explanatory variables in the regression analysis. Paleoflood and non-exceedance bound ages were established based on reconnaissance-level methods. Multiple lines of evidence were used at each streamgage to arrive at a conclusion (age estimate) to add a higher degree of certainty to reconnaissance-level estimates. Paleoflood or nonexceedance bound evidence was documented at 41 streamgages, and 3 streamgages had previously collected paleoflood data.To determine the peak discharge of a paleoflood or non-exceedanc bound, two different hydraulic models were used.The mean standard error of prediction (SEP) for all 8 AEPDs was reduced approximately 25 percent compared to the previous flood-frequency study. For paleoflood data to be effective in reducing the SEP in eastern Colorado, a larger ratio than 44 of 188 (23 percent) streamgages would need paleoflood data and that paleoflood data would need to increase the record length by more than 25 years for the 1-percent AEPD. The greatest reduction in SEP for the peak-streamflow regional-regression equations was observed when additional new basin characteristics were included in the peak-streamflow regional-regression equations and when eastern Colorado was divided into two separate hydrologic regions. To make further reductions in the uncertainties of the peak-streamflow regional-regression equations in the Foothills and Plains hydrologic regions, additional streamgages or crest-stage gages are needed to collect peak-streamflow data on natural streams in eastern Colorado.Generalized-Least Squares regression was used to compute the final peak-streamflow regional-regression equations for peak-streamflow. Dividing eastern Colorado into two new individual regions at –104° longitude resulted in peak-streamflow regional-regression equations with the smallest SEP. The new hydrologic region located between –104° longitude and the Kansas-Nebraska State line will be designated the Plains hydrologic region and the hydrologic region comprising the rest of eastern Colorado located west of the –104° longitude and east of the Rocky Mountains and below 7,500 feet in the South Platte River Basin and below 9,000 feet in the Arkansas River Basin will be designated the Foothills hydrologic region.
New methodology for modeling annual-aircraft emissions at airports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woodmansey, B.G.; Patterson, J.G.
An as-accurate-as-possible estimation of total-aircraft emissions are an essential component of any environmental-impact assessment done for proposed expansions at major airports. To determine the amount of emissions generated by aircraft using present models it is necessary to know the emission characteristics of all engines that are on all planes using the airport. However, the published data base does not cover all engine types and, therefore, a new methodology is needed to assist in estimating annual emissions from aircraft at airports. Linear regression equations relating quantity of emissions to aircraft weight using a known-fleet mix are developed in this paper. Total-annualmore » emissions for CO, NO[sub x], NMHC, SO[sub x], CO[sub 2], and N[sub 2]O are tabulated for Toronto's international airport for 1990. The regression equations are statistically significant for all emissions except for NMHC from large jets and NO[sub x] and NMHC for piston-engine aircraft. This regression model is a relatively simple, fast, and inexpensive method of obtaining an annual-emission inventory for an airport.« less
Dudley, Robert W.; Hodgkins, Glenn A.; Dickinson, Jesse
2017-01-01
We present a logistic regression approach for forecasting the probability of future groundwater levels declining or maintaining below specific groundwater-level thresholds. We tested our approach on 102 groundwater wells in different climatic regions and aquifers of the United States that are part of the U.S. Geological Survey Groundwater Climate Response Network. We evaluated the importance of current groundwater levels, precipitation, streamflow, seasonal variability, Palmer Drought Severity Index, and atmosphere/ocean indices for developing the logistic regression equations. Several diagnostics of model fit were used to evaluate the regression equations, including testing of autocorrelation of residuals, goodness-of-fit metrics, and bootstrap validation testing. The probabilistic predictions were most successful at wells with high persistence (low month-to-month variability) in their groundwater records and at wells where the groundwater level remained below the defined low threshold for sustained periods (generally three months or longer). The model fit was weakest at wells with strong seasonal variability in levels and with shorter duration low-threshold events. We identified challenges in deriving probabilistic-forecasting models and possible approaches for addressing those challenges.
Validation of a heteroscedastic hazards regression model.
Wu, Hong-Dar Isaac; Hsieh, Fushing; Chen, Chen-Hsin
2002-03-01
A Cox-type regression model accommodating heteroscedasticity, with a power factor of the baseline cumulative hazard, is investigated for analyzing data with crossing hazards behavior. Since the approach of partial likelihood cannot eliminate the baseline hazard, an overidentified estimating equation (OEE) approach is introduced in the estimation procedure. It by-product, a model checking statistic, is presented to test for the overall adequacy of the heteroscedastic model. Further, under the heteroscedastic model setting, we propose two statistics to test the proportional hazards assumption. Implementation of this model is illustrated in a data analysis of a cancer clinical trial.
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Ahearn, Elizabeth A.
2010-01-01
Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.
Jennings, M.E.; Thomas, W.O.; Riggs, H.C.
1994-01-01
For many years, the U.S. Geological Survey (USGS) has been involved in the development of regional regression equations for estimating flood magnitude and frequency at ungaged sites. These regression equations are used to transfer flood characteristics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally these equations have been developed on a statewide or metropolitan area basis as part of cooperative study programs with specific State Departments of Transportation or specific cities. The USGS, in cooperation with the Federal Highway Administration and the Federal Emergency Management Agency, has compiled all the current (as of September 1993) statewide and metropolitan area regression equations into a micro-computer program titled the National Flood Frequency Program.This program includes regression equations for estimating flood-peak discharges and techniques for estimating a typical flood hydrograph for a given recurrence interval peak discharge for unregulated rural and urban watersheds. These techniques should be useful to engineers and hydrologists for planning and design applications. This report summarizes the statewide regression equations for rural watersheds in each State, summarizes the applicable metropolitan area or statewide regression equations for urban watersheds, describes the National Flood Frequency Program for making these computations, and provides much of the reference information on the extrapolation variables needed to run the program.
Ejlerskov, Katrine T.; Jensen, Signe M.; Christensen, Line B.; Ritz, Christian; Michaelsen, Kim F.; Mølgaard, Christian
2014-01-01
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height2/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2–4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity. PMID:24463487
Ejlerskov, Katrine T; Jensen, Signe M; Christensen, Line B; Ritz, Christian; Michaelsen, Kim F; Mølgaard, Christian
2014-01-27
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height(2)/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2-4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity.
Weighted linear regression using D2H and D2 as the independent variables
Hans T. Schreuder; Michael S. Williams
1998-01-01
Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...
F.F. Wangaard; George E. Woodson
1972-01-01
Based on a model developed for hardwood fiber strength-pulp property relationships, multiple-regression equations involving fiber strength, fiber length, and sheet density were determined to predict the properties of kraft pulps of slash pine (Pinus elliottii). Regressions for breaking length and burst factor accounted for 88 and 90 percent,...
Fiber length strength interrelationship for slash pine and its effect on pulp-sheet properties
F. G. Wangaard; G. E. Woodson
1973-01-01
Based on a model developed for hardwood fiber strength-pulp property relationships, multiple-regression equations involving fiber strength, fiber length, and sheet density were determined to predict the properties of kraft pulps of slash pine (Pinus elliottii). Regressions for breaking length and burst factor accounted for 88 and 90 percent,...
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.
Eash, David A.; Barnes, Kimberlee K.; O'Shea, Padraic S.
2016-09-19
A statewide study was led to develop regression equations for estimating three selected spring and three selected fall low-flow frequency statistics for ungaged stream sites in Iowa. The estimation equations developed for the six low-flow frequency statistics include spring (April through June) 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years and fall (October through December) 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years. Estimates of the three selected spring statistics are provided for 241 U.S. Geological Survey continuous-record streamgages, and estimates of the three selected fall statistics are provided for 238 of these streamgages, using data through June 2014. Because only 9 years of fall streamflow record were available, three streamgages included in the development of the spring regression equations were not included in the development of the fall regression equations. Because of regulation, diversion, or urbanization, 30 of the 241 streamgages were not included in the development of the regression equations. The study area includes Iowa and adjacent areas within 50 miles of the Iowa border. Because trend analyses indicated statistically significant positive trends when considering the period of record for most of the streamgages, the longest, most recent period of record without a significant trend was determined for each streamgage for use in the study. Geographic information system software was used to measure 63 selected basin characteristics for each of the 211streamgages used to develop the regional regression equations. The study area was divided into three low-flow regions that were defined in a previous study for the development of regional regression equations.Because several streamgages included in the development of regional regression equations have estimates of zero flow calculated from observed streamflow for selected spring and fall low-flow frequency statistics, the final equations for the three low-flow regions were developed using two types of regression analyses—left-censored and generalized-least-squares regression analyses. A total of 211 streamgages were included in the development of nine spring regression equations—three equations for each of the three low-flow regions. A total of 208 streamgages were included in the development of nine fall regression equations—three equations for each of the three low-flow regions. A censoring threshold was used to develop 15 left-censored regression equations to estimate the three fall low-flow frequency statistics for each of the three low-flow regions and to estimate the three spring low-flow frequency statistics for the southern and northwest regions. For the northeast region, generalized-least-squares regression was used to develop three equations to estimate the three spring low-flow frequency statistics. For the northeast region, average standard errors of prediction range from 32.4 to 48.4 percent for the spring equations and average standard errors of estimate range from 56.4 to 73.8 percent for the fall equations. For the northwest region, average standard errors of estimate range from 58.9 to 62.1 percent for the spring equations and from 83.2 to 109.4 percent for the fall equations. For the southern region, average standard errors of estimate range from 43.2 to 64.0 percent for the spring equations and from 78.1 to 78.7 percent for the fall equations.The regression equations are applicable only to stream sites in Iowa with low flows not substantially affected by regulation, diversion, or urbanization and with basin characteristics within the range of those used to develop the equations. The regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system application. StreamStats allows users to click on any ungaged stream site and compute estimates of the six selected spring and fall low-flow statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged site are provided. StreamStats also allows users to click on any Iowa streamgage to obtain computed estimates for the six selected spring and fall low-flow statistics.
An Overview of Longitudinal Data Analysis Methods for Neurological Research
Locascio, Joseph J.; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models. PMID:22203825
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Combustion performance and scale effect from N2O/HTPB hybrid rocket motor simulations
NASA Astrophysics Data System (ADS)
Shan, Fanli; Hou, Lingyun; Piao, Ying
2013-04-01
HRM code for the simulation of N2O/HTPB hybrid rocket motor operation and scale effect analysis has been developed. This code can be used to calculate motor thrust and distributions of physical properties inside the combustion chamber and nozzle during the operational phase by solving the unsteady Navier-Stokes equations using a corrected compressible difference scheme and a two-step, five species combustion model. A dynamic fuel surface regression technique and a two-step calculation method together with the gas-solid coupling are applied in the calculation of fuel regression and the determination of combustion chamber wall profile as fuel regresses. Both the calculated motor thrust from start-up to shut-down mode and the combustion chamber wall profile after motor operation are in good agreements with experimental data. The fuel regression rate equation and the relation between fuel regression rate and axial distance have been derived. Analysis of results suggests improvements in combustion performance to the current hybrid rocket motor design and explains scale effects in the variation of fuel regression rate with combustion chamber diameter.
OPC modeling by genetic algorithm
NASA Astrophysics Data System (ADS)
Huang, W. C.; Lai, C. M.; Luo, B.; Tsai, C. K.; Tsay, C. S.; Lai, C. W.; Kuo, C. C.; Liu, R. G.; Lin, H. T.; Lin, B. J.
2005-05-01
Optical proximity correction (OPC) is usually used to pre-distort mask layouts to make the printed patterns as close to the desired shapes as possible. For model-based OPC, a lithographic model to predict critical dimensions after lithographic processing is needed. The model is usually obtained via a regression of parameters based on experimental data containing optical proximity effects. When the parameters involve a mix of the continuous (optical and resist models) and the discrete (kernel numbers) sets, the traditional numerical optimization method may have difficulty handling model fitting. In this study, an artificial-intelligent optimization method was used to regress the parameters of the lithographic models for OPC. The implemented phenomenological models were constant-threshold models that combine diffused aerial image models with loading effects. Optical kernels decomposed from Hopkin"s equation were used to calculate aerial images on the wafer. Similarly, the numbers of optical kernels were treated as regression parameters. This way, good regression results were obtained with different sets of optical proximity effect data.
Katherine J. Elliott; Barton D. Clinton
1993-01-01
Allometric equations were developed to predict aboveground dry weight of herbaceous and woody species on prescribe-burned sites in the Southern Appalachians. Best-fit least-square regression models were developed using diamet,er, height, or both, as the independent variables and dry weight as the dependent variable. Coefficients of determination for the selected total...
The Protective Role of Supportive Friends against Bullying Perpetration and Victimization
ERIC Educational Resources Information Center
Kendrick, Kristin; Jutengren, Goran; Stattin, Hakan
2012-01-01
A crossed-lagged regression model was tested to investigate relationships between friendship support, bullying involvement, and its consequences during adolescence. Students, 12-16 years (N = 880), were administered questionnaires twice, one year apart. Using structural equation modeling, a model was specified and higher levels of support from…
Bjerklie, David M.; Dingman, S. Lawrence; Bolster, Carl H.
2005-01-01
A set of conceptually derived in‐bank river discharge–estimating equations (models), based on the Manning and Chezy equations, are calibrated and validated using a database of 1037 discharge measurements in 103 rivers in the United States and New Zealand. The models are compared to a multiple regression model derived from the same data. The comparison demonstrates that in natural rivers, using an exponent on the slope variable of 0.33 rather than the traditional value of 0.5 reduces the variance associated with estimating flow resistance. Mean model uncertainty, assuming a constant value for the conductance coefficient, is less than 5% for a large number of estimates, and 67% of the estimates would be accurate within 50%. The models have potential application where site‐specific flow resistance information is not available and can be the basis for (1) a general approach to estimating discharge from remotely sensed hydraulic data, (2) comparison to slope‐area discharge estimates, and (3) large‐scale river modeling.
Regression model estimation of early season crop proportions: North Dakota, some preliminary results
NASA Technical Reports Server (NTRS)
Lin, K. K. (Principal Investigator)
1982-01-01
To estimate crop proportions early in the season, an approach is proposed based on: use of a regression-based prediction equation to obtain an a priori estimate for specific major crop groups; modification of this estimate using current-year LANDSAT and weather data; and a breakdown of the major crop groups into specific crops by regression models. Results from the development and evaluation of appropriate regression models for the first portion of the proposed approach are presented. The results show that the model predicts 1980 crop proportions very well at both county and crop reporting district levels. In terms of planted acreage, the model underpredicted 9.1 percent of the 1980 published data on planted acreage at the county level. It predicted almost exactly the 1980 published data on planted acreage at the crop reporting district level and overpredicted the planted acreage by just 0.92 percent.
NASA Astrophysics Data System (ADS)
POP, A. B.; ȚÎȚU, M. A.
2016-11-01
In the metal cutting process, surface quality is intrinsically related to the cutting parameters and to the cutting tool geometry. At the same time, metal cutting processes are closely related to the machining costs. The purpose of this paper is to reduce manufacturing costs and processing time. A study was made, based on the mathematical modelling of the average of the absolute value deviation (Ra) resulting from the end milling process on 7136 aluminium alloy, depending on cutting process parameters. The novel element brought by this paper is the 7136 aluminium alloy type, chosen to conduct the experiments, which is a material developed and patented by Universal Alloy Corporation. This aluminium alloy is used in the aircraft industry to make parts from extruded profiles, and it has not been studied for the proposed research direction. Based on this research, a mathematical model of surface roughness Ra was established according to the cutting parameters studied in a set experimental field. A regression analysis was performed, which identified the quantitative relationships between cutting parameters and the surface roughness. Using the variance analysis ANOVA, the degree of confidence for the achieved results by the regression equation was determined, and the suitability of this equation at every point of the experimental field.
Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra
2010-01-01
In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.
Nestler, Steffen
2014-05-01
Parameters in structural equation models are typically estimated using the maximum likelihood (ML) approach. Bollen (1996) proposed an alternative non-iterative, equation-by-equation estimator that uses instrumental variables. Although this two-stage least squares/instrumental variables (2SLS/IV) estimator has good statistical properties, one problem with its application is that parameter equality constraints cannot be imposed. This paper presents a mathematical solution to this problem that is based on an extension of the 2SLS/IV approach to a system of equations. We present an example in which our approach was used to examine strong longitudinal measurement invariance. We also investigated the new approach in a simulation study that compared it with ML in the examination of the equality of two latent regression coefficients and strong measurement invariance. Overall, the results show that the suggested approach is a useful extension of the original 2SLS/IV estimator and allows for the effective handling of equality constraints in structural equation models. © 2013 The British Psychological Society.
Lombard, Pamela J.; Hodgkins, Glenn A.
2015-01-01
Regression equations to estimate peak streamflows with 1- to 500-year recurrence intervals (annual exceedance probabilities from 99 to 0.2 percent, respectively) were developed for small, ungaged streams in Maine. Equations presented here are the best available equations for estimating peak flows at ungaged basins in Maine with drainage areas from 0.3 to 12 square miles (mi2). Previously developed equations continue to be the best available equations for estimating peak flows for basin areas greater than 12 mi2. New equations presented here are based on streamflow records at 40 U.S. Geological Survey streamgages with a minimum of 10 years of recorded peak flows between 1963 and 2012. Ordinary least-squares regression techniques were used to determine the best explanatory variables for the regression equations. Traditional map-based explanatory variables were compared to variables requiring field measurements. Two field-based variables—culvert rust lines and bankfull channel widths—either were not commonly found or did not explain enough of the variability in the peak flows to warrant inclusion in the equations. The best explanatory variables were drainage area and percent basin wetlands; values for these variables were determined with a geographic information system. Generalized least-squares regression was used with these two variables to determine the equation coefficients and estimates of accuracy for the final equations.
NASA Astrophysics Data System (ADS)
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Qing, Si-han; Chang, Yun-feng; Dong, Xiao-ai; Li, Yuan; Chen, Xiao-gang; Shu, Yong-kang; Deng, Zhen-hua
2013-10-01
To establish the mathematical models of stature estimation for Sichuan Han female with measurement of lumbar vertebrae by X-ray to provide essential data for forensic anthropology research. The samples, 206 Sichuan Han females, were divided into three groups including group A, B and C according to the ages. Group A (206 samples) consisted of all ages, group B (116 samples) were 20-45 years old and 90 samples over 45 years old were group C. All the samples were examined lumbar vertebrae through CR technology, including the parameters of five centrums (L1-L5) as anterior border, posterior border and central heights (x1-x15), total central height of lumbar spine (x16), and the real height of every sample. The linear regression analysis was produced using the parameters to establish the mathematical models of stature estimation. Sixty-two trained subjects were tested to verify the accuracy of the mathematical models. The established mathematical models by hypothesis test of linear regression equation model were statistically significant (P<0.05). The standard errors of the equation were 2.982-5.004 cm, while correlation coefficients were 0.370-0.779 and multiple correlation coefficients were 0.533-0.834. The return tests of the highest correlation coefficient and multiple correlation coefficient of each group showed that the highest accuracy of the multiple regression equation, y = 100.33 + 1.489 x3 - 0.548 x6 + 0.772 x9 + 0.058 x12 + 0.645 x15, in group A were 80.6% (+/- lSE) and 100% (+/- 2SE). The established mathematical models in this study could be applied for the stature estimation for Sichuan Han females.
Avoiding and Correcting Bias in Score-Based Latent Variable Regression with Discrete Manifest Items
ERIC Educational Resources Information Center
Lu, Irene R. R.; Thomas, D. Roland
2008-01-01
This article considers models involving a single structural equation with latent explanatory and/or latent dependent variables where discrete items are used to measure the latent variables. Our primary focus is the use of scores as proxies for the latent variables and carrying out ordinary least squares (OLS) regression on such scores to estimate…
ERIC Educational Resources Information Center
Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.
2007-01-01
Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…
Working covariance model selection for generalized estimating equations.
Carey, Vincent J; Wang, You-Gan
2011-11-20
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. Copyright © 2011 John Wiley & Sons, Ltd.
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Estimation of Flood-Frequency Discharges for Rural, Unregulated Streams in West Virginia
Wiley, Jeffrey B.; Atkins, John T.
2010-01-01
Flood-frequency discharges were determined for 290 streamgage stations having a minimum of 9 years of record in West Virginia and surrounding states through the 2006 or 2007 water year. No trend was determined in the annual peaks used to calculate the flood-frequency discharges. Multiple and simple least-squares regression equations for the 100-year (1-percent annual-occurrence probability) flood discharge with independent variables that describe the basin characteristics were developed for 290 streamgage stations in West Virginia and adjacent states. The regression residuals for the models were evaluated and used to define three regions of the State, designated as Eastern Panhandle, Central Mountains, and Western Plateaus. Exploratory data analysis procedures identified 44 streamgage stations that were excluded from the development of regression equations representative of rural, unregulated streams in West Virginia. Regional equations for the 1.1-, 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year flood discharges were determined by generalized least-squares regression using data from the remaining 246 streamgage stations. Drainage area was the only significant independent variable determined for all equations in all regions. Procedures developed to estimate flood-frequency discharges on ungaged streams were based on (1) regional equations and (2) drainage-area ratios between gaged and ungaged locations on the same stream. The procedures are applicable only to rural, unregulated streams within the boundaries of West Virginia that have drainage areas within the limits of the stations used to develop the regional equations (from 0.21 to 1,461 square miles in the Eastern Panhandle, from 0.10 to 1,619 square miles in the Central Mountains, and from 0.13 to 1,516 square miles in the Western Plateaus). The accuracy of the equations is quantified by measuring the average prediction error (from 21.7 to 56.3 percent) and equivalent years of record (from 2.0 to 70.9 years).
NASA Astrophysics Data System (ADS)
Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra
2013-03-01
SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.
NASA Astrophysics Data System (ADS)
Keat, Sim Chong; Chun, Beh Boon; San, Lim Hwee; Jafri, Mohd Zubir Mat
2015-04-01
Climate change due to carbon dioxide (CO2) emissions is one of the most complex challenges threatening our planet. This issue considered as a great and international concern that primary attributed from different fossil fuels. In this paper, regression model is used for analyzing the causal relationship among CO2 emissions based on the energy consumption in Malaysia using time series data for the period of 1980-2010. The equations were developed using regression model based on the eight major sources that contribute to the CO2 emissions such as non energy, Liquefied Petroleum Gas (LPG), diesel, kerosene, refinery gas, Aviation Turbine Fuel (ATF) and Aviation Gasoline (AV Gas), fuel oil and motor petrol. The related data partly used for predict the regression model (1980-2000) and partly used for validate the regression model (2001-2010). The results of the prediction model with the measured data showed a high correlation coefficient (R2=0.9544), indicating the model's accuracy and efficiency. These results are accurate and can be used in early warning of the population to comply with air quality standards.
Roland, Mark A.; Stuckey, Marla H.
2008-01-01
Regression equations were developed for estimating flood flows at selected recurrence intervals for ungaged streams in Pennsylvania with drainage areas less than 2,000 square miles. These equations were developed utilizing peak-flow data from 322 streamflow-gaging stations within Pennsylvania and surrounding states. All stations used in the development of the equations had 10 or more years of record and included active and discontinued continuous-record as well as crest-stage partial-record stations. The state was divided into four regions, and regional regression equations were developed to estimate the 2-, 5-, 10-, 50-, 100-, and 500-year recurrence-interval flood flows. The equations were developed by means of a regression analysis that utilized basin characteristics and flow data associated with the stations. Significant explanatory variables at the 95-percent confidence level for one or more regression equations included the following basin characteristics: drainage area; mean basin elevation; and the percentages of carbonate bedrock, urban area, and storage within a basin. The regression equations can be used to predict the magnitude of flood flows for specified recurrence intervals for most streams in the state; however, they are not valid for streams with drainage areas generally greater than 2,000 square miles or with substantial regulation, diversion, or mining activity within the basin. Estimates of flood-flow magnitude and frequency for streamflow-gaging stations substantially affected by upstream regulation are also presented.
Hanley, James A
2008-01-01
Most survival analysis textbooks explain how the hazard ratio parameters in Cox's life table regression model are estimated. Fewer explain how the components of the nonparametric baseline survivor function are derived. Those that do often relegate the explanation to an "advanced" section and merely present the components as algebraic or iterative solutions to estimating equations. None comment on the structure of these estimators. This note brings out a heuristic representation that may help to de-mystify the structure.
NASA Astrophysics Data System (ADS)
Bhojawala, V. M.; Vakharia, D. P.
2017-12-01
This investigation provides an accurate prediction of static pull-in voltage for clamped-clamped micro/nano beams based on distributed model. The Euler-Bernoulli beam theory is used adapting geometric non-linearity of beam, internal (residual) stress, van der Waals force, distributed electrostatic force and fringing field effects for deriving governing differential equation. The Galerkin discretisation method is used to make reduced-order model of the governing differential equation. A regime plot is presented in the current work for determining the number of modes required in reduced-order model to obtain completely converged pull-in voltage for micro/nano beams. A closed-form relation is developed based on the relationship obtained from curve fitting of pull-in instability plots and subsequent non-linear regression for the proposed relation. The output of regression analysis provides Chi-square (χ 2) tolerance value equals to 1 × 10-9, adjusted R-square value equals to 0.999 29 and P-value equals to zero, these statistical parameters indicate the convergence of non-linear fit, accuracy of fitted data and significance of the proposed model respectively. The closed-form equation is validated using available data of experimental and numerical results. The relative maximum error of 4.08% in comparison to several available experimental and numerical data proves the reliability of the proposed closed-form equation.
Interpreting experimental data on egg production--applications of dynamic differential equations.
France, J; Lopez, S; Kebreab, E; Dijkstra, J
2013-09-01
This contribution focuses on applying mathematical models based on systems of ordinary first-order differential equations to synthesize and interpret data from egg production experiments. Models based on linear systems of differential equations are contrasted with those based on nonlinear systems. Regression equations arising from analytical solutions to linear compartmental schemes are considered as candidate functions for describing egg production curves, together with aspects of parameter estimation. Extant candidate functions are reviewed, a role for growth functions such as the Gompertz equation suggested, and a function based on a simple new model outlined. Structurally, the new model comprises a single pool with an inflow and an outflow. Compartmental simulation models based on nonlinear systems of differential equations, and thus requiring numerical solution, are next discussed, and aspects of parameter estimation considered. This type of model is illustrated in relation to development and evaluation of a dynamic model of calcium and phosphorus flows in layers. The model consists of 8 state variables representing calcium and phosphorus pools in the crop, stomachs, plasma, and bone. The flow equations are described by Michaelis-Menten or mass action forms. Experiments that measure Ca and P uptake in layers fed different calcium concentrations during shell-forming days are used to evaluate the model. In addition to providing a useful management tool, such a simulation model also provides a means to evaluate feeding strategies aimed at reducing excretion of potential pollutants in poultry manure to the environment.
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
ERIC Educational Resources Information Center
Savalei, Victoria; Rhemtulla, Mijke
2017-01-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately…
A mass transfer model of ethanol emission from thin layers of corn silage
USDA-ARS?s Scientific Manuscript database
A mass transfer model of ethanol emission from thin layers of corn silage was developed and validated. The model was developed based on data from wind tunnel experiments conducted at different temperatures and air velocities. Multiple regression analysis was used to derive an equation that related t...
Modeling critical habitat for Flammulated Owls (Otus flammeolus)
David A. Christie; Astrid M. van Woudenberg
1997-01-01
Multiple logistic regression analysis was used to produce a prediction model for Flammulated Owl (Otus flammeolus) breeding habitat within the Kamloops Forest Region in south-central British Columbia. Using the model equation, a pilot habitat prediction map was created within a Geographic Information System (GIS) environment that had a 75.7 percent...
Perry, Charles A.; Wolock, David M.; Artman, Joshua C.
2004-01-01
Streamflow statistics of flow duration and peak-discharge frequency were estimated for 4,771 individual locations on streams listed on the 1999 Kansas Surface Water Register. These statistics included the flow-duration values of 90, 75, 50, 25, and 10 percent, as well as the mean flow value. Peak-discharge frequency values were estimated for the 2-, 5-, 10-, 25-, 50-, and 100-year floods. Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating flow-duration values of 90, 75, 50, 25, and 10 percent and the mean flow for uncontrolled flow stream locations. The contributing-drainage areas of 149 U.S. Geological Survey streamflow-gaging stations in Kansas and parts of surrounding States that had flow uncontrolled by Federal reservoirs and used in the regression analyses ranged from 2.06 to 12,004 square miles. Logarithmic transformations of climatic and basin data were performed to yield the best linear relation for developing equations to compute flow durations and mean flow. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were contributing-drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. The analyses yielded a model standard error of prediction range of 0.43 logarithmic units for the 90-percent duration analysis to 0.15 logarithmic units for the 10-percent duration analysis. The model standard error of prediction was 0.14 logarithmic units for the mean flow. Regression equations used to estimate peak-discharge frequency values were obtained from a previous report, and estimates for the 2-, 5-, 10-, 25-, 50-, and 100-year floods were determined for this report. The regression equations and an interpolation procedure were used to compute flow durations, mean flow, and estimates of peak-discharge frequency for locations along uncontrolled flow streams on the 1999 Kansas Surface Water Register. Flow durations, mean flow, and peak-discharge frequency values determined at available gaging stations were used to interpolate the regression-estimated flows for the stream locations where available. Streamflow statistics for locations that had uncontrolled flow were interpolated using data from gaging stations weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled reaches of Kansas streams, the streamflow statistics were interpolated between gaging stations using only gaged data weighted by drainage area.
Wiley, Jeffrey B.; Atkins, John T.; Newell, Dawn A.
2002-01-01
Multiple and simple least-squares regression models for the log10-transformed 1.5- and 2-year recurrence intervals of peak discharges with independent variables describing the basin characteristics (log10-transformed and untransformed) for 236 streamflow-gaging stations were evaluated, and the regression residuals were plotted as areal distributions that defined three regions in West Virginia designated as East, North, and South. Regional equations for the 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2.0-, 2.5-, and 3-year recurrence intervals of peak discharges were determined by generalized least-squares regression. Log10-transformed drainage area was the most significant independent variable for all regions. Equations developed in this study are applicable only to rural, unregulated streams within the boundaries of West Virginia. The accuracies of estimating equations are quantified by measuring the average prediction error (from 27.4 to 52.4 percent) and equivalent years of record (from 1.1 to 3.4 years).
1990-09-01
without the help from the DSXR staff. William Lyons, Charles Ramsey , and Martin Meeks went above and beyond to help complete this research. Special...develop a valid forecasting model that is significantly more accurate than the one presently used by DSXR and suggested the development and testing of a...method, Strom tested DSXR’s iterative linear regression forecasting technique by examining P1 in the simple regression equation to determine whether
NASA Astrophysics Data System (ADS)
Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou
2013-05-01
A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.
Nonlinear GARCH model and 1 / f noise
NASA Astrophysics Data System (ADS)
Kononovicius, A.; Ruseckas, J.
2015-06-01
Auto-regressive conditionally heteroskedastic (ARCH) family models are still used, by practitioners in business and economic policy making, as a conditional volatility forecasting models. Furthermore ARCH models still are attracting an interest of the researchers. In this contribution we consider the well known GARCH(1,1) process and its nonlinear modifications, reminiscent of NGARCH model. We investigate the possibility to reproduce power law statistics, probability density function and power spectral density, using ARCH family models. For this purpose we derive stochastic differential equations from the GARCH processes in consideration. We find the obtained equations to be similar to a general class of stochastic differential equations known to reproduce power law statistics. We show that linear GARCH(1,1) process has power law distribution, but its power spectral density is Brownian noise-like. However, the nonlinear modifications exhibit both power law distribution and power spectral density of the 1 /fβ form, including 1 / f noise.
Aspects of porosity prediction using multivariate linear regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Byrnes, A.P.; Wilson, M.D.
1991-03-01
Highly accurate multiple linear regression models have been developed for sandstones of diverse compositions. Porosity reduction or enhancement processes are controlled by the fundamental variables, Pressure (P), Temperature (T), Time (t), and Composition (X), where composition includes mineralogy, size, sorting, fluid composition, etc. The multiple linear regression equation, of which all linear porosity prediction models are subsets, takes the generalized form: Porosity = C{sub 0} + C{sub 1}(P) + C{sub 2}(T) + C{sub 3}(X) + C{sub 4}(t) + C{sub 5}(PT) + C{sub 6}(PX) + C{sub 7}(Pt) + C{sub 8}(TX) + C{sub 9}(Tt) + C{sub 10}(Xt) + C{sub 11}(PTX) + C{submore » 12}(PXt) + C{sub 13}(PTt) + C{sub 14}(TXt) + C{sub 15}(PTXt). The first four primary variables are often interactive, thus requiring terms involving two or more primary variables (the form shown implies interaction and not necessarily multiplication). The final terms used may also involve simple mathematic transforms such as log X, e{sup T}, X{sup 2}, or more complex transformations such as the Time-Temperature Index (TTI). The X term in the equation above represents a suite of compositional variable and, therefore, a fully expanded equation may include a series of terms incorporating these variables. Numerous published bivariate porosity prediction models involving P (or depth) or Tt (TTI) are effective to a degree, largely because of the high degree of colinearity between p and TTI. However, all such bivariate models ignore the unique contributions of P and Tt, as well as various X terms. These simpler models become poor predictors in regions where colinear relations change, were important variables have been ignored, or where the database does not include a sufficient range or weight distribution for the critical variables.« less
Ham, Joo-ho; Park, Hun-Young; Kim, Youn-ho; Bae, Sang-kon; Ko, Byung-hoon
2017-01-01
[Purpose] The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. [Methods] We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20–59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. [Results] Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. [Conclusion] These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. PMID:29036765
Ham, Joo-Ho; Park, Hun-Young; Kim, Youn-Ho; Bae, Sang-Kon; Ko, Byung-Hoon; Nam, Sang-Seok
2017-09-30
The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20-59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. ©2017 The Korean Society for Exercise Nutrition
Williams-Sether, Tara; Gross, Tara A.
2016-02-09
Seasonal mean daily flow data from 119 U.S. Geological Survey streamflow-gaging stations in North Dakota; the surrounding states of Montana, Minnesota, and South Dakota; and the Canadian provinces of Manitoba and Saskatchewan with 10 or more years of unregulated flow record were used to develop regression equations for flow duration, n-day high flow and n-day low flow using ordinary least-squares and Tobit regression techniques. Regression equations were developed for seasonal flow durations at the 10th, 25th, 50th, 75th, and 90th percent exceedances; the 1-, 7-, and 30-day seasonal mean high flows for the 10-, 25-, and 50-year recurrence intervals; and the 1-, 7-, and 30-day seasonal mean low flows for the 2-, 5-, and 10-year recurrence intervals. Basin and climatic characteristics determined to be significant explanatory variables in one or more regression equations included drainage area, percentage of basin drainage area that drains to isolated lakes and ponds, ruggedness number, stream length, basin compactness ratio, minimum basin elevation, precipitation, slope ratio, stream slope, and soil permeability. The adjusted coefficient of determination for the n-day high-flow regression equations ranged from 55.87 to 94.53 percent. The Chi2 values for the duration regression equations ranged from 13.49 to 117.94, whereas the Chi2 values for the n-day low-flow regression equations ranged from 4.20 to 49.68.
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Ratiu, S. A.; Rackov, M.; Penčić, M.
2018-01-01
Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. This article focuses on expressing the multiple linear regression model related to the hardness assurance by the chemical composition of the phosphorous cast irons destined to the brake shoes, having in view that the regression coefficients will illustrate the unrelated contributions of each independent variable towards predicting the dependent variable. In order to settle the multiple correlations between the hardness of the cast-iron brake shoes, and their chemical compositions several regression equations has been proposed. Is searched a mathematical solution which can determine the optimum chemical composition for the hardness desirable values. Starting from the above-mentioned affirmations two new statistical experiments are effectuated related to the values of Phosphorus [P], Manganese [Mn] and Silicon [Si]. Therefore, the regression equations, which describe the mathematical dependency between the above-mentioned elements and the hardness, are determined. As result, several correlation charts will be revealed.
ERIC Educational Resources Information Center
Molenaar, Ivo W.
The technical problems involved in obtaining Bayesian model estimates for the regression parameters in m similar groups are studied. The available computer programs, BPREP (BASIC), and BAYREG, both written in FORTRAN, require an amount of computer processing that does not encourage regular use. These programs are analyzed so that the performance…
An Accurate VO[subscript 2]max Nonexercise Regression Model for 18-65-Year-Old Adults
ERIC Educational Resources Information Center
Bradshaw, Danielle I.; George, James D.; Hyde, Annette; LaMonte, Michael J.; Vehrs, Pat R.; Hager, Ronald L.; Yanowitz, Frank G.
2005-01-01
The purpose of this study was to develop a regression equation to predict maximal oxygen uptake (VO[subscript 2]max) based on nonexercise (N-EX) data. All participants (N = 100), ages 18-65 years, successfully completed a maximal graded exercise test (GXT) to assess VO[subscript 2]max (M = 39.96 mL[middle dot]kg[superscript -1][middle…
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
A quantitative model for designing keyboard layout.
Shieh, K K; Lin, C C
1999-02-01
This study analyzed the quantitative relationship between keytapping times and ergonomic principles in typewriting skills. Keytapping times and key-operating characteristics of a female subject typing on the Qwerty and Dvorak keyboards for six weeks each were collected and analyzed. The results showed that characteristics of the typed material and the movements of hands and fingers were significantly related to keytapping times. The most significant factors affecting keytapping times were association frequency between letters, consecutive use of the same hand or finger, and the finger used. A regression equation for relating keytapping times to ergonomic principles was fitted to the data. Finally, a protocol for design of computerized keyboard layout based on the regression equation was proposed.
Acidity in DMSO from the embedded cluster integral equation quantum solvation model.
Heil, Jochen; Tomazic, Daniel; Egbers, Simon; Kast, Stefan M
2014-04-01
The embedded cluster reference interaction site model (EC-RISM) is applied to the prediction of acidity constants of organic molecules in dimethyl sulfoxide (DMSO) solution. EC-RISM is based on a self-consistent treatment of the solute's electronic structure and the solvent's structure by coupling quantum-chemical calculations with three-dimensional (3D) RISM integral equation theory. We compare available DMSO force fields with reference calculations obtained using the polarizable continuum model (PCM). The results are evaluated statistically using two different approaches to eliminating the proton contribution: a linear regression model and an analysis of pK(a) shifts for compound pairs. Suitable levels of theory for the integral equation methodology are benchmarked. The results are further analyzed and illustrated by visualizing solvent site distribution functions and comparing them with an aqueous environment.
Equilibrium, kinetics and process design of acid yellow 132 adsorption onto red pine sawdust.
Can, Mustafa
2015-01-01
Linear and non-linear regression procedures have been applied to the Langmuir, Freundlich, Tempkin, Dubinin-Radushkevich, and Redlich-Peterson isotherms for adsorption of acid yellow 132 (AY132) dye onto red pine (Pinus resinosa) sawdust. The effects of parameters such as particle size, stirring rate, contact time, dye concentration, adsorption dose, pH, and temperature were investigated, and interaction was characterized by Fourier transform infrared spectroscopy and field emission scanning electron microscope. The non-linear method of the Langmuir isotherm equation was found to be the best fitting model to the equilibrium data. The maximum monolayer adsorption capacity was found as 79.5 mg/g. The calculated thermodynamic results suggested that AY132 adsorption onto red pine sawdust was an exothermic, physisorption, and spontaneous process. Kinetics was analyzed by four different kinetic equations using non-linear regression analysis. The pseudo-second-order equation provides the best fit with experimental data.
ERT to aid in WSN based early warning system for landslides
NASA Astrophysics Data System (ADS)
T, H.
2017-12-01
Amrita University's landslide monitoring and early warning system using Wireless Sensor Networks (WSN) consists of heterogeneous sensors like rain gauge, moisture sensor, piezometer, geophone, inclinometer, tilt meter etc. The information from the sensors are accurate and limited to that point. In order to monitor a large area, ERT can be used in conjunction with WSN technology. To accomplish the feasibility of ERT in landslide early warning along with WSN technology, we have conducted experiments in Amrita's landslide laboratory setup. The experiment was aimed to simulate landslide, and monitor the changes happening in the soil using moisture sensor and ERT. Simulating moisture values from resistivity measurements to a greater accuracy can help in landslide monitoring for large areas. For accomplishing the same we have adapted two mathematical approaches, 1) Regression analysis between resistivity measurements and actual moisture values from moisture sensor, and 2) Using Waxman Smith model to simulate moisture values from resistivity measurements. The simulated moisture values from Waxman Smith model is compared with the actual moisture values and the Mean Square Error (MSE) is found to be 46.33. Regression curve is drawn for the resistivity vs simulated moisture values from Waxman model, and it is compared with the regression curve of actual model, which is shown in figure-1. From figure-1, it is clear that there the regression curve from actual moisture values and the regression curve from simulated moisture values, follow the similar pattern and there is a small difference between them. Moisture values can be simulated to a greater accuracy using actual regression equation, but the limitation is that, regression curves will differ for different sites and different soils. Regression equation from actual moisture values can be used, if we have conducted experiment in the laboratory for a particular soil sample, otherwise with the knowledge of soil properties, Waxman model can be used to simulate moisture values. The promising results assure that, ERT measurements when used in conjunction with WSN technique, vital paramters triggering landslides like moisture can be simulated for a large area, which will help in providing early warning for large areas.
Kim-Godwin, Yeoun Soo; Maume, Michael O; Fox, Jane A
2014-12-01
The purpose of the study is to identify the predictors of depression and intimate partner violence (IPV) among Latinos in rural Southeastern North Carolina. A sample of 291 migrant and seasonal farmworkers was interviewed to complete the demographic questionnaire, HITS (intimate violence tendency), Migrant Farmworker Stress Inventory, Center for Epidemiologic Studies Depression Scale (depression), and CAGE/4M (alcohol abuse). OLS regression and structural equation modeling were used to test the hypothesized relations between predictors of IPV and depression. The findings indicated that respondents reporting higher levels of stress also reported higher levels of IPV and depression. The goodness-of-fit statistics for the overall model again indicated a moderate fit of the model to the data (χ2 = 5,612, p < .001; root mean square error for approximation = 0.09; adjusted goodness-of-fit index = 0.44; comparative fit index = 0.52). Although the findings were not robust to estimation in the structural equation models, the OLS regression models indicated direct associations between IPV and depression.
The validation of a human force model to predict dynamic forces resulting from multi-joint motions
NASA Technical Reports Server (NTRS)
Pandya, Abhilash K.; Maida, James C.; Aldridge, Ann M.; Hasson, Scott M.; Woolford, Barbara J.
1992-01-01
The development and validation is examined of a dynamic strength model for humans. This model is based on empirical data. The shoulder, elbow, and wrist joints were characterized in terms of maximum isolated torque, or position and velocity, in all rotational planes. This data was reduced by a least squares regression technique into a table of single variable second degree polynomial equations determining torque as a function of position and velocity. The isolated joint torque equations were then used to compute forces resulting from a composite motion, in this case, a ratchet wrench push and pull operation. A comparison of the predicted results of the model with the actual measured values for the composite motion indicates that forces derived from a composite motion of joints (ratcheting) can be predicted from isolated joint measures. Calculated T values comparing model versus measured values for 14 subjects were well within the statistically acceptable limits and regression analysis revealed coefficient of variation between actual and measured to be within 0.72 and 0.80.
Wang, Shan-Shan; Hong, Wen-Jing; Zhang, Yu-Qi; Chen, Shu-Bao; Huang, Guo-Ying; Zhang, Hong-Yan; Chen, Li-Jun; Wu, Lan-Ping; Shen, Rong; Liu, Yi-Qing; Zhu, Jun-Xue
2018-06-01
Clinical decision making in children with heart disease relies on detailed measurements of cardiac structures using two-dimensional and M-mode echocardiography. However, no echocardiographic reference values are available for the Chinese children. We aimed to establish z-score regression equations for left heart structures in a population-based cohort of healthy Chinese Han children. Echocardiography was performed in 545 children with a normal heart. The dimensions of the aortic valve annulus (AVA), aortic sinuses of Valsalva (ASV), sinotubular junction (STJ), ascending aorta (AAO), left atrium (LA), mitral valve annulus (MVA), interventricular septal end-diastolic thickness (IVSd), interventricular septal end-systolic thickness (IVSs), left ventricular end-diastolic diameter (LVIDd), left ventricular end-systolic diameter (LVIDs), left ventricular posterior wall end-diastolic thickness (LVPWd), left ventricular posterior wall end-systolic thickness (LVPWs) were measured. Regression analyses were conducted to relate the measurements of left heart structures to body surface area (BSA). Left ventricular ejection fraction (LVEF) and left ventricular fractional shortening (LVFS) were calculated. Several models were used, and the adjusted R2 values were compared for each model. AVA, ASV, STJ, AAO, LA, MVA, IVSd, IVSs, LVIDd, LVIDs, LVPWd, and LVPWs had a cubic relationship with BSA. LVEF and LVFS fell within a narrow range. Our results provide reference values for z scores and regression equations for left heart structures in Han Chinese children. These data may help make a quick and accurate judgment of the routine clinical measurement of left heart structures in children with heart disease. © 2018 Wiley Periodicals, Inc.
The study of correlation among different scattering parameters in an aggregate dust model
NASA Astrophysics Data System (ADS)
Mazarbhuiya, A. M.; Das, H. S.
2017-09-01
We study the light scattering properties of aggregate particles in a wide range of complex refractive indices (m = n + i k, where 1.4 ≤ n ≤ 2.0, 0.001 ≤ k ≤1.0) and wavelengths (0.45 ≤ λ≤1.25 μ m) to investigate the correlation among different parameters e.g., the positive polarization maximum (P_{max}), the amplitude of the negative polarization (P_{min}), geometric albedo (A), (n,k) and λ. Numerical computations are performed by the Superposition T-matrix code with Ballistic Cluster-Cluster Aggregate (BCCA) particles of 128 monomers and Ballistic Aggregates (BA) particles of 512 monomers, where monomer's radius of aggregates is considered to be 0.1 μm. At a fixed value of k, P_{max} and n are correlated via a quadratic regression equation and this nature is observed at all wavelengths. Further, P_{max} and k are found to be related via a polynomial regression equation when n is taken to be fixed. The degree of the equation depends on the wavelength, higher the wavelength lower is the degree. We find that A and P_{max} are correlated via a cubic regression at λ= 0.45 μ m whereas this correlation is quadratic at higher wavelengths. We notice that |P_{min}| increases with the decrease of P_{max} and a strong linear correlation between them is observed when n is fixed at some value and k is changed from higher to lower value. Further, at a fix value of k, P_{min} and P_{max} can be fitted well via a quartic regression equation when n is changed from higher to lower value. We also find that P_{max} increases with λ and they are correlated via a quartic regression.
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
Tong, Xuming; Chen, Jinghang; Miao, Hongyu; Li, Tingting; Zhang, Le
2015-01-01
Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data. PMID:26535589
[Comparison of three stand-level biomass estimation methods].
Dong, Li Hu; Li, Feng Ri
2016-12-01
At present, the forest biomass methods of regional scale attract most of attention of the researchers, and developing the stand-level biomass model is popular. Based on the forestry inventory data of larch plantation (Larix olgensis) in Jilin Province, we used non-linear seemly unrelated regression (NSUR) to estimate the parameters in two additive system of stand-level biomass equations, i.e., stand-level biomass equations including the stand variables and stand biomass equations including the biomass expansion factor (i.e., Model system 1 and Model system 2), listed the constant biomass expansion factor for larch plantation and compared the prediction accuracy of three stand-level biomass estimation methods. The results indicated that for two additive system of biomass equations, the adjusted coefficient of determination (R a 2 ) of the total and stem equations was more than 0.95, the root mean squared error (RMSE), the mean prediction error (MPE) and the mean absolute error (MAE) were smaller. The branch and foliage biomass equations were worse than total and stem biomass equations, and the adjusted coefficient of determination (R a 2 ) was less than 0.95. The prediction accuracy of a constant biomass expansion factor was relatively lower than the prediction accuracy of Model system 1 and Model system 2. Overall, although stand-level biomass equation including the biomass expansion factor belonged to the volume-derived biomass estimation method, and was different from the stand biomass equations including stand variables in essence, but the obtained prediction accuracy of the two methods was similar. The constant biomass expansion factor had the lower prediction accuracy, and was inappropriate. In addition, in order to make the model parameter estimation more effective, the established stand-level biomass equations should consider the additivity in a system of all tree component biomass and total biomass equations.
Chai Rui; Li Si-Man; Xu Li-Sheng; Yao Yang; Hao Li-Ling
2017-07-01
This study mainly analyzed the parameters such as ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO) and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. These parameters extracted from the central pulse wave invasively measured were compared with the parameters measured from the brachial pulse waves by a regression model and a transfer function model. The accuracy of the parameters which were estimated by the regression model and the transfer function model was compared too. Our findings showed that in addition to the k value, the above parameters of the central pulse wave and the brachial pulse wave invasively measured had positive correlation. Both the regression model parameters including A_slope, DBP, SEVR and the transfer function model parameters had good consistency with the parameters invasively measured, and they had the same effect of consistency. The regression equations of the three parameters were expressed by Y'=a+bx. The SBP, PP, SV, CO of central pulse wave could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
Eash, D.A.
1993-01-01
Procedures provided for applying the drainage-basin and channel-geometry regression equations depend on whether the design-flood discharge estimate is for a site on an ungaged stream, an ungaged site on a gaged stream, or a gaged site. When both a drainage-basin and a channel-geometry regression-equation estimate are available for a stream site, a procedure is presented for determining a weighted average of the two flood estimates. The drainage-basin regression equations are applicable to unregulated rural drainage areas less than 1,060 square miles, and the channel-geometry regression equations are applicable to unregulated rural streams in Iowa with stabilized channels.
Ito, Yukiko; Hattori, Reiko; Mase, Hiroki; Watanabe, Masako; Shiotani, Itaru
2008-12-01
Pollen information is indispensable for allergic individuals and clinicians. This study aimed to develop forecasting models for the total annual count of airborne pollen grains based on data monitored over the last 20 years at the Mie Chuo Medical Center, Tsu, Mie, Japan. Airborne pollen grains were collected using a Durham sampler. Total annual pollen count and pollen count from October to December (OD pollen count) of the previous year were transformed to logarithms. Regression analysis of the total pollen count was performed using variables such as the OD pollen count and the maximum temperature for mid-July of the previous year. Time series analysis revealed an alternate rhythm of the series of total pollen count. The alternate rhythm consisted of a cyclic alternation of an "on" year (high pollen count) and an "off" year (low pollen count). This rhythm was used as a dummy variable in regression equations. Of the three models involving the OD pollen count, a multiple regression equation that included the alternate rhythm variable and the interaction of this rhythm with OD pollen count showed a high coefficient of determination (0.844). Of the three models involving the maximum temperature for mid-July, those including the alternate rhythm variable and the interaction of this rhythm with maximum temperature had the highest coefficient of determination (0.925). An alternate pollen dispersal rhythm represented by a dummy variable in the multiple regression analysis plays a key role in improving forecasting models for the total annual sugi pollen count.
Burns, Douglas A.; Smith, Martyn J.; Freehafer, Douglas A.
2015-12-31
The application uses predictions of future annual precipitation from five climate models and two future greenhouse gas emissions scenarios and provides results that are averaged over three future periods—2025 to 2049, 2050 to 2074, and 2075 to 2099. Results are presented in ensemble form as the mean, median, maximum, and minimum values among the five climate models for each greenhouse gas emissions scenario and period. These predictions of future annual precipitation are substituted into either the precipitation variable or a water balance equation for runoff to calculate potential future peak flows. This application is intended to be used only as an exploratory tool because (1) the regression equations on which the application is based have not been adequately tested outside the range of the current climate and (2) forecasting future precipitation with climate models and downscaling these results to a fine spatial resolution have a high degree of uncertainty. This report includes a discussion of the assumptions, uncertainties, and appropriate use of this exploratory application.
Multi-fidelity Gaussian process regression for prediction of random fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parussini, L.; Venturi, D., E-mail: venturi@ucsc.edu; Perdikaris, P.
We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgersmore » equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.« less
Chen, Ling; Feng, Yanqin; Sun, Jianguo
2017-10-01
This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Guenole, Nigel; Brown, Anna
2014-01-01
We report a Monte Carlo study examining the effects of two strategies for handling measurement non-invariance – modeling and ignoring non-invariant items – on structural regression coefficients between latent variables measured with item response theory models for categorical indicators. These strategies were examined across four levels and three types of non-invariance – non-invariant loadings, non-invariant thresholds, and combined non-invariance on loadings and thresholds – in simple, partial, mediated and moderated regression models where the non-invariant latent variable occupied predictor, mediator, and criterion positions in the structural regression models. When non-invariance is ignored in the latent predictor, the focal group regression parameters are biased in the opposite direction to the difference in loadings and thresholds relative to the referent group (i.e., lower loadings and thresholds for the focal group lead to overestimated regression parameters). With criterion non-invariance, the focal group regression parameters are biased in the same direction as the difference in loadings and thresholds relative to the referent group. While unacceptable levels of parameter bias were confined to the focal group, bias occurred at considerably lower levels of ignored non-invariance than was previously recognized in referent and focal groups. PMID:25278911
Craven, Stephen; Shirsat, Nishikant; Whelan, Jessica; Glennon, Brian
2013-01-01
A Monod kinetic model, logistic equation model, and statistical regression model were developed for a Chinese hamster ovary cell bioprocess operated under three different modes of operation (batch, bolus fed-batch, and continuous fed-batch) and grown on two different bioreactor scales (3 L bench-top and 15 L pilot-scale). The Monod kinetic model was developed for all modes of operation under study and predicted cell density, glucose glutamine, lactate, and ammonia concentrations well for the bioprocess. However, it was computationally demanding due to the large number of parameters necessary to produce a good model fit. The transferability of the Monod kinetic model structure and parameter set across bioreactor scales and modes of operation was investigated and a parameter sensitivity analysis performed. The experimentally determined parameters had the greatest influence on model performance. They changed with scale and mode of operation, but were easily calculated. The remaining parameters, which were fitted using a differential evolutionary algorithm, were not as crucial. Logistic equation and statistical regression models were investigated as alternatives to the Monod kinetic model. They were less computationally intensive to develop due to the absence of a large parameter set. However, modeling of the nutrient and metabolite concentrations proved to be troublesome due to the logistic equation model structure and the inability of both models to incorporate a feed. The complexity, computational load, and effort required for model development has to be balanced with the necessary level of model sophistication when choosing which model type to develop for a particular application. Copyright © 2012 American Institute of Chemical Engineers (AIChE).
Influence of Primary Gage Sensitivities on the Convergence of Balance Load Iterations
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2012-01-01
The connection between the convergence of wind tunnel balance load iterations and the existence of the primary gage sensitivities of a balance is discussed. First, basic elements of two load iteration equations that the iterative method uses in combination with results of a calibration data analysis for the prediction of balance loads are reviewed. Then, the connection between the primary gage sensitivities, the load format, the gage output format, and the convergence characteristics of the load iteration equation choices is investigated. A new criterion is also introduced that may be used to objectively determine if the primary gage sensitivity of a balance gage exists. Then, it is shown that both load iteration equations will converge as long as a suitable regression model is used for the analysis of the balance calibration data, the combined influence of non linear terms of the regression model is very small, and the primary gage sensitivities of all balance gages exist. The last requirement is fulfilled, e.g., if force balance calibration data is analyzed in force balance format. Finally, it is demonstrated that only one of the two load iteration equation choices, i.e., the iteration equation used by the primary load iteration method, converges if one or more primary gage sensitivities are missing. This situation may occur, e.g., if force balance calibration data is analyzed in direct read format using the original gage outputs. Data from the calibration of a six component force balance is used to illustrate the connection between the convergence of the load iteration equation choices and the existence of the primary gage sensitivities.
Improved Bond Equations for Fiber-Reinforced Polymer Bars in Concrete.
Pour, Sadaf Moallemi; Alam, M Shahria; Milani, Abbas S
2016-08-30
This paper explores a set of new equations to predict the bond strength between fiber reinforced polymer (FRP) rebar and concrete. The proposed equations are based on a comprehensive statistical analysis and existing experimental results in the literature. Namely, the most effective parameters on bond behavior of FRP concrete were first identified by applying a factorial analysis on a part of the available database. Then the database that contains 250 pullout tests were divided into four groups based on the concrete compressive strength and the rebar surface. Afterward, nonlinear regression analysis was performed for each study group in order to determine the bond equations. The results show that the proposed equations can predict bond strengths more accurately compared to the other previously reported models.
Eash, David A.; Barnes, Kimberlee K.
2017-01-01
A statewide study was conducted to develop regression equations for estimating six selected low-flow frequency statistics and harmonic mean flows for ungaged stream sites in Iowa. The estimation equations developed for the six low-flow frequency statistics include: the annual 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years, the annual 30-day mean low flow for a recurrence interval of 5 years, and the seasonal (October 1 through December 31) 1- and 7-day mean low flows for a recurrence interval of 10 years. Estimation equations also were developed for the harmonic-mean-flow statistic. Estimates of these seven selected statistics are provided for 208 U.S. Geological Survey continuous-record streamgages using data through September 30, 2006. The study area comprises streamgages located within Iowa and 50 miles beyond the State's borders. Because trend analyses indicated statistically significant positive trends when considering the entire period of record for the majority of the streamgages, the longest, most recent period of record without a significant trend was determined for each streamgage for use in the study. The median number of years of record used to compute each of these seven selected statistics was 35. Geographic information system software was used to measure 54 selected basin characteristics for each streamgage. Following the removal of two streamgages from the initial data set, data collected for 206 streamgages were compiled to investigate three approaches for regionalization of the seven selected statistics. Regionalization, a process using statistical regression analysis, provides a relation for efficiently transferring information from a group of streamgages in a region to ungaged sites in the region. The three regionalization approaches tested included statewide, regional, and region-of-influence regressions. For the regional regression, the study area was divided into three low-flow regions on the basis of hydrologic characteristics, landform regions, and soil regions. A comparison of root mean square errors and average standard errors of prediction for the statewide, regional, and region-of-influence regressions determined that the regional regression provided the best estimates of the seven selected statistics at ungaged sites in Iowa. Because a significant number of streams in Iowa reach zero flow as their minimum flow during low-flow years, four different types of regression analyses were used: left-censored, logistic, generalized-least-squares, and weighted-least-squares regression. A total of 192 streamgages were included in the development of 27 regression equations for the three low-flow regions. For the northeast and northwest regions, a censoring threshold was used to develop 12 left-censored regression equations to estimate the 6 low-flow frequency statistics for each region. For the southern region a total of 12 regression equations were developed; 6 logistic regression equations were developed to estimate the probability of zero flow for the 6 low-flow frequency statistics and 6 generalized least-squares regression equations were developed to estimate the 6 low-flow frequency statistics, if nonzero flow is estimated first by use of the logistic equations. A weighted-least-squares regression equation was developed for each region to estimate the harmonic-mean-flow statistic. Average standard errors of estimate for the left-censored equations for the northeast region range from 64.7 to 88.1 percent and for the northwest region range from 85.8 to 111.8 percent. Misclassification percentages for the logistic equations for the southern region range from 5.6 to 14.0 percent. Average standard errors of prediction for generalized least-squares equations for the southern region range from 71.7 to 98.9 percent and pseudo coefficients of determination for the generalized-least-squares equations range from 87.7 to 91.8 percent. Average standard errors of prediction for weighted-least-squares equations developed for estimating the harmonic-mean-flow statistic for each of the three regions range from 66.4 to 80.4 percent. The regression equations are applicable only to stream sites in Iowa with low flows not significantly affected by regulation, diversion, or urbanization and with basin characteristics within the range of those used to develop the equations. If the equations are used at ungaged sites on regulated streams, or on streams affected by water-supply and agricultural withdrawals, then the estimates will need to be adjusted by the amount of regulation or withdrawal to estimate the actual flow conditions if that is of interest. Caution is advised when applying the equations for basins with characteristics near the applicable limits of the equations and for basins located in karst topography. A test of two drainage-area ratio methods using 31 pairs of streamgages, for the annual 7-day mean low-flow statistic for a recurrence interval of 10 years, indicates a weighted drainage-area ratio method provides better estimates than regional regression equations for an ungaged site on a gaged stream in Iowa when the drainage-area ratio is between 0.5 and 1.4. These regression equations will be implemented within the U.S. Geological Survey StreamStats web-based geographic-information-system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the seven selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these seven selected statistics are provided for the streamgage.
The use of generalized estimating equations in the analysis of motor vehicle crash data.
Hutchings, Caroline B; Knight, Stacey; Reading, James C
2003-01-01
The purpose of this study was to determine if it is necessary to use generalized estimating equations (GEEs) in the analysis of seat belt effectiveness in preventing injuries in motor vehicle crashes. The 1992 Utah crash dataset was used, excluding crash participants where seat belt use was not appropriate (n=93,633). The model used in the 1996 Report to Congress [Report to congress on benefits of safety belts and motorcycle helmets, based on data from the Crash Outcome Data Evaluation System (CODES). National Center for Statistics and Analysis, NHTSA, Washington, DC, February 1996] was analyzed for all occupants with logistic regression, one level of nesting (occupants within crashes), and two levels of nesting (occupants within vehicles within crashes) to compare the use of GEEs with logistic regression. When using one level of nesting compared to logistic regression, 13 of 16 variance estimates changed more than 10%, and eight of 16 parameter estimates changed more than 10%. In addition, three of the independent variables changed from significant to insignificant (alpha=0.05). With the use of two levels of nesting, two of 16 variance estimates and three of 16 parameter estimates changed more than 10% from the variance and parameter estimates in one level of nesting. One of the independent variables changed from insignificant to significant (alpha=0.05) in the two levels of nesting model; therefore, only two of the independent variables changed from significant to insignificant when the logistic regression model was compared to the two levels of nesting model. The odds ratio of seat belt effectiveness in preventing injuries was 12% lower when a one-level nested model was used. Based on these results, we stress the need to use a nested model and GEEs when analyzing motor vehicle crash data.
Chronology of DIC technique based on the fundamental mathematical modeling and dehydration impact.
Alias, Norma; Saipol, Hafizah Farhah Saipan; Ghani, Asnida Che Abd
2014-12-01
A chronology of mathematical models for heat and mass transfer equation is proposed for the prediction of moisture and temperature behavior during drying using DIC (Détente Instantanée Contrôlée) or instant controlled pressure drop technique. DIC technique has the potential as most commonly used dehydration method for high impact food value including the nutrition maintenance and the best possible quality for food storage. The model is governed by the regression model, followed by 2D Fick's and Fourier's parabolic equation and 2D elliptic-parabolic equation in a rectangular slice. The models neglect the effect of shrinkage and radiation effects. The simulations of heat and mass transfer equations with parabolic and elliptic-parabolic types through some numerical methods based on finite difference method (FDM) have been illustrated. Intel®Core™2Duo processors with Linux operating system and C programming language have been considered as a computational platform for the simulation. Qualitative and quantitative differences between DIC technique and the conventional drying methods have been shown as a comparative.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.
Methods for estimating selected low-flow frequency statistics for unregulated streams in Kentucky
Martin, Gary R.; Arihood, Leslie D.
2010-01-01
This report provides estimates of, and presents methods for estimating, selected low-flow frequency statistics for unregulated streams in Kentucky including the 30-day mean low flows for recurrence intervals of 2 and 5 years (30Q2 and 30Q5) and the 7-day mean low flows for recurrence intervals of 5, 10, and 20 years (7Q2, 7Q10, and 7Q20). Estimates of these statistics are provided for 121 U.S. Geological Survey streamflow-gaging stations with data through the 2006 climate year, which is the 12-month period ending March 31 of each year. Data were screened to identify the periods of homogeneous, unregulated flows for use in the analyses. Logistic-regression equations are presented for estimating the annual probability of the selected low-flow frequency statistics being equal to zero. Weighted-least-squares regression equations were developed for estimating the magnitude of the nonzero 30Q2, 30Q5, 7Q2, 7Q10, and 7Q20 low flows. Three low-flow regions were defined for estimating the 7-day low-flow frequency statistics. The explicit explanatory variables in the regression equations include total drainage area and the mapped streamflow-variability index measured from a revised statewide coverage of this characteristic. The percentage of the station low-flow statistics correctly classified as zero or nonzero by use of the logistic-regression equations ranged from 87.5 to 93.8 percent. The average standard errors of prediction of the weighted-least-squares regression equations ranged from 108 to 226 percent. The 30Q2 regression equations have the smallest standard errors of prediction, and the 7Q20 regression equations have the largest standard errors of prediction. The regression equations are applicable only to stream sites with low flows unaffected by regulation from reservoirs and local diversions of flow and to drainage basins in specified ranges of basin characteristics. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features.
Toledo-Martín, Eva María; Font, Rafael; Obregón-Cano, Sara; De Haro-Bailón, Antonio; Villatoro-Pulido, Myriam; Del Río-Celestino, Mercedes
2017-05-20
The potential of visible-near infrared spectroscopy to predict glucosinolates and total phenolic content in rocket ( Eruca vesicaria ) leaves has been evaluated. Accessions of the E. vesicaria species were scanned by NIRS as ground leaf, and their reference values regressed against different spectral transformations by modified partial least squares (MPLS) regression. The coefficients of determination in the external validation (R²VAL) for the different quality components analyzed in rocket ranged from 0.59 to 0.84, which characterize those equations as having from good to excellent quantitative information. These results show that the total glucosinolates, glucosativin and glucoerucin equations obtained, can be used to identify those samples with low and high contents. The glucoraphanin equation obtained can be used for rough predictions of samples and in case of total phenolic content, the equation showed good correlation. The standard deviation (SD) to standard error of prediction ratio (RPD) and SD to range (RER) were variable for the different quality compounds and showed values that were characteristic of equations suitable for screening purposes or to perform accurate analyses. From the study of the MPLS loadings of the first three terms of the different equations, it can be concluded that some major cell components such as protein and cellulose, highly participated in modelling the equations for glucosinolates.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Effects of temperature on embryonic development of lake herring (Coregonus artedii)
Colby, Peter J.; Brooke, L.T.
1973-01-01
Embryonic development of lake herring (Coregonus artedii) was observed in the laboratory at 13 constant temperatures from 0.0 to 12.1 C and in Pickerel Lake (Washtenaw County, Michigan) at natural temperature regimes. Rate of development during incubation was based on progression of the embryos through 20 identifiable stages. An equation was derived to predict development stage at constant temperatures, on the general assumption that development stage (DS) is a function of time (days, D) and temperature (T). The equation should also be useful in interpreting estimates from future regressions that include other environmental variables that affect egg development. A second regression model, derived primarily for fluctuating temperatures, related development rate for stage j (DRj), expressed as the reciprocal of time, to temperature (x). The generalized equation for a development stage is: DRj = abx cx2 dx3. In general, time required for embryos to reach each stage of development in Pickerel Lake agreed closely with the time predicted from this equation, derived from our laboratory observations. Hatching time was predicted within 1 day in 1969 and within 2 days in 1970. We used the equations derived with the second model to predict the effect of the superimposition of temperature increases of 1 and 2 C on the measured temperatures in Pickerel Lake. Conceivably, hatching dates could be affected sufficiently to jeopardize the first feeding of lake herring through loss of harmony between hatching date and seasonal food availability.
ERIC Educational Resources Information Center
Pek, Jolynn; Chalmers, R. Philip; Kok, Bethany E.; Losardo, Diane
2015-01-01
Structural equation mixture models (SEMMs), when applied as a semiparametric model (SPM), can adequately recover potentially nonlinear latent relationships without their specification. This SPM is useful for exploratory analysis when the form of the latent regression is unknown. The purpose of this article is to help users familiar with structural…
NASA Astrophysics Data System (ADS)
See, J. J.; Jamaian, S. S.; Salleh, R. M.; Nor, M. E.; Aman, F.
2018-04-01
This research aims to estimate the parameters of Monod model of microalgae Botryococcus Braunii sp growth by the Least-Squares method. Monod equation is a non-linear equation which can be transformed into a linear equation form and it is solved by implementing the Least-Squares linear regression method. Meanwhile, Gauss-Newton method is an alternative method to solve the non-linear Least-Squares problem with the aim to obtain the parameters value of Monod model by minimizing the sum of square error ( SSE). As the result, the parameters of the Monod model for microalgae Botryococcus Braunii sp can be estimated by the Least-Squares method. However, the estimated parameters value obtained by the non-linear Least-Squares method are more accurate compared to the linear Least-Squares method since the SSE of the non-linear Least-Squares method is less than the linear Least-Squares method.
Dudley, Robert W.
2015-12-03
The largest average errors of prediction are associated with regression equations for the lowest streamflows derived for months during which the lowest streamflows of the year occur (such as the 5 and 1 monthly percentiles for August and September). The regression equations have been derived on the basis of streamflow and basin characteristics data for unregulated, rural drainage basins without substantial streamflow or drainage modifications (for example, diversions and (or) regulation by dams or reservoirs, tile drainage, irrigation, channelization, and impervious paved surfaces), therefore using the equations for regulated or urbanized basins with substantial streamflow or drainage modifications will yield results of unknown error. Input basin characteristics derived using techniques or datasets other than those documented in this report or using values outside the ranges used to develop these regression equations also will yield results of unknown error.
Major controlling factors and prediction models for arsenic uptake from soil to wheat plants.
Dai, Yunchao; Lv, Jialong; Liu, Ke; Zhao, Xiaoyan; Cao, Yingfei
2016-08-01
The application of current Chinese agriculture soil quality standards fails to evaluate the land utilization functions appropriately due to the diversity of soil properties and plant species. Therefore, the standards should be amended. A greenhouse experiment was conducted to investigate arsenic (As) enrichment in various soils from 18 Chinese provinces in parallel with As transfer to 8 wheat varieties. The goal of the study was to build and calibrate soil-wheat threshold models to forecast the As threshold of wheat soils. In Shaanxi soils, Wanmai and Jimai were the most sensitive and insensitive wheat varieties, respectively; and in Jiangxi soils, Zhengmai and Xumai were the most sensitive and insensitive wheat varieties, respectively. Relationships between soil properties and the bioconcentration factor (BCF) were built based on stepwise multiple linear regressions. Soil pH was the best predictor of BCF, and after normalizing the regression equation (Log BCF=0.2054 pH- 3.2055, R(2)=0.8474, n=14, p<0.001), we obtained a calibrated model. Using the calibrated model, a continuous soil-wheat threshold equation (HC5=10((-0.2054 pH+2.9935))+9.2) was obtained for the species-sensitive distribution curve, which was built on Chinese food safety standards. The threshold equation is a helpful tool that can be applied to estimate As uptake from soil to wheat. Copyright © 2016 Elsevier Inc. All rights reserved.
Testing a Theoretical Model of the Stress Process in Alzheimer's Caregivers with Race as a Moderator
ERIC Educational Resources Information Center
Hilgeman, Michelle M.; Durkin, Daniel W.; Sun, Fei; DeCoster, Jamie; Allen, Rebecca S.; Gallagher-Thompson, Dolores; Burgio, Louis D.
2009-01-01
Purpose: The primary aim of this study was to test the stress process model (SPM; Pearlin, Mullan, Semple, & Skaff, 1990) in a racially diverse sample of Alzheimer's caregivers (CGs) using structural equation modeling (SEM) and regression techniques. A secondary aim was to examine race or ethnicity as a moderator of the relation between latent…
Regression Is a Univariate General Linear Model Subsuming Other Parametric Methods as Special Cases.
ERIC Educational Resources Information Center
Vidal, Sherry
Although the concept of the general linear model (GLM) has existed since the 1960s, other univariate analyses such as the t-test and the analysis of variance models have remained popular. The GLM produces an equation that minimizes the mean differences of independent variables as they are related to a dependent variable. From a computer printout…
Mastin, Mark C.; Konrad, Christopher P.; Veilleux, Andrea G.; Tecca, Alison E.
2016-09-20
An investigation into the magnitude and frequency of floods in Washington State computed the annual exceedance probability (AEP) statistics for 648 U.S. Geological Survey unregulated streamgages in and near the borders of Washington using the recorded annual peak flows through water year 2014. This is an updated report from a previous report published in 1998 that used annual peak flows through the water year 1996. New in this report, a regional skew coefficient was developed for the Pacific Northwest region that includes areas in Oregon, Washington, Idaho and western Montana within the Columbia River drainage basin south of the United States-Canada border, the coastal areas of Oregon and western Washington, and watersheds draining into Puget Sound, Washington. The skew coefficient is an important term in the Log Pearson Type III equation used to define the distribution of the log-transformed annual peaks. The Expected Moments Algorithm was used to fit historical and censored peak-flow data to the log Pearson Type III distribution. A Multiple Grubb-Beck test was employed to censor low outliers of annual peak flows to improve on the frequency distribution. This investigation also includes a section on observed trends in annual peak flows that showed significant trends (p-value < 0.05) in 21 of 83 long-term sites, but with small magnitude Kendall tau values suggesting a limited monotonic trend in the time series of annual peaks. Most of the sites with a significant trend in western Washington were positive and all the sites with significant trends (three sites) in eastern Washington were negative.Multivariate regression analysis with measured basin characteristics and the AEP statistics at long-term, unregulated, and un-urbanized (defined as drainage basins with less than 5 percent impervious land cover for this investigation) streamgages within Washington and some in Idaho and Oregon that are near the Washington border was used to develop equations to estimate AEP statistics at ungaged basins. Washington was divided into four regions to improve the accuracy of the regression equations; a set of equations for eight selected AEPs and for each region were constructed. Selected AEP statistics included the annual peak flows that equaled or exceeded 50, 20, 10, 4, 2, 1, 0.5 and 0.2 percent of the time equivalent to peak flows for peaks with a 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence intervals, respectively. Annual precipitation and drainage area were the significant basin characteristics in the regression equations for all four regression regions in Washington and forest cover was significant for the two regression regions in eastern Washington. Average standard error of prediction for the regional regression equations ranged from 70.19 to 125.72 percent for Regression Regions 1 and 2 on the eastern side of the Cascade Mountains and from 43.22 to 58.04 percent for Regression Regions 3 and 4 on the western side of the Cascade Mountains. The pseudo coefficient of determination (where a value of 100 signifies a perfect regression model) ranged from 68.39 to 90.68 for Regression Regions 1 and 2, and 92.35 to 95.44 for Regions 3 and 4.The calculated AEP statistics for the streamgages and the regional regression equations are expected to be incorporated into StreamStats after the publication of this report. StreamStats is the interactive Web-based map tool created by the U.S. Geological Survey to allow the user to choose a streamgage and obtain published statistics or choose ungaged locations where the program automatically applies the regional regression equations and computes the estimates of the AEP statistics.
Fossum, Kenneth D.; O'Day, Christie M.; Wilson, Barbara J.; Monical, Jim E.
2001-01-01
Stormwater and streamflow in Maricopa County were monitored to (1) describe the physical, chemical, and toxicity characteristics of stormwater from areas having different land uses, (2) describe the physical, chemical, and toxicity characteristics of streamflow from areas that receive urban stormwater, and (3) estimate constituent loads in stormwater. Urban stormwater and streamflow had similar ranges in most constituent concentrations. The mean concentration of dissolved solids in urban stormwater was lower than in streamflow from the Salt River and Indian Bend Wash. Urban stormwater, however, had a greater chemical oxygen demand and higher concentrations of most nutrients. Mean seasonal loads and mean annual loads of 11 constituents and volumes of runoff were estimated for municipalities in the metropolitan Phoenix area, Arizona, by adjusting regional regression equations of loads. This adjustment procedure uses the original regional regression equation and additional explanatory variables that were not included in the original equation. The adjusted equations had standard errors that ranged from 161 to 196 percent. The large standard errors of the prediction result from the large variability of the constituent concentration data used in the regression analysis. Adjustment procedures produced unsatisfactory results for nine of the regressions?suspended solids, dissolved solids, total phosphorus, dissolved phosphorus, total recoverable cadmium, total recoverable copper, total recoverable lead, total recoverable zinc, and storm runoff. These equations had no consistent direction of bias and no other additional explanatory variables correlated with the observed loads. A stepwise-multiple regression or a three-variable regression (total storm rainfall, drainage area, and impervious area) and local data were used to develop local regression equations for these nine constituents. These equations had standard errors from 15 to 183 percent.
Modeling The Skeleton Weight of an Adult Caucasian Man.
Avtandilashvili, Maia; Tolmachev, Sergei Y
2018-05-17
The reference value for the skeleton weight of an adult male (10.5 kg) recommended by the International Commission on Radiological Protection in Publication 70 is based on weights of dissected skeletons from 44 individuals, including two U.S. Transuranium and Uranium Registries whole-body donors. The International Commission on Radiological Protection analysis of anatomical data from 31 individuals with known values of body height demonstrated significant correlation between skeleton weight and body height. The corresponding regression equation, Wskel (kg) = -10.7 + 0.119 × H (cm), published in International Commission on Radiological Protection Publication 70 is typically used to estimate the skeleton weight from body height. Currently, the U.S. Transuranium and Uranium Registries holds data on individual bone weights from a total of 40 male whole-body donors, which has provided a unique opportunity to update the International Commission on Radiological Protection skeleton weight vs. body height equation. The original International Commission on Radiological Protection Publication 70 and the new U.S. Transuranium and Uranium Registries data were combined in a set of 69 data points representing a group of 33- to 95-y-old individuals with body heights and skeleton weights ranging from 155 to 188 cm and 6.5 to 13.4 kg, respectively. Data were fitted with a linear least-squares regression. A significant correlation between the two parameters was observed (r = 0.28), and an updated skeleton weight vs. body height equation was derived: Wskel (kg) = -6.5 + 0.093 × H (cm). In addition, a correlation of skeleton weight with multiple variables including body height, body weight, and age was evaluated using multiple regression analysis, and a corresponding fit equation was derived: Wskel (kg) = -0.25 + 0.046 × H (cm) + 0.036 × Wbody (kg) - 0.012 × A (y). These equations will be used to estimate skeleton weights and, ultimately, total skeletal actinide activities for biokinetic modeling of U.S. Transuranium and Uranium Registries partial-body donation cases.
[Aboveground biomass of three conifers in Qianyanzhou plantation].
Li, Xuanran; Liu, Qijing; Chen, Yongrui; Hu, Lile; Yang, Fengting
2006-08-01
In this paper, the regressive models of the aboveground biomass of Pinus elliottii, P. massoniana and Cunninghamia lanceolata in Qianyanzhou of subtropical China were established, and the regression analysis on the dry weight of leaf biomass and total biomass against branch diameter (d), branch length (L), d3 and d2L was conducted with linear, power and exponent functions. Power equation with single parameter (d) was proved to be better than the rests for P. massoniana and C. lanceolata, and linear equation with parameter (d3) was better for P. elliottii. The canopy biomass was derived by the regression equations for all branches. These equations were also used to fit the relationships of total tree biomass, branch biomass and foliage biomass with tree diameter at breast height (D), tree height (H), D3 and D2H, respectively. D2H was found to be the best parameter for estimating total biomass. For foliage-and branch biomass, both parameters and equation forms showed some differences among species. Correlations were highly significant (P <0.001) for foliage-, branch-and total biomass, with the highest for total biomass. By these equations, the aboveground biomass and its allocation were estimated, with the aboveground biomass of P. massoniana, P. elliottii, and C. lanceolata forests being 83.6, 72. 1 and 59 t x hm(-2), respectively, and more stem biomass than foliage-and branch biomass. According to the previous studies, the underground biomass of these three forests was estimated to be 10.44, 9.42 and 11.48 t x hm(-2), and the amount of fixed carbon was 47.94, 45.14 and 37.52 t x hm(-2), respectively.
Prediction equation for estimating total daily energy requirements of special operations personnel.
Barringer, N D; Pasiakos, S M; McClung, H L; Crombie, A P; Margolis, L M
2018-01-01
Special Operations Forces (SOF) engage in a variety of military tasks with many producing high energy expenditures, leading to undesired energy deficits and loss of body mass. Therefore, the ability to accurately estimate daily energy requirements would be useful for accurate logistical planning. Generate a predictive equation estimating energy requirements of SOF. Retrospective analysis of data collected from SOF personnel engaged in 12 different SOF training scenarios. Energy expenditure and total body water were determined using the doubly-labeled water technique. Physical activity level was determined as daily energy expenditure divided by resting metabolic rate. Physical activity level was broken into quartiles (0 = mission prep, 1 = common warrior tasks, 2 = battle drills, 3 = specialized intense activity) to generate a physical activity factor (PAF). Regression analysis was used to construct two predictive equations (Model A; body mass and PAF, Model B; fat-free mass and PAF) estimating daily energy expenditures. Average measured energy expenditure during SOF training was 4468 (range: 3700 to 6300) Kcal·d- 1 . Regression analysis revealed that physical activity level ( r = 0.91; P < 0.05) and body mass ( r = 0.28; P < 0.05; Model A), or fat-free mass (FFM; r = 0.32; P < 0.05; Model B) were the factors that most highly predicted energy expenditures. Predictive equations coupling PAF with body mass (Model A) and FFM (Model B), were correlated ( r = 0.74 and r = 0.76, respectively) and did not differ [mean ± SEM: Model A; 4463 ± 65 Kcal·d - 1 , Model B; 4462 ± 61 Kcal·d - 1 ] from DLW measured energy expenditures. By quantifying and grouping SOF training exercises into activity factors, SOF energy requirements can be predicted with reasonable accuracy and these equations used by dietetic/logistical personnel to plan appropriate feeding regimens to meet SOF nutritional requirements across their mission profile.
Choice of mathematical models for technological process of glass rod drawing
NASA Astrophysics Data System (ADS)
Alekseeva, L. B.
2017-10-01
The technological process of drawing glass rods (light guides) is considered. Automated control of the drawing process is reduced to the process of making decisions to ensure a given quality. The drawing process is considered as a control object, including the drawing device (control device) and the optical fiber forming zone (control object). To study the processes occurring in the formation zone, mathematical models are proposed, based on the continuum mechanics basics. To assess the influence of disturbances, a transfer function is obtained from the basis of the wave equation. Obtaining the regression equation also adequately describes the drawing process.
Nohara, Ryuki; Endo, Yui; Murai, Akihiko; Takemura, Hiroshi; Kouchi, Makiko; Tada, Mitsunori
2016-08-01
Individual human models are usually created by direct 3D scanning or deforming a template model according to the measured dimensions. In this paper, we propose a method to estimate all the necessary dimensions (full set) for the human model individualization from a small number of measured dimensions (subset) and human dimension database. For this purpose, we solved multiple regression equation from the dimension database given full set dimensions as the objective variable and subset dimensions as the explanatory variables. Thus, the full set dimensions are obtained by simply multiplying the subset dimensions to the coefficient matrix of the regression equation. We verified the accuracy of our method by imputing hand, foot, and whole body dimensions from their dimension database. The leave-one-out cross validation is employed in this evaluation. The mean absolute errors (MAE) between the measured and the estimated dimensions computed from 4 dimensions (hand length, breadth, middle finger breadth at proximal, and middle finger depth at proximal) in the hand, 2 dimensions (foot length, breadth, and lateral malleolus height) in the foot, and 1 dimension (height) and weight in the whole body are computed. The average MAE of non-measured dimensions were 4.58% in the hand, 4.42% in the foot, and 3.54% in the whole body, while that of measured dimensions were 0.00%.
Determination of suitable drying curve model for bread moisture loss during baking
NASA Astrophysics Data System (ADS)
Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.
2013-03-01
This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
Tosteson, Tor D.; Morden, Nancy E.; Stukel, Therese A.; O'Malley, A. James
2014-01-01
The estimation of treatment effects is one of the primary goals of statistics in medicine. Estimation based on observational studies is subject to confounding. Statistical methods for controlling bias due to confounding include regression adjustment, propensity scores and inverse probability weighted estimators. These methods require that all confounders are recorded in the data. The method of instrumental variables (IVs) can eliminate bias in observational studies even in the absence of information on confounders. We propose a method for integrating IVs within the framework of Cox's proportional hazards model and demonstrate the conditions under which it recovers the causal effect of treatment. The methodology is based on the approximate orthogonality of an instrument with unobserved confounders among those at risk. We derive an estimator as the solution to an estimating equation that resembles the score equation of the partial likelihood in much the same way as the traditional IV estimator resembles the normal equations. To justify this IV estimator for a Cox model we perform simulations to evaluate its operating characteristics. Finally, we apply the estimator to an observational study of the effect of coronary catheterization on survival. PMID:25506259
MacKenzie, Todd A; Tosteson, Tor D; Morden, Nancy E; Stukel, Therese A; O'Malley, A James
2014-06-01
The estimation of treatment effects is one of the primary goals of statistics in medicine. Estimation based on observational studies is subject to confounding. Statistical methods for controlling bias due to confounding include regression adjustment, propensity scores and inverse probability weighted estimators. These methods require that all confounders are recorded in the data. The method of instrumental variables (IVs) can eliminate bias in observational studies even in the absence of information on confounders. We propose a method for integrating IVs within the framework of Cox's proportional hazards model and demonstrate the conditions under which it recovers the causal effect of treatment. The methodology is based on the approximate orthogonality of an instrument with unobserved confounders among those at risk. We derive an estimator as the solution to an estimating equation that resembles the score equation of the partial likelihood in much the same way as the traditional IV estimator resembles the normal equations. To justify this IV estimator for a Cox model we perform simulations to evaluate its operating characteristics. Finally, we apply the estimator to an observational study of the effect of coronary catheterization on survival.
Winters, Eric R; Petosa, Rick L; Charlton, Thomas E
2003-06-01
To examine whether knowledge of high school students' actions of self-regulation, and perceptions of self-efficacy to overcome exercise barriers, social situation, and outcome expectation will predict non-school related moderate and vigorous physical exercise. High school students enrolled in introductory Physical Education courses completed questionnaires that targeted selected Social Cognitive Theory variables. They also self-reported their typical "leisure-time" exercise participation using a standardized questionnaire. Bivariate correlation statistic and hierarchical regression were conducted on reports of moderate and vigorous exercise frequency. Each predictor variable was significantly associated with measures of moderate and vigorous exercise frequency. All predictor variables were significant in the final regression model used to explain vigorous exercise. After controlling for the effects of gender, the psychosocial variables explained 29% of variance in vigorous exercise frequency. Three of four predictor variables were significant in the final regression equation used to explain moderate exercise. The final regression equation accounted for 11% of variance in moderate exercise frequency. Professionals who attempt to increase the prevalence of physical exercise through educational methods should focus on the psychosocial variables utilized in this study.
Hwang, Bosun; Han, Jonghee; Choi, Jong Min; Park, Kwang Suk
2008-11-01
The purpose of this study was to develop an unobtrusive energy expenditure (EE) measurement system using an infrared (IR) sensor-based activity monitoring system to measure indoor activities and to estimate individual quantitative EE. IR-sensor activation counts were measured with a Bluetooth-based monitoring system and the standard EE was calculated using an established regression equation. Ten male subjects participated in the experiment and three different EE measurement systems (gas analyzer, accelerometer, IR sensor) were used simultaneously in order to determine the regression equation and evaluate the performance. As a standard measurement, oxygen consumption was simultaneously measured by a portable metabolic system (Metamax 3X, Cortex, Germany). A single room experiment was performed to develop a regression model of the standard EE measurement from the proposed IR sensor-based measurement system. In addition, correlation and regression analyses were done to compare the performance of the IR system with that of the Actigraph system. We determined that our proposed IR-based EE measurement system shows a similar correlation to the Actigraph system with the standard measurement system.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, N.; Bader, Jon B.
2010-01-01
Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.
A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty
Friedel, Michael J.
2011-01-01
This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
Martin, Gary R.; Fowler, Kathleen K.; Arihood, Leslie D.
2016-09-06
Information on low-flow characteristics of streams is essential for the management of water resources. This report provides equations for estimating the 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years and the harmonic-mean flow at ungaged, unregulated stream sites in Indiana. These equations were developed using the low-flow statistics and basin characteristics for 108 continuous-record streamgages in Indiana with at least 10 years of daily mean streamflow data through the 2011 climate year (April 1 through March 31). The equations were developed in cooperation with the Indiana Department of Environmental Management.Regression techniques were used to develop the equations for estimating low-flow frequency statistics and the harmonic-mean flows on the basis of drainage-basin characteristics. A geographic information system was used to measure basin characteristics for selected streamgages. A final set of 25 basin characteristics measured at all the streamgages were evaluated to choose the best predictors of the low-flow statistics.Logistic-regression equations applicable statewide are presented for estimating the probability that selected low-flow frequency statistics equal zero. These equations use the explanatory variables total drainage area, average transmissivity of the full thickness of the unconsolidated deposits within 1,000 feet of the stream network, and latitude of the basin outlet. The percentage of the streamgage low-flow statistics correctly classified as zero or nonzero using the logistic-regression equations ranged from 86.1 to 88.9 percent.Generalized-least-squares regression equations applicable statewide for estimating nonzero low-flow frequency statistics use total drainage area, the average hydraulic conductivity of the top 70 feet of unconsolidated deposits, the slope of the basin, and the index of permeability and thickness of the Quaternary surficial sediments as explanatory variables. The average standard error of prediction of these regression equations ranges from 55.7 to 61.5 percent.Regional weighted-least-squares regression equations were developed for estimating the harmonic-mean flows by dividing the State into three low-flow regions. The Northern region uses total drainage area and the average transmissivity of the entire thickness of unconsolidated deposits as explanatory variables. The Central region uses total drainage area, the average hydraulic conductivity of the entire thickness of unconsolidated deposits, and the index of permeability and thickness of the Quaternary surficial sediments. The Southern region uses total drainage area and the percent of the basin covered by forest. The average standard error of prediction for these equations ranges from 39.3 to 66.7 percent.The regional regression equations are applicable only to stream sites with low flows unaffected by regulation and to stream sites with drainage basin characteristic values within specified limits. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features and for urbanized basins. Extrapolations near and beyond the applicable basin characteristic limits will have unknown errors that may be large. Equations are presented for use in estimating the 90-percent prediction interval of the low-flow statistics estimated by use of the regression equations at a given stream site.The regression equations are to be incorporated into the U.S. Geological Survey StreamStats Web-based application for Indiana. StreamStats allows users to select a stream site on a map and automatically measure the needed basin characteristics and compute the estimated low-flow statistics and associated prediction intervals.
Alexander, Terry W.; Wilson, Gary L.
1995-01-01
A generalized least-squares regression technique was used to relate the 2- to 500-year flood discharges from 278 selected streamflow-gaging stations to statistically significant basin characteristics. The regression relations (estimating equations) were defined for three hydrologic regions (I, II, and III) in rural Missouri. Ordinary least-squares regression analyses indicate that drainage area (Regions I, II, and III) and main-channel slope (Regions I and II) are the only basin characteristics needed for computing the 2- to 500-year design-flood discharges at gaged or ungaged stream locations. The resulting generalized least-squares regression equations provide a technique for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood discharges on unregulated streams in rural Missouri. The regression equations for Regions I and II were developed from stream-flow-gaging stations with drainage areas ranging from 0.13 to 11,500 square miles and 0.13 to 14,000 square miles, and main-channel slopes ranging from 1.35 to 150 feet per mile and 1.20 to 279 feet per mile. The regression equations for Region III were developed from streamflow-gaging stations with drainage areas ranging from 0.48 to 1,040 square miles. Standard errors of estimate for the generalized least-squares regression equations in Regions I, II, and m ranged from 30 to 49 percent.
Improved Bond Equations for Fiber-Reinforced Polymer Bars in Concrete
Pour, Sadaf Moallemi; Alam, M. Shahria; Milani, Abbas S.
2016-01-01
This paper explores a set of new equations to predict the bond strength between fiber reinforced polymer (FRP) rebar and concrete. The proposed equations are based on a comprehensive statistical analysis and existing experimental results in the literature. Namely, the most effective parameters on bond behavior of FRP concrete were first identified by applying a factorial analysis on a part of the available database. Then the database that contains 250 pullout tests were divided into four groups based on the concrete compressive strength and the rebar surface. Afterward, nonlinear regression analysis was performed for each study group in order to determine the bond equations. The results show that the proposed equations can predict bond strengths more accurately compared to the other previously reported models. PMID:28773859
DOT National Transportation Integrated Search
1999-11-01
Using a fairly large cross-section/time-series data base, covering all provinces of Norway and all months between January 1973 and December 1994, we estimate non-linear (Box-Cox) regression equations explaining aggregate car ownership, road use, seat...
Multivariate regression model for partitioning tree volume of white oak into round-product classes
Daniel A. Yaussy; David L. Sonderman
1984-01-01
Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
THOMAS J. BRANDEIS; MARIA DEL ROCIO SUAREZ ROZO
2005-01-01
Total aboveground live tree biomass in Puerto Rican lower montane wet, subtropical wet, subtropical moist and subtropical dry forests was estimated using data from two forest inventories and published regression equations. Multiple potentially-applicable published biomass models existed for some forested life zones, and their estimates tended to diverge with increasing...
Thomas J. Brandeis; Maria Del Rocio; Suarez Rozo
2005-01-01
Total aboveground live tree biomass in Puerto Rican lower montane wet, subtropical wet, subtropical moist and subtropical dry forests was estimated using data from two forest inventories and published regression equations. Multiple potentially-applicable published biomass models existed for some forested life zones, and their estimates tended to diverge with increasing...
ERIC Educational Resources Information Center
Owen, Steven V.; Feldhusen, John F.
This study compares the effectiveness of three models of multivariate prediction for academic success in identifying the criterion variance of achievement in nursing education. The first model involves the use of an optimum set of predictors and one equation derived from a regression analysis on first semester grade average in predicting the…
Solid precipitation measurement intercomparison in Bismarck, North Dakota, from 1988 through 1997
Ryberg, Karen R.; Emerson, Douglas G.; Macek-Rowland, Kathleen M.
2009-01-01
A solid precipitation measurement intercomparison was recommended by the World Meteorological Organization (WMO) and was initiated after approval by the ninth session of the Commission for Instruments and Methods of Observation. The goal of the intercomparison was to assess national methods of measuring solid precipitation against methods whose accuracy and reliability were known. A field study was started in Bismarck, N. Dak., during the 1988-89 winter as part of the intercomparison. The last official field season of the WMO intercomparison was 1992-93; however, the Bismarck site continued to operate through the winter of 1996-97. Precipitation events at Bismarck were categorized as snow, mixed, or rain on the basis of descriptive notes recorded as part of the solid precipitation intercomparison. The rain events were not further analyzed in this study. Catch ratios (CRs) - the ratio of the precipitation catch at each gage to the true precipitation measurement (the corrected double fence intercomparison reference) - were calculated. Then, regression analysis was used to develop equations that model the snow and mixed precipitation CRs at each gage as functions of wind speed and temperature. Wind speed at the gages, functions of temperature, and upper air conditions (wind speed and air temperature at 700 millibars pressure) were used as possible explanatory variables in the multiple regression analysis done for this study. The CRs were modeled by using multiple regression analysis for the Tretyakov gage, national shielded gage, national unshielded gage, AeroChem gage, national gage with double fence, and national gage with Wyoming windshield. As in earlier studies by the WMO, wind speed and air temperature were found to influence the CR of the Tretyakov gage. However, in this study, the temperature variable represented the average upper air temperature over the duration of the event. The WMO did not use upper air conditions in its analysis. The national shielded and unshielded gages where found to be influenced by functions of wind speed only, as in other studies, but the upper air wind speed was used as an explanatory variable in this study. The AeroChem gage was not used in the WMO intercomparison study for 1987-93. The AeroChem gage had a highly varied CR at Bismarck, and a number of variables related to wind speed and temperature were used in the model for the CR. Despite extensive efforts to find a model for the national gage with double fence, no statistically significant regression model was found at the 0.05 level of statistical significance. The national gage with Wyoming windshield had a CR modeled by temperature and wind speed variables, and the regression relation had the highest coefficient of determination (R2 = 0.572) and adjusted coefficient of multiple determination (R2a = 0.476) of all of the models identified for any gage. Three of the gage CRs evaluated could be compared with those in the WMO intercomparison study for 1987-93. The WMO intercomparison had the advantage of a much larger dataset than this study. However, the data in this study represented a longer time period. Snow precipitation catch is highly varied depending on the equipment used and the weather conditions. Much of the variation is not accounted for in the WMO equations or in the equations developed in this study, particularly for unshielded gages. Extensive attempts at regression analysis were made with the mixed precipitation data, but it was concluded that the sample sizes were not large enough to model the CRs. However, the data could be used to test the WMO intercomparison equations. The mixed precipitation equations for the Tretyakov and national shielded gages are similar to those for snow in that they are more likely to underestimate precipitation when observed amounts were small and overestimate precipitation when observed amounts were relatively large. Mixed precipitation is underestimated by the WMO adjustment and t
Crop status evaluations and yield predictions
NASA Technical Reports Server (NTRS)
Haun, J. R.
1975-01-01
A model was developed for predicting the day 50 percent of the wheat crop is planted in North Dakota. This model incorporates location as an independent variable. The Julian date when 50 percent of the crop was planted for the nine divisions of North Dakota for seven years was regressed on the 49 variables through the step-down multiple regression procedure. This procedure begins with all of the independent variables and sequentially removes variables that are below a predetermined level of significance after each step. The prediction equation was tested on daily data. The accuracy of the model is considered satisfactory for finding the historic dates on which to initiate yield prediction model. Growth prediction models were also developed for spring wheat.
Shah, A A; Xing, W W; Triantafyllidis, V
2017-04-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.
Xing, W. W.; Triantafyllidis, V.
2017-01-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327
Oki, Delwyn S.; Rosa, Sarah N.; Yeung, Chiu W.
2010-01-01
This study provides an updated analysis of the magnitude and frequency of peak stream discharges in Hawai`i. Annual peak-discharge data collected by the U.S. Geological Survey during and before water year 2008 (ending September 30, 2008) at stream-gaging stations were analyzed. The existing generalized-skew value for the State of Hawai`i was retained, although three methods were used to evaluate whether an update was needed. Regional regression equations were developed for peak discharges with 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals for unregulated streams (those for which peak discharges are not affected to a large extent by upstream reservoirs, dams, diversions, or other structures) in areas with less than 20 percent combined medium- and high-intensity development on Kaua`i, O`ahu, Moloka`i, Maui, and Hawai`i. The generalized-least-squares (GLS) regression equations relate peak stream discharge to quantified basin characteristics (for example, drainage-basin area and mean annual rainfall) that were determined using geographic information system (GIS) methods. Each of the islands of Kaua`i,O`ahu, Moloka`i, Maui, and Hawai`i was divided into two regions, generally corresponding to a wet region and a dry region. Unique peak-discharge regression equations were developed for each region. The regression equations developed for this study have standard errors of prediction ranging from 16 to 620 percent. Standard errors of prediction are greatest for regression equations developed for leeward Moloka`i and southern Hawai`i. In general, estimated 100-year peak discharges from this study are lower than those from previous studies, which may reflect the longer periods of record used in this study. Each regression equation is valid within the range of values of the explanatory variables used to develop the equation. The regression equations were developed using peak-discharge data from streams that are mainly unregulated, and they should not be used to estimate peak discharges in regulated streams. Use of a regression equation beyond its limits will produce peak-discharge estimates with unknown error and should therefore be avoided. Improved estimates of the magnitude and frequency of peak discharges in Hawai`i will require continued operation of existing stream-gaging stations and operation of additional gaging stations for areas such as Moloka`i and Hawai`i, where limited stream-gaging data are available.
Development of an empirically based dynamic biomechanical strength model
NASA Technical Reports Server (NTRS)
Pandya, A.; Maida, J.; Aldridge, A.; Hasson, S.; Woolford, B.
1992-01-01
The focus here is on the development of a dynamic strength model for humans. Our model is based on empirical data. The shoulder, elbow, and wrist joints are characterized in terms of maximum isolated torque, position, and velocity in all rotational planes. This information is reduced by a least squares regression technique into a table of single variable second degree polynomial equations determining the torque as a function of position and velocity. The isolated joint torque equations are then used to compute forces resulting from a composite motion, which in this case is a ratchet wrench push and pull operation. What is presented here is a comparison of the computed or predicted results of the model with the actual measured values for the composite motion.
Regression analysis on the variation in efficiency frontiers for prevention stage of HIV/AIDS.
Kamae, Maki S; Kamae, Isao; Cohen, Joshua T; Neumann, Peter J
2011-01-01
To investigate how the cost effectiveness of preventing HIV/AIDS varies across possible efficiency frontiers (EFs) by taking into account potentially relevant external factors, such as prevention stage, and how the EFs can be characterized using regression analysis given uncertainty of the QALY-cost estimates. We reviewed cost-effectiveness estimates for the prevention and treatment of HIV/AIDS published from 2002-2007 and catalogued in the Tufts Medical Center Cost-Effectiveness Analysis (CEA) Registry. We constructed efficiency frontier (EF) curves by plotting QALYs against costs, using methods used by the Institute for Quality and Efficiency in Health Care (IQWiG) in Germany. We stratified the QALY-cost ratios by prevention stage, country of study, and payer perspective, and estimated EF equations using log and square-root models. A total of 53 QALY-cost ratios were identified for HIV/AIDS in the Tufts CEA Registry. Plotted ratios stratified by prevention stage were visually grouped into a cluster consisting of primary/secondary prevention measures and a cluster consisting of tertiary measures. Correlation coefficients for each cluster were statistically significant. For each cluster, we derived two EF equations - one based on the log model, and one based on the square-root model. Our findings indicate that stratification of HIV/AIDS interventions by prevention stage can yield distinct EFs, and that the correlation and regression analyses are useful for parametrically characterizing EF equations. Our study has certain limitations, such as the small number of included articles and the potential for study populations to be non-representative of countries of interest. Nonetheless, our approach could help develop a deeper appreciation of cost effectiveness beyond the deterministic approach developed by IQWiG.
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
The use of cognitive ability measures as explanatory variables in regression analysis.
Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J
2012-12-01
Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual's wage, or a decision such as an individual's education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score , constructed via standard psychometric practice from individuals' responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a "mixed effects structural equations" (MESE) model, may be more appropriate in many circumstances.
NASA Technical Reports Server (NTRS)
Jacobsen, R. T.; Stewart, R. B.; Crain, R. W., Jr.; Rose, G. L.; Myers, A. F.
1976-01-01
A method was developed for establishing a rational choice of the terms to be included in an equation of state with a large number of adjustable coefficients. The methods presented were developed for use in the determination of an equation of state for oxygen and nitrogen. However, a general application of the methods is possible in studies involving the determination of an optimum polynomial equation for fitting a large number of data points. The data considered in the least squares problem are experimental thermodynamic pressure-density-temperature data. Attention is given to a description of stepwise multiple regression and the use of stepwise regression in the determination of an equation of state for oxygen and nitrogen.
Burns, A.W.
1988-01-01
This report describes an interactive-accounting model used to simulate streamflow, chemical-constituent concentrations and loads, and water-supply operations in a river basin. The model uses regression equations to compute flow from incremental (internode) drainage areas. Conservative chemical constituents (typically dissolved solids) also are computed from regression equations. Both flow and water quality loads are accumulated downstream. Optionally, the model simulates the water use and the simplified groundwater systems of a basin. Water users include agricultural, municipal, industrial, and in-stream users , and reservoir operators. Water users list their potential water sources, including direct diversions, groundwater pumpage, interbasin imports, or reservoir releases, in the order in which they will be used. Direct diversions conform to basinwide water law priorities. The model is interactive, and although the input data exist in files, the user can modify them interactively. A major feature of the model is its color-graphic-output options. This report includes a description of the model, organizational charts of subroutines, and examples of the graphics. Detailed format instructions for the input data, example files of input data, definitions of program variables, and listing of the FORTRAN source code are Attachments to the report. (USGS)
Gotvald, Anthony J.
2017-01-13
The U.S. Geological Survey, in cooperation with the Georgia Department of Natural Resources, Environmental Protection Division, developed regional regression equations for estimating selected low-flow frequency and mean annual flow statistics for ungaged streams in north Georgia that are not substantially affected by regulation, diversions, or urbanization. Selected low-flow frequency statistics and basin characteristics for 56 streamgage locations within north Georgia and 75 miles beyond the State’s borders in Alabama, Tennessee, North Carolina, and South Carolina were combined to form the final dataset used in the regional regression analysis. Because some of the streamgages in the study recorded zero flow, the final regression equations were developed using weighted left-censored regression analysis to analyze the flow data in an unbiased manner, with weights based on the number of years of record. The set of equations includes the annual minimum 1- and 7-day average streamflow with the 10-year recurrence interval (referred to as 1Q10 and 7Q10), monthly 7Q10, and mean annual flow. The final regional regression equations are functions of drainage area, mean annual precipitation, and relief ratio for the selected low-flow frequency statistics and drainage area and mean annual precipitation for mean annual flow. The average standard error of estimate was 13.7 percent for the mean annual flow regression equation and ranged from 26.1 to 91.6 percent for the selected low-flow frequency equations.The equations, which are based on data from streams with little to no flow alterations, can be used to provide estimates of the natural flows for selected ungaged stream locations in the area of Georgia north of the Fall Line. The regression equations are not to be used to estimate flows for streams that have been altered by the effects of major dams, surface-water withdrawals, groundwater withdrawals (pumping wells), diversions, or wastewater discharges. The regression equations should be used only for ungaged sites with drainage areas between 1.67 and 576 square miles, mean annual precipitation between 47.6 and 81.6 inches, and relief ratios between 0.146 and 0.607; these are the ranges of the explanatory variables used to develop the equations. An attempt was made to develop regional regression equations for the area of Georgia south of the Fall Line by using the same approach used during this study for north Georgia; however, the equations resulted with high average standard errors of estimates and poorly predicted flows below 0.5 cubic foot per second, which may be attributed to the karst topography common in that area.The final regression equations developed from this study are planned to be incorporated into the U.S. Geological Survey StreamStats program. StreamStats is a Web-based geographic information system that provides users with access to an assortment of analytical tools useful for water-resources planning and management, and for engineering design applications, such as the design of bridges. The StreamStats program provides streamflow statistics and basin characteristics for U.S. Geological Survey streamgage locations and ungaged sites of interest. StreamStats also can compute basin characteristics and provide estimates of streamflow statistics for ungaged sites when users select the location of a site along any stream in Georgia.
Assessing Spurious Interaction Effects in Structural Equation Modeling
ERIC Educational Resources Information Center
Harring, Jeffrey R.; Weiss, Brandi A.; Li, Ming
2015-01-01
Several studies have stressed the importance of simultaneously estimating interaction and quadratic effects in multiple regression analyses, even if theory only suggests an interaction effect should be present. Specifically, past studies suggested that failing to simultaneously include quadratic effects when testing for interaction effects could…
Simple models for estimating local removals of timber in the northeast
David N. Larsen; David A. Gansner
1975-01-01
Provides a practical method of estimating subregional removals of timber and demonstrates its application to a typical problem. Stepwise multiple regression analysis is used to develop equations for estimating removals of softwood, hardwood, and all timber from selected characteristics of socioeconomic structure.
MEDISE: A macroeconomic model for energy planning in Costa Rica
DOE Office of Scientific and Technical Information (OSTI.GOV)
Booth, S.R.; Leiva, C.L.
This report describes the development and results of MEDISE, an econometric macroeconomic model for energy planning in Costa Rica. The model is a simultaneous system of 19 equations that was constructed using ENERPLAN, an energy planning tool developed by the United Nations for use in developing countries. The equations were estimated using regression analysis on a data time series of 1966 to 1984. ENERPLAN's model solution package was used to obtain forecasts of 19 economic variables from 1985 to 2005. the modeling effort was conducted jointly by Los Alamos Central American Energy and Resources Project (CAP) personnel and the Energymore » Sector Directorate of Costa Rica during 1986. The CAP was funded by the US Agency for International Development. 6 refs., 3 figs., 11 tabs.« less
NASA Astrophysics Data System (ADS)
Bilonick, Richard A.; Connell, Daniel P.; Talbott, Evelyn O.; Rager, Judith R.; Xue, Tao
2015-02-01
The objective of this study was to remove systematic bias among fine particulate matter (PM2.5) mass concentration measurements made by different types of samplers used in the Pittsburgh Aerosol Research and Inhalation Epidemiology Study (PARIES). PARIES is a retrospective epidemiology study that aims to provide a comprehensive analysis of the associations between air quality and human health effects in the Pittsburgh, Pennsylvania, region from 1999 to 2008. Calibration was needed in order to minimize the amount of systematic error in PM2.5 exposure estimation as a result of including data from 97 different PM2.5 samplers at 47 monitoring sites. Ordinary regression often has been used for calibrating air quality measurements from pairs of measurement devices; however, this is only appropriate when one of the two devices (the "independent" variable) is free from random error, which is rarely the case. A group of methods known as "errors-in-variables" (e.g., Deming regression, reduced major axis regression) has been developed to handle calibration between two devices when both are subject to random error, but these methods require information on the relative sizes of the random errors for each device, which typically cannot be obtained from the observed data. When data from more than two devices (or repeats of the same device) are available, the additional information is not used to inform the calibration. A more general approach that often has been overlooked is the use of a measurement error structural equation model (SEM) that allows the simultaneous comparison of three or more devices (or repeats). The theoretical underpinnings of all of these approaches to calibration are described, and the pros and cons of each are discussed. In particular, it is shown that both ordinary regression (when used for calibration) and Deming regression are particular examples of SEMs but with substantial deficiencies. To illustrate the use of SEMs, the 7865 daily average PM2.5 mass concentration measurements made by seven collocated samplers at an urban monitoring site in Pittsburgh, Pennsylvania, were used. These samplers, which included three federal reference method (FRM) samplers, three speciation samplers, and a tapered element oscillating microbalance (TEOM), operated at various times during the 10-year PARIES study period. Because TEOM measurements are known to depend on temperature, the constructed SEM provided calibration equations relating the TEOM to the FRM and speciation samplers as a function of ambient temperature. It was shown that TEOM imprecision and TEOM bias (relative to the FRM) both decreased as temperature increased. It also was shown that the temperature dependency for bias was non-linear and followed a sigmoidal (logistic) pattern. The speciation samplers exhibited only small bias relative to the FRM samplers, although the FRM samplers were shown to be substantially more precise than both the TEOM and the speciation samplers. Comparison of the SEM results to pairwise simple linear regression results showed that the regression results can differ substantially from the correctly-derived calibration equations, especially if the less-precise device is used as the independent variable in the regression.
Koseki, Shige; Nonaka, Junko
2012-09-01
The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation.
Galindo-Romero, Marta; Lippert, Tristan; Gavrilov, Alexander
2015-12-01
This paper presents an empirical linear equation to predict peak pressure level of anthropogenic impulsive signals based on its correlation with the sound exposure level. The regression coefficients are shown to be weakly dependent on the environmental characteristics but governed by the source type and parameters. The equation can be applied to values of the sound exposure level predicted with a numerical model, which provides a significant improvement in the prediction of the peak pressure level. Part I presents the analysis for airgun arrays signals, and Part II considers the application of the empirical equation to offshore impact piling noise.
Kaitaniemi, Pekka
2008-04-09
Allometric equations are widely used in many branches of biological science. The potential information content of the normalization constant b in allometric equations of the form Y = bX(a) has, however, remained largely neglected. To demonstrate the potential for utilizing this information, I generated a large number of artificial datasets that resembled those that are frequently encountered in biological studies, i.e., relatively small samples including measurement error or uncontrolled variation. The value of X was allowed to vary randomly within the limits describing different data ranges, and a was set to a fixed theoretical value. The constant b was set to a range of values describing the effect of a continuous environmental variable. In addition, a normally distributed random error was added to the values of both X and Y. Two different approaches were then used to model the data. The traditional approach estimated both a and b using a regression model, whereas an alternative approach set the exponent a at its theoretical value and only estimated the value of b. Both approaches produced virtually the same model fit with less than 0.3% difference in the coefficient of determination. Only the alternative approach was able to precisely reproduce the effect of the environmental variable, which was largely lost among noise variation when using the traditional approach. The results show how the value of b can be used as a source of valuable biological information if an appropriate regression model is selected.
Lima, Luiz Rodrigo Augustemak de; Martins, Priscila Custódio; Junior, Carlos Alencar Souza Alves; Castro, João Antônio Chula de; Silva, Diego Augusto Santos; Petroski, Edio Luiz
The aim of this study was to assess the validity of traditional anthropometric equations and to develop predictive equations of total body and trunk fat for children and adolescents living with HIV based on anthropometric measurements. Forty-eight children and adolescents of both sexes (24 boys) aged 7-17 years, living in Santa Catarina, Brazil, participated in the study. Dual-energy X-ray absorptiometry was used as the reference method to evaluate total body and trunk fat. Height, body weight, circumferences and triceps, subscapular, abdominal and calf skinfolds were measured. The traditional equations of Lohman and Slaughter were used to estimate body fat. Multiple regression models were fitted to predict total body fat (Model 1) and trunk fat (Model 2) using a backward selection procedure. Model 1 had an R 2 =0.85 and a standard error of the estimate of 1.43. Model 2 had an R 2 =0.80 and standard error of the estimate=0.49. The traditional equations of Lohman and Slaughter showed poor performance in estimating body fat in children and adolescents living with HIV. The prediction models using anthropometry provided reliable estimates and can be used by clinicians and healthcare professionals to monitor total body and trunk fat in children and adolescents living with HIV. Copyright © 2017 Sociedade Brasileira de Infectologia. Published by Elsevier Editora Ltda. All rights reserved.
Minute ventilation of cyclists, car and bus passengers: an experimental study.
Zuurbier, Moniek; Hoek, Gerard; van den Hazel, Peter; Brunekreef, Bert
2009-10-27
Differences in minute ventilation between cyclists, pedestrians and other commuters influence inhaled doses of air pollution. This study estimates minute ventilation of cyclists, car and bus passengers, as part of a study on health effects of commuters' exposure to air pollutants. Thirty-four participants performed a submaximal test on a bicycle ergometer, during which heart rate and minute ventilation were measured simultaneously at increasing cycling intensity. Individual regression equations were calculated between heart rate and the natural log of minute ventilation. Heart rates were recorded during 280 two hour trips by bicycle, bus and car and were calculated into minute ventilation levels using the individual regression coefficients. Minute ventilation during bicycle rides were on average 2.1 times higher than in the car (individual range from 1.3 to 5.3) and 2.0 times higher than in the bus (individual range from 1.3 to 5.1). The ratio of minute ventilation of cycling compared to travelling by bus or car was higher in women than in men. Substantial differences in regression equations were found between individuals. The use of individual regression equations instead of average regression equations resulted in substantially better predictions of individual minute ventilations. The comparability of the gender-specific overall regression equations linking heart rate and minute ventilation with one previous American study, supports that for studies on the group level overall equations can be used. For estimating individual doses, the use of individual regression coefficients provides more precise data. Minute ventilation levels of cyclists are on average two times higher than of bus and car passengers, consistent with the ratio found in one small previous study of young adults. The study illustrates the importance of inclusion of minute ventilation data in comparing air pollution doses between different modes of transport.
NASA Astrophysics Data System (ADS)
Wang, Ji-Peng; François, Bertrand; Lambert, Pierre
2017-09-01
Estimating hydraulic conductivity from particle size distribution (PSD) is an important issue for various engineering problems. Classical models such as Hazen model, Beyer model, and Kozeny-Carman model usually regard the grain diameter at 10% passing (d10) as an effective grain size and the effects of particle size uniformity (in Beyer model) or porosity (in Kozeny-Carman model) are sometimes embedded. This technical note applies the dimensional analysis (Buckingham's ∏ theorem) to analyze the relationship between hydraulic conductivity and particle size distribution (PSD). The porosity is regarded as a dependent variable on the grain size distribution in unconsolidated conditions. It indicates that the coefficient of grain size uniformity and a dimensionless group representing the gravity effect, which is proportional to the mean grain volume, are the main two determinative parameters for estimating hydraulic conductivity. Regression analysis is then carried out on a database comprising 431 samples collected from different depositional environments and new equations are developed for hydraulic conductivity estimation. The new equation, validated in specimens beyond the database, shows an improved prediction comparing to using the classic models.
Estimation of Magnitude and Frequency of Floods for Streams on the Island of Oahu, Hawaii
Wong, Michael F.
1994-01-01
This report describes techniques for estimating the magnitude and frequency of floods for the island of Oahu. The log-Pearson Type III distribution and methodology recommended by the Interagency Committee on Water Data was used to determine the magnitude and frequency of floods at 79 gaging stations that had 11 to 72 years of record. Multiple regression analysis was used to construct regression equations to transfer the magnitude and frequency information from gaged sites to ungaged sites. Oahu was divided into three hydrologic regions to define relations between peak discharge and drainage-basin and climatic characteristics. Regression equations are provided to estimate the 2-, 5-, 10-, 25-, 50-, and 100-year peak discharges at ungaged sites. Significant basin and climatic characteristics included in the regression equations are drainage area, median annual rainfall, and the 2-year, 24-hour rainfall intensity. Drainage areas for sites used in this study ranged from 0.03 to 45.7 square miles. Standard error of prediction for the regression equations ranged from 34 to 62 percent. Peak-discharge data collected through water year 1988, geographic information system (GIS) technology, and generalized least-squares regression were used in the analyses. The use of GIS seems to be a more flexible and consistent means of defining and calculating basin and climatic characteristics than using manual methods. Standard errors of estimate for the regression equations in this report are an average of 8 percent less than those published in previous studies.
Hegab, Moustafa; Midan, Mahmoud Farouk; Taha, Tamer; Bibars, Mamdouh; Wakeel, Khaled Helmi El; Amer, Hesham; Azmy, Osama
2018-05-20
To construct new fetal biometric charts and equations for some fetal biometric parameters for women between 12 th and 41 st weeks living in Ismailia and Port Said Governorates in Egypt. This cross-sectional study was carried out on 656 Egyptian women (from Ismailia and Port Said governorates) with an uncomplicated pregnancy, and all were sure of their dates. The selected group was between the 12 th and 41 st weeks of gestation, recruited from the district general hospital in Ismailia and Port Said to measure ultrasonographically biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC) and femur length (FL), then for each measurement separate regression models were fitted to estimate both the mean and the Standard deviation at each gestational age. New Egyptian charts were reported for BPD, HC, AC, and FL. Reference equations for the dating of pregnancy were presented. The mean of the previous measurements at 12 th and 41 st weeks were as follows: (23.37, 98.72), (83.05, 336.12), (67.85, 332.57) and (12.50, 74.92) respectively. New fetal biometric charts and regression equations for pregnant women living in Port Said & Ismailia governorates in Egypt.
ERIC Educational Resources Information Center
Matsanka, Christopher
2017-01-01
The purpose of this non-experimental quantitative study was to investigate the relationship between Pennsylvania's Classroom Diagnostic Tools (CDT) interim assessments and the state-mandated Pennsylvania System of School Assessment (PSSA) and to create linear regression equations that could be used as models to predict student performance on the…
Hart, Carl R; Reznicek, Nathan J; Wilson, D Keith; Pettit, Chris L; Nykaza, Edward T
2016-05-01
Many outdoor sound propagation models exist, ranging from highly complex physics-based simulations to simplified engineering calculations, and more recently, highly flexible statistical learning methods. Several engineering and statistical learning models are evaluated by using a particular physics-based model, namely, a Crank-Nicholson parabolic equation (CNPE), as a benchmark. Narrowband transmission loss values predicted with the CNPE, based upon a simulated data set of meteorological, boundary, and source conditions, act as simulated observations. In the simulated data set sound propagation conditions span from downward refracting to upward refracting, for acoustically hard and soft boundaries, and low frequencies. Engineering models used in the comparisons include the ISO 9613-2 method, Harmonoise, and Nord2000 propagation models. Statistical learning methods used in the comparisons include bagged decision tree regression, random forest regression, boosting regression, and artificial neural network models. Computed skill scores are relative to sound propagation in a homogeneous atmosphere over a rigid ground. Overall skill scores for the engineering noise models are 0.6%, -7.1%, and 83.8% for the ISO 9613-2, Harmonoise, and Nord2000 models, respectively. Overall skill scores for the statistical learning models are 99.5%, 99.5%, 99.6%, and 99.6% for bagged decision tree, random forest, boosting, and artificial neural network regression models, respectively.
Graphical Tools for Linear Structural Equation Modeling
2014-06-01
others. 4Kenny and Milan (2011) write, “Identification is perhaps the most difficult concept for SEM researchers to understand. We have seen SEM...model to using typical SEM software to determine model identifia- bility. Kenny and Milan (2011) list the following drawbacks: (i) If poor starting...the well known recursive and null rules (Bollen, 1989) and the regression rule (Kenny and Milan , 2011). A Simple Criterion for Identifying Individual
NASA Technical Reports Server (NTRS)
Deadmore, D. L.
1984-01-01
The effects of Cr, Al, Ti, Mo, Ta, Nb, and W content on the hot corrosion of nickel base alloys were investigated. The alloys were tested in a Mach 0.3 flame with 0.5 ppmw sodium at a temperature of 900 C. One nondestructive and three destructive tests were conducted. The best corrosion resistance was achieved when the Cr content was 12 wt %. However, some lower-Cr-content alloys ( 10 wt%) exhibited reasonable resistance provided that the Al content alloys ( 10 wt %) exhibited reasonable resistance provided that the Al content was 2.5 wt % and the Ti content was Aa wt %. The effect of W, Ta, Mo, and Nb contents on the hot-corrosion resistance varied depending on the Al and Ti contents. Several commercial alloy compositions were also tested and the corrosion attack was measured. Predicted attack was calculated for these alloys from derived regression equations and was in reasonable agreement with that experimentally measured. The regression equations were derived from measurements made on alloys in a one-quarter replicate of a 2(7) statistical design alloy composition experiment. These regression equations represent a simple linear model and are only a very preliminary analysis of the data needed to provide insights into the experimental method.
Validation of Core Temperature Estimation Algorithm
2016-01-29
plot of observed versus estimated core temperature with the line of identity (dashed) and the least squares regression line (solid) and line equation...estimated PSI with the line of identity (dashed) and the least squares regression line (solid) and line equation in the top left corner. (b) Bland...for comparison. The root mean squared error (RMSE) was also computed, as given by Equation 2.
Runkel, Robert L.
1998-01-01
OTIS is a mathematical simulation model used to characterize the fate and transport of water-borne solutes in streams and rivers. The governing equation underlying the model is the advection-dispersion equation with additional terms to account for transient storage, lateral inflow, first-order decay, and sorption. This equation and the associated equations describing transient storage and sorption are solved using a Crank-Nicolson finite-difference solution. OTIS may be used in conjunction with data from field-scale tracer experiments to quantify the hydrologic parameters affecting solute transport. This application typically involves a trial-and-error approach wherein parameter estimates are adjusted to obtain an acceptable match between simulated and observed tracer concentrations. Additional applications include analyses of nonconservative solutes that are subject to sorption processes or first-order decay. OTIS-P, a modified version of OTIS, couples the solution of the governing equation with a nonlinear regression package. OTIS-P determines an optimal set of parameter estimates that minimize the squared differences between the simulated and observed concentrations, thereby automating the parameter estimation process. This report details the development and application of OTIS and OTIS-P. Sections of the report describe model theory, input/output specifications, sample applications, and installation instructions.
Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim
2014-01-01
The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.
Age estimation using pulp/tooth area ratio in maxillary canines-A digital image analysis.
Juneja, Manjushree; Devi, Yashoda B K; Rakesh, N; Juneja, Saurabh
2014-09-01
Determination of age of a subject is one of the most important aspects of medico-legal cases and anthropological research. Radiographs can be used to indirectly measure the rate of secondary dentine deposition which is depicted by reduction in the pulp area. In this study, 200 patients of Karnataka aged between 18-72 years were selected for the study. Panoramic radiographs were made and indirectly digitized. Radiographic images of maxillary canines (RIC) were processed using a computer-aided drafting program (ImageJ). The variables pulp/root length (p), pulp/tooth length (r), pulp/root width at enamel-cementum junction (ECJ) level (a), pulp/root width at mid-root level (c), pulp/root width at midpoint level between ECJ level and mid-root level (b) and pulp/tooth area ratio (AR) were recorded. All the morphological variables including gender were statistically analyzed to derive regression equation for estimation of age. It was observed that 2 variables 'AR' and 'b' contributed significantly to the fit and were included in the regression model, yielding the formula: Age = 87.305-480.455(AR)+48.108(b). Statistical analysis indicated that the regression equation with selected variables explained 96% of total variance with the median of the residuals of 0.1614 years and standard error of estimate of 3.0186 years. There is significant correlation between age and morphological variables 'AR' and 'b' and the derived population specific regression equation can be potentially used for estimation of chronological age of individuals of Karnataka origin.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Methods for estimating flood frequency in Montana based on data through water year 1998
Parrett, Charles; Johnson, Dave R.
2004-01-01
Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Flood-frequency prediction methods for unregulated streams of Tennessee, 2000
Law, George S.; Tasker, Gary D.
2003-01-01
Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.
Olson, Scott A.; with a section by Veilleux, Andrea G.
2014-01-01
This report provides estimates of flood discharges at selected annual exceedance probabilities (AEPs) for streamgages in and adjacent to Vermont and equations for estimating flood discharges at AEPs of 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent (recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-years, respectively) for ungaged, unregulated, rural streams in Vermont. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 145 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, percentage of wetland area, and the basin-wide mean of the average annual precipitation. The average standard errors of prediction for estimating the flood discharges at the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEP with these equations are 34.9, 36.0, 38.7, 42.4, 44.9, 47.3, 50.7, and 55.1 percent, respectively. Flood discharges at selected AEPs for streamgages were computed by using the Expected Moments Algorithm. To improve estimates of the flood discharges for given exceedance probabilities at streamgages in Vermont, a new generalized skew coefficient was developed. The new generalized skew for the region is a constant, 0.44. The mean square error of the generalized skew coefficient is 0.078. This report describes a technique for using results from the regression equations to adjust an AEP discharge computed from a streamgage record. This report also describes a technique for using a drainage-area adjustment to estimate flood discharge at a selected AEP for an ungaged site upstream or downstream from a streamgage. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Managing Salary Equity. AIR Forum 1981 Paper.
ERIC Educational Resources Information Center
Prather, James E.; Posey, Ellen I.
Technical considerations in the development of a salary equity model based upon regression analysis are reviewed, and a simplified salary prediction equation is examined. Application and communication of the results of the analysis within the existing operational context of a postsecondary institution are also addressed. The literature is…
Teams as a Learning Forum for Accounting Professionals.
ERIC Educational Resources Information Center
Kleinman, Gary; Siegel, Philip; Eckstein, Claire
2002-01-01
Hierarchical regression and structural equation modeling of data from 440 accountants found that social interaction in work teams fosters individual, organizational, and team learning. Organizational and personal learning mediated the relationship between team social interaction processes and the attitudinal outcomes, but team learning did not.…
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert; Bader, Jon B.
2009-01-01
Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.
Ries, Kernell G.; Crouse, Michele Y.
2002-01-01
For many years, the U.S. Geological Survey (USGS) has been developing regional regression equations for estimating flood magnitude and frequency at ungaged sites. These regression equations are used to transfer flood characteristics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally, these equations have been developed on a Statewide or metropolitan-area basis as part of cooperative study programs with specific State Departments of Transportation. In 1994, the USGS released a computer program titled the National Flood Frequency Program (NFF), which compiled all the USGS available regression equations for estimating the magnitude and frequency of floods in the United States and Puerto Rico. NFF was developed in cooperation with the Federal Highway Administration and the Federal Emergency Management Agency. Since the initial release of NFF, the USGS has produced new equations for many areas of the Nation. A new version of NFF has been developed that incorporates these new equations and provides additional functionality and ease of use. NFF version 3 provides regression-equation estimates of flood-peak discharges for unregulated rural and urban watersheds, flood-frequency plots, and plots of typical flood hydrographs for selected recurrence intervals. The Program also provides weighting techniques to improve estimates of flood-peak discharges for gaging stations and ungaged sites. The information provided by NFF should be useful to engineers and hydrologists for planning and design applications. This report describes the flood-regionalization techniques used in NFF and provides guidance on the applicability and limitations of the techniques. The NFF software and the documentation for the regression equations included in NFF are available at http://water.usgs.gov/software/nff.html.
Estimation of Missing Water-Level Data for the Everglades Depth Estimation Network (EDEN)
Conrads, Paul; Petkewich, Matthew D.
2009-01-01
The Everglades Depth Estimation Network (EDEN) is an integrated network of real-time water-level gaging stations, ground-elevation models, and water-surface elevation models designed to provide scientists, engineers, and water-resource managers with current (2000-2009) water-depth information for the entire freshwater portion of the greater Everglades. The U.S. Geological Survey Greater Everglades Priority Ecosystems Science provides support for EDEN and their goal of providing quality-assured monitoring data for the U.S. Army Corps of Engineers Comprehensive Everglades Restoration Plan. To increase the accuracy of the daily water-surface elevation model, water-level estimation equations were developed to fill missing data. To minimize the occurrences of no estimation of data due to missing data for an input station, a minimum of three linear regression equations were developed for each station using different input stations. Of the 726 water-level estimation equations developed to fill missing data at 239 stations, more than 60 percent of the equations have coefficients of determination greater than 0.90, and 92 percent have an coefficient of determination greater than 0.70.
Forecasting municipal solid waste generation using prognostic tools and regression analysis.
Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria
2016-11-01
For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Regression rate behaviors of HTPB-based propellant combinations for hybrid rocket motor
NASA Astrophysics Data System (ADS)
Sun, Xingliang; Tian, Hui; Li, Yuelong; Yu, Nanjia; Cai, Guobiao
2016-02-01
The purpose of this paper is to characterize the regression rate behavior of hybrid rocket motor propellant combinations, using hydrogen peroxide (HP), gaseous oxygen (GOX), nitrous oxide (N2O) as the oxidizer and hydroxyl-terminated poly-butadiene (HTPB) as the based fuel. In order to complete this research by experiment and simulation, a hybrid rocket motor test system and a numerical simulation model are established. Series of hybrid rocket motor firing tests are conducted burning different propellant combinations, and several of those are used as references for numerical simulations. The numerical simulation model is developed by combining the Navies-Stokes equations with the turbulence model, one-step global reaction model, and solid-gas coupling model. The distribution of regression rate along the axis is determined by applying simulation mode to predict the combustion process and heat transfer inside the hybrid rocket motor. The time-space averaged regression rate has a good agreement between the numerical value and experimental data. The results indicate that the N2O/HTPB and GOX/HTPB propellant combinations have a higher regression rate, since the enhancement effect of latter is significant due to its higher flame temperature. Furthermore, the containing of aluminum (Al) and/or ammonium perchlorate(AP) in the grain does enhance the regression rate, mainly due to the more energy released inside the chamber and heat feedback to the grain surface by the aluminum combustion.
Lee, Yeonok; Wu, Hulin
2012-01-01
Differential equation models are widely used for the study of natural phenomena in many fields. The study usually involves unknown factors such as initial conditions and/or parameters. It is important to investigate the impact of unknown factors (parameters and initial conditions) on model outputs in order to better understand the system the model represents. Apportioning the uncertainty (variation) of output variables of a model according to the input factors is referred to as sensitivity analysis. In this paper, we focus on the global sensitivity analysis of ordinary differential equation (ODE) models over a time period using the multivariate adaptive regression spline (MARS) as a meta model based on the concept of the variance of conditional expectation (VCE). We suggest to evaluate the VCE analytically using the MARS model structure of univariate tensor-product functions which is more computationally efficient. Our simulation studies show that the MARS model approach performs very well and helps to significantly reduce the computational cost. We present an application example of sensitivity analysis of ODE models for influenza infection to further illustrate the usefulness of the proposed method.
Ahearn, Elizabeth A.
2004-01-01
Multiple linear-regression equations were developed to estimate the magnitudes of floods in Connecticut for recurrence intervals ranging from 2 to 500 years. The equations can be used for nonurban, unregulated stream sites in Connecticut with drainage areas ranging from about 2 to 715 square miles. Flood-frequency data and hydrologic characteristics from 70 streamflow-gaging stations and the upstream drainage basins were used to develop the equations. The hydrologic characteristics?drainage area, mean basin elevation, and 24-hour rainfall?are used in the equations to estimate the magnitude of floods. Average standard errors of prediction for the equations are 31.8, 32.7, 34.4, 35.9, 37.6 and 45.0 percent for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals, respectively. Simplified equations using only one hydrologic characteristic?drainage area?also were developed. The regression analysis is based on generalized least-squares regression techniques. Observed flows (log-Pearson Type III analysis of the annual maximum flows) from five streamflow-gaging stations in urban basins in Connecticut were compared to flows estimated from national three-parameter and seven-parameter urban regression equations. The comparison shows that the three- and seven- parameter equations used in conjunction with the new statewide equations generally provide reasonable estimates of flood flows for urban sites in Connecticut, although a national urban flood-frequency study indicated that the three-parameter equations significantly underestimated flood flows in many regions of the country. Verification of the accuracy of the three-parameter or seven-parameter national regression equations using new data from Connecticut stations was beyond the scope of this study. A technique for calculating flood flows at streamflow-gaging stations using a weighted average also is described. Two estimates of flood flows?one estimate based on the log-Pearson Type III analyses of the annual maximum flows at the gaging station, and the other estimate from the regression equation?are weighted together based on the years of record at the gaging station and the equivalent years of record value determined from the regression. Weighted averages of flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are tabulated for the 70 streamflow-gaging stations used in the regression analysis. Generally, weighted averages give the most accurate estimate of flood flows at gaging stations. An evaluation of the Connecticut's streamflow-gaging network was performed to determine whether the spatial coverage and range of geographic and hydrologic conditions are adequately represented for transferring flood characteristics from gaged to ungaged sites. Fifty-one of 54 stations in the current (2004) network support one or more flood needs of federal, state, and local agencies. Twenty-five of 54 stations in the current network are considered high-priority stations by the U.S. Geological Survey because of their contribution to the longterm understanding of floods, and their application for regionalflood analysis. Enhancements to the network to improve overall effectiveness for regionalization can be made by increasing the spatial coverage of gaging stations, establishing stations in regions of the state that are not well-represented, and adding stations in basins with drainage area sizes not represented. Additionally, the usefulness of the network for characterizing floods can be maintained and improved by continuing operation at the current stations because flood flows can be more accurately estimated at stations with continuous, long-term record.
Age Estimation of Infants Through Metric Analysis of Developing Anterior Deciduous Teeth.
Viciano, Joan; De Luca, Stefano; Irurita, Javier; Alemán, Inmaculada
2018-01-01
This study provides regression equations for estimation of age of infants from the dimensions of their developing deciduous teeth. The sample comprises 97 individuals of known sex and age (62 boys, 35 girls), aged between 2 days and 1,081 days. The age-estimation equations were obtained for the sexes combined, as well as for each sex separately, thus including "sex" as an independent variable. The values of the correlations and determination coefficients obtained for each regression equation indicate good fits for most of the equations obtained. The "sex" factor was statistically significant when included as an independent variable in seven of the regression equations. However, the "sex" factor provided an advantage for age estimation in only three of the equations, compared to those that did not include "sex" as a factor. These data suggest that the ages of infants can be accurately estimated from measurements of their developing deciduous teeth. © 2017 American Academy of Forensic Sciences.
NASA Technical Reports Server (NTRS)
Rango, A.; Salomonson, V. V.; Foster, J. L.
1975-01-01
Low resolution meteorological satellite and high resolution earth resources satellite data were used to map snowcovered area over the upper Indus River and the Wind River Mountains of Wyoming, respectively. For the Indus River, early Spring snowcovered area was extracted and related to April through June streamflow from 1967-1971 using a regression equation. Composited results from two years of data over seven Wind River Mountain watersheds indicated that LANDSAT-1 snowcover observations, separated on the basis of watershed elevation, could also be related to runoff in significant regression equations. It appears that earth resources satellite data will be useful in assisting in the prediction of seasonal streamflow for various water resources applications, nonhazardous collection of snow data from restricted-access areas, and in hydrologic modeling of snowmelt runoff.
Comparison of methods for the analysis of relatively simple mediation models.
Rijnhart, Judith J M; Twisk, Jos W R; Chinapaw, Mai J M; de Boer, Michiel R; Heymans, Martijn W
2017-09-01
Statistical mediation analysis is an often used method in trials, to unravel the pathways underlying the effect of an intervention on a particular outcome variable. Throughout the years, several methods have been proposed, such as ordinary least square (OLS) regression, structural equation modeling (SEM), and the potential outcomes framework. Most applied researchers do not know that these methods are mathematically equivalent when applied to mediation models with a continuous mediator and outcome variable. Therefore, the aim of this paper was to demonstrate the similarities between OLS regression, SEM, and the potential outcomes framework in three mediation models: 1) a crude model, 2) a confounder-adjusted model, and 3) a model with an interaction term for exposure-mediator interaction. Secondary data analysis of a randomized controlled trial that included 546 schoolchildren. In our data example, the mediator and outcome variable were both continuous. We compared the estimates of the total, direct and indirect effects, proportion mediated, and 95% confidence intervals (CIs) for the indirect effect across OLS regression, SEM, and the potential outcomes framework. OLS regression, SEM, and the potential outcomes framework yielded the same effect estimates in the crude mediation model, the confounder-adjusted mediation model, and the mediation model with an interaction term for exposure-mediator interaction. Since OLS regression, SEM, and the potential outcomes framework yield the same results in three mediation models with a continuous mediator and outcome variable, researchers can continue using the method that is most convenient to them.
Simulating sunflower canopy temperatures to infer root-zone soil water potential
NASA Technical Reports Server (NTRS)
Choudhury, B. J.; Idso, S. B.
1983-01-01
A soil-plant-atmosphere model for sunflower (Helianthus annuus L.), together with clear sky weather data for several days, is used to study the relationship between canopy temperature and root-zone soil water potential. Considering the empirical dependence of stomatal resistance on insolation, air temperature and leaf water potential, a continuity equation for water flux in the soil-plant-atmosphere system is solved for the leaf water potential. The transpirational flux is calculated using Monteith's combination equation, while the canopy temperature is calculated from the energy balance equation. The simulation shows that, at high soil water potentials, canopy temperature is determined primarily by air and dew point temperatures. These results agree with an empirically derived linear regression equation relating canopy-air temperature differential to air vapor pressure deficit. The model predictions of leaf water potential are also in agreement with observations, indicating that measurements of canopy temperature together with a knowledge of air and dew point temperatures can provide a reliable estimate of the root-zone soil water potential.
ERIC Educational Resources Information Center
Kapes, Jerome T.; And Others
Three models of multiple regression analysis (MRA): single equation, commonality analysis, and path analysis, were applied to longitudinal data from the Pennsylvania Vocational Development Study. Variables influencing weekly income of vocational education students one year after high school graduation were examined: grade point averages (grades…
Estimating Causal Effects in Mediation Analysis Using Propensity Scores
ERIC Educational Resources Information Center
Coffman, Donna L.
2011-01-01
Mediation is usually assessed by a regression-based or structural equation modeling (SEM) approach that we refer to as the classical approach. This approach relies on the assumption that there are no confounders that influence both the mediator, "M", and the outcome, "Y". This assumption holds if individuals are randomly…
ERIC Educational Resources Information Center
Stapleton, Laura M.
2008-01-01
This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…
Straub, David E.; Ebner, Andrew D.
2011-01-01
The USGS, in cooperation with the Chippewa Subdistrict of the Muskingum Watershed Conservancy District, performed hydrologic and hydraulic analyses for selected reaches of three streams in Medina, Wayne, Stark, and Summit Counties in northeast Ohio: Chippewa Creek, Little Chippewa Creek, and River Styx. This study was done to facilitate assessment of various alternatives for mitigating flood hazards in the Chippewa Creek basin. StreamStats regional regression equations were used to estimate instantaneous peak discharges approximately corresponding to bankfull flows. Explanatory variables used in the regression equations were drainage area, main-channel slope, and storage area. Hydraulic models were developed to determine water-surface profiles along the three stream reaches studied for the bankfull discharges established in the hydrologic analyses. The HEC-RAS step-backwater hydraulic analysis model was used to determine water-surface profiles for the three streams. Starting water-surface elevations for all streams were established using normal depth computations in the HEC-RAS models. Cross-sectional elevation data, hydraulic-structure geometries, and roughness coefficients were collected in the field and (along with peak-discharge estimates) used as input for the models. Reach-averaged reductions in water-surface elevations ranged from 0.11 to 1.29 feet over the four roughness coefficient reduction scenarios.
Tortorelli, Robert L.
1997-01-01
Statewide regression equations for Oklahoma were determined for estimating peak discharge and flood frequency for selected recurrence intervals from 2 to 500 years for ungaged sites on natural unregulated streams. The most significant independent variables required to estimate peak-streamflow frequency for natural unregulated streams in Oklahoma are contributing drainage area, main-channel slope, and mean-annual precipitation. The regression equations are applicable for watersheds with drainage areas less than 2,510 square miles that are not affected by regulation from manmade works. Limitations on the use of the regression relations and the reliability of regression estimates for natural unregulated streams are discussed. Log-Pearson Type III analysis information, basin and climatic characteristics, and the peak-stream-flow frequency estimates for 251 gaging stations in Oklahoma and adjacent states are listed. Techniques are presented to make a peak-streamflow frequency estimate for gaged sites on natural unregulated streams and to use this result to estimate a nearby ungaged site on the same stream. For ungaged sites on urban streams, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow frequency. For ungaged sites on streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow frequency. The statewide regression equations are adjusted by substituting the drainage area below the floodwater retarding structures, or drainage area that represents the percentage of the unregulated basin, in the contributing drainage area parameter to obtain peak-streamflow frequency estimates.
NASA Astrophysics Data System (ADS)
Bo, Z.; Chen, J. H.
2010-02-01
The dimensional analysis technique is used to formulate a correlation between ozone generation rate and various parameters that are important in the design and operation of positive wire-to-plate corona discharges in indoor air. The dimensionless relation is determined by linear regression analysis based on the results from 36 laboratory-scale experiments. The derived equation is validated by experimental data and a numerical model published in the literature. Applications of such derived equation are illustrated through an example selection of the appropriate set of operating conditions in the design/operation of a photocopier to follow the federal regulations of ozone emission. Finally, a new current-voltage characteristic equation is proposed for positive wire-to-plate corona discharges based on the derived dimensionless equation.
Further analytical study of hybrid rocket combustion
NASA Technical Reports Server (NTRS)
Hung, W. S. Y.; Chen, C. S.; Haviland, J. K.
1972-01-01
Analytical studies of the transient and steady-state combustion processes in a hybrid rocket system are discussed. The particular system chosen consists of a gaseous oxidizer flowing within a tube of solid fuel, resulting in a heterogeneous combustion. Finite rate chemical kinetics with appropriate reaction mechanisms were incorporated in the model. A temperature dependent Arrhenius type fuel surface regression rate equation was chosen for the current study. The governing mathematical equations employed for the reacting gas phase and for the solid phase are the general, two-dimensional, time-dependent conservation equations in a cylindrical coordinate system. Keeping the simplifying assumptions to a minimum, these basic equations were programmed for numerical computation, using two implicit finite-difference schemes, the Lax-Wendroff scheme for the gas phase, and, the Crank-Nicolson scheme for the solid phase.
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Saputro, Dewi Retno Sari
2017-03-01
GWOLR model used for represent relationship between dependent variable has categories and scale of category is ordinal with independent variable influenced the geographical location of the observation site. Parameters estimation of GWOLR model use maximum likelihood provide system of nonlinear equations and hard to be found the result in analytic resolution. By finishing it, it means determine the maximum completion, this thing associated with optimizing problem. The completion nonlinear system of equations optimize use numerical approximation, which one is Newton Raphson method. The purpose of this research is to make iteration algorithm Newton Raphson and program using R software to estimate GWOLR model. Based on the research obtained that program in R can be used to estimate the parameters of GWOLR model by forming a syntax program with command "while".
Score Estimating Equations from Embedded Likelihood Functions under Accelerated Failure Time Model
NING, JING; QIN, JING; SHEN, YU
2014-01-01
SUMMARY The semiparametric accelerated failure time (AFT) model is one of the most popular models for analyzing time-to-event outcomes. One appealing feature of the AFT model is that the observed failure time data can be transformed to identically independent distributed random variables without covariate effects. We describe a class of estimating equations based on the score functions for the transformed data, which are derived from the full likelihood function under commonly used semiparametric models such as the proportional hazards or proportional odds model. The methods of estimating regression parameters under the AFT model can be applied to traditional right-censored survival data as well as more complex time-to-event data subject to length-biased sampling. We establish the asymptotic properties and evaluate the small sample performance of the proposed estimators. We illustrate the proposed methods through applications in two examples. PMID:25663727
Comparative evaluation of adsorption kinetics of diclofenac and isoproturon by activated carbon.
Torrellas, Silvia A; Rodriguez, Araceli R; Escudero, Gabriel O; Martín, José María G; Rodriguez, Juan G
2015-01-01
Adsorption mechanism of diclofenac and isoproturon onto activated carbon has been proposed using Langmuir and Freundlich isotherms. Adsorption capacity and optimum adsorption isotherms were predicted by nonlinear regression method. Different kinetic equations, pseudo-first-order, pseudo-second-order, intraparticle diffusion model and Bangham kinetic model, were applied to study the adsorption kinetics of emerging contaminants on activated carbon in two aqueous matrices.
Wang, Wei; Young, Bessie A.; Fülöp, Tibor; de Boer, Ian H.; Boulware, L. Ebony; Katz, Ronit; Correa, Adolfo; Griswold, Michael E.
2015-01-01
Background The calibration to Isotope Dilution Mass Spectroscopy (IDMS) traceable creatinine is essential for valid use of the new Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation to estimate the glomerular filtration rate (GFR). Methods For 5,210 participants in the Jackson Heart Study (JHS), serum creatinine was measured with a multipoint enzymatic spectrophotometric assay at the baseline visit (2000–2004) and re-measured using the Roche enzymatic method, traceable to IDMS in a subset of 206 subjects. The 200 eligible samples (6 were excluded, 1 for failure of the re-measurement and 5 for outliers) were divided into three disjoint sets - training, validation, and test - to select a calibration model, estimate true errors, and assess performance of the final calibration equation. The calibration equation was applied to serum creatinine measurements of 5,210 participants to estimate GFR and the prevalence of CKD. Results The selected Deming regression model provided a slope of 0.968 (95% Confidence Interval (CI), 0.904 to 1.053) and intercept of −0.0248 (95% CI, −0.0862 to 0.0366) with R squared 0.9527. Calibrated serum creatinine showed high agreement with actual measurements when applying to the unused test set (concordance correlation coefficient 0.934, 95% CI, 0.894 to 0.960). The baseline prevalence of CKD in the JHS (2000–2004) was 6.30% using calibrated values, compared with 8.29% using non-calibrated serum creatinine with the CKD-EPI equation (P < 0.001). Conclusions A Deming regression model was chosen to optimally calibrate baseline serum creatinine measurements in the JHS and the calibrated values provide a lower CKD prevalence estimate. PMID:25806862
Wang, Wei; Young, Bessie A; Fülöp, Tibor; de Boer, Ian H; Boulware, L Ebony; Katz, Ronit; Correa, Adolfo; Griswold, Michael E
2015-05-01
The calibration to isotope dilution mass spectrometry-traceable creatinine is essential for valid use of the new Chronic Kidney Disease Epidemiology Collaboration equation to estimate the glomerular filtration rate. For 5,210 participants in the Jackson Heart Study (JHS), serum creatinine was measured with a multipoint enzymatic spectrophotometric assay at the baseline visit (2000-2004) and remeasured using the Roche enzymatic method, traceable to isotope dilution mass spectrometry in a subset of 206 subjects. The 200 eligible samples (6 were excluded, 1 for failure of the remeasurement and 5 for outliers) were divided into 3 disjoint sets-training, validation and test-to select a calibration model, estimate true errors and assess performance of the final calibration equation. The calibration equation was applied to serum creatinine measurements of 5,210 participants to estimate glomerular filtration rate and the prevalence of chronic kidney disease (CKD). The selected Deming regression model provided a slope of 0.968 (95% confidence interval [CI], 0.904-1.053) and intercept of -0.0248 (95% CI, -0.0862 to 0.0366) with R value of 0.9527. Calibrated serum creatinine showed high agreement with actual measurements when applying to the unused test set (concordance correlation coefficient 0.934, 95% CI, 0.894-0.960). The baseline prevalence of CKD in the JHS (2000-2004) was 6.30% using calibrated values compared with 8.29% using noncalibrated serum creatinine with the Chronic Kidney Disease Epidemiology Collaboration equation (P < 0.001). A Deming regression model was chosen to optimally calibrate baseline serum creatinine measurements in the JHS, and the calibrated values provide a lower CKD prevalence estimate.
Systolic time interval v heart rate regression equations using atropine: reproducibility studies.
Kelman, A W; Sumner, D J; Whiting, B
1981-01-01
1. Systolic time intervals (STI) were recorded in six normal male subjects over a period of 3 weeks. On one day per week, each subject received incremental doses of atropine intravenously to increase heart rate, allowing the determination of individual STI v HR regression equations. On the other days STI were recorded with the subjects resting, in the supine position. 2. There were highly significant regression relationships between heart rate and both LVET and QS2, but not between heart rate and PEP. 3. The regression relationships showed little intra-subject variability, but a large degree of inter-subject variability: they proved adequate to correct the STI for the daily fluctuations in heart rate. 4. Administration of small doses of atropine intravenously provides a satisfactory and convenient method of deriving individual STI v HR regression equations which can be applied over a period of weeks. PMID:7248136
Systolic time interval v heart rate regression equations using atropine: reproducibility studies.
Kelman, A W; Sumner, D J; Whiting, B
1981-07-01
1. Systolic time intervals (STI) were recorded in six normal male subjects over a period of 3 weeks. On one day per week, each subject received incremental doses of atropine intravenously to increase heart rate, allowing the determination of individual STI v HR regression equations. On the other days STI were recorded with the subjects resting, in the supine position. 2. There were highly significant regression relationships between heart rate and both LVET and QS2, but not between heart rate and PEP. 3. The regression relationships showed little intra-subject variability, but a large degree of inter-subject variability: they proved adequate to correct the STI for the daily fluctuations in heart rate. 4. Administration of small doses of atropine intravenously provides a satisfactory and convenient method of deriving individual STI v HR regression equations which can be applied over a period of weeks.
Sauser Zachrison, Kori; Iwashyna, Theodore J; Gebremariam, Achamyeleh; Hutchins, Meghan; Lee, Joyce M
2016-12-28
Connected individuals (or nodes) in a network are more likely to be similar than two randomly selected nodes due to homophily and/or network influence. Distinguishing between these two influences is an important goal in network analysis, and generalized estimating equation (GEE) analyses of longitudinal dyadic network data are an attractive approach. It is not known to what extent such regressions can accurately extract underlying data generating processes. Therefore our primary objective is to determine to what extent, and under what conditions, does the GEE-approach recreate the actual dynamics in an agent-based model. We generated simulated cohorts with pre-specified network characteristics and attachments in both static and dynamic networks, and we varied the presence of homophily and network influence. We then used statistical regression and examined the GEE model performance in each cohort to determine whether the model was able to detect the presence of homophily and network influence. In cohorts with both static and dynamic networks, we find that the GEE models have excellent sensitivity and reasonable specificity for determining the presence or absence of network influence, but little ability to distinguish whether or not homophily is present. The GEE models are a valuable tool to examine for the presence of network influence in longitudinal data, but are quite limited with respect to homophily.
Estimating Causal Effects with Ancestral Graph Markov Models
Malinsky, Daniel; Spirtes, Peter
2017-01-01
We present an algorithm for estimating bounds on causal effects from observational data which combines graphical model search with simple linear regression. We assume that the underlying system can be represented by a linear structural equation model with no feedback, and we allow for the possibility of latent variables. Under assumptions standard in the causal search literature, we use conditional independence constraints to search for an equivalence class of ancestral graphs. Then, for each model in the equivalence class, we perform the appropriate regression (using causal structure information to determine which covariates to include in the regression) to estimate a set of possible causal effects. Our approach is based on the “IDA” procedure of Maathuis et al. (2009), which assumes that all relevant variables have been measured (i.e., no unmeasured confounders). We generalize their work by relaxing this assumption, which is often violated in applied contexts. We validate the performance of our algorithm on simulated data and demonstrate improved precision over IDA when latent variables are present. PMID:28217244
NASA Technical Reports Server (NTRS)
Boyce, Lola; Bast, Callie C.
1992-01-01
The research included ongoing development of methodology that provides probabilistic lifetime strength of aerospace materials via computational simulation. A probabilistic material strength degradation model, in the form of a randomized multifactor interaction equation, is postulated for strength degradation of structural components of aerospace propulsion systems subjected to a number of effects or primative variables. These primative variable may include high temperature, fatigue or creep. In most cases, strength is reduced as a result of the action of a variable. This multifactor interaction strength degradation equation has been randomized and is included in the computer program, PROMISS. Also included in the research is the development of methodology to calibrate the above described constitutive equation using actual experimental materials data together with linear regression of that data, thereby predicting values for the empirical material constraints for each effect or primative variable. This regression methodology is included in the computer program, PROMISC. Actual experimental materials data were obtained from the open literature for materials typically of interest to those studying aerospace propulsion system components. Material data for Inconel 718 was analyzed using the developed methodology.
Wang, Shan-Shan; Zhang, Yu-Qi; Chen, Shu-Bao; Huang, Guo-Ying; Zhang, Hong-Yan; Zhang, Zhi-Fang; Wu, Lan-Ping; Hong, Wen-Jing; Shen, Rong; Liu, Yi-Qing; Zhu, Jun-Xue
2017-06-01
Clinical decision making in children with congenital and acquired heart disease relies on measurements of cardiac structures using two-dimensional echocardiography. We aimed to establish z-score regression equations for right heart structures in healthy Chinese Han children. Two-dimensional and M-mode echocardiography was performed in 515 patients. We measured the dimensions of the pulmonary valve annulus (PVA), main pulmonary artery (MPA), left pulmonary artery (LPA), right pulmonary artery (RPA), right ventricular outflow tract at end-diastole (RVOTd) and at end-systole (RVOTs), tricuspid valve annulus (TVA), right ventricular inflow tract at end-diastole (RVIDd) and at end-systole (RVIDs), and right atrium (RA). Regression analyses were conducted to relate the measurements of right heart structures to 4body surface area (BSA). Right ventricular outflow-tract fractional shortening (RVOTFS) was also calculated. Several models were used, and the best model was chosen to establish a z-score calculator. PVA, MPA, LPA, RPA, RVOTd, RVOTs, TVA, RVIDd, RVIDs, and RA (R 2 = 0.786, 0.705, 0.728, 0.701, 0.706, 0.824, 0.804, 0.663, 0.626, and 0.793, respectively) had a cubic polynomial relationship with BSA; specifically, measurement (M) = β0 + β1 × BSA + β2 × BSA 2 + β3 × BSA. 3 RVOTFS (0.28 ± 0.02) fell within a narrow range (0.12-0.51). Our results provide reference values for z scores and regression equations for right heart structures in Han Chinese children. These data may help interpreting the routine clinical measurement of right heart structures in children with congenital or acquired heart disease. © 2016 Wiley Periodicals, Inc. J Clin Ultrasound 45:293-303, 2017. © 2017 Wiley Periodicals, Inc.
Weather adjustment using seemingly unrelated regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noll, T.A.
1995-05-01
Seemingly unrelated regression (SUR) is a system estimation technique that accounts for time-contemporaneous correlation between individual equations within a system of equations. SUR is suited to weather adjustment estimations when the estimation is: (1) composed of a system of equations and (2) the system of equations represents either different weather stations, different sales sectors or a combination of different weather stations and different sales sectors. SUR utilizes the cross-equation error values to develop more accurate estimates of the system coefficients than are obtained using ordinary least-squares (OLS) estimation. SUR estimates can be generated using a variety of statistical software packagesmore » including MicroTSP and SAS.« less
Saucedo-Reyes, Daniela; Carrillo-Salazar, José A; Román-Padilla, Lizbeth; Saucedo-Veloz, Crescenciano; Reyes-Santamaría, María I; Ramírez-Gilly, Mariana; Tecante, Alberto
2018-03-01
High hydrostatic pressure inactivation kinetics of Escherichia coli ATCC 25922 and Salmonella enterica subsp. enterica serovar Typhimurium ATCC 14028 ( S. typhimurium) in a low acid mamey pulp at four pressure levels (300, 350, 400, and 450 MPa), different exposure times (0-8 min), and temperature of 25 ± 2℃ were obtained. Survival curves showed deviations from linearity in the form of a tail (upward concavity). The primary models tested were the Weibull model, the modified Gompertz equation, and the biphasic model. The Weibull model gave the best goodness of fit ( R 2 adj > 0.956, root mean square error < 0.290) in the modeling and the lowest Akaike information criterion value. Exponential-logistic and exponential decay models, and Bigelow-type and an empirical models for b'( P) and n( P) parameters, respectively, were tested as alternative secondary models. The process validation considered the two- and one-step nonlinear regressions for making predictions of the survival fraction; both regression types provided an adequate goodness of fit and the one-step nonlinear regression clearly reduced fitting errors. The best candidate model according to the Akaike theory information, with better accuracy and more reliable predictions was the Weibull model integrated by the exponential-logistic and exponential decay secondary models as a function of time and pressure (two-step procedure) or incorporated as one equation (one-step procedure). Both mathematical expressions were used to determine the t d parameter, where the desired reductions ( 5D) (considering d = 5 ( t 5 ) as the criterion of 5 Log 10 reduction (5 D)) in both microorganisms are attainable at 400 MPa for 5.487 ± 0.488 or 5.950 ± 0.329 min, respectively, for the one- or two-step nonlinear procedure.
Wei, Chang-Na; Zhou, Qing-He; Wang, Li-Zhong
2017-01-01
Abstract Currently, there is no consensus on how to determine the optimal dose of intrathecal bupivacaine for an individual undergoing an elective cesarean section. In this study, we developed a regression equation between intrathecal 0.5% hyperbaric bupivacaine volume and abdominal girth and vertebral column length, to determine a suitable block level (T5) for elective cesarean section patients. In phase I, we analyzed 374 parturients undergoing an elective cesarean section that received a suitable dose of intrathecal 0.5% hyperbaric bupivacaine after a combined spinal-epidural (CSE) was performed at the L3/4 interspace. Parturients with T5 blockade to pinprick were selected for establishing the regression equation between 0.5% hyperbaric bupivacaine volume and vertebral column length and abdominal girth. Six parturient and neonatal variables, intrathecal 0.5% hyperbaric bupivacaine volume, and spinal anesthesia spread were recorded. Bivariate line correlation analyses, multiple line regression analyses, and 2-tailed t tests or chi-square test were performed, as appropriate. In phase II, another 200 parturients with CSE for elective cesarean section were enrolled to verify the accuracy of the regression equation. In phase I, a total of 143 parturients were selected to establish the following regression equation: YT5 = 0.074X1 − 0.022X2 − 0.017 (YT5 = 0.5% hyperbaric bupivacaine volume for T5 block level; X1 = vertebral column length; and X2 = abdominal girth). In phase II, a total of 189 participants were enrolled in the study to verify the accuracy of the regression equation, and 155 parturients with T5 blockade were deemed eligible, which accounted for 82.01% of all participants. This study evaluated parturients with T5 blockade to pinprick after a CSE for elective cesarean section to establish a regression equation between parturient vertebral column length and abdominal girth and 0.5% hyperbaric intrathecal bupivacaine volume. This equation can accurately predict the suitable intrathecal hyperbaric bupivacaine dose for elective cesarean section. PMID:28834913
Wei, Chang-Na; Zhou, Qing-He; Wang, Li-Zhong
2017-08-01
Currently, there is no consensus on how to determine the optimal dose of intrathecal bupivacaine for an individual undergoing an elective cesarean section. In this study, we developed a regression equation between intrathecal 0.5% hyperbaric bupivacaine volume and abdominal girth and vertebral column length, to determine a suitable block level (T5) for elective cesarean section patients.In phase I, we analyzed 374 parturients undergoing an elective cesarean section that received a suitable dose of intrathecal 0.5% hyperbaric bupivacaine after a combined spinal-epidural (CSE) was performed at the L3/4 interspace. Parturients with T5 blockade to pinprick were selected for establishing the regression equation between 0.5% hyperbaric bupivacaine volume and vertebral column length and abdominal girth. Six parturient and neonatal variables, intrathecal 0.5% hyperbaric bupivacaine volume, and spinal anesthesia spread were recorded. Bivariate line correlation analyses, multiple line regression analyses, and 2-tailed t tests or chi-square test were performed, as appropriate. In phase II, another 200 parturients with CSE for elective cesarean section were enrolled to verify the accuracy of the regression equation.In phase I, a total of 143 parturients were selected to establish the following regression equation: YT5 = 0.074X1 - 0.022X2 - 0.017 (YT5 = 0.5% hyperbaric bupivacaine volume for T5 block level; X1 = vertebral column length; and X2 = abdominal girth). In phase II, a total of 189 participants were enrolled in the study to verify the accuracy of the regression equation, and 155 parturients with T5 blockade were deemed eligible, which accounted for 82.01% of all participants.This study evaluated parturients with T5 blockade to pinprick after a CSE for elective cesarean section to establish a regression equation between parturient vertebral column length and abdominal girth and 0.5% hyperbaric intrathecal bupivacaine volume. This equation can accurately predict the suitable intrathecal hyperbaric bupivacaine dose for elective cesarean section.
O'Neill, William; Penn, Richard; Werner, Michael; Thomas, Justin
2015-06-01
Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible.
Arsenyev, P A; Trezvov, V V; Saratovskaya, N V
1997-01-01
This work represents a method, which allows to determine phase composition of calcium hydroxylapatite basing on its infrared spectrum. The method uses factor analysis of the spectral data of calibration set of samples to determine minimal number of factors required to reproduce the spectra within experimental error. Multiple linear regression is applied to establish correlation between factor scores of calibration standards and their properties. The regression equations can be used to predict the property value of unknown sample. The regression model was built for determination of beta-tricalcium phosphate content in hydroxylapatite. Statistical estimation of quality of the model was carried out. Application of the factor analysis on spectral data allows to increase accuracy of beta-tricalcium phosphate determination and expand the range of determination towards its less concentration. Reproducibility of results is retained.
NASA Technical Reports Server (NTRS)
Holms, A. G.
1974-01-01
Monte Carlo studies using population models intended to represent response surface applications are reported. Simulated experiments were generated by adding pseudo random normally distributed errors to population values to generate observations. Model equations were fitted to the observations and the decision procedure was used to delete terms. Comparison of values predicted by the reduced models with the true population values enabled the identification of deletion strategies that are approximately optimal for minimizing prediction errors.
Curran, Janet H.; Barth, Nancy A.; Veilleux, Andrea G.; Ourso, Robert T.
2016-03-16
Estimates of the magnitude and frequency of floods are needed across Alaska for engineering design of transportation and water-conveyance structures, flood-insurance studies, flood-plain management, and other water-resource purposes. This report updates methods for estimating flood magnitude and frequency in Alaska and conterminous basins in Canada. Annual peak-flow data through water year 2012 were compiled from 387 streamgages on unregulated streams with at least 10 years of record. Flood-frequency estimates were computed for each streamgage using the Expected Moments Algorithm to fit a Pearson Type III distribution to the logarithms of annual peak flows. A multiple Grubbs-Beck test was used to identify potentially influential low floods in the time series of peak flows for censoring in the flood frequency analysis.For two new regional skew areas, flood-frequency estimates using station skew were computed for stations with at least 25 years of record for use in a Bayesian least-squares regression analysis to determine a regional skew value. The consideration of basin characteristics as explanatory variables for regional skew resulted in improvements in precision too small to warrant the additional model complexity, and a constant model was adopted. Regional Skew Area 1 in eastern-central Alaska had a regional skew of 0.54 and an average variance of prediction of 0.45, corresponding to an effective record length of 22 years. Regional Skew Area 2, encompassing coastal areas bordering the Gulf of Alaska, had a regional skew of 0.18 and an average variance of prediction of 0.12, corresponding to an effective record length of 59 years. Station flood-frequency estimates for study sites in regional skew areas were then recomputed using a weighted skew incorporating the station skew and regional skew. In a new regional skew exclusion area outside the regional skew areas, the density of long-record streamgages was too sparse for regional analysis and station skew was used for all estimates. Final station flood frequency estimates for all study streamgages are presented for the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities.Regional multiple-regression analysis was used to produce equations for estimating flood frequency statistics from explanatory basin characteristics. Basin characteristics, including physical and climatic variables, were updated for all study streamgages using a geographical information system and geospatial source data. Screening for similar-sized nested basins eliminated hydrologically redundant sites, and screening for eligibility for analysis of explanatory variables eliminated regulated peaks, outburst peaks, and sites with indeterminate basin characteristics. An ordinary least‑squares regression used flood-frequency statistics and basin characteristics for 341 streamgages (284 in Alaska and 57 in Canada) to determine the most suitable combination of basin characteristics for a flood-frequency regression model and to explore regional grouping of streamgages for explaining variability in flood-frequency statistics across the study area. The most suitable model for explaining flood frequency used drainage area and mean annual precipitation as explanatory variables for the entire study area as a region. Final regression equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability discharge in Alaska and conterminous basins in Canada were developed using a generalized least-squares regression. The average standard error of prediction for the regression equations for the various annual exceedance probabilities ranged from 69 to 82 percent, and the pseudo-coefficient of determination (pseudo-R2) ranged from 85 to 91 percent.The regional regression equations from this study were incorporated into the U.S. Geological Survey StreamStats program for a limited area of the State—the Cook Inlet Basin. StreamStats is a national web-based geographic information system application that facilitates retrieval of streamflow statistics and associated information. StreamStats retrieves published data for gaged sites and, for user-selected ungaged sites, delineates drainage areas from topographic and hydrographic data, computes basin characteristics, and computes flood frequency estimates using the regional regression equations.
2015-01-01
The ecological significance of fish and squid of the mesopelagic zone (200 m–1000 m) is evident by their pervasiveness in the diets of a broad spectrum of upper pelagic predators including other fishes and squids, seabirds and marine mammals. As diel vertical migrators, mesopelagic micronekton are recognized as an important trophic link between the deep scattering layer and upper surface waters, yet fundamental aspects of the life history and energetic contribution to the food web for most are undescribed. Here, we present newly derived regression equations for 32 species of mesopelagic fish and squid based on the relationship between body size and the size of hard parts typically used to identify prey species in predator diet studies. We describe the proximate composition and energy density of 31 species collected in the eastern Bering Sea during May 1999 and 2000. Energy values are categorized by body size as a proxy for relative age and can be cross-referenced with the derived regression equations. Data are tabularized to facilitate direct application to predator diet studies and food web models. PMID:26287534
Sinclair, Elizabeth H; Walker, William A; Thomason, James R
2015-01-01
The ecological significance of fish and squid of the mesopelagic zone (200 m-1000 m) is evident by their pervasiveness in the diets of a broad spectrum of upper pelagic predators including other fishes and squids, seabirds and marine mammals. As diel vertical migrators, mesopelagic micronekton are recognized as an important trophic link between the deep scattering layer and upper surface waters, yet fundamental aspects of the life history and energetic contribution to the food web for most are undescribed. Here, we present newly derived regression equations for 32 species of mesopelagic fish and squid based on the relationship between body size and the size of hard parts typically used to identify prey species in predator diet studies. We describe the proximate composition and energy density of 31 species collected in the eastern Bering Sea during May 1999 and 2000. Energy values are categorized by body size as a proxy for relative age and can be cross-referenced with the derived regression equations. Data are tabularized to facilitate direct application to predator diet studies and food web models.
Development of surrogate models for the prediction of the flow around an aircraft propeller
NASA Astrophysics Data System (ADS)
Salpigidou, Christina; Misirlis, Dimitris; Vlahostergios, Zinon; Yakinthos, Kyros
2018-05-01
In the present work, the derivation of two surrogate models (SMs) for modelling the flow around a propeller for small aircrafts is presented. Both methodologies use derived functions based on computations with the detailed propeller geometry. The computations were performed using k-ω shear stress transport for modelling turbulence. In the SMs, the modelling of the propeller was performed in a computational domain of disk-like geometry, where source terms were introduced in the momentum equations. In the first SM, the source terms were polynomial functions of swirl and thrust, mainly related to the propeller radius. In the second SM, regression analysis was used to correlate the source terms with the velocity distribution through the propeller. The proposed SMs achieved faster convergence, in relation to the detail model, by providing also results closer to the available operational data. The regression-based model was the most accurate and required less computational time for convergence.
Nunes, Matheus Henrique
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects. PMID:27187074
Nunes, Matheus Henrique; Görgens, Eric Bastos
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
Over, Thomas M.; Saito, Riki J.; Veilleux, Andrea G.; Sharpe, Jennifer B.; Soong, David T.; Ishii, Audrey L.
2016-06-28
This report provides two sets of equations for estimating peak discharge quantiles at annual exceedance probabilities (AEPs) of 0.50, 0.20, 0.10, 0.04, 0.02, 0.01, 0.005, and 0.002 (recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively) for watersheds in Illinois based on annual maximum peak discharge data from 117 watersheds in and near northeastern Illinois. One set of equations was developed through a temporal analysis with a two-step least squares-quantile regression technique that measures the average effect of changes in the urbanization of the watersheds used in the study. The resulting equations can be used to adjust rural peak discharge quantiles for the effect of urbanization, and in this study the equations also were used to adjust the annual maximum peak discharges from the study watersheds to 2010 urbanization conditions.The other set of equations was developed by a spatial analysis. This analysis used generalized least-squares regression to fit the peak discharge quantiles computed from the urbanization-adjusted annual maximum peak discharges from the study watersheds to drainage-basin characteristics. The peak discharge quantiles were computed by using the Expected Moments Algorithm following the removal of potentially influential low floods defined by a multiple Grubbs-Beck test. To improve the quantile estimates, regional skew coefficients were obtained from a newly developed regional skew model in which the skew increases with the urbanized land use fraction. The drainage-basin characteristics used as explanatory variables in the spatial analysis include drainage area, the fraction of developed land, the fraction of land with poorly drained soils or likely water, and the basin slope estimated as the ratio of the basin relief to basin perimeter.This report also provides the following: (1) examples to illustrate the use of the spatial and urbanization-adjustment equations for estimating peak discharge quantiles at ungaged sites and to improve flood-quantile estimates at and near a gaged site; (2) the urbanization-adjusted annual maximum peak discharges and peak discharge quantile estimates at streamgages from 181 watersheds including the 117 study watersheds and 64 additional watersheds in the study region that were originally considered for use in the study but later deemed to be redundant.The urbanization-adjustment equations, spatial regression equations, and peak discharge quantile estimates developed in this study will be made available in the web application StreamStats, which provides automated regression-equation solutions for user-selected stream locations. Figures and tables comparing the observed and urbanization-adjusted annual maximum peak discharge records by streamgage are provided at https://doi.org/10.3133/sir20165050 for download.
NASA Astrophysics Data System (ADS)
Wang, Jie; Chen, Li; Yu, Zhongbo
2018-02-01
Rainfall infiltration on hillslopes is an important issue in hydrology, which is related to many environmental problems, such as flood, soil erosion, and nutrient and contaminant transport. This study aimed to improve the quantification of infiltration on hillslopes under both steady and unsteady rainfalls. Starting from Darcy's law, an analytical integral infiltrability equation was derived for hillslope infiltration by use of the flux-concentration relation. Based on this equation, a simple scaling relation linking the infiltration times on hillslopes and horizontal planes was obtained which is applicable for both small and large times and can be used to simplify the solution procedure of hillslope infiltration. The infiltrability equation also improved the estimation of ponding time for infiltration under rainfall conditions. For infiltration after ponding, the time compression approximation (TCA) was applied together with the infiltrability equation. To improve the computational efficiency, the analytical integral infiltrability equation was approximated with a two-term power-like function by nonlinear regression. Procedures of applying this approach to both steady and unsteady rainfall conditions were proposed. To evaluate the performance of the new approach, it was compared with the Green-Ampt model for sloping surfaces by Chen and Young (2006) and Richards' equation. The proposed model outperformed the sloping Green-Ampt, and both ponding time and infiltration predictions agreed well with the solutions of Richards' equation for various soil textures, slope angles, initial water contents, and rainfall intensities for both steady and unsteady rainfalls.
Modelling ecological flow regime: an example from the Tennessee and Cumberland River basins
Knight, Rodney R.; Gain, W. Scott; Wolfe, William J.
2012-01-01
Predictive equations were developed for 19 ecologically relevant streamflow characteristics within five major groups of flow variables (magnitude, ratio, frequency, variability, and date) for use in the Tennessee and Cumberland River basins using stepbackward regression. Basin characteristics explain 50% or more of the variation for 12 of the 19 equations. Independent variables identified through stepbackward regression were statistically significant in 78 of 304 cases (α > 0.0001) and represent four major groups: climate, physical landscape features, regional indicators, and land use. Of these groups, the regional and climate variables were the most influential for determining hydrologic response. Daily temperature range, geologic factor, and rock depth were major factors explaining the variability in 17, 15, and 13 equations, respectively. The equations and independent datasets were used to explore the broad relation between basin properties and streamflow and the implication of streamflow to the study of ecological flow requirements. Key results include a high degree of hydrologic variability among least disturbed Blue Ridge streams, similar hydrologic behaviour for watersheds with widely varying degrees of forest cover, and distinct hydrologic profiles for streams in different geographic regions. Published in 2011. This article is a US Government work and is in the public domain in the USA.
Painter, Colin C.; Heimann, David C.; Lanning-Rush, Jennifer L.
2017-08-14
A study was done by the U.S. Geological Survey in cooperation with the Kansas Department of Transportation and the Federal Emergency Management Agency to develop regression models to estimate peak streamflows of annual exceedance probabilities of 50, 20, 10, 4, 2, 1, 0.5, and 0.2 percent at ungaged locations in Kansas. Peak streamflow frequency statistics from selected streamgages were related to contributing drainage area and average precipitation using generalized least-squares regression analysis. The peak streamflow statistics were derived from 151 streamgages with at least 25 years of streamflow data through 2015. The developed equations can be used to predict peak streamflow magnitude and frequency within two hydrologic regions that were defined based on the effects of irrigation. The equations developed in this report are applicable to streams in Kansas that are not substantially affected by regulation, surface-water diversions, or urbanization. The equations are intended for use for streams with contributing drainage areas ranging from 0.17 to 14,901 square miles in the nonirrigation effects region and, 1.02 to 3,555 square miles in the irrigation-affected region, corresponding to the range of drainage areas of the streamgages used in the development of the regional equations.
Discovering governing equations from data by sparse identification of nonlinear dynamical systems
Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan
2016-01-01
Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing. PMID:27035946
Discovering governing equations from data by sparse identification of nonlinear dynamical systems.
Brunton, Steven L; Proctor, Joshua L; Kutz, J Nathan
2016-04-12
Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing.
Kennedy, Jeffrey R.; Paretti, Nicholas V.
2014-01-01
Flooding in urban areas routinely causes severe damage to property and often results in loss of life. To investigate the effect of urbanization on the magnitude and frequency of flood peaks, a flood frequency analysis was carried out using data from urbanized streamgaging stations in Phoenix and Tucson, Arizona. Flood peaks at each station were predicted using the log-Pearson Type III distribution, fitted using the expected moments algorithm and the multiple Grubbs-Beck low outlier test. The station estimates were then compared to flood peaks estimated by rural-regression equations for Arizona, and to flood peaks adjusted for urbanization using a previously developed procedure for adjusting U.S. Geological Survey rural regression peak discharges in an urban setting. Only smaller, more common flood peaks at the 50-, 20-, 10-, and 4-percent annual exceedance probabilities (AEPs) demonstrate any increase in magnitude as a result of urbanization; the 1-, 0.5-, and 0.2-percent AEP flood estimates are predicted without bias by the rural-regression equations. Percent imperviousness was determined not to account for the difference in estimated flood peaks between stations, either when adjusting the rural-regression equations or when deriving urban-regression equations to predict flood peaks directly from basin characteristics. Comparison with urban adjustment equations indicates that flood peaks are systematically overestimated if the rural-regression-estimated flood peaks are adjusted upward to account for urbanization. At nearly every streamgaging station in the analysis, adjusted rural-regression estimates were greater than the estimates derived using station data. One likely reason for the lack of increase in flood peaks with urbanization is the presence of significant stormwater retention and detention structures within the watershed used in the study.
NASA Technical Reports Server (NTRS)
Lambert, Winifred; Wheeler, Mark
2005-01-01
Five logistic regression equations were created that predict the probability of cloud-to-ground lightning occurrence for the day in the KSC/CCAFS area for each month in the warm season. These equations integrated the results from several studies over recent years to improve thunderstorm forecasting at KSC/CCAFS. All of the equations outperform persistence, which is known to outperform NPTI, the current objective tool used in 45 WS lightning forecasting operations. The equations also performed well in other tests. As a result, the new equations will be added to the current set of tools used by the 45 WS to determine the probability of lightning for their daily planning forecast. The results from these equations are meant to be used as first-guess guidance when developing the lightning probability forecast for the day. They provide an objective base from which forecasters can use other observations, model data, consultation with other forecasters, and their own experience to create the final lightning probability for the 1100 UTC briefing.
Individualized prediction of lung-function decline in chronic obstructive pulmonary disease
Zafari, Zafar; Sin, Don D.; Postma, Dirkje S.; Löfdahl, Claes-Göran; Vonk, Judith; Bryan, Stirling; Lam, Stephen; Tammemagi, C. Martin; Khakban, Rahman; Man, S.F. Paul; Tashkin, Donald; Wise, Robert A.; Connett, John E.; McManus, Bruce; Ng, Raymond; Hollander, Zsuszanna; Sadatsafavi, Mohsen
2016-01-01
Background: The rate of lung-function decline in chronic obstructive pulmonary disease (COPD) varies substantially among individuals. We sought to develop and validate an individualized prediction model for forced expiratory volume at 1 second (FEV1) in current smokers with mild-to-moderate COPD. Methods: Using data from a large long-term clinical trial (the Lung Health Study), we derived mixed-effects regression models to predict future FEV1 values over 11 years according to clinical traits. We modelled heterogeneity by allowing regression coefficients to vary across individuals. Two independent cohorts with COPD were used for validating the equations. Results: We used data from 5594 patients (mean age 48.4 yr, 63% men, mean baseline FEV1 2.75 L) to create the individualized prediction equations. There was significant between-individual variability in the rate of FEV1 decline, with the interval for the annual rate of decline that contained 95% of individuals being −124 to −15 mL/yr for smokers and −83 to 15 mL/yr for sustained quitters. Clinical variables in the final model explained 88% of variation around follow-up FEV1. The C statistic for predicting severity grades was 0.90. Prediction equations performed robustly in the 2 external data sets. Interpretation: A substantial part of individual variation in FEV1 decline can be explained by easily measured clinical variables. The model developed in this work can be used for prediction of future lung health in patients with mild-to-moderate COPD. Trial registration: Lung Health Study — ClinicalTrials.gov, no. NCT00000568; Pan-Canadian Early Detection of Lung Cancer Study — ClinicalTrials.gov, no. NCT00751660 PMID:27486205
Mortality rates in OECD countries converged during the period 1990-2010.
Bremberg, Sven G
2017-06-01
Since the scientific revolution of the 18th century, human health has gradually improved, but there is no unifying theory that explains this improvement in health. Studies of macrodeterminants have produced conflicting results. Most studies have analysed health at a given point in time as the outcome; however, the rate of improvement in health might be a more appropriate outcome. Twenty-eight OECD member countries were selected for analysis in the period 1990-2010. The main outcomes studied, in six age groups, were the national rates of decrease in mortality in the period 1990-2010. The effects of seven potential determinants on the rates of decrease in mortality were analysed in linear multiple regression models using least squares, controlling for country-specific history constants, which represent the mortality rate in 1990. The multiple regression analyses started with models that only included mortality rates in 1990 as determinants. These models explained 87% of the intercountry variation in the children aged 1-4 years and 51% in adults aged 55-74 years. When added to the regression equations, the seven determinants did not seem to significantly increase the explanatory power of the equations. The analyses indicated a decrease in mortality in all nations and in all age groups. The development of mortality rates in the different nations demonstrated significant catch-up effects. Therefore an important objective of the national public health sector seems to be to reduce the delay between international research findings and the universal implementation of relevant innovations.
Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.
2000-01-01
The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.
Hatanaka, N; Yamamoto, Y; Ichihara, K; Mastuo, S; Nakamura, Y; Watanabe, M; Iwatani, Y
2008-04-01
Various scales have been devised to predict development of pressure ulcers on the basis of clinical and laboratory data, such as the Braden Scale (Braden score), which is used to monitor activity and skin conditions of bedridden patients. However, none of these scales facilitates clinically reliable prediction. To develop a clinical laboratory data-based predictive equation for the development of pressure ulcers. Subjects were 149 hospitalised patients with respiratory disorders who were monitored for the development of pressure ulcers over a 3-month period. The proportional hazards model (Cox regression) was used to analyse the results of 12 basic laboratory tests on the day of hospitalisation in comparison with Braden score. Pressure ulcers developed in 38 patients within the study period. A Cox regression model consisting solely of Braden scale items showed that none of these items contributed to significantly predicting pressure ulcers. Rather, a combination of haemoglobin (Hb), C-reactive protein (CRP), albumin (Alb), age, and gender produced the best model for prediction. Using the set of explanatory variables, we created a new indicator based on a multiple logistic regression equation. The new indicator showed high sensitivity (0.73) and specificity (0.70), and its diagnostic power was higher than that of Alb, Hb, CRP, or the Braden score alone. The new indicator may become a more useful clinical tool for predicting presser ulcers than Braden score. The new indicator warrants verification studies to facilitate its clinical implementation in the future.
ERIC Educational Resources Information Center
Inbar-Furst, Hagit; Gumpel, Thomas P.
2015-01-01
Questionnaires were given to 392 elementary school teachers to examine help-seeking or help-avoidance in dealing with classroom behavioral problems. Scale validity was examined through a series of exploratory and confirmatory factor analyses. Using a series of multivariate regression analyses and structural equation modeling, we identified…
ERIC Educational Resources Information Center
McCarthy, Christopher J.; Fouladi, Rachel T.; Juncker, Brian D.; Matheny, Kenneth B.
2006-01-01
The association of protective resources, personality variables, life events, and gender with anxiety and depression was examined with university students. Building on regression analyses, a structural equation model was generated with good fit, indicating that with respect to both anxiety and depression, negative life events and coping resources…
Family Income and Parenting: The Role of Parental Depression and Social Support
ERIC Educational Resources Information Center
Lee, Chih-Yuan S.; Anderson, Jared R.; Horowitz, Jason L.; August, Gerald J.
2009-01-01
This study examined the relations among family income, social support, parental depression, and parenting among 290 predominantly rural families with children at risk for disruptive or socially withdrawn behaviors. Structural equation modeling and multiple regression were used, and the results showed that low family income was related to high…
ERIC Educational Resources Information Center
Pallone, Nathaniel J.; Hennessy, James J.; Voelbel, Gerald T.
1998-01-01
A scientifically sound methodology for identifying offenders about whose presence the community should be notified is demonstrated. A stepwise multiple regression was calculated among incarcerated pedophiles (N=52) including both psychological and legal data; a precision-weighted equation produced 90.4% "true positives." This methodology can be…
ERIC Educational Resources Information Center
Eren, Altay
2012-01-01
This study aimed to examine the mediating role of prospective teachers' academic optimism in the relationship between their future time perspective and professional plans about teaching. A total of 396 prospective teachers voluntarily participated in the study. Correlation, regression, and structural equation modeling analyses were conducted in…
ERIC Educational Resources Information Center
Li, Wenjing; Denson, Linley A.; Dorstyn, Diana S.
2017-01-01
This study investigated help-seeking intentions and use of mental health services within a sample of 1128 Mainland Chinese college students (630 males and 498 females; mean age = 20.01 years, SD = 1.48). Results of structural equation modeling and logistic regression analysis suggested that social-cognitive variables had significant effects both…
Monitoring heavy metal Cr in soil based on hyperspectral data using regression analysis
NASA Astrophysics Data System (ADS)
Zhang, Ningyu; Xu, Fuyun; Zhuang, Shidong; He, Changwei
2016-10-01
Heavy metal pollution in soils is one of the most critical problems in the global ecology and environment safety nowadays. Hyperspectral remote sensing and its application is capable of high speed, low cost, less risk and less damage, and provides a good method for detecting heavy metals in soil. This paper proposed a new idea of applying regression analysis of stepwise multiple regression between the spectral data and monitoring the amount of heavy metal Cr by sample points in soil for environmental protection. In the measurement, a FieldSpec HandHeld spectroradiometer is used to collect reflectance spectra of sample points over the wavelength range of 325-1075 nm. Then the spectral data measured by the spectroradiometer is preprocessed to reduced the influence of the external factors, and the preprocessed methods include first-order differential equation, second-order differential equation and continuum removal method. The algorithms of stepwise multiple regression are established accordingly, and the accuracy of each equation is tested. The results showed that the accuracy of first-order differential equation works best, which makes it feasible to predict the content of heavy metal Cr by using stepwise multiple regression.
Parrett, Charles; Omang, R.J.; Hull, J.A.
1983-01-01
Equations for estimating mean annual runoff and peak discharge from measurements of channel geometry were developed for western and northeastern Montana. The study area was divided into two regions for the mean annual runoff analysis, and separate multiple-regression equations were developed for each region. The active-channel width was determined to be the most important independent variable in each region. The standard error of estimate for the estimating equation using active-channel width was 61 percent in the Northeast Region and 38 percent in the West region. The study area was divided into six regions for the peak discharge analysis, and multiple regression equations relating channel geometry and basin characteristics to peak discharges having recurrence intervals of 2, 5, 10, 25, 50 and 100 years were developed for each region. The standard errors of estimate for the regression equations using only channel width as an independent variable ranged from 35 to 105 percent. The standard errors improved in four regions as basin characteristics were added to the estimating equations. (USGS)
Chen, Ying-Jen; Ho, Meng-Yang; Chen, Kwan-Ju; Hsu, Chia-Fen; Ryu, Shan-Jin
2009-08-01
The aims of the present study were to (i) investigate if traditional Chinese word reading ability can be used for estimating premorbid general intelligence; and (ii) to provide multiple regression equations for estimating premorbid performance on Raven's Standard Progressive Matrices (RSPM), using age, years of education and Chinese Graded Word Reading Test (CGWRT) scores as predictor variables. Four hundred and twenty-six healthy volunteers (201 male, 225 female), aged 16-93 years (mean +/- SD, 41.92 +/- 18.19 years) undertook the tests individually under supervised conditions. Seventy percent of subjects were randomly allocated to the derivation group (n = 296), and the rest to the validation group (n = 130). RSPM score was positively correlated with CGWRT score and years of education. RSPM and CGWRT scores and years of education were also inversely correlated with age, but the declining trend for RSPM performance against age was steeper than that for CGWRT performance. Separate multiple regression equations were derived for estimating RSPM scores using different combinations of age, years of education, and CGWRT score for both groups. The multiple regression coefficient of each equation ranged from 0.71 to 0.80 with the standard error of estimate between 7 and 8 RSPM points. When fitting the data of one group to the equations derived from its counterpart group, the cross-validation multiple regression coefficients ranged from 0.71 to 0.79. There were no significant differences in the 'predicted-obtained' RSPM discrepancies between any equations. The regression equations derived in the present study may provide a basis for estimating premorbid RSPM performance.
Estimating air drying times of lumber with multiple regression
William T. Simpson
2004-01-01
In this study, the applicability of a multiple regression equation for estimating air drying times of red oak, sugar maple, and ponderosa pine lumber was evaluated. The equation allows prediction of estimated air drying times from historic weather records of temperature and relative humidity at any desired location.
National scale biomass estimators for United States tree species
Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey
2003-01-01
Estimates of national-scale forest carbon (C) stocks and fluxes are typically based on allometric regression equations developed using dimensional analysis techniques. However, the literature is inconsistent and incomplete with respect to large-scale forest C estimation. We compiled all available diameter-based allometric regression equations for estimating total...
On the use of regression analysis for the estimation of human biological age.
Krøll, J; Saxtrup, O
2000-01-01
The present investigation compares three linear regression procedures for the definition of human biological age (bioage). As a model system for bioage definition is used the variations with age of blood hemoglobin (B-hemoglobin) in males in the age range 50-95 years. The bioage measures compared are: 1: P-bioage; defined from regression of chronological age on B-hemoglobin results. 2: AC-bioage; obtained by indirect regression, using in reverse the equation describing the regression of B-hemoglobin on age in a reference population. 3: BC-bioage; defined by orthogonal regression on the reference regression line of B-hemoglobin on age. It is demonstrated that the P-bioage measure gives an overestimation of the bioage in the younger and an underestimation in the older individuals. This 'regression to the mean' is avoided using the indirect regression procedures. Here the relatively low SD of the BC-bioage measure results from the inclusion of individual chronological age in the orthogonal regression procedure. Observations on male blood donors illustrates the variation of the AC- and BC-bioage measures in the individual.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
Saunders, Christina T; Blume, Jeffrey D
2017-10-26
Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
Adding In-Plane Flexibility to the Equations of Motion of a Single Rotor Helicopter
NASA Technical Reports Server (NTRS)
Curtiss, H. C., Jr.
2000-01-01
This report describes a way to add the effects of main rotor blade flexibility in the in- plane or lead-lag direction to a large set of non-linear equations of motion for a single rotor helicopter with rigid blades(l). Differences between the frequency of the regressing lag mode predicted by the equations of (1) and that measured in flight (2) for a UH-60 helicopter indicate that some element is missing from the analytical model of (1) which assumes rigid blades. A previous study (3) noted a similar discrepancy for the CH-53 helicopter. Using a relatively simple analytical model in (3), compared to (1), it was shown that a mechanical lag damper increases significantly the coupling between the rigid lag mode and the first flexible mode. This increased coupling due to a powerful lag damper produces an increase in the lowest lag frequency when viewed in a frame rotating with the blade. Flight test measurements normally indicate the frequency of this mode in a non-rotating or fixed frame. This report presents the additions necessary to the full equations of motion, to include main rotor blade lag flexibility. Since these additions are made to a very complex nonlinear dynamic model, in order to provide physical insight, a discussion of the results obtained from a simplified set of equations of motion is included. The reduced model illustrates the physics involved in the coupling and should indicate trends in the full model.
Klaeboe, Ronny
2005-09-01
When Gardermoen replaced Fornebu as the main airport for Oslo, aircraft noise levels increased in recreational areas near Gardermoen and decreased in areas near Fornebu. Krog and Engdahl [J. Acoust. Soc. Am. 116, 323-333 (2004)] estimate that recreationists' annoyance from aircraft noise in these areas changed more than would be anticipated from the actual noise changes. However, the sizes of their estimated "situation" effects are not credible. One possible reason for the anomalous results is that standard regression assumptions become violated when motivational factors are inserted into the regression model. Standardized regression coefficients (beta values) should also not be utilized for comparisons across equations.
Menz, Hylton B; Lord, Stephen R; Fitzpatrick, Richard C
2007-02-01
Many falls in older people occur while walking, however the mechanisms responsible for gait instability are poorly understood. Therefore, the aim of this study was to develop a plausible model describing the relationships between impaired sensorimotor function, fear of falling and gait patterns in older people. Temporo-spatial gait parameters and acceleration patterns of the head and pelvis were obtained from 100 community-dwelling older people aged between 75 and 93 years while walking on an irregular walkway. A theoretical model was developed to explain the relationships between these variables, assuming that head stability is a primary output of the postural control system when walking. This model was then tested using structural equation modeling, a statistical technique which enables the testing of a set of regression equations simultaneously. The structural equation model indicated that: (i) reduced step length has a significant direct and indirect association with reduced head stability; (ii) impaired sensorimotor function is significantly associated with reduced head stability, but this effect is largely indirect, mediated by reduced step length, and; (iii) fear of falling is significantly associated with reduced step length, but has little direct influence on head stability. These findings provide useful insights into the possible mechanisms underlying gait characteristics and risk of falling in older people. Particularly important is the indication that fear-related step length shortening may be maladaptive.
Estimation of Compaction Parameters Based on Soil Classification
NASA Astrophysics Data System (ADS)
Lubis, A. S.; Muis, Z. A.; Hastuty, I. P.; Siregar, I. M.
2018-02-01
Factors that must be considered in compaction of the soil works were the type of soil material, field control, maintenance and availability of funds. Those problems then raised the idea of how to estimate the density of the soil with a proper implementation system, fast, and economical. This study aims to estimate the compaction parameter i.e. the maximum dry unit weight (γ dmax) and optimum water content (Wopt) based on soil classification. Each of 30 samples were being tested for its properties index and compaction test. All of the data’s from the laboratory test results, were used to estimate the compaction parameter values by using linear regression and Goswami Model. From the research result, the soil types were A4, A-6, and A-7 according to AASHTO and SC, SC-SM, and CL based on USCS. By linear regression, the equation for estimation of the maximum dry unit weight (γdmax *)=1,862-0,005*FINES- 0,003*LL and estimation of the optimum water content (wopt *)=- 0,607+0,362*FINES+0,161*LL. By Goswami Model (with equation Y=mLogG+k), for estimation of the maximum dry unit weight (γdmax *) with m=-0,376 and k=2,482, for estimation of the optimum water content (wopt *) with m=21,265 and k=-32,421. For both of these equations a 95% confidence interval was obtained.
Sando, Roy; Sando, Steven K.; McCarthy, Peter M.; Dutton, DeAnn M.
2016-04-05
The U.S. Geological Survey (USGS), in cooperation with the Montana Department of Natural Resources and Conservation, completed a study to update methods for estimating peak-flow frequencies at ungaged sites in Montana based on peak-flow data at streamflow-gaging stations through water year 2011. The methods allow estimation of peak-flow frequencies (that is, peak-flow magnitudes, in cubic feet per second, associated with annual exceedance probabilities of 66.7, 50, 42.9, 20, 10, 4, 2, 1, 0.5, and 0.2 percent) at ungaged sites. The annual exceedance probabilities correspond to 1.5-, 2-, 2.33-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence intervals, respectively.Regional regression analysis is a primary focus of Chapter F of this Scientific Investigations Report, and regression equations for estimating peak-flow frequencies at ungaged sites in eight hydrologic regions in Montana are presented. The regression equations are based on analysis of peak-flow frequencies and basin characteristics at 537 streamflow-gaging stations in or near Montana and were developed using generalized least squares regression or weighted least squares regression.All of the data used in calculating basin characteristics that were included as explanatory variables in the regression equations were developed for and are available through the USGS StreamStats application (http://water.usgs.gov/osw/streamstats/) for Montana. StreamStats is a Web-based geographic information system application that was created by the USGS to provide users with access to an assortment of analytical tools that are useful for water-resource planning and management. The primary purpose of the Montana StreamStats application is to provide estimates of basin characteristics and streamflow characteristics for user-selected ungaged sites on Montana streams. The regional regression equations presented in this report chapter can be conveniently solved using the Montana StreamStats application.Selected results from this study were compared with results of previous studies. For most hydrologic regions, the regression equations reported for this study had lower mean standard errors of prediction (in percent) than the previously reported regression equations for Montana. The equations presented for this study are considered to be an improvement on the previously reported equations primarily because this study (1) included 13 more years of peak-flow data; (2) included 35 more streamflow-gaging stations than previous studies; (3) used a detailed geographic information system (GIS)-based definition of the regulation status of streamflow-gaging stations, which allowed better determination of the unregulated peak-flow records that are appropriate for use in the regional regression analysis; (4) included advancements in GIS and remote-sensing technologies, which allowed more convenient calculation of basin characteristics and investigation of many more candidate basin characteristics; and (5) included advancements in computational and analytical methods, which allowed more thorough and consistent data analysis.This report chapter also presents other methods for estimating peak-flow frequencies at ungaged sites. Two methods for estimating peak-flow frequencies at ungaged sites located on the same streams as streamflow-gaging stations are described. Additionally, envelope curves relating maximum recorded annual peak flows to contributing drainage area for each of the eight hydrologic regions in Montana are presented and compared to a national envelope curve. In addition to providing general information on characteristics of large peak flows, the regional envelope curves can be used to assess the reasonableness of peak-flow frequency estimates determined using the regression equations.
Approximate median regression for complex survey data with skewed response.
Fraser, Raphael André; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett M; Pan, Yi
2016-12-01
The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling, and weighting. In this article, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS)'based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. © 2016, The International Biometric Society.
Approximate Median Regression for Complex Survey Data with Skewed Response
Fraser, Raphael André; Lipsitz, Stuart R.; Sinha, Debajyoti; Fitzmaurice, Garrett M.; Pan, Yi
2016-01-01
Summary The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling and weighting. In this paper, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS) based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. PMID:27062562
Methods for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma
Esralew, Rachel A.; Smith, S. Jerrod
2010-01-01
Flow statistics can be used to provide decision makers with surface-water information needed for activities such as water-supply permitting, flow regulation, and other water rights issues. Flow statistics could be needed at any location along a stream. Most often, streamflow statistics are needed at ungaged sites, where no flow data are available to compute the statistics. Methods are presented in this report for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma. Flow statistics included the (1) annual (period of record), (2) seasonal (summer-autumn and winter-spring), and (3) 12 monthly duration statistics, including the 20th, 50th, 80th, 90th, and 95th percentile flow exceedances, and the annual mean-flow (mean of daily flows for the period of record). Flow statistics were calculated from daily streamflow information collected from 235 streamflow-gaging stations throughout Oklahoma and areas in adjacent states. A drainage-area ratio method is the preferred method for estimating flow statistics at an ungaged location that is on a stream near a gage. The method generally is reliable only if the drainage-area ratio of the two sites is between 0.5 and 1.5. Regression equations that relate flow statistics to drainage-basin characteristics were developed for the purpose of estimating selected flow-duration and annual mean-flow statistics for ungaged streams that are not near gaging stations on the same stream. Regression equations were developed from flow statistics and drainage-basin characteristics for 113 unregulated gaging stations. Separate regression equations were developed by using U.S. Geological Survey streamflow-gaging stations in regions with similar drainage-basin characteristics. These equations can increase the accuracy of regression equations used for estimating flow-duration and annual mean-flow statistics at ungaged stream locations in Oklahoma. Streamflow-gaging stations were grouped by selected drainage-basin characteristics by using a k-means cluster analysis. Three regions were identified for Oklahoma on the basis of the clustering of gaging stations and a manual delineation of distinguishable hydrologic and geologic boundaries: Region 1 (western Oklahoma excluding the Oklahoma and Texas Panhandles), Region 2 (north- and south-central Oklahoma), and Region 3 (eastern and central Oklahoma). A total of 228 regression equations (225 flow-duration regressions and three annual mean-flow regressions) were developed using ordinary least-squares and left-censored (Tobit) multiple-regression techniques. These equations can be used to estimate 75 flow-duration statistics and annual mean-flow for ungaged streams in the three regions. Drainage-basin characteristics that were statistically significant independent variables in the regression analyses were (1) contributing drainage area; (2) station elevation; (3) mean drainage-basin elevation; (4) channel slope; (5) percentage of forested canopy; (6) mean drainage-basin hillslope; (7) soil permeability; and (8) mean annual, seasonal, and monthly precipitation. The accuracy of flow-duration regression equations generally decreased from high-flow exceedance (low-exceedance probability) to low-flow exceedance (high-exceedance probability) . This decrease may have happened because a greater uncertainty exists for low-flow estimates and low-flow is largely affected by localized geology that was not quantified by the drainage-basin characteristics selected. The standard errors of estimate of regression equations for Region 1 (western Oklahoma) were substantially larger than those standard errors for other regions, especially for low-flow exceedances. These errors may be a result of greater variability in low flow because of increased irrigation activities in this region. Regression equations may not be reliable for sites where the drainage-basin characteristics are outside the range of values of independent vari
Generalized Onsager's reciprocal relations for the master and Fokker-Planck equations
NASA Astrophysics Data System (ADS)
Peng, Liangrong; Zhu, Yi; Hong, Liu
2018-06-01
The Onsager's reciprocal relation plays a fundamental role in the nonequilibrium thermodynamics. However, unfortunately, its classical version is valid only within a narrow region near equilibrium due to the linear regression hypothesis, which largely restricts its usage. In this paper, based on the conservation-dissipation formalism, a generalized version of Onsager's relations for the master equations and Fokker-Planck equations was derived. Nonlinear constitutive relations with nonsymmetric and positively stable operators, which become symmetric under the detailed balance condition, constitute key features of this new generalization. Similar conclusions also hold for many other classical models in physics and chemistry, which in turn make the current study as a benchmark for the application of generalized Onsager's relations in nonequilibrium thermodynamics.
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
Estimating irrigation water use in the humid eastern United States
Levin, Sara B.; Zarriello, Phillip J.
2013-01-01
Accurate accounting of irrigation water use is an important part of the U.S. Geological Survey National Water-Use Information Program and the WaterSMART initiative to help maintain sustainable water resources in the Nation. Irrigation water use in the humid eastern United States is not well characterized because of inadequate reporting and wide variability associated with climate, soils, crops, and farming practices. To better understand irrigation water use in the eastern United States, two types of predictive models were developed and compared by using metered irrigation water-use data for corn, cotton, peanut, and soybean crops in Georgia and turf farms in Rhode Island. Reliable metered irrigation data were limited to these areas. The first predictive model that was developed uses logistic regression to predict the occurrence of irrigation on the basis of antecedent climate conditions. Logistic regression equations were developed for corn, cotton, peanut, and soybean crops by using weekly irrigation water-use data from 36 metered sites in Georgia in 2009 and 2010 and turf farms in Rhode Island from 2000 to 2004. For the weeks when irrigation was predicted to take place, the irrigation water-use volume was estimated by multiplying the average metered irrigation application rate by the irrigated acreage for a given crop. The second predictive model that was developed is a crop-water-demand model that uses a daily soil water balance to estimate the water needs of a crop on a given day based on climate, soil, and plant properties. Crop-water-demand models were developed independently of reported irrigation water-use practices and relied on knowledge of plant properties that are available in the literature. Both modeling approaches require accurate accounting of irrigated area and crop type to estimate total irrigation water use. Water-use estimates from both modeling methods were compared to the metered irrigation data from Rhode Island and Georgia that were used to develop the models as well as two independent validation datasets from Georgia and Virginia that were not used in model development. Irrigation water-use estimates from the logistic regression method more closely matched mean reported irrigation rates than estimates from the crop-water-demand model when compared to the irrigation data used to develop the equations. The root mean squared errors (RMSEs) for the logistic regression estimates of mean annual irrigation ranged from 0.3 to 2.0 inches (in.) for the five crop types; RMSEs for the crop-water-demand models ranged from 1.4 to 3.9 in. However, when the models were applied and compared to the independent validation datasets from southwest Georgia from 2010, and from Virginia from 1999 to 2007, the crop-water-demand model estimates were as good as or better at predicting the mean irrigation volume than the logistic regression models for most crop types. RMSEs for logistic regression estimates of mean annual irrigation ranged from 1.0 to 7.0 in. for validation data from Georgia and from 1.8 to 4.9 in. for validation data from Virginia; RMSEs for crop-water-demand model estimates ranged from 2.1 to 5.8 in. for Georgia data and from 2.0 to 3.9 in. for Virginia data. In general, regression-based models performed better in areas that had quality daily or weekly irrigation data from which the regression equations were developed; however, the regression models were less reliable than the crop-water-demand models when applied outside the area for which they were developed. In most eastern coastal states that do not have quality irrigation data, the crop-water-demand model can be used more reliably. The development of predictive models of irrigation water use in this study was hindered by a lack of quality irrigation data. Many mid-Atlantic and New England states do not require irrigation water use to be reported. A survey of irrigation data from 14 eastern coastal states from Maine to Georgia indicated that, with the exception of the data in Georgia, irrigation data in the states that do require reporting commonly did not contain requisite ancillary information such as irrigated area or crop type, lacked precision, or were at an aggregated temporal scale making them unsuitable for use in the development of predictive models. Confidence in the reliability of either modeling method is affected by uncertainty in the reported data from which the models were developed or validated. Only through additional collection of quality data and further study can the accuracy and uncertainty of irrigation water-use estimates be improved in the humid eastern United States.
Analysis and improvement measures of flight delay in China
NASA Astrophysics Data System (ADS)
Zang, Yuhang
2017-03-01
Firstly, this paper establishes the principal component regression model to analyze the data quantitatively, based on principal component analysis to get the three principal component factors of flight delays. Then the least square method is used to analyze the factors and obtained the regression equation expression by substitution, and then found that the main reason for flight delays is airlines, followed by weather and traffic. Aiming at the above problems, this paper improves the controllable aspects of traffic flow control. For reasons of traffic flow control, an adaptive genetic queuing model is established for the runway terminal area. This paper, establish optimization method that fifteen planes landed simultaneously on the three runway based on Beijing capital international airport, comparing the results with the existing FCFS algorithm, the superiority of the model is proved.
Chung, Moo K.; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K.
2014-01-01
We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel regression is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. Unlike many previous partial differential equation based approaches involving diffusion, our approach represents the solution of diffusion analytically, reducing numerical inaccuracy and slow convergence. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, we have applied the method in characterizing the localized growth pattern of mandible surfaces obtained in CT images from subjects between ages 0 and 20 years by regressing the length of displacement vectors with respect to the template surface. PMID:25791435
Assimilation of GOES-Derived Cloud Fields Into MM5
NASA Astrophysics Data System (ADS)
Biazar, A. P.; Doty, K. G.; McNider, R.
2007-12-01
This approach for the assimilation of GOES-derived cloud data into an atmospheric model (the Fifth-Generation Pennsylvania State University-National Center for Atmospheric Research Mesoscale Model, or MM5) was performed in two steps. In the first step, multiple linear regression equations were developed using a control MM5 simulation to develop relationships for several dependent variables in model columns that had one or more layers of clouds. In the second step, the regression equations were applied during an MM5 simulation with assimilation in which the hourly GOES satellite data were used to determine the cloud locations and some of the cloud properties, but with all the other variables being determined by the model data. The satellite-derived fields used were shortwave cloud albedo and cloud top pressure. Ten multiple linear regression equations were developed for the following dependent variables: total cloud depth, number of cloud layers, depth of the layer that contains the maximum vertical velocity, the maximum vertical velocity, the height of the maximum vertical velocity, the estimated 1-h stable (i.e., grid scale) precipitation rate, the estimated 1-h convective precipitation rate, the height of the level with the maximum positive diabatic heating, the magnitude of the maximum positive diabatic heating, and the largest continuous layer of upward motion. The horizontal components of the divergent wind were adjusted to be consistent with the regression estimate of the maximum vertical velocity. The new total horizontal wind field with these new divergent components was then used to nudge an ongoing MM5 model simulation towards the target vertical velocity. Other adjustments included diabatic heating and moistening at specified levels. Where the model simulation had clouds when the satellite data indicated clear conditions, procedures were taken to remove or diminish the errant clouds. The results for the period of 0000 UTC 28 June - 0000 UTC 16 July 1999 for both a continental 32-km grid and an 8-km grid over the Southeastern United States indicate a significant improvement in the cloud bias statistics. The main improvement was the reduction of high bias values that indicated times and locations in the control run when there were model clouds but when the satellite indicated clear conditions. The importance of this technique is that it has been able to assimilate the observed clouds in the model in a dynamically sustainable manner. Acknowledgments. This work was partially funded by the following grants: a GEWEX grant from NASA , the Cooperative Agreement between the University of Alabama in Huntsville and the Minerals Management Service on Gulf of Mexico Issues, a NASA applications grant, and a NSF grant.
Prediction equations for maximal respiratory pressures of Brazilian adolescents.
Mendes, Raquel E F; Campos, Tania F; Macêdo, Thalita M F; Borja, Raíssa O; Parreira, Verônica F; Mendonça, Karla M P P
2013-01-01
The literature emphasizes the need for studies to provide reference values and equations able to predict respiratory muscle strength of Brazilian subjects at different ages and from different regions of Brazil. To develop prediction equations for maximal respiratory pressures (MRP) of Brazilian adolescents. In total, 182 healthy adolescents (98 boys and 84 girls) aged between 12 and 18 years, enrolled in public and private schools in the city of Natal-RN, were evaluated using an MVD300 digital manometer (Globalmed®) according to a standardized protocol. Statistical analysis was performed using SPSS Statistics 17.0 software, with a significance level of 5%. Data normality was verified using the Kolmogorov-Smirnov test, and descriptive analysis results were expressed as the mean and standard deviation. To verify the correlation between the MRP and the independent variables (age, weight, height and sex), the Pearson correlation test was used. To obtain the prediction equations, stepwise multiple linear regression was used. The variables height, weight and sex were correlated to MRP. However, weight and sex explained part of the variability of MRP, and the regression analysis in this study indicated that these variables contributed significantly in predicting maximal inspiratory pressure, and only sex contributed significantly to maximal expiratory pressure. This study provides reference values and two models of prediction equations for maximal inspiratory and expiratory pressures and sets the necessary normal lower limits for the assessment of the respiratory muscle strength of Brazilian adolescents.
A Comparison of Regional and SiteSpecific Volume Estimation Equations
Joe P. McClure; Jana Anderson; Hans T. Schreuder
1987-01-01
Regression equations for volume by region and site class were examined for lobiolly pine. The regressions for the Coastal Plain and Piedmont regions had significantly different slopes. The results shared important practical differences in percentage of confidence intervals containing the true total volume and in percentage of estimates within a specific proportion of...
NASA Technical Reports Server (NTRS)
Ulbrich, N.; Volden, T.
2018-01-01
Analysis and use of temperature-dependent wind tunnel strain-gage balance calibration data are discussed in the paper. First, three different methods are presented and compared that may be used to process temperature-dependent strain-gage balance data. The first method uses an extended set of independent variables in order to process the data and predict balance loads. The second method applies an extended load iteration equation during the analysis of balance calibration data. The third method uses temperature-dependent sensitivities for the data analysis. Physical interpretations of the most important temperature-dependent regression model terms are provided that relate temperature compensation imperfections and the temperature-dependent nature of the gage factor to sets of regression model terms. Finally, balance calibration recommendations are listed so that temperature-dependent calibration data can be obtained and successfully processed using the reviewed analysis methods.
NASA Astrophysics Data System (ADS)
Nelson, Kirk E.; Ginn, Timothy R.
2011-05-01
A new equation for the collector efficiency (η) of the colloid filtration theory (CFT) is developed via nonlinear regression on the numerical data generated by a large number of Lagrangian simulations conducted in Happel's sphere-in-cell porous media model over a wide range of environmentally relevant conditions. The new equation expands the range of CFT's applicability in the natural subsurface primarily by accommodating departures from power law dependence of η on the Peclet and gravity numbers, a necessary but as of yet unavailable feature for applying CFT to large-scale field transport (e.g., of nanoparticles, radionuclides, or genetically modified organisms) under low groundwater velocity conditions. The new equation also departs from prior equations for colloids in the nanoparticle size range at all fluid velocities. These departures are particularly relevant to subsurface colloid and colloid-facilitated transport where low permeabilities and/or hydraulic gradients lead to low groundwater velocities and/or to nanoparticle fate and transport in porous media in general. We also note the importance of consistency in the conceptualization of particle flux through the single collector model on which most η equations are based for the purpose of attaining a mechanistic understanding of the transport and attachment steps of deposition. A lack of sufficient data for small particles and low velocities warrants further experiments to draw more definitive and comprehensive conclusions regarding the most significant discrepancies between the available equations.
Howley, Donna; Howley, Peter; Oxenham, Marc F
2018-06-01
Stature and a further 8 anthropometric dimensions were recorded from the arms and hands of a sample of 96 staff and students from the Australian National University and The University of Newcastle, Australia. These dimensions were used to create simple and multiple logistic regression models for sex estimation and simple and multiple linear regression equations for stature estimation of a contemporary Australian population. Overall sex classification accuracies using the models created were comparable to similar studies. The stature estimation models achieved standard errors of estimates (SEE) which were comparable to and in many cases lower than those achieved in similar research. Generic, non sex-specific models achieved similar SEEs and R 2 values to the sex-specific models indicating stature may be accurately estimated when sex is unknown. Copyright © 2018 Elsevier B.V. All rights reserved.
Thermal requirements of Dermanyssus gallinae (De Geer, 1778) (Acari: Dermanyssidae).
Tucci, Edna Clara; do Prado, Angelo P; de Araújo, Raquel Pires
2008-01-01
The thermal requirements for development of Dermanyssus gallinae were studied under laboratory conditions at 15, 20, 25, 30 and 35 degrees C, a 12h photoperiod and 60-85% RH. The thermal requirements for D. gallinae were as follows. Preoviposition: base temperature 3.4 degrees C, thermal constant (k) 562.85 degree-hours, determination coefficient (R(2)) 0.59, regression equation: Y= -0.006035 + 0.001777x. Egg: base temperature 10.60 degrees C, thermal constant (k) 689.65 degree-hours, determination coefficient (R(2)) 0.94, regression equation: Y= -0.015367 + 0.001450x. Larva: base temperature 9.82 degrees C, thermal constant (k) 464.91 degree-hours, determination coefficient (R(2)) 0.87, regression equation: Y= -0.021123 + 0.002151x. Protonymph: base temperature 10.17 degrees C, thermal constant (k) 504.49 degree-hours, determination coefficient (R(2)) 0.90, regression equation: Y= -0.020152 + 0.001982x. Deutonymph: base temperature 11.80 degrees C, thermal constant (k) 501.11 degree-hours, determination coefficient (R(2)) 0.99, regression equation: Y= -0.023555 + 0.001996x. The results obtained showed that 15 to 42 generations of Dermanyssus gallinae may occur during the year in the State of São Paulo, as estimated based on isotherm charts. Dermanyssus gallinae may develop continually in the State of São Paulo, with a population decrease in the winter. There were differences between the developmental stages of D. gallinae in relation to thermal requirements.
Techniques for estimating flood-peak discharges from urban basins in Missouri
Becker, L.D.
1986-01-01
Techniques are defined for estimating the magnitude and frequency of future flood peak discharges of rainfall-induced runoff from small urban basins in Missouri. These techniques were developed from an initial analysis of flood records of 96 gaged sites in Missouri and adjacent states. Final regression equations are based on a balanced, representative sampling of 37 gaged sites in Missouri. This sample included 9 statewide urban study sites, 18 urban sites in St. Louis County, and 10 predominantly rural sites statewide. Short-term records were extended on the basis of long-term climatic records and use of a rainfall-runoff model. Linear least-squares regression analyses were used with log-transformed variables to relate flood magnitudes of selected recurrence intervals (dependent variables) to selected drainage basin indexes (independent variables). For gaged urban study sites within the State, the flood peak estimates are from the frequency curves defined from the synthesized long-term discharge records. Flood frequency estimates are made for ungaged sites by using regression equations that require determination of the drainage basin size and either the percentage of impervious area or a basin development factor. Alternative sets of equations are given for the 2-, 5-, 10-, 25-, 50-, and 100-yr recurrence interval floods. The average standard errors of estimate range from about 33% for the 2-yr flood to 26% for the 100-yr flood. The techniques for estimation are applicable to flood flows that are not significantly affected by storage caused by manmade activities. Flood peak discharge estimating equations are considered applicable for sites on basins draining approximately 0.25 to 40 sq mi. (Author 's abstract)
Using twig diameters to estimate browse utilization on three shrub species in southeastern Montana
Mark A. Rumble
1987-01-01
Browse utilization estimates based on twig length and twig weight were compared for skunkbush sumac, wax currant, and chokecherry. Linear regression analysis was valid for twig length data; twig weight equations are nonlinear. Estimates of twig weight are more accurate. Problems encountered during development of a utilization model are discussed.
USDA-ARS?s Scientific Manuscript database
Prediction equations of energy expenditure (EE) using accelerometers and miniaturized heart rate (HR) monitors have been developed in older children and adults but not in preschool-aged children. Because the relationships between accelerometer counts (ACs), HR, and EE are confounded by growth and ma...
Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.
NASA Technical Reports Server (NTRS)
Ohring, G.
1972-01-01
Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.
Thompson, Ronald E.; Hoffman, Scott A.
2006-01-01
A suite of 28 streamflow statistics, ranging from extreme low to high flows, was computed for 17 continuous-record streamflow-gaging stations and predicted for 20 partial-record stations in Monroe County and contiguous counties in north-eastern Pennsylvania. The predicted statistics for the partial-record stations were based on regression analyses relating inter-mittent flow measurements made at the partial-record stations indexed to concurrent daily mean flows at continuous-record stations during base-flow conditions. The same statistics also were predicted for 134 ungaged stream locations in Monroe County on the basis of regression analyses relating the statistics to GIS-determined basin characteristics for the continuous-record station drainage areas. The prediction methodology for developing the regression equations used to estimate statistics was developed for estimating low-flow frequencies. This study and a companion study found that the methodology also has application potential for predicting intermediate- and high-flow statistics. The statistics included mean monthly flows, mean annual flow, 7-day low flows for three recurrence intervals, nine flow durations, mean annual base flow, and annual mean base flows for two recurrence intervals. Low standard errors of prediction and high coefficients of determination (R2) indicated good results in using the regression equations to predict the statistics. Regression equations for the larger flow statistics tended to have lower standard errors of prediction and higher coefficients of determination (R2) than equations for the smaller flow statistics. The report discusses the methodologies used in determining the statistics and the limitations of the statistics and the equations used to predict the statistics. Caution is indicated in using the predicted statistics for small drainage area situations. Study results constitute input needed by water-resource managers in Monroe County for planning purposes and evaluation of water-resources availability.
Modeling individualized coefficient alpha to measure quality of test score data.
Liu, Molei; Hu, Ming; Zhou, Xiao-Hua
2018-05-23
Individualized coefficient alpha is defined. It is item and subject specific and is used to measure the quality of test score data with heterogenicity among the subjects and items. A regression model is developed based on 3 sets of generalized estimating equations. The first set of generalized estimating equation models the expectation of the responses, the second set models the response's variance, and the third set is proposed to estimate the individualized coefficient alpha, defined and used to measure individualized internal consistency of the responses. We also use different techniques to extend our method to handle missing data. Asymptotic property of the estimators is discussed, based on which inference on the coefficient alpha is derived. Performance of our method is evaluated through simulation study and real data analysis. The real data application is from a health literacy study in Hunan province of China. Copyright © 2018 John Wiley & Sons, Ltd.
Liu, Yi; Luo, Bi-Ru
2016-11-20
To analyze the factors affecting maternal physical activities at different stages among pregnant women. Self-designed questionnaires were used to investigate the physical activities of women in different stages, including 650 in the first, 650 in the second, and 750 in the third trimester of pregnancy. The factors affecting maternal physical activities were analyzed using the structural equation model that comprised 4 latent variables (attitude, norm, behavioral attention and behavior) with observed variables that matched the latent variables. The participants ranged from 18 to 35 years of age. The women and their husbands, but not their mothers or mothers-in-law, were all well educated. The caregiver during pregnancy was mostly the mother followed by the husband. For traveling, the women in the first, second and third trimesters preferred walking, bus, and personal escort, respectively; the main physical activity was walking in all trimesters, and the women in different trimester were mostly sedentary, a greater intensity of exercise was associated with less exercise time. Structural equation modeling (SEM) analysis showed that the physical activities of pregnant women was affected by behavioral intention (with standardized regression coefficient of 0.372); attitude and subjective norms affected physical activity by indirectly influencing the behavior intention (standardized regression coefficients of 0.140 and 0.669). The pregnant women in different stages have inappropriate physical activities with insufficient exercise time and intensity. The subjective norms affects the physical activities of the pregnant women by influencing their attitudes and behavior intention indirectly, suggesting the need of health education of the caregivers during pregnancy.
Sparling, D.W.; Barzen, J.A.; Lovvorn, J.R.; Serie, J.R.
1992-01-01
Regression equations that use mensural data to estimate body condition have been developed for several water birds. These equations often have been based on data that represent different sexes, age classes, or seasons, without being adequately tested for intergroup differences. We used proximate carcass analysis of 538 adult and juvenile canvasbacks (Aythya valisineria ) collected during fall migration, winter, and spring migrations in 1975-76 and 1982-85 to test regression methods for estimating body condition.
Rapcsak, Steven Z; Henry, Maya L; Teague, Sommer L; Carnahan, Susan D; Beeson, Pélagie M
2007-06-18
Coltheart and co-workers [Castles, A., Bates, T. C., & Coltheart, M. (2006). John Marshall and the developmental dyslexias. Aphasiology, 20, 871-892; Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204-256] have demonstrated that an equation derived from dual-route theory accurately predicts reading performance in young normal readers and in children with reading impairment due to developmental dyslexia or stroke. In this paper, we present evidence that the dual-route equation and a related multiple regression model also accurately predict both reading and spelling performance in adult neurological patients with acquired alexia and agraphia. These findings provide empirical support for dual-route theories of written language processing.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time component also was a statistically significant explanatory variable for estimating chloride. The regression equations for chloride at the Red River at Fargo provided a fair relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.66 and the equation for the Red River at Grand Forks provided a relatively good relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.77. Turbidity and streamflow were statistically significant explanatory variables for estimating nitrate plus nitrite concentrations at the Red River at Fargo and turbidity was the only statistically significant explanatory variable for estimating nitrate plus nitrite concentrations at Grand Forks. The regression equation for the Red River at Fargo provided a relatively poor relation between nitrate plus nitrite concentrations, turbidity, and streamflow, with an adjusted coefficient of determination of 0.46. The regression equation for the Red River at Grand Forks provided a fair relation between nitrate plus nitrite concentrations and turbidity, with an adjusted coefficient of determination of 0.73. Some of the variability that was not explained by the equations might be attributed to different sources contributing nitrates to the stream at different times. Turbidity, streamflow, and a seasonal component were statistically significant explanatory variables for estimating total phosphorus at the Red River at Fargo and Grand Forks. The regression equation for the Red River at Fargo provided a relatively fair relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.74. The regression equation for the Red River at Grand Forks provided a good relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.87. For the Red River at Fargo, turbidity and streamflow were statistically significant explanatory variables for estimating suspended-sediment concentrations. For the Red River at Grand Forks, turbidity was the only statistically significant explanatory variable for estimating suspended-sediment concentration. The regression equation at the Red River at Fargo provided a good relation between suspended-sediment concentration, turbidity, and streamflow, with an adjusted coefficient of determination of 0.95. The regression equation for the Red River at Grand Forks provided a good relation between suspended-sediment concentration and turbidity, with an adjusted coefficient of determination of 0.96.
A stream-gaging network analysis for the 7-day, 10-year annual low flow in New Hampshire streams
Flynn, Robert H.
2003-01-01
The 7-day, 10-year (7Q10) low-flow-frequency statistic is a widely used measure of surface-water availability in New Hampshire. Regression equations and basin-characteristic digital data sets were developed to help water-resource managers determine surface-water resources during periods of low flow in New Hampshire streams. These regression equations and data sets were developed to estimate streamflow statistics for the annual and seasonal low-flow-frequency, and period-of-record and seasonal period-of-record flow durations. generalized-least-squares (GLS) regression methods were used to develop the annual 7Q10 low-flow-frequency regression equation from 60 continuous-record stream-gaging stations in New Hampshire and in neighboring States. In the regression equation, the dependent variables were the annual 7Q10 flows at the 60 stream-gaging stations. The independent (or predictor) variables were objectively selected characteristics of the drainage basins that contribute flow to those stations. In contrast to ordinary-least-squares (OLS) regression analysis, GLS-developed estimating equations account for differences in length of record and spatial correlations among the flow-frequency statistics at the various stations.A total of 93 measurable drainage-basin characteristics were candidate independent variables. On the basis of several statistical parameters that were used to evaluate which combination of basin characteristics contribute the most to the predictive power of the equations, three drainage-basin characteristics were determined to be statistically significant predictors of the annual 7Q10: (1) total drainage area, (2) mean summer stream-gaging station precipitation from 1961 to 90, and (3) average mean annual basinwide temperature from 1961 to 1990.To evaluate the effectiveness of the stream-gaging network in providing regional streamflow data for the annual 7Q10, the computer program GLSNET (generalized-least-squares NETwork) was used to analyze the network by application of GLS regression between streamflow and the climatic and basin characteristics of the drainage basin upstream from each stream-gaging station. Improvement to the predictive ability of the regression equations developed for the network analyses is measured by the reduction in the average sampling-error variance, and can be achieved by collecting additional streamflow data at existing stations. The predictive ability of the regression equations is enhanced even further with the addition of new stations to the network. Continued data collection at unregulated stream-gaging stations with less than 14 years of record resulted in the greatest cost-weighted reduction to the average sampling-error variance of the annual 7Q10 regional regression equation. The addition of new stations in basins with underrepresented values for the independent variables of the total drainage area, average mean annual basinwide temperature, or mean summer stream-gaging station precipitation in the annual 7Q10 regression equation yielded a much greater cost-weighted reduction to the average sampling-error variance than when more data were collected at existing unregulated stations. To maximize the regional information obtained from the stream-gaging network for the annual 7Q10, ranking of the streamflow data can be used to determine whether an active station should be continued or if a new or discontinued station should be activated for streamflow data collection. Thus, this network analysis can help determine the costs and benefits of continuing the operation of a particular station or activating a new station at another location to predict the 7Q10 at ungaged stream reaches. The decision to discontinue an existing station or activate a new station, however, must also consider its contribution to other water-resource analyses such as flood management, water quality, or trends in land use or climatic change.
Reddy, M Srinivasa; Basha, Shaik; Joshi, H V; Sravan Kumar, V G; Jha, B; Ghosh, P K
2005-01-01
Alang-Sosiya is the largest ship-scrapping yard in the world, established in 1982. Every year an average of 171 ships having a mean weight of 2.10 x 10(6)(+/-7.82 x 10(5)) of light dead weight tonnage (LDT) being scrapped. Apart from scrapped metals, this yard generates a massive amount of combustible solid waste in the form of waste wood, plastic, insulation material, paper, glass wool, thermocol pieces (polyurethane foam material), sponge, oiled rope, cotton waste, rubber, etc. In this study multiple regression analysis was used to develop predictive models for energy content of combustible ship-scrapping solid wastes. The scope of work comprised qualitative and quantitative estimation of solid waste samples and performing a sequential selection procedure for isolating variables. Three regression models were developed to correlate the energy content (net calorific values (LHV)) with variables derived from material composition, proximate and ultimate analyses. The performance of these models for this particular waste complies well with the equations developed by other researchers (Dulong, Steuer, Scheurer-Kestner and Bento's) for estimating energy content of municipal solid waste.
Uhrich, Mark A.; Kolasinac, Jasna; Booth, Pamela L.; Fountain, Robert L.; Spicer, Kurt R.; Mosbrucker, Adam R.
2014-01-01
Researchers at the U.S. Geological Survey, Cascades Volcano Observatory, investigated alternative methods for the traditional sample-based sediment record procedure in determining suspended-sediment concentration (SSC) and discharge. One such sediment-surrogate technique was developed using turbidity and discharge to estimate SSC for two gaging stations in the Toutle River Basin near Mount St. Helens, Washington. To provide context for the study, methods for collecting sediment data and monitoring turbidity are discussed. Statistical methods used include the development of ordinary least squares regression models for each gaging station. Issues of time-related autocorrelation also are evaluated. Addition of lagged explanatory variables was used to account for autocorrelation in the turbidity, discharge, and SSC data. Final regression model equations and plots are presented for the two gaging stations. The regression models support near-real-time estimates of SSC and improved suspended-sediment discharge records by incorporating continuous instream turbidity. Future use of such models may potentially lower the costs of sediment monitoring by reducing time it takes to collect and process samples and to derive a sediment-discharge record.
The integrated Michaelis-Menten rate equation: déjà vu or vu jàdé?
Goličnik, Marko
2013-08-01
A recent article of Johnson and Goody (Biochemistry, 2011;50:8264-8269) described the almost-100-years-old paper of Michaelis and Menten. Johnson and Goody translated this classic article and presented the historical perspective to one of incipient enzyme-reaction data analysis, including a pioneering global fit of the integrated rate equation in its implicit form to the experimental time-course data. They reanalyzed these data, although only numerical techniques were used to solve the model equations. However, there is also the still little known algebraic rate-integration equation in a closed form that enables direct fitting of the data. Therefore, in this commentary, I briefly present the integral solution of the Michaelis-Menten rate equation, which has been largely overlooked for three decades. This solution is expressed in terms of the Lambert W function, and I demonstrate here its use for global nonlinear regression curve fitting, as carried out with the original time-course dataset of Michaelis and Menten.
Eastwood, Sophie V; Tillin, Therese; Wright, Andrew; Heasman, John; Willis, Joseph; Godsland, Ian F; Forouhi, Nita; Whincup, Peter; Hughes, Alun D; Chaturvedi, Nishi
2013-01-01
South Asians and African Caribbeans experience more cardiometabolic disease than Europeans. Risk factors include visceral (VAT) and subcutaneous abdominal (SAT) adipose tissue, which vary with ethnicity and are difficult to quantify using anthropometry. We developed and cross-validated ethnicity and gender-specific equations using anthropometrics to predict VAT and SAT. 669 Europeans, 514 South Asians and 227 African Caribbeans (70 ± 7 years) underwent anthropometric measurement and abdominal CT scanning. South Asian and African Caribbean participants were first-generation migrants living in London. Prediction equations were derived for CT-measured VAT and SAT using stepwise regression, then cross-validated by comparing actual and predicted means. South Asians had more and African Caribbeans less VAT than Europeans. For basic VAT prediction equations (age and waist circumference), model fit was better in men (R(2) range 0.59-0.71) than women (range 0.35-0.59). Expanded equations (+ weight, height, hip and thigh circumference) improved fit for South Asian and African Caribbean women (R(2) 0.35 to 0.55, and 0.43 to 0.56 respectively). For basic SAT equations, R(2) was 0.69-0.77, and for expanded equations it was 0.72-0.86. Cross-validation showed differences between actual and estimated VAT of <7%, and SAT of <8% in all groups, apart from VAT in South Asian women which disagreed by 16%. We provide ethnicity- and gender-specific VAT and SAT prediction equations, derived from a large tri-ethnic sample. Model fit was reasonable for SAT and VAT in men, while basic VAT models should be used cautiously in South Asian and African Caribbean women. These equations will aid studies of mechanisms of cardiometabolic disease in later life, where imaging data are not available.
Use of Thematic Mapper for water quality assessment
NASA Technical Reports Server (NTRS)
Horn, E. M.; Morrissey, L. A.
1984-01-01
The evaluation of simulated TM data obtained on an ER-2 aircraft at twenty-five predesignated sample sites for mapping water quality factors such as conductivity, pH, suspended solids, turbidity, temperature, and depth, is discussed. Using a multiple regression for the seven TM bands, an equation is developed for the suspended solids. TM bands 1, 2, 3, 4, and 6 are used with logarithm conductivity in a multiple regression. The assessment of regression equations for a high coefficient of determination (R-squared) and statistical significance is considered. Confidence intervals about the mean regression point are calculated in order to assess the robustness of the regressions used for mapping conductivity, turbidity, and suspended solids, and by regressing random subsamples of sites and comparing the resultant range of R-squared, cross validation is conducted.
ERIC Educational Resources Information Center
Hafner, Lawrence E.
A study developed a multiple regression prediction equation for each of six selected achievement variables in a popular standardized test of achievement. Subjects, 42 fourth-grade pupils randomly selected across several classes in a large elementary school in a north Florida city, were administered several standardized tests to determine predictor…
Flood characteristics of urban watersheds in the United States
Sauer, Vernon B.; Thomas, W.O.; Stricker, V.A.; Wilson, K.V.
1983-01-01
A nationwide study of flood magnitude and frequency in urban areas was made for the purpose of reviewing available literature, compiling an urban flood data base, and developing methods of estimating urban floodflow characteristics in ungaged areas. The literature review contains synopses of 128 recent publications related to urban floodflow. A data base of 269 gaged basins in 56 cities and 31 States, including Hawaii, contains a wide variety of topographic and climatic characteristics, land-use variables, indices of urbanization, and flood-frequency estimates. Three sets of regression equations were developed to estimate flood discharges for ungaged sites for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years. Two sets of regression equations are based on seven independent parameters and the third is based on three independent parameters. The only difference in the two sets of seven-parameter equations is the use of basin lag time in one and lake and reservoir storage in the other. Of primary importance in these equations is an independent estimate of the equivalent rural discharge for the ungaged basin. The equations adjust the equivalent rural discharge to an urban condition. The primary adjustment factor, or index of urbanization, is the basin development factor, a measure of the extent of development of the drainage system in the basin. This measure includes evaluations of storm drains (sewers), channel improvements, and curb-and-gutter streets. The basin development factor is statistically very significant and offers a simple and effective way of accounting for drainage development and runoff response in urban areas. Percentage of impervious area is also included in the seven-parameter equations as an additional measure of urbanization and apparently accounts for increased runoff volumes. This factor is not highly significant for large floods, which supports the generally held concept that imperviousness is not a dominant factor when soils become more saturated during large storms. Other parameters in the seven-parameter equations include drainage area size, channel slope, rainfall intensity, lake and reservoir storage, and basin lag time. These factors are all statistically significant and provide logical indices of basin conditions. The three-parameter equations include only the three most significant parameters: rural discharge, basin-development factor, and drainage area size. All three sets of regression equations provide unbiased estimates of urban flood frequency. The seven-parameter regression equations without basin lag time have average standard errors of regression varying from ? 37 percent for the 5-year flood to ? 44 percent for the 100-year flood and ? 49 percent for the 500-year flood. The other two sets of regression equations have similar accuracy. Several tests for bias, sensitivity, and hydrologic consistency are included which support the conclusion that the equations are useful throughout the United States. All estimating equations were developed from data collected on drainage basins where temporary in-channel storage, due to highway embankments, was not significant. Consequently, estimates made with these equations do not account for the reducing effect of this temporary detention storage.
Asquith, William H.; Roussel, Meghan C.
2009-01-01
Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed characteristics of drainage area, dimensionless main-channel slope, and mean annual precipitation. The residuals of the nine equations are spatially mapped, and residuals for the 10-year recurrence interval are selected for generalization to 1-degree latitude and longitude quadrangles. The generalized residual is referred to as the OmegaEM parameter and represents a generalized terrain and climate index that expresses peak-streamflow potential not otherwise represented in the three watershed characteristics. The OmegaEM parameter was assigned to each station, and using OmegaEM, nine additional regression equations are computed. Because of favorable diagnostics, the OmegaEM equations are expected to be generally reliable estimators of peak-streamflow frequency for undeveloped and ungaged stream locations in Texas. The mean residual standard error, adjusted R-squared, and percentage reduction of PRESS by use of OmegaEM are 0.30log10, 0.86, and -21 percent, respectively. Inclusion of the OmegaEM parameter provides a substantial reduction in the PRESS statistic of the regression equations and removes considerable spatial dependency in regression residuals. Although the OmegaEM parameter requires interpretation on the part of analysts and the potential exists that different analysts could estimate different values for a given watershed, the authors suggest that typical uncertainty in the OmegaEM estimate might be about +or-0.1010. Finally, given the two ensembles of equations reported herein and those in previous reports, hydrologic design engineers and other analysts have several different methods, which represent different analytical tracks, to make comparisons of peak-streamflow frequency estimates for ungaged watersheds in the study area.
Effect of rice husk ash and fly ash on the compressive strength of high performance concrete
NASA Astrophysics Data System (ADS)
Van Lam, Tang; Bulgakov, Boris; Aleksandrova, Olga; Larsen, Oksana; Anh, Pham Ngoc
2018-03-01
The usage of industrial and agricultural wastes for building materials production plays an important role to improve the environment and economy by preserving nature materials and land resources, reducing land, water and air pollution as well as organizing and storing waste costs. This study mainly focuses on mathematical modeling dependence of the compressive strength of high performance concrete (HPC) at the ages of 3, 7 and 28 days on the amount of rice husk ash (RHA) and fly ash (FA), which are added to the concrete mixtures by using the Central composite rotatable design. The result of this study provides the second-order regression equation of objective function, the images of the surface expression and the corresponding contours of the objective function of the regression equation, as the optimal points of HPC compressive strength. These objective functions, which are the compressive strength values of HPC at the ages of 3, 7 and 28 days, depend on two input variables as: x1 (amount of RHA) and x2 (amount of FA). The Maple 13 program, solving the second-order regression equation, determines the optimum composition of the concrete mixture for obtaining high performance concrete and calculates the maximum value of the HPC compressive strength at the ages of 28 days. The results containMaxR28HPC = 76.716 MPa when RHA = 0.1251 and FA = 0.3119 by mass of Portland cement.
Lorenz, David L.; Sanocki, Chris A.; Kocian, Matthew J.
2010-01-01
Knowledge of the peak flow of floods of a given recurrence interval is essential for regulation and planning of water resources and for design of bridges, culverts, and dams along Minnesota's rivers and streams. Statistical techniques are needed to estimate peak flow at ungaged sites because long-term streamflow records are available at relatively few places. Because of the need to have up-to-date peak-flow frequency information in order to estimate peak flows at ungaged sites, the U.S. Geological Survey (USGS) conducted a peak-flow frequency study in cooperation with the Minnesota Department of Transportation and the Minnesota Pollution Control Agency. Estimates of peak-flow magnitudes for 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are presented for 330 streamflow-gaging stations in Minnesota and adjacent areas in Iowa and South Dakota based on data through water year 2005. The peak-flow frequency information was subsequently used in regression analyses to develop equations relating peak flows for selected recurrence intervals to various basin and climatic characteristics. Two statistically derived techniques-regional regression equation and region of influence regression-can be used to estimate peak flow on ungaged streams smaller than 3,000 square miles in Minnesota. Regional regression equations were developed for selected recurrence intervals in each of six regions in Minnesota: A (northwestern), B (north central and east central), C (northeastern), D (west central and south central), E (southwestern), and F (southeastern). The regression equations can be used to estimate peak flows at ungaged sites. The region of influence regression technique dynamically selects streamflow-gaging stations with characteristics similar to a site of interest. Thus, the region of influence regression technique allows use of a potentially unique set of gaging stations for estimating peak flow at each site of interest. Two methods of selecting streamflow-gaging stations, similarity and proximity, can be used for the region of influence regression technique. The regional regression equation technique is the preferred technique as an estimate of peak flow in all six regions for ungaged sites. The region of influence regression technique is not appropriate for regions C, E, and F because the interrelations of some characteristics of those regions do not agree with the interrelations throughout the rest of the State. Both the similarity and proximity methods for the region of influence technique can be used in the other regions (A, B, and D) to provide additional estimates of peak flow. The peak-flow-frequency estimates and basin characteristics for selected streamflow-gaging stations and regional peak-flow regression equations are included in this report.
1990-05-01
0.759 0.744 0.768 0.753 106 (THUMBBR) THUMB BREADTH -0.652 -0.673 -0.539 -0.663 217 (LIPLGTHH) LIP LENGTH HEADBOARD 0.017 0.019 0.020 51 (FTBRHOR) FOOT...DEPENDENT VARIABLE: (106) THUMB BREADTH (THUBBR) MODEL INDEPENDENT VARIABLE 1 2 3 4 5 INTERCEPT 6.621 5.016 6.267 5.697 4.528 59 (HANDCIRC) HAND...95 (SLLSPEL) SLEEVE LENGTH: SPINE-ELBOW -0.020 -0.019 -C.018 9 (BLFTCIRC) BALL OF FOOT CIRCUMFERENCE -0.032 -0.039 106 (THUMBBR) THUMB BREADTH 0.228
NASA Astrophysics Data System (ADS)
Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei
2018-03-01
A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.
Barth, Nancy A.; Veilleux, Andrea G.
2012-01-01
The U.S. Geological Survey (USGS) is currently updating at-site flood frequency estimates for USGS streamflow-gaging stations in the desert region of California. The at-site flood-frequency analysis is complicated by short record lengths (less than 20 years is common) and numerous zero flows/low outliers at many sites. Estimates of the three parameters (mean, standard deviation, and skew) required for fitting the log Pearson Type 3 (LP3) distribution are likely to be highly unreliable based on the limited and heavily censored at-site data. In a generalization of the recommendations in Bulletin 17B, a regional analysis was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the LP3 distribution. A regional skew value of zero from a previously published report was used with a new estimated mean squared error (MSE) of 0.20. A weighted least squares (WLS) regression method was used to develop both a regional standard deviation and a mean model based on annual peak-discharge data for 33 USGS stations throughout California’s desert region. At-site standard deviation and mean values were determined by using an expected moments algorithm (EMA) method for fitting the LP3 distribution to the logarithms of annual peak-discharge data. Additionally, a multiple Grubbs-Beck (MGB) test, a generalization of the test recommended in Bulletin 17B, was used for detecting multiple potentially influential low outliers in a flood series. The WLS regression found that no basin characteristics could explain the variability of standard deviation. Consequently, a constant regional standard deviation model was selected, resulting in a log-space value of 0.91 with a MSE of 0.03 log units. Yet drainage area was found to be statistically significant at explaining the site-to-site variability in mean. The linear WLS regional mean model based on drainage area had a Pseudo- 2 R of 51 percent and a MSE of 0.32 log units. The regional parameter estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final equations are functions of drainage area.Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent.
Wu, Hulin; Xue, Hongqi; Kumar, Arun
2012-06-01
Differential equations are extensively used for modeling dynamics of physical processes in many scientific fields such as engineering, physics, and biomedical sciences. Parameter estimation of differential equation models is a challenging problem because of high computational cost and high-dimensional parameter space. In this article, we propose a novel class of methods for estimating parameters in ordinary differential equation (ODE) models, which is motivated by HIV dynamics modeling. The new methods exploit the form of numerical discretization algorithms for an ODE solver to formulate estimating equations. First, a penalized-spline approach is employed to estimate the state variables and the estimated state variables are then plugged in a discretization formula of an ODE solver to obtain the ODE parameter estimates via a regression approach. We consider three different order of discretization methods, Euler's method, trapezoidal rule, and Runge-Kutta method. A higher-order numerical algorithm reduces numerical error in the approximation of the derivative, which produces a more accurate estimate, but its computational cost is higher. To balance the computational cost and estimation accuracy, we demonstrate, via simulation studies, that the trapezoidal discretization-based estimate is the best and is recommended for practical use. The asymptotic properties for the proposed numerical discretization-based estimators are established. Comparisons between the proposed methods and existing methods show a clear benefit of the proposed methods in regards to the trade-off between computational cost and estimation accuracy. We apply the proposed methods t an HIV study to further illustrate the usefulness of the proposed approaches. © 2012, The International Biometric Society.
NASA Technical Reports Server (NTRS)
Caldwell, E. C.; Cowley, M. S.; Scott-Pandorf, M. M.
2010-01-01
Develop a model that simulates a human running in 0 G using the European Space Agency s (ESA) Subject Loading System (SLS). The model provides ground reaction forces (GRF) based on speed and pull-down forces (PDF). DESIGN The theoretical basis for the Running Model was based on a simple spring-mass model. The dynamic properties of the spring-mass model express theoretical vertical GRF (GRFv) and shear GRF in the posterior-anterior direction (GRFsh) during running gait. ADAMs VIEW software was used to build the model, which has a pelvis, thigh segment, shank segment, and a spring foot (see Figure 1).the model s movement simulates the joint kinematics of a human running at Earth gravity with the aim of generating GRF data. DEVELOPMENT & VERIFICATION ESA provided parabolic flight data of subjects running while using the SLS, for further characterization of the model s GRF. Peak GRF data were fit to a linear regression line dependent on PDF and speed. Interpolation and extrapolation of the regression equation provided a theoretical data matrix, which is used to drive the model s motion equations. Verification of the model was conducted by running the model at 4 different speeds, with each speed accounting for 3 different PDF. The model s GRF data fell within a 1-standard-deviation boundary derived from the empirical ESA data. CONCLUSION The Running Model aids in conducting various simulations (potential scenarios include a fatigued runner or a powerful runner generating high loads at a fast cadence) to determine limitations for the T2 vibration isolation system (VIS) aboard the International Space Station. This model can predict how running with the ESA SLS affects the T2 VIS and may be used for other exercise analyses in the future.
Mathematical and computational model for the analysis of micro hybrid rocket motor
NASA Astrophysics Data System (ADS)
Stoia-Djeska, Marius; Mingireanu, Florin
2012-11-01
The hybrid rockets use a two-phase propellant system. In the present work we first develop a simplified model of the coupling of the hybrid combustion process with the complete unsteady flow, starting from the combustion port and ending with the nozzle. The physical and mathematical model are adapted to the simulations of micro hybrid rocket motors. The flow model is based on the one-dimensional Euler equations with source terms. The flow equations and the fuel regression rate law are solved in a coupled manner. The platform of the numerical simulations is an implicit fourth-order Runge-Kutta second order cell-centred finite volume method. The numerical results obtained with this model show a good agreement with published experimental and numerical results. The computational model developed in this work is simple, computationally efficient and offers the advantage of taking into account a large number of functional and constructive parameters that are used by the engineers.
NASA Technical Reports Server (NTRS)
Maughan, P. M. (Principal Investigator)
1973-01-01
The author has identified the following significant results. Linear regression of secchi disc visibility against number of sets yielded significant results in a number of instances. The variability seen in the slope of the regression lines is due to the nonuniformity of sample size. The longer the period sampled, the larger the total number of attempts. Further, there is no reason to expect either the influence of transparency or of other variables to remain constant throughout the season. However, the fact that the data for the entire season, variable as it is, was significant at the 5% level, suggests its potential utility for predictive modeling. Thus, this regression equation will be considered representative and will be utilized for the first numerical model. Secchi disc visibility was also regressed against number of sets for the three day period September 27-September 29, 1972 to determine if surface truth data supported the intense relationship between ERTS-1 identified turbidity and fishing effort previously discussed. A very negative correlation was found. These relationship lend additional credence to the hypothesis that ERTS imagery, when utilized as a source of visibility (turbidity) data, may be useful as a predictive tool.
Estimation of Flood Discharges at Selected Recurrence Intervals for Streams in New Hampshire
Olson, Scott A.
2009-01-01
This report provides estimates of flood discharges at selected recurrence intervals for streamgages in and adjacent to New Hampshire and equations for estimating flood discharges at recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, and 500-years for ungaged, unregulated, rural streams in New Hampshire. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 117 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, mean April precipitation, percentage of wetland area, and main channel slope. The average standard error of prediction for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence interval flood discharges with these equations are 30.0, 30.8, 32.0, 34.2, 36.0, 38.1, and 43.4 percent, respectively. Flood discharges at selected recurrence intervals for selected streamgages were computed following the guidelines in Bulletin 17B of the U.S. Interagency Advisory Committee on Water Data. To determine the flood-discharge exceedence probabilities at streamgages in New Hampshire, a new generalized skew coefficient map covering the State was developed. The standard error of the data on new map is 0.298. To improve estimates of flood discharges at selected recurrence intervals for 20 streamgages with short-term records (10 to 15 years), record extension using the two-station comparison technique was applied. The two-station comparison method uses data from a streamgage with long-term record to adjust the frequency characteristics at a streamgage with a short-term record. A technique for adjusting a flood-discharge frequency curve computed from a streamgage record with results from the regression equations is described in this report. Also, a technique is described for estimating flood discharge at a selected recurrence interval for an ungaged site upstream or downstream from a streamgage using a drainage-area adjustment. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Asquith, William H.
2014-01-01
A database containing more than 16,300 discharge values and ancillary hydraulic attributes was assembled from summaries of discharge measurement records for 391 USGS streamflow-gauging stations (streamgauges) in Texas. Each discharge is between the 40th- and 60th-percentile daily mean streamflow as determined by period-of-record, streamgauge-specific, flow-duration curves. Each discharge therefore is assumed to represent a discharge measurement made for near-median streamflow conditions, and such conditions are conceptualized as representative of midrange to baseflow conditions in much of the state. The hydraulic attributes of each discharge measurement included concomitant cross-section flow area, water-surface top width, and reported mean velocity. Two regression equations are presented: (1) an expression for discharge and (2) an expression for mean velocity, both as functions of selected hydraulic attributes and watershed characteristics. Specifically, the discharge equation uses cross-sectional area, water-surface top width, contributing drainage area of the watershed, and mean annual precipitation of the location; the equation has an adjusted R-squared of approximately 0.95 and residual standard error of approximately 0.23 base-10 logarithm (cubic meters per second). The mean velocity equation uses discharge, water-surface top width, contributing drainage area, and mean annual precipitation; the equation has an adjusted R-squared of approximately 0.50 and residual standard error of approximately 0.087 third root (meters per second). Residual plots from both equations indicate that reliable estimates of discharge and mean velocity at ungauged stream sites are possible. Further, the relation between contributing drainage area and main-channel slope (a measure of whole-watershed slope) is depicted to aid analyst judgment of equation applicability for ungauged sites. Example applications and computations are provided and discussed within a real-world, discharge-measurement scenario, and an illustration of the development of a preliminary stage-discharge relation using the discharge equation is given.
Villa, Chiara; Brůžek, Jaroslav
2017-01-01
Background Estimating volumes and masses of total body components is important for the study and treatment monitoring of nutrition and nutrition-related disorders, cancer, joint replacement, energy-expenditure and exercise physiology. While several equations have been offered for estimating total body components from MRI slices, no reliable and tested method exists for CT scans. For the first time, body composition data was derived from 41 high-resolution whole-body CT scans. From these data, we defined equations for estimating volumes and masses of total body AT and LT from corresponding tissue areas measured in selected CT scan slices. Methods We present a new semi-automatic approach to defining the density cutoff between adipose tissue (AT) and lean tissue (LT) in such material. An intra-class correlation coefficient (ICC) was used to validate the method. The equations for estimating the whole-body composition volume and mass from areas measured in selected slices were modeled with ordinary least squares (OLS) linear regressions and support vector machine regression (SVMR). Results and Discussion The best predictive equation for total body AT volume was based on the AT area of a single slice located between the 4th and 5th lumbar vertebrae (L4-L5) and produced lower prediction errors (|PE| = 1.86 liters, %PE = 8.77) than previous equations also based on CT scans. The LT area of the mid-thigh provided the lowest prediction errors (|PE| = 2.52 liters, %PE = 7.08) for estimating whole-body LT volume. We also present equations to predict total body AT and LT masses from a slice located at L4-L5 that resulted in reduced error compared with the previously published equations based on CT scans. The multislice SVMR predictor gave the theoretical upper limit for prediction precision of volumes and cross-validated the results. PMID:28533960
Lacoste Jeanson, Alizé; Dupej, Ján; Villa, Chiara; Brůžek, Jaroslav
2017-01-01
Estimating volumes and masses of total body components is important for the study and treatment monitoring of nutrition and nutrition-related disorders, cancer, joint replacement, energy-expenditure and exercise physiology. While several equations have been offered for estimating total body components from MRI slices, no reliable and tested method exists for CT scans. For the first time, body composition data was derived from 41 high-resolution whole-body CT scans. From these data, we defined equations for estimating volumes and masses of total body AT and LT from corresponding tissue areas measured in selected CT scan slices. We present a new semi-automatic approach to defining the density cutoff between adipose tissue (AT) and lean tissue (LT) in such material. An intra-class correlation coefficient (ICC) was used to validate the method. The equations for estimating the whole-body composition volume and mass from areas measured in selected slices were modeled with ordinary least squares (OLS) linear regressions and support vector machine regression (SVMR). The best predictive equation for total body AT volume was based on the AT area of a single slice located between the 4th and 5th lumbar vertebrae (L4-L5) and produced lower prediction errors (|PE| = 1.86 liters, %PE = 8.77) than previous equations also based on CT scans. The LT area of the mid-thigh provided the lowest prediction errors (|PE| = 2.52 liters, %PE = 7.08) for estimating whole-body LT volume. We also present equations to predict total body AT and LT masses from a slice located at L4-L5 that resulted in reduced error compared with the previously published equations based on CT scans. The multislice SVMR predictor gave the theoretical upper limit for prediction precision of volumes and cross-validated the results.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
New approach to probability estimate of femoral neck fracture by fall (Slovak regression model).
Wendlova, J
2009-01-01
3,216 Slovak women with primary or secondary osteoporosis or osteopenia, aged 20-89 years, were examined with the bone densitometer DXA (dual energy X-ray absorptiometry, GE, Prodigy - Primo), x = 58.9, 95% C.I. (58.42; 59.38). The values of the following variables for each patient were measured: FSI (femur strength index), T-score total hip left, alpha angle - left, theta angle - left, HAL (hip axis length) left, BMI (body mass index) was calculated from the height and weight of the patients. Regression model determined the following order of independent variables according to the intensity of their influence upon the occurrence of values of dependent FSI variable: 1. BMI, 2. theta angle, 3. T-score total hip, 4. alpha angle, 5. HAL. The regression model equation, calculated from the variables monitored in the study, enables a doctor in praxis to determine the probability magnitude (absolute risk) for the occurrence of pathological value of FSI (FSI < 1) in the femoral neck area, i. e., allows for probability estimate of a femoral neck fracture by fall for Slovak women. 1. The Slovak regression model differs from regression models, published until now, in chosen independent variables and a dependent variable, belonging to biomechanical variables, characterising the bone quality. 2. The Slovak regression model excludes the inaccuracies of other models, which are not able to define precisely the current and past clinical condition of tested patients (e.g., to define the length and dose of exposure to risk factors). 3. The Slovak regression model opens the way to a new method of estimating the probability (absolute risk) or the odds for a femoral neck fracture by fall, based upon the bone quality determination. 4. It is assumed that the development will proceed by improving the methods enabling to measure the bone quality, determining the probability of fracture by fall (Tab. 6, Fig. 3, Ref. 22). Full Text (Free, PDF) www.bmj.sk.
Reforming the Military Health Care System
1988-01-01
Population Model and its Application ," International Journal of Health Services, vol. 10, no. 4 (1980). 7. "Understanding Variations in the Use of... Financial Management (November 1986), pp. 26- 34. 21. Based on the following multiple regression equation: OP/NOR= 0.51 + 0.35x(POP/NOR)-6.84x(CIV/NORxPOP) (t...Military Beneficiary Health Care Survey 95 B Actual and Expected Admission Rates 99 C The Statistical Model of Family Use 103 D The Capitation Budgeting
Meteorological adjustment of yearly mean values for air pollutant concentration comparison
NASA Technical Reports Server (NTRS)
Sidik, S. M.; Neustadter, H. E.
1976-01-01
Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.
ELASTIC NET FOR COX'S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM.
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox's proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox's proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems.
Analytical and regression models of glass rod drawing process
NASA Astrophysics Data System (ADS)
Alekseeva, L. B.
2018-03-01
The process of drawing glass rods (light guides) is being studied. The parameters of the process affecting the quality of the light guide have been determined. To solve the problem, mathematical models based on general equations of continuum mechanics are used. The conditions for the stable flow of the drawing process have been found, which are determined by the stability of the motion of the glass mass in the formation zone to small uncontrolled perturbations. The sensitivity of the formation zone to perturbations of the drawing speed and viscosity is estimated. Experimental models of the drawing process, based on the regression analysis methods, have been obtained. These models make it possible to customize a specific production process to obtain light guides of the required quality. They allow one to find the optimum combination of process parameters in the chosen area and to determine the required accuracy of maintaining them at a specified level.
NASA Astrophysics Data System (ADS)
Hartanto, R.; Jantra, M. A. C.; Santosa, S. A. B.; Purnomoadi, A.
2018-01-01
The purpose of this research was to find an appropriate relationship model between the feed energy and protein ratio with the amount of production and quality of milk proteins. This research was conducted at Getasan Sub-district, Semarang Regency, Central Java Province, Indonesia using 40 samples (Holstein Friesian cattle, lactation period II-III and lactation month 3-4). Data were analyzed using linear and quadratic regressions, to predict the production and quality of milk protein from feed energy and protein ratio that describe the diet. The significance of model was tested using analysis of variance. Coefficient of determination (R2), residual variance (RV) and root mean square prediction error (RMSPE) were reported for the developed equations as an indicator of the goodness of model fit. The results showed no relationship in milk protein (kg), milk casein (%), milk casein (kg) and milk urea N (mg/dl) as function of CP/TDN. The significant relationship was observed in milk production (L or kg) and milk protein (%) as function of CP/TDN, both in linear and quadratic models. In addition, a quadratic change in milk production (L) (P = 0.003), milk production (kg) (P = 0.003) and milk protein concentration (%) (P = 0.026) were observed with increase of CP/TDN. It can be concluded that quadratic equation was the good fitting model for this research, because quadratic equation has larger R2, smaller RV and smaller RMSPE than those of linear equation.
Using High Resolution Model Data to Improve Lightning Forecasts across Southern California
NASA Astrophysics Data System (ADS)
Capps, S. B.; Rolinski, T.
2014-12-01
Dry lightning often results in a significant amount of fire starts in areas where the vegetation is dry and continuous. Meteorologists from the USDA Forest Service Predictive Services' program in Riverside, California are tasked to provide southern and central California's fire agencies with fire potential outlooks. Logistic regression equations were developed by these meteorologists several years ago, which forecast probabilities of lightning as well as lightning amounts, out to seven days across southern California. These regression equations were developed using ten years of historical gridded data from the Global Forecast System (GFS) model on a coarse scale (0.5 degree resolution), correlated with historical lightning strike data. These equations do a reasonably good job of capturing a lightning episode (3-5 consecutive days or greater of lightning), but perform poorly regarding more detailed information such as exact location and amounts. It is postulated that the inadequacies in resolving the finer details of episodic lightning events is due to the coarse resolution of the GFS data, along with limited predictors. Stability parameters, such as the Lifted Index (LI), the Total Totals index (TT), Convective Available Potential Energy (CAPE), along with Precipitable Water (PW) are the only parameters being considered as predictors. It is hypothesized that the statistical forecasts will benefit from higher resolution data both in training and implementing the statistical model. We have dynamically downscaled NCEP FNL (Final) reanalysis data using the Weather Research and Forecasting model (WRF) to 3km spatial and hourly temporal resolution across a decade. This dataset will be used to evaluate the contribution to the success of the statistical model of additional predictors in higher vertical, spatial and temporal resolution. If successful, we will implement an operational dynamically downscaled GFS forecast product to generate predictors for the resulting statistical lightning model. This data will help fire agencies be better prepared to pre-deploy resources in advance of these events. Specific information regarding duration, amount, and location will be especially valuable.
NASA Astrophysics Data System (ADS)
Anak Gisen, Jacqueline Isabella; Nijzink, Remko C.; Savenije, Hubert H. G.
2014-05-01
Dispersion mathematical representation of tidal mixing between sea water and fresh water in The definition of dispersion somehow remains unclear as it is not directly measurable. The role of dispersion is only meaningful if it is related to the appropriate temporal and spatial scale of mixing, which are identified as the tidal period, tidal excursion (longitudinal), width of estuary (lateral) and mixing depth (vertical). Moreover, the mixing pattern determines the salt intrusion length in an estuary. If a physically based description of the dispersion is defined, this would allow the analytical solution of the salt intrusion problem. The objective of this study is to develop a predictive equation for estimating the dispersion coefficient at tidal average (TA) condition, which can be applied in the salt intrusion model to predict the salinity profile for any estuary during different events. Utilizing available data of 72 measurements in 27 estuaries (including 6 recently studied estuaries in Malaysia), regressions analysis has been performed with various combinations of dimensionless parameters . The predictive dispersion equations have been developed for two different locations, at the mouth D0TA and at the inflection point D1TA (where the convergence length changes). Regressions have been carried out with two separated datasets: 1) more reliable data for calibration; and 2) less reliable data for validation. The combination of dimensionless ratios that give the best performance is selected as the final outcome which indicates that the dispersion coefficient is depending on the tidal excursion, tidal range, tidal velocity amplitude, friction and the Richardson Number. A limitation of the newly developed equation is that the friction is generally unknown. In order to compensate this problem, further analysis has been performed adopting the hydraulic model of Cai et. al. (2012) to estimate the friction and depth. Keywords: dispersion, alluvial estuaries, mixing, salt intrusion, predictive equation
Zlotnik, Alexander; Gallardo-Antolín, Ascensión; Cuchí Alfaro, Miguel; Pérez Pérez, María Carmen; Montero Martínez, Juan Manuel
2015-08-01
Although emergency department visit forecasting can be of use for nurse staff planning, previous research has focused on models that lacked sufficient resolution and realistic error metrics for these predictions to be applied in practice. Using data from a 1100-bed specialized care hospital with 553,000 patients assigned to its healthcare area, forecasts with different prediction horizons, from 2 to 24 weeks ahead, with an 8-hour granularity, using support vector regression, M5P, and stratified average time-series models were generated with an open-source software package. As overstaffing and understaffing errors have different implications, error metrics and potential personnel monetary savings were calculated with a custom validation scheme, which simulated subsequent generation of predictions during a 4-year period. Results were then compared with a generalized estimating equation regression. Support vector regression and M5P models were found to be superior to the stratified average model with a 95% confidence interval. Our findings suggest that medium and severe understaffing situations could be reduced in more than an order of magnitude and average yearly savings of up to €683,500 could be achieved if dynamic nursing staff allocation was performed with support vector regression instead of the static staffing levels currently in use.
NASA Astrophysics Data System (ADS)
Leroux, Romain; Chatellier, Ludovic; David, Laurent
2018-01-01
This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.
NASA Astrophysics Data System (ADS)
Zafarani, H.; Luzi, Lucia; Lanzano, Giovanni; Soghrat, M. R.
2018-01-01
A recently compiled, comprehensive, and good-quality strong-motion database of the Iranian earthquakes has been used to develop local empirical equations for the prediction of peak ground acceleration (PGA) and 5%-damped pseudo-spectral accelerations (PSA) up to 4.0 s. The equations account for style of faulting and four site classes and use the horizontal distance from the surface projection of the rupture plane as a distance measure. The model predicts the geometric mean of horizontal components and the vertical-to-horizontal ratio. A total of 1551 free-field acceleration time histories recorded at distances of up to 200 km from 200 shallow earthquakes (depth < 30 km) with moment magnitudes ranging from Mw 4.0 to 7.3 are used to perform regression analysis using the random effects algorithm of Abrahamson and Youngs (Bull Seism Soc Am 82:505-510, 1992), which considers between-events as well as within-events errors. Due to the limited data used in the development of previous Iranian ground motion prediction equations (GMPEs) and strong trade-offs between different terms of GMPEs, it is likely that the previously determined models might have less precision on their coefficients in comparison to the current study. The richer database of the current study allows improving on prior works by considering additional variables that could not previously be adequately constrained. Here, a functional form used by Boore and Atkinson (Earthquake Spect 24:99-138, 2008) and Bindi et al. (Bull Seism Soc Am 9:1899-1920, 2011) has been adopted that allows accounting for the saturation of ground motions at close distances. A regression has been also performed for the V/H in order to retrieve vertical components by scaling horizontal spectra. In order to take into account epistemic uncertainty, the new model can be used along with other appropriate GMPEs through a logic tree framework for seismic hazard assessment in Iran and Middle East region.
Penn, Richard; Werner, Michael; Thomas, Justin
2015-01-01
Background Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. Methods In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. Results We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Conclusions Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible. PMID:26029638
Analysis of oscillatory motion of a light airplane at high values of lift coefficient
NASA Technical Reports Server (NTRS)
Batterson, J. G.
1983-01-01
A modified stepwise regression is applied to flight data from a light research air-plane operating at high angles at attack. The well-known phenomenon referred to as buckling or porpoising is analyzed and modeled using both power series and spline expansions of the aerodynamic force and moment coefficients associated with the longitudinal equations of motion.
1982-05-13
Size Of The Software. A favourite measure for software system size is linos of operational code, or deliverable code (operational code plus...regression models, these conversions are either derived from productivity measures using the "cost per instruction" type of equation or they are...appropriate to different development organisattons, differert project types, different sets of units for measuring e and s, and different items
Mental Models of Software Forecasting
NASA Technical Reports Server (NTRS)
Hihn, J.; Griesel, A.; Bruno, K.; Fouser, T.; Tausworthe, R.
1993-01-01
The majority of software engineers resist the use of the currently available cost models. One problem is that the mathematical and statistical models that are currently available do not correspond with the mental models of the software engineers. In an earlier JPL funded study (Hihn and Habib-agahi, 1991) it was found that software engineers prefer to use analogical or analogy-like techniques to derive size and cost estimates, whereas curren CER's hide any analogy in the regression equations. In addition, the currently available models depend upon information which is not available during early planning when the most important forecasts must be made.
Kim, Minji; Kim, Won-Baek; Koo, Kyoung Yoon; Kim, Bo Ram; Kim, Doohyun; Lee, Seoyoun; Son, Hong Joo; Hwang, Dae Youn; Kim, Dong Seob; Lee, Chung Yeoul; Lee, Heeseob
2017-04-28
This study was conducted to evaluate the hyaluronidase (HAase) inhibition activity of Asparagus cochinchinesis (AC) extracts following fermentation by Weissella cibaria through response surface methodology. To optimize the HAase inhibition activity, a central composite design was introduced based on four variables: the concentration of AC extract ( X 1 : 1-5%), amount of starter culture ( X 2 : 1-5%), pH ( X 3 : 4-8), and fermentation time ( X 4 : 0-10 days). The experimental data were fitted to quadratic regression equations, the accuracy of the equations was analyzed by ANOVA, and the regression coefficients for the surface quadratic model of HAase inhibition activity in the fermented AC extract were estimated by the F test and the corresponding p values. The HAase inhibition activity indicated that fermentation time was most significant among the parameters within the conditions tested. To validate the model, two different conditions among those generated by the Design Expert program were selected. Under both conditions, predicted and experimental data agreed well. Moreover, the content of protodioscin (a well-known compound related to anti-inflammation activity) was elevated after fermentation of the AC extract at the optimized fermentation condition.
Predicting Diameter at Breast Height from Stump Diameters for Northeastern Tree Species
Eric H. Wharton; Eric H. Wharton
1984-01-01
Presents equations to predict diameter at breast height from stump diameter measurements for 17 northeastern tree species. Simple linear regression was used to develop the equations. Application of the equations is discussed.
Predicting in ungauged basins using a parsimonious rainfall-runoff model
NASA Astrophysics Data System (ADS)
Skaugen, Thomas; Olav Peerebom, Ivar; Nilsson, Anna
2015-04-01
Prediction in ungauged basins is a demanding, but necessary test for hydrological model structures. Ideally, the relationship between model parameters and catchment characteristics (CC) should be hydrologically justifiable. Many studies, however, report on failure to obtain significant correlations between model parameters and CCs. Under the hypothesis that the lack of correlations stems from non-identifiability of model parameters caused by overparameterization, the relatively new parameter parsimonious DDD (Distance Distribution Dynamics) model was tested for predictions in ungauged basins in Norway. In DDD, the capacity of the subsurface water reservoir M is the only parameter to be calibrated whereas the runoff dynamics is completely parameterised from observed characteristics derived from GIS and runoff recession analysis. Water is conveyed through the soils to the river network by waves with celerities determined by the level of saturation in the catchment. The distributions of distances between points in the catchment to the nearest river reach and of the river network give, together with the celerities, distributions of travel times, and, consequently unit hydrographs. DDD has 6 parameters less to calibrate in the runoff module than, for example, the well-known Swedish HBV model. In this study, multiple regression equations relating CCs and model parameters were trained from 84 calibrated catchments located all over Norway and all model parameters showed significant correlations with catchment characteristics. The significant correlation coefficients (with p- value < 0.05) ranged from 0.22-0.55. The suitability of DDD for predictions in ungauged basins was tested for 17 catchments not used to estimate the multiple regression equations. For 10 of the 17 catchments, deviations in Nash-Suthcliffe Efficiency (NSE) criteria between the calibrated and regionalised model were less than 0.1. The median NSE for the regionalised DDD for the 17 catchments, for two different time series was 0.66 and 0.72. Deviations in NSE between calibrated and regionalised models are well explained by the deviations between calibrated and regressed parameters describing spatial snow distribution and snowmelt, respectively. This latter result indicates the topic for further improvements in the model structure of DDD.
Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao; Lu, Jun-Ying; Zeng, Yan-Hong; Meng, Fan-Jie; Cao, Bin; Zi, Xue-Rong; Han, Shu-Ming; Zhang, Yu-Huan
2013-09-01
Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 × d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l × h × d): V = 0.56 × (l × h × d) + 39.44 (r = 0.92, P = 0.000). The 64-slice CT volume-rendering technique can accurately measure the volume in pleural effusion patients, and a linear regression equation can be used to estimate the volume of the free pleural effusion.
Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566
Von Guerard, Paul; Weiss, W.B.
1995-01-01
The U.S. Environmental Protection Agency requires that municipalities that have a population of 100,000 or greater obtain National Pollutant Discharge Elimination System permits to characterize the quality of their storm runoff. In 1992, the U.S. Geological Survey, in cooperation with the Colorado Springs City Engineering Division, began a study to characterize the water quality of storm runoff and to evaluate procedures for the estimation of storm-runoff loads, volume and event-mean concentrations for selected properties and constituents. Precipitation, streamflow, and water-quality data were collected during 1992 at five sites in Colorado Springs. Thirty-five samples were collected, seven at each of the five sites. At each site, three samples were collected for permitting purposes; two of the samples were collected during rainfall runoff, and one sample was collected during snowmelt runoff. Four additional samples were collected at each site to obtain a large enough sample size to estimate storm-runoff loads, volume, and event-mean concentrations for selected properties and constituents using linear-regression procedures developed using data from the Nationwide Urban Runoff Program (NURP). Storm-water samples were analyzed for as many as 186 properties and constituents. The constituents measured include total-recoverable metals, vola-tile-organic compounds, acid-base/neutral organic compounds, and pesticides. Storm runoff sampled had large concentrations of chemical oxygen demand and 5-day biochemical oxygen demand. Chemical oxygen demand ranged from 100 to 830 milligrams per liter, and 5.-day biochemical oxygen demand ranged from 14 to 260 milligrams per liter. Total-organic carbon concentrations ranged from 18 to 240 milligrams per liter. The total-recoverable metals lead and zinc had the largest concentrations of the total-recoverable metals analyzed. Concentrations of lead ranged from 23 to 350 micrograms per liter, and concentrations of zinc ranged from 110 to 1,400 micrograms per liter. The data for 30 storms representing rainfall runoff from 5 drainage basins were used to develop single-storm local-regression models. The response variables, storm-runoff loads, volume, and event-mean concentrations were modeled using explanatory variables for climatic, physical, and land-use characteristics. The r2 for models that use ordinary least-squares regression ranged from 0.57 to 0.86 for storm-runoff loads and volume and from 0.25 to 0.63 for storm-runoff event-mean concentrations. Except for cadmium, standard errors of estimate ranged from 43 to 115 percent for storm- runoff loads and volume and from 35 to 66 percent for storm-runoff event-mean concentrations. Eleven of the 30 concentrations collected during rainfall runoff for total-recoverable cadmium were censored (less than) concentrations. Ordinary least-squares regression should not be used with censored data; however, censored data can be included with uncensored data using tobit regression. Standard errors of estimate for storm-runoff load and event-mean concentration for total-recoverable cadmium, computed using tobit regression, are 247 and 171 percent. Estimates from single-storm regional-regression models, developed from the Nationwide Urban Runoff Program data base, were compared with observed storm-runoff loads, volume, and event-mean concentrations determined from samples collected in the study area. Single-storm regional-regression models tended to overestimate storm-runoff loads, volume, and event-mean con-centrations. Therefore, single-storm local- and regional-regression models were combined using model-adjustment procedures to take advantage of the strengths of both models while minimizing the deficiencies of each model. Procedures were used to develop single-stormregression equations that were adjusted using local data and estimates from single-storm regional-regression equations. Single-storm regression models developed using model- adjustment proce
Rane, Smita; Prabhakar, Bala
2013-07-01
The aim of this study was to investigate the combined influence of 3 independent variables in the preparation of paclitaxel containing pH-sensitive liposomes. A 3 factor, 3 levels Box-Behnken design was used to derive a second order polynomial equation and construct contour plots to predict responses. The independent variables selected were molar ratio phosphatidylcholine:diolylphosphatidylethanolamine (X1), molar concentration of cholesterylhemisuccinate (X2), and amount of drug (X3). Fifteen batches were prepared by thin film hydration method and evaluated for percent drug entrapment, vesicle size, and pH sensitivity. The transformed values of the independent variables and the percent drug entrapment were subjected to multiple regression to establish full model second order polynomial equation. F was calculated to confirm the omission of insignificant terms from the full model equation to derive a reduced model polynomial equation to predict the dependent variables. Contour plots were constructed to show the effects of X1, X2, and X3 on the percent drug entrapment. A model was validated for accurate prediction of the percent drug entrapment by performing checkpoint analysis. The computer optimization process and contour plots predicted the levels of independent variables X1, X2, and X3 (0.99, -0.06, 0, respectively), for maximized response of percent drug entrapment with constraints on vesicle size and pH sensitivity.
Estimates of Median Flows for Streams on the 1999 Kansas Surface Water Register
Perry, Charles A.; Wolock, David M.; Artman, Joshua C.
2004-01-01
The Kansas State Legislature, by enacting Kansas Statute KSA 82a?2001 et. seq., mandated the criteria for determining which Kansas stream segments would be subject to classification by the State. One criterion for the selection as a classified stream segment is based on the statistic of median flow being equal to or greater than 1 cubic foot per second. As specified by KSA 82a?2001 et. seq., median flows were determined from U.S. Geological Survey streamflow-gaging-station data by using the most-recent 10 years of gaged data (KSA) for each streamflow-gaging station. Median flows also were determined by using gaged data from the entire period of record (all-available hydrology, AAH). Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating median flows for uncontrolled stream segments. The drainage area of the gaging stations on uncontrolled stream segments used in the regression analyses ranged from 2.06 to 12,004 square miles. A logarithmic transformation of the data was needed to develop the best linear relation for computing median flows. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. Tobit analyses of KSA data yielded a model standard error of prediction of 0.285 logarithmic units, and the best equations using Tobit analyses of AAH data had a model standard error of prediction of 0.250 logarithmic units. These regression equations and an interpolation procedure were used to compute median flows for the uncontrolled stream segments on the 1999 Kansas Surface Water Register. Measured median flows from gaging stations were incorporated into the regression-estimated median flows along the stream segments where available. The segments that were uncontrolled were interpolated using gaged data weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled segments of Kansas streams, the median flow information was interpolated between gaging stations using only gaged data weighted by drainage area. Of the 2,232 total stream segments on the Kansas Surface Water Register, 34.5 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second when the KSA analysis was used. When the AAH analysis was used, 36.2 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second. This report supercedes U.S. Geological Survey Water-Resources Investigations Report 02?4292.
Sun, Yanqing; Sun, Liuquan; Zhou, Jie
2013-07-01
This paper studies the generalized semiparametric regression model for longitudinal data where the covariate effects are constant for some and time-varying for others. Different link functions can be used to allow more flexible modelling of longitudinal data. The nonparametric components of the model are estimated using a local linear estimating equation and the parametric components are estimated through a profile estimating function. The method automatically adjusts for heterogeneity of sampling times, allowing the sampling strategy to depend on the past sampling history as well as possibly time-dependent covariates without specifically model such dependence. A [Formula: see text]-fold cross-validation bandwidth selection is proposed as a working tool for locating an appropriate bandwidth. A criteria for selecting the link function is proposed to provide better fit of the data. Large sample properties of the proposed estimators are investigated. Large sample pointwise and simultaneous confidence intervals for the regression coefficients are constructed. Formal hypothesis testing procedures are proposed to check for the covariate effects and whether the effects are time-varying. A simulation study is conducted to examine the finite sample performances of the proposed estimation and hypothesis testing procedures. The methods are illustrated with a data example.
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
Methods for estimating low-flow statistics for Massachusetts streams
Ries, Kernell G.; Friesz, Paul J.
2000-01-01
Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The streamgaging stations had from 2 to 81 years of record, with a mean record length of 37 years. The low-flow partial-record stations had from 8 to 36 streamflow measurements, with a median of 14 measurements. All basin characteristics were determined from digital map data. The basin characteristics that were statistically significant in most of the final regression equations were drainage area, the area of stratified-drift deposits per unit of stream length plus 0.1, mean basin slope, and an indicator variable that was 0 in the eastern region and 1 in the western region of Massachusetts. The equations were developed by use of weighted-least-squares regression analyses, with weights assigned proportional to the years of record and inversely proportional to the variances of the streamflow statistics for the stations. Standard errors of prediction ranged from 70.7 to 17.5 percent for the equations to predict the 7-day, 10-year low flow and 50-percent duration flow, respectively. The equations are not applicable for use in the Southeast Coastal region of the State, or where basin characteristics for the selected ungaged site are outside the ranges of those for the stations used in the regression analyses. A World Wide Web application was developed that provides streamflow statistics for data collection stations from a data base and for ungaged sites by measuring the necessary basin characteristics for the site and solving the regression equations. Output provided by the Web application for ungaged sites includes a map of the drainage-basin boundary determined for the site, the measured basin characteristics, the estimated streamflow statistics, and 90-percent prediction intervals for the estimates. An equation is provided for combining regression and correlation estimates to obtain improved estimates of the streamflow statistics for low-flow partial-record stations. An equation is also provided for combining regression and drainage-area ratio estimates to obtain improved e
Gotvald, Anthony J.; Barth, Nancy A.; Veilleux, Andrea G.; Parrett, Charles
2012-01-01
Methods for estimating the magnitude and frequency of floods in California that are not substantially affected by regulation or diversions have been updated. Annual peak-flow data through water year 2006 were analyzed for 771 streamflow-gaging stations (streamgages) in California having 10 or more years of data. Flood-frequency estimates were computed for the streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Low-outlier and historic information were incorporated into the flood-frequency analysis, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low outliers. Special methods for fitting the distribution were developed for streamgages in the desert region in southeastern California. Additionally, basin characteristics for the streamgages were computed by using a geographical information system. Regional regression analysis, using generalized least squares regression, was used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins in California that are outside of the southeastern desert region. Flood-frequency estimates and basin characteristics for 630 streamgages were combined to form the final database used in the regional regression analysis. Five hydrologic regions were developed for the area of California outside of the desert region. The final regional regression equations are functions of drainage area and mean annual precipitation for four of the five regions. In one region, the Sierra Nevada region, the final equations are functions of drainage area, mean basin elevation, and mean annual precipitation. Average standard errors of prediction for the regression equations in all five regions range from 42.7 to 161.9 percent. For the desert region of California, an analysis of 33 streamgages was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the log-Pearson Type III distribution. The regional estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final regional regression equations are functions of drainage area. Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent. Annual peak-flow data through water year 2006 were analyzed for eight streamgages in California having 10 or more years of data considered to be affected by urbanization. Flood-frequency estimates were computed for the urban streamgages by fitting a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Regression analysis could not be used to develop flood-frequency estimation equations for urban streams because of the limited number of sites. Flood-frequency estimates for the eight urban sites were graphically compared to flood-frequency estimates for 630 non-urban sites. The regression equations developed from this study will be incorporated into the U.S. Geological Survey (USGS) StreamStats program. The StreamStats program is a Web-based application that provides streamflow statistics and basin characteristics for USGS streamgages and ungaged sites of interest. StreamStats can also compute basin characteristics and provide estimates of streamflow statistics for ungaged sites when users select the location of a site along any stream in California.
Estimates of streamflow characteristics for selected small streams, Baker River basin, Washington
Williams, John R.
1987-01-01
Regression equations were used to estimate streamflow characteristics at eight ungaged sites on small streams in the Baker River basin in the North Cascade Mountains, Washington, that could be suitable for run-of-the-river hydropower development. The regression equations were obtained by relating known streamflow characteristics at 25 gaging stations in nearby basins to several physical and climatic variables that could be easily measured in gaged or ungaged basins. The known streamflow characteristics were mean annual flows, 1-, 3-, and 7-day low flows and high flows, mean monthly flows, and flow duration. Drainage area and mean annual precipitation were not the most significant variables in all the regression equations. Variance in the low flows and the summer mean monthly flows was reduced by including an index of glacierized area within the basin as a third variable. Standard errors of estimate of the regression equations ranged from 25 to 88%, and the largest errors were associated with the low flow characteristics. Discharge measurements made at the eight sites near midmonth each month during 1981 were used to estimate monthly mean flows at the sites for that period. These measurements also were correlated with concurrent daily mean flows from eight operating gaging stations. The correlations provided estimates of mean monthly flows that compared reasonably well with those estimated by the regression analyses. (Author 's abstract)
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R2 = 0.86–0.93 for 72 data sets collected in the industrial river and R2 = 0.60–0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals’ concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density. PMID:27028017
Correlation among extinction efficiency and other parameters in an aggregate dust model
NASA Astrophysics Data System (ADS)
Dhar, Tanuj Kumar; Sekhar Das, Himadri
2017-10-01
We study the extinction properties of highly porous Ballistic Cluster-Cluster Aggregate dust aggregates in a wide range of complex refractive indices (1.4≤ n≤ 2.0, 0.001≤ k≤ 1.0) and wavelengths (0.11 {{μ }}{{m}}≤ {{λ }}≤ 3.4 {{μ }} m). An attempt has been made for the first time to investigate the correlation among extinction efficiency ({Q}{ext}), composition of dust aggregates (n,k), wavelength of radiation (λ) and size parameter of the monomers (x). If k is fixed at any value between 0.001 and 1.0, {Q}{ext} increases with increase of n from 1.4 to 2.0. {Q}{ext} and n are correlated via linear regression when the cluster size is small, whereas the correlation is quadratic at moderate and higher sizes of the cluster. This feature is observed at all wavelengths (ultraviolet to optical to infrared). We also find that the variation of {Q}{ext} with n is very small when λ is high. When n is fixed at any value between 1.4 and 2.0, it is observed that {Q}{ext} and k are correlated via a polynomial regression equation (of degree 1, 2, 3 or 4), where the degree of the equation depends on the cluster size, n and λ. The correlation is linear for small size and quadratic/cubic/quartic for moderate and higher sizes. We have also found that {Q}{ext} and x are correlated via a polynomial regression (of degree 3, 4 or 5) for all values of n. The degree of regression is found to be n and k-dependent. The set of relations obtained from our work can be used to model interstellar extinction for dust aggregates in a wide range of wavelengths and complex refractive indices.
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R(2) = 0.86-0.93 for 72 data sets collected in the industrial river and R(2) = 0.60-0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals' concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density.
Inami, Satoshi; Moridaira, Hiroshi; Takeuchi, Daisaku; Shiba, Yo; Nohara, Yutaka; Taneichi, Hiroshi
2016-11-01
Adult spinal deformity (ASD) classification showing that ideal pelvic incidence minus lumbar lordosis (PI-LL) value is within 10° has been received widely. But no study has focused on the optimum level of PI-LL value that reflects wide variety in PI among patients. This study was conducted to determine the optimum PI-LL value specific to an individual's PI in postoperative ASD patients. 48 postoperative ASD patients were recruited. Spino-pelvic parameters and Oswestry Disability Index (ODI) were measured at the final follow-up. Factors associated with good clinical results were determined by stepwise multiple regression model using the ODI. The patients with ODI under the 75th percentile cutoff were designated into the "good" health related quality of life (HRQOL) group. In this group, the relationship between the PI-LL and PI was assessed by regression analysis. Multiple regression analysis revealed PI-LL as significant parameters associated with ODI. Thirty-six patients with an ODI <22 points (75th percentile cutoff) were categorized into a good HRQOL group, and linear regression models demonstrated the following equation: PI-LL = 0.41PI-11.12 (r = 0.45, P = 0.0059). On the basis of this equation, in the patients with a PI = 50°, the PI-LL is 9°. Whereas in those with a PI = 30°, the optimum PI-LL is calculated to be as low as 1°. In those with a PI = 80°, PI-LL is estimated at 22°. Consequently, an optimum PI-LL is inconsistent in that it depends on the individual PI.
Bankfull characteristics of Ohio streams and their relation to peak streamflows
Sherwood, James M.; Huitger, Carrie A.
2005-01-01
Regional curves, simple-regression equations, and multiple-regression equations were developed to estimate bankfull width, bankfull mean depth, bankfull cross-sectional area, and bankfull discharge of rural, unregulated streams in Ohio. The methods are based on geomorphic, basin, and flood-frequency data collected at 50 study sites on unregulated natural alluvial streams in Ohio, of which 40 sites are near streamflow-gaging stations. The regional curves and simple-regression equations relate the bankfull characteristics to drainage area. The multiple-regression equations relate the bankfull characteristics to drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope. Average standard errors of prediction for bankfull width equations range from 20.6 to 24.8 percent; for bankfull mean depth, 18.8 to 20.6 percent; for bankfull cross-sectional area, 25.4 to 30.6 percent; and for bankfull discharge, 27.0 to 78.7 percent. The simple-regression (drainage-area only) equations have the highest average standard errors of prediction. The multiple-regression equations in which the explanatory variables included drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope have the lowest average standard errors of prediction. Field surveys were done at each of the 50 study sites to collect the geomorphic data. Bankfull indicators were identified and evaluated, cross-section and longitudinal profiles were surveyed, and bed- and bank-material were sampled. Field data were analyzed to determine various geomorphic characteristics such as bankfull width, bankfull mean depth, bankfull cross-sectional area, bankfull discharge, streambed slope, and bed- and bank-material particle-size distribution. The various geomorphic characteristics were analyzed by means of a combination of graphical and statistical techniques. The logarithms of the annual peak discharges for the 40 gaged study sites were fit by a Pearson Type III frequency distribution to develop flood-peak discharges associated with recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The peak-frequency data were related to geomorphic, basin, and climatic variables by multiple-regression analysis. Simple-regression equations were developed to estimate 2-, 5-, 10-, 25-, 50-, and 100-year flood-peak discharges of rural, unregulated streams in Ohio from bankfull channel cross-sectional area. The average standard errors of prediction are 31.6, 32.6, 35.9, 41.5, 46.2, and 51.2 percent, respectively. The study and methods developed are intended to improve understanding of the relations between geomorphic, basin, and flood characteristics of streams in Ohio and to aid in the design of hydraulic structures, such as culverts and bridges, where stability of the stream and structure is an important element of the design criteria. The study was done in cooperation with the Ohio Department of Transportation and the U.S. Department of Transportation, Federal Highway Administration.
Pearl, D L; Louie, M; Chui, L; Doré, K; Grimsrud, K M; Martin, S W; Michel, P; Svenson, L W; McEwen, S A
2008-04-01
Using multivariable models, we compared whether there were significant differences between reported outbreak and sporadic cases in terms of their sex, age, and mode and site of disease transmission. We also determined the potential role of administrative, temporal, and spatial factors within these models. We compared a variety of approaches to account for clustering of cases in outbreaks including weighted logistic regression, random effects models, general estimating equations, robust variance estimates, and the random selection of one case from each outbreak. Age and mode of transmission were the only epidemiologically and statistically significant covariates in our final models using the above approaches. Weighing observations in a logistic regression model by the inverse of their outbreak size appeared to be a relatively robust and valid means for modelling these data. Some analytical techniques, designed to account for clustering, had difficulty converging or producing realistic measures of association.
Garnier, Alain; Gaillet, Bruno
2015-12-01
Not so many fermentation mathematical models allow analytical solutions of batch process dynamics. The most widely used is the combination of the logistic microbial growth kinetics with Luedeking-Piret bioproduct synthesis relation. However, the logistic equation is principally based on formalistic similarities and only fits a limited range of fermentation types. In this article, we have developed an analytical solution for the combination of Monod growth kinetics with Luedeking-Piret relation, which can be identified by linear regression and used to simulate batch fermentation evolution. Two classical examples are used to show the quality of fit and the simplicity of the method proposed. A solution for the combination of Haldane substrate-limited growth model combined with Luedeking-Piret relation is also provided. These models could prove useful for the analysis of fermentation data in industry as well as academia. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Phillion, A. B.; Cockcroft, S. L.; Lee, P. D.
2009-07-01
The methodology of direct finite element (FE) simulation was used to predict the semi-solid constitutive behavior of an industrially important aluminum-magnesium alloy, AA5182. Model microstructures were generated that detail key features of the as-cast semi-solid: equiaxed-globular grains of random size and shape, interconnected liquid films, and pores at the triple-junctions. Based on the results of over fifty different simulations, a model-based constitutive relationship which includes the effects of the key microstructure features—fraction solid, grain size and fraction porosity—was derived using regression analysis. This novel constitutive equation was then validated via comparison with both the FE simulations and experimental stress/strain data. Such an equation can now be used to incorporate the effects of microstructure on the bulk semi-solid flow stress within a macro- scale process model.
Regional interpretation of water-quality monitoring data
Smith, Richard A.; Schwarz, Gregory E.; Alexander, Richard B.
1997-01-01
We describe a method for using spatially referenced regressions of contaminant transport on watershed attributes (SPARROW) in regional water-quality assessment. The method is designed to reduce the problems of data interpretation caused by sparse sampling, network bias, and basin heterogeneity. The regression equation relates measured transport rates in streams to spatially referenced descriptors of pollution sources and land-surface and stream-channel characteristics. Regression models of total phosphorus (TP) and total nitrogen (TN) transport are constructed for a region defined as the nontidal conterminous United States. Observed TN and TP transport rates are derived from water-quality records for 414 stations in the National Stream Quality Accounting Network. Nutrient sources identified in the equations include point sources, applied fertilizer, livestock waste, nonagricultural land, and atmospheric deposition (TN only). Surface characteristics found to be significant predictors of land-water delivery include soil permeability, stream density, and temperature (TN only). Estimated instream decay coefficients for the two contaminants decrease monotonically with increasing stream size. TP transport is found to be significantly reduced by reservoir retention. Spatial referencing of basin attributes in relation to the stream channel network greatly increases their statistical significance and model accuracy. The method is used to estimate the proportion of watersheds in the conterminous United States (i.e., hydrologic cataloging units) with outflow TP concentrations less than the criterion of 0.1 mg/L, and to classify cataloging units according to local TN yield (kg/km2/yr).
Boomer, Kathleen B; Weller, Donald E; Jordan, Thomas E
2008-01-01
The Universal Soil Loss Equation (USLE) and its derivatives are widely used for identifying watersheds with a high potential for degrading stream water quality. We compared sediment yields estimated from regional application of the USLE, the automated revised RUSLE2, and five sediment delivery ratio algorithms to measured annual average sediment delivery in 78 catchments of the Chesapeake Bay watershed. We did the same comparisons for another 23 catchments monitored by the USGS. Predictions exceeded observed sediment yields by more than 100% and were highly correlated with USLE erosion predictions (Pearson r range, 0.73-0.92; p < 0.001). RUSLE2-erosion estimates were highly correlated with USLE estimates (r = 0.87; p < 001), so the method of implementing the USLE model did not change the results. In ranked comparisons between observed and predicted sediment yields, the models failed to identify catchments with higher yields (r range, -0.28-0.00; p > 0.14). In a multiple regression analysis, soil erodibility, log (stream flow), basin shape (topographic relief ratio), the square-root transformed proportion of forest, and occurrence in the Appalachian Plateau province explained 55% of the observed variance in measured suspended sediment loads, but the model performed poorly (r(2) = 0.06) at predicting loads in the 23 USGS watersheds not used in fitting the model. The use of USLE or multiple regression models to predict sediment yields is not advisable despite their present widespread application. Integrated watershed models based on the USLE may also be unsuitable for making management decisions.
Saygın, Selen Deviren; Basaran, Mustafa; Ozcan, Ali Ugur; Dolarslan, Melda; Timur, Ozgur Burhan; Yilman, F Ebru; Erpul, Gunay
2011-09-01
Land degradation by soil erosion is one of the most serious problems and environmental issues in many ecosystems of arid and semi-arid regions. Especially, the disturbed areas have greater soil detachability and transportability capacity. Evaluation of land degradation in terms of soil erodibility, by using geostatistical modeling, is vital to protect and reclaim susceptible areas. Soil erodibility, described as the ability of soils to resist erosion, can be measured either directly under natural or simulated rainfall conditions, or indirectly estimated by empirical regression models. This study compares three empirical equations used to determine the soil erodibility factor of revised universal soil loss equation prediction technology based on their geospatial performances in the semi-arid catchment of the Saraykoy II Irrigation Dam located in Cankiri, Turkey. A total of 311 geo-referenced soil samples were collected with irregular intervals from the top soil layer (0-10 cm). Geostatistical analysis was performed with the point values of each equation to determine its spatial pattern. Results showed that equations that used soil organic matter in combination with the soil particle size better agreed with the variations in land use and topography of the catchment than the one using only the particle size distribution. It is recommended that the equations which dynamically integrate soil intrinsic properties with land use, topography, and its influences on the local microclimates, could be successfully used to geospatially determine sites highly susceptible to water erosion, and therefore, to select the agricultural and bio-engineering control measures needed.
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
Mathematical Modelling of Optimization of Structures of Monolithic Coverings Based on Liquid Rubbers
NASA Astrophysics Data System (ADS)
Turgumbayeva, R. Kh; Abdikarimov, M. N.; Mussabekov, R.; Sartayev, D. T.
2018-05-01
The paper considers optimization of monolithic coatings compositions using a computer and MPE methods. The goal of the paper was to construct a mathematical model of the complete factorial experiment taking into account its plan and conditions. Several regression equations were received. Dependence between content components and parameters of rubber, as well as the quantity of a rubber crumb, was considered. An optimal composition for manufacturing the material of monolithic coatings compositions was recommended based on experimental data.
Equations for predicting biomass in 2- to 6-year-old Eucalyptus saligna in Hawaii
Craig D. Whitesell; Susan C. Miyasaka; Robert F. Strand; Thomas H. Schubert; Katharine E. McDuffie
1988-01-01
Eucalyptus saligna trees grown in short-rotation plantations on the island of Hawaii were measured, harvested, and weighed to provide data for developing regression equations using non-destructive stand measurements. Regression analysis of the data from 190 trees in the 2.0- to 3.5-year range and 96 trees in the 4- to 6-year range related stem-only...
Low-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Low-flow annual non-exceedance probabilities (ANEP), called probability-percent chance (P-percent chance) flow estimates, regional regression equations, and transfer methods are provided describing the low-flow characteristics of Virginia streams. Statistical methods are used to evaluate streamflow data. Analysis of Virginia streamflow data collected from 1895 through 2007 is summarized. Methods are provided for estimating low-flow characteristics of gaged and ungaged streams. The 1-, 4-, 7-, and 30-day average streamgaging station low-flow characteristics for 290 long-term, continuous-record, streamgaging stations are determined, adjusted for instances of zero flow using a conditional probability adjustment method, and presented for non-exceedance probabilities of 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, and 0.005. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression equations to estimate annual non-exceedance probabilities at gaged and ungaged sites and are summarized for 290 long-term, continuous-record streamgaging stations, 136 short-term, continuous-record streamgaging stations, and 613 partial-record streamgaging stations. Regional regression equations for six physiographic regions use basin characteristics to estimate 1-, 4-, 7-, and 30-day average low-flow annual non-exceedance probabilities at gaged and ungaged sites. Weighted low-flow values that combine computed streamgaging station low-flow characteristics and annual non-exceedance probabilities from regional regression equations provide improved low-flow estimates. Regression equations developed using the Maintenance of Variance with Extension (MOVE.1) method describe the line of organic correlation (LOC) with an appropriate index site for low-flow characteristics at 136 short-term, continuous-record streamgaging stations and 613 partial-record streamgaging stations. Monthly streamflow statistics computed on the individual daily mean streamflows of selected continuous-record streamgaging stations and curves describing flow-duration are presented. Text, figures, and lists are provided summarizing low-flow estimates, selected low-flow sites, delineated physiographic regions, basin characteristics, regression equations, error estimates, definitions, and data sources. This study supersedes previous studies of low flows in Virginia.
Automation of workplace lifting hazard assessment for musculoskeletal injury prevention.
Spector, June T; Lieblich, Max; Bao, Stephen; McQuade, Kevin; Hughes, Margaret
2014-01-01
Existing methods for practically evaluating musculoskeletal exposures such as posture and repetition in workplace settings have limitations. We aimed to automate the estimation of parameters in the revised United States National Institute for Occupational Safety and Health (NIOSH) lifting equation, a standard manual observational tool used to evaluate back injury risk related to lifting in workplace settings, using depth camera (Microsoft Kinect) and skeleton algorithm technology. A large dataset (approximately 22,000 frames, derived from six subjects) of simultaneous lifting and other motions recorded in a laboratory setting using the Kinect (Microsoft Corporation, Redmond, Washington, United States) and a standard optical motion capture system (Qualysis, Qualysis Motion Capture Systems, Qualysis AB, Sweden) was assembled. Error-correction regression models were developed to improve the accuracy of NIOSH lifting equation parameters estimated from the Kinect skeleton. Kinect-Qualysis errors were modelled using gradient boosted regression trees with a Huber loss function. Models were trained on data from all but one subject and tested on the excluded subject. Finally, models were tested on three lifting trials performed by subjects not involved in the generation of the model-building dataset. Error-correction appears to produce estimates for NIOSH lifting equation parameters that are more accurate than those derived from the Microsoft Kinect algorithm alone. Our error-correction models substantially decreased the variance of parameter errors. In general, the Kinect underestimated parameters, and modelling reduced this bias, particularly for more biased estimates. Use of the raw Kinect skeleton model tended to result in falsely high safe recommended weight limits of loads, whereas error-corrected models gave more conservative, protective estimates. Our results suggest that it may be possible to produce reasonable estimates of posture and temporal elements of tasks such as task frequency in an automated fashion, although these findings should be confirmed in a larger study. Further work is needed to incorporate force assessments and address workplace feasibility challenges. We anticipate that this approach could ultimately be used to perform large-scale musculoskeletal exposure assessment not only for research but also to provide real-time feedback to workers and employers during work method improvement activities and employee training.
Trommer, J.T.; Loper, J.E.; Hammett, K.M.
1996-01-01
Several traditional techniques have been used for estimating stormwater runoff from ungaged watersheds. Applying these techniques to water- sheds in west-central Florida requires that some of the empirical relationships be extrapolated beyond tested ranges. As a result, there is uncertainty as to the accuracy of these estimates. Sixty-six storms occurring in 15 west-central Florida watersheds were initially modeled using the Rational Method, the U.S. Geological Survey Regional Regression Equations, the Natural Resources Conservation Service TR-20 model, the U.S. Army Corps of Engineers Hydrologic Engineering Center-1 model, and the Environmental Protection Agency Storm Water Management Model. The techniques were applied according to the guidelines specified in the user manuals or standard engineering textbooks as though no field data were available and the selection of input parameters was not influenced by observed data. Computed estimates were compared with observed runoff to evaluate the accuracy of the techniques. One watershed was eliminated from further evaluation when it was determined that the area contributing runoff to the stream varies with the amount and intensity of rainfall. Therefore, further evaluation and modification of the input parameters were made for only 62 storms in 14 watersheds. Runoff ranged from 1.4 to 99.3 percent percent of rainfall. The average runoff for all watersheds included in this study was about 36 percent of rainfall. The average runoff for the urban, natural, and mixed land-use watersheds was about 41, 27, and 29 percent, respectively. Initial estimates of peak discharge using the rational method produced average watershed errors that ranged from an underestimation of 50.4 percent to an overestimation of 767 percent. The coefficient of runoff ranged from 0.20 to 0.60. Calibration of the technique produced average errors that ranged from an underestimation of 3.3 percent to an overestimation of 1.5 percent. The average calibrated coefficient of runoff for each watershed ranged from 0.02 to 0.72. The average values of the coefficient of runoff necessary to calibrate the urban, natural, and mixed land-use watersheds were 0.39, 0.16, and 0.08, respectively. The U.S. Geological Survey regional regression equations for determining peak discharge produced errors that ranged from an underestimation of 87.3 percent to an over- estimation of 1,140 percent. The regression equations for determining runoff volume produced errors that ranged from an underestimation of 95.6 percent to an overestimation of 324 percent. Regression equations developed from data used for this study produced errors that ranged between an underestimation of 82.8 percent and an over- estimation of 328 percent for peak discharge, and from an underestimation of 71.2 percent to an overestimation of 241 percent for runoff volume. Use of the equations developed for west-central Florida streams produced average errors for each type of watershed that were lower than errors associated with use of the U.S. Geological Survey equations. Initial estimates of peak discharges and runoff volumes using the Natural Resources Conservation Service TR-20 model, produced average errors of 44.6 and 42.7 percent respectively, for all the watersheds. Curve numbers and times of concentration were adjusted to match estimated and observed peak discharges and runoff volumes. The average change in the curve number for all the watersheds was a decrease of 2.8 percent. The average change in the time of concentration was an increase of 59.2 percent. The shape of the input dimensionless unit hydrograph also had to be adjusted to match the shape and peak time of the estimated and observed flood hydrographs. Peak rate factors for the modified input dimensionless unit hydrographs ranged from 162 to 454. The mean errors for peak discharges and runoff volumes were reduced to 18.9 and 19.5 percent, respectively, using the average calibrated input parameters for ea
Kosegarten, Carlos E; Ramírez-Corona, Nelly; Mani-López, Emma; Palou, Enrique; López-Malo, Aurelio
2017-01-02
A Box-Behnken design was used to determine the effect of protein concentration (0, 5, or 10g of casein/100g), fat (0, 3, or 6g of corn oil/100g), a w (0.900, 0.945, or 0.990), pH (3.5, 5.0, or 6.5), concentration of cinnamon essential oil (CEO, 0, 200, or 400μL/kg) and incubation temperature (15, 25, or 35°C) on the growth of Aspergillus flavus during 50days of incubation. Mold response under the evaluated conditions was modeled by the modified Gompertz equation, logistic regression, and time-to-detection model. The obtained polynomial regression models allow the significant coefficients (p<0.05) for linear, quadratic and interaction effects for the Gompertz equation's parameters to be identified, which adequately described (R 2 >0.967) the studied mold responses. After 50days of incubation, every tested model system was classified according to the observed response as 1 (growth) or 0 (no growth), then a binary logistic regression was utilized to model A. flavus growth interface, allowing to predict the probability of mold growth under selected combinations of tested factors. The time-to-detection model was utilized to estimate the time at which A. flavus visible growth begins. Water activity, temperature, and CEO concentration were the most important factors affecting fungal growth. It was observed that there is a range of possible combinations that may induce growth, such that incubation conditions and the amount of essential oil necessary for fungal growth inhibition strongly depend on protein and fat concentrations as well as on the pH of studied model systems. The probabilistic model and the time-to-detection models constitute another option to determine appropriate storage/processing conditions and accurately predict the probability and/or the time at which A. flavus growth occurs. Copyright © 2016 Elsevier B.V. All rights reserved.
Mendez, Bomar Rojas
2017-01-01
Background Improving access to delivery services does not guarantee access to quality obstetric care and better survival, and therefore, concerns for quality of maternal and newborn care in low- and middle-income countries have been raised. Our study explored characteristics associated with the quality of initial assessment, intrapartum, and immediate postpartum and newborn care, and further assessed the relationships along the continuum of care. Methods The 2010 Service Provision Assessment data of Kenya for 627 routine deliveries of women aged 15–49 were used. Quality of care measures were assessed using recently validated quality of care measures during initial assessment, intrapartum, and postpartum periods. Data were analyzed with negative binomial regression and structural equation modeling technique. Results The negative binomial regression results identified a number of determinants of quality, such as the level of health facilities, managing authority, presence of delivery fee, central electricity supply and clinical guideline for maternal and neonatal care. Our structural equation modeling (SEM) further demonstrated that facility characteristics were important determinants of quality for initial assessment and postpartum care, while characteristics at the provider level became more important in shaping the quality of intrapartum care. Furthermore we also noted that quality of initial assessment had a positive association with quality of intrapartum care (β = 0.71, p < 0.001), which in turn was positively associated with the quality of newborn and immediate postpartum care (β = 1.29, p = 0.004). Conclusions A continued focus on quality of care along the continuum of maternity care is important not only to mothers but also their newborns. Policymakers should therefore ensure that required resources, as well as adequate supervision and emphasis on the quality of obstetric care, are available. PMID:28520771
Williams-Sether, Tara
2004-01-01
The Dakota Water Resources Act, passed by the U.S. Congress on December 15, 2000, authorized the Secretary of the Interior to conduct a comprehensive study of future water-quantity and quality needs of the Red River of the North Basin in North Dakota and possible options to meet those water needs. Previous Red River of the North Basin studies conducted by the Bureau of Reclamation used streamflow and water-quality data bases developed by the U.S. Geological Survey that included data for 1931-84. As a result of the recent congressional authorization and results of previous studies by the Bureau of Reclamation, redevelopment of the streamflow and water-quality data bases with current data through 1999 are needed in order to evaluate and predict the water-quantity and quality effects within the Red River of the North Basin. This report provides updated statistical summaries of selected water-quality constituents and streamflow and the regression relations between them. Available data for 1931-99 were used to develop regression equations between 5 selected water-quality constituents and streamflow for 38 gaging stations in the Red River of the North Basin. The water-quality constituents that were regressed against streamflow were hardness (as CaCO3), sodium, chloride, sulfate, and dissolved solids. Statistical summaries of the selected water-quality constituents and streamflow for the gaging stations used in the regression equations development and the applications and limitations of the regression equations are presented in this report.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
August Median Streamflow on Ungaged Streams in Eastern Aroostook County, Maine
Lombard, Pamela J.; Tasker, Gary D.; Nielsen, Martha G.
2003-01-01
Methods for estimating August median streamflow were developed for ungaged, unregulated streams in the eastern part of Aroostook County, Maine, with drainage areas from 0.38 to 43 square miles and mean basin elevations from 437 to 1,024 feet. Few long-term, continuous-record streamflow-gaging stations with small drainage areas were available from which to develop the equations; therefore, 24 partial-record gaging stations were established in this investigation. A mathematical technique for estimating a standard low-flow statistic, August median streamflow, at partial-record stations was applied by relating base-flow measurements at these stations to concurrent daily flows at nearby long-term, continuous-record streamflow- gaging stations (index stations). Generalized least-squares regression analysis (GLS) was used to relate estimates of August median streamflow at gaging stations to basin characteristics at these same stations to develop equations that can be applied to estimate August median streamflow on ungaged streams. GLS accounts for varying periods of record at the gaging stations and the cross correlation of concurrent streamflows among gaging stations. Twenty-three partial-record stations and one continuous-record station were used for the final regression equations. The basin characteristics of drainage area and mean basin elevation are used in the calculated regression equation for ungaged streams to estimate August median flow. The equation has an average standard error of prediction from -38 to 62 percent. A one-variable equation uses only drainage area to estimate August median streamflow when less accuracy is acceptable. This equation has an average standard error of prediction from -40 to 67 percent. Model error is larger than sampling error for both equations, indicating that additional basin characteristics could be important to improved estimates of low-flow statistics. Weighted estimates of August median streamflow, which can be used when making estimates at partial-record or continuous-record gaging stations, range from 0.03 to 11.7 cubic feet per second or from 0.1 to 0.4 cubic feet per second per square mile. Estimates of August median streamflow on ungaged streams in the eastern part of Aroostook County, within the range of acceptable explanatory variables, range from 0.03 to 30 cubic feet per second or 0.1 to 0.7 cubic feet per second per square mile. Estimates of August median streamflow per square mile of drainage area generally increase as mean elevation and drainage area increase.
Petkewich, Matthew D.; Conrads, Paul
2013-01-01
The Everglades Depth Estimation Network is an integrated network of real-time water-level gaging stations, a ground-elevation model, and a water-surface elevation model designed to provide scientists, engineers, and water-resource managers with water-level and water-depth information (1991-2013) for the entire freshwater portion of the Greater Everglades. The U.S. Geological Survey Greater Everglades Priority Ecosystems Science provides support for the Everglades Depth Estimation Network in order for the Network to provide quality-assured monitoring data for the U.S. Army Corps of Engineers Comprehensive Everglades Restoration Plan. In a previous study, water-level estimation equations were developed to fill in missing data to increase the accuracy of the daily water-surface elevation model. During this study, those equations were updated because of the addition and removal of water-level gaging stations, the consistent use of water-level data relative to the North American Vertical Datum of 1988, and availability of recent data (March 1, 2006, to September 30, 2011). Up to three linear regression equations were developed for each station by using three different input stations to minimize the occurrences of missing data for an input station. Of the 667 water-level estimation equations developed to fill missing data at 223 stations, more than 72 percent of the equations have coefficients of determination greater than 0.90, and 97 percent have coefficients of determination greater than 0.70.
Lera, Lydia; Albala, Cecilia; Ángel, Bárbara; Sánchez, Hugo; Picrin, Yaisy; Hormazabal, María José; Quiero, Andrea
2014-03-01
To develop a predictive model of appendicular skeletal muscle mass (ASM) based on anthropometric measurements in elderly from Santiago, Chile. 616 community dwelling, non-disabled subjects ≥ 60 years (mean 69.9 ± 5.2 years) living in Santiago, 64.6% female, participating in ALEXANDROS study. Anthropometric measurements, handgrip strength, mobility tests and DEXA were performed. Step by step linear regression models were used to associate ASM from DEXA with anthropometric variables, age and sex. The sample was divided at random into two to obtain prediction equations for both subsamples, which were mutually validated by double cross-validation. The high correlation between the values of observed and predicted MMAE in both sub-samples and the low degree of shrinkage allowed developing the final prediction equation with the total sample. The cross-validity coefficient between prediction models from the subsamples (0.941 and 0.9409) and the shrinkage (0.004 and 0.006) were similar in both equations. The final prediction model obtained from the total sample was: ASM (kg) = 0.107(weight in kg) + 0.251( knee height in cm) + 0.197 (Calf Circumference in cm) +0.047 (dynamometry in kg) - 0.034 (Hip Circumference in cm) + 3.417 (Man) - 0.020 (age years) - 7.646 (R2 = 0.89). The mean ASM obtained by the prediction equation and the DEXA measurement were similar (16.8 ± 4.0 vs 16.9 ± 3.7) and highly concordant according Bland and Altman (95% CI: -2.6 -2.7) and Lin (concordance correlation coefficient = 0.94) methods. We obtained a low cost anthropometric equation to determine the appendicular skeletal muscle mass useful for the screening of sarcopenia in older adults. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Regional regression equations for estimation of natural streamflow statistics in Colorado
Capesius, Joseph P.; Stephens, Verlin C.
2009-01-01
The U.S. Geological Survey (USGS), in cooperation with the Colorado Water Conservation Board and the Colorado Department of Transportation, developed regional regression equations for estimation of various streamflow statistics that are representative of natural streamflow conditions at ungaged sites in Colorado. The equations define the statistical relations between streamflow statistics (response variables) and basin and climatic characteristics (predictor variables). The equations were developed using generalized least-squares and weighted least-squares multilinear regression reliant on logarithmic variable transformation. Streamflow statistics were derived from at least 10 years of streamflow data through about 2007 from selected USGS streamflow-gaging stations in the study area that are representative of natural-flow conditions. Basin and climatic characteristics used for equation development are drainage area, mean watershed elevation, mean watershed slope, percentage of drainage area above 7,500 feet of elevation, mean annual precipitation, and 6-hour, 100-year precipitation. For each of five hydrologic regions in Colorado, peak-streamflow equations that are based on peak-streamflow data from selected stations are presented for the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year instantaneous-peak streamflows. For four of the five hydrologic regions, equations based on daily-mean streamflow data from selected stations are presented for 7-day minimum 2-, 10-, and 50-year streamflows and for 7-day maximum 2-, 10-, and 50-year streamflows. Other equations presented for the same four hydrologic regions include those for estimation of annual- and monthly-mean streamflow and streamflow-duration statistics for exceedances of 10, 25, 50, 75, and 90 percent. All equations are reported along with salient diagnostic statistics, ranges of basin and climatic characteristics on which each equation is based, and commentary of potential bias, which is not otherwise removed by log-transformation of the variables of the equations from interpretation of residual plots. The predictor-variable ranges can be used to assess equation applicability for ungaged sites in Colorado.
1998-01-01
Changes in domestic refining operations are identified and related to the summer Reid vapor pressure (RVP) restrictions and oxygenate blending requirements. This analysis uses published Energy Information Administration survey data and linear regression equations from the Short-Term Integrated Forecasting System (STIFS). The STIFS model is used for producing forecasts appearing in the Short-Term Energy Outlook.
Solvency supervision based on a total balance sheet approach
NASA Astrophysics Data System (ADS)
Pitselis, Georgios
2009-11-01
In this paper we investigate the adequacy of the own funds a company requires in order to remain healthy and avoid insolvency. Two methods are applied here; the quantile regression method and the method of mixed effects models. Quantile regression is capable of providing a more complete statistical analysis of the stochastic relationship among random variables than least squares estimation. The estimated mixed effects line can be considered as an internal industry equation (norm), which explains a systematic relation between a dependent variable (such as own funds) with independent variables (e.g. financial characteristics, such as assets, provisions, etc.). The above two methods are implemented with two data sets.
Wood, Molly S.; Rea, Alan; Skinner, Kenneth D.; Hortness, Jon E.
2009-01-01
Many State and Federal agencies use information regarding the locations of streams having intermittent or perennial flow when making management and regulatory decisions. For example, the application of some Idaho water quality standards depends on whether streams are intermittent. Idaho Administrative Code defines an intermittent stream as one having a 7-day, 2-year low flow (7Q2) less than 0.1 ft3/s. However, there is a general recognition that the cartographic representation of perennial/intermittent status of streams on U.S. Geological Survey (USGS) topographic maps is not as accurate or consistent as desirable from one map to another, which makes broad management and regulatory assessments difficult and inconsistent. To help resolve this problem, the USGS has developed a methodology for predicting the locations of perennial streams based on regional generalized least-squares (GLS) regression equations for Idaho streams for the 7Q2 low-flow statistic. Using these regression equations, the 7Q2 streamflow may be estimated for naturally flowing streams in most areas in Idaho. The use of these equations in conjunction with a geographic information system (GIS) technique known as weighted flow accumulation allows for an automated and continuous estimation of 7Q2 streamflow at all points along stream reaches. The USGS has developed a GIS-based map of the locations of streams in Idaho with perennial flow based on a 7Q2 of 0.1 ft3/s and a transition zone of plus or minus 1 standard error. Idaho State cooperators plan to use this information to make regulatory and water-quality management decisions. Originally, 7Q2 equations were developed for eight regions of similar hydrologic characteristics in the study area, using long-term data from 234 streamflow-gaging stations. Equations in five of the regions were revised based on spatial patterns observed in the initial perennial streams map and unrealistic behavior of the equations in extrapolation. The standard errors of prediction for the final equations ranged from a minimum of +75.0 to -42.9 percent in the central part of the study area to a maximum of +277 to -73.5 percent in the southern part of the study area. The equations are applicable only to unregulated, naturally-flowing streams and may produce unreliable results outside the range of explanatory variables used for equation development. Extrapolation outside the range of available data was necessary, however, to predict perennial flow initiation points and transition zones along stream reaches. The map of perennial streams was evaluated by comparing predicted stream classifications with four independent datasets, including field observations by other government agencies. Overall, 81 percent of the comparison data points agreed with the USGS perennial streams model. Regions with the highest number of disagreements had a high percentage of mountainous and forested area with potential mountain front recharge zones, and regions with the highest agreements had a high percentage of low gradient, low elevation area. As a whole, the USGS model predicted a higher number of perennial streams than predictions made with the independent datasets. Some disagreements were due to poor site location coordinates, timing of the comparison site visits during unusually wet or dry years, discrepancies in classification criteria, and variable ground water contributions to flow in some areas. The Idaho Department of Environmental Quality Beneficial Use Reconnaissance Program (BURP) dataset is considered the most representative dataset for comparison because it covered a range of climate conditions and the number of sites visited were consistent from year to year during the study period. Eighty-five percent of BURP comparison data points agreed with the USGS perennial streams model. Although site-specific flow data may be needed to correctly classify streams in some areas, this information rarely is available and is not always practical to o
Ziegeweid, Jeffrey R.; Magdalene, Suzanne
2015-01-01
The new regression equations were used to calculate revised estimates of historical streamflows for Stillwater and Prescott starting in 1910 and ending when index-velocity streamgages were installed. Monthly, annual, 30-year, and period of record statistics were examined between previous and revised estimates of historical streamflows. The abilities of the new regression equations to estimate historical streamflows were evaluated by using percent differences to compare new estimates of historical daily streamflows to discrete streamflow measurements made at Stillwater and Prescott before the installation of index-velocity streamgages. Although less variability was observed between estimated and measured streamflows at Stillwater compared to Prescott, the percent difference data indicated that the new estimates closely approximated measured streamflows at both locations.
Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David
2016-10-01
Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.
A Model for Oil-Gas Pipelines Cost Prediction Based on a Data Mining Process
NASA Astrophysics Data System (ADS)
Batzias, Fragiskos A.; Spanidis, Phillip-Mark P.
2009-08-01
This paper addresses the problems associated with the cost estimation of oil/gas pipelines during the elaboration of feasibility assessments. Techno-economic parameters, i.e., cost, length and diameter, are critical for such studies at the preliminary design stage. A methodology for the development of a cost prediction model based on Data Mining (DM) process is proposed. The design and implementation of a Knowledge Base (KB), maintaining data collected from various disciplines of the pipeline industry, are presented. The formulation of a cost prediction equation is demonstrated by applying multiple regression analysis using data sets extracted from the KB. Following the methodology proposed, a learning context is inductively developed as background pipeline data are acquired, grouped and stored in the KB, and through a linear regression model provide statistically substantial results, useful for project managers or decision makers.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
The influence of changes in land use and landscape patterns on soil erosion in a watershed.
Zhang, Shanghong; Fan, Weiwei; Li, Yueqiang; Yi, Yujun
2017-01-01
It is very important to have a good understanding of the relation between soil erosion and landscape patterns so that soil and water conservation in river basins can be optimized. In this study, this relationship was explored, using the Liusha River Watershed, China, as a case study. A distributed water and sediment model based on the Soil and Water Assessment Tool (SWAT) was developed to simulate soil erosion from different land use types in each sub-basin of the Liusha River Watershed. Observed runoff and sediment data from 1985 to 2005 and land use maps from 1986, 1995, and 2000 were used to calibrate and validate the model. The erosion modulus for each sub-basin was calculated from SWAT model results using the different land use maps and 12 landscape indices were chosen and calculated to describe the land use in each sub-basin for the different years. The variations in instead of the absolute amounts of the erosion modulus and the landscape indices for each sub-basin were used as the dependent and independent variables, respectively, for the regression equations derived from multiple linear regression. The results indicated that the variations in the erosion modulus were closely related to changes in the large patch index, patch cohesion index, modified Simpson's evenness index, and the aggregation index. From the regression equation and the corresponding landscape indices, it was found that watershed erosion can be reduced by decreasing the physical connectivity between patches, improving the evenness of the landscape patch types, enriching landscape types, and enhancing the degree of aggregation between the landscape patches. These findings will be useful for water and soil conservation and for optimizing the management of watershed landscapes. Copyright © 2016 Elsevier B.V. All rights reserved.
Walton, David M; Lefebvre, Andy; Reynolds, Darcy
2015-06-01
Illness representations pertain to the ways in which an individual constructs and understands the experience of a health condition. The Brief Illness Perceptions Questionnaire (BIPQ) comprises 9 items intended to capture the key components of the Illness Representations Model. The purpose of this paper was to explore the utility of the BIPQ for evaluating and classifying uncomplicated mechanical neck pain in the rehabilitation setting. A convenience sample of 198 subjects presenting to physiotherapy for neck pain problems were used in this study. In the first step, 183 subjects completed the BIPQ and a series of related cognitive measures. Latent class analysis (LCA) was used to explore the number of identifiable classes amongst the sample based on BIPQ response patterns. A regression equation was created to facilitate classification. In the second step, an independent sample of 15 subjects were classified using the equation established in step 1, and they were followed over a 3 month period. The LCA revealed 3 classes of subjects with optimal fit statistics: mildly affected, moderately affected, and severely affected. Inter-group comparisons of the secondary cognitive measures supported these labels. Classification accuracy of a regression equation was high (94.5%). Applying the equation to the independent longitudinal sample revealed that it functioned equally well and that the classes may have prognostic value. The BIPQ may be a useful clinical tool for classification of neck pain. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Boyce, Lola; Bast, Callie C.; Trimble, Greg A.
1992-01-01
This report presents the results of a fourth year effort of a research program, conducted for NASA-LeRC by the University of Texas at San Antonio (UTSA). The research included on-going development of methodology that provides probabilistic lifetime strength of aerospace materials via computational simulation. A probabilistic material strength degradation model, in the form of a randomized multifactor interaction equation, is postulated for strength degradation of structural components of aerospace propulsion systems subject to a number of effects or primitive variables. These primitive variables may include high temperature, fatigue or creep. In most cases, strength is reduced as a result of the action of a variable. This multifactor interaction strength degradation equation has been randomized and is included in the computer program, PROMISS. Also included in the research is the development of methodology to calibrate the above-described constitutive equation using actual experimental materials data together with regression analysis of that data, thereby predicting values for the empirical material constants for each effect or primitive variable. This regression methodology is included in the computer program, PROMISC. Actual experimental materials data were obtained from industry and the open literature for materials typically for applications in aerospace propulsion system components. Material data for Inconel 718 has been analyzed using the developed methodology.
NASA Technical Reports Server (NTRS)
Boyce, Lola; Bast, Callie C.; Trimble, Greg A.
1992-01-01
The results of a fourth year effort of a research program conducted for NASA-LeRC by The University of Texas at San Antonio (UTSA) are presented. The research included on-going development of methodology that provides probabilistic lifetime strength of aerospace materials via computational simulation. A probabilistic material strength degradation model, in the form of a randomized multifactor interaction equation, is postulated for strength degradation of structural components of aerospace propulsion systems subjected to a number of effects or primitive variables. These primitive variables may include high temperature, fatigue, or creep. In most cases, strength is reduced as a result of the action of a variable. This multifactor interaction strength degradation equation was randomized and is included in the computer program, PROMISC. Also included in the research is the development of methodology to calibrate the above-described constitutive equation using actual experimental materials data together with regression analysis of that data, thereby predicting values for the empirical material constants for each effect or primitive variable. This regression methodology is included in the computer program, PROMISC. Actual experimental materials data were obtained from industry and the open literature for materials typically for applications in aerospace propulsion system components. Material data for Inconel 718 was analyzed using the developed methodology.
Evaluation of a Pitot type spirometer in helium/oxygen mixtures.
Søndergaard, S; Kárason, S; Lundin, S; Stenqvist, O
1998-08-01
Mixtures of helium and oxygen are regaining a place in the treatment of obstruction of the upper and lower respiratory tract. The parenchymal changes during the course of IRDS or ARDS may also benefit from the reintroduction of helium/oxygen. In order to monitor and document the effect of low-density gas mixtures, we evaluated the Datex AS/3 Side Stream Spirometry module with D-lite (Datex-Engstrom Instrumentarium Corporation, Finland) against two golden standards. Under conditions simulating controlled and spontaneous ventilation with gas mixtures of He (approx. 80, 50, and 20%)/O2 or N2(approx. 21 and 79%)/02, simultaneous measurements using Biotek Ventilator Tester (Bio-Tek Instr., Vermont, USA) or body plethysmograph (SensorMedics System, Anaheim, USA) were correlated with data from the spirometry module. Data were analyzed according to a statistical regression model resulting in a best-fit equation based on density, voltage, and volume measurements. As expected, the D-lite (a modified Pitot tube) showed density-dependent behaviour. Regression equations and percentage deviation of estimated versus measured values were calculated. Measurements with the D-lite using low-density gases are satisfactorily contained in best-fit equations with a standard deviation of less than 5% during all ventilatory modes and mixtures.
Prediction of Maximal Aerobic Capacity in Severely Burned Children
Porro, Laura; Rivero, Haidy G.; Gonzalez, Dante; Tan, Alai; Herndon, David N.; Suman, Oscar E.
2011-01-01
Introduction Maximal oxygen uptake (VO2 peak) is an indicator of cardiorespiratory fitness, but requires expensive equipment and a relatively high technical skill level. Purpose The aim of this study is to provide a formula for estimating VO2 peak in burned children, using information obtained without expensive equipment. Methods Children, with ≥40% total surface area burned (TBSA), underwent a modified Bruce treadmill test to asses VO2 peak at 6 months after injury. We recorded gender, age, %TBSA, %3rd degree burn, height, weight, treadmill time, maximal speed, maximal grade, and peak heart rate, and applied McHenry’s select algorithm to extract important independent variables and Robust multiple regression to establish prediction equations. Results 42 children; 7 to 17 years old were tested. Robust multiple regression model provided the equation: VO2=10.33 – 0.62 *Age (years) + 1.88 * Treadmill Time (min) + 2.3 (gender; Females = 0, Males = 1). The correlation between measured and estimated VO2 peak was R=0.80. We then validated the equation with a group of 33 burned children, which yielded a correlation between measured and estimated VO2 peak of R=0.79. Conclusions Using only a treadmill and easily gathered information, VO2 peak can be estimated in children with burns. PMID:21316155
NASA Technical Reports Server (NTRS)
Gentry, R. C.; Rodgers, E.; Steranka, J.; Shenk, W. E.
1978-01-01
A regression technique was developed to forecast 24 hour changes of the maximum winds for weak (maximum winds less than or equal to 65 Kt) and strong (maximum winds greater than 65 Kt) tropical cyclones by utilizing satellite measured equivalent blackbody temperatures around the storm alone and together with the changes in maximum winds during the preceding 24 hours and the current maximum winds. Independent testing of these regression equations shows that the mean errors made by the equations are lower than the errors in forecasts made by the peristence techniques.
Feaster, Toby D.; Gotvald, Anthony J.; Weaver, J. Curtis
2014-01-01
Reliable estimates of the magnitude and frequency of floods are essential for the design of transportation and water-conveyance structures, flood-insurance studies, and flood-plain management. Such estimates are particularly important in densely populated urban areas. In order to increase the number of streamflow-gaging stations (streamgages) available for analysis, expand the geographical coverage that would allow for application of regional regression equations across State boundaries, and build on a previous flood-frequency investigation of rural U.S Geological Survey streamgages in the Southeast United States, a multistate approach was used to update methods for determining the magnitude and frequency of floods in urban and small, rural streams that are not substantially affected by regulation or tidal fluctuations in Georgia, South Carolina, and North Carolina. The at-site flood-frequency analysis of annual peak-flow data for urban and small, rural streams (through September 30, 2011) included 116 urban streamgages and 32 small, rural streamgages, defined in this report as basins draining less than 1 square mile. The regional regression analysis included annual peak-flow data from an additional 338 rural streamgages previously included in U.S. Geological Survey flood-frequency reports and 2 additional rural streamgages in North Carolina that were not included in the previous Southeast rural flood-frequency investigation for a total of 488 streamgages included in the urban and small, rural regression analysis. The at-site flood-frequency analyses for the urban and small, rural streamgages included the expected moments algorithm, which is a modification of the Bulletin 17B log-Pearson type III method for fitting the statistical distribution to the logarithms of the annual peak flows. Where applicable, the flood-frequency analysis also included low-outlier and historic information. Additionally, the application of a generalized Grubbs-Becks test allowed for the detection of multiple potentially influential low outliers. Streamgage basin characteristics were determined using geographical information system techniques. Initial ordinary least squares regression simulations reduced the number of basin characteristics on the basis of such factors as statistical significance, coefficient of determination, Mallow’s Cp statistic, and ease of measurement of the explanatory variable. Application of generalized least squares regression techniques produced final predictive (regression) equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability flows for urban and small, rural ungaged basins for three hydrologic regions (HR1, Piedmont–Ridge and Valley; HR3, Sand Hills; and HR4, Coastal Plain), which previously had been defined from exploratory regression analysis in the Southeast rural flood-frequency investigation. Because of the limited availability of urban streamgages in the Coastal Plain of Georgia, South Carolina, and North Carolina, additional urban streamgages in Florida and New Jersey were used in the regression analysis for this region. Including the urban streamgages in New Jersey allowed for the expansion of the applicability of the predictive equations in the Coastal Plain from 3.5 to 53.5 square miles. Average standard error of prediction for the predictive equations, which is a measure of the average accuracy of the regression equations when predicting flood estimates for ungaged sites, range from 25.0 percent for the 10-percent annual exceedance probability regression equation for the Piedmont–Ridge and Valley region to 73.3 percent for the 0.2-percent annual exceedance probability regression equation for the Sand Hills region.
Estimating the magnitude of peak flows at selected recurrence intervals for streams in Idaho
Berenbrock, Charles
2002-01-01
The region-of-influence method is not recommended for use in determining flood-frequency estimates for ungaged sites in Idaho because the results, overall, are less accurate and the calculations are more complex than those of regional regression equations. The regional regression equations were considered to be the primary method of estimating the magnitude and frequency of peak flows for ungaged sites in Idaho.
Gómez Campos, Rossana; Pacheco Carrillo, Jaime; Almonacid Fierro, Alejandro; Urra Albornoz, Camilo; Cossío-Bolaños, Marco
2018-03-01
(i) To propose regression equations based on anthropometric measures to estimate fat mass (FM) using dual energy X-ray absorptiometry (DXA) as reference method, and (ii)to establish population reference standards for equation-derived FM. A cross-sectional study on 6,713 university students (3,354 males and 3,359 females) from Chile aged 17.0 to 27.0years. Anthropometric measures (weight, height, waist circumference) were taken in all participants. Whole body DXA was performed in 683 subjects. A total of 478 subjects were selected to develop regression equations, and 205 for their cross-validation. Data from 6,030 participants were used to develop reference standards for FM. Equations were generated using stepwise multiple regression analysis. Percentiles were developed using the LMS method. Equations for men were: (i) FM=-35,997.486 +232.285 *Weight +432.216 *CC (R 2 =0.73, SEE=4.1); (ii)FM=-37,671.303 +309.539 *Weight +66,028.109 *ICE (R2=0.76, SEE=3.8), while equations for women were: (iii)FM=-13,216.917 +461,302 *Weight+91.898 *CC (R 2 =0.70, SEE=4.6), and (iv) FM=-14,144.220 +464.061 *Weight +16,189.297 *ICE (R 2 =0.70, SEE=4.6). Percentiles proposed included p10, p50, p85, and p95. The developed equations provide valid and accurate estimation of FM in both sexes. The values obtained using the equations may be analyzed from percentiles that allow for categorizing body fat levels by age and sex. Copyright © 2017 SEEN y SED. Publicado por Elsevier España, S.L.U. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wetter, Michael; Zuo, Wangda; Nouidui, Thierry S.
This paper describes the Buildings library, a free open-source library that is implemented in Modelica, an equation-based object-oriented modeling language. The library supports rapid prototyping, as well as design and operation of building energy and control systems. First, we describe the scope of the library, which covers HVAC systems, multi-zone heat transfer and multi-zone airflow and contaminant transport. Next, we describe differentiability requirements and address how we implemented them. We describe the class hierarchy that allows implementing component models by extending partial implementations of base models of heat and mass exchangers, and by instantiating basic models for conservation equations andmore » flow resistances. We also describe associated tools for pre- and post-processing, regression tests, co-simulation and real-time data exchange with building automation systems. Furthermore, the paper closes with an example of a chilled water plant, with and without water-side economizer, in which we analyzed the system-level efficiency for different control setpoints.« less
Wetter, Michael; Zuo, Wangda; Nouidui, Thierry S.; ...
2013-03-13
This paper describes the Buildings library, a free open-source library that is implemented in Modelica, an equation-based object-oriented modeling language. The library supports rapid prototyping, as well as design and operation of building energy and control systems. First, we describe the scope of the library, which covers HVAC systems, multi-zone heat transfer and multi-zone airflow and contaminant transport. Next, we describe differentiability requirements and address how we implemented them. We describe the class hierarchy that allows implementing component models by extending partial implementations of base models of heat and mass exchangers, and by instantiating basic models for conservation equations andmore » flow resistances. We also describe associated tools for pre- and post-processing, regression tests, co-simulation and real-time data exchange with building automation systems. Furthermore, the paper closes with an example of a chilled water plant, with and without water-side economizer, in which we analyzed the system-level efficiency for different control setpoints.« less
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Rumpold, Gerhard; Klingseis, Michael; Dornauer, Kurt; Kopp, Martin; Doering, Stephan; Höfer, Stefan; Mumelter, Birgit; Schüssler, Gerhard
2006-01-01
The use of psychotropic substances in adolescents represents a serious public health problem. In this study a representative sample of 485 Austrian students between 14 and 18 years of age were investigated with a semistructured interview about substance-related issues and completed the general health questionnaire. The following rates of regular psychotropic substance use were found: cigarettes 41.4%, alcohol 44.5%, cannabis 10.1%, and other illicit substances 3%. Logistic regression analyses and structural equation modeling revealed the following major risk factors for substance use: peer pressure, negative family atmosphere, school difficulties, and psychopathology. Knowledge about substance use acted as a protective factor. Prevention of adolescent substance use and misuse should aim at these different targets. Information about coping with peer pressure may be a particularly promising route of intervention.
Estimating the magnitude and frequency of floods in urban basins in Missouri
Southard, Rodney E.
2010-01-01
Streamgage flood-frequency analyses were done for 35 streamgages on urban streams in and adjacent to Missouri for estimation of the magnitude and frequency of floods in urban areas of Missouri. A log-Pearson Type-III distribution was fitted to the annual series of peak flow data retrieved from the U.S. Geological Survey National Water Information System. For this report, the flood frequency estimates are expressed in terms of percent annual exceedance probabilities of 50, 20, 10, 4, 2, 1, and 0.2. Of the 35 streamgages, 30 are located in Missouri. The remaining five non-Missouri streamgages were added to the dataset to improve the range and applicability of the regression analyses from the streamgage frequency analyses. Ordinary least-squares was used to determine the best set of independent variables for the regression equations. Basin characteristics selected for independent variables into the ordinary least-squares regression analyses were based on theoretical relation to flood flows, literature review of possible basin characteristics, and the ability to measure the basin characteristics using digital datasets and geographic information system technology. Results of the ordinary least-squares were evaluated on the basis of Mallow's Cp statistic, the adjusted coefficient of determination, and the statistical significance of the independent variables. The independent variables of drainage area and percent impervious area were determined to be statistically significant and readily determined from existing digital datasets. The drainage area variable was computed using the best elevation data available, either from a statewide 10-meter grid or high-resolution elevation data from urban areas. The impervious area variable was computed from the National Land Cover Dataset 2001 impervious area dataset. The National Land Cover Dataset 2001 impervious area data for each basin was compared to historical imagery and 7.5-minute topographic maps to verify the national dataset represented the urbanization of the basin at the time streamgage data were collected. Eight streamgages had less urbanization during the period of time streamflow data were collected than was shown on the 2001 dataset. The impervious area values for these eight urban basins were adjusted downward as much as 23 percent to account for the additional urbanization since the streamflow data were collected. Weighted least-squares regression techniques were used to determine the final regression equations for the statewide urban flood-frequency equations. Weighted least-squares techniques improve regression equations by adjusting for different and varying lengths in streamflow records. The final flood-frequency equations for the 50-, 20-, 10-, 4-, 2-, 1-, and 0.2-percent annual exceedance probability floods for Missouri provide a technique for estimating peak flows on urban streams at gaged and ungaged sites. The applicability of the equations is limited by the range in basin characteristics used to develop the regression equations. The range in drainage area is 0.28 to 189 square miles; range in impervious area is 2.3 to 46.0 percent. Seven of the 35 selected streamgages were used to compare the results of the existing rural and urban equations to the urban equations presented in this report for the 1-percent annual exceedance probability. Results of the comparison indicate that the estimated peak flows for the urban equation in this report ranged from 3 to 52 percent higher than the results from the rural equations. Comparing the estimated urban peak flows from this report to the existing urban equation developed in 1986 indicated the range was 255 percent lower to 10 percent higher. The overall comparison between the current (2010) and 1986 urban equations indicates a reduction in estimated peak flow values for the 1-percent annual exceedance probability flood.
Liu, Xun; Li, Ning-shan; Lv, Lin-sheng; Huang, Jian-hua; Tang, Hua; Chen, Jin-xia; Ma, Hui-juan; Wu, Xiao-ming; Lou, Tan-qi
2013-12-01
Accurate estimation of glomerular filtration rate (GFR) is important in clinical practice. Current models derived from regression are limited by the imprecision of GFR estimates. We hypothesized that an artificial neural network (ANN) might improve the precision of GFR estimates. A study of diagnostic test accuracy. 1,230 patients with chronic kidney disease were enrolled, including the development cohort (n=581), internal validation cohort (n=278), and external validation cohort (n=371). Estimated GFR (eGFR) using a new ANN model and a new regression model using age, sex, and standardized serum creatinine level derived in the development and internal validation cohort, and the CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) 2009 creatinine equation. Measured GFR (mGFR). GFR was measured using a diethylenetriaminepentaacetic acid renal dynamic imaging method. Serum creatinine was measured with an enzymatic method traceable to isotope-dilution mass spectrometry. In the external validation cohort, mean mGFR was 49±27 (SD) mL/min/1.73 m2 and biases (median difference between mGFR and eGFR) for the CKD-EPI, new regression, and new ANN models were 0.4, 1.5, and -0.5 mL/min/1.73 m2, respectively (P<0.001 and P=0.02 compared to CKD-EPI and P<0.001 comparing the new regression and ANN models). Precisions (IQRs for the difference) were 22.6, 14.9, and 15.6 mL/min/1.73 m2, respectively (P<0.001 for both compared to CKD-EPI and P<0.001 comparing the new ANN and new regression models). Accuracies (proportions of eGFRs not deviating >30% from mGFR) were 50.9%, 77.4%, and 78.7%, respectively (P<0.001 for both compared to CKD-EPI and P=0.5 comparing the new ANN and new regression models). Different methods for measuring GFR were a source of systematic bias in comparisons of new models to CKD-EPI, and both the derivation and validation cohorts consisted of a group of patients who were referred to the same institution. An ANN model using 3 variables did not perform better than a new regression model. Whether ANN can improve GFR estimation using more variables requires further investigation. Copyright © 2013 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.
Wheat flour dough Alveograph characteristics predicted by Mixolab regression models.
Codină, Georgiana Gabriela; Mironeasa, Silvia; Mironeasa, Costel; Popa, Ciprian N; Tamba-Berehoiu, Radiana
2012-02-01
In Romania, the Alveograph is the most used device to evaluate the rheological properties of wheat flour dough, but lately the Mixolab device has begun to play an important role in the breadmaking industry. These two instruments are based on different principles but there are some correlations that can be found between the parameters determined by the Mixolab and the rheological properties of wheat dough measured with the Alveograph. Statistical analysis on 80 wheat flour samples using the backward stepwise multiple regression method showed that Mixolab values using the ‘Chopin S’ protocol (40 samples) and ‘Chopin + ’ protocol (40 samples) can be used to elaborate predictive models for estimating the value of the rheological properties of wheat dough: baking strength (W), dough tenacity (P) and extensibility (L). The correlation analysis confirmed significant findings (P < 0.05 and P < 0.01) between the parameters of wheat dough studied by the Mixolab and its rheological properties measured with the Alveograph. A number of six predictive linear equations were obtained. Linear regression models gave multiple regression coefficients with R²(adjusted) > 0.70 for P, R²(adjusted) > 0.70 for W and R²(adjusted) > 0.38 for L, at a 95% confidence interval. Copyright © 2011 Society of Chemical Industry.
Equations for estimating bankfull channel geometry and discharge for streams in Massachusetts
Bent, Gardner C.; Waite, Andrew M.
2013-01-01
Regression equations were developed for estimating bankfull geometry—width, mean depth, cross-sectional area—and discharge for streams in Massachusetts. The equations provide water-resource and conservation managers with methods for estimating bankfull characteristics at specific stream sites in Massachusetts. This information can be used for the adminstration of the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a protected riverfront area extending from the mean annual high-water line corresponding to the elevation of bankfull discharge along each side of a perennial stream. Additionally, information on bankfull channel geometry and discharge are important to Federal, State, and local government agencies and private organizations involved in stream assessment and restoration projects. Regression equations are based on data from stream surveys at 33 sites (32 streamgages and 1 crest-stage gage operated by the U.S. Geological Survey) in and near Massachusetts. Drainage areas of the 33 sites ranged from 0.60 to 329 square miles (mi2). At 27 of the 33 sites, field data were collected and analyses were done to determine bankfull channel geometry and discharge as part of the present study. For 6 of the 33 sites, data on bankfull channel geometry and discharge were compiled from other studies done by the U.S. Geological Survey, Natural Resources Conservation Service of the U.S. Department of Agriculture, and the Vermont Department of Environmental Conservation. Similar techniques were used for field data collection and analysis for bankfull channel geometry and discharge at all 33 sites. Recurrence intervals of the bankfull discharge, which represent the frequency with which a stream fills its channel, averaged 1.53 years (median value 1.34 years) at the 33 sites. Simple regression equations were developed for bankfull width, mean depth, cross-sectional area, and discharge using drainage area, which is the most significant explanatory variable in estimating these bankfull characteristics. The use of drainage area as an explanatory variable is also the most commonly published method for estimating these bankfull characteristics. Regional curves (graphic plots) of bankfull channel geometry and discharge by drainage area are presented. The regional curves are based on the simple regression equations and can be used to estimate bankfull characteristics from drainage area. Multiple regression analysis, which includes basin characteristics in addition to drainage area, also was used to develop equations. Variability in bankfull width, mean depth, cross-sectional area, and discharge was more fully explained by the multiple regression equations that include mean-basin slope and drainage area than was explained by equations based on drainage area alone. The Massachusetts regional curves and equations developed in this study are similar, in terms of values of slopes and intercepts, to those developed for other parts of the northeastern United States. Limitations associated with site selection and development of the equations resulted in some constraints for the application of equations and regional curves presented in this report. The curves and equations are applicable to stream sites that have (1) less than about 25 percent of their drainage basin area occupied by urban land use (commercial, industrial, transportation, and high-density residential), (2) little to no streamflow regulation, especially from flood-control structures, (3) drainage basin areas greater than 0.60 mi2 and less than 329 mi2, and (4) a mean basin slope greater than 2.2 percent and less than 23.9 percent. The equations may not be applicable where streams flow through extensive wetlands. The equations also may not apply in areas of Cape Cod and the Islands and the area of southeastern Massachusetts close to Cape Cod with extensive areas of coarse-grained glacial deposits where none of the study sites are located. Regardless of the setting, the regression equations are not intended for use as the sole method of estimating bankfull characteristics; however, they may supplement field identification of the bankfull channel when used in conjunction with field verified bankfull indicators, flood-frequency analysis, or other supporting evidence.
Yen, Muh-Yong; Lu, Yun-Ching; Huang, Pi-Hsiang; Chen, Chen-Ming; Chen, Yee-Chun; Lin, Yusen E
2010-07-01
Healthcare workers (HCWs) are at high risk of acquiring emerging infections while caring for patients, as has been shown in the recent SARS and swine flu epidemics. Using SARS as an example, we determined the effectiveness of infection control measures (ICMs) by logistic regression and structural equation modelling (SEM), a quantitative methodology that can test a hypothetical model and validates causal relationships among ICMs. Logistic regression showed that installing hand wash stations in the emergency room (p = 0.012, odds ratio = 1.07) was the only ICM significantly associated with the protection of HCWs from acquiring the SARS virus. The structural equation modelling results showed that the most important contributing factor (highest proportion of effectiveness) was installation of a fever screening station outside the emergency department (51%). Other measures included traffic control in the emergency department (19%), availability of an outbreak standard operation protocol (12%), mandatory temperature screening (9%), establishing a hand washing setup at each hospital checkpoint (3%), adding simplified isolation rooms (3%), and a standardized patient transfer protocol (3%). Installation of fever screening stations outside of the hospital and implementing traffic control in the emergency department contributed to 70% of the effectiveness in the prevention of SARS transmission. Our approach can be applied to the evaluation of control measures for other epidemic infectious diseases, including swine flu and avian flu.
Wells, Samantha; Flynn, Andrea; Tremblay, Paul F; Dumas, Tara; Miller, Peter; Graham, Kathryn
2014-05-01
This study extends previous research on masculinity and negative drinking consequences among young men by considering mediating effects of heavy episodic drinking (HED) and alcohol expectancies. We hypothesized that masculinity would have a direct relationship with negative consequences from drinking as well as indirect relationships mediated by HED and alcohol expectancies of courage, risk, and aggression. A random sample of 1,436 college and university men ages 19-25 years completed an online survey, including conformity to masculine norms, alcohol-related expectancies, HED, and negative drinking consequences. Regression analyses and structural equation modeling were used. Six of seven dimensions of masculinity and the alcohol expectancy scales were significantly associated with both HED and negative consequences. In multivariate regression models predicting HED and negative consequences, the playboy and violence dimensions of masculinity and the risk/aggression alcohol expectancy remained significant. HED and the risk-taking dimension of masculinity were also significant in the model predicting negative consequences. The structural equation model indicated that masculinity was directly associated with HED and negative consequences but also influenced negative consequences indirectly through HED and alcohol expectancies. The findings suggest that, among young adult male college and university students, masculinity is an important factor related to both HED and drinking consequences, with the latter effect partly mediated by HED and alcohol expectancies. Addressing male norms about masculinity may help to reduce HED and negative consequences from drinking.
NASA Astrophysics Data System (ADS)
Liu, Yongqiang; Mamtimin, Ali; He, Qing
2014-05-01
Because land surface emissivity (ɛ) has not been reliably measured, global climate model (GCM) land surface schemes conventionally set this parameter as simply assumption, for example, 1 as in the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) model, 0.96 for soil and wetland in the Global and Regional Assimilation and Prediction System (GRAPES) Common Land Model (CoLM). This is the so-called emissivity assumption. Accurate broadband emissivity data are needed as model inputs to better simulate the land surface climate. It is demonstrated in this paper that the assumption of the emissivity induces errors in modeling the surface energy budget over Taklimakan Desert where ɛ is far smaller than original value. One feasible solution to this problem is to apply the accurate broadband emissivity into land surface models. The Moderate Resolution Imaging Spectroradiometer (MODIS) instrument has routinely measured spectral emissivities in six thermal infrared bands. The empirical regression equations have been developed in this study to convert these spectral emissivities to broadband emissivity required by land surface models. In order to calibrate the regression equations, using a portable Fourier Transform infrared (FTIR) spectrometer instrument, crossing Taklimakan Desert along with highway from north to south, to measure the accurate broadband emissivity. The observed emissivity data show broadband ɛ around 0.89-0.92. To examine the impact of improved ɛ to radiative energy redistribution, simulation studies were conducted using offline CoLM. The results illustrate that large impacts of surface ɛ occur over desert, with changes up in surface skin temperature, as well as evident changes in sensible heat fluxes. Keywords: Taklimakan Desert, surface broadband emissivity, Fourier Transform infrared spectrometer, MODIS, CoLM
Rus, David L.; Dietsch, Benjamin J.; Woodward, Brenda K.; Fry, Beth E.; Wilson, Richard C.
2007-01-01
An assessment of the 16.3-square-mile Cardwell Branch watershed characterized the hydrology, fluvial geomorphology, and stream ecology in 2003-04. The study - performed by the U.S. Geological Survey in cooperation with the City of Lincoln, Nebraska, and the Lower Platte South Natural Resources District - focused on the 7.7-square-mile drainage downstream from Yankee Hill Reservoir. Hydrologic and hydraulic models were developed using the Hydrologic Modeling System (HEC-HMS) and River Analysis System (HEC-RAS) of the U.S. Army Corps of Engineers Hydraulic Engineering Center. Estimates of streamflow and water-surface elevation were simulated for 24-hour-duration design rainstorms ranging from a 50-percent frequency to a 0.2-percent frequency. An initial HEC-HMS model was developed using the standardized parameter estimation techniques associated with the Soil Conservation Service curve number technique. An adjusted HEC-HMS model also was developed in which parameters were adjusted in order for the model output to better correspond to peak streamflows estimated from regional regression equations. Comparisons of peak streamflow from the two HEC-HMS models indicate that the initial HEC-HMS model may better agree with the regional regression equations for higher frequency storms, and the adjusted HEC-HMS model may perform more closely to regional regression equations for larger, rarer events. However, a lack of observed streamflow data, coupled with conflicting results from regional regression equations and local high-water marks, introduced considerable uncertainty into the model simulations. Using the HEC-RAS model to estimate water-surface elevations associated with the peak streamflow, the adjusted HEC-HMS model produced average increases in water-surface elevation of 0.2, 1.1, and 1.4 feet for the 50-, 1-, and 0.2-percent-frequency rainstorms, respectively, when compared to the initial HEC-HMS model. Cross-sectional surveys and field assessments conducted between November 2003 and March 2004 indicated that Cardwell Branch and its unnamed tributary appear to be undergoing incision (the process of downcutting) (with three locations showing 2 or more feet of streambed incision since 1978) that is somewhat moderated by the presence of grade controls and vegetation along the channel profile. Although streambank failures were commonly observed, 96 percent of the surveyed cross sections were classified as stable by planar and rotational failure analysis-a disconnect that may have been the result of assumed soil properties. Two process-based classification systems each indicated that the reaches within the study area were incising and widening, and the Rosgen classification system characterized the streams as either type E6 or B6c. E6 channels are hydraulically efficient with low width-depth ratios, low to moderate sinuosity, and gentle to moderately steep slopes. B6c channels typically are incised with low width-depth ratios maintained by riparian vegetation, low bedload transport, and high washload transport. No obvious nickpoints (interruption or break in slope) were observed in the thalweg profile (line of maximum streambed descent), and the most acute incision occurred immediately downstream from bridges and culverts. Nine water-quality samples were collected between August 2003 and November 2004 near the mouth of the watershed. Sediment-laden rainfall-runoff substantially affected the water quality in Cardwell Branch, leading to greater biochemical and chemical oxygen demands as well as increased concentrations of several nutrient, bacteriological, sediment, and pesticide constituents. The storage of rainfall runoff in Yankee Hill Reservoir may prolong the presence of runoff-related constituents downstream. Across the study area, there was a lack of habitat availability for aquatic biota because of low dissolved oxygen levels and low streamflows or dry channels. In August 2003, the aquatic community near the mouth of
Rogers, Paul; Stoner, Julie
2016-01-01
Regression models for correlated binary outcomes are commonly fit using a Generalized Estimating Equations (GEE) methodology. GEE uses the Liang and Zeger sandwich estimator to produce unbiased standard error estimators for regression coefficients in large sample settings even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large, and there are few repeated measurements. The sandwich estimator is not without drawbacks; its asymptotic properties do not hold in small sample settings. In these situations, the sandwich estimator is biased downwards, underestimating the variances. In this project, a modified form for the sandwich estimator is proposed to correct this deficiency. The performance of this new sandwich estimator is compared to the traditional Liang and Zeger estimator as well as alternative forms proposed by Morel, Pan and Mancl and DeRouen. The performance of each estimator was assessed with 95% coverage probabilities for the regression coefficient estimators using simulated data under various combinations of sample sizes and outcome prevalence values with an Independence (IND), Autoregressive (AR) and Compound Symmetry (CS) correlation structure. This research is motivated by investigations involving rare-event outcomes in aviation data. PMID:26998504
Reference value of impulse oscillometry in taiwanese preschool children.
Lai, Shen-Hao; Yao, Tsung-Chieh; Liao, Sui-Ling; Tsai, Ming-Han; Hua, Men-Chin; Yeh, Kuo-Wei; Huang, Jing-Long
2015-06-01
Impulse oscillometry is a potential technique for assessing the respiratory mechanism-which includes airway resistance and reactance during tidal breathing-in minimally cooperative young children. The reference values available in Asian preschool children are limited, especially in children of Chinese ethnicity. This study aimed to develop reference equations for lung function measurements using impulse oscillometry in Taiwanese children for future clinical application and research exploitation. Impulse oscillometry was performed in 150 healthy Taiwanese children (aged 2-6 years) to measure airway resistance and reactance at various frequencies. We used regression analysis to generate predictive equations separately by age, body height, body weight, and gender. The stepwise regression model revealed that body height was the most significant determinant of airway resistance and reactance in preschool young children. With the growth in height, a decrease in airway resistance and a paradoxical increase in reactance occurred at different frequencies. The regression curve of resistance at 5 Hz was comparable to previous reference values. This study provided reference values for several variables of the impulse oscillometry measurements in healthy Taiwanese children aged 2-6 years. With these reference data, clinical application of impulse oscillometry would be expedient in diagnosing respiratory diseases in preschool children. Copyright © 2014. Published by Elsevier B.V.
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
2012-01-01
Background Route environments may influence people's active commuting positively and thereby contribute to public health. Assessments of route environments are, however, needed in order to better understand the possible relationship between active commuting and the route environment. The aim of this study was, therefore, to assess the potential associations between perceptions of whether the route environment on the whole hinders or stimulates bicycle commuting and perceptions of environmental factors. Methods The Active Commuting Route Environment Scale (ACRES) was used for the assessment of bicycle commuters' perceptions of their route environments in the inner urban parts of Greater Stockholm, Sweden. Bicycle commuters (n = 827) were recruited by advertisements in newspapers. Simultaneous multiple regression analyses were used to assess the relation between predictor variables (such as levels of exhaust fumes, noise, traffic speed, traffic congestion and greenery) and the outcome variable (hindering - stimulating route environments). Two models were run, (Model 1) without and (Model 2) with the item traffic: unsafe or safe included as a predictor. Results Overall, about 40% of the variance of hindering - stimulating route environments was explained by the environmental predictors in our models (Model 1, R2 = 0.415, and Model 2, R 2= 0.435). The regression equation for Model 1 was: y = 8.53 + 0.33 ugly or beautiful + 0.14 greenery + (-0.14) course of the route + (-0.13) exhaust fumes + (-0.09) congestion: all types of vehicles (p ≤ 0.019). The regression equation for Model 2 was y = 6.55 + 0.31 ugly or beautiful + 0.16 traffic: unsafe or safe + (-0.13) exhaust fumes + 0.12 greenery + (-0.12) course of the route (p ≤ 0.001). Conclusions The main results indicate that beautiful, green and safe route environments seem to be, independently of each other, stimulating factors for bicycle commuting in inner urban areas. On the other hand, exhaust fumes, traffic congestion and low 'directness' of the route seem to be hindering factors. Furthermore, the overall results illustrate the complexity of a research area at the beginning of exploration. PMID:22401492
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Ryberg, Karen R.
2006-01-01
This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the Bureau of Reclamation, U.S. Department of the Interior, to estimate water-quality constituent concentrations in the Red River of the North at Fargo, North Dakota. Regression analysis of water-quality data collected in 2003-05 was used to estimate concentrations and loads for alkalinity, dissolved solids, sulfate, chloride, total nitrite plus nitrate, total nitrogen, total phosphorus, and suspended sediment. The explanatory variables examined for regression relation were continuously monitored physical properties of water-streamflow, specific conductance, pH, water temperature, turbidity, and dissolved oxygen. For the conditions observed in 2003-05, streamflow was a significant explanatory variable for all estimated constituents except dissolved solids. pH, water temperature, and dissolved oxygen were not statistically significant explanatory variables for any of the constituents in this study. Specific conductance was a significant explanatory variable for alkalinity, dissolved solids, sulfate, and chloride. Turbidity was a significant explanatory variable for total phosphorus and suspended sediment. For the nutrients, total nitrite plus nitrate, total nitrogen, and total phosphorus, cosine and sine functions of time also were used to explain the seasonality in constituent concentrations. The regression equations were evaluated using common measures of variability, including R2, or the proportion of variability in the estimated constituent explained by the regression equation. R2 values ranged from 0.703 for total nitrogen concentration to 0.990 for dissolved-solids concentration. The regression equations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.1 for dissolved solids to 35.2 for total nitrite plus nitrate. Regression equations also were used to estimate daily constituent loads. Load estimates can be used by water-quality managers for comparison of current water-quality conditions to water-quality standards expressed as total maximum daily loads (TMDLs). TMDLs are a measure of the maximum amount of chemical constituents that a water body can receive and still meet established water-quality standards. The peak loads generally occurred in June and July when streamflow also peaked.
Prediction of Maximal Oxygen Uptake by Six-Minute Walk Test and Body Mass Index in Healthy Boys.
Jalili, Majid; Nazem, Farzad; Sazvar, Akbar; Ranjbar, Kamal
2018-05-14
To develop an equation to predict maximal oxygen uptake (VO2max) based on the 6-minute walk test (6MWT) and body composition in healthy boys. Direct VO2max, 6-minute walk distance, and anthropometric characteristics were measured in 349 healthy boys (12.49 ± 2.72 years). Multiple regression analysis was used to generate VO2max prediction equations. Cross-validation of the VO2max prediction equations was assessed with predicted residual sum of squares statistics. Pearson correlation was used to assess the correlation between measured and predicted VO2max. Objectively measured VO2max had a significant correlation with demographic and 6MWT characteristics (R = 0.11-0.723, P < .01). Multiple regression analysis revealed the following VO2max prediction equation: VO2max (mL/kg/min) = 12.701 + (0.06 × 6-minute walk distance m ) - (0.732 × body mass index kg/m2 ) (R 2 = 0.79, standard error of the estimate [SEE] = 2.91 mL/kg/min, %SEE = 6.9%). There was strong correlation between measured and predicted VO2max (r = 0.875, P < .001). Cross-validation revealed minimal shrinkage (R 2 p = 0.78 and predicted residual sum of squares SEE = 2.99 mL/kg/min). This study provides a relatively accurate and convenient VO2max prediction equation based on the 6MWT and body mass index in healthy boys. This model can be used for evaluation of cardiorespiratory fitness of boys in different settings. Copyright © 2018 Elsevier Inc. All rights reserved.
Determination of streamflow of the Arkansas River near Bentley in south-central Kansas
Perry, Charles A.
2012-01-01
The Kansas Department of Agriculture, Division of Water Resources, requires that the streamflow of the Arkansas River just upstream from Bentley in south-central Kansas be measured or calculated before groundwater can be pumped from the well field. When the daily streamflow of the Arkansas River near Bentley is less than 165 cubic feet per second (ft3/s), pumping must be curtailed. Daily streamflow near Bentley was calculated by determining the relations between streamflow data from two reference streamgages with a concurrent record of 24 years, one located 17.2 miles (mi) upstream and one located 10.9 mi downstream, and streamflow at a temporary gage located just upstream from Bentley (Arkansas River near Bentley, Kansas). Flow-duration curves for the two reference streamgages indicate that during 1988?2011, the mean daily streamflow was less than 165 ft3/s 30 to 35 percent of the time. During extreme low-flow (drought) conditions, the reach of the Arkansas River between Hutchinson and Maize can lose flow to the adjacent alluvial aquifer, with streamflow losses as much as 1.6 cubic feet per second per mile. Three models were developed to calculate the streamflow of the Arkansas River near Bentley, Kansas. The model chosen depends on the data available and on whether the reach of the Arkansas River between Hutchinson and Maize is gaining or losing groundwater from or to the adjacent alluvial aquifer. The first model was a pair of equations developed from linear regressions of the relation between daily streamflow data from the Bentley streamgage and daily streamflow data from either the Arkansas River near Hutchinson, Kansas, station (station number 07143330) or the Arkansas River near Maize, Kansas, station (station number 07143375). The standard error of the Hutchinson-only equation was 22.8 ft3/s, and the standard error of the Maize-only equation was 22.3 ft3/s. The single-station model would be used if only one streamgage was available. In the second model, the flow gradient between the streamflow near Hutchinson and the streamflow near Maize was used to calculate the streamflow at the Bentley streamgage. This equation resulted in a standard error of 26.7 ft3/s. In the third model, a multiple regression analysis between both the daily streamflow of the Arkansas River near Hutchinson, Kansas, and the daily streamflow of the Arkansas River near Maize, Kansas, was used to calculate the streamflow at the Bentley streamgage. The multiple regression equation had a standard error of 21.2 ft3/s, which was the smallest of the standard errors for all the models. An analysis of the number of low-flow days and the number of days when the reach between Hutchinson and Maize loses flow to the adjacent alluvial aquifer indicates that the long-term trend is toward fewer days of losing conditions. This trend may indicate a long-term increase in water levels in the alluvial aquifer, which could be caused by one or more of several conditions, including an increase in rainfall, a decrease in pumping, a decrease in temperature, and an increase in streamflow upstream from the Hutchinson-to-Maize reach of the Arkansas River.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Flood characteristics of Alaskan streams
Lamke, R.D.
1979-01-01
Peak discharge data for Alaskan streams are summarized and analyzed. Multiple-regression equations relating peak discharge magnitude and frequency to climatic and physical characteristics of 260 gaged basins were determined in order to estimate average recurrence interval of floods at ungaged sites. These equations are for 1.25-, 2-, 5-, 10-, 25-, and 50-year average recurrence intervals. In this report, Alaska was divided into two regions, one having a maritime climate with fall and winter rains and floods, the other having spring and summer floods of a variety or combinations of causes. Average standard errors of the six multiple-regression equations for these two regions were 48 and 74 percent, respectively. Maximum recorded floods at more than 400 sites throughout Alaska are tabulated. Maps showing lines of equal intensity of the principal climatic variables found to be significant (mean annual precipitation and mean minimum January temperature), and location of the 260 sites used in the multiple-regression analyses are included. Little flood data have been collected in western and arctic Alaska, and the predictive equations are therefore less reliable for those areas. (Woodard-USGS)
NASA Astrophysics Data System (ADS)
Panagos, Panos; Ballabio, Cristiano; Borrelli, Pasquale; Meusburger, Katrin; Alewell, Christine
2015-04-01
Rainfall erosivity (R-factor) is among the 6 input factors in estimating soil erosion risk by using the empirical Revised Universal Soil Loss Equation (RUSLE). R-factor is a driving force for soil erosion modelling and potentially can be used in flood risk assessments, landslides susceptibility, post-fire damage assessment, application of agricultural management practices and climate change modelling. The rainfall erosivity is extremely difficult to model at large scale (national, European) due to lack of high temporal resolution precipitation data which cover long-time series. In most cases, R-factor is estimated based on empirical equations which take into account precipitation volume. The Rainfall Erosivity Database on the European Scale (REDES) is the output of an extensive data collection of high resolution precipitation data in the 28 Member States of the European Union plus Switzerland taking place during 2013-2014 in collaboration with national meteorological/environmental services. Due to different temporal resolutions of the data (5, 10, 15, 30, 60 minutes), conversion equations have been applied in order to homogenise the database at 30-minutes interval. The 1,541 stations included in REDES have been interpolated using the Gaussian Process Regression (GPR) model using as covariates the climatic data (monthly precipitation, monthly temperature, wettest/driest month) from WorldClim Database, Digital Elevation Model and latitude/longitude. GPR has been selected among other candidate models (GAM, Regression Kriging) due the best performance both in cross validation (R2=0.63) and in fitting dataset (R2=0.72). The highest uncertainty has been noticed in North-western Scotland, North Sweden and Finland due to limited number of stations in REDES. Also, in highlands such as Alpine arch and Pyrenees the diversity of environmental features forced relatively high uncertainty. The rainfall erosivity map of Europe available at 500m resolution plus the standard error and the erosivity density (Rainfall erosivity per mm of precipitation) are available in the European Soil Data Centre (ESDAC). The highest erosivity has been found in the mediterrean countries (Italy, Western Greece, Spain, Northern Portugal), South Austria, Slovenia, Croatia and Western United Kingdom.
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Estimation of health effects of prenatal methylmercury exposure using structural equation models.
Budtz-Jørgensen, Esben; Keiding, Niels; Grandjean, Philippe; Weihe, Pal
2002-10-14
Observational studies in epidemiology always involve concerns regarding validity, especially measurement error, confounding, missing data, and other problems that may affect the study outcomes. Widely used standard statistical techniques, such as multiple regression analysis, may to some extent adjust for these shortcomings. However, structural equations may incorporate most of these considerations, thereby providing overall adjusted estimations of associations. This approach was used in a large epidemiological data set from a prospective study of developmental methyl-mercury toxicity. Structural equation models were developed for assessment of the association between biomarkers of prenatal mercury exposure and neuropsychological test scores in 7 year old children. Eleven neurobehavioral outcomes were grouped into motor function and verbally mediated function. Adjustment for local dependence and item bias was necessary for a satisfactory fit of the model, but had little impact on the estimated mercury effects. The mercury effect on the two latent neurobehavioral functions was similar to the strongest effects seen for individual test scores of motor function and verbal skills. Adjustment for contaminant exposure to poly chlorinated biphenyls (PCBs) changed the estimates only marginally, but the mercury effect could be reduced to non-significance by assuming a large measurement error for the PCB biomarker. The structural equation analysis allows correction for measurement error in exposure variables, incorporation of multiple outcomes and incomplete cases. This approach therefore deserves to be applied more frequently in the analysis of complex epidemiological data sets.
Numerical investigation on the regression rate of hybrid rocket motor with star swirl fuel grain
NASA Astrophysics Data System (ADS)
Zhang, Shuai; Hu, Fan; Zhang, Weihua
2016-10-01
Although hybrid rocket motor is prospected to have distinct advantages over liquid and solid rocket motor, low regression rate and insufficient efficiency are two major disadvantages which have prevented it from being commercially viable. In recent years, complex fuel grain configurations are attractive in overcoming the disadvantages with the help of Rapid Prototyping technology. In this work, an attempt has been made to numerically investigate the flow field characteristics and local regression rate distribution inside the hybrid rocket motor with complex star swirl grain. A propellant combination with GOX and HTPB has been chosen. The numerical model is established based on the three dimensional Navier-Stokes equations with turbulence, combustion, and coupled gas/solid phase formulations. The calculated fuel regression rate is compared with the experimental data to validate the accuracy of numerical model. The results indicate that, comparing the star swirl grain with the tube grain under the conditions of the same port area and the same grain length, the burning surface area rises about 200%, the spatially averaged regression rate rises as high as about 60%, and the oxidizer can combust sufficiently due to the big vortex around the axis in the aft-mixing chamber. The combustion efficiency of star swirl grain is better and more stable than that of tube grain.
Eastwood, Sophie V.; Tillin, Therese; Wright, Andrew; Heasman, John; Willis, Joseph; Godsland, Ian F.; Forouhi, Nita; Whincup, Peter; Hughes, Alun D.; Chaturvedi, Nishi
2013-01-01
Background South Asians and African Caribbeans experience more cardiometabolic disease than Europeans. Risk factors include visceral (VAT) and subcutaneous abdominal (SAT) adipose tissue, which vary with ethnicity and are difficult to quantify using anthropometry. Objective We developed and cross-validated ethnicity and gender-specific equations using anthropometrics to predict VAT and SAT. Design 669 Europeans, 514 South Asians and 227 African Caribbeans (70±7 years) underwent anthropometric measurement and abdominal CT scanning. South Asian and African Caribbean participants were first-generation migrants living in London. Prediction equations were derived for CT-measured VAT and SAT using stepwise regression, then cross-validated by comparing actual and predicted means. Results South Asians had more and African Caribbeans less VAT than Europeans. For basic VAT prediction equations (age and waist circumference), model fit was better in men (R2 range 0.59-0.71) than women (range 0.35-0.59). Expanded equations (+ weight, height, hip and thigh circumference) improved fit for South Asian and African Caribbean women (R2 0.35 to 0.55, and 0.43 to 0.56 respectively). For basic SAT equations, R2 was 0.69-0.77, and for expanded equations it was 0.72-0.86. Cross-validation showed differences between actual and estimated VAT of <7%, and SAT of <8% in all groups, apart from VAT in South Asian women which disagreed by 16%. Conclusion We provide ethnicity- and gender-specific VAT and SAT prediction equations, derived from a large tri-ethnic sample. Model fit was reasonable for SAT and VAT in men, while basic VAT models should be used cautiously in South Asian and African Caribbean women. These equations will aid studies of mechanisms of cardiometabolic disease in later life, where imaging data are not available. PMID:24069381
Vegetation Monitoring with Gaussian Processes and Latent Force Models
NASA Astrophysics Data System (ADS)
Camps-Valls, Gustau; Svendsen, Daniel; Martino, Luca; Campos, Manuel; Luengo, David
2017-04-01
Monitoring vegetation by biophysical parameter retrieval from Earth observation data is a challenging problem, where machine learning is currently a key player. Neural networks, kernel methods, and Gaussian Process (GP) regression have excelled in parameter retrieval tasks at both local and global scales. GP regression is based on solid Bayesian statistics, yield efficient and accurate parameter estimates, and provides interesting advantages over competing machine learning approaches such as confidence intervals. However, GP models are hampered by lack of interpretability, that prevented the widespread adoption by a larger community. In this presentation we will summarize some of our latest developments to address this issue. We will review the main characteristics of GPs and their advantages in vegetation monitoring standard applications. Then, three advanced GP models will be introduced. First, we will derive sensitivity maps for the GP predictive function that allows us to obtain feature ranking from the model and to assess the influence of examples in the solution. Second, we will introduce a Joint GP (JGP) model that combines in situ measurements and simulated radiative transfer data in a single GP model. The JGP regression provides more sensible confidence intervals for the predictions, respects the physics of the underlying processes, and allows for transferability across time and space. Finally, a latent force model (LFM) for GP modeling that encodes ordinary differential equations to blend data-driven modeling and physical models of the system is presented. The LFM performs multi-output regression, adapts to the signal characteristics, is able to cope with missing data in the time series, and provides explicit latent functions that allow system analysis and evaluation. Empirical evidence of the performance of these models will be presented through illustrative examples.
ZHANG, BO; WANG, DAN; GUO, YUNBAO; YU, JINLU
2015-01-01
The aim of the present study was to identify the major factors correlated with early postoperative seizures in elderly patients who had undergone a meningioma resection, and subsequently, to develop a logistic regression equation for assessing the seizures risk. Fourteen factors possibly correlated with early postoperative seizures in a cohort of 209 elderly patients who had undergone meningioma resection, as analyzed by multifactorial stepwise logistic regression. Phenobarbital sodium (0.1 g, intramuscularly) was administered to all 209 patients 30 min prior to undergoing surgery. All the patients had no previous history of seizures. The correlation of the 14 clinical factors (gender, tumor site, dyskinesia, peritumoral brain edema (PTBE), tumor diameter, pre- and postoperative prophylaxes, surgery time, tumor adhesion, circumscription, blood supply, intraoperative transfusion, original site of the tumor and dysphasia) was assessed in association with the risk for post-operative seizures. Tumor diameter, postoperative prophylactic antiepileptic drug (PPAD) administration, PTBE and tumor site were entered as risk factors into a mathematical regression model. The odds ratio (OR) of the tumor diameter was >1, and PPAD administration showed an OR >1, relative to a non-prophylactic group. A logistic regression equation was obtained and the sensitivity, specificity and misdiagnosis rates were 91.4, 74.3 and 25.7%, respectively. Tumor diameter, PPAD administration, PTBE and tumor site were closely correlated with early postoperative seizures; PTBE and PPAD administration were risk and protective factors, respectively. PMID:26137257
NASA Astrophysics Data System (ADS)
Pirozzi, K. L.; Long, C. J.; McAleer, C. W.; Smith, A. S. T.; Hickman, J. J.
2013-08-01
Rigorous analysis of muscle function in in vitro systems is needed for both acute and chronic biomedical applications. Forces generated by skeletal myotubes on bio-microelectromechanical cantilevers were calculated using a modified version of Stoney's thin-film equation and finite element analysis (FEA), then analyzed for regression to physical parameters. The Stoney's equation results closely matched the more intensive FEA and the force correlated to cross-sectional area (CSA). Normalizing force to measured CSA significantly improved the statistical sensitivity and now allows for close comparison of in vitro data to in vivo measurements for applications in exercise physiology, robotics, and modeling neuromuscular diseases.