Lewis, Jason M.
2010-01-01
Peak-streamflow regression equations were determined for estimating flows with exceedance probabilities from 50 to 0.2 percent for the state of Oklahoma. These regression equations incorporate basin characteristics to estimate peak-streamflow magnitude and frequency throughout the state by use of a generalized least squares regression analysis. The most statistically significant independent variables required to estimate peak-streamflow magnitude and frequency for unregulated streams in Oklahoma are contributing drainage area, mean-annual precipitation, and main-channel slope. The regression equations are applicable for watershed basins with drainage areas less than 2,510 square miles that are not affected by regulation. The resulting regression equations had a standard model error ranging from 31 to 46 percent. Annual-maximum peak flows observed at 231 streamflow-gaging stations through water year 2008 were used for the regression analysis. Gage peak-streamflow estimates were used from previous work unless 2008 gaging-station data were available, in which new peak-streamflow estimates were calculated. The U.S. Geological Survey StreamStats web application was used to obtain the independent variables required for the peak-streamflow regression equations. Limitations on the use of the regression equations and the reliability of regression estimates for natural unregulated streams are described. Log-Pearson Type III analysis information, basin and climate characteristics, and the peak-streamflow frequency estimates for the 231 gaging stations in and near Oklahoma are listed. Methodologies are presented to estimate peak streamflows at ungaged sites by using estimates from gaging stations on unregulated streams. For ungaged sites on urban streams and streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow magnitude and frequency.
Ahearn, Elizabeth A.
2010-01-01
Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.
Eash, David A.; Barnes, Kimberlee K.; O'Shea, Padraic S.
2016-09-19
A statewide study was led to develop regression equations for estimating three selected spring and three selected fall low-flow frequency statistics for ungaged stream sites in Iowa. The estimation equations developed for the six low-flow frequency statistics include spring (April through June) 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years and fall (October through December) 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years. Estimates of the three selected spring statistics are provided for 241 U.S. Geological Survey continuous-record streamgages, and estimates of the three selected fall statistics are provided for 238 of these streamgages, using data through June 2014. Because only 9 years of fall streamflow record were available, three streamgages included in the development of the spring regression equations were not included in the development of the fall regression equations. Because of regulation, diversion, or urbanization, 30 of the 241 streamgages were not included in the development of the regression equations. The study area includes Iowa and adjacent areas within 50 miles of the Iowa border. Because trend analyses indicated statistically significant positive trends when considering the period of record for most of the streamgages, the longest, most recent period of record without a significant trend was determined for each streamgage for use in the study. Geographic information system software was used to measure 63 selected basin characteristics for each of the 211streamgages used to develop the regional regression equations. The study area was divided into three low-flow regions that were defined in a previous study for the development of regional regression equations.Because several streamgages included in the development of regional regression equations have estimates of zero flow calculated from observed streamflow for selected spring and fall low-flow frequency statistics, the final equations for the three low-flow regions were developed using two types of regression analyses—left-censored and generalized-least-squares regression analyses. A total of 211 streamgages were included in the development of nine spring regression equations—three equations for each of the three low-flow regions. A total of 208 streamgages were included in the development of nine fall regression equations—three equations for each of the three low-flow regions. A censoring threshold was used to develop 15 left-censored regression equations to estimate the three fall low-flow frequency statistics for each of the three low-flow regions and to estimate the three spring low-flow frequency statistics for the southern and northwest regions. For the northeast region, generalized-least-squares regression was used to develop three equations to estimate the three spring low-flow frequency statistics. For the northeast region, average standard errors of prediction range from 32.4 to 48.4 percent for the spring equations and average standard errors of estimate range from 56.4 to 73.8 percent for the fall equations. For the northwest region, average standard errors of estimate range from 58.9 to 62.1 percent for the spring equations and from 83.2 to 109.4 percent for the fall equations. For the southern region, average standard errors of estimate range from 43.2 to 64.0 percent for the spring equations and from 78.1 to 78.7 percent for the fall equations.The regression equations are applicable only to stream sites in Iowa with low flows not substantially affected by regulation, diversion, or urbanization and with basin characteristics within the range of those used to develop the equations. The regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system application. StreamStats allows users to click on any ungaged stream site and compute estimates of the six selected spring and fall low-flow statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged site are provided. StreamStats also allows users to click on any Iowa streamgage to obtain computed estimates for the six selected spring and fall low-flow statistics.
Comparative evaluation of urban storm water quality models
NASA Astrophysics Data System (ADS)
Vaze, J.; Chiew, Francis H. S.
2003-10-01
The estimation of urban storm water pollutant loads is required for the development of mitigation and management strategies to minimize impacts to receiving environments. Event pollutant loads are typically estimated using either regression equations or "process-based" water quality models. The relative merit of using regression models compared to process-based models is not clear. A modeling study is carried out here to evaluate the comparative ability of the regression equations and process-based water quality models to estimate event diffuse pollutant loads from impervious surfaces. The results indicate that, once calibrated, both the regression equations and the process-based model can estimate event pollutant loads satisfactorily. In fact, the loads estimated using the regression equation as a function of rainfall intensity and runoff rate are better than the loads estimated using the process-based model. Therefore, if only estimates of event loads are required, regression models should be used because they are simpler and require less data compared to process-based models.
Ding, A Adam; Wu, Hulin
2014-10-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method.
Ding, A. Adam; Wu, Hulin
2015-01-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method. PMID:26401093
Asquith, William H.; Thompson, David B.
2008-01-01
The U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, investigated a refinement of the regional regression method and developed alternative equations for estimation of peak-streamflow frequency for undeveloped watersheds in Texas. A common model for estimation of peak-streamflow frequency is based on the regional regression method. The current (2008) regional regression equations for 11 regions of Texas are based on log10 transformations of all regression variables (drainage area, main-channel slope, and watershed shape). Exclusive use of log10-transformation does not fully linearize the relations between the variables. As a result, some systematic bias remains in the current equations. The bias results in overestimation of peak streamflow for both the smallest and largest watersheds. The bias increases with increasing recurrence interval. The primary source of the bias is the discernible curvilinear relation in log10 space between peak streamflow and drainage area. Bias is demonstrated by selected residual plots with superimposed LOWESS trend lines. To address the bias, a statistical framework based on minimization of the PRESS statistic through power transformation of drainage area is described and implemented, and the resulting regression equations are reported. Compared to log10-exclusive equations, the equations derived from PRESS minimization have PRESS statistics and residual standard errors less than the log10 exclusive equations. Selected residual plots for the PRESS-minimized equations are presented to demonstrate that systematic bias in regional regression equations for peak-streamflow frequency estimation in Texas can be reduced. Because the overall error is similar to the error associated with previous equations and because the bias is reduced, the PRESS-minimized equations reported here provide alternative equations for peak-streamflow frequency estimation.
Eash, D.A.
1993-01-01
Procedures provided for applying the drainage-basin and channel-geometry regression equations depend on whether the design-flood discharge estimate is for a site on an ungaged stream, an ungaged site on a gaged stream, or a gaged site. When both a drainage-basin and a channel-geometry regression-equation estimate are available for a stream site, a procedure is presented for determining a weighted average of the two flood estimates. The drainage-basin regression equations are applicable to unregulated rural drainage areas less than 1,060 square miles, and the channel-geometry regression equations are applicable to unregulated rural streams in Iowa with stabilized channels.
Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio
Koltun, G.F.
2003-01-01
Regional equations for estimating 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood-peak discharges at ungaged sites on rural, unregulated streams in Ohio were developed by means of ordinary and generalized least-squares (GLS) regression techniques. One-variable, simple equations and three-variable, full-model equations were developed on the basis of selected basin characteristics and flood-frequency estimates determined for 305 streamflow-gaging stations in Ohio and adjacent states. The average standard errors of prediction ranged from about 39 to 49 percent for the simple equations, and from about 34 to 41 percent for the full-model equations. Flood-frequency estimates determined by means of log-Pearson Type III analyses are reported along with weighted flood-frequency estimates, computed as a function of the log-Pearson Type III estimates and the regression estimates. Values of explanatory variables used in the regression models were determined from digital spatial data sets by means of a geographic information system (GIS), with the exception of drainage area, which was determined by digitizing the area within basin boundaries manually delineated on topographic maps. Use of GIS-based explanatory variables represents a major departure in methodology from that described in previous reports on estimating flood-frequency characteristics of Ohio streams. Examples are presented illustrating application of the regression equations to ungaged sites on ungaged and gaged streams. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site on the same stream. A region-of-influence method, which employs a computer program to estimate flood-frequency characteristics for ungaged sites based on data from gaged sites with similar characteristics, was also tested and compared to the GLS full-model equations. For all recurrence intervals, the GLS full-model equations had superior prediction accuracy relative to the simple equations and therefore are recommended for use.
Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.
2015-09-28
Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Jennings, M.E.; Thomas, W.O.; Riggs, H.C.
1994-01-01
For many years, the U.S. Geological Survey (USGS) has been involved in the development of regional regression equations for estimating flood magnitude and frequency at ungaged sites. These regression equations are used to transfer flood characteristics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally these equations have been developed on a statewide or metropolitan area basis as part of cooperative study programs with specific State Departments of Transportation or specific cities. The USGS, in cooperation with the Federal Highway Administration and the Federal Emergency Management Agency, has compiled all the current (as of September 1993) statewide and metropolitan area regression equations into a micro-computer program titled the National Flood Frequency Program.This program includes regression equations for estimating flood-peak discharges and techniques for estimating a typical flood hydrograph for a given recurrence interval peak discharge for unregulated rural and urban watersheds. These techniques should be useful to engineers and hydrologists for planning and design applications. This report summarizes the statewide regression equations for rural watersheds in each State, summarizes the applicable metropolitan area or statewide regression equations for urban watersheds, describes the National Flood Frequency Program for making these computations, and provides much of the reference information on the extrapolation variables needed to run the program.
Eash, David A.; Barnes, Kimberlee K.
2017-01-01
A statewide study was conducted to develop regression equations for estimating six selected low-flow frequency statistics and harmonic mean flows for ungaged stream sites in Iowa. The estimation equations developed for the six low-flow frequency statistics include: the annual 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years, the annual 30-day mean low flow for a recurrence interval of 5 years, and the seasonal (October 1 through December 31) 1- and 7-day mean low flows for a recurrence interval of 10 years. Estimation equations also were developed for the harmonic-mean-flow statistic. Estimates of these seven selected statistics are provided for 208 U.S. Geological Survey continuous-record streamgages using data through September 30, 2006. The study area comprises streamgages located within Iowa and 50 miles beyond the State's borders. Because trend analyses indicated statistically significant positive trends when considering the entire period of record for the majority of the streamgages, the longest, most recent period of record without a significant trend was determined for each streamgage for use in the study. The median number of years of record used to compute each of these seven selected statistics was 35. Geographic information system software was used to measure 54 selected basin characteristics for each streamgage. Following the removal of two streamgages from the initial data set, data collected for 206 streamgages were compiled to investigate three approaches for regionalization of the seven selected statistics. Regionalization, a process using statistical regression analysis, provides a relation for efficiently transferring information from a group of streamgages in a region to ungaged sites in the region. The three regionalization approaches tested included statewide, regional, and region-of-influence regressions. For the regional regression, the study area was divided into three low-flow regions on the basis of hydrologic characteristics, landform regions, and soil regions. A comparison of root mean square errors and average standard errors of prediction for the statewide, regional, and region-of-influence regressions determined that the regional regression provided the best estimates of the seven selected statistics at ungaged sites in Iowa. Because a significant number of streams in Iowa reach zero flow as their minimum flow during low-flow years, four different types of regression analyses were used: left-censored, logistic, generalized-least-squares, and weighted-least-squares regression. A total of 192 streamgages were included in the development of 27 regression equations for the three low-flow regions. For the northeast and northwest regions, a censoring threshold was used to develop 12 left-censored regression equations to estimate the 6 low-flow frequency statistics for each region. For the southern region a total of 12 regression equations were developed; 6 logistic regression equations were developed to estimate the probability of zero flow for the 6 low-flow frequency statistics and 6 generalized least-squares regression equations were developed to estimate the 6 low-flow frequency statistics, if nonzero flow is estimated first by use of the logistic equations. A weighted-least-squares regression equation was developed for each region to estimate the harmonic-mean-flow statistic. Average standard errors of estimate for the left-censored equations for the northeast region range from 64.7 to 88.1 percent and for the northwest region range from 85.8 to 111.8 percent. Misclassification percentages for the logistic equations for the southern region range from 5.6 to 14.0 percent. Average standard errors of prediction for generalized least-squares equations for the southern region range from 71.7 to 98.9 percent and pseudo coefficients of determination for the generalized-least-squares equations range from 87.7 to 91.8 percent. Average standard errors of prediction for weighted-least-squares equations developed for estimating the harmonic-mean-flow statistic for each of the three regions range from 66.4 to 80.4 percent. The regression equations are applicable only to stream sites in Iowa with low flows not significantly affected by regulation, diversion, or urbanization and with basin characteristics within the range of those used to develop the equations. If the equations are used at ungaged sites on regulated streams, or on streams affected by water-supply and agricultural withdrawals, then the estimates will need to be adjusted by the amount of regulation or withdrawal to estimate the actual flow conditions if that is of interest. Caution is advised when applying the equations for basins with characteristics near the applicable limits of the equations and for basins located in karst topography. A test of two drainage-area ratio methods using 31 pairs of streamgages, for the annual 7-day mean low-flow statistic for a recurrence interval of 10 years, indicates a weighted drainage-area ratio method provides better estimates than regional regression equations for an ungaged site on a gaged stream in Iowa when the drainage-area ratio is between 0.5 and 1.4. These regression equations will be implemented within the U.S. Geological Survey StreamStats web-based geographic-information-system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the seven selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these seven selected statistics are provided for the streamgage.
Lombard, Pamela J.; Hodgkins, Glenn A.
2015-01-01
Regression equations to estimate peak streamflows with 1- to 500-year recurrence intervals (annual exceedance probabilities from 99 to 0.2 percent, respectively) were developed for small, ungaged streams in Maine. Equations presented here are the best available equations for estimating peak flows at ungaged basins in Maine with drainage areas from 0.3 to 12 square miles (mi2). Previously developed equations continue to be the best available equations for estimating peak flows for basin areas greater than 12 mi2. New equations presented here are based on streamflow records at 40 U.S. Geological Survey streamgages with a minimum of 10 years of recorded peak flows between 1963 and 2012. Ordinary least-squares regression techniques were used to determine the best explanatory variables for the regression equations. Traditional map-based explanatory variables were compared to variables requiring field measurements. Two field-based variables—culvert rust lines and bankfull channel widths—either were not commonly found or did not explain enough of the variability in the peak flows to warrant inclusion in the equations. The best explanatory variables were drainage area and percent basin wetlands; values for these variables were determined with a geographic information system. Generalized least-squares regression was used with these two variables to determine the equation coefficients and estimates of accuracy for the final equations.
Tortorelli, Robert L.
1997-01-01
Statewide regression equations for Oklahoma were determined for estimating peak discharge and flood frequency for selected recurrence intervals from 2 to 500 years for ungaged sites on natural unregulated streams. The most significant independent variables required to estimate peak-streamflow frequency for natural unregulated streams in Oklahoma are contributing drainage area, main-channel slope, and mean-annual precipitation. The regression equations are applicable for watersheds with drainage areas less than 2,510 square miles that are not affected by regulation from manmade works. Limitations on the use of the regression relations and the reliability of regression estimates for natural unregulated streams are discussed. Log-Pearson Type III analysis information, basin and climatic characteristics, and the peak-stream-flow frequency estimates for 251 gaging stations in Oklahoma and adjacent states are listed. Techniques are presented to make a peak-streamflow frequency estimate for gaged sites on natural unregulated streams and to use this result to estimate a nearby ungaged site on the same stream. For ungaged sites on urban streams, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow frequency. For ungaged sites on streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow frequency. The statewide regression equations are adjusted by substituting the drainage area below the floodwater retarding structures, or drainage area that represents the percentage of the unregulated basin, in the contributing drainage area parameter to obtain peak-streamflow frequency estimates.
Weather adjustment using seemingly unrelated regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noll, T.A.
1995-05-01
Seemingly unrelated regression (SUR) is a system estimation technique that accounts for time-contemporaneous correlation between individual equations within a system of equations. SUR is suited to weather adjustment estimations when the estimation is: (1) composed of a system of equations and (2) the system of equations represents either different weather stations, different sales sectors or a combination of different weather stations and different sales sectors. SUR utilizes the cross-equation error values to develop more accurate estimates of the system coefficients than are obtained using ordinary least-squares (OLS) estimation. SUR estimates can be generated using a variety of statistical software packagesmore » including MicroTSP and SAS.« less
Roland, Mark A.; Stuckey, Marla H.
2008-01-01
Regression equations were developed for estimating flood flows at selected recurrence intervals for ungaged streams in Pennsylvania with drainage areas less than 2,000 square miles. These equations were developed utilizing peak-flow data from 322 streamflow-gaging stations within Pennsylvania and surrounding states. All stations used in the development of the equations had 10 or more years of record and included active and discontinued continuous-record as well as crest-stage partial-record stations. The state was divided into four regions, and regional regression equations were developed to estimate the 2-, 5-, 10-, 50-, 100-, and 500-year recurrence-interval flood flows. The equations were developed by means of a regression analysis that utilized basin characteristics and flow data associated with the stations. Significant explanatory variables at the 95-percent confidence level for one or more regression equations included the following basin characteristics: drainage area; mean basin elevation; and the percentages of carbonate bedrock, urban area, and storage within a basin. The regression equations can be used to predict the magnitude of flood flows for specified recurrence intervals for most streams in the state; however, they are not valid for streams with drainage areas generally greater than 2,000 square miles or with substantial regulation, diversion, or mining activity within the basin. Estimates of flood-flow magnitude and frequency for streamflow-gaging stations substantially affected by upstream regulation are also presented.
Methods for estimating selected low-flow frequency statistics for unregulated streams in Kentucky
Martin, Gary R.; Arihood, Leslie D.
2010-01-01
This report provides estimates of, and presents methods for estimating, selected low-flow frequency statistics for unregulated streams in Kentucky including the 30-day mean low flows for recurrence intervals of 2 and 5 years (30Q2 and 30Q5) and the 7-day mean low flows for recurrence intervals of 5, 10, and 20 years (7Q2, 7Q10, and 7Q20). Estimates of these statistics are provided for 121 U.S. Geological Survey streamflow-gaging stations with data through the 2006 climate year, which is the 12-month period ending March 31 of each year. Data were screened to identify the periods of homogeneous, unregulated flows for use in the analyses. Logistic-regression equations are presented for estimating the annual probability of the selected low-flow frequency statistics being equal to zero. Weighted-least-squares regression equations were developed for estimating the magnitude of the nonzero 30Q2, 30Q5, 7Q2, 7Q10, and 7Q20 low flows. Three low-flow regions were defined for estimating the 7-day low-flow frequency statistics. The explicit explanatory variables in the regression equations include total drainage area and the mapped streamflow-variability index measured from a revised statewide coverage of this characteristic. The percentage of the station low-flow statistics correctly classified as zero or nonzero by use of the logistic-regression equations ranged from 87.5 to 93.8 percent. The average standard errors of prediction of the weighted-least-squares regression equations ranged from 108 to 226 percent. The 30Q2 regression equations have the smallest standard errors of prediction, and the 7Q20 regression equations have the largest standard errors of prediction. The regression equations are applicable only to stream sites with low flows unaffected by regulation from reservoirs and local diversions of flow and to drainage basins in specified ranges of basin characteristics. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features.
National scale biomass estimators for United States tree species
Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey
2003-01-01
Estimates of national-scale forest carbon (C) stocks and fluxes are typically based on allometric regression equations developed using dimensional analysis techniques. However, the literature is inconsistent and incomplete with respect to large-scale forest C estimation. We compiled all available diameter-based allometric regression equations for estimating total...
Validation of Core Temperature Estimation Algorithm
2016-01-29
plot of observed versus estimated core temperature with the line of identity (dashed) and the least squares regression line (solid) and line equation...estimated PSI with the line of identity (dashed) and the least squares regression line (solid) and line equation in the top left corner. (b) Bland...for comparison. The root mean squared error (RMSE) was also computed, as given by Equation 2.
Adjustment of regional regression equations for urban storm-runoff quality using at-site data
Barks, C.S.
1996-01-01
Regional regression equations have been developed to estimate urban storm-runoff loads and mean concentrations using a national data base. Four statistical methods using at-site data to adjust the regional equation predictions were developed to provide better local estimates. The four adjustment procedures are a single-factor adjustment, a regression of the observed data against the predicted values, a regression of the observed values against the predicted values and additional local independent variables, and a weighted combination of a local regression with the regional prediction. Data collected at five representative storm-runoff sites during 22 storms in Little Rock, Arkansas, were used to verify, and, when appropriate, adjust the regional regression equation predictions. Comparison of observed values of stormrunoff loads and mean concentrations to the predicted values from the regional regression equations for nine constituents (chemical oxygen demand, suspended solids, total nitrogen as N, total ammonia plus organic nitrogen as N, total phosphorus as P, dissolved phosphorus as P, total recoverable copper, total recoverable lead, and total recoverable zinc) showed large prediction errors ranging from 63 percent to more than several thousand percent. Prediction errors for 6 of the 18 regional regression equations were less than 100 percent and could be considered reasonable for water-quality prediction equations. The regression adjustment procedure was used to adjust five of the regional equation predictions to improve the predictive accuracy. For seven of the regional equations the observed and the predicted values are not significantly correlated. Thus neither the unadjusted regional equations nor any of the adjustments were appropriate. The mean of the observed values was used as a simple estimator when the regional equation predictions and adjusted predictions were not appropriate.
Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)
Churchill, Morgan; Clementz, Mark T; Kohno, Naoki
2014-01-01
Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814
Estimation of peak-discharge frequency of urban streams in Jefferson County, Kentucky
Martin, Gary R.; Ruhl, Kevin J.; Moore, Brian L.; Rose, Martin F.
1997-01-01
An investigation of flood-hydrograph characteristics for streams in urban Jefferson County, Kentucky, was made to obtain hydrologic information needed for waterresources management. Equations for estimating peak-discharge frequencies for ungaged streams in the county were developed by combining (1) long-term annual peakdischarge data and rainfall-runoff data collected from 1991 to 1995 in 13 urban basins and (2) long-term annual peak-discharge data in four rural basins located in hydrologically similar areas of neighboring counties. The basins ranged in size from 1.36 to 64.0 square miles. The U.S. Geological Survey Rainfall- Runoff Model (RRM) was calibrated for each of the urban basins. The calibrated models were used with long-term, historical rainfall and pan-evaporation data to simulate 79 years of annual peak-discharge data. Peak-discharge frequencies were estimated by fitting the logarithms of the annual peak discharges to a Pearson-Type III frequency distribution. The simulated peak-discharge frequencies were adjusted for improved reliability by application of bias-correction factors derived from peakdischarge frequencies based on local, observed annual peak discharges. The three-parameter and the preferred seven-parameter nationwide urban-peak-discharge regression equations previously developed by USGS investigators provided biased (high) estimates for the urban basins studied. Generalized-least-square regression procedures were used to relate peakdischarge frequency to selected basin characteristics. Regression equations were developed to estimate peak-discharge frequency by adjusting peak-dischargefrequency estimates made by use of the threeparameter nationwide urban regression equations. The regression equations are presented in equivalent forms as functions of contributing drainage area, main-channel slope, and basin development factor, which is an index for measuring the efficiency of the basin drainage system. Estimates of peak discharges for streams in the county can be made for the 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals by use of the regression equations. The average standard errors of prediction of the regression equations ranges from ? 34 to ? 45 percent. The regression equations are applicable to ungaged streams in the county having a specific range of basin characteristics.
Age Estimation of Infants Through Metric Analysis of Developing Anterior Deciduous Teeth.
Viciano, Joan; De Luca, Stefano; Irurita, Javier; Alemán, Inmaculada
2018-01-01
This study provides regression equations for estimation of age of infants from the dimensions of their developing deciduous teeth. The sample comprises 97 individuals of known sex and age (62 boys, 35 girls), aged between 2 days and 1,081 days. The age-estimation equations were obtained for the sexes combined, as well as for each sex separately, thus including "sex" as an independent variable. The values of the correlations and determination coefficients obtained for each regression equation indicate good fits for most of the equations obtained. The "sex" factor was statistically significant when included as an independent variable in seven of the regression equations. However, the "sex" factor provided an advantage for age estimation in only three of the equations, compared to those that did not include "sex" as a factor. These data suggest that the ages of infants can be accurately estimated from measurements of their developing deciduous teeth. © 2017 American Academy of Forensic Sciences.
Peak-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.
Estimating air drying times of lumber with multiple regression
William T. Simpson
2004-01-01
In this study, the applicability of a multiple regression equation for estimating air drying times of red oak, sugar maple, and ponderosa pine lumber was evaluated. The equation allows prediction of estimated air drying times from historic weather records of temperature and relative humidity at any desired location.
Parrett, Charles; Omang, R.J.; Hull, J.A.
1983-01-01
Equations for estimating mean annual runoff and peak discharge from measurements of channel geometry were developed for western and northeastern Montana. The study area was divided into two regions for the mean annual runoff analysis, and separate multiple-regression equations were developed for each region. The active-channel width was determined to be the most important independent variable in each region. The standard error of estimate for the estimating equation using active-channel width was 61 percent in the Northeast Region and 38 percent in the West region. The study area was divided into six regions for the peak discharge analysis, and multiple regression equations relating channel geometry and basin characteristics to peak discharges having recurrence intervals of 2, 5, 10, 25, 50 and 100 years were developed for each region. The standard errors of estimate for the regression equations using only channel width as an independent variable ranged from 35 to 105 percent. The standard errors improved in four regions as basin characteristics were added to the estimating equations. (USGS)
Martin, Gary R.; Fowler, Kathleen K.; Arihood, Leslie D.
2016-09-06
Information on low-flow characteristics of streams is essential for the management of water resources. This report provides equations for estimating the 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years and the harmonic-mean flow at ungaged, unregulated stream sites in Indiana. These equations were developed using the low-flow statistics and basin characteristics for 108 continuous-record streamgages in Indiana with at least 10 years of daily mean streamflow data through the 2011 climate year (April 1 through March 31). The equations were developed in cooperation with the Indiana Department of Environmental Management.Regression techniques were used to develop the equations for estimating low-flow frequency statistics and the harmonic-mean flows on the basis of drainage-basin characteristics. A geographic information system was used to measure basin characteristics for selected streamgages. A final set of 25 basin characteristics measured at all the streamgages were evaluated to choose the best predictors of the low-flow statistics.Logistic-regression equations applicable statewide are presented for estimating the probability that selected low-flow frequency statistics equal zero. These equations use the explanatory variables total drainage area, average transmissivity of the full thickness of the unconsolidated deposits within 1,000 feet of the stream network, and latitude of the basin outlet. The percentage of the streamgage low-flow statistics correctly classified as zero or nonzero using the logistic-regression equations ranged from 86.1 to 88.9 percent.Generalized-least-squares regression equations applicable statewide for estimating nonzero low-flow frequency statistics use total drainage area, the average hydraulic conductivity of the top 70 feet of unconsolidated deposits, the slope of the basin, and the index of permeability and thickness of the Quaternary surficial sediments as explanatory variables. The average standard error of prediction of these regression equations ranges from 55.7 to 61.5 percent.Regional weighted-least-squares regression equations were developed for estimating the harmonic-mean flows by dividing the State into three low-flow regions. The Northern region uses total drainage area and the average transmissivity of the entire thickness of unconsolidated deposits as explanatory variables. The Central region uses total drainage area, the average hydraulic conductivity of the entire thickness of unconsolidated deposits, and the index of permeability and thickness of the Quaternary surficial sediments. The Southern region uses total drainage area and the percent of the basin covered by forest. The average standard error of prediction for these equations ranges from 39.3 to 66.7 percent.The regional regression equations are applicable only to stream sites with low flows unaffected by regulation and to stream sites with drainage basin characteristic values within specified limits. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features and for urbanized basins. Extrapolations near and beyond the applicable basin characteristic limits will have unknown errors that may be large. Equations are presented for use in estimating the 90-percent prediction interval of the low-flow statistics estimated by use of the regression equations at a given stream site.The regression equations are to be incorporated into the U.S. Geological Survey StreamStats Web-based application for Indiana. StreamStats allows users to select a stream site on a map and automatically measure the needed basin characteristics and compute the estimated low-flow statistics and associated prediction intervals.
Kennedy, Jeffrey R.; Paretti, Nicholas V.
2014-01-01
Flooding in urban areas routinely causes severe damage to property and often results in loss of life. To investigate the effect of urbanization on the magnitude and frequency of flood peaks, a flood frequency analysis was carried out using data from urbanized streamgaging stations in Phoenix and Tucson, Arizona. Flood peaks at each station were predicted using the log-Pearson Type III distribution, fitted using the expected moments algorithm and the multiple Grubbs-Beck low outlier test. The station estimates were then compared to flood peaks estimated by rural-regression equations for Arizona, and to flood peaks adjusted for urbanization using a previously developed procedure for adjusting U.S. Geological Survey rural regression peak discharges in an urban setting. Only smaller, more common flood peaks at the 50-, 20-, 10-, and 4-percent annual exceedance probabilities (AEPs) demonstrate any increase in magnitude as a result of urbanization; the 1-, 0.5-, and 0.2-percent AEP flood estimates are predicted without bias by the rural-regression equations. Percent imperviousness was determined not to account for the difference in estimated flood peaks between stations, either when adjusting the rural-regression equations or when deriving urban-regression equations to predict flood peaks directly from basin characteristics. Comparison with urban adjustment equations indicates that flood peaks are systematically overestimated if the rural-regression-estimated flood peaks are adjusted upward to account for urbanization. At nearly every streamgaging station in the analysis, adjusted rural-regression estimates were greater than the estimates derived using station data. One likely reason for the lack of increase in flood peaks with urbanization is the presence of significant stormwater retention and detention structures within the watershed used in the study.
Olson, Scott A.; with a section by Veilleux, Andrea G.
2014-01-01
This report provides estimates of flood discharges at selected annual exceedance probabilities (AEPs) for streamgages in and adjacent to Vermont and equations for estimating flood discharges at AEPs of 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent (recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-years, respectively) for ungaged, unregulated, rural streams in Vermont. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 145 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, percentage of wetland area, and the basin-wide mean of the average annual precipitation. The average standard errors of prediction for estimating the flood discharges at the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEP with these equations are 34.9, 36.0, 38.7, 42.4, 44.9, 47.3, 50.7, and 55.1 percent, respectively. Flood discharges at selected AEPs for streamgages were computed by using the Expected Moments Algorithm. To improve estimates of the flood discharges for given exceedance probabilities at streamgages in Vermont, a new generalized skew coefficient was developed. The new generalized skew for the region is a constant, 0.44. The mean square error of the generalized skew coefficient is 0.078. This report describes a technique for using results from the regression equations to adjust an AEP discharge computed from a streamgage record. This report also describes a technique for using a drainage-area adjustment to estimate flood discharge at a selected AEP for an ungaged site upstream or downstream from a streamgage. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Methods for estimating flood frequency in Montana based on data through water year 1998
Parrett, Charles; Johnson, Dave R.
2004-01-01
Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Alexander, Terry W.; Wilson, Gary L.
1995-01-01
A generalized least-squares regression technique was used to relate the 2- to 500-year flood discharges from 278 selected streamflow-gaging stations to statistically significant basin characteristics. The regression relations (estimating equations) were defined for three hydrologic regions (I, II, and III) in rural Missouri. Ordinary least-squares regression analyses indicate that drainage area (Regions I, II, and III) and main-channel slope (Regions I and II) are the only basin characteristics needed for computing the 2- to 500-year design-flood discharges at gaged or ungaged stream locations. The resulting generalized least-squares regression equations provide a technique for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood discharges on unregulated streams in rural Missouri. The regression equations for Regions I and II were developed from stream-flow-gaging stations with drainage areas ranging from 0.13 to 11,500 square miles and 0.13 to 14,000 square miles, and main-channel slopes ranging from 1.35 to 150 feet per mile and 1.20 to 279 feet per mile. The regression equations for Region III were developed from streamflow-gaging stations with drainage areas ranging from 0.48 to 1,040 square miles. Standard errors of estimate for the generalized least-squares regression equations in Regions I, II, and m ranged from 30 to 49 percent.
Ahearn, Elizabeth A.
2004-01-01
Multiple linear-regression equations were developed to estimate the magnitudes of floods in Connecticut for recurrence intervals ranging from 2 to 500 years. The equations can be used for nonurban, unregulated stream sites in Connecticut with drainage areas ranging from about 2 to 715 square miles. Flood-frequency data and hydrologic characteristics from 70 streamflow-gaging stations and the upstream drainage basins were used to develop the equations. The hydrologic characteristics?drainage area, mean basin elevation, and 24-hour rainfall?are used in the equations to estimate the magnitude of floods. Average standard errors of prediction for the equations are 31.8, 32.7, 34.4, 35.9, 37.6 and 45.0 percent for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals, respectively. Simplified equations using only one hydrologic characteristic?drainage area?also were developed. The regression analysis is based on generalized least-squares regression techniques. Observed flows (log-Pearson Type III analysis of the annual maximum flows) from five streamflow-gaging stations in urban basins in Connecticut were compared to flows estimated from national three-parameter and seven-parameter urban regression equations. The comparison shows that the three- and seven- parameter equations used in conjunction with the new statewide equations generally provide reasonable estimates of flood flows for urban sites in Connecticut, although a national urban flood-frequency study indicated that the three-parameter equations significantly underestimated flood flows in many regions of the country. Verification of the accuracy of the three-parameter or seven-parameter national regression equations using new data from Connecticut stations was beyond the scope of this study. A technique for calculating flood flows at streamflow-gaging stations using a weighted average also is described. Two estimates of flood flows?one estimate based on the log-Pearson Type III analyses of the annual maximum flows at the gaging station, and the other estimate from the regression equation?are weighted together based on the years of record at the gaging station and the equivalent years of record value determined from the regression. Weighted averages of flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are tabulated for the 70 streamflow-gaging stations used in the regression analysis. Generally, weighted averages give the most accurate estimate of flood flows at gaging stations. An evaluation of the Connecticut's streamflow-gaging network was performed to determine whether the spatial coverage and range of geographic and hydrologic conditions are adequately represented for transferring flood characteristics from gaged to ungaged sites. Fifty-one of 54 stations in the current (2004) network support one or more flood needs of federal, state, and local agencies. Twenty-five of 54 stations in the current network are considered high-priority stations by the U.S. Geological Survey because of their contribution to the longterm understanding of floods, and their application for regionalflood analysis. Enhancements to the network to improve overall effectiveness for regionalization can be made by increasing the spatial coverage of gaging stations, establishing stations in regions of the state that are not well-represented, and adding stations in basins with drainage area sizes not represented. Additionally, the usefulness of the network for characterizing floods can be maintained and improved by continuing operation at the current stations because flood flows can be more accurately estimated at stations with continuous, long-term record.
Ziegeweid, Jeffrey R.; Magdalene, Suzanne
2015-01-01
The new regression equations were used to calculate revised estimates of historical streamflows for Stillwater and Prescott starting in 1910 and ending when index-velocity streamgages were installed. Monthly, annual, 30-year, and period of record statistics were examined between previous and revised estimates of historical streamflows. The abilities of the new regression equations to estimate historical streamflows were evaluated by using percent differences to compare new estimates of historical daily streamflows to discrete streamflow measurements made at Stillwater and Prescott before the installation of index-velocity streamgages. Although less variability was observed between estimated and measured streamflows at Stillwater compared to Prescott, the percent difference data indicated that the new estimates closely approximated measured streamflows at both locations.
Chen, Ying-Jen; Ho, Meng-Yang; Chen, Kwan-Ju; Hsu, Chia-Fen; Ryu, Shan-Jin
2009-08-01
The aims of the present study were to (i) investigate if traditional Chinese word reading ability can be used for estimating premorbid general intelligence; and (ii) to provide multiple regression equations for estimating premorbid performance on Raven's Standard Progressive Matrices (RSPM), using age, years of education and Chinese Graded Word Reading Test (CGWRT) scores as predictor variables. Four hundred and twenty-six healthy volunteers (201 male, 225 female), aged 16-93 years (mean +/- SD, 41.92 +/- 18.19 years) undertook the tests individually under supervised conditions. Seventy percent of subjects were randomly allocated to the derivation group (n = 296), and the rest to the validation group (n = 130). RSPM score was positively correlated with CGWRT score and years of education. RSPM and CGWRT scores and years of education were also inversely correlated with age, but the declining trend for RSPM performance against age was steeper than that for CGWRT performance. Separate multiple regression equations were derived for estimating RSPM scores using different combinations of age, years of education, and CGWRT score for both groups. The multiple regression coefficient of each equation ranged from 0.71 to 0.80 with the standard error of estimate between 7 and 8 RSPM points. When fitting the data of one group to the equations derived from its counterpart group, the cross-validation multiple regression coefficients ranged from 0.71 to 0.79. There were no significant differences in the 'predicted-obtained' RSPM discrepancies between any equations. The regression equations derived in the present study may provide a basis for estimating premorbid RSPM performance.
Ries, Kernell G.; Crouse, Michele Y.
2002-01-01
For many years, the U.S. Geological Survey (USGS) has been developing regional regression equations for estimating flood magnitude and frequency at ungaged sites. These regression equations are used to transfer flood characteristics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally, these equations have been developed on a Statewide or metropolitan-area basis as part of cooperative study programs with specific State Departments of Transportation. In 1994, the USGS released a computer program titled the National Flood Frequency Program (NFF), which compiled all the USGS available regression equations for estimating the magnitude and frequency of floods in the United States and Puerto Rico. NFF was developed in cooperation with the Federal Highway Administration and the Federal Emergency Management Agency. Since the initial release of NFF, the USGS has produced new equations for many areas of the Nation. A new version of NFF has been developed that incorporates these new equations and provides additional functionality and ease of use. NFF version 3 provides regression-equation estimates of flood-peak discharges for unregulated rural and urban watersheds, flood-frequency plots, and plots of typical flood hydrographs for selected recurrence intervals. The Program also provides weighting techniques to improve estimates of flood-peak discharges for gaging stations and ungaged sites. The information provided by NFF should be useful to engineers and hydrologists for planning and design applications. This report describes the flood-regionalization techniques used in NFF and provides guidance on the applicability and limitations of the techniques. The NFF software and the documentation for the regression equations included in NFF are available at http://water.usgs.gov/software/nff.html.
Sherwood, J.M.
1986-01-01
Methods are presented for estimating peak discharges, flood volumes and hydrograph shapes of small (less than 5 sq mi) urban streams in Ohio. Examples of how to use the various regression equations and estimating techniques also are presented. Multiple-regression equations were developed for estimating peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The significant independent variables affecting peak discharge are drainage area, main-channel slope, average basin-elevation index, and basin-development factor. Standard errors of regression and prediction for the peak discharge equations range from +/-37% to +/-41%. An equation also was developed to estimate the flood volume of a given peak discharge. Peak discharge, drainage area, main-channel slope, and basin-development factor were found to be the significant independent variables affecting flood volumes for given peak discharges. The standard error of regression for the volume equation is +/-52%. A technique is described for estimating the shape of a runoff hydrograph by applying a specific peak discharge and the estimated lagtime to a dimensionless hydrograph. An equation for estimating the lagtime of a basin was developed. Two variables--main-channel length divided by the square root of the main-channel slope and basin-development factor--have a significant effect on basin lagtime. The standard error of regression for the lagtime equation is +/-48%. The data base for the study was established by collecting rainfall-runoff data at 30 basins distributed throughout several metropolitan areas of Ohio. Five to eight years of data were collected at a 5-min record interval. The USGS rainfall-runoff model A634 was calibrated for each site. The calibrated models were used in conjunction with long-term rainfall records to generate a long-term streamflow record for each site. Each annual peak-discharge record was fitted to a Log-Pearson Type III frequency curve. Multiple-regression techniques were then used to analyze the peak discharge data as a function of the basin characteristics of the 30 sites. (Author 's abstract)
Sparling, D.W.; Barzen, J.A.; Lovvorn, J.R.; Serie, J.R.
1992-01-01
Regression equations that use mensural data to estimate body condition have been developed for several water birds. These equations often have been based on data that represent different sexes, age classes, or seasons, without being adequately tested for intergroup differences. We used proximate carcass analysis of 538 adult and juvenile canvasbacks (Aythya valisineria ) collected during fall migration, winter, and spring migrations in 1975-76 and 1982-85 to test regression methods for estimating body condition.
Estimation of Magnitude and Frequency of Floods for Streams on the Island of Oahu, Hawaii
Wong, Michael F.
1994-01-01
This report describes techniques for estimating the magnitude and frequency of floods for the island of Oahu. The log-Pearson Type III distribution and methodology recommended by the Interagency Committee on Water Data was used to determine the magnitude and frequency of floods at 79 gaging stations that had 11 to 72 years of record. Multiple regression analysis was used to construct regression equations to transfer the magnitude and frequency information from gaged sites to ungaged sites. Oahu was divided into three hydrologic regions to define relations between peak discharge and drainage-basin and climatic characteristics. Regression equations are provided to estimate the 2-, 5-, 10-, 25-, 50-, and 100-year peak discharges at ungaged sites. Significant basin and climatic characteristics included in the regression equations are drainage area, median annual rainfall, and the 2-year, 24-hour rainfall intensity. Drainage areas for sites used in this study ranged from 0.03 to 45.7 square miles. Standard error of prediction for the regression equations ranged from 34 to 62 percent. Peak-discharge data collected through water year 1988, geographic information system (GIS) technology, and generalized least-squares regression were used in the analyses. The use of GIS seems to be a more flexible and consistent means of defining and calculating basin and climatic characteristics than using manual methods. Standard errors of estimate for the regression equations in this report are an average of 8 percent less than those published in previous studies.
Stature estimation equations for South Asian skeletons based on DXA scans of contemporary adults.
Pomeroy, Emma; Mushrif-Tripathy, Veena; Wells, Jonathan C K; Kulkarni, Bharati; Kinra, Sanjay; Stock, Jay T
2018-05-03
Stature estimation from the skeleton is a classic anthropological problem, and recent years have seen the proliferation of population-specific regression equations. Many rely on the anatomical reconstruction of stature from archaeological skeletons to derive regression equations based on long bone lengths, but this requires a collection with very good preservation. In some regions, for example, South Asia, typical environmental conditions preclude the sufficient preservation of skeletal remains. Large-scale epidemiological studies that include medical imaging of the skeleton by techniques such as dual-energy X-ray absorptiometry (DXA) offer new potential datasets for developing such equations. We derived estimation equations based on known height and bone lengths measured from DXA scans from the Andhra Pradesh Children and Parents Study (Hyderabad, India). Given debates on the most appropriate regression model to use, multiple methods were compared, and the performance of the equations was tested on a published skeletal dataset of individuals with known stature. The equations have standard errors of estimates and prediction errors similar to those derived using anatomical reconstruction or from cadaveric datasets. As measured by the number of significant differences between true and estimated stature, and the prediction errors, the new equations perform as well as, and generally better than, published equations commonly used on South Asian skeletons or based on Indian cadaveric datasets. This study demonstrates the utility of DXA scans as a data source for developing stature estimation equations and offer a new set of equations for use with South Asian datasets. © 2018 Wiley Periodicals, Inc.
Lorenz, David L.; Sanocki, Chris A.; Kocian, Matthew J.
2010-01-01
Knowledge of the peak flow of floods of a given recurrence interval is essential for regulation and planning of water resources and for design of bridges, culverts, and dams along Minnesota's rivers and streams. Statistical techniques are needed to estimate peak flow at ungaged sites because long-term streamflow records are available at relatively few places. Because of the need to have up-to-date peak-flow frequency information in order to estimate peak flows at ungaged sites, the U.S. Geological Survey (USGS) conducted a peak-flow frequency study in cooperation with the Minnesota Department of Transportation and the Minnesota Pollution Control Agency. Estimates of peak-flow magnitudes for 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are presented for 330 streamflow-gaging stations in Minnesota and adjacent areas in Iowa and South Dakota based on data through water year 2005. The peak-flow frequency information was subsequently used in regression analyses to develop equations relating peak flows for selected recurrence intervals to various basin and climatic characteristics. Two statistically derived techniques-regional regression equation and region of influence regression-can be used to estimate peak flow on ungaged streams smaller than 3,000 square miles in Minnesota. Regional regression equations were developed for selected recurrence intervals in each of six regions in Minnesota: A (northwestern), B (north central and east central), C (northeastern), D (west central and south central), E (southwestern), and F (southeastern). The regression equations can be used to estimate peak flows at ungaged sites. The region of influence regression technique dynamically selects streamflow-gaging stations with characteristics similar to a site of interest. Thus, the region of influence regression technique allows use of a potentially unique set of gaging stations for estimating peak flow at each site of interest. Two methods of selecting streamflow-gaging stations, similarity and proximity, can be used for the region of influence regression technique. The regional regression equation technique is the preferred technique as an estimate of peak flow in all six regions for ungaged sites. The region of influence regression technique is not appropriate for regions C, E, and F because the interrelations of some characteristics of those regions do not agree with the interrelations throughout the rest of the State. Both the similarity and proximity methods for the region of influence technique can be used in the other regions (A, B, and D) to provide additional estimates of peak flow. The peak-flow-frequency estimates and basin characteristics for selected streamflow-gaging stations and regional peak-flow regression equations are included in this report.
Estimating Flow-Duration and Low-Flow Frequency Statistics for Unregulated Streams in Oregon
Risley, John; Stonewall, Adam J.; Haluska, Tana
2008-01-01
Flow statistical datasets, basin-characteristic datasets, and regression equations were developed to provide decision makers with surface-water information needed for activities such as water-quality regulation, water-rights adjudication, biological habitat assessment, infrastructure design, and water-supply planning and management. The flow statistics, which included annual and monthly period of record flow durations (5th, 10th, 25th, 50th, and 95th percent exceedances) and annual and monthly 7-day, 10-year (7Q10) and 7-day, 2-year (7Q2) low flows, were computed at 466 streamflow-gaging stations at sites with unregulated flow conditions throughout Oregon and adjacent areas of neighboring States. Regression equations, created from the flow statistics and basin characteristics of the stations, can be used to estimate flow statistics at ungaged stream sites in Oregon. The study area was divided into 10 regression modeling regions based on ecological, topographic, geologic, hydrologic, and climatic criteria. In total, 910 annual and monthly regression equations were created to predict the 7 flow statistics in the 10 regions. Equations to predict the five flow-duration exceedance percentages and the two low-flow frequency statistics were created with Ordinary Least Squares and Generalized Least Squares regression, respectively. The standard errors of estimate of the equations created to predict the 5th and 95th percent exceedances had medians of 42.4 and 64.4 percent, respectively. The standard errors of prediction of the equations created to predict the 7Q2 and 7Q10 low-flow statistics had medians of 51.7 and 61.2 percent, respectively. Standard errors for regression equations for sites in western Oregon were smaller than those in eastern Oregon partly because of a greater density of available streamflow-gaging stations in western Oregon than eastern Oregon. High-flow regression equations (such as the 5th and 10th percent exceedances) also generally were more accurate than the low-flow regression equations (such as the 95th percent exceedance and 7Q10 low-flow statistic). The regression equations predict unregulated flow conditions in Oregon. Flow estimates need to be adjusted if they are used at ungaged sites that are regulated by reservoirs or affected by water-supply and agricultural withdrawals if actual flow conditions are of interest. The regression equations are installed in the USGS StreamStats Web-based tool (http://water.usgs.gov/osw/streamstats/index.html, accessed July 16, 2008). StreamStats provides users with a set of annual and monthly flow-duration and low-flow frequency estimates for ungaged sites in Oregon in addition to the basin characteristics for the sites. Prediction intervals at the 90-percent confidence level also are automatically computed.
Oki, Delwyn S.; Rosa, Sarah N.; Yeung, Chiu W.
2010-01-01
This study provides an updated analysis of the magnitude and frequency of peak stream discharges in Hawai`i. Annual peak-discharge data collected by the U.S. Geological Survey during and before water year 2008 (ending September 30, 2008) at stream-gaging stations were analyzed. The existing generalized-skew value for the State of Hawai`i was retained, although three methods were used to evaluate whether an update was needed. Regional regression equations were developed for peak discharges with 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals for unregulated streams (those for which peak discharges are not affected to a large extent by upstream reservoirs, dams, diversions, or other structures) in areas with less than 20 percent combined medium- and high-intensity development on Kaua`i, O`ahu, Moloka`i, Maui, and Hawai`i. The generalized-least-squares (GLS) regression equations relate peak stream discharge to quantified basin characteristics (for example, drainage-basin area and mean annual rainfall) that were determined using geographic information system (GIS) methods. Each of the islands of Kaua`i,O`ahu, Moloka`i, Maui, and Hawai`i was divided into two regions, generally corresponding to a wet region and a dry region. Unique peak-discharge regression equations were developed for each region. The regression equations developed for this study have standard errors of prediction ranging from 16 to 620 percent. Standard errors of prediction are greatest for regression equations developed for leeward Moloka`i and southern Hawai`i. In general, estimated 100-year peak discharges from this study are lower than those from previous studies, which may reflect the longer periods of record used in this study. Each regression equation is valid within the range of values of the explanatory variables used to develop the equation. The regression equations were developed using peak-discharge data from streams that are mainly unregulated, and they should not be used to estimate peak discharges in regulated streams. Use of a regression equation beyond its limits will produce peak-discharge estimates with unknown error and should therefore be avoided. Improved estimates of the magnitude and frequency of peak discharges in Hawai`i will require continued operation of existing stream-gaging stations and operation of additional gaging stations for areas such as Moloka`i and Hawai`i, where limited stream-gaging data are available.
Gotvald, Anthony J.
2017-01-13
The U.S. Geological Survey, in cooperation with the Georgia Department of Natural Resources, Environmental Protection Division, developed regional regression equations for estimating selected low-flow frequency and mean annual flow statistics for ungaged streams in north Georgia that are not substantially affected by regulation, diversions, or urbanization. Selected low-flow frequency statistics and basin characteristics for 56 streamgage locations within north Georgia and 75 miles beyond the State’s borders in Alabama, Tennessee, North Carolina, and South Carolina were combined to form the final dataset used in the regional regression analysis. Because some of the streamgages in the study recorded zero flow, the final regression equations were developed using weighted left-censored regression analysis to analyze the flow data in an unbiased manner, with weights based on the number of years of record. The set of equations includes the annual minimum 1- and 7-day average streamflow with the 10-year recurrence interval (referred to as 1Q10 and 7Q10), monthly 7Q10, and mean annual flow. The final regional regression equations are functions of drainage area, mean annual precipitation, and relief ratio for the selected low-flow frequency statistics and drainage area and mean annual precipitation for mean annual flow. The average standard error of estimate was 13.7 percent for the mean annual flow regression equation and ranged from 26.1 to 91.6 percent for the selected low-flow frequency equations.The equations, which are based on data from streams with little to no flow alterations, can be used to provide estimates of the natural flows for selected ungaged stream locations in the area of Georgia north of the Fall Line. The regression equations are not to be used to estimate flows for streams that have been altered by the effects of major dams, surface-water withdrawals, groundwater withdrawals (pumping wells), diversions, or wastewater discharges. The regression equations should be used only for ungaged sites with drainage areas between 1.67 and 576 square miles, mean annual precipitation between 47.6 and 81.6 inches, and relief ratios between 0.146 and 0.607; these are the ranges of the explanatory variables used to develop the equations. An attempt was made to develop regional regression equations for the area of Georgia south of the Fall Line by using the same approach used during this study for north Georgia; however, the equations resulted with high average standard errors of estimates and poorly predicted flows below 0.5 cubic foot per second, which may be attributed to the karst topography common in that area.The final regression equations developed from this study are planned to be incorporated into the U.S. Geological Survey StreamStats program. StreamStats is a Web-based geographic information system that provides users with access to an assortment of analytical tools useful for water-resources planning and management, and for engineering design applications, such as the design of bridges. The StreamStats program provides streamflow statistics and basin characteristics for U.S. Geological Survey streamgage locations and ungaged sites of interest. StreamStats also can compute basin characteristics and provide estimates of streamflow statistics for ungaged sites when users select the location of a site along any stream in Georgia.
Kennedy, Jeffrey R.; Paretti, Nicholas V.; Veilleux, Andrea G.
2014-01-01
Regression equations, which allow predictions of n-day flood-duration flows for selected annual exceedance probabilities at ungaged sites, were developed using generalized least-squares regression and flood-duration flow frequency estimates at 56 streamgaging stations within a single, relatively uniform physiographic region in the central part of Arizona, between the Colorado Plateau and Basin and Range Province, called the Transition Zone. Drainage area explained most of the variation in the n-day flood-duration annual exceedance probabilities, but mean annual precipitation and mean elevation were also significant variables in the regression models. Standard error of prediction for the regression equations varies from 28 to 53 percent and generally decreases with increasing n-day duration. Outside the Transition Zone there are insufficient streamgaging stations to develop regression equations, but flood-duration flow frequency estimates are presented at select streamgaging stations.
Methods for estimating low-flow statistics for Massachusetts streams
Ries, Kernell G.; Friesz, Paul J.
2000-01-01
Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The streamgaging stations had from 2 to 81 years of record, with a mean record length of 37 years. The low-flow partial-record stations had from 8 to 36 streamflow measurements, with a median of 14 measurements. All basin characteristics were determined from digital map data. The basin characteristics that were statistically significant in most of the final regression equations were drainage area, the area of stratified-drift deposits per unit of stream length plus 0.1, mean basin slope, and an indicator variable that was 0 in the eastern region and 1 in the western region of Massachusetts. The equations were developed by use of weighted-least-squares regression analyses, with weights assigned proportional to the years of record and inversely proportional to the variances of the streamflow statistics for the stations. Standard errors of prediction ranged from 70.7 to 17.5 percent for the equations to predict the 7-day, 10-year low flow and 50-percent duration flow, respectively. The equations are not applicable for use in the Southeast Coastal region of the State, or where basin characteristics for the selected ungaged site are outside the ranges of those for the stations used in the regression analyses. A World Wide Web application was developed that provides streamflow statistics for data collection stations from a data base and for ungaged sites by measuring the necessary basin characteristics for the site and solving the regression equations. Output provided by the Web application for ungaged sites includes a map of the drainage-basin boundary determined for the site, the measured basin characteristics, the estimated streamflow statistics, and 90-percent prediction intervals for the estimates. An equation is provided for combining regression and correlation estimates to obtain improved estimates of the streamflow statistics for low-flow partial-record stations. An equation is also provided for combining regression and drainage-area ratio estimates to obtain improved e
Charles E. Rose; Thomas B. Lynch
2001-01-01
A method was developed for estimating parameters in an individual tree basal area growth model using a system of equations based on dbh rank classes. The estimation method developed is a compromise between an individual tree and a stand level basal area growth model that accounts for the correlation between trees within a plot by using seemingly unrelated regression (...
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
Fraysse, François; Thewlis, Dominic
2014-11-07
Numerous methods exist to estimate the pose of the axes of rotation of the forearm. These include anatomical definitions, such as the conventions proposed by the ISB, and functional methods based on instantaneous helical axes, which are commonly accepted as the modelling gold standard for non-invasive, in-vivo studies. We investigated the validity of a third method, based on regression equations, to estimate the rotation axes of the forearm. We also assessed the accuracy of both ISB methods. Axes obtained from a functional method were considered as the reference. Results indicate a large inter-subject variability in the axes positions, in accordance with previous studies. Both ISB methods gave the same level of accuracy in axes position estimations. Regression equations seem to improve estimation of the flexion-extension axis but not the pronation-supination axis. Overall, given the large inter-subject variability, the use of regression equations cannot be recommended. Copyright © 2014 Elsevier Ltd. All rights reserved.
Estimates of streamflow characteristics for selected small streams, Baker River basin, Washington
Williams, John R.
1987-01-01
Regression equations were used to estimate streamflow characteristics at eight ungaged sites on small streams in the Baker River basin in the North Cascade Mountains, Washington, that could be suitable for run-of-the-river hydropower development. The regression equations were obtained by relating known streamflow characteristics at 25 gaging stations in nearby basins to several physical and climatic variables that could be easily measured in gaged or ungaged basins. The known streamflow characteristics were mean annual flows, 1-, 3-, and 7-day low flows and high flows, mean monthly flows, and flow duration. Drainage area and mean annual precipitation were not the most significant variables in all the regression equations. Variance in the low flows and the summer mean monthly flows was reduced by including an index of glacierized area within the basin as a third variable. Standard errors of estimate of the regression equations ranged from 25 to 88%, and the largest errors were associated with the low flow characteristics. Discharge measurements made at the eight sites near midmonth each month during 1981 were used to estimate monthly mean flows at the sites for that period. These measurements also were correlated with concurrent daily mean flows from eight operating gaging stations. The correlations provided estimates of mean monthly flows that compared reasonably well with those estimated by the regression analyses. (Author 's abstract)
A Comparison of Regional and SiteSpecific Volume Estimation Equations
Joe P. McClure; Jana Anderson; Hans T. Schreuder
1987-01-01
Regression equations for volume by region and site class were examined for lobiolly pine. The regressions for the Coastal Plain and Piedmont regions had significantly different slopes. The results shared important practical differences in percentage of confidence intervals containing the true total volume and in percentage of estimates within a specific proportion of...
Gotvald, Anthony J.; Barth, Nancy A.; Veilleux, Andrea G.; Parrett, Charles
2012-01-01
Methods for estimating the magnitude and frequency of floods in California that are not substantially affected by regulation or diversions have been updated. Annual peak-flow data through water year 2006 were analyzed for 771 streamflow-gaging stations (streamgages) in California having 10 or more years of data. Flood-frequency estimates were computed for the streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Low-outlier and historic information were incorporated into the flood-frequency analysis, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low outliers. Special methods for fitting the distribution were developed for streamgages in the desert region in southeastern California. Additionally, basin characteristics for the streamgages were computed by using a geographical information system. Regional regression analysis, using generalized least squares regression, was used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins in California that are outside of the southeastern desert region. Flood-frequency estimates and basin characteristics for 630 streamgages were combined to form the final database used in the regional regression analysis. Five hydrologic regions were developed for the area of California outside of the desert region. The final regional regression equations are functions of drainage area and mean annual precipitation for four of the five regions. In one region, the Sierra Nevada region, the final equations are functions of drainage area, mean basin elevation, and mean annual precipitation. Average standard errors of prediction for the regression equations in all five regions range from 42.7 to 161.9 percent. For the desert region of California, an analysis of 33 streamgages was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the log-Pearson Type III distribution. The regional estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final regional regression equations are functions of drainage area. Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent. Annual peak-flow data through water year 2006 were analyzed for eight streamgages in California having 10 or more years of data considered to be affected by urbanization. Flood-frequency estimates were computed for the urban streamgages by fitting a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Regression analysis could not be used to develop flood-frequency estimation equations for urban streams because of the limited number of sites. Flood-frequency estimates for the eight urban sites were graphically compared to flood-frequency estimates for 630 non-urban sites. The regression equations developed from this study will be incorporated into the U.S. Geological Survey (USGS) StreamStats program. The StreamStats program is a Web-based application that provides streamflow statistics and basin characteristics for USGS streamgages and ungaged sites of interest. StreamStats can also compute basin characteristics and provide estimates of streamflow statistics for ungaged sites when users select the location of a site along any stream in California.
Singer, Donald A.; Kouda, Ryoichi
2011-01-01
Empirical evidence indicates that processes affecting number and quantity of resources in geologic settings are very general across deposit types. Sizes of permissive tracts that geologically could contain the deposits are excellent predictors of numbers of deposits. In addition, total ore tonnage of mineral deposits of a particular type in a tract is proportional to the type’s median tonnage in a tract. Regressions using size of permissive tracts and median tonnage allow estimation of number of deposits and of total tonnage of mineralization. These powerful estimators, based on 10 different deposit types from 109 permissive worldwide control tracts, generalize across deposit types. Estimates of number of deposits and of total tonnage of mineral deposits are made by regressing permissive area, and mean (in logs) tons in deposits of the type, against number of deposits and total tonnage of deposits in the tract for the 50th percentile estimates. The regression equations (R2 = 0.91 and 0.95) can be used for all deposit types just by inserting logarithmic values of permissive area in square kilometers, and mean tons in deposits in millions of metric tons. The regression equations provide estimates at the 50th percentile, and other equations are provided for 90% confidence limits for lower estimates and 10% confidence limits for upper estimates of number of deposits and total tonnage. Equations for these percentile estimates along with expected value estimates are presented here along with comparisons with independent expert estimates. Also provided are the equations for correcting for the known well-explored deposits in a tract. These deposit-density models require internally consistent grade and tonnage models and delineations for arriving at unbiased estimates.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Estimating the magnitude of peak flows at selected recurrence intervals for streams in Idaho
Berenbrock, Charles
2002-01-01
The region-of-influence method is not recommended for use in determining flood-frequency estimates for ungaged sites in Idaho because the results, overall, are less accurate and the calculations are more complex than those of regional regression equations. The regional regression equations were considered to be the primary method of estimating the magnitude and frequency of peak flows for ungaged sites in Idaho.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time component also was a statistically significant explanatory variable for estimating chloride. The regression equations for chloride at the Red River at Fargo provided a fair relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.66 and the equation for the Red River at Grand Forks provided a relatively good relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.77. Turbidity and streamflow were statistically significant explanatory variables for estimating nitrate plus nitrite concentrations at the Red River at Fargo and turbidity was the only statistically significant explanatory variable for estimating nitrate plus nitrite concentrations at Grand Forks. The regression equation for the Red River at Fargo provided a relatively poor relation between nitrate plus nitrite concentrations, turbidity, and streamflow, with an adjusted coefficient of determination of 0.46. The regression equation for the Red River at Grand Forks provided a fair relation between nitrate plus nitrite concentrations and turbidity, with an adjusted coefficient of determination of 0.73. Some of the variability that was not explained by the equations might be attributed to different sources contributing nitrates to the stream at different times. Turbidity, streamflow, and a seasonal component were statistically significant explanatory variables for estimating total phosphorus at the Red River at Fargo and Grand Forks. The regression equation for the Red River at Fargo provided a relatively fair relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.74. The regression equation for the Red River at Grand Forks provided a good relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.87. For the Red River at Fargo, turbidity and streamflow were statistically significant explanatory variables for estimating suspended-sediment concentrations. For the Red River at Grand Forks, turbidity was the only statistically significant explanatory variable for estimating suspended-sediment concentration. The regression equation at the Red River at Fargo provided a good relation between suspended-sediment concentration, turbidity, and streamflow, with an adjusted coefficient of determination of 0.95. The regression equation for the Red River at Grand Forks provided a good relation between suspended-sediment concentration and turbidity, with an adjusted coefficient of determination of 0.96.
Fossum, Kenneth D.; O'Day, Christie M.; Wilson, Barbara J.; Monical, Jim E.
2001-01-01
Stormwater and streamflow in Maricopa County were monitored to (1) describe the physical, chemical, and toxicity characteristics of stormwater from areas having different land uses, (2) describe the physical, chemical, and toxicity characteristics of streamflow from areas that receive urban stormwater, and (3) estimate constituent loads in stormwater. Urban stormwater and streamflow had similar ranges in most constituent concentrations. The mean concentration of dissolved solids in urban stormwater was lower than in streamflow from the Salt River and Indian Bend Wash. Urban stormwater, however, had a greater chemical oxygen demand and higher concentrations of most nutrients. Mean seasonal loads and mean annual loads of 11 constituents and volumes of runoff were estimated for municipalities in the metropolitan Phoenix area, Arizona, by adjusting regional regression equations of loads. This adjustment procedure uses the original regional regression equation and additional explanatory variables that were not included in the original equation. The adjusted equations had standard errors that ranged from 161 to 196 percent. The large standard errors of the prediction result from the large variability of the constituent concentration data used in the regression analysis. Adjustment procedures produced unsatisfactory results for nine of the regressions?suspended solids, dissolved solids, total phosphorus, dissolved phosphorus, total recoverable cadmium, total recoverable copper, total recoverable lead, total recoverable zinc, and storm runoff. These equations had no consistent direction of bias and no other additional explanatory variables correlated with the observed loads. A stepwise-multiple regression or a three-variable regression (total storm rainfall, drainage area, and impervious area) and local data were used to develop local regression equations for these nine constituents. These equations had standard errors from 15 to 183 percent.
Sando, Roy; Sando, Steven K.; McCarthy, Peter M.; Dutton, DeAnn M.
2016-04-05
The U.S. Geological Survey (USGS), in cooperation with the Montana Department of Natural Resources and Conservation, completed a study to update methods for estimating peak-flow frequencies at ungaged sites in Montana based on peak-flow data at streamflow-gaging stations through water year 2011. The methods allow estimation of peak-flow frequencies (that is, peak-flow magnitudes, in cubic feet per second, associated with annual exceedance probabilities of 66.7, 50, 42.9, 20, 10, 4, 2, 1, 0.5, and 0.2 percent) at ungaged sites. The annual exceedance probabilities correspond to 1.5-, 2-, 2.33-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence intervals, respectively.Regional regression analysis is a primary focus of Chapter F of this Scientific Investigations Report, and regression equations for estimating peak-flow frequencies at ungaged sites in eight hydrologic regions in Montana are presented. The regression equations are based on analysis of peak-flow frequencies and basin characteristics at 537 streamflow-gaging stations in or near Montana and were developed using generalized least squares regression or weighted least squares regression.All of the data used in calculating basin characteristics that were included as explanatory variables in the regression equations were developed for and are available through the USGS StreamStats application (http://water.usgs.gov/osw/streamstats/) for Montana. StreamStats is a Web-based geographic information system application that was created by the USGS to provide users with access to an assortment of analytical tools that are useful for water-resource planning and management. The primary purpose of the Montana StreamStats application is to provide estimates of basin characteristics and streamflow characteristics for user-selected ungaged sites on Montana streams. The regional regression equations presented in this report chapter can be conveniently solved using the Montana StreamStats application.Selected results from this study were compared with results of previous studies. For most hydrologic regions, the regression equations reported for this study had lower mean standard errors of prediction (in percent) than the previously reported regression equations for Montana. The equations presented for this study are considered to be an improvement on the previously reported equations primarily because this study (1) included 13 more years of peak-flow data; (2) included 35 more streamflow-gaging stations than previous studies; (3) used a detailed geographic information system (GIS)-based definition of the regulation status of streamflow-gaging stations, which allowed better determination of the unregulated peak-flow records that are appropriate for use in the regional regression analysis; (4) included advancements in GIS and remote-sensing technologies, which allowed more convenient calculation of basin characteristics and investigation of many more candidate basin characteristics; and (5) included advancements in computational and analytical methods, which allowed more thorough and consistent data analysis.This report chapter also presents other methods for estimating peak-flow frequencies at ungaged sites. Two methods for estimating peak-flow frequencies at ungaged sites located on the same streams as streamflow-gaging stations are described. Additionally, envelope curves relating maximum recorded annual peak flows to contributing drainage area for each of the eight hydrologic regions in Montana are presented and compared to a national envelope curve. In addition to providing general information on characteristics of large peak flows, the regional envelope curves can be used to assess the reasonableness of peak-flow frequency estimates determined using the regression equations.
Methods for estimating streamflow at mountain fronts in southern New Mexico
Waltemeyer, S.D.
1994-01-01
The infiltration of streamflow is potential recharge to alluvial-basin aquifers at or near mountain fronts in southern New Mexico. Data for 13 streamflow-gaging stations were used to determine a relation between mean annual stream- flow and basin and climatic conditions. Regression analysis was used to develop an equation that can be used to estimate mean annual streamflow on the basis of drainage areas and mean annual precipi- tation. The average standard error of estimate for this equation is 46 percent. Regression analysis also was used to develop an equation to estimate mean annual streamflow on the basis of active- channel width. Measurements of the width of active channels were determined for 6 of the 13 gaging stations. The average standard error of estimate for this relation is 29 percent. Stream- flow estimates made using a regression equation based on channel geometry are considered more reliable than estimates made from an equation based on regional relations of basin and climatic conditions. The sample size used to develop these relations was small, however, and the reported standard error of estimate may not represent that of the entire population. Active-channel-width measurements were made at 23 ungaged sites along the Rio Grande upstream from Elephant Butte Reservoir. Data for additional sites would be needed for a more comprehensive assessment of mean annual streamflow in southern New Mexico.
Low-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Low-flow annual non-exceedance probabilities (ANEP), called probability-percent chance (P-percent chance) flow estimates, regional regression equations, and transfer methods are provided describing the low-flow characteristics of Virginia streams. Statistical methods are used to evaluate streamflow data. Analysis of Virginia streamflow data collected from 1895 through 2007 is summarized. Methods are provided for estimating low-flow characteristics of gaged and ungaged streams. The 1-, 4-, 7-, and 30-day average streamgaging station low-flow characteristics for 290 long-term, continuous-record, streamgaging stations are determined, adjusted for instances of zero flow using a conditional probability adjustment method, and presented for non-exceedance probabilities of 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, and 0.005. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression equations to estimate annual non-exceedance probabilities at gaged and ungaged sites and are summarized for 290 long-term, continuous-record streamgaging stations, 136 short-term, continuous-record streamgaging stations, and 613 partial-record streamgaging stations. Regional regression equations for six physiographic regions use basin characteristics to estimate 1-, 4-, 7-, and 30-day average low-flow annual non-exceedance probabilities at gaged and ungaged sites. Weighted low-flow values that combine computed streamgaging station low-flow characteristics and annual non-exceedance probabilities from regional regression equations provide improved low-flow estimates. Regression equations developed using the Maintenance of Variance with Extension (MOVE.1) method describe the line of organic correlation (LOC) with an appropriate index site for low-flow characteristics at 136 short-term, continuous-record streamgaging stations and 613 partial-record streamgaging stations. Monthly streamflow statistics computed on the individual daily mean streamflows of selected continuous-record streamgaging stations and curves describing flow-duration are presented. Text, figures, and lists are provided summarizing low-flow estimates, selected low-flow sites, delineated physiographic regions, basin characteristics, regression equations, error estimates, definitions, and data sources. This study supersedes previous studies of low flows in Virginia.
Breaker, Brian K.
2015-01-01
Equations for two regions were found to be statistically significant for developing regression equations for estimating harmonic mean flows at ungaged basins; thus, equations are applicable only to streams in those respective regions in Arkansas. Regression equations for dry season mean monthly flows are applicable only to streams located throughout Arkansas. All regression equations are applicable only to unaltered streams where flows were not significantly affected by regulation, diversion, or urbanization. The median number of years used for dry season mean monthly flow calculation was 43, and the median number of years used for harmonic mean flow calculations was 34 for region 1 and 43 for region 2.
Estimation of Flood Discharges at Selected Recurrence Intervals for Streams in New Hampshire
Olson, Scott A.
2009-01-01
This report provides estimates of flood discharges at selected recurrence intervals for streamgages in and adjacent to New Hampshire and equations for estimating flood discharges at recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, and 500-years for ungaged, unregulated, rural streams in New Hampshire. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 117 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, mean April precipitation, percentage of wetland area, and main channel slope. The average standard error of prediction for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence interval flood discharges with these equations are 30.0, 30.8, 32.0, 34.2, 36.0, 38.1, and 43.4 percent, respectively. Flood discharges at selected recurrence intervals for selected streamgages were computed following the guidelines in Bulletin 17B of the U.S. Interagency Advisory Committee on Water Data. To determine the flood-discharge exceedence probabilities at streamgages in New Hampshire, a new generalized skew coefficient map covering the State was developed. The standard error of the data on new map is 0.298. To improve estimates of flood discharges at selected recurrence intervals for 20 streamgages with short-term records (10 to 15 years), record extension using the two-station comparison technique was applied. The two-station comparison method uses data from a streamgage with long-term record to adjust the frequency characteristics at a streamgage with a short-term record. A technique for adjusting a flood-discharge frequency curve computed from a streamgage record with results from the regression equations is described in this report. Also, a technique is described for estimating flood discharge at a selected recurrence interval for an ungaged site upstream or downstream from a streamgage using a drainage-area adjustment. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Zemski, Adam J; Broad, Elizabeth M; Slater, Gary J
2018-01-01
Body composition in elite rugby union athletes is routinely assessed using surface anthropometry, which can be utilized to provide estimates of absolute body composition using regression equations. This study aims to assess the ability of available skinfold equations to estimate body composition in elite rugby union athletes who have unique physique traits and divergent ethnicity. The development of sport-specific and ethnicity-sensitive equations was also pursued. Forty-three male international Australian rugby union athletes of Caucasian and Polynesian descent underwent surface anthropometry and dual-energy X-ray absorptiometry (DXA) assessment. Body fat percent (BF%) was estimated using five previously developed equations and compared to DXA measures. Novel sport and ethnicity-sensitive prediction equations were developed using forward selection multiple regression analysis. Existing skinfold equations provided unsatisfactory estimates of BF% in elite rugby union athletes, with all equations demonstrating a 95% prediction interval in excess of 5%. The equations tended to underestimate BF% at low levels of adiposity, whilst overestimating BF% at higher levels of adiposity, regardless of ethnicity. The novel equations created explained a similar amount of variance to those previously developed (Caucasians 75%, Polynesians 90%). The use of skinfold equations, including the created equations, cannot be supported to estimate absolute body composition. Until a population-specific equation is established that can be validated to precisely estimate body composition, it is advocated to use a proven method, such as DXA, when absolute measures of lean and fat mass are desired, and raw anthropometry data routinely to derive an estimate of body composition change.
Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio
Koltun, G.F.; Roberts, J.W.
1990-01-01
Multiple-regression equations are presented for estimating flood-peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years at ungaged sites on rural, unregulated streams in Ohio. The average standard errors of prediction for the equations range from 33.4% to 41.4%. Peak discharge estimates determined by log-Pearson Type III analysis using data collected through the 1987 water year are reported for 275 streamflow-gaging stations. Ordinary least-squares multiple-regression techniques were used to divide the State into three regions and to identify a set of basin characteristics that help explain station-to- station variation in the log-Pearson estimates. Contributing drainage area, main-channel slope, and storage area were identified as suitable explanatory variables. Generalized least-square procedures, which include historical flow data and account for differences in the variance of flows at different gaging stations, spatial correlation among gaging station records, and variable lengths of station record were used to estimate the regression parameters. Weighted peak-discharge estimates computed as a function of the log-Pearson Type III and regression estimates are reported for each station. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site located on the same stream. Limitations and shortcomings cited in an earlier report on the magnitude and frequency of floods in Ohio are addressed in this study. Geographic bias is no longer evident for the Maumee River basin of northwestern Ohio. No bias is found to be associated with the forested-area characteristic for the range used in the regression analysis (0.0 to 99.0%), nor is this characteristic significant in explaining peak discharges. Surface-mined area likewise is not significant in explaining peak discharges, and the regression equations are not biased when applied to basins having approximately 30% or less surface-mined area. Analyses of residuals indicate that the equations tend to overestimate flood-peak discharges for basins having approximately 30% or more surface-mined area. (USGS)
Methods for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma
Esralew, Rachel A.; Smith, S. Jerrod
2010-01-01
Flow statistics can be used to provide decision makers with surface-water information needed for activities such as water-supply permitting, flow regulation, and other water rights issues. Flow statistics could be needed at any location along a stream. Most often, streamflow statistics are needed at ungaged sites, where no flow data are available to compute the statistics. Methods are presented in this report for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma. Flow statistics included the (1) annual (period of record), (2) seasonal (summer-autumn and winter-spring), and (3) 12 monthly duration statistics, including the 20th, 50th, 80th, 90th, and 95th percentile flow exceedances, and the annual mean-flow (mean of daily flows for the period of record). Flow statistics were calculated from daily streamflow information collected from 235 streamflow-gaging stations throughout Oklahoma and areas in adjacent states. A drainage-area ratio method is the preferred method for estimating flow statistics at an ungaged location that is on a stream near a gage. The method generally is reliable only if the drainage-area ratio of the two sites is between 0.5 and 1.5. Regression equations that relate flow statistics to drainage-basin characteristics were developed for the purpose of estimating selected flow-duration and annual mean-flow statistics for ungaged streams that are not near gaging stations on the same stream. Regression equations were developed from flow statistics and drainage-basin characteristics for 113 unregulated gaging stations. Separate regression equations were developed by using U.S. Geological Survey streamflow-gaging stations in regions with similar drainage-basin characteristics. These equations can increase the accuracy of regression equations used for estimating flow-duration and annual mean-flow statistics at ungaged stream locations in Oklahoma. Streamflow-gaging stations were grouped by selected drainage-basin characteristics by using a k-means cluster analysis. Three regions were identified for Oklahoma on the basis of the clustering of gaging stations and a manual delineation of distinguishable hydrologic and geologic boundaries: Region 1 (western Oklahoma excluding the Oklahoma and Texas Panhandles), Region 2 (north- and south-central Oklahoma), and Region 3 (eastern and central Oklahoma). A total of 228 regression equations (225 flow-duration regressions and three annual mean-flow regressions) were developed using ordinary least-squares and left-censored (Tobit) multiple-regression techniques. These equations can be used to estimate 75 flow-duration statistics and annual mean-flow for ungaged streams in the three regions. Drainage-basin characteristics that were statistically significant independent variables in the regression analyses were (1) contributing drainage area; (2) station elevation; (3) mean drainage-basin elevation; (4) channel slope; (5) percentage of forested canopy; (6) mean drainage-basin hillslope; (7) soil permeability; and (8) mean annual, seasonal, and monthly precipitation. The accuracy of flow-duration regression equations generally decreased from high-flow exceedance (low-exceedance probability) to low-flow exceedance (high-exceedance probability) . This decrease may have happened because a greater uncertainty exists for low-flow estimates and low-flow is largely affected by localized geology that was not quantified by the drainage-basin characteristics selected. The standard errors of estimate of regression equations for Region 1 (western Oklahoma) were substantially larger than those standard errors for other regions, especially for low-flow exceedances. These errors may be a result of greater variability in low flow because of increased irrigation activities in this region. Regression equations may not be reliable for sites where the drainage-basin characteristics are outside the range of values of independent vari
Ryberg, Karen R.
2006-01-01
This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the Bureau of Reclamation, U.S. Department of the Interior, to estimate water-quality constituent concentrations in the Red River of the North at Fargo, North Dakota. Regression analysis of water-quality data collected in 2003-05 was used to estimate concentrations and loads for alkalinity, dissolved solids, sulfate, chloride, total nitrite plus nitrate, total nitrogen, total phosphorus, and suspended sediment. The explanatory variables examined for regression relation were continuously monitored physical properties of water-streamflow, specific conductance, pH, water temperature, turbidity, and dissolved oxygen. For the conditions observed in 2003-05, streamflow was a significant explanatory variable for all estimated constituents except dissolved solids. pH, water temperature, and dissolved oxygen were not statistically significant explanatory variables for any of the constituents in this study. Specific conductance was a significant explanatory variable for alkalinity, dissolved solids, sulfate, and chloride. Turbidity was a significant explanatory variable for total phosphorus and suspended sediment. For the nutrients, total nitrite plus nitrate, total nitrogen, and total phosphorus, cosine and sine functions of time also were used to explain the seasonality in constituent concentrations. The regression equations were evaluated using common measures of variability, including R2, or the proportion of variability in the estimated constituent explained by the regression equation. R2 values ranged from 0.703 for total nitrogen concentration to 0.990 for dissolved-solids concentration. The regression equations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.1 for dissolved solids to 35.2 for total nitrite plus nitrate. Regression equations also were used to estimate daily constituent loads. Load estimates can be used by water-quality managers for comparison of current water-quality conditions to water-quality standards expressed as total maximum daily loads (TMDLs). TMDLs are a measure of the maximum amount of chemical constituents that a water body can receive and still meet established water-quality standards. The peak loads generally occurred in June and July when streamflow also peaked.
Williams-Sether, Tara
2015-08-06
Annual peak-flow frequency data from 231 U.S. Geological Survey streamflow-gaging stations in North Dakota and parts of Montana, South Dakota, and Minnesota, with 10 or more years of unregulated peak-flow record, were used to develop regional regression equations for exceedance probabilities of 0.5, 0.20, 0.10, 0.04, 0.02, 0.01, and 0.002 using generalized least-squares techniques. Updated peak-flow frequency estimates for 262 streamflow-gaging stations were developed using data through 2009 and log-Pearson Type III procedures outlined by the Hydrology Subcommittee of the Interagency Advisory Committee on Water Data. An average generalized skew coefficient was determined for three hydrologic zones in North Dakota. A StreamStats web application was developed to estimate basin characteristics for the regional regression equation analysis. Methods for estimating a weighted peak-flow frequency for gaged sites and ungaged sites are presented.
Asquith, William H.; Roussel, Meghan C.
2009-01-01
Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed characteristics of drainage area, dimensionless main-channel slope, and mean annual precipitation. The residuals of the nine equations are spatially mapped, and residuals for the 10-year recurrence interval are selected for generalization to 1-degree latitude and longitude quadrangles. The generalized residual is referred to as the OmegaEM parameter and represents a generalized terrain and climate index that expresses peak-streamflow potential not otherwise represented in the three watershed characteristics. The OmegaEM parameter was assigned to each station, and using OmegaEM, nine additional regression equations are computed. Because of favorable diagnostics, the OmegaEM equations are expected to be generally reliable estimators of peak-streamflow frequency for undeveloped and ungaged stream locations in Texas. The mean residual standard error, adjusted R-squared, and percentage reduction of PRESS by use of OmegaEM are 0.30log10, 0.86, and -21 percent, respectively. Inclusion of the OmegaEM parameter provides a substantial reduction in the PRESS statistic of the regression equations and removes considerable spatial dependency in regression residuals. Although the OmegaEM parameter requires interpretation on the part of analysts and the potential exists that different analysts could estimate different values for a given watershed, the authors suggest that typical uncertainty in the OmegaEM estimate might be about +or-0.1010. Finally, given the two ensembles of equations reported herein and those in previous reports, hydrologic design engineers and other analysts have several different methods, which represent different analytical tracks, to make comparisons of peak-streamflow frequency estimates for ungaged watersheds in the study area.
Linear models for calculating digestibile energy for sheep diets.
Fonnesbeck, P V; Christiansen, M L; Harris, L E
1981-05-01
Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.
Estimation of left ventricular mass in conscious dogs
NASA Technical Reports Server (NTRS)
Coleman, Bernell; Cothran, Laval N.; Ison-Franklin, E. L.; Hawthorne, E. W.
1986-01-01
A method for the assessment of the development or the regression of left ventricular hypertrophy (LVH) in a conscious instrumented animal is described. First, the single-slice short-axis area-length method for estimating the left-ventricular mass (LVM) and volume (LVV) was validated in 24 formaldehyde-fixed canine hearts, and a regression equation was developed that could be used in the intact animal to correct the sonomicrometrically estimated LVM. The LVM-assessment method, which uses the combined techniques of echocardiography and sonomicrometry (in conjunction with the regression equation), was shown to provide reliable and reproducible day-to-day estimates of LVM and LVV, and to be sensitive enough to detect serial changes during the development of LVH.
Modeling animal movements using stochastic differential equations
Haiganoush K. Preisler; Alan A. Ager; Bruce K. Johnson; John G. Kie
2004-01-01
We describe the use of bivariate stochastic differential equations (SDE) for modeling movements of 216 radiocollared female Rocky Mountain elk at the Starkey Experimental Forest and Range in northeastern Oregon. Spatially and temporally explicit vector fields were estimated using approximating difference equations and nonparametric regression techniques. Estimated...
Flood characteristics of urban watersheds in the United States
Sauer, Vernon B.; Thomas, W.O.; Stricker, V.A.; Wilson, K.V.
1983-01-01
A nationwide study of flood magnitude and frequency in urban areas was made for the purpose of reviewing available literature, compiling an urban flood data base, and developing methods of estimating urban floodflow characteristics in ungaged areas. The literature review contains synopses of 128 recent publications related to urban floodflow. A data base of 269 gaged basins in 56 cities and 31 States, including Hawaii, contains a wide variety of topographic and climatic characteristics, land-use variables, indices of urbanization, and flood-frequency estimates. Three sets of regression equations were developed to estimate flood discharges for ungaged sites for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years. Two sets of regression equations are based on seven independent parameters and the third is based on three independent parameters. The only difference in the two sets of seven-parameter equations is the use of basin lag time in one and lake and reservoir storage in the other. Of primary importance in these equations is an independent estimate of the equivalent rural discharge for the ungaged basin. The equations adjust the equivalent rural discharge to an urban condition. The primary adjustment factor, or index of urbanization, is the basin development factor, a measure of the extent of development of the drainage system in the basin. This measure includes evaluations of storm drains (sewers), channel improvements, and curb-and-gutter streets. The basin development factor is statistically very significant and offers a simple and effective way of accounting for drainage development and runoff response in urban areas. Percentage of impervious area is also included in the seven-parameter equations as an additional measure of urbanization and apparently accounts for increased runoff volumes. This factor is not highly significant for large floods, which supports the generally held concept that imperviousness is not a dominant factor when soils become more saturated during large storms. Other parameters in the seven-parameter equations include drainage area size, channel slope, rainfall intensity, lake and reservoir storage, and basin lag time. These factors are all statistically significant and provide logical indices of basin conditions. The three-parameter equations include only the three most significant parameters: rural discharge, basin-development factor, and drainage area size. All three sets of regression equations provide unbiased estimates of urban flood frequency. The seven-parameter regression equations without basin lag time have average standard errors of regression varying from ? 37 percent for the 5-year flood to ? 44 percent for the 100-year flood and ? 49 percent for the 500-year flood. The other two sets of regression equations have similar accuracy. Several tests for bias, sensitivity, and hydrologic consistency are included which support the conclusion that the equations are useful throughout the United States. All estimating equations were developed from data collected on drainage basins where temporary in-channel storage, due to highway embankments, was not significant. Consequently, estimates made with these equations do not account for the reducing effect of this temporary detention storage.
Wood, Molly S.; Fosness, Ryan L.; Skinner, Kenneth D.; Veilleux, Andrea G.
2016-06-27
The U.S. Geological Survey, in cooperation with the Idaho Transportation Department, updated regional regression equations to estimate peak-flow statistics at ungaged sites on Idaho streams using recent streamflow (flow) data and new statistical techniques. Peak-flow statistics with 80-, 67-, 50-, 43-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities (1.25-, 1.50-, 2.00-, 2.33-, 5.00-, 10.0-, 25.0-, 50.0-, 100-, 200-, and 500-year recurrence intervals, respectively) were estimated for 192 streamgages in Idaho and bordering States with at least 10 years of annual peak-flow record through water year 2013. The streamgages were selected from drainage basins with little or no flow diversion or regulation. The peak-flow statistics were estimated by fitting a log-Pearson type III distribution to records of annual peak flows and applying two additional statistical methods: (1) the Expected Moments Algorithm to help describe uncertainty in annual peak flows and to better represent missing and historical record; and (2) the generalized Multiple Grubbs Beck Test to screen out potentially influential low outliers and to better fit the upper end of the peak-flow distribution. Additionally, a new regional skew was estimated for the Pacific Northwest and used to weight at-station skew at most streamgages. The streamgages were grouped into six regions (numbered 1_2, 3, 4, 5, 6_8, and 7, to maintain consistency in region numbering with a previous study), and the estimated peak-flow statistics were related to basin and climatic characteristics to develop regional regression equations using a generalized least squares procedure. Four out of 24 evaluated basin and climatic characteristics were selected for use in the final regional peak-flow regression equations.Overall, the standard error of prediction for the regional peak-flow regression equations ranged from 22 to 132 percent. Among all regions, regression model fit was best for region 4 in west-central Idaho (average standard error of prediction=46.4 percent; pseudo-R2>92 percent) and region 5 in central Idaho (average standard error of prediction=30.3 percent; pseudo-R2>95 percent). Regression model fit was poor for region 7 in southern Idaho (average standard error of prediction=103 percent; pseudo-R2<78 percent) compared to other regions because few streamgages in region 7 met the criteria for inclusion in the study, and the region’s semi-arid climate and associated variability in precipitation patterns causes substantial variability in peak flows.A drainage area ratio-adjustment method, using ratio exponents estimated using generalized least-squares regression, was presented as an alternative to the regional regression equations if peak-flow estimates are desired at an ungaged site that is close to a streamgage selected for inclusion in this study. The alternative drainage area ratio-adjustment method is appropriate for use when the drainage area ratio between the ungaged and gaged sites is between 0.5 and 1.5.The updated regional peak-flow regression equations had lower total error (standard error of prediction) than all regression equations presented in a 1982 study and in four of six regions presented in 2002 and 2003 studies in Idaho. A more extensive streamgage screening process used in the current study resulted in fewer streamgages used in the current study than in the 1982, 2002, and 2003 studies. Fewer streamgages used and the selection of different explanatory variables were likely causes of increased error in some regions compared to previous studies, but overall, regional peak‑flow regression model fit was generally improved for Idaho. The revised statistical procedures and increased streamgage screening applied in the current study most likely resulted in a more accurate representation of natural peak-flow conditions.The updated, regional peak-flow regression equations will be integrated in the U.S. Geological Survey StreamStats program to allow users to estimate basin and climatic characteristics and peak-flow statistics at ungaged locations of interest. StreamStats estimates peak-flow statistics with quantifiable certainty only when used at sites with basin and climatic characteristics within the range of input variables used to develop the regional regression equations. Both the regional regression equations and StreamStats should be used to estimate peak-flow statistics only in naturally flowing, relatively unregulated streams without substantial local influences to flow, such as large seeps, springs, or other groundwater-surface water interactions that are not widespread or characteristic of the respective region.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…
Biomass equations for major tree species of the Northeast
Louise M. Tritton; James W. Hornbeck
1982-01-01
Regression equations are used in both forestry and ecosystem studies to estimate tree biomass from field measurements of dbh (diameter at breast height) or a combination of dbh and height. Literature on biomass is reviewed, and 178 sets of publish equation for 25 species common to the Northeastern Unites States are listed. On the basis of these equations, estimates of...
Stamey, Timothy C.
1998-01-01
Simple and reliable methods for estimating hourly streamflow are needed for the calibration and verification of a Chattahoochee River basin model between Buford Dam and Franklin, Ga. The river basin model is being developed by Georgia Department of Natural Resources, Environmental Protection Division, as part of their Chattahoochee River Modeling Project. Concurrent streamflow data collected at 19 continuous-record, and 31 partial-record streamflow stations, were used in ordinary least-squares linear regression analyses to define estimating equations, and in verifying drainage-area prorations. The resulting regression or drainage-area ratio estimating equations were used to compute hourly streamflow at the partial-record stations. The coefficients of determination (r-squared values) for the regression estimating equations ranged from 0.90 to 0.99. Observed and estimated hourly and daily streamflow data were computed for May 1, 1995, through October 31, 1995. Comparisons of observed and estimated daily streamflow data for 12 continuous-record tributary stations, that had available streamflow data for all or part of the period from May 1, 1995, to October 31, 1995, indicate that the mean error of estimate for the daily streamflow was about 25 percent.
Khan, I.; Hawlader, Sophie Mohammad Delwer Hossain; Arifeen, Shams El; Moore, Sophie; Hills, Andrew P.; Wells, Jonathan C.; Persson, Lars-Åke; Kabir, Iqbal
2012-01-01
The aim of this study was to investigate the validity of the Tanita TBF 300A leg-to-leg bioimpedance analyzer for estimating fat-free mass (FFM) in Bangladeshi children aged 4-10 years and to develop novel prediction equations for use in this population, using deuterium dilution as the reference method. Two hundred Bangladeshi children were enrolled. The isotope dilution technique with deuterium oxide was used for estimation of total body water (TBW). FFM estimated by Tanita was compared with results of deuterium oxide dilution technique. Novel prediction equations were created for estimating FFM, using linear regression models, fitting child's height and impedance as predictors. There was a significant difference in FFM and percentage of body fat (BF%) between methods (p<0.01), Tanita underestimating TBW in boys (p=0.001) and underestimating BF% in girls (p<0.001). A basic linear regression model with height and impedance explained 83% of the variance in FFM estimated by deuterium oxide dilution technique. The best-fit equation to predict FFM from linear regression modelling was achieved by adding weight, sex, and age to the basic model, bringing the adjusted R2 to 89% (standard error=0.90, p<0.001). These data suggest Tanita analyzer may be a valid field-assessment technique in Bangladeshi children when using population-specific prediction equations, such as the ones developed here. PMID:23082630
Estimating total forest biomass in Maine, 1995
Eric H. Wharton; Douglas M. Griffith; Douglas M. Griffith
1998-01-01
Presents methods for synthesizing information from existing biomass literature for estimating biomass over extensive forest areas with specific applications to Maine. Tables of appropriate regression equations and the tree and shrub species to which these equations can be applied are presented as well as biomass estimates at the county and state level.
Rank-preserving regression: a more robust rank regression model against outliers.
Chen, Tian; Kowalski, Jeanne; Chen, Rui; Wu, Pan; Zhang, Hui; Feng, Changyong; Tu, Xin M
2016-08-30
Mean-based semi-parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon-score-based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional-response-model-based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank-preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Estimating equations estimates of trends
Link, W.A.; Sauer, J.R.
1994-01-01
The North American Breeding Bird Survey monitors changes in bird populations through time using annual counts at fixed survey sites. The usual method of estimating trends has been to use the logarithm of the counts in a regression analysis. It is contended that this procedure is reasonably satisfactory for more abundant species, but produces biased estimates for less abundant species. An alternative estimation procedure based on estimating equations is presented.
Ries(compiler), Kernell G.; With sections by Atkins, J. B.; Hummel, P.R.; Gray, Matthew J.; Dusenbury, R.; Jennings, M.E.; Kirby, W.H.; Riggs, H.C.; Sauer, V.B.; Thomas, W.O.
2007-01-01
The National Streamflow Statistics (NSS) Program is a computer program that should be useful to engineers, hydrologists, and others for planning, management, and design applications. NSS compiles all current U.S. Geological Survey (USGS) regional regression equations for estimating streamflow statistics at ungaged sites in an easy-to-use interface that operates on computers with Microsoft Windows operating systems. NSS expands on the functionality of the USGS National Flood Frequency Program, and replaces it. The regression equations included in NSS are used to transfer streamflow statistics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally, the equations were developed on a statewide or metropolitan-area basis as part of cooperative study programs. Equations are available for estimating rural and urban flood-frequency statistics, such as the 1 00-year flood, for every state, for Puerto Rico, and for the island of Tutuila, American Samoa. Equations are available for estimating other statistics, such as the mean annual flow, monthly mean flows, flow-duration percentiles, and low-flow frequencies (such as the 7-day, 0-year low flow) for less than half of the states. All equations available for estimating streamflow statistics other than flood-frequency statistics assume rural (non-regulated, non-urbanized) conditions. The NSS output provides indicators of the accuracy of the estimated streamflow statistics. The indicators may include any combination of the standard error of estimate, the standard error of prediction, the equivalent years of record, or 90 percent prediction intervals, depending on what was provided by the authors of the equations. The program includes several other features that can be used only for flood-frequency estimation. These include the ability to generate flood-frequency plots, and plots of typical flood hydrographs for selected recurrence intervals, estimates of the probable maximum flood, extrapolation of the 500-year flood when an equation for estimating it is not available, and weighting techniques to improve flood-frequency estimates for gaging stations and ungaged sites on gaged streams. This report describes the regionalization techniques used to develop the equations in NSS and provides guidance on the applicability and limitations of the techniques. The report also includes a users manual and a summary of equations available for estimating basin lagtime, which is needed by the program to generate flood hydrographs. The NSS software and accompanying database, and the documentation for the regression equations included in NSS, are available on the Web at http://water.usgs.gov/software/.
Estimated Perennial Streams of Idaho and Related Geospatial Datasets
Rea, Alan; Skinner, Kenneth D.
2009-01-01
The perennial or intermittent status of a stream has bearing on many regulatory requirements. Because of changing technologies over time, cartographic representation of perennial/intermittent status of streams on U.S. Geological Survey (USGS) topographic maps is not always accurate and (or) consistent from one map sheet to another. Idaho Administrative Code defines an intermittent stream as one having a 7-day, 2-year low flow (7Q2) less than 0.1 cubic feet per second. To establish consistency with the Idaho Administrative Code, the USGS developed regional regression equations for Idaho streams for several low-flow statistics, including 7Q2. Using these regression equations, the 7Q2 streamflow may be estimated for naturally flowing streams anywhere in Idaho to help determine perennial/intermittent status of streams. Using these equations in conjunction with a Geographic Information System (GIS) technique known as weighted flow accumulation allows for an automated and continuous estimation of 7Q2 streamflow at all points along a stream, which in turn can be used to determine if a stream is intermittent or perennial according to the Idaho Administrative Code operational definition. The selected regression equations were applied to create continuous grids of 7Q2 estimates for the eight low-flow regression regions of Idaho. By applying the 0.1 ft3/s criterion, the perennial streams have been estimated in each low-flow region. Uncertainty in the estimates is shown by identifying a 'transitional' zone, corresponding to flow estimates of 0.1 ft3/s plus and minus one standard error. Considerable additional uncertainty exists in the model of perennial streams presented in this report. The regression models provide overall estimates based on general trends within each regression region. These models do not include local factors such as a large spring or a losing reach that may greatly affect flows at any given point. Site-specific flow data, assuming a sufficient period of record, generally would be considered to represent flow conditions better at a given site than flow estimates based on regionalized regression models. The geospatial datasets of modeled perennial streams are considered a first-cut estimate, and should not be construed to override site-specific flow data.
Sando, Steven K.; Morgan, Timothy J.; Dutton, DeAnn M.; McCarthy, Peter M.
2009-01-01
Charles M. Russell National Wildlife Refuge (CMR) encompasses about 1.1 million acres (including Fort Peck Reservoir on the Missouri River) in northeastern Montana. To ensure that sufficient streamflow remains in the tributary streams to maintain the riparian corridors, the U.S. Fish and Wildlife Service is negotiating water-rights issues with the Reserved Water Rights Compact Commission of Montana. The U.S. Geological Survey, in cooperation with the U.S. Fish and Wildlife Service, conducted a study to gage, for a short period, selected streams that cross CMR, and analyze data to estimate long-term streamflow characteristics for CMR. The long-term streamflow characteristics of primary interest include the monthly and annual 90-, 80-, 50-, and 20-percent exceedance streamflows and mean streamflows (Q.90, Q.80, Q.50, Q.20, and QM, respectively), and the 1.5-, 2-, and 2.33- year peak flows (PK1.5, PK2, and PK2.33, respectively). The Regional Adjustment Relationship (RAR) was investigated for estimating the monthly and annual Q.90, Q.80, Q.50, Q.20, and QM, and the PK1.5, PK2, and PK2.33 for the short-term CMR gaging stations (hereinafter referred to as CMR stations). The RAR was determined to provide acceptable results for estimating the long-term Q.90, Q.80, Q.50, Q.20, and QM on a monthly basis for the months of March through June, and also on an annual basis. For the months of September through January, the RAR regression equations did not provide acceptable results for any long-term streamflow characteristic. For the month of February, the RAR regression equations provided acceptable results for the long-term Q.50 and QM, but poor results for the long-term Q.90, Q.80, and Q.20. For the months of July and August, the RAR provided acceptable results for the long-term Q.50, Q.20, and QM, but poor results for the long-term Q.90 and Q.80. Estimation coefficients were developed for estimating the long-term streamflow characteristics for which the RAR did not provide acceptable results. The RAR also was determined to provide acceptable results for estimating the PK1.5., PK2, and PK2.33 for the three CMR stations that lacked suitable peak-flow records. Methods for estimating streamflow characteristics at ungaged sites also were derived. Regression analyses that relate individual streamflow characteristics to various basin and climatic characteristics for gaging stations were performed to develop regression equations to estimate streamflow characteristics at ungaged sites. Final equations for the annual Q.50, Q.20, and QM are reported. Acceptable equations also were developed for estimating QM for the months of February, March, April, June, and July, and Q.50, Q.20, and QM on an annual basis. However, equations for QM for the months of February, March, April, June, and July were determined to be less consistent and reliable than the use of estimation coefficients applied to the regression equation results for the annual QM. Acceptable regression equations also were developed for the PK1.5, PK2, and PK2.33.
John Yarie; Bert R. Mead
1988-01-01
Equations are presented for estimating the twig, foliage, and combined biomass for 58 plant species in interior Alaska. The equations can be used for estimating biomass from percentage of foliar cover of 10-centimeter layers in a vertical profile from 0 to 6 meters. Few differences were found in regressions of the same species between layers except when the ratio of...
Weight estimation techniques for composite airplanes in general aviation industry
NASA Technical Reports Server (NTRS)
Paramasivam, T.; Horn, W. J.; Ritter, J.
1986-01-01
Currently available weight estimation methods for general aviation airplanes were investigated. New equations with explicit material properties were developed for the weight estimation of aircraft components such as wing, fuselage and empennage. Regression analysis was applied to the basic equations for a data base of twelve airplanes to determine the coefficients. The resulting equations can be used to predict the component weights of either metallic or composite airplanes.
Estimating leaf area and leaf biomass of open-grown deciduous urban trees
David J. Nowak
1996-01-01
Logarithmic regression equations were developed to predict leaf area and leaf biomass for open-grown deciduous urban trees based on stem diameter and crown parameters. Equations based on crown parameters produced more reliable estimates. The equations can be used to help quantify forest structure and functions, particularly in urbanizing and urban/suburban areas.
Equations for estimating selected streamflow statistics in Rhode Island
Bent, Gardner C.; Steeves, Peter A.; Waite, Andrew M.
2014-01-01
The equations, which are based on data from streams with little to no flow alterations, will provide an estimate of the natural flows for a selected site. They will not estimate flows for altered sites with dams, surface-water withdrawals, groundwater withdrawals (pumping wells), diversions, and wastewater discharges. If the equations are used to estimate streamflow statistics for altered sites, the user should adjust the flow estimates for the alterations. The regression equations should be used only for ungaged sites with drainage areas between 0.52 and 294 square miles and stream densities between 0.94 and 3.49 miles per square mile; these are the ranges of the explanatory variables in the equations.
Estimating total forest biomass in New York, 1993
Eric Wharton; Carol Alerich; David A. Drake; David A. Drake
1997-01-01
Presents methods for synthesizing information from existing biomass literature for estimating biomass over extensive forest areas with specific applications to New York. Tables of appropriate regression equations and the tree and shrub species to which these equations can be applied are presented well as biomass estimates at the county, geographic unit, and state level...
Baldys, Stanley; Raines, T.H.; Mansfield, B.L.; Sandlin, J.T.
1998-01-01
Local regression equations were developed to estimate loads produced by individual storms. Mean annual loads were estimated by applying the storm-load equations for all runoff-producing storms in an average climatic year and summing individual storm loads to determine the annual load.
The Bland-Altman Method Should Not Be Used in Regression Cross-Validation Studies
ERIC Educational Resources Information Center
O'Connor, Daniel P.; Mahar, Matthew T.; Laughlin, Mitzi S.; Jackson, Andrew S.
2011-01-01
The purpose of this study was to demonstrate the bias in the Bland-Altman (BA) limits of agreement method when it is used to validate regression models. Data from 1,158 men were used to develop three regression equations to estimate maximum oxygen uptake (R[superscript 2] = 0.40, 0.61, and 0.82, respectively). The equations were evaluated in a…
Westgate, Philip M.
2016-01-01
When generalized estimating equations (GEE) incorporate an unstructured working correlation matrix, the variances of regression parameter estimates can inflate due to the estimation of the correlation parameters. In previous work, an approximation for this inflation that results in a corrected version of the sandwich formula for the covariance matrix of regression parameter estimates was derived. Use of this correction for correlation structure selection also reduces the over-selection of the unstructured working correlation matrix. In this manuscript, we conduct a simulation study to demonstrate that an increase in variances of regression parameter estimates can occur when GEE incorporates structured working correlation matrices as well. Correspondingly, we show the ability of the corrected version of the sandwich formula to improve the validity of inference and correlation structure selection. We also study the relative influences of two popular corrections to a different source of bias in the empirical sandwich covariance estimator. PMID:27818539
Westgate, Philip M
2016-01-01
When generalized estimating equations (GEE) incorporate an unstructured working correlation matrix, the variances of regression parameter estimates can inflate due to the estimation of the correlation parameters. In previous work, an approximation for this inflation that results in a corrected version of the sandwich formula for the covariance matrix of regression parameter estimates was derived. Use of this correction for correlation structure selection also reduces the over-selection of the unstructured working correlation matrix. In this manuscript, we conduct a simulation study to demonstrate that an increase in variances of regression parameter estimates can occur when GEE incorporates structured working correlation matrices as well. Correspondingly, we show the ability of the corrected version of the sandwich formula to improve the validity of inference and correlation structure selection. We also study the relative influences of two popular corrections to a different source of bias in the empirical sandwich covariance estimator.
Perry, Charles A.; Wolock, David M.; Artman, Joshua C.
2004-01-01
Streamflow statistics of flow duration and peak-discharge frequency were estimated for 4,771 individual locations on streams listed on the 1999 Kansas Surface Water Register. These statistics included the flow-duration values of 90, 75, 50, 25, and 10 percent, as well as the mean flow value. Peak-discharge frequency values were estimated for the 2-, 5-, 10-, 25-, 50-, and 100-year floods. Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating flow-duration values of 90, 75, 50, 25, and 10 percent and the mean flow for uncontrolled flow stream locations. The contributing-drainage areas of 149 U.S. Geological Survey streamflow-gaging stations in Kansas and parts of surrounding States that had flow uncontrolled by Federal reservoirs and used in the regression analyses ranged from 2.06 to 12,004 square miles. Logarithmic transformations of climatic and basin data were performed to yield the best linear relation for developing equations to compute flow durations and mean flow. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were contributing-drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. The analyses yielded a model standard error of prediction range of 0.43 logarithmic units for the 90-percent duration analysis to 0.15 logarithmic units for the 10-percent duration analysis. The model standard error of prediction was 0.14 logarithmic units for the mean flow. Regression equations used to estimate peak-discharge frequency values were obtained from a previous report, and estimates for the 2-, 5-, 10-, 25-, 50-, and 100-year floods were determined for this report. The regression equations and an interpolation procedure were used to compute flow durations, mean flow, and estimates of peak-discharge frequency for locations along uncontrolled flow streams on the 1999 Kansas Surface Water Register. Flow durations, mean flow, and peak-discharge frequency values determined at available gaging stations were used to interpolate the regression-estimated flows for the stream locations where available. Streamflow statistics for locations that had uncontrolled flow were interpolated using data from gaging stations weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled reaches of Kansas streams, the streamflow statistics were interpolated between gaging stations using only gaged data weighted by drainage area.
Maine StreamStats: a water-resources web application
Lombard, Pamela J.
2015-01-01
Reports referenced in this fact sheet present the regression equations used to estimate the flow statistics, describe the errors associated with the estimates, and describe the methods used to develop the equations and to measure the basin characteristics used in the equations. Limitations of the methods are also described in the reports; for example, all of the equations are appropriate only for ungaged, unregulated, rural streams in Maine.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Bisese, James A.
1995-01-01
Methods are presented for estimating the peak discharges of rural, unregulated streams in Virginia. A Pearson Type III distribution is fitted to the logarithms of the unregulated annual peak-discharge records from 363 stream-gaging stations in Virginia to estimate the peak discharge at these stations for recurrence intervals of 2 to 500 years. Peak-discharge characteristics for 284 unregulated stations are divided into eight regions based on physiographic province, and regressed on basin characteristics, including drainage area, main channel length, main channel slope, mean basin elevation, percentage of forest cover, mean annual precipitation, and maximum rainfall intensity. Regression equations for each region are computed by use of the generalized least-squares method, which accounts for spatial and temporal correlation between nearby gaging stations. This regression technique weights the significance of each station to the regional equation based on the length of records collected at each cation, the correlation between annual peak discharges among the stations, and the standard deviation of the annual peak discharge for each station.Drainage area proved to be the only significant explanatory variable in four regions, while other regions have as many as three significant variables. Standard errors of the regression equations range from 30 to 80 percent. Alternate equations using drainage area only are provided for the five regions with more than one significant explanatory variable.Methods and sample computations are provided to estimate peak discharges at gaged and engaged sites in Virginia for recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, and to adjust the regression estimates for sites on gaged streams where nearby gaging-station records are available.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Watson, Kara M.; McHugh, Amy R.
2014-01-01
Regional regression equations were developed for estimating monthly flow-duration and monthly low-flow frequency statistics for ungaged streams in Coastal Plain and non-coastal regions of New Jersey for baseline and current land- and water-use conditions. The equations were developed to estimate 87 different streamflow statistics, which include the monthly 99-, 90-, 85-, 75-, 50-, and 25-percentile flow-durations of the minimum 1-day daily flow; the August–September 99-, 90-, and 75-percentile minimum 1-day daily flow; and the monthly 7-day, 10-year (M7D10Y) low-flow frequency. These 87 streamflow statistics were computed for 41 continuous-record streamflow-gaging stations (streamgages) with 20 or more years of record and 167 low-flow partial-record stations in New Jersey with 10 or more streamflow measurements. The regression analyses used to develop equations to estimate selected streamflow statistics were performed by testing the relation between flow-duration statistics and low-flow frequency statistics for 32 basin characteristics (physical characteristics, land use, surficial geology, and climate) at the 41 streamgages and 167 low-flow partial-record stations. The regression analyses determined drainage area, soil permeability, average April precipitation, average June precipitation, and percent storage (water bodies and wetlands) were the significant explanatory variables for estimating the selected flow-duration and low-flow frequency statistics. Streamflow estimates were computed for two land- and water-use conditions in New Jersey—land- and water-use during the baseline period of record (defined as the years a streamgage had little to no change in development and water use) and current land- and water-use conditions (1989–2008)—for each selected station using data collected through water year 2008. The baseline period of record is representative of a period when the basin was unaffected by change in development. The current period is representative of the increased development of the last 20 years (1989–2008). The two different land- and water-use conditions were used as surrogates for development to determine whether there have been changes in low-flow statistics as a result of changes in development over time. The State was divided into two low-flow regression regions, the Coastal Plain and the non-coastal region, in order to improve the accuracy of the regression equations. The left-censored parametric survival regression method was used for the analyses to account for streamgages and partial-record stations that had zero flow values for some of the statistics. The average standard error of estimate for the 348 regression equations ranged from 16 to 340 percent. These regression equations and basin characteristics are presented in the U.S. Geological Survey (USGS) StreamStats Web-based geographic information system application. This tool allows users to click on an ungaged site on a stream in New Jersey and get the estimated flow-duration and low-flow frequency statistics. Additionally, the user can click on a streamgage or partial-record station and get the “at-site” streamflow statistics. The low-flow characteristics of a stream ultimately affect the use of the stream by humans. Specific information on the low-flow characteristics of streams is essential to water managers who deal with problems related to municipal and industrial water supply, fish and wildlife conservation, and dilution of wastewater.
Methods for estimating magnitude and frequency of peak flows for natural streams in Utah
Kenney, Terry A.; Wilkowske, Chris D.; Wright, Shane J.
2007-01-01
Estimates of the magnitude and frequency of peak streamflows is critical for the safe and cost-effective design of hydraulic structures and stream crossings, and accurate delineation of flood plains. Engineers, planners, resource managers, and scientists need accurate estimates of peak-flow return frequencies for locations on streams with and without streamflow-gaging stations. The 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence-interval flows were estimated for 344 unregulated U.S. Geological Survey streamflow-gaging stations in Utah and nearby in bordering states. These data along with 23 basin and climatic characteristics computed for each station were used to develop regional peak-flow frequency and magnitude regression equations for 7 geohydrologic regions of Utah. These regression equations can be used to estimate the magnitude and frequency of peak flows for natural streams in Utah within the presented range of predictor variables. Uncertainty, presented as the average standard error of prediction, was computed for each developed equation. Equations developed using data from more than 35 gaging stations had standard errors of prediction that ranged from 35 to 108 percent, and errors for equations developed using data from less than 35 gaging stations ranged from 50 to 357 percent.
Eash, David A.; Barnes, Kimberlee K.; Veilleux, Andrea G.
2013-01-01
A statewide study was performed to develop regional regression equations for estimating selected annual exceedance-probability statistics for ungaged stream sites in Iowa. The study area comprises streamgages located within Iowa and 50 miles beyond the State’s borders. Annual exceedance-probability estimates were computed for 518 streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data through 2010. The estimation of the selected statistics included a Bayesian weighted least-squares/generalized least-squares regression analysis to update regional skew coefficients for the 518 streamgages. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low flows. Also, geographic information system software was used to measure 59 selected basin characteristics for each streamgage. Regional regression analysis, using generalized least-squares regression, was used to develop a set of equations for each flood region in Iowa for estimating discharges for ungaged stream sites with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. A total of 394 streamgages were included in the development of regional regression equations for three flood regions (regions 1, 2, and 3) that were defined for Iowa based on landform regions and soil regions. Average standard errors of prediction range from 31.8 to 45.2 percent for flood region 1, 19.4 to 46.8 percent for flood region 2, and 26.5 to 43.1 percent for flood region 3. The pseudo coefficients of determination for the generalized least-squares equations range from 90.8 to 96.2 percent for flood region 1, 91.5 to 97.9 percent for flood region 2, and 92.4 to 96.0 percent for flood region 3. The regression equations are applicable only to stream sites in Iowa with flows not significantly affected by regulation, diversion, channelization, backwater, or urbanization and with basin characteristics within the range of those used to develop the equations. These regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the eight selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided by the Web-based tool. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these eight selected statistics are provided for the streamgage.
Estimating the magnitude and frequency of floods in urban basins in Missouri
Southard, Rodney E.
2010-01-01
Streamgage flood-frequency analyses were done for 35 streamgages on urban streams in and adjacent to Missouri for estimation of the magnitude and frequency of floods in urban areas of Missouri. A log-Pearson Type-III distribution was fitted to the annual series of peak flow data retrieved from the U.S. Geological Survey National Water Information System. For this report, the flood frequency estimates are expressed in terms of percent annual exceedance probabilities of 50, 20, 10, 4, 2, 1, and 0.2. Of the 35 streamgages, 30 are located in Missouri. The remaining five non-Missouri streamgages were added to the dataset to improve the range and applicability of the regression analyses from the streamgage frequency analyses. Ordinary least-squares was used to determine the best set of independent variables for the regression equations. Basin characteristics selected for independent variables into the ordinary least-squares regression analyses were based on theoretical relation to flood flows, literature review of possible basin characteristics, and the ability to measure the basin characteristics using digital datasets and geographic information system technology. Results of the ordinary least-squares were evaluated on the basis of Mallow's Cp statistic, the adjusted coefficient of determination, and the statistical significance of the independent variables. The independent variables of drainage area and percent impervious area were determined to be statistically significant and readily determined from existing digital datasets. The drainage area variable was computed using the best elevation data available, either from a statewide 10-meter grid or high-resolution elevation data from urban areas. The impervious area variable was computed from the National Land Cover Dataset 2001 impervious area dataset. The National Land Cover Dataset 2001 impervious area data for each basin was compared to historical imagery and 7.5-minute topographic maps to verify the national dataset represented the urbanization of the basin at the time streamgage data were collected. Eight streamgages had less urbanization during the period of time streamflow data were collected than was shown on the 2001 dataset. The impervious area values for these eight urban basins were adjusted downward as much as 23 percent to account for the additional urbanization since the streamflow data were collected. Weighted least-squares regression techniques were used to determine the final regression equations for the statewide urban flood-frequency equations. Weighted least-squares techniques improve regression equations by adjusting for different and varying lengths in streamflow records. The final flood-frequency equations for the 50-, 20-, 10-, 4-, 2-, 1-, and 0.2-percent annual exceedance probability floods for Missouri provide a technique for estimating peak flows on urban streams at gaged and ungaged sites. The applicability of the equations is limited by the range in basin characteristics used to develop the regression equations. The range in drainage area is 0.28 to 189 square miles; range in impervious area is 2.3 to 46.0 percent. Seven of the 35 selected streamgages were used to compare the results of the existing rural and urban equations to the urban equations presented in this report for the 1-percent annual exceedance probability. Results of the comparison indicate that the estimated peak flows for the urban equation in this report ranged from 3 to 52 percent higher than the results from the rural equations. Comparing the estimated urban peak flows from this report to the existing urban equation developed in 1986 indicated the range was 255 percent lower to 10 percent higher. The overall comparison between the current (2010) and 1986 urban equations indicates a reduction in estimated peak flow values for the 1-percent annual exceedance probability flood.
Kohn, Michael S.; Stevens, Michael R.; Harden, Tessa M.; Godaire, Jeanne E.; Klinger, Ralph E.; Mommandi, Amanullah
2016-09-09
The U.S. Geological Survey (USGS), in cooperation with the Colorado Department of Transportation, developed regional-regression equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, 0.2-percent annual exceedance-probability discharge (AEPD) for natural streamflow in eastern Colorado. A total of 188 streamgages, consisting of 6,536 years of record and a mean of approximately 35 years of record per streamgage, were used to develop the peak-streamflow regional-regression equations. The estimated AEPDs for each streamgage were computed using the USGS software program PeakFQ. The AEPDs were determined using systematic data through water year 2013. Based on previous studies conducted in Colorado and neighboring States and on the availability of data, 72 characteristics (57 basin and 15 climatic characteristics) were evaluated as candidate explanatory variables in the regression analysis. Paleoflood and non-exceedance bound ages were established based on reconnaissance-level methods. Multiple lines of evidence were used at each streamgage to arrive at a conclusion (age estimate) to add a higher degree of certainty to reconnaissance-level estimates. Paleoflood or nonexceedance bound evidence was documented at 41 streamgages, and 3 streamgages had previously collected paleoflood data.To determine the peak discharge of a paleoflood or non-exceedanc bound, two different hydraulic models were used.The mean standard error of prediction (SEP) for all 8 AEPDs was reduced approximately 25 percent compared to the previous flood-frequency study. For paleoflood data to be effective in reducing the SEP in eastern Colorado, a larger ratio than 44 of 188 (23 percent) streamgages would need paleoflood data and that paleoflood data would need to increase the record length by more than 25 years for the 1-percent AEPD. The greatest reduction in SEP for the peak-streamflow regional-regression equations was observed when additional new basin characteristics were included in the peak-streamflow regional-regression equations and when eastern Colorado was divided into two separate hydrologic regions. To make further reductions in the uncertainties of the peak-streamflow regional-regression equations in the Foothills and Plains hydrologic regions, additional streamgages or crest-stage gages are needed to collect peak-streamflow data on natural streams in eastern Colorado.Generalized-Least Squares regression was used to compute the final peak-streamflow regional-regression equations for peak-streamflow. Dividing eastern Colorado into two new individual regions at –104° longitude resulted in peak-streamflow regional-regression equations with the smallest SEP. The new hydrologic region located between –104° longitude and the Kansas-Nebraska State line will be designated the Plains hydrologic region and the hydrologic region comprising the rest of eastern Colorado located west of the –104° longitude and east of the Rocky Mountains and below 7,500 feet in the South Platte River Basin and below 9,000 feet in the Arkansas River Basin will be designated the Foothills hydrologic region.
Sanford, Ward E.; Nelms, David L.; Pope, Jason P.; Selnick, David L.
2012-01-01
This study by the U.S. Geological Survey, prepared in cooperation with the Virginia Department of Environmental Quality, quantifies the components of the hydrologic cycle across the Commonwealth of Virginia. Long-term, mean fluxes were calculated for precipitation, surface runoff, infiltration, total evapotranspiration (ET), riparian ET, recharge, base flow (or groundwater discharge) and net total outflow. Fluxes of these components were first estimated on a number of real-time-gaged watersheds across Virginia. Specific conductance was used to distinguish and separate surface runoff from base flow. Specific-conductance data were collected every 15 minutes at 75 real-time gages for approximately 18 months between March 2007 and August 2008. Precipitation was estimated for 1971–2000 using PRISM climate data. Precipitation and temperature from the PRISM data were used to develop a regression-based relation to estimate total ET. The proportion of watershed precipitation that becomes surface runoff was related to physiographic province and rock type in a runoff regression equation. Component flux estimates from the watersheds were transferred to flux estimates for counties and independent cities using the ET and runoff regression equations. Only 48 of the 75 watersheds yielded sufficient data, and data from these 48 were used in the final runoff regression equation. The base-flow proportion for the 48 watersheds averaged 72 percent using specific conductance, a value that was substantially higher than the 61 percent average calculated using a graphical-separation technique (the USGS program PART). Final results for the study are presented as component flux estimates for all counties and independent cities in Virginia.
Annual peak streamflow and ancillary data for small watersheds in central and western Texas
Harwell, Glenn R.; Asquith, William H.
2011-01-01
Estimates of annual peak-streamflow frequency are needed for flood-plain management, assessment of flood risk, and design of structures, such as roads, bridges, culverts, dams, and levees. Regional regression equations have been developed and are used extensively to estimate annual peak-streamflow frequency for ungaged sites in natural (unregulated and rural or nonurbanized) watersheds in Texas (Asquith and Slade, 1997; Asquith and Thompson, 2008; Asquith and Roussel, 2009). The most recent regional regression equations were developed by using data from 638 Texas streamflow-gaging stations throughout the State with eight or more years of data by using drainage area, channel slope, and mean annual precipitation as predictor variables (Asquith and Roussel, 2009). However, because of a lack of sufficient historical streamflow data from small, rural watersheds in certain parts of the State (central and western), substantial uncertainity exists when using the regional regression equations for the purpose of estimating annual peak-streamflow frequency.
Demidenko, Eugene
2017-09-01
The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.
Minute ventilation of cyclists, car and bus passengers: an experimental study.
Zuurbier, Moniek; Hoek, Gerard; van den Hazel, Peter; Brunekreef, Bert
2009-10-27
Differences in minute ventilation between cyclists, pedestrians and other commuters influence inhaled doses of air pollution. This study estimates minute ventilation of cyclists, car and bus passengers, as part of a study on health effects of commuters' exposure to air pollutants. Thirty-four participants performed a submaximal test on a bicycle ergometer, during which heart rate and minute ventilation were measured simultaneously at increasing cycling intensity. Individual regression equations were calculated between heart rate and the natural log of minute ventilation. Heart rates were recorded during 280 two hour trips by bicycle, bus and car and were calculated into minute ventilation levels using the individual regression coefficients. Minute ventilation during bicycle rides were on average 2.1 times higher than in the car (individual range from 1.3 to 5.3) and 2.0 times higher than in the bus (individual range from 1.3 to 5.1). The ratio of minute ventilation of cycling compared to travelling by bus or car was higher in women than in men. Substantial differences in regression equations were found between individuals. The use of individual regression equations instead of average regression equations resulted in substantially better predictions of individual minute ventilations. The comparability of the gender-specific overall regression equations linking heart rate and minute ventilation with one previous American study, supports that for studies on the group level overall equations can be used. For estimating individual doses, the use of individual regression coefficients provides more precise data. Minute ventilation levels of cyclists are on average two times higher than of bus and car passengers, consistent with the ratio found in one small previous study of young adults. The study illustrates the importance of inclusion of minute ventilation data in comparing air pollution doses between different modes of transport.
Magnitude and Frequency of Floods for Urban and Small Rural Streams in Georgia, 2008
Gotvald, Anthony J.; Knaak, Andrew E.
2011-01-01
A study was conducted that updated methods for estimating the magnitude and frequency of floods in ungaged urban basins in Georgia that are not substantially affected by regulation or tidal fluctuations. Annual peak-flow data for urban streams from September 2008 were analyzed for 50 streamgaging stations (streamgages) in Georgia and 6 streamgages on adjacent urban streams in Florida and South Carolina having 10 or more years of data. Flood-frequency estimates were computed for the 56 urban streamgages by fitting logarithms of annual peak flows for each streamgage to a Pearson Type III distribution. Additionally, basin characteristics for the streamgages were computed by using a geographical information system and computer algorithms. Regional regression analysis, using generalized least-squares regression, was used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged urban basins in Georgia. In addition to the 56 urban streamgages, 171 rural streamgages were included in the regression analysis to maintain continuity between flood estimates for urban and rural basins as the basin characteristics pertaining to urbanization approach zero. Because 21 of the rural streamgages have drainage areas less than 1 square mile, the set of equations developed for this study can also be used for estimating small ungaged rural streams in Georgia. Flood-frequency estimates and basin characteristics for 227 streamgages were combined to form the final database used in the regional regression analysis. Four hydrologic regions were developed for Georgia. The final equations are functions of drainage area and percentage of impervious area for three of the regions and drainage area, percentage of developed land, and mean basin slope for the fourth region. Average standard errors of prediction for these regression equations range from 20.0 to 74.5 percent.
Effects of Employing Ridge Regression in Structural Equation Models.
ERIC Educational Resources Information Center
McQuitty, Shaun
1997-01-01
LISREL 8 invokes a ridge option when maximum likelihood or generalized least squares are used to estimate a structural equation model with a nonpositive definite covariance or correlation matrix. Implications of the ridge option for model fit, parameter estimates, and standard errors are explored through two examples. (SLD)
Senior, Lisa A.
2017-09-15
Several streams used for recreational activities, such as fishing, swimming, and boating, in Chester County, Pennsylvania, are known to have periodic elevated concentrations of fecal coliform bacteria, a type of bacteria used to indicate the potential presence of fecally related pathogens that may pose health risks to humans exposed through water contact. The availability of near real-time continuous stream discharge, turbidity, and other water-quality data for some streams in the county presents an opportunity to use surrogates to estimate near real-time concentrations of fecal coliform (FC) bacteria and thus provide some information about associated potential health risks during recreational use of streams.The U.S. Geological Survey (USGS), in cooperation with the Chester County Health Department (CCHD) and the Chester County Water Resources Authority (CCWRA), has collected discrete stream samples for analysis of FC concentrations during March–October annually at or near five gaging stations where near real-time continuous data on stream discharge, turbidity, and water temperature have been collected since 2007 (or since 2012 at 2 of the 5 stations). In 2014, the USGS, in cooperation with the CCWRA and CCHD, began to develop regression equations to estimate FC concentrations using available near real-time continuous data. Regression equations included possible explanatory variables of stream discharge, turbidity, water temperature, and seasonal factors calculated using Julian Day with base-10 logarithmic (log) transformations of selected variables.The regression equations were developed using the data from 2007 to 2015 (101–106 discrete bacteria samples per site) for three gaging stations on Brandywine Creek (West Branch Brandywine Creek at Modena, East Branch Brandywine Creek below Downingtown, and Brandywine Creek at Chadds Ford) and from 2012 to 2015 (37–38 discrete bacteria samples per site) for one station each on French Creek near Phoenixville and White Clay Creek near Strickersville. Fecal coliform bacteria data collected by USGS in 2016 (about nine samples per site) were used to validate the equations. The best-fit regression equations included log turbidity and seasonality factors computed using Julian Day as explanatory variables to estimate log FC concentrations at all five stream sites. The adjusted coefficient of determination for the equations ranged from 0.61 to 0.76, with the strength of the regression equations likely affected in part by the limited amount and variability of FC bacteria data. During summer months, the estimated and measured FC concentrations commonly were greater than the Pennsylvania Department of Environmental Protection established standards of 200 and 400 colonies per 100 milliliters for water contact from May through September at the 5 stream sites, with concentrations typically higher at 2 sites (White Clay Creek and West Branch Brandywine Creek at Modena) than at the other 3 sites. The estimated concentrations of FC bacteria during the summer months commonly were higher than measured concentrations and therefore could be considered cautious estimates of potential human-health risk. Additional water-quality data are needed to maintain and (or) improve the ability of regression equations to estimate FC concentrations by use of surrogate data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Narlesky, Joshua Edward; Kelly, Elizabeth J.
2015-09-10
This report documents the new PG calibration regression equation. These calibration equations incorporate new data that have become available since revision 1 of “A Calibration to Predict the Concentrations of Impurities in Plutonium Oxide by Prompt Gamma Analysis” was issued [3] The calibration equations are based on a weighted least squares (WLS) approach for the regression. The WLS method gives each data point its proper amount of influence over the parameter estimates. This gives two big advantages, more precise parameter estimates and better and more defensible estimates of uncertainties. The WLS approach makes sense both statistically and experimentally because themore » variances increase with concentration, and there are physical reasons that the higher measurements are less reliable and should be less influential. The new magnesium calibration includes a correction for sodium and separate calibration equation for items with and without chlorine. These additional calibration equations allow for better predictions and smaller uncertainties for sodium in materials with and without chlorine. Chlorine and sodium have separate equations for RICH materials. Again, these equations give better predictions and smaller uncertainties chlorine and sodium for RICH materials.« less
Equations for estimating bankfull channel geometry and discharge for streams in Massachusetts
Bent, Gardner C.; Waite, Andrew M.
2013-01-01
Regression equations were developed for estimating bankfull geometry—width, mean depth, cross-sectional area—and discharge for streams in Massachusetts. The equations provide water-resource and conservation managers with methods for estimating bankfull characteristics at specific stream sites in Massachusetts. This information can be used for the adminstration of the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a protected riverfront area extending from the mean annual high-water line corresponding to the elevation of bankfull discharge along each side of a perennial stream. Additionally, information on bankfull channel geometry and discharge are important to Federal, State, and local government agencies and private organizations involved in stream assessment and restoration projects. Regression equations are based on data from stream surveys at 33 sites (32 streamgages and 1 crest-stage gage operated by the U.S. Geological Survey) in and near Massachusetts. Drainage areas of the 33 sites ranged from 0.60 to 329 square miles (mi2). At 27 of the 33 sites, field data were collected and analyses were done to determine bankfull channel geometry and discharge as part of the present study. For 6 of the 33 sites, data on bankfull channel geometry and discharge were compiled from other studies done by the U.S. Geological Survey, Natural Resources Conservation Service of the U.S. Department of Agriculture, and the Vermont Department of Environmental Conservation. Similar techniques were used for field data collection and analysis for bankfull channel geometry and discharge at all 33 sites. Recurrence intervals of the bankfull discharge, which represent the frequency with which a stream fills its channel, averaged 1.53 years (median value 1.34 years) at the 33 sites. Simple regression equations were developed for bankfull width, mean depth, cross-sectional area, and discharge using drainage area, which is the most significant explanatory variable in estimating these bankfull characteristics. The use of drainage area as an explanatory variable is also the most commonly published method for estimating these bankfull characteristics. Regional curves (graphic plots) of bankfull channel geometry and discharge by drainage area are presented. The regional curves are based on the simple regression equations and can be used to estimate bankfull characteristics from drainage area. Multiple regression analysis, which includes basin characteristics in addition to drainage area, also was used to develop equations. Variability in bankfull width, mean depth, cross-sectional area, and discharge was more fully explained by the multiple regression equations that include mean-basin slope and drainage area than was explained by equations based on drainage area alone. The Massachusetts regional curves and equations developed in this study are similar, in terms of values of slopes and intercepts, to those developed for other parts of the northeastern United States. Limitations associated with site selection and development of the equations resulted in some constraints for the application of equations and regional curves presented in this report. The curves and equations are applicable to stream sites that have (1) less than about 25 percent of their drainage basin area occupied by urban land use (commercial, industrial, transportation, and high-density residential), (2) little to no streamflow regulation, especially from flood-control structures, (3) drainage basin areas greater than 0.60 mi2 and less than 329 mi2, and (4) a mean basin slope greater than 2.2 percent and less than 23.9 percent. The equations may not be applicable where streams flow through extensive wetlands. The equations also may not apply in areas of Cape Cod and the Islands and the area of southeastern Massachusetts close to Cape Cod with extensive areas of coarse-grained glacial deposits where none of the study sites are located. Regardless of the setting, the regression equations are not intended for use as the sole method of estimating bankfull characteristics; however, they may supplement field identification of the bankfull channel when used in conjunction with field verified bankfull indicators, flood-frequency analysis, or other supporting evidence.
Over, Thomas M.; Saito, Riki J.; Veilleux, Andrea G.; Sharpe, Jennifer B.; Soong, David T.; Ishii, Audrey L.
2016-06-28
This report provides two sets of equations for estimating peak discharge quantiles at annual exceedance probabilities (AEPs) of 0.50, 0.20, 0.10, 0.04, 0.02, 0.01, 0.005, and 0.002 (recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively) for watersheds in Illinois based on annual maximum peak discharge data from 117 watersheds in and near northeastern Illinois. One set of equations was developed through a temporal analysis with a two-step least squares-quantile regression technique that measures the average effect of changes in the urbanization of the watersheds used in the study. The resulting equations can be used to adjust rural peak discharge quantiles for the effect of urbanization, and in this study the equations also were used to adjust the annual maximum peak discharges from the study watersheds to 2010 urbanization conditions.The other set of equations was developed by a spatial analysis. This analysis used generalized least-squares regression to fit the peak discharge quantiles computed from the urbanization-adjusted annual maximum peak discharges from the study watersheds to drainage-basin characteristics. The peak discharge quantiles were computed by using the Expected Moments Algorithm following the removal of potentially influential low floods defined by a multiple Grubbs-Beck test. To improve the quantile estimates, regional skew coefficients were obtained from a newly developed regional skew model in which the skew increases with the urbanized land use fraction. The drainage-basin characteristics used as explanatory variables in the spatial analysis include drainage area, the fraction of developed land, the fraction of land with poorly drained soils or likely water, and the basin slope estimated as the ratio of the basin relief to basin perimeter.This report also provides the following: (1) examples to illustrate the use of the spatial and urbanization-adjustment equations for estimating peak discharge quantiles at ungaged sites and to improve flood-quantile estimates at and near a gaged site; (2) the urbanization-adjusted annual maximum peak discharges and peak discharge quantile estimates at streamgages from 181 watersheds including the 117 study watersheds and 64 additional watersheds in the study region that were originally considered for use in the study but later deemed to be redundant.The urbanization-adjustment equations, spatial regression equations, and peak discharge quantile estimates developed in this study will be made available in the web application StreamStats, which provides automated regression-equation solutions for user-selected stream locations. Figures and tables comparing the observed and urbanization-adjusted annual maximum peak discharge records by streamgage are provided at https://doi.org/10.3133/sir20165050 for download.
Verochana, Karune; Prapayasatok, Sangsom; Mahasantipiya, Phattaranant May; Korwanich, Narumanas
2016-01-01
Purpose This study assessed the accuracy of age estimates produced by a regression equation derived from lower third molar development in a Thai population. Materials and Methods The first part of this study relied on measurements taken from panoramic radiographs of 614 Thai patients aged from 9 to 20. The stage of lower left and right third molar development was observed in each radiograph and a modified Gat score was assigned. Linear regression on this data produced the following equation: Y=9.309+1.673 mG+0.303S (Y=age; mG=modified Gat score; S=sex). In the second part of this study, the predictive accuracy of this equation was evaluated using data from a second set of panoramic radiographs (539 Thai subjects, 9 to 24 years old). Each subject's age was estimated using the above equation and compared against age calculated from a provided date of birth. Estimated and known age data were analyzed using the Pearson correlation coefficient and descriptive statistics. Results Ages estimated from lower left and lower right third molar development stage were significantly correlated with the known ages (r=0.818, 0.808, respectively, P≤0.01). 50% of age estimates in the second part of the study fell within a range of error of ±1 year, while 75% fell within a range of error of ±2 years. The study found that the equation tends to estimate age accurately when individuals are 9 to 20 years of age. Conclusion The equation can be used for age estimation for Thai populations when the individuals are 9 to 20 years of age. PMID:27051633
Verochana, Karune; Prapayasatok, Sangsom; Janhom, Apirum; Mahasantipiya, Phattaranant May; Korwanich, Narumanas
2016-03-01
This study assessed the accuracy of age estimates produced by a regression equation derived from lower third molar development in a Thai population. The first part of this study relied on measurements taken from panoramic radiographs of 614 Thai patients aged from 9 to 20. The stage of lower left and right third molar development was observed in each radiograph and a modified Gat score was assigned. Linear regression on this data produced the following equation: Y=9.309+1.673 mG+0.303S (Y=age; mG=modified Gat score; S=sex). In the second part of this study, the predictive accuracy of this equation was evaluated using data from a second set of panoramic radiographs (539 Thai subjects, 9 to 24 years old). Each subject's age was estimated using the above equation and compared against age calculated from a provided date of birth. Estimated and known age data were analyzed using the Pearson correlation coefficient and descriptive statistics. Ages estimated from lower left and lower right third molar development stage were significantly correlated with the known ages (r=0.818, 0.808, respectively, P≤0.01). 50% of age estimates in the second part of the study fell within a range of error of ±1 year, while 75% fell within a range of error of ±2 years. The study found that the equation tends to estimate age accurately when individuals are 9 to 20 years of age. The equation can be used for age estimation for Thai populations when the individuals are 9 to 20 years of age.
Regional equations for estimation of peak-streamflow frequency for natural basins in Texas
Asquith, William H.; Slade, Raymond M.
1997-01-01
Peak-streamflow frequency for 559 Texas stations with natural (unregulated and rural or nonurbanized) basins was estimated with annual peak-streamflow data through 1993. The peak-streamflow frequency and drainage-basin characteristics for the Texas stations were used to develop 16 sets of equations to estimate peak-streamflow frequency for ungaged natural stream sites in each of 11 regions in Texas. The relation between peak-streamflow frequency and contributing drainage area for 5 of the 11 regions is curvilinear, requiring that one set of equations be developed for drainage areas less than 32 square miles and another set be developed for drainage areas greater than 32 square miles. These equations, developed through multiple-regression analysis using weighted least squares, are based on the relation between peak-streamflow frequency and basin characteristics for streamflow-gaging stations. The regions represent areas with similar flood characteristics. The use and limitations of the regression equations also are discussed. Additionally, procedures are presented to compute the 50-, 67-, and 90-percent confidence limits for any estimation from the equations. Also, supplemental peak-streamflow frequency and basin characteristics for 105 selected stations bordering Texas are included in the report. This supplemental information will aid in interpretation of flood characteristics for sites near the state borders of Texas.
Techniques for estimating magnitude and frequency of peak flows for Pennsylvania streams
Stuckey, Marla H.; Reed, Lloyd A.
2000-01-01
Regression equations for estimating the magnitude and frequency of floods on ungaged streams in Pennsylvania with drainage areas less that 2,000 square miles were developed on the basis of peak-flow data collected at 313 streamflow-gaging stations. All streamflow-gaging stations used in the development of the equations had 10 or more years of record and include active and discontinued continuous-record and crest-stage partial-record streamflow-gaging stations. Regional regression equations were developed for flood flows expected every 10, 25, 50, 100, and 500 years by the use of a weighted multiple linear regression model.The State was divided into two regions. The largest region, Region A, encompasses about 78 percent of Pennsylvania. The smaller region, Region B, includes only the northwestern part of the State. Basin characteristics used in the regression equations for Region A are drainage area, percentage of forest cover, percentage of urban development, percentage of basin underlain by carbonate bedrock, and percentage of basin controlled by lakes, swamps, and reservoirs. Basin characteristics used in the regression equations for Region B are drainage area and percentage of basin controlled by lakes, swamps, and reservoirs. The coefficient of determination (R2) values for the five flood-frequency equations for Region A range from 0.93 to 0.82, and for Region B, the range is from 0.96 to 0.89.While the regression equations can be used to predict the magnitude and frequency of peak flows for most streams in the State, they should not be used for streams with drainage areas greater than 2,000 square miles or less than 1.5 square miles, for streams that drain extensively mined areas, or for stream reaches immediately below flood-control reservoirs. In addition, the equations presented for Region B should not be used if the stream drains a basin with more than 5 percent urban development.
Gómez Campos, Rossana; Pacheco Carrillo, Jaime; Almonacid Fierro, Alejandro; Urra Albornoz, Camilo; Cossío-Bolaños, Marco
2018-03-01
(i) To propose regression equations based on anthropometric measures to estimate fat mass (FM) using dual energy X-ray absorptiometry (DXA) as reference method, and (ii)to establish population reference standards for equation-derived FM. A cross-sectional study on 6,713 university students (3,354 males and 3,359 females) from Chile aged 17.0 to 27.0years. Anthropometric measures (weight, height, waist circumference) were taken in all participants. Whole body DXA was performed in 683 subjects. A total of 478 subjects were selected to develop regression equations, and 205 for their cross-validation. Data from 6,030 participants were used to develop reference standards for FM. Equations were generated using stepwise multiple regression analysis. Percentiles were developed using the LMS method. Equations for men were: (i) FM=-35,997.486 +232.285 *Weight +432.216 *CC (R 2 =0.73, SEE=4.1); (ii)FM=-37,671.303 +309.539 *Weight +66,028.109 *ICE (R2=0.76, SEE=3.8), while equations for women were: (iii)FM=-13,216.917 +461,302 *Weight+91.898 *CC (R 2 =0.70, SEE=4.6), and (iv) FM=-14,144.220 +464.061 *Weight +16,189.297 *ICE (R 2 =0.70, SEE=4.6). Percentiles proposed included p10, p50, p85, and p95. The developed equations provide valid and accurate estimation of FM in both sexes. The values obtained using the equations may be analyzed from percentiles that allow for categorizing body fat levels by age and sex. Copyright © 2017 SEEN y SED. Publicado por Elsevier España, S.L.U. All rights reserved.
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
Improving precision of glomerular filtration rate estimating model by ensemble learning.
Liu, Xun; Li, Ningshan; Lv, Linsheng; Fu, Yongmei; Cheng, Cailian; Wang, Caixia; Ye, Yuqiu; Li, Shaomin; Lou, Tanqi
2017-11-09
Accurate assessment of kidney function is clinically important, but estimates of glomerular filtration rate (GFR) by regression are imprecise. We hypothesized that ensemble learning could improve precision. A total of 1419 participants were enrolled, with 1002 in the development dataset and 417 in the external validation dataset. GFR was independently estimated from age, sex and serum creatinine using an artificial neural network (ANN), support vector machine (SVM), regression, and ensemble learning. GFR was measured by 99mTc-DTPA renal dynamic imaging calibrated with dual plasma sample 99mTc-DTPA GFR. Mean measured GFRs were 70.0 ml/min/1.73 m 2 in the developmental and 53.4 ml/min/1.73 m 2 in the external validation cohorts. In the external validation cohort, precision was better in the ensemble model of the ANN, SVM and regression equation (IQR = 13.5 ml/min/1.73 m 2 ) than in the new regression model (IQR = 14.0 ml/min/1.73 m 2 , P < 0.001). The precision of ensemble learning was the best of the three models, but the models had similar bias and accuracy. The median difference ranged from 2.3 to 3.7 ml/min/1.73 m 2 , 30% accuracy ranged from 73.1 to 76.0%, and P was > 0.05 for all comparisons of the new regression equation and the other new models. An ensemble learning model including three variables, the average ANN, SVM, and regression equation values, was more precise than the new regression model. A more complex ensemble learning strategy may further improve GFR estimates.
Cuesta-Vargas, Antonio I; González-Sánchez, Manuel
2014-03-01
Currently, there are no studies combining electromyography (EMG) and sonography to estimate the absolute and relative strength values of erector spinae (ES) muscles in healthy individuals. The purpose of this study was to establish whether the maximum voluntary contraction (MVC) of the ES during isometric contractions could be predicted from the changes in surface EMG as well as in fiber pennation and thickness as measured by sonography. Thirty healthy adults performed 3 isometric extensions at 45° from the vertical to calculate the MVC force. Contractions at 33% and 100% of the MVC force were then used during sonographic and EMG recordings. These measurements were used to observe the architecture and function of the muscles during contraction. Statistical analysis was performed using bivariate regression and regression equations. The slope for each regression equation was statistically significant (P < .001) with R(2) values of 0.837 and 0.986 for the right and left ES, respectively. The standard error estimate between the sonographic measurements and the regression-estimated pennation angles for the right and left ES were 0.10 and 0.02, respectively. Erector spinae muscle activation can be predicted from the changes in fiber pennation during isometric contractions at 33% and 100% of the MVC force. These findings could be essential for developing a regression equation that could estimate the level of muscle activation from changes in the muscle architecture.
August Median Streamflow on Ungaged Streams in Eastern Aroostook County, Maine
Lombard, Pamela J.; Tasker, Gary D.; Nielsen, Martha G.
2003-01-01
Methods for estimating August median streamflow were developed for ungaged, unregulated streams in the eastern part of Aroostook County, Maine, with drainage areas from 0.38 to 43 square miles and mean basin elevations from 437 to 1,024 feet. Few long-term, continuous-record streamflow-gaging stations with small drainage areas were available from which to develop the equations; therefore, 24 partial-record gaging stations were established in this investigation. A mathematical technique for estimating a standard low-flow statistic, August median streamflow, at partial-record stations was applied by relating base-flow measurements at these stations to concurrent daily flows at nearby long-term, continuous-record streamflow- gaging stations (index stations). Generalized least-squares regression analysis (GLS) was used to relate estimates of August median streamflow at gaging stations to basin characteristics at these same stations to develop equations that can be applied to estimate August median streamflow on ungaged streams. GLS accounts for varying periods of record at the gaging stations and the cross correlation of concurrent streamflows among gaging stations. Twenty-three partial-record stations and one continuous-record station were used for the final regression equations. The basin characteristics of drainage area and mean basin elevation are used in the calculated regression equation for ungaged streams to estimate August median flow. The equation has an average standard error of prediction from -38 to 62 percent. A one-variable equation uses only drainage area to estimate August median streamflow when less accuracy is acceptable. This equation has an average standard error of prediction from -40 to 67 percent. Model error is larger than sampling error for both equations, indicating that additional basin characteristics could be important to improved estimates of low-flow statistics. Weighted estimates of August median streamflow, which can be used when making estimates at partial-record or continuous-record gaging stations, range from 0.03 to 11.7 cubic feet per second or from 0.1 to 0.4 cubic feet per second per square mile. Estimates of August median streamflow on ungaged streams in the eastern part of Aroostook County, within the range of acceptable explanatory variables, range from 0.03 to 30 cubic feet per second or 0.1 to 0.7 cubic feet per second per square mile. Estimates of August median streamflow per square mile of drainage area generally increase as mean elevation and drainage area increase.
Feaster, Toby D.; Gotvald, Anthony J.; Weaver, J. Curtis
2014-01-01
Reliable estimates of the magnitude and frequency of floods are essential for the design of transportation and water-conveyance structures, flood-insurance studies, and flood-plain management. Such estimates are particularly important in densely populated urban areas. In order to increase the number of streamflow-gaging stations (streamgages) available for analysis, expand the geographical coverage that would allow for application of regional regression equations across State boundaries, and build on a previous flood-frequency investigation of rural U.S Geological Survey streamgages in the Southeast United States, a multistate approach was used to update methods for determining the magnitude and frequency of floods in urban and small, rural streams that are not substantially affected by regulation or tidal fluctuations in Georgia, South Carolina, and North Carolina. The at-site flood-frequency analysis of annual peak-flow data for urban and small, rural streams (through September 30, 2011) included 116 urban streamgages and 32 small, rural streamgages, defined in this report as basins draining less than 1 square mile. The regional regression analysis included annual peak-flow data from an additional 338 rural streamgages previously included in U.S. Geological Survey flood-frequency reports and 2 additional rural streamgages in North Carolina that were not included in the previous Southeast rural flood-frequency investigation for a total of 488 streamgages included in the urban and small, rural regression analysis. The at-site flood-frequency analyses for the urban and small, rural streamgages included the expected moments algorithm, which is a modification of the Bulletin 17B log-Pearson type III method for fitting the statistical distribution to the logarithms of the annual peak flows. Where applicable, the flood-frequency analysis also included low-outlier and historic information. Additionally, the application of a generalized Grubbs-Becks test allowed for the detection of multiple potentially influential low outliers. Streamgage basin characteristics were determined using geographical information system techniques. Initial ordinary least squares regression simulations reduced the number of basin characteristics on the basis of such factors as statistical significance, coefficient of determination, Mallow’s Cp statistic, and ease of measurement of the explanatory variable. Application of generalized least squares regression techniques produced final predictive (regression) equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability flows for urban and small, rural ungaged basins for three hydrologic regions (HR1, Piedmont–Ridge and Valley; HR3, Sand Hills; and HR4, Coastal Plain), which previously had been defined from exploratory regression analysis in the Southeast rural flood-frequency investigation. Because of the limited availability of urban streamgages in the Coastal Plain of Georgia, South Carolina, and North Carolina, additional urban streamgages in Florida and New Jersey were used in the regression analysis for this region. Including the urban streamgages in New Jersey allowed for the expansion of the applicability of the predictive equations in the Coastal Plain from 3.5 to 53.5 square miles. Average standard error of prediction for the predictive equations, which is a measure of the average accuracy of the regression equations when predicting flood estimates for ungaged sites, range from 25.0 percent for the 10-percent annual exceedance probability regression equation for the Piedmont–Ridge and Valley region to 73.3 percent for the 0.2-percent annual exceedance probability regression equation for the Sand Hills region.
Tosteson, Tor D.; Morden, Nancy E.; Stukel, Therese A.; O'Malley, A. James
2014-01-01
The estimation of treatment effects is one of the primary goals of statistics in medicine. Estimation based on observational studies is subject to confounding. Statistical methods for controlling bias due to confounding include regression adjustment, propensity scores and inverse probability weighted estimators. These methods require that all confounders are recorded in the data. The method of instrumental variables (IVs) can eliminate bias in observational studies even in the absence of information on confounders. We propose a method for integrating IVs within the framework of Cox's proportional hazards model and demonstrate the conditions under which it recovers the causal effect of treatment. The methodology is based on the approximate orthogonality of an instrument with unobserved confounders among those at risk. We derive an estimator as the solution to an estimating equation that resembles the score equation of the partial likelihood in much the same way as the traditional IV estimator resembles the normal equations. To justify this IV estimator for a Cox model we perform simulations to evaluate its operating characteristics. Finally, we apply the estimator to an observational study of the effect of coronary catheterization on survival. PMID:25506259
MacKenzie, Todd A; Tosteson, Tor D; Morden, Nancy E; Stukel, Therese A; O'Malley, A James
2014-06-01
The estimation of treatment effects is one of the primary goals of statistics in medicine. Estimation based on observational studies is subject to confounding. Statistical methods for controlling bias due to confounding include regression adjustment, propensity scores and inverse probability weighted estimators. These methods require that all confounders are recorded in the data. The method of instrumental variables (IVs) can eliminate bias in observational studies even in the absence of information on confounders. We propose a method for integrating IVs within the framework of Cox's proportional hazards model and demonstrate the conditions under which it recovers the causal effect of treatment. The methodology is based on the approximate orthogonality of an instrument with unobserved confounders among those at risk. We derive an estimator as the solution to an estimating equation that resembles the score equation of the partial likelihood in much the same way as the traditional IV estimator resembles the normal equations. To justify this IV estimator for a Cox model we perform simulations to evaluate its operating characteristics. Finally, we apply the estimator to an observational study of the effect of coronary catheterization on survival.
NASA Technical Reports Server (NTRS)
Barrett, C. A.
1985-01-01
Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.
Regional regression equations for estimation of natural streamflow statistics in Colorado
Capesius, Joseph P.; Stephens, Verlin C.
2009-01-01
The U.S. Geological Survey (USGS), in cooperation with the Colorado Water Conservation Board and the Colorado Department of Transportation, developed regional regression equations for estimation of various streamflow statistics that are representative of natural streamflow conditions at ungaged sites in Colorado. The equations define the statistical relations between streamflow statistics (response variables) and basin and climatic characteristics (predictor variables). The equations were developed using generalized least-squares and weighted least-squares multilinear regression reliant on logarithmic variable transformation. Streamflow statistics were derived from at least 10 years of streamflow data through about 2007 from selected USGS streamflow-gaging stations in the study area that are representative of natural-flow conditions. Basin and climatic characteristics used for equation development are drainage area, mean watershed elevation, mean watershed slope, percentage of drainage area above 7,500 feet of elevation, mean annual precipitation, and 6-hour, 100-year precipitation. For each of five hydrologic regions in Colorado, peak-streamflow equations that are based on peak-streamflow data from selected stations are presented for the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year instantaneous-peak streamflows. For four of the five hydrologic regions, equations based on daily-mean streamflow data from selected stations are presented for 7-day minimum 2-, 10-, and 50-year streamflows and for 7-day maximum 2-, 10-, and 50-year streamflows. Other equations presented for the same four hydrologic regions include those for estimation of annual- and monthly-mean streamflow and streamflow-duration statistics for exceedances of 10, 25, 50, 75, and 90 percent. All equations are reported along with salient diagnostic statistics, ranges of basin and climatic characteristics on which each equation is based, and commentary of potential bias, which is not otherwise removed by log-transformation of the variables of the equations from interpretation of residual plots. The predictor-variable ranges can be used to assess equation applicability for ungaged sites in Colorado.
Waltemeyer, Scott D.
2006-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable flood-hazard mapping in the Navajo Nation in Arizona, Utah, Colorado, and New Mexico. The Bureau of Indian Affairs, U.S. Army Corps of Engineers, and Navajo Nation requested that the U.S. Geological Survey update estimates of peak discharge magnitude for gaging stations in the region and update regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites using data collected through 1999 at 146 gaging stations, an additional 13 years of peak-discharge data since a 1997 investigation, which used gaging-station data through 1986. The equations for estimation of peak discharges at ungaged sites were developed for flood regions 8, 11, high elevation, and 6 and are delineated on the basis of the hydrologic codes from the 1997 investigation. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 82 of the 146 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge having a recurrence interval of less than 1.4 years in the probability-density function. Within each region, logarithms of the peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then was applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction for a peak discharge have a recurrence interval of 100-years for region 8 was 53 percent (average) for the 100-year flood. The average standard of prediction, which includes average sampling error and average standard error of regression, ranged from 45 to 83 percent for the 100-year flood. Estimated standard error of prediction for a hybrid method for region 11 was large in the 1997 investigation. No distinction of floods produced from a high-elevation region was presented in the 1997 investigation. Overall, the equations based on generalized least-squares regression techniques are considered to be more reliable than those in the 1997 report because of the increased length of record and improved GIS method. Techniques for transferring flood-frequency relations to ungaged sites on the same stream can be estimated at an ungaged site by a direct application of the regional regression equation or at an ungaged site on a stream that has a gaging station upstream or downstream by using the drainage-area ratio and the drainage-area exponent from the regional regression equation of the respective region.
A Comparison of Methods for Estimating Quadratic Effects in Nonlinear Structural Equation Models
ERIC Educational Resources Information Center
Harring, Jeffrey R.; Weiss, Brandi A.; Hsu, Jui-Chen
2012-01-01
Two Monte Carlo simulations were performed to compare methods for estimating and testing hypotheses of quadratic effects in latent variable regression models. The methods considered in the current study were (a) a 2-stage moderated regression approach using latent variable scores, (b) an unconstrained product indicator approach, (c) a latent…
Double Cross-Validation in Multiple Regression: A Method of Estimating the Stability of Results.
ERIC Educational Resources Information Center
Rowell, R. Kevin
In multiple regression analysis, where resulting predictive equation effectiveness is subject to shrinkage, it is especially important to evaluate result replicability. Double cross-validation is an empirical method by which an estimate of invariance or stability can be obtained from research data. A procedure for double cross-validation is…
Sanford, Ward E.; Selnick, David L.
2013-01-01
Evapotranspiration (ET) is an important quantity for water resource managers to know because it often represents the largest sink for precipitation (P) arriving at the land surface. In order to estimate actual ET across the conterminous United States (U.S.) in this study, a water-balance method was combined with a climate and land-cover regression equation. Precipitation and streamflow records were compiled for 838 watersheds for 1971-2000 across the U.S. to obtain long-term estimates of actual ET. A regression equation was developed that related the ratio ET/P to climate and land-cover variables within those watersheds. Precipitation and temperatures were used from the PRISM climate dataset, and land-cover data were used from the USGS National Land Cover Dataset. Results indicate that ET can be predicted relatively well at a watershed or county scale with readily available climate variables alone, and that land-cover data can also improve those predictions. Using the climate and land-cover data at an 800-m scale and then averaging to the county scale, maps were produced showing estimates of ET and ET/P for the entire conterminous U.S. Using the regression equation, such maps could also be made for more detailed state coverages, or for other areas of the world where climate and land-cover data are plentiful.
Use of streamflow data to estimate base flowground-water recharge for Wisconsin
Gebert, W.A.; Radloff, M.J.; Considine, E.J.; Kennedy, J.L.
2007-01-01
The average annual base flow/recharge was determined for streamflow-gaging stations throughout Wisconsin by base-flow separation. A map of the State was prepared that shows the average annual base flow for the period 1970-99 for watersheds at 118 gaging stations. Trend analysis was performed on 22 of the 118 streamflow-gaging stations that had long-term records, unregulated flow, and provided aerial coverage of the State. The analysis found that a statistically significant increasing trend was occurring for watersheds where the primary land use was agriculture. Most gaging stations where the land cover was forest had no significant trend. A method to estimate the average annual base flow at ungaged sites was developed by multiple-regression analysis using basin characteristics. The equation with the lowest standard error of estimate, 9.5%, has drainage area, soil infiltration and base flow factor as independent variables. To determine the average annual base flow for smaller watersheds, estimates were made at low-flow partial-record stations in 3 of the 12 major river basins in Wisconsin. Regression equations were developed for each of the three major river basins using basin characteristics. Drainage area, soil infiltration, basin storage and base-flow factor were the independent variables in the regression equations with the lowest standard error of estimate. The standard error of estimate ranged from 17% to 52% for the three river basins. ?? 2007 American Water Resources Association.
Rogers, Paul; Stoner, Julie
2016-01-01
Regression models for correlated binary outcomes are commonly fit using a Generalized Estimating Equations (GEE) methodology. GEE uses the Liang and Zeger sandwich estimator to produce unbiased standard error estimators for regression coefficients in large sample settings even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large, and there are few repeated measurements. The sandwich estimator is not without drawbacks; its asymptotic properties do not hold in small sample settings. In these situations, the sandwich estimator is biased downwards, underestimating the variances. In this project, a modified form for the sandwich estimator is proposed to correct this deficiency. The performance of this new sandwich estimator is compared to the traditional Liang and Zeger estimator as well as alternative forms proposed by Morel, Pan and Mancl and DeRouen. The performance of each estimator was assessed with 95% coverage probabilities for the regression coefficient estimators using simulated data under various combinations of sample sizes and outcome prevalence values with an Independence (IND), Autoregressive (AR) and Compound Symmetry (CS) correlation structure. This research is motivated by investigations involving rare-event outcomes in aviation data. PMID:26998504
Techniques for estimating flood-peak discharges from urban basins in Missouri
Becker, L.D.
1986-01-01
Techniques are defined for estimating the magnitude and frequency of future flood peak discharges of rainfall-induced runoff from small urban basins in Missouri. These techniques were developed from an initial analysis of flood records of 96 gaged sites in Missouri and adjacent states. Final regression equations are based on a balanced, representative sampling of 37 gaged sites in Missouri. This sample included 9 statewide urban study sites, 18 urban sites in St. Louis County, and 10 predominantly rural sites statewide. Short-term records were extended on the basis of long-term climatic records and use of a rainfall-runoff model. Linear least-squares regression analyses were used with log-transformed variables to relate flood magnitudes of selected recurrence intervals (dependent variables) to selected drainage basin indexes (independent variables). For gaged urban study sites within the State, the flood peak estimates are from the frequency curves defined from the synthesized long-term discharge records. Flood frequency estimates are made for ungaged sites by using regression equations that require determination of the drainage basin size and either the percentage of impervious area or a basin development factor. Alternative sets of equations are given for the 2-, 5-, 10-, 25-, 50-, and 100-yr recurrence interval floods. The average standard errors of estimate range from about 33% for the 2-yr flood to 26% for the 100-yr flood. The techniques for estimation are applicable to flood flows that are not significantly affected by storage caused by manmade activities. Flood peak discharge estimating equations are considered applicable for sites on basins draining approximately 0.25 to 40 sq mi. (Author 's abstract)
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Can stature be estimated from tooth crown dimensions? A study in a sample of South-East Asians.
Hossain, Mohammad Zakir; Munawar, Khalil M M; Rahim, Zubaidah H A; Bakri, Marina Mohd
2016-04-01
Stature estimation is an important step during medico-legal and forensic examination. Difficulty arises when highly decomposed and mutilated dead bodies with fragmentary remains are brought for forensic identification like in mass disaster or airplane crash. The body remains could be just a jaw with some teeth. The objective of this study was to explore if the stature of an individual can be determined from the tooth crown dimensions. A total of 201 volunteers participated in this study. The stature and clinical crown dimensions (length, mesiodistal and labiolingual diameters) of maxillary anterior teeth were measured. Correlation between crown dimensions and stature was analyzed by Pearson correlation test. Regression analysis was used to get equations for estimation of stature from crown measurements. The regression equations were applied in the same sample of volunteers that was used to obtain the equations. The reliability and accuracy of the equations were checked in another sample of volunteers. Length and mesiodistal diameter of the crown of central incisors and canines showed significant albeit low to moderate correlations (0.35-0.45) with the stature. The correlation co-efficient values were higher (as high as 0.537) when summation of the measurements was taken for analysis. The regression equations when applied to the same and a test sample of volunteers revealed that differences between actual and estimated stature can be as low as 0.01 to as much as 16.50cm. The findings suggest that although there are some degrees of positive correlations between stature and tooth crown dimensions, stature estimation from the tooth crown dimensions cannot provide the accuracy of estimation as required in forensic situations. The stature estimation accuracy using tooth crown dimensions is comparable to that of cephalo-facial dimensions but inferior to that of long bones. Copyright © 2016 Elsevier Ltd. All rights reserved.
Allometric Biomass Equations for 98 Species of Herbs, Shrubs, and Small Trees
W. Brad Smith; Gary J. Brand
1983-01-01
Biomass regression coefficients from the literature for the allometric equation form are presented for 98 species of shrubs and herbs in the northern U.S. and Canada. The equation and coeffients provide estimates of grams of biomass (oven-dry weight) for foliage, woody stem and total biomass.
Estimating Dbh from Stump Diameter for 15 Southern Species
Carl V. Bylin
1982-01-01
Regression equations for predicting dbh from tree stump diameter inside and outside bark are presented for 15 southern species. Equations were certified on idependent test subsets using the F distrubution statistic with signigicance level of .05.
Estimating annual suspended-sediment loads in the northern and central Appalachian Coal region
Koltun, G.F.
1985-01-01
Multiple-regression equations were developed for estimating the annual suspended-sediment load, for a given year, from small to medium-sized basins in the northern and central parts of the Appalachian coal region. The regression analysis was performed with data for land use, basin characteristics, streamflow, rainfall, and suspended-sediment load for 15 sites in the region. Two variables, the maximum mean-daily discharge occurring within the year and the annual peak discharge, explained much of the variation in the annual suspended-sediment load. Separate equations were developed employing each of these discharge variables. Standard errors for both equations are relatively large, which suggests that future predictions will probably have a low level of precision. This level of precision, however, may be acceptable for certain purposes. It is therefore left to the user to asses whether the level of precision provided by these equations is acceptable for the intended application.
Tortorelli, R.L.; Bergman, D.L.
1985-01-01
Statewide regression relations for Oklahoma were determined for estimating peak discharge of floods for selected recurrence intervals from 2 to 500 years. The independent variables required for estimating flood discharge for rural streams are contributing drainage area and mean annual precipitation. Main-channel slope, a variable used in previous reports, was found to contribute very little to the accuracy of the relations and was not used. The regression equations are applicable for watersheds with drainage areas less than 2,500 square miles that are not significantly affected by regulation from manmade works. These relations are presented in graphical form for easy application. Limitations on the use of the regression relations and the reliability of regression estimates for rural unregulated streams are discussed. Basin and climatic characteristics, log-Pearson Type III statistics and the flood-frequency relations for 226 gaging stations in Oklahoma and adjacent states are presented. Regression relations are investigated for estimating flood magnitude and frequency for watersheds affected by regulation from small FRS (floodwater retarding structures) built by the U.S. Soil Conservation Service in their watershed protection and flood prevention program. Gaging-station data from nine FRS regulated sites in Oklahoma and one FRS regulated site in Kansas are used. For sites regulated by FRS, an adjustment of the statewide rural regression relations can be used to estimate flood magnitude and frequency. The statewide regression equations are used by substituting the drainage area below the FRS, or drainage area that represents the percent of the basin unregulated, in the contributing drainage area parameter to obtain flood-frequency estimates. Flood-frequency curves and flow-duration curves are presented for five gaged sites to illustrate the effects of FRS regulation on peak discharge.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.
ERIC Educational Resources Information Center
Algina, James; Olejnik, Stephen
2000-01-01
Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
Magnitude and Frequency of Floods on Nontidal Streams in Delaware
Ries, Kernell G.; Dillow, Jonathan J.A.
2006-01-01
Reliable estimates of the magnitude and frequency of annual peak flows are required for the economical and safe design of transportation and water-conveyance structures. This report, done in cooperation with the Delaware Department of Transportation (DelDOT) and the Delaware Geological Survey (DGS), presents methods for estimating the magnitude and frequency of floods on nontidal streams in Delaware at locations where streamgaging stations monitor streamflow continuously and at ungaged sites. Methods are presented for estimating the magnitude of floods for return frequencies ranging from 2 through 500 years. These methods are applicable to watersheds exhibiting a full range of urban development conditions. The report also describes StreamStats, a web application that makes it easy to obtain flood-frequency estimates for user-selected locations on Delaware streams. Flood-frequency estimates for ungaged sites are obtained through a process known as regionalization, using statistical regression analysis, where information determined for a group of streamgaging stations within a region forms the basis for estimates for ungaged sites within the region. One hundred and sixteen streamgaging stations in and near Delaware with at least 10 years of non-regulated annual peak-flow data available were used in the regional analysis. Estimates for gaged sites are obtained by combining the station peak-flow statistics (mean, standard deviation, and skew) and peak-flow estimates with regional estimates of skew and flood-frequency magnitudes. Example flood-frequency estimate calculations using the methods presented in the report are given for: (1) ungaged sites, (2) gaged locations, (3) sites upstream or downstream from a gaged location, and (4) sites between gaged locations. Regional regression equations applicable to ungaged sites in the Piedmont and Coastal Plain Physiographic Provinces of Delaware are presented. The equations incorporate drainage area, forest cover, impervious area, basin storage, housing density, soil type A, and mean basin slope as explanatory variables, and have average standard errors of prediction ranging from 28 to 72 percent. Additional regression equations that incorporate drainage area and housing density as explanatory variables are presented for use in defining the effects of urbanization on peak-flow estimates throughout Delaware for the 2-year through 500-year recurrence intervals, along with suggestions for their appropriate use in predicting development-affected peak flows. Additional topics associated with the analyses performed during the study are also discussed, including: (1) the availability and description of more than 30 basin and climatic characteristics considered during the development of the regional regression equations; (2) the treatment of increasing trends in the annual peak-flow series identified at 18 gaged sites, with respect to their relations with maximum 24-hour precipitation and housing density, and their use in the regional analysis; (3) calculation of the 90-percent confidence interval associated with peak-flow estimates from the regional regression equations; and (4) a comparison of flood-frequency estimates at gages used in a previous study, highlighting the effects of various improved analytical techniques.
Sanford, Ward E.; Nelms, David L.; Pope, Jason P.; Selnick, David L.
2015-01-01
Mean long-term hydrologic budget components, such as recharge and base flow, are often difficult to estimate because they can vary substantially in space and time. Mean long-term fluxes were calculated in this study for precipitation, surface runoff, infiltration, total evapotranspiration (ET), riparian ET, recharge, base flow (or groundwater discharge) and net total outflow using long-term estimates of mean ET and precipitation and the assumption that the relative change in storage over that 30-year period is small compared to the total ET or precipitation. Fluxes of these components were first estimated on a number of real-time-gaged watersheds across Virginia. Specific conductance was used to distinguish and separate surface runoff from base flow. Specific-conductance (SC) data were collected every 15 minutes at 75 real-time gages for approximately 18 months between March 2007 and August 2008. Precipitation was estimated for 1971-2000 using PRISM climate data. Precipitation and temperature from the PRISM data were used to develop a regression-based relation to estimate total ET. The proportion of watershed precipitation that becomes surface runoff was related to physiographic province and rock type in a runoff regression equation. A new approach to estimate riparian ET using seasonal SC data gave results consistent with those from other methods. Component flux estimates from the watersheds were transferred to flux estimates for counties and independent cities using the ET and runoff regression equations. Only 48 of the 75 watersheds yielded sufficient data, and data from these 48 were used in the final runoff regression equation. Final results for the study are presented as component flux estimates for all counties and independent cities in Virginia. The method has the potential to be applied in many other states in the U.S. or in other regions or countries of the world where climate and stream flow data are plentiful.
Biomass relations for components of five Minnesota shrubs.
Richard R. Buech; David J. Rugg
1995-01-01
Presents equations for estimating biomass of six components on five species of shrubs common to northeastern Minnesota. Regression analysis is used to compare the performance of three estimators of biomass.
Williams-Sether, Tara; Gross, Tara A.
2016-02-09
Seasonal mean daily flow data from 119 U.S. Geological Survey streamflow-gaging stations in North Dakota; the surrounding states of Montana, Minnesota, and South Dakota; and the Canadian provinces of Manitoba and Saskatchewan with 10 or more years of unregulated flow record were used to develop regression equations for flow duration, n-day high flow and n-day low flow using ordinary least-squares and Tobit regression techniques. Regression equations were developed for seasonal flow durations at the 10th, 25th, 50th, 75th, and 90th percent exceedances; the 1-, 7-, and 30-day seasonal mean high flows for the 10-, 25-, and 50-year recurrence intervals; and the 1-, 7-, and 30-day seasonal mean low flows for the 2-, 5-, and 10-year recurrence intervals. Basin and climatic characteristics determined to be significant explanatory variables in one or more regression equations included drainage area, percentage of basin drainage area that drains to isolated lakes and ponds, ruggedness number, stream length, basin compactness ratio, minimum basin elevation, precipitation, slope ratio, stream slope, and soil permeability. The adjusted coefficient of determination for the n-day high-flow regression equations ranged from 55.87 to 94.53 percent. The Chi2 values for the duration regression equations ranged from 13.49 to 117.94, whereas the Chi2 values for the n-day low-flow regression equations ranged from 4.20 to 49.68.
Miller, M.R.; Eadie, J. McA
2006-01-01
We examined the allometric relationship between resting metabolic rate (RMR; kJ day-1) and body mass (kg) in wild waterfowl (Anatidae) by regressing RMR on body mass using species means from data obtained from published literature (18 sources, 54 measurements, 24 species; all data from captive birds). There was no significant difference among measurements from the rest (night; n = 37), active (day; n = 14), and unspecified (n = 3) phases of the daily cycle (P > 0.10), and we pooled these measurements for analysis. The resulting power function (aMassb) for all waterfowl (swans, geese, and ducks) had an exponent (b; slope of the regression) of 0.74, indistinguishable from that determined with commonly used general equations for nonpasserine birds (0.72-0.73). In contrast, the mass proportionality coefficient (b; y-intercept at mass = 1 kg) of 422 exceeded that obtained from the nonpasserine equations by 29%-37%. Analyses using independent contrasts correcting for phylogeny did not substantially alter the equation. Our results suggest the waterfowl equation provides a more appropriate estimate of RMR for bioenergetics analyses of waterfowl than do the general nonpasserine equations. When adjusted with a multiple to account for energy costs of free living, the waterfowl equation better estimates daily energy expenditure. Using this equation, we estimated that the extent of wetland habitat required to support wintering waterfowl populations could be 37%-50% higher than previously predicted using general nonpasserine equations. ?? The Cooper Ornithological Society 2006.
Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang
2014-11-15
The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45 kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents.
Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang
2014-01-01
Background The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. Material/Methods A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. Results FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. Conclusions BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents. PMID:25398209
August median streamflow on ungaged streams in Eastern Coastal Maine
Lombard, Pamela J.
2004-01-01
Methods for estimating August median streamflow were developed for ungaged, unregulated streams in eastern coastal Maine. The methods apply to streams with drainage areas ranging in size from 0.04 to 73.2 square miles and fraction of basin underlain by a sand and gravel aquifer ranging from 0 to 71 percent. The equations were developed with data from three long-term (greater than or equal to 10 years of record) continuous-record streamflow-gaging stations, 23 partial-record streamflow- gaging stations, and 5 short-term (less than 10 years of record) continuous-record streamflow-gaging stations. A mathematical technique for estimating a standard low-flow statistic, August median streamflow, at partial-record streamflow-gaging stations and short-term continuous-record streamflow-gaging stations was applied by relating base-flow measurements at these stations to concurrent daily streamflows at nearby long-term continuous-record streamflow-gaging stations (index stations). Generalized least-squares regression analysis (GLS) was used to relate estimates of August median streamflow at streamflow-gaging stations to basin characteristics at these same stations to develop equations that can be applied to estimate August median streamflow on ungaged streams. GLS accounts for different periods of record at the gaging stations and the cross correlation of concurrent streamflows among gaging stations. Thirty-one stations were used for the final regression equations. Two basin characteristics?drainage area and fraction of basin underlain by a sand and gravel aquifer?are used in the calculated regression equation to estimate August median streamflow for ungaged streams. The equation has an average standard error of prediction from -27 to 38 percent. A one-variable equation uses only drainage area to estimate August median streamflow when less accuracy is acceptable. This equation has an average standard error of prediction from -30 to 43 percent. Model error is larger than sampling error for both equations, indicating that additional or improved estimates of basin characteristics could be important to improved estimates of low-flow statistics. Weighted estimates of August median streamflow at partial- record or continuous-record gaging stations range from 0.003 to 31.0 cubic feet per second or from 0.1 to 0.6 cubic feet per second per square mile. Estimates of August median streamflow on ungaged streams in eastern coastal Maine, within the range of acceptable explanatory variables, range from 0.003 to 45 cubic feet per second or 0.1 to 0.6 cubic feet per second per square mile. Estimates of August median streamflow per square mile of drainage area generally increase as drainage area and fraction of basin underlain by a sand and gravel aquifer increase.
The Variance Normalization Method of Ridge Regression Analysis.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…
Wiley, J.B.; Atkins, John T.; Tasker, Gary D.
2000-01-01
Multiple and simple least-squares regression models for the log10-transformed 100-year discharge with independent variables describing the basin characteristics (log10-transformed and untransformed) for 267 streamflow-gaging stations were evaluated, and the regression residuals were plotted as areal distributions that defined three regions of the State, designated East, North, and South. Exploratory data analysis procedures identified 31 gaging stations at which discharges are different than would be expected for West Virginia. Regional equations for the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year peak discharges were determined by generalized least-squares regression using data from 236 gaging stations. Log10-transformed drainage area was the most significant independent variable for all regions.Equations developed in this study are applicable only to rural, unregulated, streams within the boundaries of West Virginia. The accuracy of estimating equations is quantified by measuring the average prediction error (from 27.7 to 44.7 percent) and equivalent years of record (from 1.6 to 20.0 years).
Updated generalized biomass equations for North American tree species
David C. Chojnacky; Linda S. Heath; Jennifer C. Jenkins
2014-01-01
Historically, tree biomass at large scales has been estimated by applying dimensional analysis techniques and field measurements such as diameter at breast height (dbh) in allometric regression equations. Equations often have been developed using differing methods and applied only to certain species or isolated areas. We previously had compiled and combined (in meta-...
Biomass equations for shrub species of Tamualipan thornscrub of North-Eastern Mexico
J. Navar; E. Mendez; A. Najera; J. Graciano; V. Dale; B. Parresol
2004-01-01
Nine additive allometric equations for computing above-ground, standing biomass were developed for the plant community and for each of 18 single species typical of the Tamaulipan thornscrub of north-eastern Mexico. Equations developed using additive procedures in seemingly unrelated linear regression provided statistical efficiency in total biomass estimates at the...
2013-01-01
Background This study aims to improve accuracy of Bioelectrical Impedance Analysis (BIA) prediction equations for estimating fat free mass (FFM) of the elderly by using non-linear Back Propagation Artificial Neural Network (BP-ANN) model and to compare the predictive accuracy with the linear regression model by using energy dual X-ray absorptiometry (DXA) as reference method. Methods A total of 88 Taiwanese elderly adults were recruited in this study as subjects. Linear regression equations and BP-ANN prediction equation were developed using impedances and other anthropometrics for predicting the reference FFM measured by DXA (FFMDXA) in 36 male and 26 female Taiwanese elderly adults. The FFM estimated by BIA prediction equations using traditional linear regression model (FFMLR) and BP-ANN model (FFMANN) were compared to the FFMDXA. The measuring results of an additional 26 elderly adults were used to validate than accuracy of the predictive models. Results The results showed the significant predictors were impedance, gender, age, height and weight in developed FFMLR linear model (LR) for predicting FFM (coefficient of determination, r2 = 0.940; standard error of estimate (SEE) = 2.729 kg; root mean square error (RMSE) = 2.571kg, P < 0.001). The above predictors were set as the variables of the input layer by using five neurons in the BP-ANN model (r2 = 0.987 with a SD = 1.192 kg and relatively lower RMSE = 1.183 kg), which had greater (improved) accuracy for estimating FFM when compared with linear model. The results showed a better agreement existed between FFMANN and FFMDXA than that between FFMLR and FFMDXA. Conclusion When compared the performance of developed prediction equations for estimating reference FFMDXA, the linear model has lower r2 with a larger SD in predictive results than that of BP-ANN model, which indicated ANN model is more suitable for estimating FFM. PMID:23388042
Age estimation using pulp/tooth area ratio in maxillary canines-A digital image analysis.
Juneja, Manjushree; Devi, Yashoda B K; Rakesh, N; Juneja, Saurabh
2014-09-01
Determination of age of a subject is one of the most important aspects of medico-legal cases and anthropological research. Radiographs can be used to indirectly measure the rate of secondary dentine deposition which is depicted by reduction in the pulp area. In this study, 200 patients of Karnataka aged between 18-72 years were selected for the study. Panoramic radiographs were made and indirectly digitized. Radiographic images of maxillary canines (RIC) were processed using a computer-aided drafting program (ImageJ). The variables pulp/root length (p), pulp/tooth length (r), pulp/root width at enamel-cementum junction (ECJ) level (a), pulp/root width at mid-root level (c), pulp/root width at midpoint level between ECJ level and mid-root level (b) and pulp/tooth area ratio (AR) were recorded. All the morphological variables including gender were statistically analyzed to derive regression equation for estimation of age. It was observed that 2 variables 'AR' and 'b' contributed significantly to the fit and were included in the regression model, yielding the formula: Age = 87.305-480.455(AR)+48.108(b). Statistical analysis indicated that the regression equation with selected variables explained 96% of total variance with the median of the residuals of 0.1614 years and standard error of estimate of 3.0186 years. There is significant correlation between age and morphological variables 'AR' and 'b' and the derived population specific regression equation can be potentially used for estimation of chronological age of individuals of Karnataka origin.
Hanley, James A
2008-01-01
Most survival analysis textbooks explain how the hazard ratio parameters in Cox's life table regression model are estimated. Fewer explain how the components of the nonparametric baseline survivor function are derived. Those that do often relegate the explanation to an "advanced" section and merely present the components as algebraic or iterative solutions to estimating equations. None comment on the structure of these estimators. This note brings out a heuristic representation that may help to de-mystify the structure.
Emerson, Douglas G.; Vecchia, Aldo V.; Dahl, Ann L.
2005-01-01
The drainage-area ratio method commonly is used to estimate streamflow for sites where no streamflow data were collected. To evaluate the validity of the drainage-area ratio method and to determine if an improved method could be developed to estimate streamflow, a multiple-regression technique was used to determine if drainage area, main channel slope, and precipitation were significant variables for estimating streamflow in the Red River of the North Basin. A separate regression analysis was performed for streamflow for each of three seasons-- winter, spring, and summer. Drainage area and summer precipitation were the most significant variables. However, the regression equations generally overestimated streamflows for North Dakota stations and underestimated streamflows for Minnesota stations. To correct the bias in the residuals for the two groups of stations, indicator variables were included to allow both the intercept and the coefficient for the logarithm of drainage area to depend on the group. Drainage area was the only significant variable in the revised regression equations. The exponents for the drainage-area ratio were 0.85 for the winter season, 0.91 for the spring season, and 1.02 for the summer season.
Estimation of stature using hand and foot dimensions in Slovak adults.
Uhrová, Petra; Beňuš, Radoslav; Masnicová, Soňa; Obertová, Zuzana; Kramárová, Daniela; Kyselicová, Klaudia; Dörnhöferová, Michaela; Bodoriková, Silvia; Neščáková, Eva
2015-03-01
Hand and foot dimensions used for stature estimation help to formulate a biological profile in the process of personal identification. Morphological variability of hands and feet shows the importance of generating population-specific equations to estimate stature. The stature, hand length, hand breadth, foot length and foot breadth of 250 young Slovak males and females, aged 18-24 years, were measured according to standard anthropometric procedures. The data were statistically analyzed using independent t-test for sex and bilateral differences. Pearson correlation coefficient was used for assessing relationship between stature and hand/foot parameters, and subsequently linear regression analysis was used to estimate stature. The results revealed significant sex differences in hand and foot dimensions as well as in stature (p<0.05). There was a positive and statistically significant correlation between stature and all measurements in both sexes (p<0.01). The highest correlation coefficient was found for foot length in males (r=0.71) as well as in females (r=0.63). Regression equations were computed separately for each sex. The accuracy of stature prediction ranged from ±4.6 to ±6.1cm. The results of this study indicate that hand and foot dimension can be used to estimate stature for Slovak for the purpose of forensic field. The regression equations can be of use for stature estimation particularly in cases of dismembered bodies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Thompson, Ronald E.; Hoffman, Scott A.
2006-01-01
A suite of 28 streamflow statistics, ranging from extreme low to high flows, was computed for 17 continuous-record streamflow-gaging stations and predicted for 20 partial-record stations in Monroe County and contiguous counties in north-eastern Pennsylvania. The predicted statistics for the partial-record stations were based on regression analyses relating inter-mittent flow measurements made at the partial-record stations indexed to concurrent daily mean flows at continuous-record stations during base-flow conditions. The same statistics also were predicted for 134 ungaged stream locations in Monroe County on the basis of regression analyses relating the statistics to GIS-determined basin characteristics for the continuous-record station drainage areas. The prediction methodology for developing the regression equations used to estimate statistics was developed for estimating low-flow frequencies. This study and a companion study found that the methodology also has application potential for predicting intermediate- and high-flow statistics. The statistics included mean monthly flows, mean annual flow, 7-day low flows for three recurrence intervals, nine flow durations, mean annual base flow, and annual mean base flows for two recurrence intervals. Low standard errors of prediction and high coefficients of determination (R2) indicated good results in using the regression equations to predict the statistics. Regression equations for the larger flow statistics tended to have lower standard errors of prediction and higher coefficients of determination (R2) than equations for the smaller flow statistics. The report discusses the methodologies used in determining the statistics and the limitations of the statistics and the equations used to predict the statistics. Caution is indicated in using the predicted statistics for small drainage area situations. Study results constitute input needed by water-resource managers in Monroe County for planning purposes and evaluation of water-resources availability.
Proposed standard-weight equations for brook trout
Hyatt, M.W.; Hubert, W.A.
2001-01-01
Weight and length data were obtained for 113 populations of brook trout Salvelinus fontinalis across the species' geographic range in North America to estimate a standard-weight (Ws) equation for this species. Estimation was done by applying the regression-line-percentile technique to fish of 120-620 mm total length (TL). The proposed metric-unit (g and mm) equation is log10Ws = -5.186 + 3.103 log10TL; the English-unit (lb and in) equivalent is log10Ws = -3.483 + 3.103 log10TL. No systematic length bias was evident in the relative-weight values calculated from these equations.
Nestler, Steffen
2014-05-01
Parameters in structural equation models are typically estimated using the maximum likelihood (ML) approach. Bollen (1996) proposed an alternative non-iterative, equation-by-equation estimator that uses instrumental variables. Although this two-stage least squares/instrumental variables (2SLS/IV) estimator has good statistical properties, one problem with its application is that parameter equality constraints cannot be imposed. This paper presents a mathematical solution to this problem that is based on an extension of the 2SLS/IV approach to a system of equations. We present an example in which our approach was used to examine strong longitudinal measurement invariance. We also investigated the new approach in a simulation study that compared it with ML in the examination of the equality of two latent regression coefficients and strong measurement invariance. Overall, the results show that the suggested approach is a useful extension of the original 2SLS/IV estimator and allows for the effective handling of equality constraints in structural equation models. © 2013 The British Psychological Society.
Demura, Shinichi; Sato, Susumu; Nakada, Masakatsu; Minami, Masaki; Kitabayashi, Tamotsu
2003-07-01
This study compared the accuracy of body density (Db) estimation methods using hydrostatic weighing without complete head submersion (HW(withoutHS)) of Donnelly et al. (1988) and Donnelly and Sintek (1984) as referenced to Goldman and Buskirk's approach (1961). Donnelly et al.'s method estimates Db from a regression equation using HW(withoutHS), moreover, Donnelly and Sintek's method estimates it from HW(withoutHS) and head anthropometric variables. Fifteen Japanese males (173.8+/-4.5 cm, 63.6+/-5.4 kg, 21.2+/-2.8 years) and fifteen females (161.4+/-5.4 cm, 53.8+/-4.8 kg, 21.0+/-1.4 years) participated in this study. All the subjects were measured for head length, width and HWs under the two conditions of with and without head submersion. In order to examine the consistency of estimation values of Db, the correlation coefficients between the estimation values and the reference (Goldman and Buskirk, 1961) were calculated. The standard errors of estimation (SEE) were calculated by regression analysis using a reference value as a dependent variable and estimation values as independent variables. In addition, the systematic errors of two estimation methods were investigated by the Bland-Altman technique (Bland and Altman, 1986). In the estimation, Donnelly and Sintek's equation showed a high relationship with the reference (r=0.960, p<0.01), but had more differences from the reference compared with Donnelly et al.'s equation. Further studies are needed to develop new prediction equations for Japanese considering sex and individual differences in head anthropometry.
Granato, Gregory E.
2012-01-01
A nationwide study to better define triangular-hydrograph statistics for use with runoff-quality and flood-flow studies was done by the U.S. Geological Survey (USGS) in cooperation with the Federal Highway Administration. Although the triangular hydrograph is a simple linear approximation, the cumulative distribution of stormflow with a triangular hydrograph is a curvilinear S-curve that closely approximates the cumulative distribution of stormflows from measured data. The temporal distribution of flow within a runoff event can be estimated using the basin lagtime, (which is the time from the centroid of rainfall excess to the centroid of the corresponding runoff hydrograph) and the hydrograph recession ratio (which is the ratio of the duration of the falling limb to the rising limb of the hydrograph). This report documents results of the study, methods used to estimate the variables, and electronic files that facilitate calculation of variables. Ten viable multiple-linear regression equations were developed to estimate basin lagtimes from readily determined drainage basin properties using data published in 37 stormflow studies. Regression equations using the basin lag factor (BLF, which is a variable calculated as the main-channel length, in miles, divided by the square root of the main-channel slope in feet per mile) and two variables describing development in the drainage basin were selected as the best candidates, because each equation explains about 70 percent of the variability in the data. The variables describing development are the USGS basin development factor (BDF, which is a function of the amount of channel modifications, storm sewers, and curb-and-gutter streets in a basin) and the total impervious area variable (IMPERV) in the basin. Two datasets were used to develop regression equations. The primary dataset included data from 493 sites that have values for the BLF, BDF, and IMPERV variables. This dataset was used to develop the best-fit regression equation using the BLF and BDF variables. The secondary dataset included data from 896 sites that have values for the BLF and IMPERV variables. This dataset was used to develop the best-fit regression equation using the BLF and IMPERV variables. Analysis of hydrograph recession ratios and basin characteristics for 41 sites indicated that recession ratios are random variables. Thus, recession ratios cannot be estimated quantitatively using multiple linear regression equations developed using the data available for these sites. The minimums of recession ratios for different streamgages are well characterized by a value of one. The most probable values and maximum values of recession ratios for different streamgages are, however, more variable than the minimums. The most probable values of recession ratios for the 41 streamgages analyzed ranged from 1.0 to 3.52 and had a median of 1.85. The maximum values ranged from 2.66 to 11.3 and had a median of 4.36.
Villa, Chiara; Brůžek, Jaroslav
2017-01-01
Background Estimating volumes and masses of total body components is important for the study and treatment monitoring of nutrition and nutrition-related disorders, cancer, joint replacement, energy-expenditure and exercise physiology. While several equations have been offered for estimating total body components from MRI slices, no reliable and tested method exists for CT scans. For the first time, body composition data was derived from 41 high-resolution whole-body CT scans. From these data, we defined equations for estimating volumes and masses of total body AT and LT from corresponding tissue areas measured in selected CT scan slices. Methods We present a new semi-automatic approach to defining the density cutoff between adipose tissue (AT) and lean tissue (LT) in such material. An intra-class correlation coefficient (ICC) was used to validate the method. The equations for estimating the whole-body composition volume and mass from areas measured in selected slices were modeled with ordinary least squares (OLS) linear regressions and support vector machine regression (SVMR). Results and Discussion The best predictive equation for total body AT volume was based on the AT area of a single slice located between the 4th and 5th lumbar vertebrae (L4-L5) and produced lower prediction errors (|PE| = 1.86 liters, %PE = 8.77) than previous equations also based on CT scans. The LT area of the mid-thigh provided the lowest prediction errors (|PE| = 2.52 liters, %PE = 7.08) for estimating whole-body LT volume. We also present equations to predict total body AT and LT masses from a slice located at L4-L5 that resulted in reduced error compared with the previously published equations based on CT scans. The multislice SVMR predictor gave the theoretical upper limit for prediction precision of volumes and cross-validated the results. PMID:28533960
Lacoste Jeanson, Alizé; Dupej, Ján; Villa, Chiara; Brůžek, Jaroslav
2017-01-01
Estimating volumes and masses of total body components is important for the study and treatment monitoring of nutrition and nutrition-related disorders, cancer, joint replacement, energy-expenditure and exercise physiology. While several equations have been offered for estimating total body components from MRI slices, no reliable and tested method exists for CT scans. For the first time, body composition data was derived from 41 high-resolution whole-body CT scans. From these data, we defined equations for estimating volumes and masses of total body AT and LT from corresponding tissue areas measured in selected CT scan slices. We present a new semi-automatic approach to defining the density cutoff between adipose tissue (AT) and lean tissue (LT) in such material. An intra-class correlation coefficient (ICC) was used to validate the method. The equations for estimating the whole-body composition volume and mass from areas measured in selected slices were modeled with ordinary least squares (OLS) linear regressions and support vector machine regression (SVMR). The best predictive equation for total body AT volume was based on the AT area of a single slice located between the 4th and 5th lumbar vertebrae (L4-L5) and produced lower prediction errors (|PE| = 1.86 liters, %PE = 8.77) than previous equations also based on CT scans. The LT area of the mid-thigh provided the lowest prediction errors (|PE| = 2.52 liters, %PE = 7.08) for estimating whole-body LT volume. We also present equations to predict total body AT and LT masses from a slice located at L4-L5 that resulted in reduced error compared with the previously published equations based on CT scans. The multislice SVMR predictor gave the theoretical upper limit for prediction precision of volumes and cross-validated the results.
Estimation of potential bridge scour at bridges on state routes in South Dakota, 2003-07
Thompson, Ryan F.; Fosness, Ryan L.
2008-01-01
Flowing water can erode (scour) soils and cause structural failure of a bridge by exposing or undermining bridge foundations (abutments and piers). A rapid scour-estimation technique, known as the level-1.5 method and developed by the U.S. Geological Survey, was used to evaluate potential scour at bridges in South Dakota in a study conducted in cooperation with the South Dakota Department of Transportation. This method was used during 2003-07 to estimate scour for the 100-year and 500-year floods at 734 selected bridges managed by the South Dakota Department of Transportation on State routes in South Dakota. Scour depths and other parameters estimated from the level-1.5 analyses are presented in tabular form. Estimates of potential contraction scour at the 734 bridges ranged from 0 to 33.9 feet for the 100-year flood and from 0 to 35.8 feet for the 500-year flood. Abutment scour ranged from 0 to 36.9 feet for the 100-year flood and from 0 to 45.9 feet for the 500-year flood. Pier scour ranged from 0 to 30.8 feet for the 100-year flood and from 0 to 30.7 feet for the 500-year flood. The scour depths estimated by using the level-1.5 method can be used by the South Dakota Department of Transportation and others to identify bridges that may be susceptible to scour. Scour at 19 selected bridges also was estimated by using the level-2 method. Estimates of contraction, abutment, and pier scour calculated by using the level-1.5 and level-2 methods are presented in tabular and graphical formats. Compared to level-2 scour estimates, the level-1.5 method generally overestimated scour as designed, or in a few cases slightly underestimated scour. Results of the level-2 analyses were used to develop regression equations for change in head and average velocity through the bridge opening. These regression equations derived from South Dakota data are compared to similar regression equations derived from Montana and Colorado data. Future level-1.5 scour investigations in South Dakota may benefit from the use of these South Dakota-specific regression equations for estimating change in stream head and average velocity at the bridge.
Hollyday, E.F.; Hansen, G.R.
1983-01-01
Streamflow may be estimated with regression equations that relate streamflow characteristics to characteristics of the drainage basin. A statistical experiment was performed to compare the accuracy of equations using basin characteristics derived from maps and climatological records (control group equations) with the accuracy of equations using basin characteristics derived from Landsat data as well as maps and climatological records (experimental group equations). Results show that when the equations in both groups are arranged into six flow categories, there is no substantial difference in accuracy between control group equations and experimental group equations for this particular site where drainage area accounts for more than 90 percent of the variance in all streamflow characteristics (except low flows and most annual peak logarithms). (USGS)
June and August median streamflows estimated for ungaged streams in southern Maine
Lombard, Pamela J.
2010-01-01
Methods for estimating June and August median streamflows were developed for ungaged, unregulated streams in southern Maine. The methods apply to streams with drainage areas ranging in size from 0.4 to 74 square miles, with percentage of basin underlain by a sand and gravel aquifer ranging from 0 to 84 percent, and with distance from the centroid of the basin to a Gulf of Maine line paralleling the coast ranging from 14 to 94 miles. Equations were developed with data from 4 long-term continuous-record streamgage stations and 27 partial-record streamgage stations. Estimates of median streamflows at the continuous-record and partial-record stations are presented. A mathematical technique for estimating standard low-flow statistics, such as June and August median streamflows, at partial-record streamgage stations was applied by relating base-flow measurements at these stations to concurrent daily streamflows at nearby long-term (at least 10 years of record) continuous-record streamgage stations (index stations). Weighted least-squares regression analysis (WLS) was used to relate estimates of June and August median streamflows at streamgage stations to basin characteristics at these same stations to develop equations that can be used to estimate June and August median streamflows on ungaged streams. WLS accounts for different periods of record at the gaging stations. Three basin characteristics-drainage area, percentage of basin underlain by a sand and gravel aquifer, and distance from the centroid of the basin to a Gulf of Maine line paralleling the coast-are used in the final regression equation to estimate June and August median streamflows for ungaged streams. The three-variable equation to estimate June median streamflow has an average standard error of prediction from -35 to 54 percent. The three-variable equation to estimate August median streamflow has an average standard error of prediction from -45 to 83 percent. Simpler one-variable equations that use only drainage area to estimate June and August median streamflows were developed for use when less accuracy is acceptable. These equations have average standard errors of prediction from -46 to 87 percent and from -57 to 133 percent, respectively.
Estimation of precipitable water at different locations using surface dew-point
NASA Astrophysics Data System (ADS)
Abdel Wahab, M.; Sharif, T. A.
1995-09-01
The Reitan (1963) regression equation of the form ln w = a + bT d has been examined and tested to estimate precipitable water vapor content from the surface dew point temperature at different locations. The results of this study indicate that the slope b of the above equation has a constant value of 0.0681, while the intercept a changes rapidly with latitude. The use of the variable intercept technique can improve the estimated result by about 2%.
[Aboveground biomass of three conifers in Qianyanzhou plantation].
Li, Xuanran; Liu, Qijing; Chen, Yongrui; Hu, Lile; Yang, Fengting
2006-08-01
In this paper, the regressive models of the aboveground biomass of Pinus elliottii, P. massoniana and Cunninghamia lanceolata in Qianyanzhou of subtropical China were established, and the regression analysis on the dry weight of leaf biomass and total biomass against branch diameter (d), branch length (L), d3 and d2L was conducted with linear, power and exponent functions. Power equation with single parameter (d) was proved to be better than the rests for P. massoniana and C. lanceolata, and linear equation with parameter (d3) was better for P. elliottii. The canopy biomass was derived by the regression equations for all branches. These equations were also used to fit the relationships of total tree biomass, branch biomass and foliage biomass with tree diameter at breast height (D), tree height (H), D3 and D2H, respectively. D2H was found to be the best parameter for estimating total biomass. For foliage-and branch biomass, both parameters and equation forms showed some differences among species. Correlations were highly significant (P <0.001) for foliage-, branch-and total biomass, with the highest for total biomass. By these equations, the aboveground biomass and its allocation were estimated, with the aboveground biomass of P. massoniana, P. elliottii, and C. lanceolata forests being 83.6, 72. 1 and 59 t x hm(-2), respectively, and more stem biomass than foliage-and branch biomass. According to the previous studies, the underground biomass of these three forests was estimated to be 10.44, 9.42 and 11.48 t x hm(-2), and the amount of fixed carbon was 47.94, 45.14 and 37.52 t x hm(-2), respectively.
Gingerich, Stephen B.
2005-01-01
Flow-duration statistics under natural (undiverted) and diverted flow conditions were estimated for gaged and ungaged sites on 21 streams in northeast Maui, Hawaii. The estimates were made using the optimal combination of continuous-record gaging-station data, low-flow measurements, and values determined from regression equations developed as part of this study. Estimated 50- and 95-percent flow duration statistics for streams are presented and the analyses done to develop and evaluate the methods used in estimating the statistics are described. Estimated streamflow statistics are presented for sites where various amounts of streamflow data are available as well as for locations where no data are available. Daily mean flows were used to determine flow-duration statistics for continuous-record stream-gaging stations in the study area following U.S. Geological Survey established standard methods. Duration discharges of 50- and 95-percent were determined from total flow and base flow for each continuous-record station. The index-station method was used to adjust all of the streamflow records to a common, long-term period. The gaging station on West Wailuaiki Stream (16518000) was chosen as the index station because of its record length (1914-2003) and favorable geographic location. Adjustments based on the index-station method resulted in decreases to the 50-percent duration total flow, 50-percent duration base flow, 95-percent duration total flow, and 95-percent duration base flow computed on the basis of short-term records that averaged 7, 3, 4, and 1 percent, respectively. For the drainage basin of each continuous-record gaged site and selected ungaged sites, morphometric, geologic, soil, and rainfall characteristics were quantified using Geographic Information System techniques. Regression equations relating the non-diverted streamflow statistics to basin characteristics of the gaged basins were developed using ordinary-least-squares regression analyses. Rainfall rate, maximum basin elevation, and the elongation ratio of the basin were the basin characteristics used in the final regression equations for 50-percent duration total flow and base flow. Rainfall rate and maximum basin elevation were used in the final regression equations for the 95-percent duration total flow and base flow. The relative errors between observed and estimated flows ranged from 10 to 20 percent for the 50-percent duration total flow and base flow, and from 29 to 56 percent for the 95-percent duration total flow and base flow. The regression equations developed for this study were used to determine the 50-percent duration total flow, 50-percent duration base flow, 95-percent duration total flow, and 95-percent duration base flow at selected ungaged diverted and undiverted sites. Estimated streamflow, prediction intervals, and standard errors were determined for 48 ungaged sites in the study area and for three gaged sites west of the study area. Relative errors were determined for sites where measured values of 95-percent duration discharge of total flow were available. East of Keanae Valley, the 95-percent duration discharge equation generally underestimated flow, and within and west of Keanae Valley, the equation generally overestimated flow. Reduction in 50- and 95-percent flow-duration values in stream reaches affected by diversions throughout the study area average 58 to 60 percent.
ESTIMATE OF GLOBAL METHANE EMISSIONS FROM COAL MINES
Country-specific emissions of methane (CH4) from underground coal mines, surface coal mines, and coal crushing and transport operations are estimated for 1989. Emissions for individual countries are estimated by using two sets of regression equations (R2 values range from 0.56 to...
Simple models for estimating local removals of timber in the northeast
David N. Larsen; David A. Gansner
1975-01-01
Provides a practical method of estimating subregional removals of timber and demonstrates its application to a typical problem. Stepwise multiple regression analysis is used to develop equations for estimating removals of softwood, hardwood, and all timber from selected characteristics of socioeconomic structure.
Smadi, Hanan; Sargeant, Jan M; Shannon, Harry S; Raina, Parminder
2012-12-01
Growth and inactivation regression equations were developed to describe the effects of temperature on Salmonella concentration on chicken meat for refrigerated temperatures (⩽10°C) and for thermal treatment temperatures (55-70°C). The main objectives were: (i) to compare Salmonella growth/inactivation in chicken meat versus laboratory media; (ii) to create regression equations to estimate Salmonella growth in chicken meat that can be used in quantitative risk assessment (QRA) modeling; and (iii) to create regression equations to estimate D-values needed to inactivate Salmonella in chicken meat. A systematic approach was used to identify the articles, critically appraise them, and pool outcomes across studies. Growth represented in density (Log10CFU/g) and D-values (min) as a function of temperature were modeled using hierarchical mixed effects regression models. The current meta-analysis analysis found a significant difference (P⩽0.05) between the two matrices - chicken meat and laboratory media - for both growth at refrigerated temperatures and inactivation by thermal treatment. Growth and inactivation were significantly influenced by temperature after controlling for other variables; however, no consistent pattern in growth was found. Validation of growth and inactivation equations against data not used in their development is needed. Copyright © 2012 Ministry of Health, Saudi Arabia. Published by Elsevier Ltd. All rights reserved.
Sabounchi, Nasim S.; Rahmandad, Hazhir; Ammerman, Alice
2014-01-01
Basal Metabolic Rate (BMR) represents the largest component of total energy expenditure and is a major contributor to energy balance. Therefore, accurately estimating BMR is critical for developing rigorous obesity prevention and control strategies. Over the past several decades, numerous BMR formulas have been developed targeted to different population groups. A comprehensive literature search revealed 248 BMR estimation equations developed using diverse ranges of age, gender, race, fat free mass, fat mass, height, waist-to-hip ratio, body mass index, and weight. A subset of 47 studies included enough detail to allow for development of meta-regression equations. Utilizing these studies, meta-equations were developed targeted to twenty specific population groups. This review provides a comprehensive summary of available BMR equations and an estimate of their accuracy. An accompanying online BMR prediction tool (available at http://www.sdl.ise.vt.edu/tutorials.html) was developed to automatically estimate BMR based on the most appropriate equation after user-entry of individual age, race, gender, and weight. PMID:23318720
Validity of Bioelectrical Impedance Analysis to Estimation Fat-Free Mass in the Army Cadets.
Langer, Raquel D; Borges, Juliano H; Pascoa, Mauro A; Cirolini, Vagner X; Guerra-Júnior, Gil; Gonçalves, Ezequiel M
2016-03-11
Bioelectrical Impedance Analysis (BIA) is a fast, practical, non-invasive, and frequently used method for fat-free mass (FFM) estimation. The aims of this study were to validate predictive equations of BIA to FFM estimation in Army cadets and to develop and validate a specific BIA equation for this population. A total of 396 males, Brazilian Army cadets, aged 17-24 years were included. The study used eight published predictive BIA equations, a specific equation in FFM estimation, and dual-energy X-ray absorptiometry (DXA) as a reference method. Student's t-test (for paired sample), linear regression analysis, and Bland-Altman method were used to test the validity of the BIA equations. Predictive BIA equations showed significant differences in FFM compared to DXA (p < 0.05) and large limits of agreement by Bland-Altman. Predictive BIA equations explained 68% to 88% of FFM variance. Specific BIA equations showed no significant differences in FFM, compared to DXA values. Published BIA predictive equations showed poor accuracy in this sample. The specific BIA equations, developed in this study, demonstrated validity for this sample, although should be used with caution in samples with a large range of FFM.
Wagner, Daniel M.; Krieger, Joshua D.; Veilleux, Andrea G.
2016-08-04
In 2013, the U.S. Geological Survey initiated a study to update regional skew, annual exceedance probability discharges, and regional regression equations used to estimate annual exceedance probability discharges for ungaged locations on streams in the study area with the use of recent geospatial data, new analytical methods, and available annual peak-discharge data through the 2013 water year. An analysis of regional skew using Bayesian weighted least-squares/Bayesian generalized-least squares regression was performed for Arkansas, Louisiana, and parts of Missouri and Oklahoma. The newly developed constant regional skew of -0.17 was used in the computation of annual exceedance probability discharges for 281 streamgages used in the regional regression analysis. Based on analysis of covariance, four flood regions were identified for use in the generation of regional regression models. Thirty-nine basin characteristics were considered as potential explanatory variables, and ordinary least-squares regression techniques were used to determine the optimum combinations of basin characteristics for each of the four regions. Basin characteristics in candidate models were evaluated based on multicollinearity with other basin characteristics (variance inflation factor < 2.5) and statistical significance at the 95-percent confidence level (p ≤ 0.05). Generalized least-squares regression was used to develop the final regression models for each flood region. Average standard errors of prediction of the generalized least-squares models ranged from 32.76 to 59.53 percent, with the largest range in flood region D. Pseudo coefficients of determination of the generalized least-squares models ranged from 90.29 to 97.28 percent, with the largest range also in flood region D. The regional regression equations apply only to locations on streams in Arkansas where annual peak discharges are not substantially affected by regulation, diversion, channelization, backwater, or urbanization. The applicability and accuracy of the regional regression equations depend on the basin characteristics measured for an ungaged location on a stream being within range of those used to develop the equations.
Straub, D.E.
1998-01-01
The streamflow-gaging station network in Ohio was evaluated for its effectiveness in providing regional streamflow information. The analysis involved application of the principles of generalized least squares regression between streamflow and climatic and basin characteristics. Regression equations were developed for three flow characteristics: (1) the instantaneous peak flow with a 100-year recurrence interval (P100), (2) the mean annual flow (Qa), and (3) the 7-day, 10-year low flow (7Q10). All active and discontinued gaging stations with 5 or more years of unregulated-streamflow data with respect to each flow characteristic were used to develop the regression equations. The gaging-station network was evaluated for the current (1996) condition of the network and estimated conditions of various network strategies if an additional 5 and 20 years of streamflow data were collected. Any active or discontinued gaging station with (1) less than 5 years of unregulated-streamflow record, (2) previously defined basin and climatic characteristics, and (3) the potential for collection of more unregulated-streamflow record were included in the network strategies involving the additional 5 and 20 years of data. The network analysis involved use of the regression equations, in combination with location, period of record, and cost of operation, to determine the contribution of the data for each gaging station to regional streamflow information. The contribution of each gaging station was based on a cost-weighted reduction of the mean square error (average sampling-error variance) associated with each regional estimating equation. All gaging stations included in the network analysis were then ranked according to their contribution to the regional information for each flow characteristic. The predictive ability of the regression equations developed from the gaging station network could be improved for all three flow characteristics with the collection of additional streamflow data. The addition of new gaging stations to the network would result in an even greater improvement of the accuracy of the regional regression equations. Typically, continued data collection at stations with unregulated streamflow for all flow conditions that had less than 11 years of record with drainage areas smaller than 200 square miles contributed the largest cost-weighted reduction to the average sampling-error variance of the regional estimating equations. The results of the network analyses can be used to prioritize the continued operation of active gaging stations or the reactivation of discontinued gaging stations if the objective is to maximize the regional information content in the streamflow-gaging station network.
Kono, Kenichi; Nishida, Yusuke; Moriyama, Yoshihumi; Taoka, Masahiro; Sato, Takashi
2015-06-01
The assessment of nutritional states using fat free mass (FFM) measured with near-infrared spectroscopy (NIRS) is clinically useful. This measurement should incorporate the patient's post-dialysis weight ("dry weight"), in order to exclude the effects of any change in water mass. We therefore used NIRS to investigate the regression, independent variables, and absolute reliability of FFM in dry weight. The study included 47 outpatients from the hemodialysis unit. Body weight was measured before dialysis, and FFM was measured using NIRS before and after dialysis treatment. Multiple regression analysis was used to estimate the FFM in dry weight as the dependent variable. The measured FFM before dialysis treatment (Mw-FFM), and the difference between measured and dry weight (Mw-Dw) were independent variables. We performed Bland-Altman analysis to detect errors between the statistically estimated FFM and the measured FFM after dialysis treatment. The multiple regression equation to estimate the FFM in dry weight was: Dw-FFM = 0.038 + (0.984 × Mw-FFM) + (-0.571 × [Mw-Dw]); R(2) = 0.99). There was no systematic bias between the estimated and the measured values of FFM in dry weight. Using NIRS, FFM in dry weight can be calculated by an equation including FFM in measured weight and the difference between the measured weight and the dry weight. © 2015 The Authors. Therapeutic Apheresis and Dialysis © 2015 International Society for Apheresis.
Crawford, John R; Garthwaite, Paul H; Denham, Annie K; Chelune, Gordon J
2012-12-01
Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because (a) not all psychologists are aware that regression equations can be built not only from raw data but also using only basic summary data for a sample, and (b) the computations involved are tedious and prone to error. In an attempt to overcome these barriers, Crawford and Garthwaite (2007) provided methods to build and apply simple linear regression models using summary statistics as data. In the present study, we extend this work to set out the steps required to build multiple regression models from sample summary statistics and the further steps required to compute the associated statistics for drawing inferences concerning an individual case. We also develop, describe, and make available a computer program that implements these methods. Although there are caveats associated with the use of the methods, these need to be balanced against pragmatic considerations and against the alternative of either entirely ignoring a pertinent data set or using it informally to provide a clinical "guesstimate." Upgraded versions of earlier programs for regression in the single case are also provided; these add the point and interval estimates of effect size developed in the present article.
Magnitude and frequency of floods in Arkansas
Hodge, Scott A.; Tasker, Gary D.
1995-01-01
Methods are presented for estimating the magnitude and frequency of peak discharges of streams in Arkansas. Regression analyses were developed in which a stream's physical and flood characteristics were related. Four sets of regional regression equations were derived to predict peak discharges with selected recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years on streams draining less than 7,770 square kilometers. The regression analyses indicate that size of drainage area, main channel slope, mean basin elevation, and the basin shape factor were the most significant basin characteristics that affect magnitude and frequency of floods. The region of influence method is included in this report. This method is still being improved and is to be considered only as a second alternative to the standard method of producing regional regression equations. This method estimates unique regression equations for each recurrence interval for each ungaged site. The regression analyses indicate that size of drainage area, main channel slope, mean annual precipitation, mean basin elevation, and the basin shape factor were the most significant basin and climatic characteristics that affect magnitude and frequency of floods for this method. Certain recommendations on the use of this method are provided. A method is described for estimating the magnitude and frequency of peak discharges of streams for urban areas in Arkansas. The method is from a nationwide U.S. Geeological Survey flood frequency report which uses urban basin characteristics combined with rural discharges to estimate urban discharges. Annual peak discharges from 204 gaging stations, with drainage areas less than 7,770 square kilometers and at least 10 years of unregulated record, were used in the analysis. These data provide the basis for this analysis and are published in the Appendix of this report as supplemental data. Large rivers such as the Red, Arkansas, White, Black, St. Francis, Mississippi, and Ouachita Rivers have floodflow characteristics that differ from those of smaller tributary streams and were treated individually. Regional regression equations are not applicable to these large rivers. The magnitude and frequency of floods along these rivers are based on specific station data. This section is provided in the Appendix and has not been updated since the last Arkansas flood frequency report (1987b), but is included at the request of the cooperator.
Koltun, G.F.; Kula, Stephanie P.
2013-01-01
This report presents the results of a study to develop methods for estimating selected low-flow statistics and for determining annual flow-duration statistics for Ohio streams. Regression techniques were used to develop equations for estimating 10-year recurrence-interval (10-percent annual-nonexceedance probability) low-flow yields, in cubic feet per second per square mile, with averaging periods of 1, 7, 30, and 90-day(s), and for estimating the yield corresponding to the long-term 80-percent duration flow. These equations, which estimate low-flow yields as a function of a streamflow-variability index, are based on previously published low-flow statistics for 79 long-term continuous-record streamgages with at least 10 years of data collected through water year 1997. When applied to the calibration dataset, average absolute percent errors for the regression equations ranged from 15.8 to 42.0 percent. The regression results have been incorporated into the U.S. Geological Survey (USGS) StreamStats application for Ohio (http://water.usgs.gov/osw/streamstats/ohio.html) in the form of a yield grid to facilitate estimation of the corresponding streamflow statistics in cubic feet per second. Logistic-regression equations also were developed and incorporated into the USGS StreamStats application for Ohio for selected low-flow statistics to help identify occurrences of zero-valued statistics. Quantiles of daily and 7-day mean streamflows were determined for annual and annual-seasonal (September–November) periods for each complete climatic year of streamflow-gaging station record for 110 selected streamflow-gaging stations with 20 or more years of record. The quantiles determined for each climatic year were the 99-, 98-, 95-, 90-, 80-, 75-, 70-, 60-, 50-, 40-, 30-, 25-, 20-, 10-, 5-, 2-, and 1-percent exceedance streamflows. Selected exceedance percentiles of the annual-exceedance percentiles were subsequently computed and tabulated to help facilitate consideration of the annual risk of exceedance or nonexceedance of annual and annual-seasonal-period flow-duration values. The quantiles are based on streamflow data collected through climatic year 2008.
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Ryberg, Karen R.
2007-01-01
This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the North Dakota State Water Commission, to estimate water-quality constituent concentrations at seven sites on the Sheyenne River, N. Dak. Regression analysis of water-quality data collected in 1980-2006 was used to estimate concentrations for hardness, dissolved solids, calcium, magnesium, sodium, and sulfate. The explanatory variables examined for the regression relations were continuously monitored streamflow, specific conductance, and water temperature. For the conditions observed in 1980-2006, streamflow was a significant explanatory variable for some constituents. Specific conductance was a significant explanatory variable for all of the constituents, and water temperature was not a statistically significant explanatory variable for any of the constituents in this study. The regression relations were evaluated using common measures of variability, including R2, the proportion of variability in the estimated constituent concentration explained by the explanatory variables and regression equation. R2 values ranged from 0.784 for calcium to 0.997 for dissolved solids. The regression relations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.7 for dissolved solids to 11.5 for sulfate. The regression relations also may be used to estimate daily constituent loads. The relations should be monitored for change over time, especially at sites 2 and 3 which have a short period of record. In addition, caution should be used when the Sheyenne River is affected by ice or when upstream sites are affected by isolated storm runoff. Almost all of the outliers and highly influential samples removed from the analysis were made during periods when the Sheyenne River might be affected by ice.
Fabian C.C. Uzoh; Martin W. Ritchie
1996-01-01
The equations presented predict crown area for 13 species of trees and shrubs which may be found growing in competition with commercial conifers during early stages of stand development. The equations express crown area as a function of basal area and height. Parameters were estimated for each species individually using weighted nonlinear least square regression.
Olson, Scott A.; Tasker, Gary D.; Johnston, Craig M.
2003-01-01
Estimates of the magnitude and frequency of streamflow are needed to safely and economically design bridges, culverts, and other structures in or near streams. These estimates also are used for managing floodplains, identifying flood-hazard areas, and establishing flood-insurance rates, but may be required at ungaged sites where no observed flood data are available for streamflow-frequency analysis. This report describes equations for estimating flow-frequency characteristics at ungaged, unregulated streams in Vermont. In the past, regression equations developed to estimate streamflow statistics required users to spend hours manually measuring basin characteristics for the stream site of interest. This report also describes the accompanying customized geographic information system (GIS) tool that automates the measurement of basin characteristics and calculation of corresponding flow statistics. The tool includes software that computes the accuracy of the results and adjustments for expected probability and for streamflow data of a nearby stream-gaging station that is either upstream or downstream and within 50 percent of the drainage area of the site where the flow-frequency characteristics are being estimated. The custom GIS can be linked to the National Flood Frequency program, adding the ability to plot peak-flow-frequency curves and synthetic hydrographs and to compute adjustments for urbanization.
NASA Astrophysics Data System (ADS)
Fomina, E. V.; Kozhukhova, N. I.; Sverguzova, S. V.; Fomin, A. E.
2018-05-01
In this paper, the regression equations method for design of construction material was studied. Regression and polynomial equations representing the correlation between the studied parameters were proposed. The logic design and software interface of the regression equations method focused on parameter optimization to provide the energy saving effect at the stage of autoclave aerated concrete design considering the replacement of traditionally used quartz sand by coal mining by-product such as argillite. The mathematical model represented by a quadric polynomial for the design of experiment was obtained using calculated and experimental data. This allowed the estimation of relationship between the composition and final properties of the aerated concrete. The surface response graphically presented in a nomogram allowed the estimation of concrete properties in response to variation of composition within the x-space. The optimal range of argillite content was obtained leading to a reduction of raw materials demand, development of target plastic strength of aerated concrete as well as a reduction of curing time before autoclave treatment. Generally, this method allows the design of autoclave aerated concrete with required performance without additional resource and time costs.
Wiley, Jeffrey B.; Atkins, John T.; Newell, Dawn A.
2002-01-01
Multiple and simple least-squares regression models for the log10-transformed 1.5- and 2-year recurrence intervals of peak discharges with independent variables describing the basin characteristics (log10-transformed and untransformed) for 236 streamflow-gaging stations were evaluated, and the regression residuals were plotted as areal distributions that defined three regions in West Virginia designated as East, North, and South. Regional equations for the 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2.0-, 2.5-, and 3-year recurrence intervals of peak discharges were determined by generalized least-squares regression. Log10-transformed drainage area was the most significant independent variable for all regions. Equations developed in this study are applicable only to rural, unregulated streams within the boundaries of West Virginia. The accuracies of estimating equations are quantified by measuring the average prediction error (from 27.4 to 52.4 percent) and equivalent years of record (from 1.1 to 3.4 years).
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Estimation of Compaction Parameters Based on Soil Classification
NASA Astrophysics Data System (ADS)
Lubis, A. S.; Muis, Z. A.; Hastuty, I. P.; Siregar, I. M.
2018-02-01
Factors that must be considered in compaction of the soil works were the type of soil material, field control, maintenance and availability of funds. Those problems then raised the idea of how to estimate the density of the soil with a proper implementation system, fast, and economical. This study aims to estimate the compaction parameter i.e. the maximum dry unit weight (γ dmax) and optimum water content (Wopt) based on soil classification. Each of 30 samples were being tested for its properties index and compaction test. All of the data’s from the laboratory test results, were used to estimate the compaction parameter values by using linear regression and Goswami Model. From the research result, the soil types were A4, A-6, and A-7 according to AASHTO and SC, SC-SM, and CL based on USCS. By linear regression, the equation for estimation of the maximum dry unit weight (γdmax *)=1,862-0,005*FINES- 0,003*LL and estimation of the optimum water content (wopt *)=- 0,607+0,362*FINES+0,161*LL. By Goswami Model (with equation Y=mLogG+k), for estimation of the maximum dry unit weight (γdmax *) with m=-0,376 and k=2,482, for estimation of the optimum water content (wopt *) with m=21,265 and k=-32,421. For both of these equations a 95% confidence interval was obtained.
Approximate median regression for complex survey data with skewed response.
Fraser, Raphael André; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett M; Pan, Yi
2016-12-01
The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling, and weighting. In this article, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS)'based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. © 2016, The International Biometric Society.
How many stakes are required to measure the mass balance of a glacier?
Fountain, A.G.; Vecchia, A.
1999-01-01
Glacier mass balance is estimated for South Cascade Glacier and Maclure Glacier using a one-dimensional regression of mass balance with altitude as an alternative to the traditional approach of contouring mass balance values. One attractive feature of regression is that it can be applied to sparse data sets where contouring is not possible and can provide an objective error of the resulting estimate. Regression methods yielded mass balance values equivalent to contouring methods. The effect of the number of mass balance measurements on the final value for the glacier showed that sample sizes as small as five stakes provided reasonable estimates, although the error estimates were greater than for larger sample sizes. Different spatial patterns of measurement locations showed no appreciable influence on the final value as long as different surface altitudes were intermittently sampled over the altitude range of the glacier. Two different regression equations were examined, a quadratic, and a piecewise linear spline, and comparison of results showed little sensitivity to the type of equation. These results point to the dominant effect of the gradient of mass balance with altitude of alpine glaciers compared to transverse variations. The number of mass balance measurements required to determine the glacier balance appears to be scale invariant for small glaciers and five to ten stakes are sufficient.
Approximate Median Regression for Complex Survey Data with Skewed Response
Fraser, Raphael André; Lipsitz, Stuart R.; Sinha, Debajyoti; Fitzmaurice, Garrett M.; Pan, Yi
2016-01-01
Summary The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling and weighting. In this paper, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS) based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. PMID:27062562
Southard, Rodney E.
2013-01-01
The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical estimates on one of these streams can be calculated at an ungaged location that has a drainage area that is between 40 percent of the drainage area of the farthest upstream streamgage and within 150 percent of the drainage area of the farthest downstream streamgage along the stream of interest. The second method may be used on any stream with a streamgage that has operated for 10 years or longer and for which anthropogenic effects have not changed the low-flow characteristics at the ungaged location since collection of the streamflow data. A ratio of drainage area of the stream at the ungaged location to the drainage area of the stream at the streamgage was computed to estimate the statistic at the ungaged location. The range of applicability is between 40- and 150-percent of the drainage area of the streamgage, and the ungaged location must be located on the same stream as the streamgage. The third method uses regional regression equations to estimate selected low-flow frequency statistics for unregulated streams in Missouri. This report presents regression equations to estimate frequency statistics for the 10-year recurrence interval and for the N-day durations of 1, 2, 3, 7, 10, 30, and 60 days. Basin and climatic characteristics were computed using geographic information system software and digital geospatial data. A total of 35 characteristics were computed for use in preliminary statewide and regional regression analyses based on existing digital geospatial data and previous studies. Spatial analyses for geographical bias in the predictive accuracy of the regional regression equations defined three low-flow regions with the State representing the three major physiographic provinces in Missouri. Region 1 includes the Central Lowlands, Region 2 includes the Ozark Plateaus, and Region 3 includes the Mississippi Alluvial Plain. A total of 207 streamgages were used in the regression analyses for the regional equations. Of the 207 U.S. Geological Survey streamgages, 77 were located in Region 1, 120 were located in Region 2, and 10 were located in Region 3. Streamgages located outside of Missouri were selected to extend the range of data used for the independent variables in the regression analyses. Streamgages included in the regression analyses had 10 or more years of record and were considered to be affected minimally by anthropogenic activities or trends. Regional regression analyses identified three characteristics as statistically significant for the development of regional equations. For Region 1, drainage area, longest flow path, and streamflow-variability index were statistically significant. The range in the standard error of estimate for Region 1 is 79.6 to 94.2 percent. For Region 2, drainage area and streamflow variability index were statistically significant, and the range in the standard error of estimate is 48.2 to 72.1 percent. For Region 3, drainage area and streamflow-variability index also were statistically significant with a range in the standard error of estimate of 48.1 to 96.2 percent. Limitations on the use of estimating low-flow frequency statistics at ungaged locations are dependent on the method used. The first method outlined for use in Missouri, power curve equations, were developed to estimate the selected statistics for ungaged locations on 28 selected streams with multiple streamgages located on the same stream. A second method uses a drainage-area ratio to compute statistics at an ungaged location using data from a single streamgage on the same stream with 10 or more years of record. Ungaged locations on these streams may use the ratio of the drainage area at an ungaged location to the drainage area at a streamgage location to scale the selected statistic value from the streamgage location to the ungaged location. This method can be used if the drainage area of the ungaged location is within 40 to 150 percent of the streamgage drainage area. The third method is the use of the regional regression equations. The limits for the use of these equations are based on the ranges of the characteristics used as independent variables and that streams must be affected minimally by anthropogenic activities.
Revised techniques for estimating peak discharges from channel width in Montana
Parrett, Charles; Hull, J.A.; Omang, R.J.
1987-01-01
This study was conducted to develop new estimating equations based on channel width and the updated flood frequency curves of previous investigations. Simple regression equations for estimating peak discharges with recurrence intervals of 2, 5, 10 , 25, 50, and 100 years were developed for seven regions in Montana. The standard errors of estimates for the equations that use active channel width as the independent variables ranged from 30% to 87%. The standard errors of estimate for the equations that use bankfull width as the independent variable ranged from 34% to 92%. The smallest standard errors generally occurred in the prediction equations for the 2-yr flood, 5-yr flood, and 10-yr flood, and the largest standard errors occurred in the prediction equations for the 100-yr flood. The equations that use active channel width and the equations that use bankfull width were determined to be about equally reliable in five regions. In the West Region, the equations that use bankfull width were slightly more reliable than those based on active channel width, whereas in the East-Central Region the equations that use active channel width were slightly more reliable than those based on bankfull width. Compared with similar equations previously developed, the standard errors of estimate for the new equations are substantially smaller in three regions and substantially larger in two regions. Limitations on the use of the estimating equations include: (1) The equations are based on stable conditions of channel geometry and prevailing water and sediment discharge; (2) The measurement of channel width requires a site visit, preferably by a person with experience in the method, and involves appreciable measurement errors; (3) Reliability of results from the equations for channel widths beyond the range of definition is unknown. In spite of the limitations, the estimating equations derived in this study are considered to be as reliable as estimating equations based on basin and climatic variables. Because the two types of estimating equations are independent, results from each can be weighted inversely proportional to their variances, and averaged. The weighted average estimate has a variance less than either individual estimate. (Author 's abstract)
Bjerklie, David M.; Dingman, S. Lawrence; Bolster, Carl H.
2005-01-01
A set of conceptually derived in‐bank river discharge–estimating equations (models), based on the Manning and Chezy equations, are calibrated and validated using a database of 1037 discharge measurements in 103 rivers in the United States and New Zealand. The models are compared to a multiple regression model derived from the same data. The comparison demonstrates that in natural rivers, using an exponent on the slope variable of 0.33 rather than the traditional value of 0.5 reduces the variance associated with estimating flow resistance. Mean model uncertainty, assuming a constant value for the conductance coefficient, is less than 5% for a large number of estimates, and 67% of the estimates would be accurate within 50%. The models have potential application where site‐specific flow resistance information is not available and can be the basis for (1) a general approach to estimating discharge from remotely sensed hydraulic data, (2) comparison to slope‐area discharge estimates, and (3) large‐scale river modeling.
Risser, Dennis W.; Thompson, Ronald E.; Stuckey, Marla H.
2008-01-01
A method was developed for making estimates of long-term, mean annual ground-water recharge from streamflow data at 80 streamflow-gaging stations in Pennsylvania. The method relates mean annual base-flow yield derived from the streamflow data (as a proxy for recharge) to the climatic, geologic, hydrologic, and physiographic characteristics of the basins (basin characteristics) by use of a regression equation. Base-flow yield is the base flow of a stream divided by the drainage area of the basin, expressed in inches of water basinwide. Mean annual base-flow yield was computed for the period of available streamflow record at continuous streamflow-gaging stations by use of the computer program PART, which separates base flow from direct runoff on the streamflow hydrograph. Base flow provides a reasonable estimate of recharge for basins where streamflow is mostly unaffected by upstream regulation, diversion, or mining. Twenty-eight basin characteristics were included in the exploratory regression analysis as possible predictors of base-flow yield. Basin characteristics found to be statistically significant predictors of mean annual base-flow yield during 1971-2000 at the 95-percent confidence level were (1) mean annual precipitation, (2) average maximum daily temperature, (3) percentage of sand in the soil, (4) percentage of carbonate bedrock in the basin, and (5) stream channel slope. The equation for predicting recharge was developed using ordinary least-squares regression. The standard error of prediction for the equation on log-transformed data was 9.7 percent, and the coefficient of determination was 0.80. The equation can be used to predict long-term, mean annual recharge rates for ungaged basins, providing that the explanatory basin characteristics can be determined and that the underlying assumption is accepted that base-flow yield derived from PART is a reasonable estimate of ground-water recharge rates. For example, application of the equation for 370 hydrologic units in Pennsylvania predicted a range of ground-water recharge from about 6.0 to 22 inches per year. A map of the predicted recharge illustrates the general magnitude and variability of recharge throughout Pennsylvania.
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
System identification principles in studies of forest dynamics.
Rolfe A. Leary
1970-01-01
Shows how it is possible to obtain governing equation parameter estimates on the basis of observed system states. The approach used represents a constructive alternative to regression techniques for models expressed as differential equations. This approach allows scientists to more completely quantify knowledge of forest development processes, to express theories in...
Merchantable sawlog and bole-length equations for the Northeastern United States
Daniel A. Yaussy; Martin E. Dale; Martin E. Dale
1991-01-01
A modified Richards growth model is used to develop species-specific coefficients for equations estimating the merchantable sawlog and bole lengths of trees from 25 species groups common to the Northeastern United States. These regression coefficients have been incorporated into the growth-and-yield simulation software, NE-TWIGS.
Methods for estimating magnitude and frequency of peak flows for small watersheds in Utah.
DOT National Transportation Integrated Search
2010-06-01
Determining discharge in a stream is important to the design of culverts, bridges, and other structures pertaining to : transportation systems. Currently in Utah regression equations exist to estimate recurrence flood year discharges for : rural wate...
ERIC Educational Resources Information Center
Carter, David S.
1979-01-01
There are a variety of formulas for reducing the positive bias which occurs in estimating R squared in multiple regression or correlation equations. Five different formulas are evaluated in a Monte Carlo study, and recommendations are made. (JKS)
Estimating the magnitude of peak flows for streams in Kentucky for selected recurrence intervals
Hodgkins, Glenn A.; Martin, Gary R.
2003-01-01
This report gives estimates of, and presents techniques for estimating, the magnitude of peak flows for streams in Kentucky for recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years. A flowchart in this report guides the user to the appropriate estimates and (or) estimating techniques for a site on a specific stream. Estimates of peak flows are given for 222 U.S. Geological Survey streamflow-gaging stations in Kentucky. In the development of the peak-flow estimates at gaging stations, a new generalized skew coefficient was calculated for the State. This single statewide value of 0.011 (with a standard error of prediction of 0.520) is more appropriate for Kentucky than the national skew isoline map in Bulletin 17B of the Interagency Advisory Committee on Water Data. Regression equations are presented for estimating the peak flows on ungaged, unregulated streams in rural drainage basins. The equations were developed by use of generalized-least-squares regression procedures at 187 U.S. Geological Survey gaging stations in Kentucky and 51 stations in surrounding States. Kentucky was divided into seven flood regions. Total drainage area is used in the final regression equations as the sole explanatory variable, except in Regions 1 and 4 where main-channel slope also was used. The smallest average standard errors of prediction were in Region 3 (from -13.1 to +15.0 percent) and the largest average standard errors of prediction were in Region 5 (from -37.6 to +60.3 percent). One section of this report describes techniques for estimating peak flows for ungaged sites on gaged, unregulated streams in rural drainage basins. Another section references two previous U.S. Geological Survey reports for peak-flow estimates on ungaged, unregulated, urban streams. Estimating peak flows at ungaged sites on regulated streams is beyond the scope of this report, because peak flows on regulated streams are dependent upon variable human activities.
Techniques for estimating flood hydrographs for ungaged urban watersheds
Stricker, V.A.; Sauer, V.B.
1984-01-01
The Clark Method, modified slightly was used to develop a synthetic, dimensionless hydrograph which can be used to estimate flood hydrographs for ungaged urban watersheds. Application of the technique results in a typical (average) flood hydrograph for a given peak discharge. Input necessary to apply the technique is an estimate of basin lagtime and the recurrence interval peak discharge. Equations for this purpose were obtained from a recent nationwide study on flood frequency in urban watersheds. A regression equation was developed which relates flood volumes to drainage area size, basin lagtime, and peak discharge. This equation is useful where storage of floodwater may be a part of design of flood prevention. (USGS)
Randall, Allan D.; Freehafer, Douglas A.
2017-08-02
A variety of watershed properties available in 2015 from geographic information systems were tested in regression equations to estimate two commonly used statistical indices of the low flow of streams, namely the lowest flows averaged over 7 consecutive days that have a 1 in 10 and a 1 in 2 chance of not being exceeded in any given year (7-day, 10-year and 7-day, 2-year low flows). The equations were based on streamflow measurements in 51 watersheds in the Lower Hudson River Basin of New York during the years 1958–1978, when the number of streamflow measurement sites on unregulated streams was substantially greater than in subsequent years. These low-flow indices are chiefly a function of the area of surficial sand and gravel in the watershed; more precisely, 7-day, 10-year and 7-day, 2-year low flows both increase in proportion to the area of sand and gravel deposited by glacial meltwater, whereas 7-day, 2-year low flows also increase in proportion to the area of postglacial alluvium. Both low-flow statistics are also functions of mean annual runoff (a measure of net water input to the watershed from precipitation) and area of swamps and poorly drained soils in or adjacent to surficial sand and gravel (where groundwater recharge is unlikely and riparian water loss to evapotranspiration is substantial). Small but significant refinements in estimation accuracy resulted from the inclusion of two indices of stream geometry, channel slope and length, in the regression equations. Most of the regression analysis was undertaken with the ordinary least squares method, but four equations were replicated by using weighted least squares to provide a more realistic appraisal of the precision of low-flow estimates. The most accurate estimation equations tested in this study explain nearly 84 and 87 percent of the variation in 7-day, 10-year and 7-day, 2-year low flows, respectively, with standard errors of 0.032 and 0.050 cubic feet per second per square mile. The equations use natural values of streamflow and watershed properties; logarithmic transformations yielded less accurate equations inconsistent with some conceptualized relationships.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
New body fat prediction equations for severely obese patients.
Horie, Lilian Mika; Barbosa-Silva, Maria Cristina Gonzalez; Torrinhas, Raquel Susana; de Mello, Marco Túlio; Cecconello, Ivan; Waitzberg, Dan Linetzky
2008-06-01
Severe obesity imposes physical limitations to body composition assessment. Our aim was to compare body fat (BF) estimations of severely obese patients obtained by bioelectrical impedance (BIA) and air displacement plethysmography (ADP) for development of new equations for BF prediction. Severely obese subjects (83 female/36 male, mean age=41.6+/-11.6 years) had BF estimated by BIA and ADP. The agreement of the data was evaluated using Bland-Altman's graphic and concordance correlation coefficient (CCC). A multivariate regression analysis was performed to develop and validate new predictive equations. BF estimations from BIA (64.8+/-15 kg) and ADP (65.6+/-16.4 kg) did not differ (p>0.05, with good accuracy, precision, and CCC), but the Bland- Altman graphic showed a wide limit of agreement (-10.4; 8.8). The standard BIA equation overestimated BF in women (-1.3 kg) and underestimated BF in men (5.6 kg; p<0.05). Two BF new predictive equations were generated after BIA measurement, which predicted BF with higher accuracy, precision, CCC, and limits of agreement than the standard BIA equation. Standard BIA equations were inadequate for estimating BF in severely obese patients. Equations developed especially for this population provide more accurate BF assessment.
Stature in archeological samples from central Italy: methodological issues and diachronic changes.
Giannecchini, Monica; Moggi-Cecchi, Jacopo
2008-03-01
Stature reconstructions from skeletal remains are usually obtained through regression equations based on the relationship between height and limb bone length. Different equations have been employed to reconstruct stature in skeletal samples, but this is the first study to provide a systematic analysis of the reliability of the different methods for Italian historical samples. Aims of this article are: 1) to analyze the reliability of different regression methods to estimate stature for populations living in Central Italy from the Iron Age to Medieval times; 2) to search for trends in stature over this time period by applying the most reliable regression method. Long bone measurements were collected from 1,021 individuals (560 males, 461 females), from 66 archeological sites for males and 54 for females. Three time periods were identified: Iron Age, Roman period, and Medieval period. To determine the most appropriate equation to reconstruct stature the Delta parameter of Gini (Memorie di metodologia statistica. Milano: Giuffre A. 1939), in which stature estimates derived from different limb bones are compared, was employed. The equations proposed by Pearson (Philos Trans R Soc London 192 (1899) 169-244) and Trotter and Gleser for Afro-Americans (Am J Phys Anthropol 10 (1952) 463-514; Am J Phys Anthropol 47 (1977) 355-356) provided the most consistent estimates when applied to our sample. We then used the equation by Pearson for further analyses. Results indicate a reduction in stature in the transition from the Iron Age to the Roman period, and a subsequent increase in the transition from the Roman period to the Medieval period. Changes of limb lengths over time were more pronounced in the distal than in the proximal elements in both limbs. 2007 Wiley-Liss, Inc.
Estimation of Flood-Frequency Discharges for Rural, Unregulated Streams in West Virginia
Wiley, Jeffrey B.; Atkins, John T.
2010-01-01
Flood-frequency discharges were determined for 290 streamgage stations having a minimum of 9 years of record in West Virginia and surrounding states through the 2006 or 2007 water year. No trend was determined in the annual peaks used to calculate the flood-frequency discharges. Multiple and simple least-squares regression equations for the 100-year (1-percent annual-occurrence probability) flood discharge with independent variables that describe the basin characteristics were developed for 290 streamgage stations in West Virginia and adjacent states. The regression residuals for the models were evaluated and used to define three regions of the State, designated as Eastern Panhandle, Central Mountains, and Western Plateaus. Exploratory data analysis procedures identified 44 streamgage stations that were excluded from the development of regression equations representative of rural, unregulated streams in West Virginia. Regional equations for the 1.1-, 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year flood discharges were determined by generalized least-squares regression using data from the remaining 246 streamgage stations. Drainage area was the only significant independent variable determined for all equations in all regions. Procedures developed to estimate flood-frequency discharges on ungaged streams were based on (1) regional equations and (2) drainage-area ratios between gaged and ungaged locations on the same stream. The procedures are applicable only to rural, unregulated streams within the boundaries of West Virginia that have drainage areas within the limits of the stations used to develop the regional equations (from 0.21 to 1,461 square miles in the Eastern Panhandle, from 0.10 to 1,619 square miles in the Central Mountains, and from 0.13 to 1,516 square miles in the Western Plateaus). The accuracy of the equations is quantified by measuring the average prediction error (from 21.7 to 56.3 percent) and equivalent years of record (from 2.0 to 70.9 years).
Melching, Charles S.; Oberg, Kevin A.
1993-01-01
The acoustic velocity meter (AVM) on the Chicago Sanitary and Ship Canal (the Canal) at Romeoville, Ill., provides vital information for the accounting of the diversion of water from Lake Michigan. A detailed analysis of the discharge record on the Canal at Romeoville was done by the U.S. Geological Survey to establish the most accurate estimates of discharge for water years 1986-91. The analysis involved (1) checking the consistency of the discharges estimated by two different AVM's installed at Romeoville for consecutive time periods by statistical and regression analyses, (2) adjusting the discharge record to account for corrections to the width and depth of the Canal determined by field measurements, and (3) development of equations for estimating discharge on days when the AVM was inoperative using discharge estimates made by the Metropolitan Water Reclamation District of Greater Chicago at the lock, powerhouse, and controlling works at Lockport, Ill. No signi- ficant difference in the discharge estimates made by the two AVM's could be documented. The estimation equations combined regression analysis with physical principles of the outlet-works operation. The estimation equations simulated the verification period of October 1, 1991, to May 31, 1992, within 0.22, 5.15, and 0.66 percent for the mean, standard deviation, and skewness coefficient, respectively. Discharges were recalculated for the corrected width and depth, estimated for the periods of AVM inoperation, and entered into the discharge record for the station.
Peak-flow characteristics of Wyoming streams
Miller, Kirk A.
2003-01-01
Peak-flow characteristics for unregulated streams in Wyoming are described in this report. Frequency relations for annual peak flows through water year 2000 at 364 streamflow-gaging stations in and near Wyoming were evaluated and revised or updated as needed. Analyses of historical floods, temporal trends, and generalized skew were included in the evaluation. Physical and climatic basin characteristics were determined for each gaging station using a geographic information system. Gaging stations with similar peak-flow and basin characteristics were grouped into six hydrologic regions. Regional statistical relations between peak-flow and basin characteristics were explored using multiple-regression techniques. Generalized least squares regression equations for estimating magnitudes of annual peak flows with selected recurrence intervals from 1.5 to 500 years were developed for each region. Average standard errors of estimate range from 34 to 131 percent. Average standard errors of prediction range from 35 to 135 percent. Several statistics for evaluating and comparing the errors in these estimates are described. Limitations of the equations are described. Methods for applying the regional equations for various circumstances are listed and examples are given.
Kjelstrom, L.C.
1995-01-01
Many individual springs and groups of springs discharge water from volcanic rocks that form the north canyon wall of the Snake River between Milner Dam and King Hill. Previous estimates of annual mean discharge from these springs have been used to understand the hydrology of the eastern part of the Snake River Plain. Four methods that were used in previous studies or developed to estimate annual mean discharge since 1902 were (1) water-budget analysis of the Snake River; (2) correlation of water-budget estimates with discharge from 10 index springs; (3) determination of the combined discharge from individual springs or groups of springs by using annual discharge measurements of 8 springs, gaging-station records of 4 springs and 3 sites on the Malad River, and regression equations developed from 5 of the measured springs; and (4) a single regression equation that correlates gaging-station records of 2 springs with historical water-budget estimates. Comparisons made among the four methods of estimating annual mean spring discharges from 1951 to 1959 and 1963 to 1980 indicated that differences were about equivalent to a measurement error of 2 to 3 percent. The method that best demonstrates the response of annual mean spring discharge to changes in ground-water recharge and discharge is method 3, which combines the measurements and regression estimates of discharge from individual springs.
Flood-frequency characteristics of Wisconsin streams
Walker, John F.; Peppler, Marie C.; Danz, Mari E.; Hubbard, Laura E.
2017-05-22
Flood-frequency characteristics for 360 gaged sites on unregulated rural streams in Wisconsin are presented for percent annual exceedance probabilities ranging from 0.2 to 50 using a statewide skewness map developed for this report. Equations of the relations between flood-frequency and drainage-basin characteristics were developed by multiple-regression analyses. Flood-frequency characteristics for ungaged sites on unregulated, rural streams can be estimated by use of the equations presented in this report. The State was divided into eight areas of similar physiographic characteristics. The most significant basin characteristics are drainage area, soil saturated hydraulic conductivity, main-channel slope, and several land-use variables. The standard error of prediction for the equation for the 1-percent annual exceedance probability flood ranges from 56 to 70 percent for Wisconsin Streams; these values are larger than results presented in previous reports. The increase in the standard error of prediction is likely due to increased variability of the annual-peak discharges, resulting in increased variability in the magnitude of flood peaks at higher frequencies. For each of the unregulated rural streamflow-gaging stations, a weighted estimate based on the at-site log Pearson type III analysis and the multiple regression results was determined. The weighted estimate generally has a lower uncertainty than either the Log Pearson type III or multiple regression estimates. For regulated streams, a graphical method for estimating flood-frequency characteristics was developed from the relations of discharge and drainage area for selected annual exceedance probabilities. Graphs for the major regulated streams in Wisconsin are presented in the report.
Estimating Selected Streamflow Statistics Representative of 1930-2002 in West Virginia
Wiley, Jeffrey B.
2008-01-01
Regional equations and procedures were developed for estimating 1-, 3-, 7-, 14-, and 30-day 2-year; 1-, 3-, 7-, 14-, and 30-day 5-year; and 1-, 3-, 7-, 14-, and 30-day 10-year hydrologically based low-flow frequency values for unregulated streams in West Virginia. Regional equations and procedures also were developed for estimating the 1-day, 3-year and 4-day, 3-year biologically based low-flow frequency values; the U.S. Environmental Protection Agency harmonic-mean flows; and the 10-, 25-, 50-, 75-, and 90-percent flow-duration values. Regional equations were developed using ordinary least-squares regression using statistics from 117 U.S. Geological Survey continuous streamflow-gaging stations as dependent variables and basin characteristics as independent variables. Equations for three regions in West Virginia - North, South-Central, and Eastern Panhandle - were determined. Drainage area, precipitation, and longitude of the basin centroid are significant independent variables in one or more of the equations. Estimating procedures are presented for determining statistics at a gaging station, a partial-record station, and an ungaged location. Examples of some estimating procedures are presented.
Abu Bakar, S N; Aspalilah, A; AbdelNasser, I; Nurliza, A; Hairuliza, M J; Swarhib, M; Das, S; Mohd Nor, F
2017-01-01
Stature is one of the characteristics that could be used to identify human, besides age, sex and racial affiliation. This is useful when the body found is either dismembered, mutilated or even decomposed, and helps in narrowing down the missing person's identity. The main aim of the present study was to construct regression functions for stature estimation by using lower limb bones in the Malaysian population. The sample comprised 87 adult individuals (81 males, 6 females) aged between 20 to 79 years. The parameters such as thigh length, lower leg length, leg length, foot length, foot height and foot breadth were measured. They were measured by a ruler and measuring tape. Statistical analysis involved independent t-test to analyse the difference between lower limbs in male and female. The Pearson's correlation test was used to analyse correlations between lower limb parameters and stature, and the linear regressions were used to form equations. The paired t-test was used to compare between actual stature and estimated stature by using the equations formed. Using independent t-test, there was a significant difference (p< 0.05) in the measurement between males and females with regard to leg length, thigh length, lower leg length, foot length and foot breadth. The thigh length, leg length and foot length were observed to have strong correlations with stature with p= 0.75, p= 0.81 and p= 0.69, respectively. Linear regressions were formulated for stature estimation. Paired t-test showed no significant difference between actual stature and estimated stature. It is concluded that regression functions can be used to estimate stature to identify skeletal remains in the Malaysia population.
Karanjekar, Richa V; Bhatt, Arpita; Altouqui, Said; Jangikhatoonabad, Neda; Durai, Vennila; Sattler, Melanie L; Hossain, M D Sahadat; Chen, Victoria
2015-12-01
Accurately estimating landfill methane emissions is important for quantifying a landfill's greenhouse gas emissions and power generation potential. Current models, including LandGEM and IPCC, often greatly simplify treatment of factors like rainfall and ambient temperature, which can substantially impact gas production. The newly developed Capturing Landfill Emissions for Energy Needs (CLEEN) model aims to improve landfill methane generation estimates, but still require inputs that are fairly easy to obtain: waste composition, annual rainfall, and ambient temperature. To develop the model, methane generation was measured from 27 laboratory scale landfill reactors, with varying waste compositions (ranging from 0% to 100%); average rainfall rates of 2, 6, and 12 mm/day; and temperatures of 20, 30, and 37°C, according to a statistical experimental design. Refuse components considered were the major biodegradable wastes, food, paper, yard/wood, and textile, as well as inert inorganic waste. Based on the data collected, a multiple linear regression equation (R(2)=0.75) was developed to predict first-order methane generation rate constant values k as functions of waste composition, annual rainfall, and temperature. Because, laboratory methane generation rates exceed field rates, a second scale-up regression equation for k was developed using actual gas-recovery data from 11 landfills in high-income countries with conventional operation. The Capturing Landfill Emissions for Energy Needs (CLEEN) model was developed by incorporating both regression equations into the first-order decay based model for estimating methane generation rates from landfills. CLEEN model values were compared to actual field data from 6 US landfills, and to estimates from LandGEM and IPCC. For 4 of the 6 cases, CLEEN model estimates were the closest to actual. Copyright © 2015 Elsevier Ltd. All rights reserved.
Estimation of selected flow and water-quality characteristics of Alaskan streams
Parks, Bruce; Madison, R.J.
1985-01-01
Although hydrologic data are either sparse or nonexistent for large areas of Alaska, the drainage area, area of lakes, glacier and forest cover, and average precipitation in a hydrologic basin of interest can be measured or estimated from existing maps. Application of multiple linear regression techniques indicates that statistically significant correlations exist between properties of basins determined from maps and measured streamflow characteristics. This suggests that corresponding characteristics of ungaged basins can be estimated. Streamflow frequency characteristics can be estimated from regional equations developed for southeast, south-central and Yukon regions. Statewide or modified regional equations must be used, however, for the southwest, northwest, and Arctic Slope regions where there is a paucity of data. Equations developed from basin characteristics are given to estimate suspended-sediment values for glacial streams and, with less reliability, for nonglacial streams. Equations developed from available specific conductance data are given to estimate concentrations of major dissolved inorganic constituents. Suggestions are made for expanding the existing data base and thus improving the ability to estimate hydrologic characteristics for Alaskan streams. (USGS)
Chen, Ling; Feng, Yanqin; Sun, Jianguo
2017-10-01
This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.
Simple method for quick estimation of aquifer hydrogeological parameters
NASA Astrophysics Data System (ADS)
Ma, C.; Li, Y. Y.
2017-08-01
Development of simple and accurate methods to determine the aquifer hydrogeological parameters was of importance for groundwater resources assessment and management. Aiming at the present issue of estimating aquifer parameters based on some data of the unsteady pumping test, a fitting function of Theis well function was proposed using fitting optimization method and then a unitary linear regression equation was established. The aquifer parameters could be obtained by solving coefficients of the regression equation. The application of the proposed method was illustrated, using two published data sets. By the error statistics and analysis on the pumping drawdown, it showed that the method proposed in this paper yielded quick and accurate estimates of the aquifer parameters. The proposed method could reliably identify the aquifer parameters from long distance observed drawdowns and early drawdowns. It was hoped that the proposed method in this paper would be helpful for practicing hydrogeologists and hydrologists.
Whole stand volume tables for quaking aspen in the Rocky Mountains
Wayne D. Shepperd; H. Todd Mowrer
1984-01-01
Linear regression equations were developed to predict stand volumes for aspen given average stand basal area and average stand height. Tables constructed from these equations allow easy field estimation of gross merchantable cubic and board foot Scribner Rules per acre, and cubic meters per hectare using simple prism cruise data.
Waltemeyer, Scott D.
2008-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.
Reaeration equations derived from U.S. geological survey database
Melching, C.S.; Flores, H.E.
1999-01-01
Accurate estimation of the reaeration-rate coefficient (K2) is extremely important for waste-load allocation. Currently, available K2 estimation equations generally yield poor estimates when applied to stream conditions different from those for which the equations were derived because they were derived from small databases composed of potentially highly inaccurate measurements. A large data set of K2 measurements made with tracer-gas methods was compiled from U.S. Geological Survey studies. This compilation included 493 reaches on 166 streams in 23 states. Careful screening to detect and eliminate erroneous measurements reduced the date set to 371 measurements. These measurements were divided into four subgroups on the basis of flow regime (channel control or pool and riffle) and stream scale (discharge greater than or less than 0.556 m3/s). Multiple linear regression in logarithms was applied to relate K2 to 12 stream hydraulic and water-quality characteristics. The resulting best-estimation equations had the form of semiempirical equations that included the rate of energy dissipation and discharge or depth and width as variables. For equation verification, a data set of K2 measurements made with tracer-gas procedures by other agencies was compiled from the literature. This compilation included 127 reaches on at least 24 streams in at least seven states. The standard error of estimate obtained when applying the developed equations to the U.S. Geological Survey data set ranged from 44 to 61%, whereas the standard error of estimate was 78% when applied to the verification data set.Accurate estimation of the reaeration-rate coefficient (K2) is extremely important for waste-load allocation. Currently, available K2 estimation equations generally yield poor estimates when applied to stream conditions different from those for which the equations were derived because they were derived from small databases composed of potentially highly inaccurate measurements. A large data set of K2 measurements made with tracer-gas methods was compiled from U.S. Geological Survey studies. This compilation included 493 reaches on 166 streams in 23 states. Careful screening to detect and eliminate erroneous measurements reduced the data set to 371 measurements. These measurements were divided into four subgroups on the basis of flow regime (channel control or pool and riffle) and stream scale (discharge greater than or less than 0.556 m3/s). Multiple linear regression in logarithms was applied to relate K2 to 12 stream hydraulic and water-quality characteristics. The resulting best-estimation equations had the form of semiempirical equations that included the rate of energy dissipation and discharge or depth and width as variables. For equation verification, a data set of K2 measurements made with tracer-gas procedures by other agencies was compiled from the literature. This compilation included 127 reaches on at least 24 streams in at least seven states. The standard error of estimate obtained when applying the developed equations to the U.S. Geological Survey data set ranged from 44 to 61%, whereas the standard error of estimate was 78% when applied to the verification data set.
Curran, Janet H.; Barth, Nancy A.; Veilleux, Andrea G.; Ourso, Robert T.
2016-03-16
Estimates of the magnitude and frequency of floods are needed across Alaska for engineering design of transportation and water-conveyance structures, flood-insurance studies, flood-plain management, and other water-resource purposes. This report updates methods for estimating flood magnitude and frequency in Alaska and conterminous basins in Canada. Annual peak-flow data through water year 2012 were compiled from 387 streamgages on unregulated streams with at least 10 years of record. Flood-frequency estimates were computed for each streamgage using the Expected Moments Algorithm to fit a Pearson Type III distribution to the logarithms of annual peak flows. A multiple Grubbs-Beck test was used to identify potentially influential low floods in the time series of peak flows for censoring in the flood frequency analysis.For two new regional skew areas, flood-frequency estimates using station skew were computed for stations with at least 25 years of record for use in a Bayesian least-squares regression analysis to determine a regional skew value. The consideration of basin characteristics as explanatory variables for regional skew resulted in improvements in precision too small to warrant the additional model complexity, and a constant model was adopted. Regional Skew Area 1 in eastern-central Alaska had a regional skew of 0.54 and an average variance of prediction of 0.45, corresponding to an effective record length of 22 years. Regional Skew Area 2, encompassing coastal areas bordering the Gulf of Alaska, had a regional skew of 0.18 and an average variance of prediction of 0.12, corresponding to an effective record length of 59 years. Station flood-frequency estimates for study sites in regional skew areas were then recomputed using a weighted skew incorporating the station skew and regional skew. In a new regional skew exclusion area outside the regional skew areas, the density of long-record streamgages was too sparse for regional analysis and station skew was used for all estimates. Final station flood frequency estimates for all study streamgages are presented for the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities.Regional multiple-regression analysis was used to produce equations for estimating flood frequency statistics from explanatory basin characteristics. Basin characteristics, including physical and climatic variables, were updated for all study streamgages using a geographical information system and geospatial source data. Screening for similar-sized nested basins eliminated hydrologically redundant sites, and screening for eligibility for analysis of explanatory variables eliminated regulated peaks, outburst peaks, and sites with indeterminate basin characteristics. An ordinary least‑squares regression used flood-frequency statistics and basin characteristics for 341 streamgages (284 in Alaska and 57 in Canada) to determine the most suitable combination of basin characteristics for a flood-frequency regression model and to explore regional grouping of streamgages for explaining variability in flood-frequency statistics across the study area. The most suitable model for explaining flood frequency used drainage area and mean annual precipitation as explanatory variables for the entire study area as a region. Final regression equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability discharge in Alaska and conterminous basins in Canada were developed using a generalized least-squares regression. The average standard error of prediction for the regression equations for the various annual exceedance probabilities ranged from 69 to 82 percent, and the pseudo-coefficient of determination (pseudo-R2) ranged from 85 to 91 percent.The regional regression equations from this study were incorporated into the U.S. Geological Survey StreamStats program for a limited area of the State—the Cook Inlet Basin. StreamStats is a national web-based geographic information system application that facilitates retrieval of streamflow statistics and associated information. StreamStats retrieves published data for gaged sites and, for user-selected ungaged sites, delineates drainage areas from topographic and hydrographic data, computes basin characteristics, and computes flood frequency estimates using the regional regression equations.
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Borodachev, S. M.
2016-06-01
The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.
Techniques for estimating magnitude and frequency of floods in Minnesota
Guetzkow, Lowell C.
1977-01-01
Estimating relations have been developed to provide engineers and designers with improved techniques for defining flow-frequency characteristics to satisfy hydraulic planning and design requirements. The magnitude and frequency of floods up to the 100-year recurrence interval can be determined for most streams in Minnesota by methods presented. By multiple regression analysis, equations have been developed for estimating flood-frequency relations at ungaged sites on natural flow streams. Eight distinct hydrologic regions are delineated within the State with boundaries defined generally by river basin divides. Regression equations are provided for each region which relate selected frequency floods to significant basin parameters. For main-stem streams, graphs are presented showing floods for selected recurrence intervals plotted against contributing drainage area. Flow-frequency estimates for intervening sites along the Minnesota River, Mississippi River, and the Red River of the North can be derived from these graphs. Flood-frequency characteristics are tabulated for 201 paging stations having 10 or more years of record.
A new model for estimating total body water from bioelectrical resistance
NASA Technical Reports Server (NTRS)
Siconolfi, S. F.; Kear, K. T.
1992-01-01
Estimation of total body water (T) from bioelectrical resistance (R) is commonly done by stepwise regression models with height squared over R, H(exp 2)/R, age, sex, and weight (W). Polynomials of H(exp 2)/R have not been included in these models. We examined the validity of a model with third order polynomials and W. Methods: T was measured with oxygen-18 labled water in 27 subjects. R at 50 kHz was obtained from electrodes placed on the hand and foot while subjects were in the supine position. A stepwise regression equation was developed with 13 subjects (age 31.5 plus or minus 6.2 years, T 38.2 plus or minus 6.6 L, W 65.2 plus or minus 12.0 kg). Correlations, standard error of estimates and mean differences were computed between T and estimated T's from the new (N) model and other models. Evaluations were completed with the remaining 14 subjects (age 32.4 plus or minus 6.3 years, T 40.3 plus or minus 8 L, W 70.2 plus or minus 12.3 kg) and two of its subgroups (high and low) Results: A regression equation was developed from the model. The only significant mean difference was between T and one of the earlier models. Conclusion: Third order polynomials in regression models may increase the accuracy of estimating total body water. Evaluating the model with a larger population is needed.
Comparison of total body water estimates from O-18 and bioelectrical response prediction equations
NASA Technical Reports Server (NTRS)
Barrows, Linda H.; Inners, L. Daniel; Stricklin, Marcella D.; Klein, Peter D.; Wong, William W.; Siconolfi, Steven F.
1993-01-01
Identification of an indirect, rapid means to measure total body water (TBW) during space flight may aid in quantifying hydration status and assist in countermeasure development. Bioelectrical response testing and hydrostatic weighing were performed on 27 subjects who ingested O-18, a naturally occurring isotope of oxygen, to measure true TBW. TBW estimates from three bioelectrical response prediction equations and fat-free mass (FFM) were compared to TBW measured from O-18. A repeated measures MANOVA with post-hoc Dunnett's Test indicated a significant (p less than 0.05) difference between TBW estimates from two of the three bioelectrical response prediction equations and O-18. TBW estimates from FFM and the Kushner & Schoeller (1986) equation yielded results that were similar to those given by O-18. Strong correlations existed between each prediction method and O-18; however, standard errors, identified through regression analyses, were higher for the bioelectrical response prediction equations compared to those derived from FFM. These findings suggest (1) the Kushner & Schoeller (1986) equation may provide a valid measure of TBW, (2) other TBW prediction equations need to be identified that have variability similar to that of FFM, and (3) bioelectrical estimates of TBW may prove valuable in quantifying hydration status during space flight.
Regression model estimation of early season crop proportions: North Dakota, some preliminary results
NASA Technical Reports Server (NTRS)
Lin, K. K. (Principal Investigator)
1982-01-01
To estimate crop proportions early in the season, an approach is proposed based on: use of a regression-based prediction equation to obtain an a priori estimate for specific major crop groups; modification of this estimate using current-year LANDSAT and weather data; and a breakdown of the major crop groups into specific crops by regression models. Results from the development and evaluation of appropriate regression models for the first portion of the proposed approach are presented. The results show that the model predicts 1980 crop proportions very well at both county and crop reporting district levels. In terms of planted acreage, the model underpredicted 9.1 percent of the 1980 published data on planted acreage at the county level. It predicted almost exactly the 1980 published data on planted acreage at the crop reporting district level and overpredicted the planted acreage by just 0.92 percent.
Dairy manure nutrient analysis using quick tests.
Singh, A; Bicudo, J R
2005-05-01
Rapid on-farm assessment of manure nutrient content can be achieved with the use of quick tests. These tests can be used to indirectly measure the nutrient content in animal slurries immediately before manure is applied on agricultural fields. The objective of this study was to assess the reliability of hydrometers, electrical conductivity meter and pens, and Agros N meter against standard laboratory methods. Manure samples were collected from 34 dairy farms in the Mammoth Cave area in central Kentucky. Regression equations were developed for combined and individual counties located In the area (Barren, Hart and Monroe). Our results indicated that accuracy in nutrient estimation could be improved if separate linear regressions were developed for farms with similar facilities in a county. Direct hydrometer estimates of total nitrogen were among the most accurate when separate regression equations were developed for each county (R2 = 0.61, 0.93, and 0.74 for Barren, Hart and Monroe county, respectively). Reasonably accurate estimates (R2 > 0.70) were also obtained for total nitrogen and total phosphorus using hydrometers, either by relating specific gravity to nutrient content or to total solids content. Estimation of ammoniacal nitrogen with Agros N meter and electrical conductivity meter/pens correlated well with standard laboratory determinations, especially while using the individual data sets from Hart County (R2 = 0.70 to 0.87). This study indicates that the use of quick test calibration equations developed for a small area or region where farms are similar in terms of manure handling and management, housing, and feed ration are more appropriate than using "universal" equations usually developed with combined data sets. Accuracy is expected to improve if individual farms develop their own calibration curves. Nevertheless, we suggest confidence intervals always be specified for nutrients estimated through quick testing for any specific region, county, or farm.
Estimation of stature from radiologic anthropometry of the lumbar vertebral dimensions in Chinese.
Zhang, Kui; Chang, Yun-feng; Fan, Fei; Deng, Zhen-hua
2015-11-01
The recent study was to assess the relationship between the radiologic anthropometry of the lumbar vertebral dimensions and stature in Chinese and to develop regression formulae to estimate stature from these dimensions. A total of 412 normal, healthy volunteers, comprising 206 males and 206 females, were recruited. The linear regression analysis were performed to assess the correlation between the stature and lengths of various segments of the lumbar vertebral column. Among the regression equations created for single variable, the predictive value was greatest for the reconstruction of stature from the lumbar segment in both sexes and subgroup analysis. When individual vertebral body was used, the heights of posterior vertebral body of L3 gave the most accurate results for male group, the heights of central vertebral body of L1 provided the most accurate results for female group and female group with age above 45 years, the heights of central vertebral body of L3 gave the most accurate results for the groups with age from 20-45 years for both sexes and the male group with age above 45 years. The heights of anterior vertebral body of L5 gave the less accurate results except for the heights of anterior vertebral body of L4 provided the less accurate result for the male group with age above 45 years. As expected, multiple regression equations were more successful than equations derived from a single variable. The research observations suggest lumbar vertebral dimensions to be useful in stature estimation among Chinese population. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Minimizing bias in biomass allometry: Model selection and log transformation of data
Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer
2011-01-01
Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Estimation of Missing Water-Level Data for the Everglades Depth Estimation Network (EDEN)
Conrads, Paul; Petkewich, Matthew D.
2009-01-01
The Everglades Depth Estimation Network (EDEN) is an integrated network of real-time water-level gaging stations, ground-elevation models, and water-surface elevation models designed to provide scientists, engineers, and water-resource managers with current (2000-2009) water-depth information for the entire freshwater portion of the greater Everglades. The U.S. Geological Survey Greater Everglades Priority Ecosystems Science provides support for EDEN and their goal of providing quality-assured monitoring data for the U.S. Army Corps of Engineers Comprehensive Everglades Restoration Plan. To increase the accuracy of the daily water-surface elevation model, water-level estimation equations were developed to fill missing data. To minimize the occurrences of no estimation of data due to missing data for an input station, a minimum of three linear regression equations were developed for each station using different input stations. Of the 726 water-level estimation equations developed to fill missing data at 239 stations, more than 60 percent of the equations have coefficients of determination greater than 0.90, and 92 percent have an coefficient of determination greater than 0.70.
Novel equations to estimate lean body mass in maintenance hemodialysis patients.
Noori, Nazanin; Kovesdy, Csaba P; Bross, Rachelle; Lee, Martin; Oreopoulos, Antigone; Benner, Deborah; Mehrotra, Rajnish; Kopple, Joel D; Kalantar-Zadeh, Kamyar
2011-01-01
Lean body mass (LBM) is an important nutritional measure representing muscle mass and somatic protein in hemodialysis patients, for whom we developed and tested equations to estimate LBM. A study of diagnostic test accuracy. The development cohort included 118 hemodialysis patients with LBM measured using dual-energy x-ray absorptiometry (DEXA) and near-infrared (NIR) interactance. The validation cohort included 612 additional hemodialysis patients with LBM measured using a portable NIR interactance technique during hemodialysis. 3-month averaged serum concentrations of creatinine, albumin, and prealbumin; normalized protein nitrogen appearance; midarm muscle circumference (MAMC); handgrip strength; and subjective global assessment of nutrition. LBM measured using DEXA in the development cohort and NIR interactance in validation cohorts. In the development cohort, DEXA and NIR interactance correlated strongly (r = 0.94, P < 0.001). DEXA-measured LBM correlated with serum creatinine level, MAMC, and handgrip strength, but not with other nutritional markers. Three regression equations to estimate DEXA-measured LBM were developed based on each of these 3 surrogates and sex, height, weight, and age (and urea reduction ratio for the serum creatinine regression). In the validation cohort, the validity of the equations was tested against the NIR interactance-measured LBM. The equation estimates correlated well with NIR interactance-measured LBM (R² ≥ 0.88), although in higher LBM ranges, they tended to underestimate it. Median (95% confidence interval) differences and interquartile range for differences between equation estimates and NIR interactance-measured LBM were 3.4 (-3.2 to 12.0) and 3.0 (1.1-5.1) kg for serum creatinine and 4.0 (-2.6 to 13.6) and 3.7 (1.3-6.0) kg for MAMC, respectively. DEXA measurements were obtained on a nondialysis day, whereas NIR interactance was performed during hemodialysis treatment, with the likelihood of confounding by volume status variations. Compared with reference measures of LBM, equations using serum creatinine level, MAMC, or handgrip strength and demographic variables can estimate LBM accurately in long-term hemodialysis patients. Copyright © 2010 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.
Novel Equations to Estimate Lean Body Mass in Maintenance Hemodialysis Patients
Noori, Nazanin; Kovesdy, Csaba P; Bross, Rachelle; Lee, Martin; Oreopoulos, Antigone; Benner, Deborah; Mehrotra, Rajnish; Kopple, Joel D; Kalantar-Zadeh, Kamyar
2010-01-01
Background Lean body mass (LBM) is an important nutritional measure representing muscle mass and somatic protein in hemodialysis patients, in whom we developed and tested equations to estimate LBM. Study Design A study of diagnostic test accuracy. Setting and Participants The development cohort included 118 hemodialysis patients, with LBM measured using dual-energy -X-ray absorptiometry (DEXA) and near-infrared (NIR) interactance. The validation cohort included 612 additional hemodialysis patients with LBM measured using portable NIR interactance technique during hemodialysis. Index Tests 3-month averaged serum concentrations of creatinine, albumin and prealbumin, normalized protein-nitrogen-appearance, mid-arm muscle circumference (MAMC), handgrip strength, and subjective global assessment of nutrition. Reference Test LBM measured via DEXA in the development cohort and via NIR interactance in validation cohorts. Results In the development cohort, DEXA and NIR interactance were strongly correlated (r=0.94, p<0.001). DEXA-measured LBM correlated with serum creatinine, MAMC, handgrip strength but not with other nutritional markers. Three regression equations to estimate DEXA-measured LBM were developed based on each of these three surrogates and gender, height, weight, and age (and urea reduction ratio for the serum creatinine regression). In the validation cohort, the validity of the equations were tested against the NIR interactance measured LBM. The equation estimates correlated well with NIR interactance measured LBM (R221 ≥0.88), although in higher LBM ranges they tended to underestimate it. Median differences between equation estimates and NIR interactance-measured LBM were 3.4 (25th–75th percentile, −3.2 to 12.0) and 3.0 (25th–75th percentile, 1.1–5.1) kg for serum creatinine and 4.0 (25th–75th percentile, −2.6 to 13.6) and 3.7 (25th–75th percentile, 1.3–6.0) kg for MAMC. Limitations DEXA measurements were performed on a non-dialysis day whereas NIR interactance was obtained during the hemodialysis treatment, with likelihood of confounding by volume status variations. Conclusions Comparing to reference measures of LBM, equations using serum creatinine, MAMC, or handgrip strength and demographic variables can accurately estimate LBM in long-term hemodialysis patients. PMID:21184920
THOMAS J. BRANDEIS; MARIA DEL ROCIO SUAREZ ROZO
2005-01-01
Total aboveground live tree biomass in Puerto Rican lower montane wet, subtropical wet, subtropical moist and subtropical dry forests was estimated using data from two forest inventories and published regression equations. Multiple potentially-applicable published biomass models existed for some forested life zones, and their estimates tended to diverge with increasing...
Thomas J. Brandeis; Maria Del Rocio; Suarez Rozo
2005-01-01
Total aboveground live tree biomass in Puerto Rican lower montane wet, subtropical wet, subtropical moist and subtropical dry forests was estimated using data from two forest inventories and published regression equations. Multiple potentially-applicable published biomass models existed for some forested life zones, and their estimates tended to diverge with increasing...
Using twig diameters to estimate browse utilization on three shrub species in southeastern Montana
Mark A. Rumble
1987-01-01
Browse utilization estimates based on twig length and twig weight were compared for skunkbush sumac, wax currant, and chokecherry. Linear regression analysis was valid for twig length data; twig weight equations are nonlinear. Estimates of twig weight are more accurate. Problems encountered during development of a utilization model are discussed.
Methods for Adjusting U.S. Geological Survey Rural Regression Peak Discharges in an Urban Setting
Moglen, Glenn E.; Shivers, Dorianne E.
2006-01-01
A study was conducted of 78 U.S. Geological Survey gaged streams that have been subjected to varying degrees of urbanization over the last three decades. Flood-frequency analysis coupled with nonlinear regression techniques were used to generate a set of equations for converting peak discharge estimates determined from rural regression equations to a set of peak discharge estimates that represent known urbanization. Specifically, urban regression equations for the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year return periods were calibrated as a function of the corresponding rural peak discharge and the percentage of impervious area in a watershed. The results of this study indicate that two sets of equations, one set based on imperviousness and one set based on population density, performed well. Both sets of equations are dependent on rural peak discharges, a measure of development (average percentage of imperviousness or average population density), and a measure of homogeneity of development within a watershed. Average imperviousness was readily determined by using geographic information system methods and commonly available land-cover data. Similarly, average population density was easily determined from census data. Thus, a key advantage to the equations developed in this study is that they do not require field measurements of watershed characteristics as did the U.S. Geological Survey urban equations developed in an earlier investigation. During this study, the U.S. Geological Survey PeakFQ program was used as an integral tool in the calibration of all equations. The scarcity of historical land-use data, however, made exclusive use of flow records necessary for the 30-year period from 1970 to 2000. Such relatively short-duration streamflow time series required a nonstandard treatment of the historical data function of the PeakFQ program in comparison to published guidelines. Thus, the approach used during this investigation does not fully comply with the guidelines set forth in U.S. Geological Survey Bulletin 17B, and modifications may be needed before it can be applied in practice.
Fifty-year flood-inundation maps for Comayagua, Hondura
Kresch, David L.; Mastin, Mark C.; Olsen, T.D.
2002-01-01
After the devastating floods caused by Hurricane Mitch in 1998, maps of the areas and depths of the 50-year-flood inundation at 15 municipalities in Honduras were prepared as a tool for agencies involved in reconstruction and planning. This report, which is one in a series of 15, presents maps of areas in the municipality of Comayagua that would be inundated by 50-year floods on Rio Humuya and Rio Majada. Geographic Information System (GIS) coverages of the flood inundation are available on a computer in the municipality of Comayagua as part of the Municipal GIS project and on the Internet at the Flood Hazard Mapping Web page (http://mitchnts1.cr.usgs.gov/projects/floodhazard.html). These coverages allow users to view the flood inundation in much more detail than is possible using the maps in this report. Water-surface elevations for 50-year-floods on Rio Humuya and Rio Majada at Comayagua were estimated using HEC-RAS, a one-dimensional, steady-flow, step-backwater computer program. The channel and floodplain cross sections used in HEC-RAS were developed from an airborne light-detection-and-ranging (LIDAR) topographic survey of the area. The 50-year-flood discharge for Rio Humuya at Comayagua, 1,400 cubic meters per second, was estimated using a regression equation that relates the 50-year-flood discharge to drainage area and mean annual precipitation. The reasonableness of the regression discharge was evaluated by comparing it with drainage-area-adjusted 50-year-flood discharges estimated for three long-term Rio Humuya stream-gaging stations. The drainage-area-adjusted 50-year-flood discharges estimated from the gage records ranged from 946 to 1,365 cubic meters per second. Because the regression equation discharge agrees closely with the high end of the range of discharges estimated from the gaging-station records, it was used for the hydraulic modeling to ensure that the resulting 50-year-flood water-surface elevations would not be underestimated. The 50-year-flood discharge for Rio Majada at Comayagua (230 cubic meters per second) was estimated using the regression equation because there are no long-term gaging-stations on this river from which to estimate the discharge.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Analysis of the low-flow characteristics of streams in Louisiana
Lee, Fred N.
1985-01-01
The U.S. Geological Survey, in cooperation with the Louisiana Department of Transportation and Development, Office of Public Works, used geologic maps, soils maps, precipitation data, and low-flow data to define four hydrographic regions in Louisiana having distinct low-flow characteristics. Equations were derived, using regression analyses, to estimate the 7Q2, 7Q10, and 7Q20 flow rates for basically unaltered stream basins smaller than 525 square miles. Independent variables in the equations include drainage area (square miles), mean annual precipitation index (inches), and main channel slope (feet per mile). Average standard errors of regression ranged from +44 to +61 percent. Graphs are given for estimating the 7Q2, 7Q10, and 7Q20 for stream basins for which the drainage area of the most downstream data-collection site is larger than 525 square miles. Detailed examples are given in this report for the use of the equations and graphs.
The Accuracy of Anthropometric Equations to Assess Body Fat in Adults with Down Syndrome
ERIC Educational Resources Information Center
Rossato, Mateus; Dellagrana, Rodolfo André; da Costa, Rafael Martins; de Souza Bezerra, Ewertton; dos Santos, João Otacílio Libardoni; Rech, Cassiano Ricardo
2018-01-01
Background: The aim of this study was to verify the accuracy of anthropometric equations to estimate the body density (BD) of adults with Down syndrome (DS), and propose new regression equations. Materials and methods: Twenty-one males (30.5 ± 9.4 years) and 17 females (27.3 ± 7.7 years) with DS participated in this study. The reference method for…
Olson, Scott A.; Brouillette, Michael C.
2006-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.
Site conditions related to erosion on logging roads
R. M. Rice; J. D. McCashion
1985-01-01
Synopsis - Data collected from 299 road segments in northwestern California were used to develop and test a procedure for estimating and managing road-related erosion. Site conditions and the design of each segment were described by 30 variables. Equations developed using 149 of the road segments were tested on the other 150. The best multiple regression equation...
Flood-frequency prediction methods for unregulated streams of Tennessee, 2000
Law, George S.; Tasker, Gary D.
2003-01-01
Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.
The use of generalized estimating equations in the analysis of motor vehicle crash data.
Hutchings, Caroline B; Knight, Stacey; Reading, James C
2003-01-01
The purpose of this study was to determine if it is necessary to use generalized estimating equations (GEEs) in the analysis of seat belt effectiveness in preventing injuries in motor vehicle crashes. The 1992 Utah crash dataset was used, excluding crash participants where seat belt use was not appropriate (n=93,633). The model used in the 1996 Report to Congress [Report to congress on benefits of safety belts and motorcycle helmets, based on data from the Crash Outcome Data Evaluation System (CODES). National Center for Statistics and Analysis, NHTSA, Washington, DC, February 1996] was analyzed for all occupants with logistic regression, one level of nesting (occupants within crashes), and two levels of nesting (occupants within vehicles within crashes) to compare the use of GEEs with logistic regression. When using one level of nesting compared to logistic regression, 13 of 16 variance estimates changed more than 10%, and eight of 16 parameter estimates changed more than 10%. In addition, three of the independent variables changed from significant to insignificant (alpha=0.05). With the use of two levels of nesting, two of 16 variance estimates and three of 16 parameter estimates changed more than 10% from the variance and parameter estimates in one level of nesting. One of the independent variables changed from insignificant to significant (alpha=0.05) in the two levels of nesting model; therefore, only two of the independent variables changed from significant to insignificant when the logistic regression model was compared to the two levels of nesting model. The odds ratio of seat belt effectiveness in preventing injuries was 12% lower when a one-level nested model was used. Based on these results, we stress the need to use a nested model and GEEs when analyzing motor vehicle crash data.
NASA Astrophysics Data System (ADS)
Reitz, M. D.; Sanford, W. E.; Senay, G. B.; Cazenas, J.
2015-12-01
Evapotranspiration (ET) is a key quantity in the hydrologic cycle, accounting for ~70% of precipitation across the contiguous United States (CONUS). However, it is a challenge to estimate, due to difficulty in making direct measurements and gaps in our theoretical understanding. Here we present a new data-driven, ~1km2 resolution map of long-term average actual evapotranspiration rates across the CONUS. The new ET map is a function of the USGS Landsat-derived National Land Cover Database (NLCD), precipitation, temperature, and daily average temperature range (from the PRISM climate dataset), and is calibrated to long-term water balance data from 679 watersheds. It is unique from previously presented ET maps in that (1) it was co-developed with estimates of runoff and recharge; (2) the regression equation was chosen from among many tested, previously published and newly proposed functional forms for its optimal description of long-term water balance ET data; (3) it has values over open-water areas that are derived from separate mass-transfer and humidity equations; and (4) the data include additional precipitation representing amounts converted from 2005 USGS water-use census irrigation data. The regression equation is calibrated using data from 2000-2013, but can also be applied to individual years with their corresponding input datasets. Comparisons among this new map, the more detailed remote-sensing-based estimates of MOD16 and SSEBop, and AmeriFlux ET tower measurements shows encouraging consistency, and indicates that the empirical ET estimate approach presented here produces closer agreement with independent flux tower data for annual average actual ET than other more complex remote sensing approaches.
Prediction of Maximal Aerobic Capacity in Severely Burned Children
Porro, Laura; Rivero, Haidy G.; Gonzalez, Dante; Tan, Alai; Herndon, David N.; Suman, Oscar E.
2011-01-01
Introduction Maximal oxygen uptake (VO2 peak) is an indicator of cardiorespiratory fitness, but requires expensive equipment and a relatively high technical skill level. Purpose The aim of this study is to provide a formula for estimating VO2 peak in burned children, using information obtained without expensive equipment. Methods Children, with ≥40% total surface area burned (TBSA), underwent a modified Bruce treadmill test to asses VO2 peak at 6 months after injury. We recorded gender, age, %TBSA, %3rd degree burn, height, weight, treadmill time, maximal speed, maximal grade, and peak heart rate, and applied McHenry’s select algorithm to extract important independent variables and Robust multiple regression to establish prediction equations. Results 42 children; 7 to 17 years old were tested. Robust multiple regression model provided the equation: VO2=10.33 – 0.62 *Age (years) + 1.88 * Treadmill Time (min) + 2.3 (gender; Females = 0, Males = 1). The correlation between measured and estimated VO2 peak was R=0.80. We then validated the equation with a group of 33 burned children, which yielded a correlation between measured and estimated VO2 peak of R=0.79. Conclusions Using only a treadmill and easily gathered information, VO2 peak can be estimated in children with burns. PMID:21316155
Development of 1RM Prediction Equations for Bench Press in Moderately Trained Men.
Macht, Jordan W; Abel, Mark G; Mullineaux, David R; Yates, James W
2016-10-01
Macht, JW, Abel, MG, Mullineaux, DR, and Yates, JW. Development of 1RM prediction equations for bench press in moderately trained men. J Strength Cond Res 30(10): 2901-2906, 2016-There are a variety of established 1 repetition maximum (1RM) prediction equations, however, very few prediction equations use anthropometric characteristics exclusively or in part, to estimate 1RM strength. Therefore, the purpose of this study was to develop an original 1RM prediction equation for bench press using anthropometric and performance characteristics in moderately trained male subjects. Sixty male subjects (21.2 ± 2.4 years) completed a 1RM bench press and were randomly assigned a load to complete as many repetitions as possible. In addition, body composition, upper-body anthropometric characteristics, and handgrip strength were assessed. Regression analysis was used to develop a performance-based 1RM prediction equation: 1RM = 1.20 repetition weight + 2.19 repetitions to fatigue - 0.56 biacromial width (cm) + 9.6 (R = 0.99, standard error of estimate [SEE] = 3.5 kg). Regression analysis to develop a nonperformance-based 1RM prediction equation yielded: 1RM (kg) = 0.997 cross-sectional area (CSA) (cm) + 0.401 chest circumference (cm) - 0.385%fat - 0.185 arm length (cm) + 36.7 (R = 0.81, SEE = 13.0 kg). The performance prediction equations developed in this study had high validity coefficients, minimal mean bias, and small limits of agreement. The anthropometric equations had moderately high validity coefficient but larger limits of agreement. The practical applications of this study indicate that the inclusion of anthropometric characteristics and performance variables produce a valid prediction equation for 1RM strength. In addition, the CSA of the arm uses a simple nonperformance method of estimating the lifter's 1RM. This information may be used to predict the starting load for a lifter performing a 1RM prediction protocol or a 1RM testing protocol.
Peak-flow frequency for tributaries of the Colorado River downstream of Austin, Texas
Asquith, William H.
1998-01-01
Peak-flow frequency for 38 stations with at least 8 years of data in natural (unregulated and nonurbanized) basins was estimated on the basis of annual peak-streamflow data through water year 1995. Peak-flow frequency represents the peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, 250, and 500 years. The peak-flow frequency and drainage basin characteristics for the stations were used to develop two sets of regression equations to estimate peak-flow frequency for tributaries of the Colorado River in the study area. One set of equations was developed for contributing drainage areas less than 32 square miles, and another set was developed for contributing drainage areas greater than 32 square miles. A procedure is presented to estimate the peak discharge at sites where both sets of equations are considered applicable. Additionally, procedures are presented to compute the 50-, 67-, and 90-percent prediction interval for any estimation from the equations.
DOT National Transportation Integrated Search
2015-01-01
Traditionally, the Iowa Department of Transportation : has used the Iowa Runoff Chart and single-variable regional-regression equations (RREs) from a U.S. Geological Survey : report (published in 1987) as the primary methods to estimate : annual exce...
DOT National Transportation Integrated Search
2014-01-01
Regression analysis techniques were used to develop a : set of equations for rural ungaged stream sites for estimating : discharges with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent : annual exceedance probabilities, which are equivalent to : ann...
DOT National Transportation Integrated Search
2015-01-01
Traditionally, the Iowa DOT has used the Iowa Runoff Chart and single-variable regional regression equations (RREs) from a USGS report : (published in 1987) as the primary methods to estimate annual exceedance-probability discharge : (AEPD) for small...
ERIC Educational Resources Information Center
Stapleton, Laura M.
2008-01-01
This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…
Magnitude and frequency of floods in small drainage basins in Idaho
Thomas, C.A.; Harenberg, W.A.; Anderson, J.M.
1973-01-01
A method is presented in this report for determining magnitude and frequency of floods on streams with drainage areas between 0.5 and 200 square miles. The method relates basin characteristics, including drainage area, percentage of forest cover, percentage of water area, latitude, and longitude, with peak flow characteristics. Regression equations for each of eight regions are presented for determination of QIQ/ the peak discharge, which, on the average, will be exceeded once in 10 years. Peak flows, Q25 and Q 50 , can then be estimated from Q25/Q10 and Q-50/Q-10 ratios developed for each region. Nomographs are included which solve the equations for basins between 1 and 50 square miles. The regional regression equations were developed using multiple regression techniques. Annual peaks for 303 sites were analyzed in the study. These included all records on unregulated streams with drainage areas less than about 500 square miles with 10 years or more of record or which could readily be extended to 10 years on the basis of nearby streams. The log-Pearson Type III method as modified and a digital computer were employed to estimate magnitude and frequency of floods for each of the 303 gaged sites. A large number of physical and climatic basin characteristics were determined for each of the gaged sites. The multiple regression method was then applied to determine the equations relating the floodflows and the most significant basin characteristics. For convenience of the users, several equations were simplified and some complex characteristics were deleted at the sacrifice of some increase in the standard error. Standard errors of estimate and many other statistical data were computed in the analysis process and are available in the Boise district office files. The analysis showed that QIQ was the best defined and most practical index flood for determination of the Q25 and 0,50 flood estimates.Regression equations are not developed because of poor definition for areas which total about 20,000 square miles, most of which are in southern Idaho. These areas are described in the report to prevent use of regression equations where they do not apply. They include urbanized areas, streams affected by regulation or diversion by works of man, unforested areas, streams with gaining or losing reaches, streams draining alluvial valleys and the Snake Plain, intense thunderstorm areas, and scattered areas where records indicate recurring floods which depart from the regional equations. Maximum flows of record and basin locations are summarized in tables and maps. The analysis indicates deficiencies in data exist. To improve knowledge regarding flood characteristics in poorly defined areas, the following data-collection programs are recommended. Gages should be operated on a few selected small streams for an extended period to define floods at long recurrence intervals. Crest-stage gages should be operated in representative basins in urbanized areas, newly developed irrigated areas and grasslands, and in unforested areas. Unusual floods should continue to be measured at miscellaneous sites on regulated streams and in intense thunderstorm-prone areas. The relationship between channel geometry and floodflow characteristics should be investigated as an alternative or supplement to operation of gaging stations. Documentation of historic flood data from newspapers and other sources would improve the basic flood-data base.
Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David
2016-10-01
Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.
A Simultaneous Equation Demand Model for Block Rates
NASA Astrophysics Data System (ADS)
Agthe, Donald E.; Billings, R. Bruce; Dobra, John L.; Raffiee, Kambiz
1986-01-01
This paper examines the problem of simultaneous-equations bias in estimation of the water demand function under an increasing block rate structure. The Hausman specification test is used to detect the presence of simultaneous-equations bias arising from correlation of the price measures with the regression error term in the results of a previously published study of water demand in Tucson, Arizona. An alternative simultaneous equation model is proposed for estimating the elasticity of demand in the presence of block rate pricing structures and availability of service charges. This model is used to reestimate the price and rate premium elasticities of demand in Tucson, Arizona for both the usual long-run static model and for a simple short-run demand model. The results from these simultaneous equation models are consistent with a priori expectations and are unbiased.
Olson, Scott A.
2003-01-01
The stream-gaging network in New Hampshire was analyzed for its effectiveness in providing regional information on peak-flood flow, mean-flow, and low-flow frequency. The data available for analysis were from stream-gaging stations in New Hampshire and selected stations in adjacent States. The principles of generalized-least-squares regression analysis were applied to develop regional regression equations that relate streamflow-frequency characteristics to watershed characteristics. Regression equations were developed for (1) the instantaneous peak flow with a 100-year recurrence interval, (2) the mean-annual flow, and (3) the 7-day, 10-year low flow. Active and discontinued stream-gaging stations with 10 or more years of flow data were used to develop the regression equations. Each stream-gaging station in the network was evaluated and ranked on the basis of how much the data from that station contributed to the cost-weighted sampling-error component of the regression equation. The potential effect of data from proposed and new stream-gaging stations on the sampling error also was evaluated. The stream-gaging network was evaluated for conditions in water year 2000 and for estimated conditions under various network strategies if an additional 5 years and 20 years of streamflow data were collected. The effectiveness of the stream-gaging network in providing regional streamflow information could be improved for all three flow characteristics with the collection of additional flow data, both temporally and spatially. With additional years of data collection, the greatest reduction in the average sampling error of the regional regression equations was found for the peak- and low-flow characteristics. In general, additional data collection at stream-gaging stations with unregulated flow, relatively short-term record (less than 20 years), and drainage areas smaller than 45 square miles contributed the largest cost-weighted reduction to the average sampling error of the regional estimating equations. The results of the network analyses can be used to prioritize the continued operation of active stations, the reactivation of discontinued stations, or the activation of new stations to maximize the regional information content provided by the stream-gaging network. Final decisions regarding altering the New Hampshire stream-gaging network would require the consideration of the many uses of the streamflow data serving local, State, and Federal interests.
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
NASA Astrophysics Data System (ADS)
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s-1. Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit ( r 2 > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s(-1). Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit (r(2) > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Bankfull characteristics of Ohio streams and their relation to peak streamflows
Sherwood, James M.; Huitger, Carrie A.
2005-01-01
Regional curves, simple-regression equations, and multiple-regression equations were developed to estimate bankfull width, bankfull mean depth, bankfull cross-sectional area, and bankfull discharge of rural, unregulated streams in Ohio. The methods are based on geomorphic, basin, and flood-frequency data collected at 50 study sites on unregulated natural alluvial streams in Ohio, of which 40 sites are near streamflow-gaging stations. The regional curves and simple-regression equations relate the bankfull characteristics to drainage area. The multiple-regression equations relate the bankfull characteristics to drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope. Average standard errors of prediction for bankfull width equations range from 20.6 to 24.8 percent; for bankfull mean depth, 18.8 to 20.6 percent; for bankfull cross-sectional area, 25.4 to 30.6 percent; and for bankfull discharge, 27.0 to 78.7 percent. The simple-regression (drainage-area only) equations have the highest average standard errors of prediction. The multiple-regression equations in which the explanatory variables included drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope have the lowest average standard errors of prediction. Field surveys were done at each of the 50 study sites to collect the geomorphic data. Bankfull indicators were identified and evaluated, cross-section and longitudinal profiles were surveyed, and bed- and bank-material were sampled. Field data were analyzed to determine various geomorphic characteristics such as bankfull width, bankfull mean depth, bankfull cross-sectional area, bankfull discharge, streambed slope, and bed- and bank-material particle-size distribution. The various geomorphic characteristics were analyzed by means of a combination of graphical and statistical techniques. The logarithms of the annual peak discharges for the 40 gaged study sites were fit by a Pearson Type III frequency distribution to develop flood-peak discharges associated with recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The peak-frequency data were related to geomorphic, basin, and climatic variables by multiple-regression analysis. Simple-regression equations were developed to estimate 2-, 5-, 10-, 25-, 50-, and 100-year flood-peak discharges of rural, unregulated streams in Ohio from bankfull channel cross-sectional area. The average standard errors of prediction are 31.6, 32.6, 35.9, 41.5, 46.2, and 51.2 percent, respectively. The study and methods developed are intended to improve understanding of the relations between geomorphic, basin, and flood characteristics of streams in Ohio and to aid in the design of hydraulic structures, such as culverts and bridges, where stability of the stream and structure is an important element of the design criteria. The study was done in cooperation with the Ohio Department of Transportation and the U.S. Department of Transportation, Federal Highway Administration.
van Deventer, Hendrick E; George, Jaya A; Paiker, Janice E; Becker, Piet J; Katz, Ivor J
2008-07-01
The 4-variable Modification of Diet in Renal Disease (4-v MDRD) and Cockcroft-Gault (CG) equations are commonly used for estimating glomerular filtration rate (GFR); however, neither of these equations has been validated in an indigenous African population. The aim of this study was to evaluate the performance of the 4-v MDRD and CG equations for estimating GFR in black South Africans against measured GFR and to assess the appropriateness for the local population of the ethnicity factor established for African Americans in the 4-v MDRD equation. We enrolled 100 patients in the study. The plasma clearance of chromium-51-EDTA ((51)Cr-EDTA) was used to measure GFR, and serum creatinine was measured using an isotope dilution mass spectrometry (IDMS) traceable assay. We estimated GFR using both the reexpressed 4-v MDRD and CG equations and compared it to measured GFR using 4 modalities: correlation coefficient, weighted Deming regression analysis, percentage bias, and proportion of estimated GFR within 30% of measured GFR (P(30)). The Spearman correlation coefficient between measured and estimated GFR for both equations was similar (4-v MDRD R(2) = 0.80 and CG R(2) = 0.79). Using the 4-v MDRD equation with the ethnicity factor of 1.212 as established for African Americans resulted in a median positive bias of 13.1 (95% CI 5.5 to 18.3) mL/min/1.73 m(2). Without the ethnicity factor, median bias was 1.9 (95% CI -0.8 to 4.5) mL/min/1.73 m(2). The 4-v MDRD equation, without the ethnicity factor of 1.212, can be used for estimating GFR in black South Africans.
Estimating Available Fuel Weight Consumed by Prescribed Fires in the South
Walter A. Hough
1978-01-01
A method is proposed for estimating the weight of fuel burned (available fuel) by prescribed fires in southern pine stands. Weights of available fuel in litter alone and in litter plus understory materials can be estimated. Prediction equations were developed by regression analysis of data from a variety of locations and stand conditions. They are most reliable for...
Estimating tree crown widths for the primary Acadian species in Maine
Matthew B. Russell; Aaron R. Weiskittel
2012-01-01
In this analysis, data for seven conifer and eight hardwood species were gathered from across the state of Maine for estimating tree crown widths. Maximum and largest crown width equations were developed using tree diameter at breast height as the primary predicting variable. Quantile regression techniques were used to estimate the maximum crown width and a constrained...
Cost estimators for construction of forest roads in the central Appalachians
Deborah, A. Layton; Chris O. LeDoux; Curt C. Hassler; Curt C. Hassler
1992-01-01
Regression equations were developed for estimating the total cost of road construction in the central Appalachian region. Estimators include methods for predicting total costs for roads constructed using hourly rental methods and roads built on a total-job bid basis. Results show that total-job bid roads cost up to five times as much as roads built than when equipment...
Thermal requirements of Dermanyssus gallinae (De Geer, 1778) (Acari: Dermanyssidae).
Tucci, Edna Clara; do Prado, Angelo P; de Araújo, Raquel Pires
2008-01-01
The thermal requirements for development of Dermanyssus gallinae were studied under laboratory conditions at 15, 20, 25, 30 and 35 degrees C, a 12h photoperiod and 60-85% RH. The thermal requirements for D. gallinae were as follows. Preoviposition: base temperature 3.4 degrees C, thermal constant (k) 562.85 degree-hours, determination coefficient (R(2)) 0.59, regression equation: Y= -0.006035 + 0.001777x. Egg: base temperature 10.60 degrees C, thermal constant (k) 689.65 degree-hours, determination coefficient (R(2)) 0.94, regression equation: Y= -0.015367 + 0.001450x. Larva: base temperature 9.82 degrees C, thermal constant (k) 464.91 degree-hours, determination coefficient (R(2)) 0.87, regression equation: Y= -0.021123 + 0.002151x. Protonymph: base temperature 10.17 degrees C, thermal constant (k) 504.49 degree-hours, determination coefficient (R(2)) 0.90, regression equation: Y= -0.020152 + 0.001982x. Deutonymph: base temperature 11.80 degrees C, thermal constant (k) 501.11 degree-hours, determination coefficient (R(2)) 0.99, regression equation: Y= -0.023555 + 0.001996x. The results obtained showed that 15 to 42 generations of Dermanyssus gallinae may occur during the year in the State of São Paulo, as estimated based on isotherm charts. Dermanyssus gallinae may develop continually in the State of São Paulo, with a population decrease in the winter. There were differences between the developmental stages of D. gallinae in relation to thermal requirements.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
NASA Astrophysics Data System (ADS)
See, J. J.; Jamaian, S. S.; Salleh, R. M.; Nor, M. E.; Aman, F.
2018-04-01
This research aims to estimate the parameters of Monod model of microalgae Botryococcus Braunii sp growth by the Least-Squares method. Monod equation is a non-linear equation which can be transformed into a linear equation form and it is solved by implementing the Least-Squares linear regression method. Meanwhile, Gauss-Newton method is an alternative method to solve the non-linear Least-Squares problem with the aim to obtain the parameters value of Monod model by minimizing the sum of square error ( SSE). As the result, the parameters of the Monod model for microalgae Botryococcus Braunii sp can be estimated by the Least-Squares method. However, the estimated parameters value obtained by the non-linear Least-Squares method are more accurate compared to the linear Least-Squares method since the SSE of the non-linear Least-Squares method is less than the linear Least-Squares method.
Gómez-Valdés, Jorge A; Menéndez Garmendia, Antinea; García-Barzola, Lizbeth; Sánchez-Mejorada, Gabriela; Karam, Carlos; Baraybar, José Pablo; Klales, Alexandra
2017-03-01
The aim of this study was to test the accuracy of the Klales et al. (2012) equation for sex estimation in contemporary Mexican population. Our investigation was carried out on a sample of 203 left innominates of identified adult skeletons from the UNAM-Collection and the Santa María Xigui Cemetery, in Central Mexico. The Klales' original equation produces a sex bias in sex estimation against males (86-92% accuracy versus 100% accuracy in females). Based on these results, the Klales et al. (2012) method was recalibrated for a new cutt-of-point for sex estimation in contemporary Mexican populations. The results show cross-validated classification accuracy rates as high as 100% after recalibrating the original logistic regression equation. Recalibration improved classification accuracy and eliminated sex bias. This new formula will improve sex estimation for Mexican contemporary populations. © 2017 Wiley Periodicals, Inc.
Katherine J. Elliott; Barton D. Clinton
1993-01-01
Allometric equations were developed to predict aboveground dry weight of herbaceous and woody species on prescribe-burned sites in the Southern Appalachians. Best-fit least-square regression models were developed using diamet,er, height, or both, as the independent variables and dry weight as the dependent variable. Coefficients of determination for the selected total...
Jaman, Ajmery; Latif, Mahbub A H M; Bari, Wasimul; Wahed, Abdus S
2016-05-20
In generalized estimating equations (GEE), the correlation between the repeated observations on a subject is specified with a working correlation matrix. Correct specification of the working correlation structure ensures efficient estimators of the regression coefficients. Among the criteria used, in practice, for selecting working correlation structure, Rotnitzky-Jewell, Quasi Information Criterion (QIC) and Correlation Information Criterion (CIC) are based on the fact that if the assumed working correlation structure is correct then the model-based (naive) and the sandwich (robust) covariance estimators of the regression coefficient estimators should be close to each other. The sandwich covariance estimator, used in defining the Rotnitzky-Jewell, QIC and CIC criteria, is biased downward and has a larger variability than the corresponding model-based covariance estimator. Motivated by this fact, a new criterion is proposed in this paper based on the bias-corrected sandwich covariance estimator for selecting an appropriate working correlation structure in GEE. A comparison of the proposed and the competing criteria is shown using simulation studies with correlated binary responses. The results revealed that the proposed criterion generally performs better than the competing criteria. An example of selecting the appropriate working correlation structure has also been shown using the data from Madras Schizophrenia Study. Copyright © 2015 John Wiley & Sons, Ltd.
Working covariance model selection for generalized estimating equations.
Carey, Vincent J; Wang, You-Gan
2011-11-20
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. Copyright © 2011 John Wiley & Sons, Ltd.
Paretti, Nicholas V.; Kennedy, Jeffrey R.; Turney, Lovina A.; Veilleux, Andrea G.
2014-01-01
The regional regression equations were integrated into the U.S. Geological Survey’s StreamStats program. The StreamStats program is a national map-based web application that allows the public to easily access published flood frequency and basin characteristic statistics. The interactive web application allows a user to select a point within a watershed (gaged or ungaged) and retrieve flood-frequency estimates derived from the current regional regression equations and geographic information system data within the selected basin. StreamStats provides users with an efficient and accurate means for retrieving the most up to date flood frequency and basin characteristic data. StreamStats is intended to provide consistent statistics, minimize user error, and reduce the need for large datasets and costly geographic information system software.
Estimates of Median Flows for Streams on the 1999 Kansas Surface Water Register
Perry, Charles A.; Wolock, David M.; Artman, Joshua C.
2004-01-01
The Kansas State Legislature, by enacting Kansas Statute KSA 82a?2001 et. seq., mandated the criteria for determining which Kansas stream segments would be subject to classification by the State. One criterion for the selection as a classified stream segment is based on the statistic of median flow being equal to or greater than 1 cubic foot per second. As specified by KSA 82a?2001 et. seq., median flows were determined from U.S. Geological Survey streamflow-gaging-station data by using the most-recent 10 years of gaged data (KSA) for each streamflow-gaging station. Median flows also were determined by using gaged data from the entire period of record (all-available hydrology, AAH). Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating median flows for uncontrolled stream segments. The drainage area of the gaging stations on uncontrolled stream segments used in the regression analyses ranged from 2.06 to 12,004 square miles. A logarithmic transformation of the data was needed to develop the best linear relation for computing median flows. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. Tobit analyses of KSA data yielded a model standard error of prediction of 0.285 logarithmic units, and the best equations using Tobit analyses of AAH data had a model standard error of prediction of 0.250 logarithmic units. These regression equations and an interpolation procedure were used to compute median flows for the uncontrolled stream segments on the 1999 Kansas Surface Water Register. Measured median flows from gaging stations were incorporated into the regression-estimated median flows along the stream segments where available. The segments that were uncontrolled were interpolated using gaged data weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled segments of Kansas streams, the median flow information was interpolated between gaging stations using only gaged data weighted by drainage area. Of the 2,232 total stream segments on the Kansas Surface Water Register, 34.5 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second when the KSA analysis was used. When the AAH analysis was used, 36.2 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second. This report supercedes U.S. Geological Survey Water-Resources Investigations Report 02?4292.
Regression models to predict hip joint centers in pathological hip population.
Mantovani, Giulia; Ng, K C Geoffrey; Lamontagne, Mario
2016-02-01
The purpose was to investigate the validity of Harrington's and Davis's hip joint center (HJC) regression equations on a population affected by a hip deformity, (i.e., femoroacetabular impingement). Sixty-seven participants (21 healthy controls, 46 with a cam-type deformity) underwent pelvic CT imaging. Relevant bony landmarks and geometric HJCs were digitized from the images, and skin thickness was measured for the anterior and posterior superior iliac spines. Non-parametric statistical and Bland-Altman tests analyzed differences between the predicted HJC (from regression equations) and the actual HJC (from CT images). The error from Davis's model (25.0 ± 6.7 mm) was larger than Harrington's (12.3 ± 5.9 mm, p<0.001). There were no differences between groups, thus, studies on femoroacetabular impingement can implement conventional regression models. Measured skin thickness was 9.7 ± 7.0mm and 19.6 ± 10.9 mm for the anterior and posterior bony landmarks, respectively, and correlated with body mass index. Skin thickness estimates can be considered to reduce the systematic error introduced by surface markers. New adult-specific regression equations were developed from the CT dataset, with the hypothesis that they could provide better estimates when tuned to a larger adult-specific dataset. The linear models were validated on external datasets and using leave-one-out cross-validation techniques; Prediction errors were comparable to those of Harrington's model, despite the adult-specific population and the larger sample size, thus, prediction accuracy obtained from these parameters could not be improved. Copyright © 2015 Elsevier B.V. All rights reserved.
Flood characteristics of Alaskan streams
Lamke, R.D.
1979-01-01
Peak discharge data for Alaskan streams are summarized and analyzed. Multiple-regression equations relating peak discharge magnitude and frequency to climatic and physical characteristics of 260 gaged basins were determined in order to estimate average recurrence interval of floods at ungaged sites. These equations are for 1.25-, 2-, 5-, 10-, 25-, and 50-year average recurrence intervals. In this report, Alaska was divided into two regions, one having a maritime climate with fall and winter rains and floods, the other having spring and summer floods of a variety or combinations of causes. Average standard errors of the six multiple-regression equations for these two regions were 48 and 74 percent, respectively. Maximum recorded floods at more than 400 sites throughout Alaska are tabulated. Maps showing lines of equal intensity of the principal climatic variables found to be significant (mean annual precipitation and mean minimum January temperature), and location of the 260 sites used in the multiple-regression analyses are included. Little flood data have been collected in western and arctic Alaska, and the predictive equations are therefore less reliable for those areas. (Woodard-USGS)
Schooling, Literacy and Individual Earnings. International Adult Literacy Survey.
ERIC Educational Resources Information Center
Osberg, Lars
This paper uses direct measures of literacy skill levels provided by the International Adult Literacy Survey to estimate the return to literacy skills. Using a very simple human capital earnings equation and standard ordinary least squares regression, it tested estimates of the return to literacy skills for their robustness to alternative scalings…
Skull Size and Intelligence, and King Robert Bruce's IQ
ERIC Educational Resources Information Center
Deary, Ian J.; Ferguson, Karen J.; Bastin, Mark E.; Barrow, Geoffrey W. S.; Reid, Louise M.; Seckl, Jonathan R.; Wardlaw, Joanna M.; MacLullich, Alasdair M. J.
2007-01-01
An estimate of someone's IQ is a potentially informative personal datum. This study examines the association between external skull measurements and IQ scores, and uses the resulting regression equation to provide an estimate of the IQ of King Robert I of Scotland (Robert Bruce, 1274-1329). Participants were 48 relatively healthy Caucasian men…
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
Prediction of elemental creep. [steady state and cyclic data from regression analysis
NASA Technical Reports Server (NTRS)
Davis, J. W.; Rummler, D. R.
1975-01-01
Cyclic and steady-state creep tests were performed to provide data which were used to develop predictive equations. These equations, describing creep as a function of stress, temperature, and time, were developed through the use of a least squares regression analyses computer program for both the steady-state and cyclic data sets. Comparison of the data from the two types of tests, revealed that there was no significant difference between the cyclic and steady-state creep strains for the L-605 sheet under the experimental conditions investigated (for the same total time at load). Attempts to develop a single linear equation describing the combined steady-state and cyclic creep data resulted in standard errors of estimates higher than obtained for the individual data sets. A proposed approach to predict elemental creep in metals uses the cyclic creep equation and a computer program which applies strain and time hardening theories of creep accumulation.
[Comparison of three stand-level biomass estimation methods].
Dong, Li Hu; Li, Feng Ri
2016-12-01
At present, the forest biomass methods of regional scale attract most of attention of the researchers, and developing the stand-level biomass model is popular. Based on the forestry inventory data of larch plantation (Larix olgensis) in Jilin Province, we used non-linear seemly unrelated regression (NSUR) to estimate the parameters in two additive system of stand-level biomass equations, i.e., stand-level biomass equations including the stand variables and stand biomass equations including the biomass expansion factor (i.e., Model system 1 and Model system 2), listed the constant biomass expansion factor for larch plantation and compared the prediction accuracy of three stand-level biomass estimation methods. The results indicated that for two additive system of biomass equations, the adjusted coefficient of determination (R a 2 ) of the total and stem equations was more than 0.95, the root mean squared error (RMSE), the mean prediction error (MPE) and the mean absolute error (MAE) were smaller. The branch and foliage biomass equations were worse than total and stem biomass equations, and the adjusted coefficient of determination (R a 2 ) was less than 0.95. The prediction accuracy of a constant biomass expansion factor was relatively lower than the prediction accuracy of Model system 1 and Model system 2. Overall, although stand-level biomass equation including the biomass expansion factor belonged to the volume-derived biomass estimation method, and was different from the stand biomass equations including stand variables in essence, but the obtained prediction accuracy of the two methods was similar. The constant biomass expansion factor had the lower prediction accuracy, and was inappropriate. In addition, in order to make the model parameter estimation more effective, the established stand-level biomass equations should consider the additivity in a system of all tree component biomass and total biomass equations.
Wang, Wei; Young, Bessie A.; Fülöp, Tibor; de Boer, Ian H.; Boulware, L. Ebony; Katz, Ronit; Correa, Adolfo; Griswold, Michael E.
2015-01-01
Background The calibration to Isotope Dilution Mass Spectroscopy (IDMS) traceable creatinine is essential for valid use of the new Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation to estimate the glomerular filtration rate (GFR). Methods For 5,210 participants in the Jackson Heart Study (JHS), serum creatinine was measured with a multipoint enzymatic spectrophotometric assay at the baseline visit (2000–2004) and re-measured using the Roche enzymatic method, traceable to IDMS in a subset of 206 subjects. The 200 eligible samples (6 were excluded, 1 for failure of the re-measurement and 5 for outliers) were divided into three disjoint sets - training, validation, and test - to select a calibration model, estimate true errors, and assess performance of the final calibration equation. The calibration equation was applied to serum creatinine measurements of 5,210 participants to estimate GFR and the prevalence of CKD. Results The selected Deming regression model provided a slope of 0.968 (95% Confidence Interval (CI), 0.904 to 1.053) and intercept of −0.0248 (95% CI, −0.0862 to 0.0366) with R squared 0.9527. Calibrated serum creatinine showed high agreement with actual measurements when applying to the unused test set (concordance correlation coefficient 0.934, 95% CI, 0.894 to 0.960). The baseline prevalence of CKD in the JHS (2000–2004) was 6.30% using calibrated values, compared with 8.29% using non-calibrated serum creatinine with the CKD-EPI equation (P < 0.001). Conclusions A Deming regression model was chosen to optimally calibrate baseline serum creatinine measurements in the JHS and the calibrated values provide a lower CKD prevalence estimate. PMID:25806862
Wang, Wei; Young, Bessie A; Fülöp, Tibor; de Boer, Ian H; Boulware, L Ebony; Katz, Ronit; Correa, Adolfo; Griswold, Michael E
2015-05-01
The calibration to isotope dilution mass spectrometry-traceable creatinine is essential for valid use of the new Chronic Kidney Disease Epidemiology Collaboration equation to estimate the glomerular filtration rate. For 5,210 participants in the Jackson Heart Study (JHS), serum creatinine was measured with a multipoint enzymatic spectrophotometric assay at the baseline visit (2000-2004) and remeasured using the Roche enzymatic method, traceable to isotope dilution mass spectrometry in a subset of 206 subjects. The 200 eligible samples (6 were excluded, 1 for failure of the remeasurement and 5 for outliers) were divided into 3 disjoint sets-training, validation and test-to select a calibration model, estimate true errors and assess performance of the final calibration equation. The calibration equation was applied to serum creatinine measurements of 5,210 participants to estimate glomerular filtration rate and the prevalence of chronic kidney disease (CKD). The selected Deming regression model provided a slope of 0.968 (95% confidence interval [CI], 0.904-1.053) and intercept of -0.0248 (95% CI, -0.0862 to 0.0366) with R value of 0.9527. Calibrated serum creatinine showed high agreement with actual measurements when applying to the unused test set (concordance correlation coefficient 0.934, 95% CI, 0.894-0.960). The baseline prevalence of CKD in the JHS (2000-2004) was 6.30% using calibrated values compared with 8.29% using noncalibrated serum creatinine with the Chronic Kidney Disease Epidemiology Collaboration equation (P < 0.001). A Deming regression model was chosen to optimally calibrate baseline serum creatinine measurements in the JHS, and the calibrated values provide a lower CKD prevalence estimate.
A stream-gaging network analysis for the 7-day, 10-year annual low flow in New Hampshire streams
Flynn, Robert H.
2003-01-01
The 7-day, 10-year (7Q10) low-flow-frequency statistic is a widely used measure of surface-water availability in New Hampshire. Regression equations and basin-characteristic digital data sets were developed to help water-resource managers determine surface-water resources during periods of low flow in New Hampshire streams. These regression equations and data sets were developed to estimate streamflow statistics for the annual and seasonal low-flow-frequency, and period-of-record and seasonal period-of-record flow durations. generalized-least-squares (GLS) regression methods were used to develop the annual 7Q10 low-flow-frequency regression equation from 60 continuous-record stream-gaging stations in New Hampshire and in neighboring States. In the regression equation, the dependent variables were the annual 7Q10 flows at the 60 stream-gaging stations. The independent (or predictor) variables were objectively selected characteristics of the drainage basins that contribute flow to those stations. In contrast to ordinary-least-squares (OLS) regression analysis, GLS-developed estimating equations account for differences in length of record and spatial correlations among the flow-frequency statistics at the various stations.A total of 93 measurable drainage-basin characteristics were candidate independent variables. On the basis of several statistical parameters that were used to evaluate which combination of basin characteristics contribute the most to the predictive power of the equations, three drainage-basin characteristics were determined to be statistically significant predictors of the annual 7Q10: (1) total drainage area, (2) mean summer stream-gaging station precipitation from 1961 to 90, and (3) average mean annual basinwide temperature from 1961 to 1990.To evaluate the effectiveness of the stream-gaging network in providing regional streamflow data for the annual 7Q10, the computer program GLSNET (generalized-least-squares NETwork) was used to analyze the network by application of GLS regression between streamflow and the climatic and basin characteristics of the drainage basin upstream from each stream-gaging station. Improvement to the predictive ability of the regression equations developed for the network analyses is measured by the reduction in the average sampling-error variance, and can be achieved by collecting additional streamflow data at existing stations. The predictive ability of the regression equations is enhanced even further with the addition of new stations to the network. Continued data collection at unregulated stream-gaging stations with less than 14 years of record resulted in the greatest cost-weighted reduction to the average sampling-error variance of the annual 7Q10 regional regression equation. The addition of new stations in basins with underrepresented values for the independent variables of the total drainage area, average mean annual basinwide temperature, or mean summer stream-gaging station precipitation in the annual 7Q10 regression equation yielded a much greater cost-weighted reduction to the average sampling-error variance than when more data were collected at existing unregulated stations. To maximize the regional information obtained from the stream-gaging network for the annual 7Q10, ranking of the streamflow data can be used to determine whether an active station should be continued or if a new or discontinued station should be activated for streamflow data collection. Thus, this network analysis can help determine the costs and benefits of continuing the operation of a particular station or activating a new station at another location to predict the 7Q10 at ungaged stream reaches. The decision to discontinue an existing station or activate a new station, however, must also consider its contribution to other water-resource analyses such as flood management, water quality, or trends in land use or climatic change.
Aisbett, B; Le Rossignol, P
2003-09-01
The VO2-power regression and estimated total energy demand for a 6-minute supra-maximal exercise test was predicted from a continuous incremental exercise test. Sub-maximal VO2-power co-ordinates were established from the last 40 seconds (s) of 150-second exercise stages. The precision of the estimated total energy demand was determined using the 95% confidence interval (95% CI) of the estimated total energy demand. The linearity of the individual VO2-power regression equations was determined using Pearson's correlation coefficient. The mean 95% CI of the estimated total energy demand was 5.9 +/- 2.5 mL O2 Eq x kg(-1) x min(-1), and the mean correlation coefficient was 0.9942 +/- 0.0042. The current study contends that the sub-maximal VO2-power co-ordinates from a continuous incremental exercise test can be used to estimate supra-maximal energy demand without compromising the precision of the accumulated oxygen deficit (AOD) method.
Wu, Hulin; Xue, Hongqi; Kumar, Arun
2012-06-01
Differential equations are extensively used for modeling dynamics of physical processes in many scientific fields such as engineering, physics, and biomedical sciences. Parameter estimation of differential equation models is a challenging problem because of high computational cost and high-dimensional parameter space. In this article, we propose a novel class of methods for estimating parameters in ordinary differential equation (ODE) models, which is motivated by HIV dynamics modeling. The new methods exploit the form of numerical discretization algorithms for an ODE solver to formulate estimating equations. First, a penalized-spline approach is employed to estimate the state variables and the estimated state variables are then plugged in a discretization formula of an ODE solver to obtain the ODE parameter estimates via a regression approach. We consider three different order of discretization methods, Euler's method, trapezoidal rule, and Runge-Kutta method. A higher-order numerical algorithm reduces numerical error in the approximation of the derivative, which produces a more accurate estimate, but its computational cost is higher. To balance the computational cost and estimation accuracy, we demonstrate, via simulation studies, that the trapezoidal discretization-based estimate is the best and is recommended for practical use. The asymptotic properties for the proposed numerical discretization-based estimators are established. Comparisons between the proposed methods and existing methods show a clear benefit of the proposed methods in regards to the trade-off between computational cost and estimation accuracy. We apply the proposed methods t an HIV study to further illustrate the usefulness of the proposed approaches. © 2012, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.
2016-06-01
Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).
Qing, Si-han; Chang, Yun-feng; Dong, Xiao-ai; Li, Yuan; Chen, Xiao-gang; Shu, Yong-kang; Deng, Zhen-hua
2013-10-01
To establish the mathematical models of stature estimation for Sichuan Han female with measurement of lumbar vertebrae by X-ray to provide essential data for forensic anthropology research. The samples, 206 Sichuan Han females, were divided into three groups including group A, B and C according to the ages. Group A (206 samples) consisted of all ages, group B (116 samples) were 20-45 years old and 90 samples over 45 years old were group C. All the samples were examined lumbar vertebrae through CR technology, including the parameters of five centrums (L1-L5) as anterior border, posterior border and central heights (x1-x15), total central height of lumbar spine (x16), and the real height of every sample. The linear regression analysis was produced using the parameters to establish the mathematical models of stature estimation. Sixty-two trained subjects were tested to verify the accuracy of the mathematical models. The established mathematical models by hypothesis test of linear regression equation model were statistically significant (P<0.05). The standard errors of the equation were 2.982-5.004 cm, while correlation coefficients were 0.370-0.779 and multiple correlation coefficients were 0.533-0.834. The return tests of the highest correlation coefficient and multiple correlation coefficient of each group showed that the highest accuracy of the multiple regression equation, y = 100.33 + 1.489 x3 - 0.548 x6 + 0.772 x9 + 0.058 x12 + 0.645 x15, in group A were 80.6% (+/- lSE) and 100% (+/- 2SE). The established mathematical models in this study could be applied for the stature estimation for Sichuan Han females.
Barth, Nancy A.; Veilleux, Andrea G.
2012-01-01
The U.S. Geological Survey (USGS) is currently updating at-site flood frequency estimates for USGS streamflow-gaging stations in the desert region of California. The at-site flood-frequency analysis is complicated by short record lengths (less than 20 years is common) and numerous zero flows/low outliers at many sites. Estimates of the three parameters (mean, standard deviation, and skew) required for fitting the log Pearson Type 3 (LP3) distribution are likely to be highly unreliable based on the limited and heavily censored at-site data. In a generalization of the recommendations in Bulletin 17B, a regional analysis was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the LP3 distribution. A regional skew value of zero from a previously published report was used with a new estimated mean squared error (MSE) of 0.20. A weighted least squares (WLS) regression method was used to develop both a regional standard deviation and a mean model based on annual peak-discharge data for 33 USGS stations throughout California’s desert region. At-site standard deviation and mean values were determined by using an expected moments algorithm (EMA) method for fitting the LP3 distribution to the logarithms of annual peak-discharge data. Additionally, a multiple Grubbs-Beck (MGB) test, a generalization of the test recommended in Bulletin 17B, was used for detecting multiple potentially influential low outliers in a flood series. The WLS regression found that no basin characteristics could explain the variability of standard deviation. Consequently, a constant regional standard deviation model was selected, resulting in a log-space value of 0.91 with a MSE of 0.03 log units. Yet drainage area was found to be statistically significant at explaining the site-to-site variability in mean. The linear WLS regional mean model based on drainage area had a Pseudo- 2 R of 51 percent and a MSE of 0.32 log units. The regional parameter estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final equations are functions of drainage area.Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent.
Body mass and stature estimation based on the first metatarsal in humans.
De Groote, Isabelle; Humphrey, Louise T
2011-04-01
Archaeological assemblages often lack the complete long bones needed to estimate stature and body mass. The most accurate estimates of body mass and stature are produced using femoral head diameter and femur length. Foot bones including the first metatarsal preserve relatively well in a range of archaeological contexts. In this article we present regression equations using the first metatarsal to estimate femoral head diameter, femoral length, and body mass in a diverse human sample. The skeletal sample comprised 87 individuals (Andamanese, Australasians, Africans, Native Americans, and British). Results show that all first metatarsal measurements correlate moderately to highly (r = 0.62-0.91) with femoral head diameter and length. The proximal articular dorsoplantar diameter is the best single measurement to predict both femoral dimensions. Percent standard errors of the estimate are below 5%. Equations using two metatarsal measurements show a small increase in accuracy. Direct estimations of body mass (calculated from measured femoral head diameter using previously published equations) have an error of just over 7%. No direct stature estimation equations were derived due to the varied linear body proportions represented in the sample. The equations were tested on a sample of 35 individuals from Christ Church Spitalfields. Percentage differences in estimated and measured femoral head diameter and length were less than 1%. This study demonstrates that it is feasible to use the first metatarsal in the estimation of body mass and stature. The equations presented here are particularly useful for assemblages where the long bones are either missing or fragmented, and enable estimation of these fundamental population parameters in poorly preserved assemblages. Copyright © 2011 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Yang, Que; Wang, Shanshan; Wang, Kai; Zhang, Chunyu; Zhang, Lu; Meng, Qingyu; Zhu, Qiudong
2015-08-01
For normal eyes without history of any ocular surgery, traditional equations for calculating intraocular lens (IOL) power, such as SRK-T, Holladay, Higis, SRK-II, et al., all were relativley accurate. However, for eyes underwent refractive surgeries, such as LASIK, or eyes diagnosed as keratoconus, these equations may cause significant postoperative refractive error, which may cause poor satisfaction after cataract surgery. Although some methods have been carried out to solve this problem, such as Hagis-L equation[1], or using preoperative data (data before LASIK) to estimate K value[2], no precise equations were available for these eyes. Here, we introduced a novel intraocular lens power estimation method by accurate ray tracing with optical design software ZEMAX. Instead of using traditional regression formula, we adopted the exact measured corneal elevation distribution, central corneal thickness, anterior chamber depth, axial length, and estimated effective lens plane as the input parameters. The calculation of intraocular lens power for a patient with keratoconus and another LASIK postoperative patient met very well with their visual capacity after cataract surgery.
Klaeboe, Ronny
2005-09-01
When Gardermoen replaced Fornebu as the main airport for Oslo, aircraft noise levels increased in recreational areas near Gardermoen and decreased in areas near Fornebu. Krog and Engdahl [J. Acoust. Soc. Am. 116, 323-333 (2004)] estimate that recreationists' annoyance from aircraft noise in these areas changed more than would be anticipated from the actual noise changes. However, the sizes of their estimated "situation" effects are not credible. One possible reason for the anomalous results is that standard regression assumptions become violated when motivational factors are inserted into the regression model. Standardized regression coefficients (beta values) should also not be utilized for comparisons across equations.
NASA Technical Reports Server (NTRS)
Pluhowski, E. J. (Principal Investigator)
1977-01-01
The author has identified the following significant results. Land use data derived from high altitude photography and satellite imagery were studied for 49 basins in Delaware, and eastern Maryland and Virginia. Applying multiple regression techniques to a network of gaging stations monitoring runoff from 39 of the basins, demonstrated that land use data from high altitude photography provided an effective means of significantly improving estimates of stream flow. Forty stream flow characteristic equations for incorporating remotely sensed land use information, were compared with a control set of equations using map derived land cover. Significant improvement was detected in six equations where level 1 data was added and in five equations where level 2 information was utilized. Only four equations were improved significantly using land use data derived from LANDSAT imagery. Significant losses in accuracy due to the use of remotely sensed land use information were detected only in estimates of flood peaks. Losses in accuracy for flood peaks were probably due to land cover changes associated with temporal differences among the primary land use data sources.
Model parameter uncertainty analysis for an annual field-scale P loss model
NASA Astrophysics Data System (ADS)
Bolster, Carl H.; Vadas, Peter A.; Boykin, Debbie
2016-08-01
Phosphorous (P) fate and transport models are important tools for developing and evaluating conservation practices aimed at reducing P losses from agricultural fields. Because all models are simplifications of complex systems, there will exist an inherent amount of uncertainty associated with their predictions. It is therefore important that efforts be directed at identifying, quantifying, and communicating the different sources of model uncertainties. In this study, we conducted an uncertainty analysis with the Annual P Loss Estimator (APLE) model. Our analysis included calculating parameter uncertainties and confidence and prediction intervals for five internal regression equations in APLE. We also estimated uncertainties of the model input variables based on values reported in the literature. We then predicted P loss for a suite of fields under different management and climatic conditions while accounting for uncertainties in the model parameters and inputs and compared the relative contributions of these two sources of uncertainty to the overall uncertainty associated with predictions of P loss. Both the overall magnitude of the prediction uncertainties and the relative contributions of the two sources of uncertainty varied depending on management practices and field characteristics. This was due to differences in the number of model input variables and the uncertainties in the regression equations associated with each P loss pathway. Inspection of the uncertainties in the five regression equations brought attention to a previously unrecognized limitation with the equation used to partition surface-applied fertilizer P between leaching and runoff losses. As a result, an alternate equation was identified that provided similar predictions with much less uncertainty. Our results demonstrate how a thorough uncertainty and model residual analysis can be used to identify limitations with a model. Such insight can then be used to guide future data collection and model development and evaluation efforts.
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R2 = 0.86–0.93 for 72 data sets collected in the industrial river and R2 = 0.60–0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals’ concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density. PMID:27028017
Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin
2016-01-01
Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R(2) = 0.86-0.93 for 72 data sets collected in the industrial river and R(2) = 0.60-0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals' concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density.
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
ERIC Educational Resources Information Center
Savalei, Victoria; Rhemtulla, Mijke
2017-01-01
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately…
Predicting crown weight and bole volume of five Western hardwoods.
J.A. Kendall Snell; Susan N. Little
1983-01-01
Regression equations are presented for estimating biomass of five western hardwoods: red alder (Alnus rubra Bong.), giant chinkapin (Castanopsis chrysophylla (Dougl.) A. DC.), bigleaf maple (Acer macrophyllum Pursh), Pacific madrone (Arbutus menziesii Pursh), and tanoak (...
Estimating flow-duration and low-flow frequency statistics for unregulated streams in Oregon.
DOT National Transportation Integrated Search
2008-08-01
Flow statistical datasets, basin-characteristic datasets, and regression equations were developed to provide decision makers with surface-water information needed for activities such as water-quality regulation, water-rights adjudication, biological ...
Bhatia, Triptish; Gettig, Elizabeth A; Gottesman, Irving I; Berliner, Jonathan; Mishra, N N; Nimgaonkar, Vishwajit L; Deshpande, Smita N
2016-12-01
Schizophrenia (SZ) has an estimated heritability of 64-88%, with the higher values based on twin studies. Conventionally, family history of psychosis is the best individual-level predictor of risk, but reliable risk estimates are unavailable for Indian populations. Genetic, environmental, and epigenetic factors are equally important and should be considered when predicting risk in 'at risk' individuals. To estimate risk based on an Indian schizophrenia participant's family history combined with selected demographic factors. To incorporate variables in addition to family history, and to stratify risk, we constructed a regression equation that included demographic variables in addition to family history. The equation was tested in two independent Indian samples: (i) an initial sample of SZ participants (N=128) with one sibling or offspring; (ii) a second, independent sample consisting of multiply affected families (N=138 families, with two or more sibs/offspring affected with SZ). The overall estimated risk was 4.31±0.27 (mean±standard deviation). There were 19 (14.8%) individuals in the high risk group, 75 (58.6%) in the moderate risk and 34 (26.6%) in the above average risk (in Sample A). In the validation sample, risks were distributed as: high (45%), moderate (38%) and above average (17%). Consistent risk estimates were obtained from both samples using the regression equation. Familial risk can be combined with demographic factors to estimate risk for SZ in India. If replicated, the proposed stratification of risk may be easier and more realistic for family members. Copyright © 2016. Published by Elsevier B.V.
Using exogenous variables in testing for monotonic trends in hydrologic time series
Alley, William M.
1988-01-01
One approach that has been used in performing a nonparametric test for monotonic trend in a hydrologic time series consists of a two-stage analysis. First, a regression equation is estimated for the variable being tested as a function of an exogenous variable. A nonparametric trend test such as the Kendall test is then performed on the residuals from the equation. By analogy to stagewise regression and through Monte Carlo experiments, it is demonstrated that this approach will tend to underestimate the magnitude of the trend and to result in some loss in power as a result of ignoring the interaction between the exogenous variable and time. An alternative approach, referred to as the adjusted variable Kendall test, is demonstrated to generally have increased statistical power and to provide more reliable estimates of the trend slope. In addition, the utility of including an exogenous variable in a trend test is examined under selected conditions.
Ejlerskov, Katrine T.; Jensen, Signe M.; Christensen, Line B.; Ritz, Christian; Michaelsen, Kim F.; Mølgaard, Christian
2014-01-01
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height2/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2–4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity. PMID:24463487
Ejlerskov, Katrine T; Jensen, Signe M; Christensen, Line B; Ritz, Christian; Michaelsen, Kim F; Mølgaard, Christian
2014-01-27
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height(2)/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2-4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity.
New methodology for modeling annual-aircraft emissions at airports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woodmansey, B.G.; Patterson, J.G.
An as-accurate-as-possible estimation of total-aircraft emissions are an essential component of any environmental-impact assessment done for proposed expansions at major airports. To determine the amount of emissions generated by aircraft using present models it is necessary to know the emission characteristics of all engines that are on all planes using the airport. However, the published data base does not cover all engine types and, therefore, a new methodology is needed to assist in estimating annual emissions from aircraft at airports. Linear regression equations relating quantity of emissions to aircraft weight using a known-fleet mix are developed in this paper. Total-annualmore » emissions for CO, NO[sub x], NMHC, SO[sub x], CO[sub 2], and N[sub 2]O are tabulated for Toronto's international airport for 1990. The regression equations are statistically significant for all emissions except for NMHC from large jets and NO[sub x] and NMHC for piston-engine aircraft. This regression model is a relatively simple, fast, and inexpensive method of obtaining an annual-emission inventory for an airport.« less
Ham, Joo-ho; Park, Hun-Young; Kim, Youn-ho; Bae, Sang-kon; Ko, Byung-hoon
2017-01-01
[Purpose] The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. [Methods] We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20–59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. [Results] Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. [Conclusion] These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. PMID:29036765
Ham, Joo-Ho; Park, Hun-Young; Kim, Youn-Ho; Bae, Sang-Kon; Ko, Byung-Hoon; Nam, Sang-Seok
2017-09-30
The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20-59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. ©2017 The Korean Society for Exercise Nutrition
Lima, Luiz Rodrigo Augustemak de; Martins, Priscila Custódio; Junior, Carlos Alencar Souza Alves; Castro, João Antônio Chula de; Silva, Diego Augusto Santos; Petroski, Edio Luiz
The aim of this study was to assess the validity of traditional anthropometric equations and to develop predictive equations of total body and trunk fat for children and adolescents living with HIV based on anthropometric measurements. Forty-eight children and adolescents of both sexes (24 boys) aged 7-17 years, living in Santa Catarina, Brazil, participated in the study. Dual-energy X-ray absorptiometry was used as the reference method to evaluate total body and trunk fat. Height, body weight, circumferences and triceps, subscapular, abdominal and calf skinfolds were measured. The traditional equations of Lohman and Slaughter were used to estimate body fat. Multiple regression models were fitted to predict total body fat (Model 1) and trunk fat (Model 2) using a backward selection procedure. Model 1 had an R 2 =0.85 and a standard error of the estimate of 1.43. Model 2 had an R 2 =0.80 and standard error of the estimate=0.49. The traditional equations of Lohman and Slaughter showed poor performance in estimating body fat in children and adolescents living with HIV. The prediction models using anthropometry provided reliable estimates and can be used by clinicians and healthcare professionals to monitor total body and trunk fat in children and adolescents living with HIV. Copyright © 2017 Sociedade Brasileira de Infectologia. Published by Elsevier Editora Ltda. All rights reserved.
ERIC Educational Resources Information Center
Molenaar, Ivo W.
The technical problems involved in obtaining Bayesian model estimates for the regression parameters in m similar groups are studied. The available computer programs, BPREP (BASIC), and BAYREG, both written in FORTRAN, require an amount of computer processing that does not encourage regular use. These programs are analyzed so that the performance…
Holtschlag, David J.
2011-01-01
In Michigan, index flow Q50 is a streamflow characteristic defined as the minimum of median flows for July, August, and September. The state of Michigan uses index flow estimates to help regulate large (greater than 100,000 gallons per day) water withdrawals to prevent adverse effects on characteristic fish populations. At sites where long-term streamgages are located, index flows are computed directly from continuous streamflow records as GageQ50. In an earlier study, a multiple-regression equation was developed to estimate index flows IndxQ50 at ungaged sites. The index equation explains about 94 percent of the variability of index flows at 147 (index) streamgages by use of six explanatory variables describing soil type, aquifer transmissivity, land cover, and precipitation characteristics. This report extends the results of the previous study, by use of Monte Carlo simulations, to evaluate alternative flow estimators, DiscQ50, IntgQ50, SiteQ50, and AugmQ50. The Monte Carlo simulations treated each of the available index streamgages, in turn, as a miscellaneous site where streamflow conditions are described by one or more instantaneous measurements of flow. In the simulations, instantaneous flows were approximated by daily mean flows at the corresponding site. All estimators use information that can be obtained from instantaneous flow measurements and contemporaneous daily mean flow data from nearby long-term streamgages. The efficacy of these estimators was evaluated over a set of measurement intensities in which the number of simulated instantaneous flow measurements ranged from 1 to 100 at a site. The discrete measurement estimator DiscQ50 is based on a simple linear regression developed between information on daily mean flows at five or more streamgages near the miscellaneous site and their corresponding GageQ50 index flows. The regression relation then was used to compute a DiscQ50 estimate at the miscellaneous site by use of the simulated instantaneous flow measurement. This process was repeated to develop a set of DiscQ50 estimates for all simulated instantaneous measurements, a weighted DiscQ50 estimate was formed from this set. Results indicated that the expected value of this weighted estimate was more precise than the IndxQ50 estimate for all measurement intensities evaluated. The integrated index-flow estimator, IntgQ50, was formed by computing a weighted average of the index estimate IndxQ50 and the DiscQ50 estimate. Results indicated that the IntgQ50 estimator was more precise than the DiscQ50 estimator at low measurement intensities of one to two measurements. At greater measurement intensities, the precision of the IntgQ50 estimator converges to the DiscQ50 estimator. Neither the DiscQ50 nor the IntgQ50 estimators provided site-specific estimates. In particular, although expected values of DiscQ50 and IntgQ50 estimates converge with increasing measurement intensity, they do not necessarily converge to the site-specific value of Q50. The site estimator of flow, SiteQ50, was developed to facilitate this convergence at higher measurement intensities. This is accomplished by use of the median of simulated instantaneous flow values for each measurement intensity level. A weighted estimate of the median and information associated with the IntgQ50 estimate was used to form the SiteQ50 estimate. Initial simulations indicate that the SiteQ50 estimator generally has greater precision than the IntgQ50 estimator at measurement intensities greater than 3, however, additional analysis is needed to identify streamflow conditions under which instantaneous measurements will produce estimates that generally converge to the index flows. A preliminary augmented index regression equation was developed, which contains the index regression estimate and two additional variables associated with base-flow recession characteristics. When these recession variables were estimated as the medians of recession parameters compute
Estimation of height and body mass index from demi-span in elderly individuals.
Weinbrenner, Tanja; Vioque, Jesús; Barber, Xavier; Asensio, Laura
2006-01-01
Obtaining accurate height and, consequently, body mass index (BMI) measurements in elderly subjects can be difficult due to changes in posture and loss of height during ageing. Measurements of other body segments can be used as an alternative to estimate standing height, but population- and age-specific equations are necessary. Our objectives were to validate existing equations, to develop new simple equations to predict height in an elderly Spanish population and to assess the accuracy of the BMI calculated by estimated height from the new equations. We measured height and demi-span in a representative sample of 592 individuals, 271 men and 321 women, 65 years and older (mean +/- SD, 73.8 +/- 6.3 years). We suggested equations to predict height from demi-span by multiple regression analyses and performed an agreement analysis between measured and estimated indices. Height estimated from demi-span correlated significantly (p < 0.001) with measured height (men: r = 0.708, women: r = 0.625). The best prediction equations were as follows: men, height (in cm) = 77.821 + (1.132 x demi-span in cm) + (-0.215 x 5-year age category); women: height (in cm) = 88.854 + (0.899 x demi-span in cm) + (-0.692 x 5-year age category). No significant differences between the mean values of estimated and measured heights were found for men (-0.03 +/- 4.6 cm) or women (-0.02 +/- 4.1 cm). The BMI derived from measured height did not differ significantly from the BMI derived from estimated height either. Predicted height values from equations based on demi-span and age may be acceptable surrogates to derive accurate nutritional indices such as the BMI, particularly in elderly populations, where height may be difficult to measure accurately.
Feaster, Toby D.; Tasker, Gary D.
2002-01-01
Data from 167 streamflow-gaging stations in or near South Carolina with 10 or more years of record through September 30, 1999, were used to develop two methods for estimating the magnitude and frequency of floods in South Carolina for rural ungaged basins that are not significantly affected by regulation. Flood frequency estimates for 54 gaged sites in South Carolina were computed by fitting the water-year peak flows for each site to a log-Pearson Type III distribution. As part of the computation of flood-frequency estimates for gaged sites, new values for generalized skew coefficients were developed. Flood-frequency analyses also were made for gaging stations that drain basins from more than one physiographic province. The U.S. Geological Survey, in cooperation with the South Carolina Department of Transportation, updated these data from previous flood-frequency reports to aid officials who are active in floodplain management as well as those who design bridges, culverts, and levees, or other structures near streams where flooding is likely to occur. Regional regression analysis, using generalized least squares regression, was used to develop a set of predictive equations that can be used to estimate the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence-interval flows for rural ungaged basins in the Blue Ridge, Piedmont, upper Coastal Plain, and lower Coastal Plain physiographic provinces of South Carolina. The predictive equations are all functions of drainage area. Average errors of prediction for these regression equations ranged from -16 to 19 percent for the 2-year recurrence-interval flow in the upper Coastal Plain to -34 to 52 percent for the 500-year recurrence interval flow in the lower Coastal Plain. A region-of-influence method also was developed that interactively estimates recurrence- interval flows for rural ungaged basins in the Blue Ridge of South Carolina. The region-of-influence method uses regression techniques to develop a unique relation between flow and basin characteristics for an individual watershed. This, then, can be used to estimate flows at ungaged sites. Because the computations required for this method are somewhat complex, a computer application was developed that performs the computations and compares the predictive errors for this method. The computer application includes the option of using the region-of-influence method, or the generalized least squares regression equations from this report to compute estimated flows and errors of prediction specific to each ungaged site. From a comparison of predictive errors using the region-of-influence method with those computed using the regional regression method, the region-of-influence method performed systematically better only in the Blue Ridge and is, therefore, not recommended for use in the other physiographic provinces. Peak-flow data for the South Carolina stations used in the regionalization study are provided in appendix A, which contains gaging station information, log-Pearson Type III statistics, information on stage-flow relations, and water-year peak stages and flows. For informational purposes, water-year peak-flow data for stations on regulated streams in South Carolina also are provided in appendix D. Other information pertaining to the regulated streams is provided in the text of the report.
Khanal, Laxman; Shah, Sandip; Koirala, Sarun
2017-03-01
Length of long bones is taken as an important contributor for estimating one of the four elements of forensic anthropology i.e., stature of the individual. Since physical characteristics of the individual differ among different groups of population, population specific studies are needed for estimating the total length of femur from its segment measurements. Since femur is not always recovered intact in forensic cases, it was the aim of this study to derive regression equations from measurements of proximal and distal fragments in Nepalese population. A cross-sectional study was done among 60 dry femora (30 from each side) without sex determination in anthropometry laboratory. Along with maximum femoral length, four proximal and four distal segmental measurements were measured following the standard method with the help of osteometric board, measuring tape and digital Vernier's caliper. Bones with gross defects were excluded from the study. Measured values were recorded separately for right and left side. Statistical Package for Social Science (SPSS version 11.5) was used for statistical analysis. The value of segmental measurements were different between right and left side but statistical difference was not significant except for depth of medial condyle (p=0.02). All the measurements were positively correlated and found to have linear relationship with the femoral length. With the help of regression equation, femoral length can be calculated from the segmental measurements; and then femoral length can be used to calculate the stature of the individual. The data collected may contribute in the analysis of forensic bone remains in study population.
Parrett, Charles; Hull, J.A.
1986-01-01
Once-monthly streamflow measurements were used to estimate selected percentile discharges on flow-duration curves of monthly mean discharge for 40 ungaged stream sites in the upper Yellowstone River basin in Montana. The estimation technique was a modification of the concurrent-discharge method previously described and used by H.C. Riggs to estimate annual mean discharge. The modified technique is based on the relationship of various mean seasonal discharges to the required discharges on the flow-duration curves. The mean seasonal discharges are estimated from the monthly streamflow measurements, and the percentile discharges are calculated from regression equations. The regression equations, developed from streamflow record at nine gaging stations, indicated a significant log-linear relationship between mean seasonal discharge and various percentile discharges. The technique was tested at two discontinued streamflow-gaging stations; the differences between estimated monthly discharges and those determined from the discharge record ranged from -31 to +27 percent at one site and from -14 to +85 percent at the other. The estimates at one site were unbiased, and the estimates at the other site were consistently larger than the recorded values. Based on the test results, the probable average error of the technique was + or - 30 percent for the 21 sites measured during the first year of the program and + or - 50 percent for the 19 sites measured during the second year. (USGS)
Estimation of basal metabolic rate in Chinese: are the current prediction equations applicable?
Camps, Stefan G; Wang, Nan Xin; Tan, Wei Shuan Kimberly; Henry, C Jeyakumar
2016-08-31
Measurement of basal metabolic rate (BMR) is suggested as a tool to estimate energy requirements. Therefore, BMR prediction equations have been developed in multiple populations because indirect calorimetry is not always feasible. However, there is a paucity of data on BMR measured in overweight and obese adults living in Asia and equations developed for this group of interest. The aim of this study was to develop a new BMR prediction equation for Chinese adults applicable for a large BMI range and compare it with commonly used prediction equations. Subjects were 121 men and 111 women (age: 21-67 years, BMI: 16-41 kg/m(2)). Height, weight, and BMR were measured. Continuous open-circuit indirect calorimetry using a ventilated hood system for 30 min was used to measure BMR. A regression equation was derived using stepwise regression and accuracy was compared to 6 existing equations (Harris-Benedict, Henry, Liu, Yang, Owen and Mifflin). Additionally, the newly derived equation was cross-validated in a separate group of 70 Chinese subjects (26 men and 44 women, age: 21-69 years, BMI: 17-39 kg/m(2)). The equation developed from our data was: BMR (kJ/d) = 52.6 x weight (kg) + 828 x gender + 1960 (women = 0, men = 1; R(2) = 0.81). The accuracy rate (within 10 % accurate) was 78 % which compared well to Owen (70 %), Henry (67 %), Mifflin (67 %), Liu (58 %), Harris-Benedict (45 %) and Yang (37 %) for the whole range of BMI. For a BMI greater than 23, the Singapore equation reached an accuracy rate of 76 %. Cross-validation proved an accuracy rate of 80 %. To date, the newly developed Singapore equation is the most accurate BMR prediction equation in Chinese and is applicable for use in a large BMI range including those overweight and obese.
Painter, Colin C.; Heimann, David C.; Lanning-Rush, Jennifer L.
2017-08-14
A study was done by the U.S. Geological Survey in cooperation with the Kansas Department of Transportation and the Federal Emergency Management Agency to develop regression models to estimate peak streamflows of annual exceedance probabilities of 50, 20, 10, 4, 2, 1, 0.5, and 0.2 percent at ungaged locations in Kansas. Peak streamflow frequency statistics from selected streamgages were related to contributing drainage area and average precipitation using generalized least-squares regression analysis. The peak streamflow statistics were derived from 151 streamgages with at least 25 years of streamflow data through 2015. The developed equations can be used to predict peak streamflow magnitude and frequency within two hydrologic regions that were defined based on the effects of irrigation. The equations developed in this report are applicable to streams in Kansas that are not substantially affected by regulation, surface-water diversions, or urbanization. The equations are intended for use for streams with contributing drainage areas ranging from 0.17 to 14,901 square miles in the nonirrigation effects region and, 1.02 to 3,555 square miles in the irrigation-affected region, corresponding to the range of drainage areas of the streamgages used in the development of the regional equations.
Improving estimates of streamflow characteristics by using Landsat-1 imagery
Hollyday, Este F.
1976-01-01
Imagery from the first Earth Resources Technology Satellite (renamed Landsat-1) was used to discriminate physical features of drainage basins in an effort to improve equations used to estimate streamflow characteristics at gaged and ungaged sites. Records of 20 gaged basins in the Delmarva Peninsula of Maryland, Delaware, and Virginia were analyzed for 40 statistical streamflow characteristics. Equations relating these characteristics to basin characteristics were obtained by a technique of multiple linear regression. A control group of equations contains basin characteristics derived from maps. An experimental group of equations contains basin characteristics derived from maps and imagery. Characteristics from imagery were forest, riparian (streambank) vegetation, water, and combined agricultural and urban land use. These basin characteristics were isolated photographically by techniques of film-density discrimination. The area of each characteristic in each basin was measured photometrically. Comparison of equations in the control group with corresponding equations in the experimental group reveals that for 12 out of 40 equations the standard error of estimate was reduced by more than 10 percent. As an example, the standard error of estimate of the equation for the 5-year recurrence-interval flood peak was reduced from 46 to 32 percent. Similarly, the standard error of the equation for the mean monthly flow for September was reduced from 32 to 24 percent, the standard error for the 7-day, 2-year recurrence low flow was reduced from 136 to 102 percent, and the standard error for the 3-day, 2-year flood volume was reduced from 30 to 12 percent. It is concluded that data from Landsat imagery can substantially improve the accuracy of estimates of some streamflow characteristics at sites in the Delmarva Peninsula.
Nunes, Matheus Henrique
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects. PMID:27187074
NEW STUDIES OF URBAN FLOOD FREQUENCY IN THE SOUTHEASTERN UNITED STATES.
Sauer, Vernon B.
1986-01-01
Five reports dealing with flood magnitude and frequency in urban areas in the southeastern United States have been published during the past 2 years by the U. S. Geological Survey (USGS). These reports are based on data collected in Tampa and Tallahassee, Florida; Atlanta, Georgia; and several cities in Alabama and Tennessee. Each report contains regression equations useful for estimating flood peaks for selected recurrence intervals at ungauged urban sites. A nationwide study of urban flood characteristics by the USGS published in 1983 contains equations for estimating urban peak discharges for ungauged sites. At the time that the nationwide study was conducted, data from only 35 sites in the southeastern United States were available. The five new reports contain data for 88 additional sites. These new data show that the seven-parameter estimating equations developed in the nationwide study are unbiased and have prediction errors less than those described in the nationwide report.
Nunes, Matheus Henrique; Görgens, Eric Bastos
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects.
Williams-Sether, Tara
2004-01-01
The Dakota Water Resources Act, passed by the U.S. Congress on December 15, 2000, authorized the Secretary of the Interior to conduct a comprehensive study of future water-quantity and quality needs of the Red River of the North Basin in North Dakota and possible options to meet those water needs. Previous Red River of the North Basin studies conducted by the Bureau of Reclamation used streamflow and water-quality data bases developed by the U.S. Geological Survey that included data for 1931-84. As a result of the recent congressional authorization and results of previous studies by the Bureau of Reclamation, redevelopment of the streamflow and water-quality data bases with current data through 1999 are needed in order to evaluate and predict the water-quantity and quality effects within the Red River of the North Basin. This report provides updated statistical summaries of selected water-quality constituents and streamflow and the regression relations between them. Available data for 1931-99 were used to develop regression equations between 5 selected water-quality constituents and streamflow for 38 gaging stations in the Red River of the North Basin. The water-quality constituents that were regressed against streamflow were hardness (as CaCO3), sodium, chloride, sulfate, and dissolved solids. Statistical summaries of the selected water-quality constituents and streamflow for the gaging stations used in the regression equations development and the applications and limitations of the regression equations are presented in this report.
Oden, Timothy D.; Asquith, William H.; Milburn, Matthew S.
2009-01-01
In December 2005, the U.S. Geological Survey in cooperation with the City of Houston, Texas, began collecting discrete water-quality samples for nutrients, total organic carbon, bacteria (total coliform and Escherichia coli), atrazine, and suspended sediment at two U.S. Geological Survey streamflow-gaging stations upstream from Lake Houston near Houston (08068500 Spring Creek near Spring, Texas, and 08070200 East Fork San Jacinto River near New Caney, Texas). The data from the discrete water-quality samples collected during 2005-07, in conjunction with monitored real-time data already being collected - physical properties (specific conductance, pH, water temperature, turbidity, and dissolved oxygen), streamflow, and rainfall - were used to develop regression models for predicting water-quality constituent concentrations for inflows to Lake Houston. Rainfall data were obtained from a rain gage monitored by Harris County Homeland Security and Emergency Management and colocated with the Spring Creek station. The leaps and bounds algorithm was used to find the best subsets of possible regression models (minimum residual sum of squares for a given number of variables). The potential explanatory or predictive variables included discharge (streamflow), specific conductance, pH, water temperature, turbidity, dissolved oxygen, rainfall, and time (to account for seasonal variations inherent in some water-quality data). The response variables at each site were nitrite plus nitrate nitrogen, total phosphorus, organic carbon, Escherichia coli, atrazine, and suspended sediment. The explanatory variables provide easily measured quantities as a means to estimate concentrations of the various constituents under investigation, with accompanying estimates of measurement uncertainty. Each regression equation can be used to estimate concentrations of a given constituent in real time. In conjunction with estimated concentrations, constituent loads were estimated by multiplying the estimated concentration by the corresponding streamflow and applying the appropriate conversion factor. By computing loads from estimated constituent concentrations, a continuous record of estimated loads can be available for comparison to total maximum daily loads. The regression equations presented in this report are site specific to the Spring Creek and East Fork San Jacinto River streamflow-gaging stations; however, the methods that were developed and documented could be applied to other tributaries to Lake Houston for estimating real-time water-quality data for streams entering Lake Houston.
Development and validation of a predictive equation for lean body mass in children and adolescents.
Foster, Bethany J; Platt, Robert W; Zemel, Babette S
2012-05-01
Lean body mass (LBM) is not easy to measure directly in the field or clinical setting. Equations to predict LBM from simple anthropometric measures, which account for the differing contributions of fat and lean to body weight at different ages and levels of adiposity, would be useful to both human biologists and clinicians. To develop and validate equations to predict LBM in children and adolescents across the entire range of the adiposity spectrum. Dual energy X-ray absorptiometry was used to measure LBM in 836 healthy children (437 females) and linear regression was used to develop sex-specific equations to estimate LBM from height, weight, age, body mass index (BMI) for age z-score and population ancestry. Equations were validated using bootstrapping methods and in a local independent sample of 332 children and in national data collected by NHANES. The mean difference between measured and predicted LBM was - 0.12% (95% limits of agreement - 11.3% to 8.5%) for males and - 0.14% ( - 11.9% to 10.9%) for females. Equations performed equally well across the entire adiposity spectrum, as estimated by BMI z-score. Validation indicated no over-fitting. LBM was predicted within 5% of measured LBM in the validation sample. The equations estimate LBM accurately from simple anthropometric measures.
A fully distributed implementation of mean annual streamflow regional regression equations
Verdin, K.L.; Worstell, B.
2008-01-01
Estimates of mean annual streamflow are needed for a variety of hydrologic assessments. Away from gage locations, regional regression equations that are a function of upstream area, precipitation, and temperature are commonly used. Geographic information systems technology has facilitated their use for projects, but traditional approaches using the polygon overlay operator have been too inefficient for national scale applications. As an alternative, the Elevation Derivatives for National Applications (EDNA) database was used as a framework for a fully distributed implementation of mean annual streamflow regional regression equations. The raster “flow accumulation” operator was used to efficiently achieve spatially continuous parameterization of the equations for every 30 m grid cell of the conterminous United States (U.S.). Results were confirmed by comparing with measured flows at stations of the Hydro-Climatic Data Network, and their applications value demonstrated in the development of a national geospatial hydropower assessment. Interactive tools at the EDNA website make possible the fast and efficient query of mean annual streamflow for any location in the conterminous U.S., providing a valuable complement to other national initiatives (StreamStats and the National Hydrography Dataset Plus).
Luo, Ying-zhen; Tu, Meng; Fan, Fei; Zheng, Jie-qian; Yang, Ming; Li, Tao; Zhang, Kui; Deng, Zhen-hua
2015-06-01
To establish the linear regression equation between body height and combined length of manubrium and mesostenum of sternum measured by CT volume rendering technique (CT-VRT) in southwest Han population. One hundred and sixty subjects, including 80 males and 80 females were selected from southwest Han population for routine CT-VRT (reconstruction thickness 1 mm) examination. The lengths of both manubrium and mesosternum were recorded, and the combined length of manubrium and mesosternum was equal to the algebraic sum of them. The sex-specific linear regression equations between the combined length of manubrium and mesosternum and the real body height of each subject were deduced. The sex-specific simple linear regression equations between the combined length of manubrium and mesostenum (x3) and body height (y) were established (male: y = 135.000+2.118 x3 and female: y = 120.790+2.808 x3). Both equations showed statistical significance (P < 0.05) with a 100% predictive accuracy. CT-VRT is an effective method for measurement of the index of sternum. The combined length of manubrium and mesosternum from CT-VRT can be used for body height estimation in southwest Han population.
Wetzel, Kim L.; Bettandorff, J.M.
1986-01-01
Techniques are presented for estimating various streamflow characteristics, such as peak flows, mean monthly and annual flows, flow durations, and flow volumes, at ungaged sites on unregulated streams in the Eastern Coal region. Streamflow data and basin characteristics for 629 gaging stations were used to develop multiple-linear-regression equations. Separate equations were developed for the Eastern and Interior Coal Provinces. Drainage area is an independent variable common to all equations. Other variables needed, depending on the streamflow characteristic, are mean annual precipitation, mean basin elevation, main channel length, basin storage, main channel slope, and forest cover. A ratio of the observed 50- to 90-percent flow durations was used in the development of relations to estimate low-flow frequencies in the Eastern Coal Province. Relations to estimate low flows in the Interior Coal Province are not presented because the standard errors were greater than 0.7500 log units and were considered to be of poor reliability.
Estimating body weight and body composition of chickens by using noninvasive measurements.
Latshaw, J D; Bishop, B L
2001-07-01
The major objective of this research was to develop equations to estimate BW and body composition using measurements taken with inexpensive instruments. We used five groups of chickens that were created with different genetic stocks and feeding programs. Four of the five groups were from broiler genetic stock, and one was from sex-linked heavy layers. The goal was to sample six males from each group when the group weight was 1.20, 1.75, and 2.30 kg. Each male was weighed and measured for back length, pelvis width, circumference, breast width, keel length, and abdominal skinfold thickness. A cloth tape measure, calipers, and skinfold calipers were used for measurement. Chickens were scanned for total body electrical conductivity (TOBEC) before being euthanized and frozen. Six females were selected at weights similar to those for males and were measured in the same way. Each whole chicken was ground, and a portion of ground material of each was used to measure water, fat, ash, and energy content. Multiple linear regression was used to estimate BW from body measurements. The best single measurement was pelvis width, with an R2 = 0.67. Inclusion of three body measurements in an equation resulted in R2 = 0.78 and the following equation: BW (g) = -930.0 + 68.5 (breast, cm) + 48.5 (circumference, cm) + 62.8 (pelvis, cm). The best single measurement to estimate body fat was abdominal skinfold thickness, expressed as a natural logarithm. Inclusion of weight and skinfold thickness resulted in R2 = 0.63 for body fat according to the following equation: fat (%) = 24.83 + 6.75 (skinfold, ln cm) - 3.87 (wt, kg). Inclusion of the result of TOBEC and the effect of sex improved the R2 to 0.78 for body fat. Regression analysis was used to develop additional equations, based on fat, to estimate water and energy contents of the body. The body water content (%) = 72.1 - 0.60 (body fat, %), and body energy (kcal/g) = 1.097 + 0.080 (body fat, %). The results of the present study indicated that the composition of a chicken's body could be estimated from the models that were developed.
Validation of a heteroscedastic hazards regression model.
Wu, Hong-Dar Isaac; Hsieh, Fushing; Chen, Chen-Hsin
2002-03-01
A Cox-type regression model accommodating heteroscedasticity, with a power factor of the baseline cumulative hazard, is investigated for analyzing data with crossing hazards behavior. Since the approach of partial likelihood cannot eliminate the baseline hazard, an overidentified estimating equation (OEE) approach is introduced in the estimation procedure. It by-product, a model checking statistic, is presented to test for the overall adequacy of the heteroscedastic model. Further, under the heteroscedastic model setting, we propose two statistics to test the proportional hazards assumption. Implementation of this model is illustrated in a data analysis of a cancer clinical trial.
Estimated and measured traveltime for the Crow River watershed, Minnesota
Arntson, Allan D.
2007-01-01
A time-of-travel study involving a luminescent dye was done on the Crow River in Minnesota from Rockford to the confluence with the Mississippi River at Dayton on July 11, 2006, at a streamflow of 293 cubic feet per second at Rockford. Dye was injected in the Crow River at Rockford, and traveltime and concentrations were measured at three sampling locations downstream: at the Hanover historic bridge in Hanover, at County Road 116 near St. Michael, and at County Road 12 in Dayton. The results of the measured traveltimes were compared to estimated traveltimes from a previous study of the Crow River and six other rivers in the Upper Mississippi River basin in 2003. Regression equations based on watershed characteristics of drainage area, river slope, mean-annual streamflow, and instantaneous streamflow at the time of measurement from more than 900 stream segments across the Nation were used to estimate traveltimes. Traveltimes were estimated and measured for the leading edge, peak concentration, and trailing edge of tracer-response curves. Estimated traveltimes for the leading edge, peak concentration, and trailing edge at Dayton were 25.3, 28.4, and 35.6 hours, respectively. Measured traveltimes for the leading edge, peak concentration, and trailing edge at Dayton were 33.2, 38.2, and 49.2 hours, respectively, for the 22.4-mile reach. Although traveltimes for the Crow and the Sauk Rivers were underestimated by use of the regression equations, the regression estimates were close enough to measured values to be considered satisfactory; hence, this estimating technique should be applicable in other source-water planning efforts in and near the study area.
NASA Astrophysics Data System (ADS)
Pachon, Jorge E.; Balachandran, Sivaraman; Hu, Yongtao; Weber, Rodney J.; Mulholland, James A.; Russell, Armistead G.
2010-10-01
In the Southeastern US, organic carbon (OC) comprises about 30% of the PM 2.5 mass. A large fraction of OC is estimated to be of secondary origin. Long-term estimates of SOC and uncertainties are necessary in the evaluation of air quality policy effectiveness and epidemiologic studies. Four methods to estimate secondary organic carbon (SOC) and respective uncertainties are compared utilizing PM 2.5 chemical composition and gas phase data available in Atlanta from 1999 to 2007. The elemental carbon (EC) tracer and the regression methods, which rely on the use of tracer species of primary and secondary OC formation, provided intermediate estimates of SOC as 30% of OC. The other two methods, chemical mass balance (CMB) and positive matrix factorization (PMF) solve mass balance equations to estimate primary and secondary fractions based on source profiles and statistically-derived common factors, respectively. CMB had the highest estimate of SOC (46% of OC) while PMF led to the lowest (26% of OC). The comparison of SOC uncertainties, estimated based on propagation of errors, led to the regression method having the lowest uncertainty among the four methods. We compared the estimates with the water soluble fraction of the OC, which has been suggested as a surrogate of SOC when biomass burning is negligible, and found a similar trend with SOC estimates from the regression method. The regression method also showed the strongest correlation with daily SOC estimates from CMB using molecular markers. The regression method shows advantages over the other methods in the calculation of a long-term series of SOC estimates.
Mastin, Mark C.; Konrad, Christopher P.; Veilleux, Andrea G.; Tecca, Alison E.
2016-09-20
An investigation into the magnitude and frequency of floods in Washington State computed the annual exceedance probability (AEP) statistics for 648 U.S. Geological Survey unregulated streamgages in and near the borders of Washington using the recorded annual peak flows through water year 2014. This is an updated report from a previous report published in 1998 that used annual peak flows through the water year 1996. New in this report, a regional skew coefficient was developed for the Pacific Northwest region that includes areas in Oregon, Washington, Idaho and western Montana within the Columbia River drainage basin south of the United States-Canada border, the coastal areas of Oregon and western Washington, and watersheds draining into Puget Sound, Washington. The skew coefficient is an important term in the Log Pearson Type III equation used to define the distribution of the log-transformed annual peaks. The Expected Moments Algorithm was used to fit historical and censored peak-flow data to the log Pearson Type III distribution. A Multiple Grubb-Beck test was employed to censor low outliers of annual peak flows to improve on the frequency distribution. This investigation also includes a section on observed trends in annual peak flows that showed significant trends (p-value < 0.05) in 21 of 83 long-term sites, but with small magnitude Kendall tau values suggesting a limited monotonic trend in the time series of annual peaks. Most of the sites with a significant trend in western Washington were positive and all the sites with significant trends (three sites) in eastern Washington were negative.Multivariate regression analysis with measured basin characteristics and the AEP statistics at long-term, unregulated, and un-urbanized (defined as drainage basins with less than 5 percent impervious land cover for this investigation) streamgages within Washington and some in Idaho and Oregon that are near the Washington border was used to develop equations to estimate AEP statistics at ungaged basins. Washington was divided into four regions to improve the accuracy of the regression equations; a set of equations for eight selected AEPs and for each region were constructed. Selected AEP statistics included the annual peak flows that equaled or exceeded 50, 20, 10, 4, 2, 1, 0.5 and 0.2 percent of the time equivalent to peak flows for peaks with a 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence intervals, respectively. Annual precipitation and drainage area were the significant basin characteristics in the regression equations for all four regression regions in Washington and forest cover was significant for the two regression regions in eastern Washington. Average standard error of prediction for the regional regression equations ranged from 70.19 to 125.72 percent for Regression Regions 1 and 2 on the eastern side of the Cascade Mountains and from 43.22 to 58.04 percent for Regression Regions 3 and 4 on the western side of the Cascade Mountains. The pseudo coefficient of determination (where a value of 100 signifies a perfect regression model) ranged from 68.39 to 90.68 for Regression Regions 1 and 2, and 92.35 to 95.44 for Regions 3 and 4.The calculated AEP statistics for the streamgages and the regional regression equations are expected to be incorporated into StreamStats after the publication of this report. StreamStats is the interactive Web-based map tool created by the U.S. Geological Survey to allow the user to choose a streamgage and obtain published statistics or choose ungaged locations where the program automatically applies the regional regression equations and computes the estimates of the AEP statistics.
Accounting for estimated IQ in neuropsychological test performance with regression-based techniques.
Testa, S Marc; Winicki, Jessica M; Pearlson, Godfrey D; Gordon, Barry; Schretlen, David J
2009-11-01
Regression-based normative techniques account for variability in test performance associated with multiple predictor variables and generate expected scores based on algebraic equations. Using this approach, we show that estimated IQ, based on oral word reading, accounts for 1-9% of the variability beyond that explained by individual differences in age, sex, race, and years of education for most cognitive measures. These results confirm that adding estimated "premorbid" IQ to demographic predictors in multiple regression models can incrementally improve the accuracy with which regression-based norms (RBNs) benchmark expected neuropsychological test performance in healthy adults. It remains to be seen whether the incremental variance in test performance explained by estimated "premorbid" IQ translates to improved diagnostic accuracy in patient samples. We describe these methods, and illustrate the step-by-step application of RBNs with two cases. We also discuss the rationale, assumptions, and caveats of this approach. More broadly, we note that adjusting test scores for age and other characteristics might actually decrease the accuracy with which test performance predicts absolute criteria, such as the ability to drive or live independently.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
O'Connor, Jean E; Coyle, Joseph; Bogue, Conor; Spence, Liam D; Last, Jason
2014-01-01
Age estimation in living subjects is primarily achieved through assessment of a hand-wrist radiograph and comparison with a standard reference atlas. Recently, maturation of other regions of the skeleton has also been assessed in an attempt to refine the age estimates. The current study presents a method to predict bone age directly from the knee in a modern Irish sample. Ten maturity indicators (A-J) at the knee were examined from radiographs of 221 subjects (137 males; 84 females). Each indicator was assigned a maturity score. Scores for indicators A-G, H-J and A-J, respectively, were totalled to provide a cumulative maturity score for change in morphology of the epiphyses (AG), epiphyseal union (HJ) and the combination of both (AJ). Linear regression equations to predict age from the maturity scores (AG, HJ, AJ) were constructed for males and females. For males, equation-AJ demonstrated the greatest predictive capability (R(2)=0.775) while for females equation-HJ had the strongest capacity for prediction (R(2)=0.815). When equation-AJ for males and equation-HJ for females were applied to the current sample, the predicted age of 90% of subjects was within ±1.5 years of actual age for male subjects and within +2.0 to -1.9 years of actual age for female subjects. The regression formulae and associated charts represent the most contemporary method of age prediction currently available for an Irish population, and provide a further technique which can contribute to a multifactorial approach to age estimation in non-adults. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Technique for estimating depth of 100-year floods in Tennessee
Gamble, Charles R.; Lewis, James G.
1977-01-01
Preface: A method is presented for estimating the depth of the loo-year flood in four hydrologic areas in Tennessee. Depths at 151 gaging stations on streams that were not significantly affected by man made changes were related to basin characteristics by multiple regression techniques. Equations derived from the analysis can be used to estimate the depth of the loo-year flood if the size of the drainage basin is known.
Bertrand-Krajewski, J L
2004-01-01
In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.
Estimation of Fat-free Mass at Discharge in Preterm Infants Fed With Optimized Feeding Regimen.
Larcade, Julie; Pradat, Pierre; Buffin, Rachel; Leick-Courtois, Charline; Jourdes, Emilie; Picaud, Jean-Charles
2017-01-01
The purpose of the present study was to validate a previously calculated equation (E1) that estimates infant fat-free mass (FFM) at discharge using data from a population of preterm infants receiving an optimized feeding regimen. Preterm infants born before 33 weeks of gestation between April 2014 and November 2015 in the tertiary care unit of Croix-Rousse Hospital in Lyon, France, were included in the study. At discharge, FFM was assessed by air displacement plethysmography (PEA POD) and was compared with FFM estimated by E1. FFM was estimated using a multiple linear regression model. Data on 155 preterm infants were collected. There was a strong correlation between the FFM estimated by E1 and FFM assessed by the PEA POD (r = 0.939). E1, however, underestimated the FFM (average difference: -197 g), and this underestimation increased as FFM increased. A new, more predictive equation is proposed (r = 0.950, average difference: -12 g). Although previous estimation methods were useful for estimating FFM at discharge, an equation adapted to present populations of preterm infants with "modern" neonatal care and nutritional practices is required for accuracy.
Predicting basal metabolic rates in Malaysian adult elite athletes.
Wong, Jyh Eiin; Poh, Bee Koon; Nik Shanita, Safii; Izham, Mohd Mohamad; Chan, Kai Quin; Tai, Meng De; Ng, Wei Wei; Ismail, Mohd Noor
2012-11-01
This study aimed to measure the basal metabolic rate (BMR) of elite athletes and develop a gender specific predictive equation to estimate their energy requirements. 92 men and 33 women (aged 18-31 years) from 15 sports, who had been training six hours daily for at least one year, were included in the study. Body composition was measured using the bioimpedance technique, and BMR by indirect calorimetry. The differences between measured and estimated BMR using various predictive equations were calculated. The novel equation derived from stepwise multiple regression was evaluated using Bland and Altman analysis. The predictive equations of Cunningham and the Food and Agriculture Organization/World Health Organization/United Nations University either over- or underestimated the measured BMR by up to ± 6%, while the equations of Ismail et al, developed from the local non-athletic population, underestimated the measured BMR by 14%. The novel predictive equation for the BMR of athletes was BMR (kcal/day) = 669 + 13 (weight in kg) + 192 (gender: 1 for men and 0 for women) (R2 0.548; standard error of estimates 163 kcal). Predicted BMRs of elite athletes by this equation were within 1.2% ± 9.5% of the measured BMR values. The novel predictive equation presented in this study can be used to calculate BMR for adult Malaysian elite athletes. Further studies may be required to validate its predictive capabilities for other sports, nationalities and age groups.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Techniques for estimating flood-depth frequency relations for streams in West Virginia
Wiley, J.B.
1987-01-01
Multiple regression analyses are applied to data from 119 U.S. Geological Survey streamflow stations to develop equations that estimate baseline depth (depth of 50% flow duration) and 100-yr flood depth on unregulated streams in West Virginia. Drainage basin characteristics determined from the 100-yr flood depth analysis were used to develop 2-, 10-, 25-, 50-, and 500-yr regional flood depth equations. Two regions with distinct baseline depth equations and three regions with distinct flood depth equations are delineated. Drainage area is the most significant independent variable found in the central and northern areas of the state where mean basin elevation also is significant. The equations are applicable to any unregulated site in West Virginia where values of independent variables are within the range evaluated for the region. Examples of inapplicable sites include those in reaches below dams, within and directly upstream from bridge or culvert constrictions, within encroached reaches, in karst areas, and where streams flow through lakes or swamps. (Author 's abstract)
Ramsthaler, Frank; Kettner, Mattias; Verhoff, Marcel A
2014-01-01
In forensic anthropological casework, estimating age-at-death is key to profiling unknown skeletal remains. The aim of this study was to examine the reliability of a new, simple, fast, and inexpensive digital odontological method for age-at-death estimation. The method is based on the original Lamendin method, which is a widely used technique in the repertoire of odontological aging methods in forensic anthropology. We examined 129 single root teeth employing a digital camera and imaging software for the measurement of the luminance of the teeth's translucent root zone. Variability in luminance detection was evaluated using statistical technical error of measurement analysis. The method revealed stable values largely unrelated to observer experience, whereas requisite formulas proved to be camera-specific and should therefore be generated for an individual recording setting based on samples of known chronological age. Multiple regression analysis showed a highly significant influence of the coefficients of the variables "arithmetic mean" and "standard deviation" of luminance for the regression formula. For the use of this primer multivariate equation for age-at-death estimation in casework, a standard error of the estimate of 6.51 years was calculated. Step-by-step reduction of the number of embedded variables to linear regression analysis employing the best contributor "arithmetic mean" of luminance yielded a regression equation with a standard error of 6.72 years (p < 0.001). The results of this study not only support the premise of root translucency as an age-related phenomenon, but also demonstrate that translucency reflects a number of other influencing factors in addition to age. This new digital measuring technique of the zone of dental root luminance can broaden the array of methods available for estimating chronological age, and furthermore facilitate measurement and age classification due to its low dependence on observer experience.
Chan, Dorothy F Y; Li, Albert M; Chan, Michael H M; So, Hung Kwan; Chan, Iris H S; Yin, Jane A T; Lam, Christopher W K; Fok, Tai Fai; Nelson, Edmund A S
2009-01-01
(1) To examine the validity of existing prediction equations (PREE) for estimating resting energy expenditure (REE) in obese Chinese children, (2) to correlate the measured REE (MREE) with anthropometric and biochemical parameters and (3) to derive a new PREE for local use. Cross-sectional study. 100 obese children (71 boys) were studied. All subjects underwent physical examination and anthropometric measurement. Upper and central body fat distribution was signified by centrality and conicity index respectively, and REE was measured by indirect calorimetry. Fat free mass (FFM) were measured by DEXA scan. Thirteen existing prediction equations for estimating REE were compared with MREE among these obese children. Fasting blood for glucose, lipid profile and insulin were obtained. The overall, male and female median MREEs were 7.1 mJ/d (IR 6.2-8.4), 7.3 mJ/d (IR 6.3-9.7) and 6.9 mJ/d (IR 5.6-8.1) respectively. No sex difference was noted in MREE (p=0.203). Most of the equations except Schofield equation underestimated REE of our children. By multiple linear regression, MREE was positively correlated with FFM (p<0.0001), conicity index (p<0.001) and centrality index (p=0.001). A new equation for estimating REE for local use was derived as: REE=(17.4*logFFM)+(11.4*conicity index)-(2.4*centrality index)-31.3. The mean difference of new PREE-MREE was -0.011 mJ/d (SD 1.51) with an interclass correlation coefficient of 0.91. None of the existing prediction equations were accurate in their estimation of REE, when applied to obese Chinese children. A new prediction equation has been derived for local use.
Bent, Gardner C.; Archfield, Stacey A.
2002-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis indicate that the probability of a stream flowing perennially at a specific site in Massachusetts can be estimated as a function of (1) drainage area (cube root), (2) drainage density, (3) areal percentage of stratified-drift deposits (square root), (4) mean basin slope, and (5) location in the South Coastal Basin or the remainder of the State. Although the equation developed provides an objective means for estimating the probability of a stream flowing perennially at a specific site, the reliability of the equation is constrained by the data used to develop the equation. The equation may not be reliable for (1) drainage areas less than 0.14 square mile in the State or less than 0.02 square mile in the South Coastal Basin, (2) streams with losing reaches, or (3) streams draining the southern part of the South Coastal Basin and the eastern part of the Buzzards Bay Basin and the entire area of Cape Cod and the Islands Basins.
Petkewich, Matthew D.; Conrads, Paul
2013-01-01
The Everglades Depth Estimation Network is an integrated network of real-time water-level gaging stations, a ground-elevation model, and a water-surface elevation model designed to provide scientists, engineers, and water-resource managers with water-level and water-depth information (1991-2013) for the entire freshwater portion of the Greater Everglades. The U.S. Geological Survey Greater Everglades Priority Ecosystems Science provides support for the Everglades Depth Estimation Network in order for the Network to provide quality-assured monitoring data for the U.S. Army Corps of Engineers Comprehensive Everglades Restoration Plan. In a previous study, water-level estimation equations were developed to fill in missing data to increase the accuracy of the daily water-surface elevation model. During this study, those equations were updated because of the addition and removal of water-level gaging stations, the consistent use of water-level data relative to the North American Vertical Datum of 1988, and availability of recent data (March 1, 2006, to September 30, 2011). Up to three linear regression equations were developed for each station by using three different input stations to minimize the occurrences of missing data for an input station. Of the 667 water-level estimation equations developed to fill missing data at 223 stations, more than 72 percent of the equations have coefficients of determination greater than 0.90, and 97 percent have coefficients of determination greater than 0.70.
Trommer, J.T.; Loper, J.E.; Hammett, K.M.
1996-01-01
Several traditional techniques have been used for estimating stormwater runoff from ungaged watersheds. Applying these techniques to water- sheds in west-central Florida requires that some of the empirical relationships be extrapolated beyond tested ranges. As a result, there is uncertainty as to the accuracy of these estimates. Sixty-six storms occurring in 15 west-central Florida watersheds were initially modeled using the Rational Method, the U.S. Geological Survey Regional Regression Equations, the Natural Resources Conservation Service TR-20 model, the U.S. Army Corps of Engineers Hydrologic Engineering Center-1 model, and the Environmental Protection Agency Storm Water Management Model. The techniques were applied according to the guidelines specified in the user manuals or standard engineering textbooks as though no field data were available and the selection of input parameters was not influenced by observed data. Computed estimates were compared with observed runoff to evaluate the accuracy of the techniques. One watershed was eliminated from further evaluation when it was determined that the area contributing runoff to the stream varies with the amount and intensity of rainfall. Therefore, further evaluation and modification of the input parameters were made for only 62 storms in 14 watersheds. Runoff ranged from 1.4 to 99.3 percent percent of rainfall. The average runoff for all watersheds included in this study was about 36 percent of rainfall. The average runoff for the urban, natural, and mixed land-use watersheds was about 41, 27, and 29 percent, respectively. Initial estimates of peak discharge using the rational method produced average watershed errors that ranged from an underestimation of 50.4 percent to an overestimation of 767 percent. The coefficient of runoff ranged from 0.20 to 0.60. Calibration of the technique produced average errors that ranged from an underestimation of 3.3 percent to an overestimation of 1.5 percent. The average calibrated coefficient of runoff for each watershed ranged from 0.02 to 0.72. The average values of the coefficient of runoff necessary to calibrate the urban, natural, and mixed land-use watersheds were 0.39, 0.16, and 0.08, respectively. The U.S. Geological Survey regional regression equations for determining peak discharge produced errors that ranged from an underestimation of 87.3 percent to an over- estimation of 1,140 percent. The regression equations for determining runoff volume produced errors that ranged from an underestimation of 95.6 percent to an overestimation of 324 percent. Regression equations developed from data used for this study produced errors that ranged between an underestimation of 82.8 percent and an over- estimation of 328 percent for peak discharge, and from an underestimation of 71.2 percent to an overestimation of 241 percent for runoff volume. Use of the equations developed for west-central Florida streams produced average errors for each type of watershed that were lower than errors associated with use of the U.S. Geological Survey equations. Initial estimates of peak discharges and runoff volumes using the Natural Resources Conservation Service TR-20 model, produced average errors of 44.6 and 42.7 percent respectively, for all the watersheds. Curve numbers and times of concentration were adjusted to match estimated and observed peak discharges and runoff volumes. The average change in the curve number for all the watersheds was a decrease of 2.8 percent. The average change in the time of concentration was an increase of 59.2 percent. The shape of the input dimensionless unit hydrograph also had to be adjusted to match the shape and peak time of the estimated and observed flood hydrographs. Peak rate factors for the modified input dimensionless unit hydrographs ranged from 162 to 454. The mean errors for peak discharges and runoff volumes were reduced to 18.9 and 19.5 percent, respectively, using the average calibrated input parameters for ea
Tong, Xuming; Chen, Jinghang; Miao, Hongyu; Li, Tingting; Zhang, Le
2015-01-01
Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data. PMID:26535589
Howley, Donna; Howley, Peter; Oxenham, Marc F
2018-06-01
Stature and a further 8 anthropometric dimensions were recorded from the arms and hands of a sample of 96 staff and students from the Australian National University and The University of Newcastle, Australia. These dimensions were used to create simple and multiple logistic regression models for sex estimation and simple and multiple linear regression equations for stature estimation of a contemporary Australian population. Overall sex classification accuracies using the models created were comparable to similar studies. The stature estimation models achieved standard errors of estimates (SEE) which were comparable to and in many cases lower than those achieved in similar research. Generic, non sex-specific models achieved similar SEEs and R 2 values to the sex-specific models indicating stature may be accurately estimated when sex is unknown. Copyright © 2018 Elsevier B.V. All rights reserved.
Peak oxygen consumption measured during the stair-climbing test in lung resection candidates.
Brunelli, Alessandro; Xiumé, Francesco; Refai, Majed; Salati, Michele; Di Nunzio, Luca; Pompili, Cecilia; Sabbatini, Armando
2010-01-01
The stair-climbing test is commonly used in the preoperative evaluation of lung resection candidates, but it is difficult to standardize and provides little physiologic information on the performance. To verify the association between the altitude and the V(O2peak) measured during the stair-climbing test. 109 consecutive candidates for lung resection performed a symptom-limited stair-climbing test with direct breath-by-breath measurement of V(O2peak) by a portable gas analyzer. Stepwise logistic regression and bootstrap analyses were used to verify the association of several perioperative variables with a V(O2peak) <15 ml/kg/min. Subsequently, multiple regression analysis was also performed to develop an equation to estimate V(O2peak) from stair-climbing parameters and other patient-related variables. 56% of patients climbing <14 m had a V(O2peak) <15 ml/kg/min, whereas 98% of those climbing >22 m had a V(O2peak) >15 ml/kg/min. The altitude reached at stair-climbing test resulted in the only significant predictor of a V(O2peak) <15 ml/kg/min after logistic regression analysis. Multiple regression analysis yielded an equation to estimate V(O2peak) factoring altitude (p < 0.0001), speed of ascent (p = 0.005) and body mass index (p = 0.0008). There was an association between altitude and V(O2peak) measured during the stair-climbing test. Most of the patients climbing more than 22 m are able to generate high values of V(O2peak) and can proceed to surgery without any additional tests. All others need to be referred for a formal cardiopulmonary exercise test. In addition, we were able to generate an equation to estimate V(O2peak), which could assist in streamlining the preoperative workup and could be used across different settings to standardize this test. Copyright (c) 2010 S. Karger AG, Basel.
Puente, Celso
1976-01-01
Water-level, springflow, and streamflow data were used to develop simple and multiple linear-regression equations for use in estimating water levels in wells and the flow of three major springs in the Edwards aquifer in the eastern San Antonio area. The equations provide daily, monthly, and annual estimates that compare very favorably with observed data. Analyses of geologic and hydrologic data indicate that the water discharged by the major springs is supplied primarily by regional underflow from the west and southwest and by local recharge in the infiltration area in northern Bexar, Comal, and Hays Counties.
1990-05-01
THUMBBR THUMB BREADTH NO. VARIABLE CONSTANT REGRESS. COEF. ST.ERROR ADJUQTED (.ERR OF ESTIMATE E4 58 HANDBRTH 7.623 0.183 ( 0.006) 1.124 .319 59 HANDCIRC...945, 956, 965 TRAGION TO TOP OF HEAD (TRAGT, 255) 39,51, 718, 851, 899, 945, 956, 965 Trapezius Post 23 Trochanter 23 TROCHArNTERION HEIGHT (TROQINT...THGHCLR, 105) 33, 40, 673, M08 881,927, 955, 964 Thigh Poia 23 THUMB BREADTH (THUMBBR, 106) 34, 49, 674,88 882,94955, 964 ThumlAip 23 THUMBTIP REACH
Simulated peak inflows for glacier dammed Russell Fiord, near Yakutat, Alaska
Neal, Edward G.
2004-01-01
In June 2002, Hubbard Glacier advanced across the entrance to 35-mile-long Russell Fiord creating a glacier-dammed lake. After closure of the ice and moraine dam, runoff from mountain streams and glacial melt caused the level in ?Russell Lake? to rise until it eventually breached the dam on August 14, 2002. Daily mean inflows to the lake during the period of closure were estimated on the basis of lake stage data and the hypsometry of Russell Lake. Inflows were regressed against the daily mean streamflows of nearby Ophir Creek and Situk River to generate an equation for simulating Russell Lake inflow. The regression equation was used to produce 11 years of synthetic daily inflows to Russell Lake for the 1992-2002 water years. A flood-frequency analysis was applied to the peak daily mean inflows for these 11 years of record to generate a 100-year peak daily mean inflow of 235,000 cubic feet per second. Regional-regression equations also were applied to the Russell Lake basin, yielding a 100-year inflow of 157,000 cubic feet per second.
Methods for estimating magnitude and frequency of floods in Montana based on data through 1983
Omang, R.J.; Parrett, Charles; Hull, J.A.
1986-01-01
Equations are presented for estimating flood magnitudes for ungaged sites in Montana based on data through 1983. The State was divided into eight regions based on hydrologic conditions, and separate multiple regression equations were developed for each region. These equations relate annual flood magnitudes and frequencies to basin characteristics and are applicable only to natural flow streams. In three of the regions, equations also were developed relating flood magnitudes and frequencies to basin characteristics and channel geometry measurements. The standard errors of estimate for an exceedance probability of 1% ranged from 39% to 87%. Techniques are described for estimating annual flood magnitude and flood frequency information at ungaged sites based on data from gaged sites on the same stream. Included are curves relating flood frequency information to drainage area for eight major streams in the State. Maximum known flood magnitudes in Montana are compared with estimated 1 %-chance flood magnitudes and with maximum known floods in the United States. Values of flood magnitudes for selected exceedance probabilities and values of significant basin characteristics and channel geometry measurements for all gaging stations used in the analysis are tabulated. Included are 375 stations in Montana and 28 nearby stations in Canada and adjoining States. (Author 's abstract)
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
NASA Technical Reports Server (NTRS)
Wilson, Edward (Inventor)
2006-01-01
The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.
Christensen, Victoria G.; Jian, Xiaodong; Ziegler, Andrew C.
2000-01-01
Water from the Little Arkansas River is used as source water for artificial recharge to the Equus Beds aquifer, which provides water for the city of Wichita in south-central Kansas. To assess the quality of the source water, continuous in-stream water-quality monitors were installed at two U.S. Geological Survey stream-gaging stations to provide real-time measurement of specific conductance, pH, water temperature, dissolved oxygen, and turbidity in the Little Arkansas River. In addition, periodic water samples were collected manually and analyzed for selected constituents, including alkalinity, dissolved solids, total suspended solids, chloride, sulfate, atrazine, and fecal coliform bacteria. However, these periodic samples do not provide real-time data on which to base aquifer-recharge operational decisions to prevent degradation of the Equus Beds aquifer. Continuous and periodic monitoring enabled identification of seasonal trends in selected physical properties and chemical constituents and estimation of chemical mass transported in the Little Arkansas River. Identification of seasonal trends was especially important because high streamflows have a substantial effect on chemical loads and because concentration data from manually collected samples often were not available. Therefore, real-time water-quality monitoring of surrogates for the estimation of selected chemical constituents in streamflow can increase the accuracy of load and yield estimates and can decrease some manual data-collection activities. Regression equations, which were based on physical properties and analysis of water samples collected from 1995 through 1998 throughout 95 percent of the stream's flow duration, were developed to estimate alkalinity, dissolved solids, total suspended solids, chloride, sulfate, atrazine, and fecal coliform bacteria concentrations. Error was evaluated for the first year of data collection and each subsequent year, and a decrease in error was observed as the number of samples increased. Generally, 2 years of data (35 to 55 samples) collected throughout 90 to 95 percent of the stream's flow duration were sufficient to define the relation between a constituent and its surrogate(s). Relations and resulting equations were site specific. To test the regression equations developed from the first 3 years of data collection (1995-98), the equations were applied to the fourth year of data collection (1999) to calculate estimated constituent loads and the errors associated with these loads. Median relative percentage differences between measured constituent loads determined using the analysis of periodic, manual water samples and estimated constituent loads were less than 25 percent for alkalinity, dissolved solids, chloride, and sulfate. The percentage differences for total suspended solids, atrazine, and bacteria loads were more than 25 percent. Even for those constituents with large relative percentage differences between the measured and estimated loads, the estimation of constituent concentrations with regression analysis and real-time water-quality monitoring has numerous advantages over periodic manual sampling. The timely availability of bacteria and other constituent data may be important when considering recreation and the whole-body contact criteria established by the Kansas Department of Health and Environment for a specific water body. In addition, water suppliers would have timely information to use in adjusting water-treatment strategies; environmental changes could be assessed in time to prevent negative effects on fish or other aquatic life; and officials for the Equus Beds Ground-Water Recharge Demonstration project could use this information to prevent the possible degradation of the Equus Beds aquifer by choosing not to recharge when constituent concentrations in the source water are large. Constituent loads calculated from the regression equations may be useful for calculating total maximum daily loads (TMDL's), wh
Estimating Commute Distances of U.S. Army Reservists by Regional and Unit Characteristics
1990-09-01
multiple regression equation is used to estimate the parameters of the commute distance distribution as a function of reserve center and market ...used to estimate the parameters of the commute distance distribution as a function of reserve center and market characteristics. The results of the...recruiting personnel to meet unit fill rates. An important objective of the USAREC is to identify market areas that will support new reserve units [Ref. 2:p
Fuller, L.M.; Aichele, Stephen S.; Minnerick, R.J.
2004-01-01
Inland lakes are an important economic and environmental resource for Michigan. The U.S. Geological Survey and the Michigan Department of Environmental Quality have been cooperatively monitoring the quality of selected lakes in Michigan through the Lake Water Quality Assessment program. Through this program, approximately 730 of Michigan's 11,000 inland lakes will be monitored once during this 15-year study. Targeted lakes will be sampled during spring turnover and again in late summer to characterize water quality. Because more extensive and more frequent sampling is not economically feasible in the Lake Water Quality Assessment program, the U.S. Geological Survey and Michigan Department of Environmental Quality investigate the use of satellite imagery as a means of estimating water quality in unsampled lakes. Satellite imagery has been successfully used in Minnesota, Wisconsin, and elsewhere to compute the trophic state of inland lakes from predicted secchi-disk measurements. Previous attempts of this kind in Michigan resulted in a poorer fit between observed and predicted data than was found for Minnesota or Wisconsin. This study tested whether estimates could be improved by using atmospherically corrected satellite imagery, whether a more appropriate regression model could be obtained for Michigan, and whether chlorophyll a concentrations could be reliably predicted from satellite imagery in order to compute trophic state of inland lakes. Although the atmospheric-correction did not significantly improve estimates of lake-water quality, a new regression equation was identified that consistently yielded better results than an equation obtained from the literature. A stepwise regression was used to determine an equation that accurately predicts chlorophyll a concentrations in northern Lower Michigan.
Regression analysis on the variation in efficiency frontiers for prevention stage of HIV/AIDS.
Kamae, Maki S; Kamae, Isao; Cohen, Joshua T; Neumann, Peter J
2011-01-01
To investigate how the cost effectiveness of preventing HIV/AIDS varies across possible efficiency frontiers (EFs) by taking into account potentially relevant external factors, such as prevention stage, and how the EFs can be characterized using regression analysis given uncertainty of the QALY-cost estimates. We reviewed cost-effectiveness estimates for the prevention and treatment of HIV/AIDS published from 2002-2007 and catalogued in the Tufts Medical Center Cost-Effectiveness Analysis (CEA) Registry. We constructed efficiency frontier (EF) curves by plotting QALYs against costs, using methods used by the Institute for Quality and Efficiency in Health Care (IQWiG) in Germany. We stratified the QALY-cost ratios by prevention stage, country of study, and payer perspective, and estimated EF equations using log and square-root models. A total of 53 QALY-cost ratios were identified for HIV/AIDS in the Tufts CEA Registry. Plotted ratios stratified by prevention stage were visually grouped into a cluster consisting of primary/secondary prevention measures and a cluster consisting of tertiary measures. Correlation coefficients for each cluster were statistically significant. For each cluster, we derived two EF equations - one based on the log model, and one based on the square-root model. Our findings indicate that stratification of HIV/AIDS interventions by prevention stage can yield distinct EFs, and that the correlation and regression analyses are useful for parametrically characterizing EF equations. Our study has certain limitations, such as the small number of included articles and the potential for study populations to be non-representative of countries of interest. Nonetheless, our approach could help develop a deeper appreciation of cost effectiveness beyond the deterministic approach developed by IQWiG.
Jenkinson, Toni-Marie; Muncer, Steven; Wheeler, Miranda; Brechin, Don; Evans, Stephen
2018-06-01
Neuropsychological assessment requires accurate estimation of an individual's premorbid cognitive abilities. Oral word reading tests, such as the test of premorbid functioning (TOPF), and demographic variables, such as age, sex, and level of education, provide a reasonable indication of premorbid intelligence, but their ability to predict other related cognitive abilities is less well understood. This study aimed to develop regression equations, based on the TOPF and demographic variables, to predict scores on tests of verbal fluency and naming ability. A sample of 119 healthy adults provided demographic information and were tested using the TOPF, FAS, animal naming test (ANT), and graded naming test (GNT). Multiple regression analyses, using the TOPF and demographics as predictor variables, were used to estimate verbal fluency and naming ability test scores. Change scores and cases of significant impairment were calculated for two clinical samples with diagnosed neurological conditions (TBI and meningioma) using the method in Knight, McMahon, Green, and Skeaff (). Demographic variables provided a significant contribution to the prediction of all verbal fluency and naming ability test scores; however, adding TOPF score to the equation considerably improved prediction beyond that afforded by demographic variables alone. The percentage of variance accounted for by demographic variables and/or TOPF score varied from 19 per cent (FAS), 28 per cent (ANT), and 41 per cent (GNT). Change scores revealed significant differences in performance in the clinical groups, particularity the TBI group. Demographic variables, particularly education level, and scores on the TOPF should be taken into consideration when interpreting performance on tests of verbal fluency and naming ability. © 2017 The British Psychological Society.
Asquith, William H.; Slade, R.M.
1999-01-01
The U.S. Geological Survey, in cooperation with the Texas Department of Transportation, has developed a computer program to estimate peak-streamflow frequency for ungaged sites in natural basins in Texas. Peak-streamflow frequency refers to the peak streamflows for recurrence intervals of 2, 5, 10, 25, 50, and 100 years. Peak-streamflow frequency estimates are needed by planners, managers, and design engineers for flood-plain management; for objective assessment of flood risk; for cost-effective design of roads and bridges; and also for the desin of culverts, dams, levees, and other flood-control structures. The program estimates peak-streamflow frequency using a site-specific approach and a multivariate generalized least-squares linear regression. A site-specific approach differs from a traditional regional regression approach by developing unique equations to estimate peak-streamflow frequency specifically for the ungaged site. The stations included in the regression are selected using an informal cluster analysis that compares the basin characteristics of the ungaged site to the basin characteristics of all the stations in the data base. The program provides several choices for selecting the stations. Selecting the stations using cluster analysis ensures that the stations included in the regression will have the most pertinent information about flooding characteristics of the ungaged site and therefore provide the basis for potentially improved peak-streamflow frequency estimation. An evaluation of the site-specific approach in estimating peak-streamflow frequency for gaged sites indicates that the site-specific approach is at least as accurate as a traditional regional regression approach.
Johnson, Carole D.; Lane, John W.
2016-01-01
Determining sediment thickness and delineating bedrock topography are important for assessing groundwater availability and characterizing contamination sites. In recent years, the horizontal-to-vertical spectral ratio (HVSR) seismic method has emerged as a non-invasive, cost-effective approach for estimating the thickness of unconsolidated sediments above bedrock. Using a three-component seismometer, this method uses the ratio of the average horizontal- and vertical-component amplitude spectrums to produce a spectral ratio curve with a peak at the fundamental resonance frequency. The HVSR method produces clear and repeatable resonance frequency peaks when there is a sharp contrast (>2:1) in acoustic impedance at the sediment/bedrock boundary. Given the resonant frequency, sediment thickness can be determined either by (1) using an estimate of average local sediment shear-wave velocity or by (2) application of a power-law regression equation developed from resonance frequency observations at sites with a range of known depths to bedrock. Two frequently asked questions about the HVSR method are (1) how accurate are the sediment thickness estimates? and (2) how much do sediment thickness/bedrock depth estimates change when using different published regression equations? This paper compares and contrasts different approaches for generating HVSR depth estimates, through analysis of HVSR data acquired in the vicinity of Tylerville, Connecticut, USA.
McCarthy, Peter M.; Dutton, DeAnn M.; Sando, Steven K.; Sando, Roy
2016-04-05
The U.S. Geological Survey (USGS) provides streamflow characteristics and other related information needed by water-resource managers to protect people and property from floods, plan and manage water-resource activities, and protect water quality. Streamflow characteristics provided by the USGS, such as peak-flow and low-flow frequencies for streamflow-gaging stations, are frequently used by engineers, flood forecasters, land managers, biologists, and others to guide their everyday decisions. In addition to providing streamflow characteristics at streamflow-gaging stations, the USGS also develops regional regression equations and drainage area-adjustment methods for estimating streamflow characteristics at locations on ungaged streams. Regional regression equations can be complex and often require users to determine several basin characteristics, which are physical and climatic characteristics of the stream and its drainage basin. Obtaining these basin characteristics for streamflow-gaging stations and ungaged sites traditionally has been time consuming and subjective, and led to inconsistent results.StreamStats is a Web-based geographic information system application that was created by the USGS to provide users with access to an assortment of analytical tools that are useful for water-resource planning and management. StreamStats allows users to easily obtain streamflow and basin characteristics for USGS streamflow-gaging stations and user-selected locations on ungaged streams. The USGS, in cooperation with Montana Department of Transportation, Montana Department of Environmental Quality, and Montana Department of Natural Resources and Conservation, completed a study to develop a StreamStats application for Montana, compute streamflow characteristics at streamflow-gaging stations, and develop regional regression equations to estimate streamflow characteristics at ungaged sites. Chapter A of this Scientific Investigations Report describes the Montana StreamStats application and the datasets, streamflow-gaging stations, streamflow characteristics, and regression equations (as described fully in Chapters B through G of this report) that are used for development of the StreamStats application for Montana.
An Examination and Comparison of Airline and Navy Pilot Career Earnings
1986-03-01
RECEIVED ........ .............. 45 16. AIRLINE PILOT PROBATIONARY WAGES .... ........ 46 17. 1985 FAPA MAXIMUM PILOT WAGE ESTIMATES ..... 53 1 1983...tI% LIN PILOT WAGES REGRESSION EQUATIONS . 5 19. AVERAGE 1983 PILOT WAGES COMPUTED FROM REGRESSION ANALYSIS ...... ............. 56 20. FAPA MAXIMUM...Western N/A 1,200 1,500 Source: FAPA This establishes a wage "base" for pilots. In addition, a pilot who ilys more than average in one month may "bank
Techniques for estimating selected streamflow characteristics of rural unregulated streams in Ohio
Koltun, G.F.; Whitehead, Matthew T.
2002-01-01
This report provides equations for estimating mean annual streamflow, mean monthly streamflows, harmonic mean streamflow, and streamflow quartiles (the 25th-, 50th-, and 75th-percentile streamflows) as a function of selected basin characteristics for rural, unregulated streams in Ohio. The equations were developed from streamflow statistics and basin-characteristics data for as many as 219 active or discontinued streamflow-gaging stations on rural, unregulated streams in Ohio with 10 or more years of homogenous daily streamflow record. Streamflow statistics and basin-characteristics data for the 219 stations are presented in this report. Simple equations (based on drainage area only) and best-fit equations (based on drainage area and at least two other basin characteristics) were developed by means of ordinary least-squares regression techniques. Application of the best-fit equations generally involves quantification of basin characteristics that require or are facilitated by use of a geographic information system. In contrast, the simple equations can be used with information that can be obtained without use of a geographic information system; however, the simple equations have larger prediction errors than the best-fit equations and exhibit geographic biases for most streamflow statistics. The best-fit equations should be used instead of the simple equations whenever possible.
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Saputro, Dewi Retno Sari
2017-03-01
GWOLR model used for represent relationship between dependent variable has categories and scale of category is ordinal with independent variable influenced the geographical location of the observation site. Parameters estimation of GWOLR model use maximum likelihood provide system of nonlinear equations and hard to be found the result in analytic resolution. By finishing it, it means determine the maximum completion, this thing associated with optimizing problem. The completion nonlinear system of equations optimize use numerical approximation, which one is Newton Raphson method. The purpose of this research is to make iteration algorithm Newton Raphson and program using R software to estimate GWOLR model. Based on the research obtained that program in R can be used to estimate the parameters of GWOLR model by forming a syntax program with command "while".
Methodology for Estimation of Flood Magnitude and Frequency for New Jersey Streams
Watson, Kara M.; Schopp, Robert D.
2009-01-01
Methodologies were developed for estimating flood magnitudes at the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals for unregulated or slightly regulated streams in New Jersey. Regression equations that incorporate basin characteristics were developed to estimate flood magnitude and frequency for streams throughout the State by use of a generalized least squares regression analysis. Relations between flood-frequency estimates based on streamflow-gaging-station discharge and basin characteristics were determined by multiple regression analysis, and weighted by effective years of record. The State was divided into five hydrologically similar regions to refine the regression equations. The regression analysis indicated that flood discharge, as determined by the streamflow-gaging-station annual peak flows, is related to the drainage area, main channel slope, percentage of lake and wetland areas in the basin, population density, and the flood-frequency region, at the 95-percent confidence level. The standard errors of estimate for the various recurrence-interval floods ranged from 48.1 to 62.7 percent. Annual-maximum peak flows observed at streamflow-gaging stations through water year 2007 and basin characteristics determined using geographic information system techniques for 254 streamflow-gaging stations were used for the regression analysis. Drainage areas of the streamflow-gaging stations range from 0.18 to 779 mi2. Peak-flow data and basin characteristics for 191 streamflow-gaging stations located in New Jersey were used, along with peak-flow data for stations located in adjoining States, including 25 stations in Pennsylvania, 17 stations in New York, 16 stations in Delaware, and 5 stations in Maryland. Streamflow records for selected stations outside of New Jersey were included in the present study because hydrologic, physiographic, and geologic boundaries commonly extend beyond political boundaries. The StreamStats web application was developed cooperatively by the U.S. Geological Survey and the Environmental Systems Research Institute, Inc., and was designed for national implementation. This web application has been recently implemented for use in New Jersey. This program used in conjunction with a geographic information system provides the computation of values for selected basin characteristics, estimates of flood magnitudes and frequencies, and statistics for stream locations in New Jersey chosen by the user, whether the site is gaged or ungaged.
Stevens, Lesley A; Coresh, Josef; Schmid, Christopher H; Feldman, Harold I.; Froissart, Marc; Kusek, John; Rossert, Jerome; Van Lente, Frederick; Bruce, Robert D.; Zhang, Yaping (Lucy); Greene, Tom; Levey, Andrew S
2008-01-01
Background Serum cystatin C (Scys) has been proposed as a potential replacement for serum creatinine (Scr) in glomerular filtration rate (GFR) estimation. We report development and evaluation of GFR estimating equations using Scys alone and Scys, Scr or both with demographic variables. Study Design Test of diagnostic accuracy. Setting and Participants Participants screened for three chronic kidney disease (CKD) studies in the US (n=2980) and a clinical population in Paris, France (n=438) Reference Test Measured GFR (mGFR). Index Test Estimated GFR using the four new equations based on Scys alone, Scys, Scr or both with age, sex and race. New equations were developed using regression with log GFR as the outcome in 2/3 data from US studies. Internal validation was performed in remaining 1/3 of data from US CKD studies; external validation was performed in the Paris study. Measurements GFR was measured using urinary clearance of 125I-iothalamate in the US studies and chromium-ethylenediaminetetraacetate (51Cr-EDTA) in the Paris study. Scys was measured by Dade Behring assay, standardized Scr. Results Mean mGFR, Scr and Scys were 48 (5th–95th percentile 15–95) ml/min/1.73m2 2.1 mg/dL and 1.8 mg/L respectively. For the new equations, the coefficients for age, sex and race were significant in the equation with Scys but 2 to 4 fold smaller than in the equation with Scr. Measures of performance among new equations were consistent across development, internal and external validation datasets. Percent of eGFR within 30% of mGFR for equations based on Scys alone, Scys, Scr or both with age, sex and race were 81, 83, 85, and 89%, respectively. The equation using Scys alone yields estimates with small biases in age, sex and race subgroups, which are improved in equations including these variables. Limitations Study population composed mainly of patients with CKD. Conclusions Scys alone provides GFR estimates that are nearly as accurate as Scr adjusted for age, sex and race thus providing an alternative GFR estimate that is not linked to muscle mass. An equation including Scys in combination with Scr, age, sex and race provide most accurate estimates. PMID:18295055
Assessing Spurious Interaction Effects in Structural Equation Modeling
ERIC Educational Resources Information Center
Harring, Jeffrey R.; Weiss, Brandi A.; Li, Ming
2015-01-01
Several studies have stressed the importance of simultaneously estimating interaction and quadratic effects in multiple regression analyses, even if theory only suggests an interaction effect should be present. Specifically, past studies suggested that failing to simultaneously include quadratic effects when testing for interaction effects could…
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation
NASA Astrophysics Data System (ADS)
Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.
A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.
Sun, Yanqing; Sun, Liuquan; Zhou, Jie
2013-07-01
This paper studies the generalized semiparametric regression model for longitudinal data where the covariate effects are constant for some and time-varying for others. Different link functions can be used to allow more flexible modelling of longitudinal data. The nonparametric components of the model are estimated using a local linear estimating equation and the parametric components are estimated through a profile estimating function. The method automatically adjusts for heterogeneity of sampling times, allowing the sampling strategy to depend on the past sampling history as well as possibly time-dependent covariates without specifically model such dependence. A [Formula: see text]-fold cross-validation bandwidth selection is proposed as a working tool for locating an appropriate bandwidth. A criteria for selecting the link function is proposed to provide better fit of the data. Large sample properties of the proposed estimators are investigated. Large sample pointwise and simultaneous confidence intervals for the regression coefficients are constructed. Formal hypothesis testing procedures are proposed to check for the covariate effects and whether the effects are time-varying. A simulation study is conducted to examine the finite sample performances of the proposed estimation and hypothesis testing procedures. The methods are illustrated with a data example.
Solvency supervision based on a total balance sheet approach
NASA Astrophysics Data System (ADS)
Pitselis, Georgios
2009-11-01
In this paper we investigate the adequacy of the own funds a company requires in order to remain healthy and avoid insolvency. Two methods are applied here; the quantile regression method and the method of mixed effects models. Quantile regression is capable of providing a more complete statistical analysis of the stochastic relationship among random variables than least squares estimation. The estimated mixed effects line can be considered as an internal industry equation (norm), which explains a systematic relation between a dependent variable (such as own funds) with independent variables (e.g. financial characteristics, such as assets, provisions, etc.). The above two methods are implemented with two data sets.
Evaluation of AUC(0-4) predictive methods for cyclosporine in kidney transplant patients.
Aoyama, Takahiko; Matsumoto, Yoshiaki; Shimizu, Makiko; Fukuoka, Masamichi; Kimura, Toshimi; Kokubun, Hideya; Yoshida, Kazunari; Yago, Kazuo
2005-05-01
Cyclosporine (CyA) is the most commonly used immunosuppressive agent in patients who undergo kidney transplantation. Dosage adjustment of CyA is usually based on trough levels. Recently, trough levels have been replacing the area under the concentration-time curve during the first 4 h after CyA administration (AUC(0-4)). The aim of this study was to compare the predictive values obtained using three different methods of AUC(0-4) monitoring. AUC(0-4) was calculated from 0 to 4 h in early and stable renal transplant patients using the trapezoidal rule. The predicted AUC(0-4) was calculated using three different methods: the multiple regression equation reported by Uchida et al.; Bayesian estimation for modified population pharmacokinetic parameters reported by Yoshida et al.; and modified population pharmacokinetic parameters reported by Cremers et al. The predicted AUC(0-4) was assessed on the basis of predictive bias, precision, and correlation coefficient. The predicted AUC(0-4) values obtained using three methods through measurement of three blood samples showed small differences in predictive bias, precision, and correlation coefficient. In the prediction of AUC(0-4) measurement of one blood sample from stable renal transplant patients, the performance of the regression equation reported by Uchida depended on sampling time. On the other hand, the performance of Bayesian estimation with modified pharmacokinetic parameters reported by Yoshida through measurement of one blood sample, which is not dependent on sampling time, showed a small difference in the correlation coefficient. The prediction of AUC(0-4) using a regression equation required accurate sampling time. In this study, the prediction of AUC(0-4) using Bayesian estimation did not require accurate sampling time in the AUC(0-4) monitoring of CyA. Thus Bayesian estimation is assumed to be clinically useful in the dosage adjustment of CyA.
Magnitude of flood flows for selected annual exceedance probabilities for streams in Massachusetts
Zarriello, Phillip J.
2017-05-11
The U.S. Geological Survey, in cooperation with the Massachusetts Department of Transportation, determined the magnitude of flood flows at selected annual exceedance probabilities (AEPs) at streamgages in Massachusetts and from these data developed equations for estimating flood flows at ungaged locations in the State. Flood magnitudes were determined for the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEPs at 220 streamgages, 125 of which are in Massachusetts and 95 are in the adjacent States of Connecticut, New Hampshire, New York, Rhode Island, and Vermont. AEP flood flows were computed for streamgages using the expected moments algorithm weighted with a recently computed regional skewness coefficient for New England.Regional regression equations were developed to estimate the magnitude of floods for selected AEP flows at ungaged sites from 199 selected streamgages and for 60 potential explanatory basin characteristics. AEP flows for 21 of the 125 streamgages in Massachusetts were not used in the final regional regression analysis, primarily because of regulation or redundancy. The final regression equations used generalized least squares methods to account for streamgage record length and correlation. Drainage area, mean basin elevation, and basin storage explained 86 to 93 percent of the variance in flood magnitude from the 50- to 0.2-percent AEPs, respectively. The estimates of AEP flows at streamgages can be improved by using a weighted estimate that is based on the magnitude of the flood and associated uncertainty from the at-site analysis and the regional regression equations. Weighting procedures for estimating AEP flows at an ungaged site on a gaged stream also are provided that improve estimates of flood flows at the ungaged site when hydrologic characteristics do not abruptly change.Urbanization expressed as the percentage of imperviousness provided some explanatory power in the regional regression; however, it was not statistically significant at the 95-percent confidence level for any of the AEPs examined. The effect of urbanization on flood flows indicates a complex interaction with other basin characteristics. Another complicating factor is the assumption of stationarity, that is, the assumption that annual peak flows exhibit no significant trend over time. The results of the analysis show that stationarity does not prevail at all of the streamgages. About 27 percent of streamgages in Massachusetts and about 42 percent of streamgages in adjacent States with 20 or more years of systematic record used in the study show a significant positive trend at the 95-percent confidence level. The remaining streamgages had both positive and negative trends, but the trends were not statistically significant. Trends were shown to vary over time. In particular, during the past decade (2004–2013), peak flows were persistently above normal, which may give the impression of positive trends. Only continued monitoring will provide the information needed to determine whether recent increases in annual peak flows are a normal oscillation or a true trend.The analysis used 37 years of additional data obtained since the last comprehensive study of flood flows in Massachusetts. In addition, new methods for computing flood flows at streamgages and regionalization improved estimates of flood magnitudes at gaged and ungaged locations and better defined the uncertainty of the estimates of AEP floods.
Huang, Qi; Chen, Yunshuang; Zhang, Min; Wang, Sihe; Zhang, Weiguang; Cai, Guangyan; Chen, Xiangmei; Sun, Xuefeng
2018-04-01
We compared the performance of technetium-99m-diethylenetriaminepentaacetic acid ( 99m Tc-DTPA) renal dynamic imaging (RDI), the MDRD equation, and the CKD EPI equation to estimate glomerular filtration rate (GFR). A total of 551 subjects, including CKD patients and healthy individuals, were enrolled in this study. Dual plasma sample clearance method of 99m Tc-DTPA was used as the true value for GFR (tGFR). RDI and the MDRD and CKD EPI equations for estimating GFR were compared and evaluated. Data indicate that RDI and the MDRD equation underestimated GFR and CKD EPI overestimated GFR. RDI was associated with significantly higher bias than the MDRD and CKD EPI equations. The regression coefficient, diagnostic precision, and consistency of RDI were significantly lower than either equation. RDI and the MDRD equation underestimated GFR to a greater degree in subjects with tGFR ≥ 90 ml/min/1.73 m 2 compared with the results obtained from all subjects. In the tGFR60-89 ml/min/1.73 m 2 group, the precision of RDI was significantly lower than that of both equations. In the tGFR30-59 ml/min/1.73 m 2 group, RDI had the least bias, the most precision, and significantly higher accuracy compared with either equation. In tGFR < 30 ml/min/1.73 m 2 , the three methods had similar performance and were not significantly different. RDI significantly underestimates GFR and performs no better than MDRD and CKD EPI equations for GFR estimation; thus, it should not be recommended as a reference standard against which other GFR measurement methods are assessed. However, RDI better estimates GFR than either equation for individuals in the tGFR30-59 ml/min/1.73 m 2 group and thus may be helpful to distinguish stage 3a and 3b CKD.
Prediction of oxygen consumption in cardiac rehabilitation patients performing leg ergometry
NASA Astrophysics Data System (ADS)
Alvarez, John Gershwin
The purpose of this study was two-fold. First, to determine the validity of the ACSM leg ergometry equation in the prediction of steady-state oxygen consumption (VO2) in a heterogeneous population of cardiac patients. Second, to determine whether a more accurate prediction equation could be developed for use in the cardiac population. Thirty-one cardiac rehabilitation patients participated in the study of which 24 were men and 7 were women. Biometric variables (mean +/- sd) of the participants were as follows: age = 61.9 +/- 9.5 years; height = 172.6 +/- 1.6 cm; and body mass = 82.3 +/- 10.6 kg. Subjects exercised on a MonarchTM cycle ergometer at 0, 180, 360, 540 and 720 kgm ˙ min-1. The length of each stage was five minutes. Heart rate, ECG, and VO2 were continuously monitored. Blood pressure and heart rate were collected at the end of each stage. Steady state VO 2 was calculated for each stage using the average of the last two minutes. Correlation coefficients, standard error of estimate, coefficient of determination, total error, and mean bias were used to determine the accuracy of the ACSM equation (1995). The analysis found the ACSM equation to be a valid means of estimating VO2 in cardiac patients. Simple linear regression was used to develop a new equation. Regression analysis found workload to be a significant predictor of VO2. The following equation is the result: VO2 = (1.6 x kgm ˙ min-1) + 444 ml ˙ min-1. The r of the equation was .78 (p < .05) and the standard error of estimate was 211 ml ˙ min-1. Analysis of variance was used to determine significant differences between means for actual and predicted VO2 values for each equation. The analysis found the ACSM and new equation to significantly (p < .05) under predict VO2 during unloaded pedaling. Furthermore, the ACSM equation was found to significantly (p < .05) under predict VO 2 during the first loaded stage of exercise. When the accuracy of the ACSM and new equations were compared based on correlation coefficients, coefficients of determinations, SEEs, total error, and mean bias the new equation was found to have equal or better accuracy at all workloads. The final form of the new equation is: VO2 (ml ˙ min-1) = (kgm ˙ min-1 x 1.6 ml ˙ kgm-1) + (3.5 ml ˙ kg-1 ˙ min-1 x body mass in kg) + 156 ml ˙ min-1.
Estimating the effects of wages on obesity.
Kim, DaeHwan; Leigh, John Paul
2010-05-01
To estimate the effects of wages on obesity and body mass. Data on household heads, aged 20 to 65 years, with full-time jobs, were drawn from the Panel Study of Income Dynamics for 2003 to 2007. The Panel Study of Income Dynamics is a nationally representative sample. Instrumental variables (IV) for wages were created using knowledge of computer software and state legal minimum wages. Least squares (linear regression) with corrected standard errors were used to estimate the equations. Statistical tests revealed both instruments were strong and tests for over-identifying restrictions were favorable. Wages were found to be predictive (P < 0.05) of obesity and body mass in regressions both before and after applying IVs. Coefficient estimates suggested stronger effects in the IV models. Results are consistent with the hypothesis that low wages increase obesity prevalence and body mass.
The Cockroft and Gault formula for estimation of creatinine clearance: a friendly deconstruction.
Millar, J Alasdair
2012-02-24
To review the derivation of the Cockroft and Gault formula for estimating creatinine clearance from serum creatinine in a historical context. The derivation described by Cockroft and Gault was reviewed, and an alternative formula was sought using the data reported in the paper. Cockroft and Gault used 24 hour urine creatinine data expressed as mg/kg body weight and mathematical manipulation of a linear regression equation which introduced body weight as an independent variable into the formula. This involved a circular logic and may have been mathematically invalid. A more logical equation not containing body weight was derived from the data. The Cockcroft and Gault formula has been validated by long usage but the derivation appears logically insecure. Nevertheless, its role in estimating renal function at the bedside is established.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Modeling The Skeleton Weight of an Adult Caucasian Man.
Avtandilashvili, Maia; Tolmachev, Sergei Y
2018-05-17
The reference value for the skeleton weight of an adult male (10.5 kg) recommended by the International Commission on Radiological Protection in Publication 70 is based on weights of dissected skeletons from 44 individuals, including two U.S. Transuranium and Uranium Registries whole-body donors. The International Commission on Radiological Protection analysis of anatomical data from 31 individuals with known values of body height demonstrated significant correlation between skeleton weight and body height. The corresponding regression equation, Wskel (kg) = -10.7 + 0.119 × H (cm), published in International Commission on Radiological Protection Publication 70 is typically used to estimate the skeleton weight from body height. Currently, the U.S. Transuranium and Uranium Registries holds data on individual bone weights from a total of 40 male whole-body donors, which has provided a unique opportunity to update the International Commission on Radiological Protection skeleton weight vs. body height equation. The original International Commission on Radiological Protection Publication 70 and the new U.S. Transuranium and Uranium Registries data were combined in a set of 69 data points representing a group of 33- to 95-y-old individuals with body heights and skeleton weights ranging from 155 to 188 cm and 6.5 to 13.4 kg, respectively. Data were fitted with a linear least-squares regression. A significant correlation between the two parameters was observed (r = 0.28), and an updated skeleton weight vs. body height equation was derived: Wskel (kg) = -6.5 + 0.093 × H (cm). In addition, a correlation of skeleton weight with multiple variables including body height, body weight, and age was evaluated using multiple regression analysis, and a corresponding fit equation was derived: Wskel (kg) = -0.25 + 0.046 × H (cm) + 0.036 × Wbody (kg) - 0.012 × A (y). These equations will be used to estimate skeleton weights and, ultimately, total skeletal actinide activities for biokinetic modeling of U.S. Transuranium and Uranium Registries partial-body donation cases.
NASA Astrophysics Data System (ADS)
Wang, Ji-Peng; François, Bertrand; Lambert, Pierre
2017-09-01
Estimating hydraulic conductivity from particle size distribution (PSD) is an important issue for various engineering problems. Classical models such as Hazen model, Beyer model, and Kozeny-Carman model usually regard the grain diameter at 10% passing (d10) as an effective grain size and the effects of particle size uniformity (in Beyer model) or porosity (in Kozeny-Carman model) are sometimes embedded. This technical note applies the dimensional analysis (Buckingham's ∏ theorem) to analyze the relationship between hydraulic conductivity and particle size distribution (PSD). The porosity is regarded as a dependent variable on the grain size distribution in unconsolidated conditions. It indicates that the coefficient of grain size uniformity and a dimensionless group representing the gravity effect, which is proportional to the mean grain volume, are the main two determinative parameters for estimating hydraulic conductivity. Regression analysis is then carried out on a database comprising 431 samples collected from different depositional environments and new equations are developed for hydraulic conductivity estimation. The new equation, validated in specimens beyond the database, shows an improved prediction comparing to using the classic models.
[Progress on Individual Stature Estimation in Forensic Medicine].
Wu, Rong-qi; Huang, Li-na; Chen, Xin
2015-12-01
Individual stature estimation is one of the most important contents of forensic anthropology. Currently, it has been used that the regression equations established by the data collected by direct measurement or radiological techniques in a certain group of limbs, irregular bones, and anatomic landmarks. Due to the impact of population mobility, human physical improvement, racial and geographic differences, estimation of individual stature should be a regular study. This paper reviews the different methods of stature estimation, briefly describes the advantages and disadvantages of each method, and prospects a new research direction.
Ide, Jun'ichiro; Chiwa, Masaaki; Higashi, Naoko; Maruno, Ryoko; Mori, Yasushi; Otsuki, Kyoichi
2012-08-01
This study sought to determine the lowest number of storm events required for adequate estimation of annual nutrient loads from a forested watershed using the regression equation between cumulative load (∑L) and cumulative stream discharge (∑Q). Hydrological surveys were conducted for 4 years, and stream water was sampled sequentially at 15-60-min intervals during 24 h in 20 events, as well as weekly in a small forested watershed. The bootstrap sampling technique was used to determine the regression (∑L-∑Q) equations of dissolved nitrogen (DN) and phosphorus (DP), particulate nitrogen (PN) and phosphorus (PP), dissolved inorganic nitrogen (DIN), and suspended solid (SS) for each dataset of ∑L and ∑Q. For dissolved nutrients (DN, DP, DIN), the coefficient of variance (CV) in 100 replicates of 4-year average annual load estimates was below 20% with datasets composed of five storm events. For particulate nutrients (PN, PP, SS), the CV exceeded 20%, even with datasets composed of more than ten storm events. The differences in the number of storm events required for precise load estimates between dissolved and particulate nutrients were attributed to the goodness of fit of the ∑L-∑Q equations. Bootstrap simulation based on flow-stratified sampling resulted in fewer storm events than the simulation based on random sampling and showed that only three storm events were required to give a CV below 20% for dissolved nutrients. These results indicate that a sampling design considering discharge levels reduces the frequency of laborious chemical analyses of water samples required throughout the year.
Soil translocation estimates calibrated for moldboard plow depth
USDA-ARS?s Scientific Manuscript database
Over the past century, one of the biggest culprits of tillage-induced soil erosion and translocation has been the moldboard plow. The distance soil will move by moldboard plow tillage has been shown to be correlated with slope gradient. Lindstrom et al. (1992) developed regression equations describi...
Determining the Statistical Significance of Relative Weights
ERIC Educational Resources Information Center
Tonidandel, Scott; LeBreton, James M.; Johnson, Jeff W.
2009-01-01
Relative weight analysis is a procedure for estimating the relative importance of correlated predictors in a regression equation. Because the sampling distribution of relative weights is unknown, researchers using relative weight analysis are unable to make judgments regarding the statistical significance of the relative weights. J. W. Johnson…
Use of the forced-oscillation technique to estimate spirometry values.
Yamamoto, Shoichiro; Miyoshi, Seigo; Katayama, Hitoshi; Okazaki, Mikio; Shigematsu, Hisayuki; Sano, Yoshifumi; Matsubara, Minoru; Hamaguchi, Naohiko; Okura, Takafumi; Higaki, Jitsuo
2017-01-01
Spirometry is sometimes difficult to perform in elderly patients and in those with severe respiratory distress. The forced-oscillation technique (FOT) is a simple and noninvasive method of measuring respiratory impedance. The aim of this study was to determine if FOT data reflect spirometric indices. Patients underwent both FOT and spirometry procedures prior to inclusion in development (n=1,089) and validation (n=552) studies. Multivariate linear regression analysis was performed to identify FOT parameters predictive of vital capacity (VC), forced VC (FVC), and forced expiratory volume in 1 second (FEV 1 ). A regression equation was used to calculate estimated VC, FVC, and FEV 1 . We then determined whether the estimated data reflected spirometric indices. Agreement between actual and estimated spirometry data was assessed by Bland-Altman analysis. Significant correlations were observed between actual and estimated VC, FVC, and FEV 1 values (all r >0.8 and P <0.001). These results were deemed robust by a separate validation study (all r >0.8 and P <0.001). Bias between the actual data and estimated data for VC, FVC, and FEV 1 in the development study was 0.007 L (95% limits of agreement [LOA] 0.907 and -0.893 L), -0.064 L (95% LOA 0.843 and -0.971 L), and -0.039 L (95% LOA 0.735 and -0.814 L), respectively. On the other hand, bias between the actual data and estimated data for VC, FVC, and FEV 1 in the validation study was -0.201 L (95% LOA 0.62 and -1.022 L), -0.262 L (95% LOA 0.582 and -1.106 L), and -0.174 L (95% LOA 0.576 and -0.923 L), respectively, suggesting that the estimated data in the validation study did not have high accuracy. Further studies are needed to generate more accurate regression equations for spirometric indices based on FOT measurements.
Van Houtven, George; Powers, John; Jessup, Amber; Yang, Jui-Chen
2006-08-01
Many economists argue that willingness-to-pay (WTP) measures are most appropriate for assessing the welfare effects of health changes. Nevertheless, the health evaluation literature is still dominated by studies estimating nonmonetary health status measures (HSMs), which are often used to assess changes in quality-adjusted life years (QALYs). Using meta-regression analysis, this paper combines results from both WTP and HSM studies applied to acute morbidity, and it tests whether a systematic relationship exists between HSM and WTP estimates. We analyze over 230 WTP estimates from 17 different studies and find evidence that QALY-based estimates of illness severity--as measured by the Quality of Well-Being (QWB) Scale--are significant factors in explaining variation in WTP, as are changes in the duration of illness and the average income and age of the study populations. In addition, we test and reject the assumption of a constant WTP per QALY gain. We also demonstrate how the estimated meta-regression equations can serve as benefit transfer functions for policy analysis. By specifying the change in duration and severity of the acute illness and the characteristics of the affected population, we apply the regression functions to predict average WTP per case avoided. Copyright 2006 John Wiley & Sons, Ltd.
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Estimation of magnitude and frequency of floods for streams in Puerto Rico : new empirical models
Ramos-Gines, Orlando
1999-01-01
Flood-peak discharges and frequencies are presented for 57 gaged sites in Puerto Rico for recurrence intervals ranging from 2 to 500 years. The log-Pearson Type III distribution, the methodology recommended by the United States Interagency Committee on Water Data, was used to determine the magnitude and frequency of floods at the gaged sites having 10 to 43 years of record. A technique is presented for estimating flood-peak discharges at recurrence intervals ranging from 2 to 500 years for unregulated streams in Puerto Rico with contributing drainage areas ranging from 0.83 to 208 square miles. Loglinear multiple regression analyses, using climatic and basin characteristics and peak-discharge data from the 57 gaged sites, were used to construct regression equations to transfer the magnitude and frequency information from gaged to ungaged sites. The equations have contributing drainage area, depth-to-rock, and mean annual rainfall as the basin and climatic characteristics in estimating flood peak discharges. Examples are given to show a step-by-step procedure in calculating a 100-year flood at a gaged site, an ungaged site, a site near a gaged location, and a site between two gaged sites.
Hansen, Dominique; Jacobs, Nele; Thijs, Herbert; Dendale, Paul; Claes, Neree
2016-09-01
Healthcare professionals with limited access to ergospirometry remain in need of valid and simple submaximal exercise tests to predict maximal oxygen uptake (VO2max ). Despite previous validation studies concerning fixed-rate step tests, accurate equations for the estimation of VO2max remain to be formulated from a large sample of healthy adults between age 18-75 years (n > 100). The aim of this study was to develop a valid equation to estimate VO2max from a fixed-rate step test in a larger sample of healthy adults. A maximal ergospirometry test, with assessment of cardiopulmonary parameters and VO2max , and a 5-min fixed-rate single-stage step test were executed in 112 healthy adults (age 18-75 years). During the step test and subsequent recovery, heart rate was monitored continuously. By linear regression analysis, an equation to predict VO2max from the step test was formulated. This equation was assessed for level of agreement by displaying Bland-Altman plots and calculation of intraclass correlations with measured VO2max . Validity further was assessed by employing a Jackknife procedure. The linear regression analysis generated the following equation to predict VO2max (l min(-1) ) from the step test: 0·054(BMI)+0·612(gender)+3·359(body height in m)+0·019(fitness index)-0·012(HRmax)-0·011(age)-3·475. This equation explained 78% of the variance in measured VO2max (F = 66·15, P<0·001). The level of agreement and intraclass correlation was high (ICC = 0·94, P<0·001) between measured and predicted VO2max . From this study, a valid fixed-rate single-stage step test equation has been developed to estimate VO2max in healthy adults. This tool could be employed by healthcare professionals with limited access to ergospirometry. © 2015 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Mroz, T A
1999-10-01
This paper contains a Monte Carlo evaluation of estimators used to control for endogeneity of dummy explanatory variables in continuous outcome regression models. When the true model has bivariate normal disturbances, estimators using discrete factor approximations compare favorably to efficient estimators in terms of precision and bias; these approximation estimators dominate all the other estimators examined when the disturbances are non-normal. The experiments also indicate that one should liberally add points of support to the discrete factor distribution. The paper concludes with an application of the discrete factor approximation to the estimation of the impact of marriage on wages.
Score Estimating Equations from Embedded Likelihood Functions under Accelerated Failure Time Model
NING, JING; QIN, JING; SHEN, YU
2014-01-01
SUMMARY The semiparametric accelerated failure time (AFT) model is one of the most popular models for analyzing time-to-event outcomes. One appealing feature of the AFT model is that the observed failure time data can be transformed to identically independent distributed random variables without covariate effects. We describe a class of estimating equations based on the score functions for the transformed data, which are derived from the full likelihood function under commonly used semiparametric models such as the proportional hazards or proportional odds model. The methods of estimating regression parameters under the AFT model can be applied to traditional right-censored survival data as well as more complex time-to-event data subject to length-biased sampling. We establish the asymptotic properties and evaluate the small sample performance of the proposed estimators. We illustrate the proposed methods through applications in two examples. PMID:25663727
Modeling individualized coefficient alpha to measure quality of test score data.
Liu, Molei; Hu, Ming; Zhou, Xiao-Hua
2018-05-23
Individualized coefficient alpha is defined. It is item and subject specific and is used to measure the quality of test score data with heterogenicity among the subjects and items. A regression model is developed based on 3 sets of generalized estimating equations. The first set of generalized estimating equation models the expectation of the responses, the second set models the response's variance, and the third set is proposed to estimate the individualized coefficient alpha, defined and used to measure individualized internal consistency of the responses. We also use different techniques to extend our method to handle missing data. Asymptotic property of the estimators is discussed, based on which inference on the coefficient alpha is derived. Performance of our method is evaluated through simulation study and real data analysis. The real data application is from a health literacy study in Hunan province of China. Copyright © 2018 John Wiley & Sons, Ltd.
Westgate, Philip M
2013-07-20
Generalized estimating equations (GEEs) are routinely used for the marginal analysis of correlated data. The efficiency of GEE depends on how closely the working covariance structure resembles the true structure, and therefore accurate modeling of the working correlation of the data is important. A popular approach is the use of an unstructured working correlation matrix, as it is not as restrictive as simpler structures such as exchangeable and AR-1 and thus can theoretically improve efficiency. However, because of the potential for having to estimate a large number of correlation parameters, variances of regression parameter estimates can be larger than theoretically expected when utilizing the unstructured working correlation matrix. Therefore, standard error estimates can be negatively biased. To account for this additional finite-sample variability, we derive a bias correction that can be applied to typical estimators of the covariance matrix of parameter estimates. Via simulation and in application to a longitudinal study, we show that our proposed correction improves standard error estimation and statistical inference. Copyright © 2012 John Wiley & Sons, Ltd.
Yamaguchi, Tsuyoshi; Higashihara, Eiji; Okegawa, Takatsugu; Miyazaki, Isao; Nutahara, Kikuo
2018-05-22
The reliability of various equations for estimating the GFR in ADPKD patients and the influence of tolvaptan on the resulting estimates have not been examined when GFR is calculated on the basis of inulin clearance. We obtained baseline and on-tolvaptan measured GFRs (mGFRs), calculated on the basis of inulin clearance, in 114 ADPKD, and these mGFRs were compared with eGFRs calculated according to four basic equations: the MDRD, CKD-EPI, and JSN-CKDI equations and the Cockcroft-Gault formula, as well as the influence of tolvaptan and of inclusion of cystatin C on accuracy of the results. Accuracy of each of the seven total equations was evaluated on the basis of the percentage of eGFR values within mGFR ± 30% (P 30 ). mGFRs were distributed throughout CKD stages 1-5. Regardless of the CKD stage, P 30 s of the MDRD, CKD-EPI, and JSN-CKDI equations did not differ significantly between baseline values and on-tolvaptan values. In CKD 1-2 patients, P 30 of the CKD-EPI equation was 100.0%, whether or not the patient was on-tolvaptan. In CKD 3-5 patients, P 30 s of the MDRD, CKD-EPI, and JSN-CKDI equations were similar. For all four equations, regression coefficients and intercepts did not differ significantly between baseline and on-tolvaptan values, but accuracy of the Cockcroft-Gault formula was inferior to that of the other three equations. Incorporation of serum cystatin C reduced accuracy. The CKD-EPI equation is most reliable, regardless of the severity of CKD. Tolvaptain intake has minimal influence and cystatin C incorporation does not improve accuracy.
Alternative evaluation metrics for risk adjustment methods.
Park, Sungchul; Basu, Anirban
2018-06-01
Risk adjustment is instituted to counter risk selection by accurately equating payments with expected expenditures. Traditional risk-adjustment methods are designed to estimate accurate payments at the group level. However, this generates residual risks at the individual level, especially for high-expenditure individuals, thereby inducing health plans to avoid those with high residual risks. To identify an optimal risk-adjustment method, we perform a comprehensive comparison of prediction accuracies at the group level, at the tail distributions, and at the individual level across 19 estimators: 9 parametric regression, 7 machine learning, and 3 distributional estimators. Using the 2013-2014 MarketScan database, we find that no one estimator performs best in all prediction accuracies. Generally, machine learning and distribution-based estimators achieve higher group-level prediction accuracy than parametric regression estimators. However, parametric regression estimators show higher tail distribution prediction accuracy and individual-level prediction accuracy, especially at the tails of the distribution. This suggests that there is a trade-off in selecting an appropriate risk-adjustment method between estimating accurate payments at the group level and lower residual risks at the individual level. Our results indicate that an optimal method cannot be determined solely on the basis of statistical metrics but rather needs to account for simulating plans' risk selective behaviors. Copyright © 2018 John Wiley & Sons, Ltd.
Predicting the rate of change in timber value for forest stands infested with gypsy moth
David A. Gansner; Owen W. Herrick
1982-01-01
Presents a method for estimating the potential impact of gypsy moth attacks on forest-stand value. Robust regression analysis is used to develop an equation for predicting the rate of change in timber value from easy-to-measure key characteristics of stand condition.
Estimating Causal Effects in Mediation Analysis Using Propensity Scores
ERIC Educational Resources Information Center
Coffman, Donna L.
2011-01-01
Mediation is usually assessed by a regression-based or structural equation modeling (SEM) approach that we refer to as the classical approach. This approach relies on the assumption that there are no confounders that influence both the mediator, "M", and the outcome, "Y". This assumption holds if individuals are randomly…
Biomass estimates of eastern red cedar tree components
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schnell, R.L.
1976-02-01
Fresh and dry-weight relationships of species of the eastern red cedar (Juniperus virginiana L.) found in the Tennessee Valley are presented. Both wood and bark were analyzed. All fresh and dry weights tabulated were computed from predicting equations developed by multiple regression analysis of field data. (JGB)
Runkel, Robert L.
1998-01-01
OTIS is a mathematical simulation model used to characterize the fate and transport of water-borne solutes in streams and rivers. The governing equation underlying the model is the advection-dispersion equation with additional terms to account for transient storage, lateral inflow, first-order decay, and sorption. This equation and the associated equations describing transient storage and sorption are solved using a Crank-Nicolson finite-difference solution. OTIS may be used in conjunction with data from field-scale tracer experiments to quantify the hydrologic parameters affecting solute transport. This application typically involves a trial-and-error approach wherein parameter estimates are adjusted to obtain an acceptable match between simulated and observed tracer concentrations. Additional applications include analyses of nonconservative solutes that are subject to sorption processes or first-order decay. OTIS-P, a modified version of OTIS, couples the solution of the governing equation with a nonlinear regression package. OTIS-P determines an optimal set of parameter estimates that minimize the squared differences between the simulated and observed concentrations, thereby automating the parameter estimation process. This report details the development and application of OTIS and OTIS-P. Sections of the report describe model theory, input/output specifications, sample applications, and installation instructions.
Biomass expansion factor and root-to-shoot ratio for Pinus in Brazil.
Sanquetta, Carlos R; Corte, Ana Pd; da Silva, Fernando
2011-09-24
The Biomass Expansion Factor (BEF) and the Root-to-Shoot Ratio (R) are variables used to quantify carbon stock in forests. They are often considered as constant or species/area specific values in most studies. This study aimed at showing tree size and age dependence upon BEF and R and proposed equations to improve forest biomass and carbon stock. Data from 70 sample Pinus spp. grown in southern Brazil trees in different diameter classes and ages were used to demonstrate the correlation between BEF and R, and forest inventory data, such as DBH, tree height and age. Total dry biomass, carbon stock and CO2 equivalent were simulated using the IPCC default values of BEF and R, corresponding average calculated from data used in this study, as well as the values estimated by regression equations. The mean values of BEF and R calculated in this study were 1.47 and 0.17, respectively. The relationship between BEF and R and the tree measurement variables were inversely related with negative exponential behavior. Simulations indicated that use of fixed values of BEF and R, either IPCC default or current average data, may lead to unreliable estimates of carbon stock inventories and CDM projects. It was concluded that accounting for the variations in BEF and R and using regression equations to relate them to DBH, tree height and age, is fundamental in obtaining reliable estimates of forest tree biomass, carbon sink and CO2 equivalent.
Estimating global per-capita carbon emissions with VIIRS nighttime lights satellite data
NASA Astrophysics Data System (ADS)
Jasmin, T.; Desai, A. R.; Pierce, R. B.
2015-12-01
With the launch of the Suomi National Polar-orbiting Partnership (NPP) satellite in November 2011, we now have nighttime lights remote sensing capability vastly improved over the predecessor Defense Meteorological Satellite Program (DMSP), owing to improved spatial and radiometric resolution provided by the Visible Infrared Imaging Radiometer Suite (VIIRS) Day Night Band (DNB) along with technology improvements in data transfer, processing, and storage. This development opens doors for improving novel scientific applications utilizing remotely sensed low-level visible light, for purposes ranging from estimating population to inferring factors relating to economic development. For example, the success of future international agreements to reduce greenhouse gas emissions will be dependent on mechanisms to monitor remotely for compliance. Here, we discuss implementation and evaluation of the VRCE system (VIIRS Remote Carbon Estimates), developed at the University of Wisconsin-Madison, which provides monthly independent, unbiased estimates of per-capita carbon emissions. Cloud-free global composites of Earth nocturnal lighting are generated from VIIRS DNB at full spatial resolution (750 meter). A population equation is derived from a linear regression of DNB radiance sums at state level to U.S. Census data. CO2 emissions are derived from a linear regression of VIIRS DNB radiance sums to U.S. Department of Energy emission estimates. Regional coefficients for factors such as percentage of energy use from renewable sources are factored in, and together these equations are used to generate per-capita CO2 emission estimates at the country level.
Estimation of selected seasonal streamflow statistics representative of 1930-2002 in West Virginia
Wiley, Jeffrey B.; Atkins, John T.
2010-01-01
Regional equations and procedures were developed for estimating seasonal 1-day 10-year, 7-day 10-year, and 30-day 5-year hydrologically based low-flow frequency values for unregulated streams in West Virginia. Regional equations and procedures also were developed for estimating the seasonal U.S. Environmental Protection Agency harmonic-mean flows and the 50-percent flow-duration values. The seasons were defined as winter (January 1-March 31), spring (April 1-June 30), summer (July 1-September 30), and fall (October 1-December 31). Regional equations were developed using ordinary least squares regression using statistics from 117 U.S. Geological Survey continuous streamgage stations as dependent variables and basin characteristics as independent variables. Equations for three regions in West Virginia-North, South-Central, and Eastern Panhandle Regions-were determined. Drainage area, average annual precipitation, and longitude of the basin centroid are significant independent variables in one or more of the equations. The average standard error of estimates for the equations ranged from 12.6 to 299 percent. Procedures developed to estimate the selected seasonal streamflow statistics in this study are applicable only to rural, unregulated streams within the boundaries of West Virginia that have independent variables within the limits of the stations used to develop the regional equations: drainage area from 16.3 to 1,516 square miles in the North Region, from 2.78 to 1,619 square miles in the South-Central Region, and from 8.83 to 3,041 square miles in the Eastern Panhandle Region; average annual precipitation from 42.3 to 61.4 inches in the South-Central Region and from 39.8 to 52.9 inches in the Eastern Panhandle Region; and longitude of the basin centroid from 79.618 to 82.023 decimal degrees in the North Region. All estimates of seasonal streamflow statistics are representative of the period from the 1930 to the 2002 climatic year.
Estimating Causal Effects with Ancestral Graph Markov Models
Malinsky, Daniel; Spirtes, Peter
2017-01-01
We present an algorithm for estimating bounds on causal effects from observational data which combines graphical model search with simple linear regression. We assume that the underlying system can be represented by a linear structural equation model with no feedback, and we allow for the possibility of latent variables. Under assumptions standard in the causal search literature, we use conditional independence constraints to search for an equivalence class of ancestral graphs. Then, for each model in the equivalence class, we perform the appropriate regression (using causal structure information to determine which covariates to include in the regression) to estimate a set of possible causal effects. Our approach is based on the “IDA” procedure of Maathuis et al. (2009), which assumes that all relevant variables have been measured (i.e., no unmeasured confounders). We generalize their work by relaxing this assumption, which is often violated in applied contexts. We validate the performance of our algorithm on simulated data and demonstrate improved precision over IDA when latent variables are present. PMID:28217244
Body temperatures in dinosaurs: what can growth curves tell us?
Griebeler, Eva Maria
2013-01-01
To estimate the body temperature (BT) of seven dinosaurs Gillooly, Alleen, and Charnov (2006) used an equation that predicts BT from the body mass and maximum growth rate (MGR) with the latter preserved in ontogenetic growth trajectories (BT-equation). The results of these authors evidence inertial homeothermy in Dinosauria and suggest that, due to overheating, the maximum body size in Dinosauria was ultimately limited by BT. In this paper, I revisit this hypothesis of Gillooly, Alleen, and Charnov (2006). I first studied whether BTs derived from the BT-equation of today's crocodiles, birds and mammals are consistent with core temperatures of animals. Second, I applied the BT-equation to a larger number of dinosaurs than Gillooly, Alleen, and Charnov (2006) did. In particular, I estimated BT of Archaeopteryx (from two MGRs), ornithischians (two), theropods (three), prosauropods (three), and sauropods (nine). For extant species, the BT value estimated from the BT-equation was a poor estimate of an animal's core temperature. For birds, BT was always strongly overestimated and for crocodiles underestimated; for mammals the accuracy of BT was moderate. I argue that taxon-specific differences in the scaling of MGR (intercept and exponent of the regression line, log-log-transformed) and in the parameterization of the Arrhenius model both used in the BT-equation as well as ecological and evolutionary adaptations of species cause these inaccuracies. Irrespective of the found inaccuracy of BTs estimated from the BT-equation and contrary to the results of Gillooly, Alleen, and Charnov (2006) I found no increase in BT with increasing body mass across all dinosaurs (Sauropodomorpha, Sauropoda) studied. This observation questions that, due to overheating, the maximum size in Dinosauria was ultimately limited by BT. However, the general high inaccuracy of dinosaurian BTs derived from the BT-equation makes a reliable test of whether body size in dinosaurs was ultimately limited by overheating impossible.
Body Temperatures in Dinosaurs: What Can Growth Curves Tell Us?
Griebeler, Eva Maria
2013-01-01
To estimate the body temperature (BT) of seven dinosaurs Gillooly, Alleen, and Charnov (2006) used an equation that predicts BT from the body mass and maximum growth rate (MGR) with the latter preserved in ontogenetic growth trajectories (BT-equation). The results of these authors evidence inertial homeothermy in Dinosauria and suggest that, due to overheating, the maximum body size in Dinosauria was ultimately limited by BT. In this paper, I revisit this hypothesis of Gillooly, Alleen, and Charnov (2006). I first studied whether BTs derived from the BT-equation of today’s crocodiles, birds and mammals are consistent with core temperatures of animals. Second, I applied the BT-equation to a larger number of dinosaurs than Gillooly, Alleen, and Charnov (2006) did. In particular, I estimated BT of Archaeopteryx (from two MGRs), ornithischians (two), theropods (three), prosauropods (three), and sauropods (nine). For extant species, the BT value estimated from the BT-equation was a poor estimate of an animal’s core temperature. For birds, BT was always strongly overestimated and for crocodiles underestimated; for mammals the accuracy of BT was moderate. I argue that taxon-specific differences in the scaling of MGR (intercept and exponent of the regression line, log-log-transformed) and in the parameterization of the Arrhenius model both used in the BT-equation as well as ecological and evolutionary adaptations of species cause these inaccuracies. Irrespective of the found inaccuracy of BTs estimated from the BT-equation and contrary to the results of Gillooly, Alleen, and Charnov (2006) I found no increase in BT with increasing body mass across all dinosaurs (Sauropodomorpha, Sauropoda) studied. This observation questions that, due to overheating, the maximum size in Dinosauria was ultimately limited by BT. However, the general high inaccuracy of dinosaurian BTs derived from the BT-equation makes a reliable test of whether body size in dinosaurs was ultimately limited by overheating impossible. PMID:24204568
Hirani, Vasant; Tabassum, Faiza; Aresu, Maria; Mindell, Jennifer
2010-08-01
Various measures have been used to estimate height when assessing nutritional status. Current equations to obtain demi-span equivalent height (DEH(Bassey)) are based on a small sample from a single study. The objectives of this study were to develop more robust DEH equations from a large number of men (n = 591) and women (n = 830) aged 25-45 y from a nationally representative cross-sectional sample (Health Survey for England 2007). Sex-specific regression equations were produced from young adults' (aged 25-45 y) measured height and demi-span to estimate new DEH equations (DEH(new)). DEH in people aged >or= 65 y was calculated using DEH(new). DEH(new) estimated current height in people aged 25-45 y with a mean difference of 0.04 in men (P = 0.80) and -0.29 in women (P = 0.05). Height, demi-span, DEH(new), and DEH(Bassey) declined by age group in both sexes aged >or=65 y (P < 0.05); DEH were larger than the measured height for all age groups (mean difference between DEH(new) and current height was -2.64 in men and -3.16 in women; both P < 0.001). Comparisons of DEH estimates showed good agreement, but DEH(new) was significantly higher than DEH(Bassey) in each age and sex group in older people. The new equations that are based on a large, randomly selected, nationally representative sample of young adults are more robust for predicting current height in young adults when height measurements are unavailable and can be used in the future to predict maximal adult height more accurately in currently young adults as they age.
Hwang, Bosun; Han, Jonghee; Choi, Jong Min; Park, Kwang Suk
2008-11-01
The purpose of this study was to develop an unobtrusive energy expenditure (EE) measurement system using an infrared (IR) sensor-based activity monitoring system to measure indoor activities and to estimate individual quantitative EE. IR-sensor activation counts were measured with a Bluetooth-based monitoring system and the standard EE was calculated using an established regression equation. Ten male subjects participated in the experiment and three different EE measurement systems (gas analyzer, accelerometer, IR sensor) were used simultaneously in order to determine the regression equation and evaluate the performance. As a standard measurement, oxygen consumption was simultaneously measured by a portable metabolic system (Metamax 3X, Cortex, Germany). A single room experiment was performed to develop a regression model of the standard EE measurement from the proposed IR sensor-based measurement system. In addition, correlation and regression analyses were done to compare the performance of the IR system with that of the Actigraph system. We determined that our proposed IR-based EE measurement system shows a similar correlation to the Actigraph system with the standard measurement system.
Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.
2000-01-01
The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.
Estimation of stature from hand and foot dimensions in a Korean population.
Kim, Wonjoon; Kim, Yong Min; Yun, Myung Hwan
2018-04-01
The estimation of stature using foot and hand dimensions is essential in the process of personal identification. The shapes of feet and hands vary depending on races and gender, and it is of great importance to design an adequate equation in consideration of variances to estimate stature. This study is based on a total of 5,195 South Korean males and females, aged from 20 to 59 years. Body dimensions of stature, hand length, hand breadth, foot length, and foot breadth were measured according to standard anthropometric procedures. The independent t-test was performed in order to verify significant gender-induced differences and the results showed that there was significant difference between males and females for all the foot-hand dimensions (p<0.01). All dimensions showed a positive and statistically significant relation with stature in both genders (p<0.01). For both genders, the foot length showed highest correlation, whereas the hand breadth showed least correlation. The stepwise regression analysis was conducted, and the results showed that males had the highest prediction accuracy in the regression equation consisting of foot length and hand length (R 2 =0.532), whereas females had the highest accuracy in the regression model consisting of foot length and hand breadth (R 2 =0.437) The findings of this study indicated that hand and foot dimensions can be used to predict the stature of South Korean in the forensic science field. Copyright © 2018 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Furushima, Taishi; Miyachi, Motohiko; Iemitsu, Motoyuki; Murakami, Haruka; Kawano, Hiroshi; Gando, Yuko; Kawakami, Ryoko; Sanada, Kiyoshi
2017-08-29
This study aimed to develop and cross-validate prediction equations for estimating appendicular skeletal muscle mass (ASM) and to examine the relationship between sarcopenia defined by the prediction equations and risk factors for cardiovascular diseases (CVD) or osteoporosis in Japanese men and women. Subjects were healthy men and women aged 20-90 years, who were randomly allocated to the following two groups: the development group (D group; 257 men, 913 women) and the cross-validation group (V group; 119 men, 112 women). To develop prediction equations, stepwise multiple regression analyses were performed on data obtained from the D group, using ASM measured by dual-energy X-ray absorptiometry (DXA) as a dependent variable and five easily obtainable measures (age, height, weight, waist circumference, and handgrip strength) as independent variables. When the prediction equations for ASM estimation were applied to the V group, a significant correlation was found between DXA-measured ASM and predicted ASM in both men and women (R 2 = 0.81 and R 2 = 0.72). Our prediction equations had higher R 2 values compared to previously developed equations (R 2 = 0.75-0.59 and R 2 = 0.69-0.40) in both men and women. Moreover, sarcopenia defined by predicted ASM was related to risk factors for osteoporosis and CVD, as well as sarcopenia defined by DXA-measured ASM. In this study, novel prediction equations were developed and cross-validated in Japanese men and women. Our analyses validated the clinical significance of these prediction equations and showed that previously reported equations were not applicable in a Japanese population.
Prediction equation for estimating total daily energy requirements of special operations personnel.
Barringer, N D; Pasiakos, S M; McClung, H L; Crombie, A P; Margolis, L M
2018-01-01
Special Operations Forces (SOF) engage in a variety of military tasks with many producing high energy expenditures, leading to undesired energy deficits and loss of body mass. Therefore, the ability to accurately estimate daily energy requirements would be useful for accurate logistical planning. Generate a predictive equation estimating energy requirements of SOF. Retrospective analysis of data collected from SOF personnel engaged in 12 different SOF training scenarios. Energy expenditure and total body water were determined using the doubly-labeled water technique. Physical activity level was determined as daily energy expenditure divided by resting metabolic rate. Physical activity level was broken into quartiles (0 = mission prep, 1 = common warrior tasks, 2 = battle drills, 3 = specialized intense activity) to generate a physical activity factor (PAF). Regression analysis was used to construct two predictive equations (Model A; body mass and PAF, Model B; fat-free mass and PAF) estimating daily energy expenditures. Average measured energy expenditure during SOF training was 4468 (range: 3700 to 6300) Kcal·d- 1 . Regression analysis revealed that physical activity level ( r = 0.91; P < 0.05) and body mass ( r = 0.28; P < 0.05; Model A), or fat-free mass (FFM; r = 0.32; P < 0.05; Model B) were the factors that most highly predicted energy expenditures. Predictive equations coupling PAF with body mass (Model A) and FFM (Model B), were correlated ( r = 0.74 and r = 0.76, respectively) and did not differ [mean ± SEM: Model A; 4463 ± 65 Kcal·d - 1 , Model B; 4462 ± 61 Kcal·d - 1 ] from DLW measured energy expenditures. By quantifying and grouping SOF training exercises into activity factors, SOF energy requirements can be predicted with reasonable accuracy and these equations used by dietetic/logistical personnel to plan appropriate feeding regimens to meet SOF nutritional requirements across their mission profile.
Kupek, Emil
2006-03-15
Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.
Modelling of capital asset pricing by considering the lagged effects
NASA Astrophysics Data System (ADS)
Sukono; Hidayat, Y.; Bon, A. Talib bin; Supian, S.
2017-01-01
In this paper the problem of modelling the Capital Asset Pricing Model (CAPM) with the effect of the lagged is discussed. It is assumed that asset returns are analysed influenced by the market return and the return of risk-free assets. To analyse the relationship between asset returns, the market return, and the return of risk-free assets, it is conducted by using a regression equation of CAPM, and regression equation of lagged distributed CAPM. Associated with the regression equation lagged CAPM distributed, this paper also developed a regression equation of Koyck transformation CAPM. Results of development show that the regression equation of Koyck transformation CAPM has advantages, namely simple as it only requires three parameters, compared with regression equation of lagged distributed CAPM.
Hybrid Rocket Performance Prediction with Coupling Method of CFD and Thermal Conduction Calculation
NASA Astrophysics Data System (ADS)
Funami, Yuki; Shimada, Toru
The final purpose of this study is to develop a design tool for hybrid rocket engines. This tool is a computer code which will be used in order to investigate rocket performance characteristics and unsteady phenomena lasting through the burning time, such as fuel regression or combustion oscillation. When phenomena inside a combustion chamber, namely boundary layer combustion, are described, it is difficult to use rigorous models for this target. It is because calculation cost may be too expensive. Therefore simple models are required for this calculation. In this study, quasi-one-dimensional compressible Euler equations for flowfields inside a chamber and the equation for thermal conduction inside a solid fuel are numerically solved. The energy balance equation at the solid fuel surface is solved to estimate fuel regression rate. Heat feedback model is Karabeyoglu's model dependent on total mass flux. Combustion model is global single step reaction model for 4 chemical species or chemical equilibrium model for 9 chemical species. As a first step, steady-state solutions are reported.
An analysis of the magnitude and frequency of floods on Oahu, Hawaii
Nakahara, R.H.
1980-01-01
An analysis of available peak-flow data for the island of Oahu, Hawaii, was made by using multiple regression techniques which related flood-frequency data to basin and climatic characteristics for 74 gaging stations on Oahu. In the analysis, several different groupings of stations were investigated, including divisions by geographic location and size of drainage area. The grouping consisting of two leeward divisions and one windward division produced the best results. Drainage basins ranged in area from 0.03 to 45.7 square miles. Equations relating flood magnitudes of selected frequencies to basin characteristics were developed for the three divisions of Oahu. These equations can be used to estimate the magnitude and frequency of floods for any site, gaged or ungaged, for any desired recurrence interval from 2 to 100 years. Data on basin characteristics, flood magnitudes for various recurrence intervals from individual station-frequency curves, and computed flood magnitudes by use of the regression equation are tabulated to provide the needed data. (USGS)
A different approach to estimate nonlinear regression model using numerical methods
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper concerns with the computational methods namely the Gauss-Newton method, Gradient algorithm methods (Newton-Raphson method, Steepest Descent or Steepest Ascent algorithm method, the Method of Scoring, the Method of Quadratic Hill-Climbing) based on numerical analysis to estimate parameters of nonlinear regression model in a very different way. Principles of matrix calculus have been used to discuss the Gradient-Algorithm methods. Yonathan Bard [1] discussed a comparison of gradient methods for the solution of nonlinear parameter estimation problems. However this article discusses an analytical approach to the gradient algorithm methods in a different way. This paper describes a new iterative technique namely Gauss-Newton method which differs from the iterative technique proposed by Gorden K. Smyth [2]. Hans Georg Bock et.al [10] proposed numerical methods for parameter estimation in DAE’s (Differential algebraic equation). Isabel Reis Dos Santos et al [11], Introduced weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel. For large-scale non smooth convex minimization the Hager and Zhang (HZ) conjugate gradient Method and the modified HZ (MHZ) method were presented by Gonglin Yuan et al [12].
Southard, Rodney E.; Veilleux, Andrea G.
2014-01-01
Regression analysis techniques were used to develop a set of equations for rural ungaged stream sites for estimating discharges with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. Basin and climatic characteristics were computed using geographic information software and digital geospatial data. A total of 35 characteristics were computed for use in preliminary statewide and regional regression analyses. Annual exceedance-probability discharge estimates were computed for 278 streamgages by using the expected moments algorithm to fit a log-Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data from water year 1844 to 2012. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized multiple Grubbs-Beck test was used to detect potentially influential low floods. Annual peak flows less than a minimum recordable discharge at a streamgage were incorporated into the at-site station analyses. An updated regional skew coefficient was determined for the State of Missouri using Bayesian weighted least-squares/generalized least squares regression analyses. At-site skew estimates for 108 long-term streamgages with 30 or more years of record and the 35 basin characteristics defined for this study were used to estimate the regional variability in skew. However, a constant generalized-skew value of -0.30 and a mean square error of 0.14 were determined in this study. Previous flood studies indicated that the distinct physical features of the three physiographic provinces have a pronounced effect on the magnitude of flood peaks. Trends in the magnitudes of the residuals from preliminary statewide regression analyses from previous studies confirmed that regional analyses in this study were similar and related to three primary physiographic provinces. The final regional regression analyses resulted in three sets of equations. For Regions 1 and 2, the basin characteristics of drainage area and basin shape factor were statistically significant. For Region 3, because of the small amount of data from streamgages, only drainage area was statistically significant. Average standard errors of prediction ranged from 28.7 to 38.4 percent for flood region 1, 24.1 to 43.5 percent for flood region 2, and 25.8 to 30.5 percent for region 3. The regional regression equations are only applicable to stream sites in Missouri with flows not significantly affected by regulation, channelization, backwater, diversion, or urbanization. Basins with about 5 percent or less impervious area were considered to be rural. Applicability of the equations are limited to the basin characteristic values that range from 0.11 to 8,212.38 square miles (mi2) and basin shape from 2.25 to 26.59 for Region 1, 0.17 to 4,008.92 mi2 and basin shape 2.04 to 26.89 for Region 2, and 2.12 to 2,177.58 mi2 for Region 3. Annual peak data from streamgages were used to qualitatively assess the largest floods recorded at streamgages in Missouri since the 1915 water year. Based on existing streamgage data, the 1983 flood event was the largest flood event on record since 1915. The next five largest flood events, in descending order, took place in 1993, 1973, 2008, 1994 and 1915. Since 1915, five of six of the largest floods on record occurred from 1973 to 2012.
NASA Technical Reports Server (NTRS)
Argentiero, P.; Lowrey, B.
1977-01-01
The least squares collocation algorithm for estimating gravity anomalies from geodetic data is shown to be an application of the well known regression equations which provide the mean and covariance of a random vector (gravity anomalies) given a realization of a correlated random vector (geodetic data). It is also shown that the collocation solution for gravity anomalies is equivalent to the conventional least-squares-Stokes' function solution when the conventional solution utilizes properly weighted zero a priori estimates. The mathematical and physical assumptions underlying the least squares collocation estimator are described.
Computational tools for fitting the Hill equation to dose-response curves.
Gadagkar, Sudhindra R; Call, Gerald B
2015-01-01
Many biological response curves commonly assume a sigmoidal shape that can be approximated well by means of the 4-parameter nonlinear logistic equation, also called the Hill equation. However, estimation of the Hill equation parameters requires access to commercial software or the ability to write computer code. Here we present two user-friendly and freely available computer programs to fit the Hill equation - a Solver-based Microsoft Excel template and a stand-alone GUI-based "point and click" program, called HEPB. Both computer programs use the iterative method to estimate two of the Hill equation parameters (EC50 and the Hill slope), while constraining the values of the other two parameters (the minimum and maximum asymptotes of the response variable) to fit the Hill equation to the data. In addition, HEPB draws the prediction band at a user-defined confidence level, and determines the EC50 value for each of the limits of this band to give boundary values that help objectively delineate sensitive, normal and resistant responses to the drug being tested. Both programs were tested by analyzing twelve datasets that varied widely in data values, sample size and slope, and were found to yield estimates of the Hill equation parameters that were essentially identical to those provided by commercial software such as GraphPad Prism and nls, the statistical package in the programming language R. The Excel template provides a means to estimate the parameters of the Hill equation and plot the regression line in a familiar Microsoft Office environment. HEPB, in addition to providing the above results, also computes the prediction band for the data at a user-defined level of confidence, and determines objective cut-off values to distinguish among response types (sensitive, normal and resistant). Both programs are found to yield estimated values that are essentially the same as those from standard software such as GraphPad Prism and the R-based nls. Furthermore, HEPB also has the option to simulate 500 response values based on the range of values of the dose variable in the original data and the fit of the Hill equation to that data. Copyright © 2014. Published by Elsevier Inc.
Devakumar, Delan; Grijalva-Eternod, Carlos S; Roberts, Sebastian; Chaube, Shiva Shankar; Saville, Naomi M; Manandhar, Dharma S; Costello, Anthony; Osrin, David; Wells, Jonathan C K
2015-01-01
Background. Body composition is important as a marker of both current and future health. Bioelectrical impedance (BIA) is a simple and accurate method for estimating body composition, but requires population-specific calibration equations. Objectives. (1) To generate population specific calibration equations to predict lean mass (LM) from BIA in Nepalese children aged 7-9 years. (2) To explore methodological changes that may extend the range and improve accuracy. Methods. BIA measurements were obtained from 102 Nepalese children (52 girls) using the Tanita BC-418. Isotope dilution with deuterium oxide was used to measure total body water and to estimate LM. Prediction equations for estimating LM from BIA data were developed using linear regression, and estimates were compared with those obtained from the Tanita system. We assessed the effects of flexing the arms of children to extend the range of coverage towards lower weights. We also estimated potential error if the number of children included in the study was reduced. Findings. Prediction equations were generated, incorporating height, impedance index, weight and sex as predictors (R (2) 93%). The Tanita system tended to under-estimate LM, with a mean error of 2.2%, but extending up to 25.8%. Flexing the arms to 90° increased the lower weight range, but produced a small error that was not significant when applied to children <16 kg (p 0.42). Reducing the number of children increased the error at the tails of the weight distribution. Conclusions. Population-specific isotope calibration of BIA for Nepalese children has high accuracy. Arm position is important and can be used to extend the range of low weight covered. Smaller samples reduce resource requirements, but leads to large errors at the tails of the weight distribution.
O'Neill, William; Penn, Richard; Werner, Michael; Thomas, Justin
2015-06-01
Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible.
Flood estimates for ungaged streams in Glacier and Yellowstone National Parks, Montana
Omang, R.J.; Parrett, Charles; Hull, J.A.
1983-01-01
Estimates of 100-year discharges were made at 59 sites in Glacier National Park and 21 sites in Yellowstone National Park to assist the National Park Services in quantifying stream inflow and outflow in the Parks. The estimates were made using regression equations previously developed for Montana. The resulting 100-year discharges are listed in tables; the discharges ranged from 260 to 53,200 cu ft/s in Glacier National Park and from 110 to 27,900 cu ft/s in Yellowstone National Park. (USGS)
A Streamflow Statistics (StreamStats) Web Application for Ohio
Koltun, G.F.; Kula, Stephanie P.; Puskas, Barry M.
2006-01-01
A StreamStats Web application was developed for Ohio that implements equations for estimating a variety of streamflow statistics including the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year peak streamflows, mean annual streamflow, mean monthly streamflows, harmonic mean streamflow, and 25th-, 50th-, and 75th-percentile streamflows. StreamStats is a Web-based geographic information system application designed to facilitate the estimation of streamflow statistics at ungaged locations on streams. StreamStats can also serve precomputed streamflow statistics determined from streamflow-gaging station data. The basic structure, use, and limitations of StreamStats are described in this report. To facilitate the level of automation required for Ohio's StreamStats application, the technique used by Koltun (2003)1 for computing main-channel slope was replaced with a new computationally robust technique. The new channel-slope characteristic, referred to as SL10-85, differed from the National Hydrography Data based channel slope values (SL) reported by Koltun (2003)1 by an average of -28.3 percent, with the median change being -13.2 percent. In spite of the differences, the two slope measures are strongly correlated. The change in channel slope values resulting from the change in computational method necessitated revision of the full-model equations for flood-peak discharges originally presented by Koltun (2003)1. Average standard errors of prediction for the revised full-model equations presented in this report increased by a small amount over those reported by Koltun (2003)1, with increases ranging from 0.7 to 0.9 percent. Mean percentage changes in the revised regression and weighted flood-frequency estimates relative to regression and weighted estimates reported by Koltun (2003)1 were small, ranging from -0.72 to -0.25 percent and -0.22 to 0.07 percent, respectively.
Using Smartphone Sensors for Improving Energy Expenditure Estimation
Zhu, Jindan; Das, Aveek K.; Zeng, Yunze; Mohapatra, Prasant; Han, Jay J.
2015-01-01
Energy expenditure (EE) estimation is an important factor in tracking personal activity and preventing chronic diseases, such as obesity and diabetes. Accurate and real-time EE estimation utilizing small wearable sensors is a difficult task, primarily because the most existing schemes work offline or use heuristics. In this paper, we focus on accurate EE estimation for tracking ambulatory activities (walking, standing, climbing upstairs, or downstairs) of a typical smartphone user. We used built-in smartphone sensors (accelerometer and barometer sensor), sampled at low frequency, to accurately estimate EE. Using a barometer sensor, in addition to an accelerometer sensor, greatly increases the accuracy of EE estimation. Using bagged regression trees, a machine learning technique, we developed a generic regression model for EE estimation that yields upto 96% correlation with actual EE. We compare our results against the state-of-the-art calorimetry equations and consumer electronics devices (Fitbit and Nike+ FuelBand). The newly developed EE estimation algorithm demonstrated superior accuracy compared with currently available methods. The results were calibrated against COSMED K4b2 calorimeter readings. PMID:27170901
Using Smartphone Sensors for Improving Energy Expenditure Estimation.
Pande, Amit; Zhu, Jindan; Das, Aveek K; Zeng, Yunze; Mohapatra, Prasant; Han, Jay J
2015-01-01
Energy expenditure (EE) estimation is an important factor in tracking personal activity and preventing chronic diseases, such as obesity and diabetes. Accurate and real-time EE estimation utilizing small wearable sensors is a difficult task, primarily because the most existing schemes work offline or use heuristics. In this paper, we focus on accurate EE estimation for tracking ambulatory activities (walking, standing, climbing upstairs, or downstairs) of a typical smartphone user. We used built-in smartphone sensors (accelerometer and barometer sensor), sampled at low frequency, to accurately estimate EE. Using a barometer sensor, in addition to an accelerometer sensor, greatly increases the accuracy of EE estimation. Using bagged regression trees, a machine learning technique, we developed a generic regression model for EE estimation that yields upto 96% correlation with actual EE. We compare our results against the state-of-the-art calorimetry equations and consumer electronics devices (Fitbit and Nike+ FuelBand). The newly developed EE estimation algorithm demonstrated superior accuracy compared with currently available methods. The results were calibrated against COSMED K4b2 calorimeter readings.
Using heart rate to predict energy expenditure in large domestic dogs.
Gerth, N; Ruoß, C; Dobenecker, B; Reese, S; Starck, J M
2016-06-01
The aim of this study was to establish heart rate as a measure of energy expenditure in large active kennel dogs (28 ± 3 kg bw). Therefore, the heart rate (HR)-oxygen consumption (V˙O2) relationship was analysed in Foxhound-Boxer-Ingelheim-Labrador cross-breds (FBI dogs) at rest and graded levels of exercise on a treadmill up to 60-65% of maximal aerobic capacity. To test for effects of training, HR and V˙O2 were measured in female dogs, before and after a training period, and after an adjacent training pause to test for reversibility of potential effects. Least squares regression was applied to describe the relationship between HR and V˙O2. The applied training had no statistically significant effect on the HR-V˙O2 regression. A general regression line from all data collected was prepared to establish a general predictive equation for energy expenditure from HR in FBI dogs. The regression equation established in this study enables fast estimation of energy requirement for running activity. The equation is valid for large dogs weighing around 30 kg that run at ground level up to 15 km/h with a heart rate maximum of 190 bpm irrespective of the training level. Journal of Animal Physiology and Animal Nutrition © 2015 Blackwell Verlag GmbH.
ERIC Educational Resources Information Center
Pek, Jolynn; Chalmers, R. Philip; Kok, Bethany E.; Losardo, Diane
2015-01-01
Structural equation mixture models (SEMMs), when applied as a semiparametric model (SPM), can adequately recover potentially nonlinear latent relationships without their specification. This SPM is useful for exploratory analysis when the form of the latent regression is unknown. The purpose of this article is to help users familiar with structural…
ERIC Educational Resources Information Center
Crawford, John R.; Garthwaite, Paul H.; Denham, Annie K.; Chelune, Gordon J.
2012-01-01
Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because…
NASA Technical Reports Server (NTRS)
Suit, W. T.; Cannaday, R. L.
1979-01-01
The longitudinal and lateral stability and control parameters for a high wing, general aviation, airplane are examined. Estimations using flight data obtained at various flight conditions within the normal range of the aircraft are presented. The estimations techniques, an output error technique (maximum likelihood) and an equation error technique (linear regression), are presented. The longitudinal static parameters are estimated from climbing, descending, and quasi steady state flight data. The lateral excitations involve a combination of rudder and ailerons. The sensitivity of the aircraft modes of motion to variations in the parameter estimates are discussed.
NASA Astrophysics Data System (ADS)
Leroux, Romain; Chatellier, Ludovic; David, Laurent
2018-01-01
This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.
Variability of creatinine measurements in clinical laboratories: results from the CRIC study.
Joffe, Marshall; Hsu, Chi-yuan; Feldman, Harold I; Weir, Matthew; Landis, J R; Hamm, L Lee
2010-01-01
Estimating equations using serum creatinine (SCr) are often used to assess glomerular filtration rate (GFR). Such creatinine (Cr)-based formulae may produce biased estimates of GFR when using Cr measurements that have not been calibrated to reference laboratories. In this paper, we sought to examine the degree of this variation in Cr assays in several laboratories associated with academic medical centers affiliated with the Chronic Renal Insufficiency Cohort (CRIC) Study; to consider how best to correct for this variation, and to quantify the impact of such corrections on eligibility for participation in CRIC. Variability of Cr is of particular concern in the conduct of CRIC, a large multicenter study of subjects with chronic renal disease, because eligibility for the study depends on Cr-based assessment of GFR. A library of 5 large volume plasma specimens from apheresis patients was assembled, representing levels of plasma Cr from 0.8 to 2.4 mg/dl. Samples from this library were used for measurement of Cr at each of the 14 CRIC laboratories repetitively over time. We used graphical displays and linear regression methods to examine the variability in Cr, and used linear regression to develop calibration equations. We also examined the impact of the various calibration equations on the proportion of subjects screened as potential participants who were actually eligible for the study. There was substantial variability in Cr assays across laboratories and over time. We developed calibration equations for each laboratory; these equations varied substantially among laboratories and somewhat over time in some laboratories. The laboratory site contributed the most to variability (51% of the variance unexplained by the specimen) and variation with time accounted for another 15%. In some laboratories, calibration equations resulted in differences in eligibility for CRIC of as much as 20%. The substantial variability in SCr assays across laboratories necessitates calibration of SCr measures to a common standard. Failing to do so may substantially affect study eligibility and clinical interpretations when they are determined by Cr-based estimates of GFR. 2010 S. Karger AG, Basel.
Weaver, J. Curtis; Feaster, Toby D.; Gotvald, Anthony J.
2009-01-01
Reliable estimates of the magnitude and frequency of floods are required for the economical and safe design of transportation and water-conveyance structures. A multistate approach was used to update methods for estimating the magnitude and frequency of floods in rural, ungaged basins in North Carolina, South Carolina, and Georgia that are not substantially affected by regulation, tidal fluctuations, or urban development. In North Carolina, annual peak-flow data available through September 2006 were available for 584 sites; 402 of these sites had a total of 10 or more years of systematic record that is required for at-site, flood-frequency analysis. Following data reviews and the computation of 20 physical and climatic basin characteristics for each station as well as at-site flood-frequency statistics, annual peak-flow data were identified for 363 sites in North Carolina suitable for use in this analysis. Among these 363 sites, 19 sites had records that could be divided into unregulated and regulated/ channelized annual peak discharges, which means peak-flow records were identified for a total of 382 cases in North Carolina. Considering the 382 cases, at-site flood-frequency statistics are provided for 333 unregulated cases (also used for the regression database) and 49 regulated/channelized cases. The flood-frequency statistics for the 333 unregulated sites were combined with data for sites from South Carolina, Georgia, and adjacent parts of Alabama, Florida, Tennessee, and Virginia to create a database of 943 sites considered for use in the regional regression analysis. Flood-frequency statistics were computed by fitting logarithms (base 10) of the annual peak flows to a log-Pearson Type III distribution. As part of the computation process, a new generalized skew coefficient was developed by using a Bayesian generalized least-squares regression model. Exploratory regression analyses using ordinary least-squares regression completed on the initial database of 943 sites resulted in defining five hydrologic regions for North Carolina, South Carolina, and Georgia. Stations with drainage areas less than 1 square mile were removed from the database, and a procedure to examine for basin redundancy (based on drainage area and periods of record) also resulted in the removal of some stations from the regression database. Flood-frequency estimates and basin characteristics for 828 gaged stations were combined to form the final database that was used in the regional regression analysis. Regional regression analysis, using generalized least-squares regression, was used to develop a set of predictive equations that can be used for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent chance exceedance flows for rural ungaged, basins in North Carolina, South Carolina, and Georgia. The final predictive equations are all functions of drainage area and the percentage of drainage basin within each of the five hydrologic regions. Average errors of prediction for these regression equations range from 34.0 to 47.7 percent. Discharge estimates determined from the systematic records for the current study are, on average, larger in magnitude than those from a previous study for the highest percent chance exceedances (50 and 20 percent) and tend to be smaller than those from the previous study for the lower percent chance exceedances when all sites are considered as a group. For example, mean differences for sites in the Piedmont hydrologic region range from positive 0.5 percent for the 50-percent chance exceedance flow to negative 4.6 percent for the 0.2-percent chance exceedance flow when stations are grouped by hydrologic region. Similarly for the same hydrologic region, median differences range from positive 0.9 percent for the 50-percent chance exceedance flow to negative 7.1 percent for the 0.2-percent chance exceedance flow. However, mean and median percentage differences between the estimates from the previous and curre
Development of LACIE CCEA-1 weather/wheat yield models. [regression analysis
NASA Technical Reports Server (NTRS)
Strommen, N. D.; Sakamoto, C. M.; Leduc, S. K.; Umberger, D. E. (Principal Investigator)
1979-01-01
The advantages and disadvantages of the casual (phenological, dynamic, physiological), statistical regression, and analog approaches to modeling for grain yield are examined. Given LACIE's primary goal of estimating wheat production for the large areas of eight major wheat-growing regions, the statistical regression approach of correlating historical yield and climate data offered the Center for Climatic and Environmental Assessment the greatest potential return within the constraints of time and data sources. The basic equation for the first generation wheat-yield model is given. Topics discussed include truncation, trend variable, selection of weather variables, episodic events, strata selection, operational data flow, weighting, and model results.
McCarthy, Peter M.
2006-01-01
The Yellowstone River is very important in a variety of ways to the residents of southeastern Montana; however, it is especially vulnerable to spilled contaminants. In 2004, the U.S. Geological Survey, in cooperation with Montana Department of Environmental Quality, initiated a study to develop a computer program to rapidly estimate instream travel times and concentrations of a potential contaminant in the Yellowstone River using regression equations developed in 1999 by the U.S. Geological Survey. The purpose of this report is to describe these equations and their limitations, describe the development of a computer program to apply the equations to the Yellowstone River, and provide detailed instructions on how to use the program. This program is available online at [http://pubs.water.usgs.gov/sir2006-5057/includes/ytot.xls]. The regression equations provide estimates of instream travel times and concentrations in rivers where little or no contaminant-transport data are available. Equations were developed and presented for the most probable flow velocity and the maximum probable flow velocity. These velocity estimates can then be used to calculate instream travel times and concentrations of a potential contaminant. The computer program was developed so estimation equations for instream travel times and concentrations can be solved quickly for sites along the Yellowstone River between Corwin Springs and Sidney, Montana. The basic types of data needed to run the program are spill data, streamflow data, and data for locations of interest along the Yellowstone River. Data output from the program includes spill location, river mileage at specified locations, instantaneous discharge, mean-annual discharge, drainage area, and channel slope. Travel times and concentrations are provided for estimates of the most probable velocity of the peak concentration and the maximum probable velocity of the peak concentration. Verification of estimates of instream travel times and concentrations for the Yellowstone River requires information about the flow velocity throughout the 520 mi of river in the study area. Dye-tracer studies would provide the best data about flow velocities and would provide the best verification of instream travel times and concentrations estimated from this computer program; however, data from such studies does not currently (2006) exist and new studies would be expensive and time-consuming. An alternative approach used in this study for verification of instream travel times is based on the use of flood-wave velocities determined from recorded streamflow hydrographs at selected mainstem streamflow-gaging stations along the Yellowstone River. The ratios of flood-wave velocity to the most probable velocity for the base flow estimated from the computer program are within the accepted range of 2.5 to 4.0 and indicate that flow velocities estimated from the computer program are reasonable for the Yellowstone River. The ratios of flood-wave velocity to the maximum probable velocity are within a range of 1.9 to 2.8 and indicate that the maximum probable flow velocities estimated from the computer program, which corresponds to the shortest travel times and maximum probable concentrations, are conservative and reasonable for the Yellowstone River.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
Omang, R.J.; Parrett, Charles; Hull, J.A.
1983-01-01
Equations using channel-geometry measurements were developed for estimating mean runoff and peak flows of ungaged streams in southeastern Montana. Two separate sets of esitmating equations were developed for determining mean annual runoff: one for perennial streams and one for ephemeral and intermittent streams. Data from 29 gaged sites on perennial streams and 21 gaged sites on ephemeral and intermittent streams were used in these analyses. Data from 78 gaged sites were used in the peak-flow analyses. Southeastern Montana was divided into three regions and separate multiple-regression equations for each region were developed that relate channel dimensions to peak discharge having recurrence intervals of 2, 5, 10, 25, 50, and 100 years. Channel-geometery relations were developed using measurements of the active-channel width and bankfull width. Active-channel width and bankfull width were the most significant channel features for estimating mean annual runoff for al types of streams. Use of this method requires that onsite measurements be made of channel width. The standard error of estimate for predicting mean annual runoff ranged from about 38 to 79 percent. The standard error of estimate relating active-channel width or bankfull width to peak flow ranged from about 37 to 115 percent. (USGS)
The evaluation of the National Long Term Care Demonstration. 2. Estimation methodology.
Brown, R S
1988-01-01
Channeling effects were estimated by comparing the post-application experience of the treatment and control groups using multiple regression. A variety of potential threats to the validity of the results, including sample composition issues, data issues, and estimation issues, were identified and assessed. Of all the potential problems examined, the only one determined to be likely to cause widespread distortion of program impact estimates was noncomparability of the baseline data. To avoid this distortion, baseline variables judged to be noncomparably measured were excluded from use as control variables in the regression equation. (Where they existed, screen counterparts to these noncomparable baseline variables were used as substitutes.) All of the other potential problems with the sample, data, or regression estimation approach were found to have little or no actual effect on impact estimates or their interpretation. Broad implementation of special procedures, therefore, was not necessary. The study did find that, because of the frequent use of proxy respondents, the estimated effects of channeling on clients' well-being actually may reflect impacts on the well-being of the informal caregiver rather than the client. This and other isolated cases in which there was some evidence of a potential problem for specific outcome variables were identified and examined in detail in technical reports dealing with those outcomes. Where appropriate, alternative estimates were presented. PMID:3130329
An Estimate of Avian Mortality at Communication Towers in the United States and Canada
Longcore, Travis; Rich, Catherine; Mineau, Pierre; MacDonald, Beau; Bert, Daniel G.; Sullivan, Lauren M.; Mutrie, Erin; Gauthreaux, Sidney A.; Avery, Michael L.; Crawford, Robert L.; Manville, Albert M.; Travis, Emilie R.; Drake, David
2012-01-01
Avian mortality at communication towers in the continental United States and Canada is an issue of pressing conservation concern. Previous estimates of this mortality have been based on limited data and have not included Canada. We compiled a database of communication towers in the continental United States and Canada and estimated avian mortality by tower with a regression relating avian mortality to tower height. This equation was derived from 38 tower studies for which mortality data were available and corrected for sampling effort, search efficiency, and scavenging where appropriate. Although most studies document mortality at guyed towers with steady-burning lights, we accounted for lower mortality at towers without guy wires or steady-burning lights by adjusting estimates based on published studies. The resulting estimate of mortality at towers is 6.8 million birds per year in the United States and Canada. Bootstrapped subsampling indicated that the regression was robust to the choice of studies included and a comparison of multiple regression models showed that incorporating sampling, scavenging, and search efficiency adjustments improved model fit. Estimating total avian mortality is only a first step in developing an assessment of the biological significance of mortality at communication towers for individual species or groups of species. Nevertheless, our estimate can be used to evaluate this source of mortality, develop subsequent per-species mortality estimates, and motivate policy action. PMID:22558082
An estimate of avian mortality at communication towers in the United States and Canada.
Longcore, Travis; Rich, Catherine; Mineau, Pierre; MacDonald, Beau; Bert, Daniel G; Sullivan, Lauren M; Mutrie, Erin; Gauthreaux, Sidney A; Avery, Michael L; Crawford, Robert L; Manville, Albert M; Travis, Emilie R; Drake, David
2012-01-01
Avian mortality at communication towers in the continental United States and Canada is an issue of pressing conservation concern. Previous estimates of this mortality have been based on limited data and have not included Canada. We compiled a database of communication towers in the continental United States and Canada and estimated avian mortality by tower with a regression relating avian mortality to tower height. This equation was derived from 38 tower studies for which mortality data were available and corrected for sampling effort, search efficiency, and scavenging where appropriate. Although most studies document mortality at guyed towers with steady-burning lights, we accounted for lower mortality at towers without guy wires or steady-burning lights by adjusting estimates based on published studies. The resulting estimate of mortality at towers is 6.8 million birds per year in the United States and Canada. Bootstrapped subsampling indicated that the regression was robust to the choice of studies included and a comparison of multiple regression models showed that incorporating sampling, scavenging, and search efficiency adjustments improved model fit. Estimating total avian mortality is only a first step in developing an assessment of the biological significance of mortality at communication towers for individual species or groups of species. Nevertheless, our estimate can be used to evaluate this source of mortality, develop subsequent per-species mortality estimates, and motivate policy action.
Estimation of health effects of prenatal methylmercury exposure using structural equation models.
Budtz-Jørgensen, Esben; Keiding, Niels; Grandjean, Philippe; Weihe, Pal
2002-10-14
Observational studies in epidemiology always involve concerns regarding validity, especially measurement error, confounding, missing data, and other problems that may affect the study outcomes. Widely used standard statistical techniques, such as multiple regression analysis, may to some extent adjust for these shortcomings. However, structural equations may incorporate most of these considerations, thereby providing overall adjusted estimations of associations. This approach was used in a large epidemiological data set from a prospective study of developmental methyl-mercury toxicity. Structural equation models were developed for assessment of the association between biomarkers of prenatal mercury exposure and neuropsychological test scores in 7 year old children. Eleven neurobehavioral outcomes were grouped into motor function and verbally mediated function. Adjustment for local dependence and item bias was necessary for a satisfactory fit of the model, but had little impact on the estimated mercury effects. The mercury effect on the two latent neurobehavioral functions was similar to the strongest effects seen for individual test scores of motor function and verbal skills. Adjustment for contaminant exposure to poly chlorinated biphenyls (PCBs) changed the estimates only marginally, but the mercury effect could be reduced to non-significance by assuming a large measurement error for the PCB biomarker. The structural equation analysis allows correction for measurement error in exposure variables, incorporation of multiple outcomes and incomplete cases. This approach therefore deserves to be applied more frequently in the analysis of complex epidemiological data sets.
Going, S.; Nichols, J.; Loftin, M.; Stewart, D.; Lohman, T.; Tuuri, G.; Ring, K.; Pickrel, J.; Blew, R.; J.Stevens
2007-01-01
Aim Equations for estimating % fat mass (%BF) and fat-free mass (FFM) from bioelectrical impedance analysis (BIA) that work in adolescent girls from different racial/ethnic backgrounds are not available. We investigated whether race/ethnicity influences estimation of body composition in adolescent girls. Principal procedures Prediction equations were developed for estimating FFM and %BF from BIA in 166 girls, 10–15 years old, consisting of 51 Black (B), 45 non-Black Hispanic (H), 55 non-Hispanic White (W) and 15 mixed (M) race/ethnicity girls, using dual energy x-ray absorptiometry (DXA) as the criterion method. Findings Black girls had similar %BF compared to other groups, yet were heavier per unit of height according to body mass index (BMI: kg·m−2) due to significantly greater FFM. BIA resistance index, age, weight and race/ethnicity were all significant predictors of FFM (R2 = 0.92, SEE = 1.81 kg). Standardized regression coefficients showed resistance index (0.63) and weight (0.34) were the most important predictors of FFM. Errors in %BF (~2%) and FFM (~1.0 kg) were greater when race/ethnicity was not included in the equation, particularly in Black girls. We conclude the BIA-composition relationship in adolescent girls is influenced by race, and consequently have developed new BIA equations for adolescent girls for predicting FFM and %BF. PMID:17848976
Saunders, Christina T; Blume, Jeffrey D
2017-10-26
Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
Mohammadpour, A-H; Nazemian, F; Abtahi, B; Naghibi, M; Gholami, K; Rezaee, S; Nazari, M-R A; Rajabi, O
2008-12-01
Area under the concentration curve (AUC) of mycophenolic acid (MPA) could help to optimize therapeutic drug monitoring during the early post-renal transplant period. The aim of this study was to develop a limited sampling strategy to estimate an abbreviated MPA AUC within the first month after renal transplantation. In this study we selected 19 patients in the early posttransplant period with normal renal graft function (glomerular filtration rate > 70 mL/min). Plasma MPA concentrations were measured using reverse-phase high-performance liquid chromatography. MPA AUC(0-12h) was calculated using the linear trapezoidal rule. Multiple stepwise regression analysis was used to determine the minimal and convenient time points of MPA levels that could be used to derive model equations best fitted to MPA AUC(0-12h). The regression equation for AUC estimation that gave the best performance was AUC = 14.46 C(10) + 15.547 (r(2) = .882). The validation of the method was performed using the jackknife method. Mean prediction error of this model was not different from zero (P > .05) and had a high root mean square prediction error (8.06). In conclusion, this limited sampling strategy provided an effective approach for therapeutic drug monitoring during the early posttransplant period.
Kokkinos, Peter; Kaminsky, Leonard A; Arena, Ross; Zhang, Jiajia; Myers, Jonathan
2017-08-15
Impaired cardiorespiratory fitness (CRF) is closely linked to chronic illness and associated with adverse events. The American College of Sports Medicine (ACSM) regression equations (ACSM equations) developed to estimate oxygen uptake have known limitations leading to well-documented overestimation of CRF, especially at higher work rates. Thus, there is a need to explore alternative equations to more accurately predict CRF. We assessed maximal oxygen uptake (VO 2 max) obtained directly by open-circuit spirometry in 7,983 apparently healthy subjects who participated in the Fitness Registry and the Importance of Exercise National Database (FRIEND). We randomly sampled 70% of the participants from each of the following age categories: <40, 40 to 50, 50 to 70, and ≥70 and used the remaining 30% for validation. Multivariable linear regression analysis was applied to identify the most relevant variables and construct the best prediction model for VO 2 max. Treadmill speed and treadmill speed × grade were considered in the final model as predictors of measured VO 2 max and the following equation was generated: VO 2 max in ml O 2 /kg/min = speed (m/min) × (0.17 + fractional grade × 0.79) + 3.5. The FRIEND equation predicted VO 2 max with an overall error >4 times lower than the error associated with the traditional ACSM equations (5.1 ± 18.3% vs 21.4 ± 24.9%, respectively). Overestimation associated with the ACSM equation was accentuated when different protocols were considered separately. In conclusion, The FRIEND equation predicts VO 2 max more precisely than the traditional ACSM equations with an overall error >4 times lower than that associated with the ACSM equations. Published by Elsevier Inc.
Gazoorian, Christopher L.
2015-01-01
A graphical user interface, with an integrated spreadsheet summary report, has been developed to estimate and display the daily mean streamflows and statistics and to evaluate different water management or water withdrawal scenarios with the estimated monthly data. This package of regression equations, U.S. Geological Survey streamgage data, and spreadsheet application produces an interactive tool to estimate an unaltered daily streamflow hydrograph and streamflow statistics at ungaged sites in New York. Among other uses, the New York Streamflow Estimation Tool can assist water managers with permitting water withdrawals, implementing habitat protection, estimating contaminant loads, or determining the potential affect from chemical spills.
Analysis of cohort studies with multivariate and partially observed disease classification data.
Chatterjee, Nilanjan; Sinha, Samiran; Diver, W Ryan; Feigelson, Heather Spencer
2010-09-01
Complex diseases like cancers can often be classified into subtypes using various pathological and molecular traits of the disease. In this article, we develop methods for analysis of disease incidence in cohort studies incorporating data on multiple disease traits using a two-stage semiparametric Cox proportional hazards regression model that allows one to examine the heterogeneity in the effect of the covariates by the levels of the different disease traits. For inference in the presence of missing disease traits, we propose a generalization of an estimating equation approach for handling missing cause of failure in competing-risk data. We prove asymptotic unbiasedness of the estimating equation method under a general missing-at-random assumption and propose a novel influence-function-based sandwich variance estimator. The methods are illustrated using simulation studies and a real data application involving the Cancer Prevention Study II nutrition cohort.
Determination of suitable drying curve model for bread moisture loss during baking
NASA Astrophysics Data System (ADS)
Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.
2013-03-01
This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
Bent, Gardner C.; Steeves, Peter A.
2006-01-01
A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial. Stream sites included in the database had drainage areas that ranged from 0.04 to 10.96 square miles. Of the 66 stream sites with drainage areas greater than 2.00 square miles, 2 sites were intermittent and 64 sites were perennial. Thus, stream sites with drainage areas greater than 2.00 square miles were assumed to flow perennially, and the database used to develop the logistic regression equation included only those stream sites with drainage areas less than 2.00 square miles. The database for the equation included 285 stream sites that had drainage areas less than 2.00 square miles, of which 83 sites were intermittent and 202 sites were perennial. Results of the logistic regression analysis indicate that the probability of a stream flowing perennially at a specific site in Massachusetts can be estimated as a function of four explanatory variables: (1) drainage area (natural logarithm), (2) areal percentage of sand and gravel deposits, (3) areal percentage of forest land, and (4) region of the state (eastern region or western region). Although the equation provides an objective means of determining the probability of a stream flowing perennially at a specific site, the reliability of the equation is constrained by the data used in its development. The equation is not recommended for (1) losing stream reaches or (2) streams whose ground-water contributing areas do not coincide with their surface-water drainage areas, such as many streams draining the Southeast Coastal Region-the southern part of the South Coastal Basin, the eastern part of the Buzzards Bay Basin, and the entire area of the Cape Cod and the Islands Basins. If the equation were used on a regulated stream site, the estimated intermittent or perennial status would reflect the natural flow conditions for that site. An automated mapping procedure was developed to determine the intermittent or perennial status of stream sites along reaches throughout a basin. The procedure delineates the drainage area boundaries, determines values for the four explanatory variables, and solves the equation for estimating the probability of a stream flowing perennially at two locations on a headwater (first-order) stream reach-one near its confluence or end point and one near its headwaters or start point. The automated procedure then determines the intermittent or perennial status of the reach on the basis of the calculated probability values and a probability cutpoint (a stream is considered to flow perennially at a cutpoint of 0.56 or greater for this study) for the two locations or continues to loop upstream or downstream between locations less than and greater than the cutpoint of 0.56 to determine the transition point from an intermittent to a perennial stream. If the first-order stream reach is determined to be intermittent, the procedure moves to the next downstream reach and repeats the same process. The automated procedure then moves to the next first-order stream and repeats the process until the entire basin is mapped. A map of the intermittent and perennial stream reaches in the Shawsheen River Basin is provided on a CD-ROM that accompanies this report. The CD-ROM also contains ArcReader 9.0, a freeware product, that allows a user to zoom in and out, set a scale, pan, turn on and off map layers (such as a USGS topographic map), and print a map of the stream site with a scale bar. Maps of the intermittent and perennial stream reaches in Massachusetts will provide city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing the intermittent or perennial status of stream sites.
2017-01-01
The U.S. Energy Information Administration's Short-Term Energy Outlook (STEO) produces monthly projections of energy supply, demand, trade, and prices over a 13-24 month period. Every January, the forecast horizon is extended through December of the following year. The STEO model is an integrated system of econometric regression equations and identities that link data on the various components of the U.S. energy industry together in order to develop consistent forecasts. The regression equations are estimated and the STEO model is solved using the EViews 9.5 econometric software package from IHS Global Inc. The model consists of various modules specific to each energy resource. All modules provide projections for the United States, and some modules provide more detailed forecasts for different regions of the country.
An overview of longitudinal data analysis methods for neurological research.
Locascio, Joseph J; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models.
Energy metabolism and hematology of white-tailed deer fawns
Rawson, R.E.; DelGiudice, G.D.; Dziuk, H.E.; Mech, L.D.
1992-01-01
Resting metabolic rates, weight gains and hematologic profiles of six newborn, captive white-tailed deer (Odocoileus virginianus) fawns (four females, two males) were determined during the first 3 mo of life. Estimated mean daily weight gain of fawns was 0.2 kg. The regression equation for metabolic rate was: Metabolic rate (kcal/kg0.75/day) = 56.1 +/- 1.3 (age in days), r = 0.65, P less than 0.001). Regression equations were also used to relate age to red blood cell count (RBC), hemoglobin concentration (Hb), packed cell volume, white blood cell count, mean corpuscular volume, mean corpuscular hemoglobin concentration (MCHC), and mean corpuscular hemoglobin. The age relationships of Hb, MCHC, and smaller RBC's were indicative of an increasing and more efficient oxygen-carrying and exchange capacity to fulfill the increasing metabolic demands for oxygen associated with increasing body size.
NASA Technical Reports Server (NTRS)
Rango, A.; Salomonson, V. V.; Foster, J. L.
1975-01-01
Low resolution meteorological satellite and high resolution earth resources satellite data were used to map snowcovered area over the upper Indus River and the Wind River Mountains of Wyoming, respectively. For the Indus River, early Spring snowcovered area was extracted and related to April through June streamflow from 1967-1971 using a regression equation. Composited results from two years of data over seven Wind River Mountain watersheds indicated that LANDSAT-1 snowcover observations, separated on the basis of watershed elevation, could also be related to runoff in significant regression equations. It appears that earth resources satellite data will be useful in assisting in the prediction of seasonal streamflow for various water resources applications, nonhazardous collection of snow data from restricted-access areas, and in hydrologic modeling of snowmelt runoff.
Hegab, Moustafa; Midan, Mahmoud Farouk; Taha, Tamer; Bibars, Mamdouh; Wakeel, Khaled Helmi El; Amer, Hesham; Azmy, Osama
2018-05-20
To construct new fetal biometric charts and equations for some fetal biometric parameters for women between 12 th and 41 st weeks living in Ismailia and Port Said Governorates in Egypt. This cross-sectional study was carried out on 656 Egyptian women (from Ismailia and Port Said governorates) with an uncomplicated pregnancy, and all were sure of their dates. The selected group was between the 12 th and 41 st weeks of gestation, recruited from the district general hospital in Ismailia and Port Said to measure ultrasonographically biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC) and femur length (FL), then for each measurement separate regression models were fitted to estimate both the mean and the Standard deviation at each gestational age. New Egyptian charts were reported for BPD, HC, AC, and FL. Reference equations for the dating of pregnancy were presented. The mean of the previous measurements at 12 th and 41 st weeks were as follows: (23.37, 98.72), (83.05, 336.12), (67.85, 332.57) and (12.50, 74.92) respectively. New fetal biometric charts and regression equations for pregnant women living in Port Said & Ismailia governorates in Egypt.
Whetstone, B.H.
1982-01-01
A program to collect and analyze flood data from small streams in South Carolina was conducted from 1967-75, as a cooperative research project with the South Carolina Department of Highways and Public Transportation and the Federal Highway Administration. As a result of that program, a technique is presented for estimating the magnitude and frequency of floods on small streams in South Carolina with drainage areas ranging in size from 1 to 500 square miles. Peak-discharge data from 74 stream-gaging stations (25 small streams were synthesized, whereas 49 stations had long-term records) were used in multiple regression procedures to obtain equations for estimating magnitude of floods having recurrence intervals of 10, 25, 50, and 100 years on small natural streams. The significant independent variable was drainage area. Equations were developed for the three physiographic provinces of South Carolina (Coastal Plain, Piedmont, and Blue Ridge) and can be used for estimating floods on small streams. (USGS)
Faria, Franciane Rocha; Faria, Eliane Rodrigues; Cecon, Roberta Stofeles; Barbosa Júnior, Djalma Adão; Franceschini, Sylvia do Carmo Castro; Peluzio, Maria do Carmo Gouveia; Ribeiro, Andréia Queiroz; Lira, Pedro Israel Cabral; Cecon, Paulo Roberto; Priore, Silvia Eloiza
2013-01-01
The aim of this study was to analyze body fat anthropometric equations and electrical bioimpedance analysis (BIA) in the prediction of cardiovascular risk factors in eutrophic and overweight adolescents. 210 adolescents were divided into eutrophic group (G1) and overweight group (G2). The percentage of body fat (% BF) was estimated using 10 body fat anthropometric equations and 2 BIA. We measured lipid profiles, uric acid, insulin, fasting glucose, homeostasis model assessment-insulin resistance (HOMA-IR), and blood pressure. We found that 76.7% of the adolescents exhibited inadequacy of at least one biochemical parameter or clinical cardiovascular risk. Higher values of triglycerides (TG) (P = 0.001), insulin, and HOMA-IR (P < 0.001) were observed in the G2 adolescents. In multivariate linear regression analysis, the % BF from equation (5) was associated with TG, diastolic blood pressure, and insulin in G1. Among the G2 adolescents, the % BF estimated by (5) and (9) was associated with LDL, TG, insulin, and the HOMA-IR. Body fat anthropometric equations were associated with cardiovascular risk factors and should be used to assess the nutritional status of adolescents. In this study, equation (5) was associated with a higher number of cardiovascular risk factors independent of the nutritional status of adolescents. PMID:23762051
Liu, A; Byrne, N M; Ma, G; Nasreddine, L; Trinidad, T P; Kijboonchoo, K; Ismail, M N; Kagawa, M; Poh, B K; Hills, A P
2011-12-01
To develop and cross-validate bioelectrical impedance analysis (BIA) prediction equations of total body water (TBW) and fat-free mass (FFM) for Asian pre-pubertal children from China, Lebanon, Malaysia, Philippines and Thailand. Height, weight, age, gender, resistance and reactance measured by BIA were collected from 948 Asian children (492 boys and 456 girls) aged 8-10 years from the five countries. The deuterium dilution technique was used as the criterion method for the estimation of TBW and FFM. The BIA equations were developed using stepwise multiple regression analysis and cross-validated using the Bland-Altman approach. The BIA prediction equation for the estimation of TBW was as follows: TBW=0.231 × height(2)/resistance+0.066 × height+0.188 × weight+0.128 × age+0.500 × sex-0.316 × Thais-4.574 (R (2)=88.0%, root mean square error (RMSE)=1.3 kg), and for the estimation of FFM was as follows: FFM=0.299 × height(2)/resistance+0.086 × height+0.245 × weight+0.260 × age+0.901 × sex-0.415 × ethnicity (Thai ethnicity =1, others = 0)-6.952 (R (2)=88.3%, RMSE=1.7 kg). No significant difference between measured and predicted values for the whole cross-validation sample was found. However, the prediction equation for estimation of TBW/FFM tended to overestimate TBW/FFM at lower levels whereas underestimate at higher levels of TBW/FFM. Accuracy of the general equation for TBW and FFM was also valid at each body mass index category. Ethnicity influences the relationship between BIA and body composition in Asian pre-pubertal children. The newly developed BIA prediction equations are valid for use in Asian pre-pubertal children.
Ortiz-Hernández, Luis; Vega López, A Valeria; Ramos-Ibáñez, Norma; Cázares Lara, L Joana; Medina Gómez, R Joab; Pérez-Salgado, Diana
To develop and validate equations to estimate the percentage of body fat of children and adolescents from Mexico using anthropometric measurements. A cross-sectional study was carried out with 601 children and adolescents from Mexico aged 5-19 years. The participants were randomly divided into the following two groups: the development sample (n=398) and the validation sample (n=203). The validity of previously published equations (e.g., Slaughter) was also assessed. The percentage of body fat was estimated by dual-energy X-ray absorptiometry. The anthropometric measurements included height, sitting height, weight, waist and arm circumferences, skinfolds (triceps, biceps, subscapular, supra-iliac, and calf), and elbow and bitrochanteric breadth. Linear regression models were estimated with the percentage of body fat as the dependent variable and the anthropometric measurements as the independent variables. Equations were created based on combinations of six to nine anthropometric variables and had coefficients of determination (r 2 ) equal to or higher than 92.4% for boys and 85.8% for girls. In the validation sample, the developed equations had high r 2 values (≥85.6% in boys and ≥78.1% in girls) in all age groups, low standard errors (SE≤3.05% in boys and ≤3.52% in girls), and the intercepts were not different from the origin (p>0.050). Using the previously published equations, the coefficients of determination were lower, and/or the intercepts were different from the origin. The equations developed in this study can be used to assess the percentage of body fat of Mexican schoolchildren and adolescents, as they demonstrate greater validity and lower error compared with previously published equations. Copyright © 2017 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
DOT National Transportation Integrated Search
1999-11-01
Using a fairly large cross-section/time-series data base, covering all provinces of Norway and all months between January 1973 and December 1994, we estimate non-linear (Box-Cox) regression equations explaining aggregate car ownership, road use, seat...
Aaron Weiskittel; Jereme Frank; David Walker; Phil Radtke; David Macfarlane; James Westfall
2015-01-01
Prediction of forest biomass and carbon is becoming important issues in the United States. However, estimating forest biomass and carbon is difficult and relies on empirically-derived regression equations. Based on recent findings from a national gap analysis and comprehensive assessment of the USDA Forest Service Forest Inventory and Analysis (USFS-FIA) component...
Estimating Janka hardness from specific gravity for tropical and temperate species
Michael C. Wiemann; David W. Green
2007-01-01
Using mean values for basic (green) specific gravity and Janka side hardness for individual species obtained from the world literature, regression equations were developed to predict side hardness from specific gravity. Statistical and graphical methods showed that the hardnessâspecific gravity relationship is the same for tropical and temperate hardwoods, but that the...
ERIC Educational Resources Information Center
Haines, Amanda; Spruance, Lori Andersen
2018-01-01
Purpose/Objectives: The study aim was to evaluate parent support for breakfast after the bell programs (BABPs). Methods: Data were collected through an online survey from parents (n=488) of school-aged children enrolled in public schools in Utah. Data were analyzed using generalized estimating equation (GEE) regression methods. Results: Parents…
Demura, S; Sato, S; Kitabayashi, T
2006-06-01
This study examined a method of predicting body density based on hydrostatic weighing without head submersion (HWwithoutHS). Donnelly and Sintek (1984) developed a method to predict body density based on hydrostatic weight without head submersion. This method predicts the difference (D) between HWwithoutHS and hydrostatic weight with head submersion (HWwithHS) from anthropometric variables (head length and head width), and then calculates body density using D as a correction factor. We developed several prediction equations to estimate D based on head anthropometry and differences between the sexes, and compared their prediction accuracy with Donnelly and Sintek's equation. Thirty-two males and 32 females aged 17-26 years participated in the study. Multiple linear regression analysis was performed to obtain the prediction equations, and the systematic errors of their predictions were assessed by Bland-Altman plots. The best prediction equations obtained were: Males: D(g) = -164.12X1 - 125.81X2 - 111.03X3 + 100.66X4 + 6488.63, where X1 = head length (cm), X2 = head circumference (cm), X3 = head breadth (cm), X4 = head thickness (cm) (R = 0.858, R2 = 0.737, adjusted R2 = 0.687, standard error of the estimate = 224.1); Females: D(g) = -156.03X1 - 14.03X2 - 38.45X3 - 8.87X4 + 7852.45, where X1 = head circumference (cm), X2 = body mass (g), X3 = head length (cm), X4 = height (cm) (R = 0.913, R2 = 0.833, adjusted R2 = 0.808, standard error of the estimate = 137.7). The effective predictors in these prediction equations differed from those of Donnelly and Sintek's equation, and head circumference and head length were included in both equations. The prediction accuracy was improved by statistically selecting effective predictors. Since we did not assess cross-validity, the equations cannot be used to generalize to other populations, and further investigation is required.
NASA Astrophysics Data System (ADS)
Khaleghi, Mohammad Reza; Varvani, Javad
2018-02-01
Complex and variable nature of the river sediment yield caused many problems in estimating the long-term sediment yield and problems input into the reservoirs. Sediment Rating Curves (SRCs) are generally used to estimate the suspended sediment load of the rivers and drainage watersheds. Since the regression equations of the SRCs are obtained by logarithmic retransformation and have a little independent variable in this equation, they also overestimate or underestimate the true sediment load of the rivers. To evaluate the bias correction factors in Kalshor and Kashafroud watersheds, seven hydrometric stations of this region with suitable upstream watershed and spatial distribution were selected. Investigation of the accuracy index (ratio of estimated sediment yield to observed sediment yield) and the precision index of different bias correction factors of FAO, Quasi-Maximum Likelihood Estimator (QMLE), Smearing, and Minimum-Variance Unbiased Estimator (MVUE) with LSD test showed that FAO coefficient increases the estimated error in all of the stations. Application of MVUE in linear and mean load rating curves has not statistically meaningful effects. QMLE and smearing factors increased the estimated error in mean load rating curve, but that does not have any effect on linear rating curve estimation.
Semiparametric temporal process regression of survival-out-of-hospital.
Zhan, Tianyu; Schaubel, Douglas E
2018-05-23
The recurrent/terminal event data structure has undergone considerable methodological development in the last 10-15 years. An example of the data structure that has arisen with increasing frequency involves the recurrent event being hospitalization and the terminal event being death. We consider the response Survival-Out-of-Hospital, defined as a temporal process (indicator function) taking the value 1 when the subject is currently alive and not hospitalized, and 0 otherwise. Survival-Out-of-Hospital is a useful alternative strategy for the analysis of hospitalization/survival in the chronic disease setting, with the response variate representing a refinement to survival time through the incorporation of an objective quality-of-life component. The semiparametric model we consider assumes multiplicative covariate effects and leaves unspecified the baseline probability of being alive-and-out-of-hospital. Using zero-mean estimating equations, the proposed regression parameter estimator can be computed without estimating the unspecified baseline probability process, although baseline probabilities can subsequently be estimated for any time point within the support of the censoring distribution. We demonstrate that the regression parameter estimator is asymptotically normal, and that the baseline probability function estimator converges to a Gaussian process. Simulation studies are performed to show that our estimating procedures have satisfactory finite sample performances. The proposed methods are applied to the Dialysis Outcomes and Practice Patterns Study (DOPPS), an international end-stage renal disease study.
Magnitude and frequency of floods in Washington
Cummans, J.E.; Collings, Michael R.; Nasser, Edmund George
1975-01-01
Relations are provided to estimate the magnitude and frequency of floods on Washington streams. Annual-peak-flow data from stream gaging stations on unregulated streams having 1 years or more of record were used to determine a log-Pearson Type III frequency curve for each station. Flood magnitudes having recurrence intervals of 2, 5, i0, 25, 50, and 10years were then related to physical and climatic indices of the drainage basins by multiple-regression analysis using the Biomedical Computer Program BMDO2R. These regression relations are useful for estimating flood magnitudes of the specified recurrence intervals at ungaged or short-record sites. Separate sets of regression equations were defined for western and eastern parts of the State, and the State was further subdivided into 12 regions in which the annual floods exhibit similar flood characteristics. Peak flows are related most significantly in western Washington to drainage-area size and mean annual precipitation. In eastern Washington-they are related most significantly to drainage-area size, mean annual precipitation, and percentage of forest cover. Standard errors of estimate of the estimating relations range from 25 to 129 percent, and the smallest errors are generally associated with the more humid regions.
Estimating irrigation water use in the humid eastern United States
Levin, Sara B.; Zarriello, Phillip J.
2013-01-01
Accurate accounting of irrigation water use is an important part of the U.S. Geological Survey National Water-Use Information Program and the WaterSMART initiative to help maintain sustainable water resources in the Nation. Irrigation water use in the humid eastern United States is not well characterized because of inadequate reporting and wide variability associated with climate, soils, crops, and farming practices. To better understand irrigation water use in the eastern United States, two types of predictive models were developed and compared by using metered irrigation water-use data for corn, cotton, peanut, and soybean crops in Georgia and turf farms in Rhode Island. Reliable metered irrigation data were limited to these areas. The first predictive model that was developed uses logistic regression to predict the occurrence of irrigation on the basis of antecedent climate conditions. Logistic regression equations were developed for corn, cotton, peanut, and soybean crops by using weekly irrigation water-use data from 36 metered sites in Georgia in 2009 and 2010 and turf farms in Rhode Island from 2000 to 2004. For the weeks when irrigation was predicted to take place, the irrigation water-use volume was estimated by multiplying the average metered irrigation application rate by the irrigated acreage for a given crop. The second predictive model that was developed is a crop-water-demand model that uses a daily soil water balance to estimate the water needs of a crop on a given day based on climate, soil, and plant properties. Crop-water-demand models were developed independently of reported irrigation water-use practices and relied on knowledge of plant properties that are available in the literature. Both modeling approaches require accurate accounting of irrigated area and crop type to estimate total irrigation water use. Water-use estimates from both modeling methods were compared to the metered irrigation data from Rhode Island and Georgia that were used to develop the models as well as two independent validation datasets from Georgia and Virginia that were not used in model development. Irrigation water-use estimates from the logistic regression method more closely matched mean reported irrigation rates than estimates from the crop-water-demand model when compared to the irrigation data used to develop the equations. The root mean squared errors (RMSEs) for the logistic regression estimates of mean annual irrigation ranged from 0.3 to 2.0 inches (in.) for the five crop types; RMSEs for the crop-water-demand models ranged from 1.4 to 3.9 in. However, when the models were applied and compared to the independent validation datasets from southwest Georgia from 2010, and from Virginia from 1999 to 2007, the crop-water-demand model estimates were as good as or better at predicting the mean irrigation volume than the logistic regression models for most crop types. RMSEs for logistic regression estimates of mean annual irrigation ranged from 1.0 to 7.0 in. for validation data from Georgia and from 1.8 to 4.9 in. for validation data from Virginia; RMSEs for crop-water-demand model estimates ranged from 2.1 to 5.8 in. for Georgia data and from 2.0 to 3.9 in. for Virginia data. In general, regression-based models performed better in areas that had quality daily or weekly irrigation data from which the regression equations were developed; however, the regression models were less reliable than the crop-water-demand models when applied outside the area for which they were developed. In most eastern coastal states that do not have quality irrigation data, the crop-water-demand model can be used more reliably. The development of predictive models of irrigation water use in this study was hindered by a lack of quality irrigation data. Many mid-Atlantic and New England states do not require irrigation water use to be reported. A survey of irrigation data from 14 eastern coastal states from Maine to Georgia indicated that, with the exception of the data in Georgia, irrigation data in the states that do require reporting commonly did not contain requisite ancillary information such as irrigated area or crop type, lacked precision, or were at an aggregated temporal scale making them unsuitable for use in the development of predictive models. Confidence in the reliability of either modeling method is affected by uncertainty in the reported data from which the models were developed or validated. Only through additional collection of quality data and further study can the accuracy and uncertainty of irrigation water-use estimates be improved in the humid eastern United States.
Pereira, Piettra Moura Galvão; da Silva, Giselma Alcântara; Santos, Gilberto Moreira; Petroski, Edio Luiz; Geraldes, Amandio Aristides Rihan
2013-07-02
This study aimed to examine the cross validity of two anthropometric equations commonly used and propose simple anthropometric equations to estimate appendicular muscle mass (AMM) in elderly women. Among 234 physically active and functionally independent elderly women, 101 (60 to 89 years) were selected through simple drawing to compose the study sample. The paired t test and the Pearson correlation coefficient were used to perform cross-validation and concordance was verified by intraclass correction coefficient (ICC) and by the Bland and Altman technique. To propose predictive models, multiple linear regression analysis, anthropometric measures of body mass (BM), height, girth, skinfolds, body mass index (BMI) were used, and muscle perimeters were included in the analysis as independent variables. Dual-Energy X-ray Absorptiometry (AMMDXA) was used as criterion measurement. The sample power calculations were carried out by Post Hoc Compute Achieved Power. Sample power values from 0.88 to 0.91 were observed. When compared, the two equations tested differed significantly from the AMMDXA (p <0.001 and p = 0.001). Ten population / specific anthropometric equations were developed to estimate AMM, among them, three equations achieved all validation criteria used: AMM (E2) = 4.150 +0.251 [bodymass (BM)] - 0.411 [bodymass index (BMI)] + 0.011 [Right forearm perimeter (PANTd) 2]; AMM (E3) = 4.087 + 0.255 (BM) - 0.371 (BMI) + 0.011 (PANTd) 2 - 0.035 [thigh skinfold (DCCO)]; MMA (E6) = 2.855 + 0.298 (BM) + 0.019 (Age) - 0,082 [hip circumference (PQUAD)] + 0.400 (PANTd) - 0.332 (BMI). The equations estimated the criterion method (p = 0.056 p = 0.158), and explained from 0.69% to 0.74% of variations observed in AMMDXA with low standard errors of the estimate (1.36 to 1.55 kg) and high concordance (ICC between 0,90 and 0.91 and concordance limits from -2,93 to 2,33 kg). The equations tested were not valid for use in physically active and functionally independent elderly women. The simple anthropometric equations developed in this study showed good practical applicability and high validity to estimate AMM in elderly women.
2013-01-01
Objective This study aimed to examine the cross validity of two anthropometric equations commonly used and propose simple anthropometric equations to estimate appendicular muscle mass (AMM) in elderly women. Methods Among 234 physically active and functionally independent elderly women, 101 (60 to 89 years) were selected through simple drawing to compose the study sample. The paired t test and the Pearson correlation coefficient were used to perform cross-validation and concordance was verified by intraclass correction coefficient (ICC) and by the Bland and Altman technique. To propose predictive models, multiple linear regression analysis, anthropometric measures of body mass (BM), height, girth, skinfolds, body mass index (BMI) were used, and muscle perimeters were included in the analysis as independent variables. Dual-Energy X-ray Absorptiometry (AMMDXA) was used as criterion measurement. The sample power calculations were carried out by Post Hoc Compute Achieved Power. Sample power values from 0.88 to 0.91 were observed. Results When compared, the two equations tested differed significantly from the AMMDXA (p <0.001 and p = 0.001). Ten population / specific anthropometric equations were developed to estimate AMM, among them, three equations achieved all validation criteria used: AMM (E2) = 4.150 +0.251 [bodymass (BM)] - 0.411 [bodymass index (BMI)] + 0.011 [Right forearm perimeter (PANTd) 2]; AMM (E3) = 4.087 + 0.255 (BM) - 0.371 (BMI) + 0.011 (PANTd) 2 - 0.035 [thigh skinfold (DCCO)]; MMA (E6) = 2.855 + 0.298 (BM) + 0.019 (Age) - 0,082 [hip circumference (PQUAD)] + 0.400 (PANTd) - 0.332 (BMI). The equations estimated the criterion method (p = 0.056 p = 0.158), and explained from 0.69% to 0.74% of variations observed in AMMDXA with low standard errors of the estimate (1.36 to 1.55 kg) and high concordance (ICC between 0,90 and 0.91 and concordance limits from -2,93 to 2,33 kg). Conclusion The equations tested were not valid for use in physically active and functionally independent elderly women. The simple anthropometric equations developed in this study showed good practical applicability and high validity to estimate AMM in elderly women. PMID:23815948
NASA Astrophysics Data System (ADS)
Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia
2017-06-01
GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.
NASA Technical Reports Server (NTRS)
Doneaud, Andre A.; Miller, James R., Jr.; Johnson, L. Ronald; Vonder Haar, Thomas H.; Laybe, Patrick
1987-01-01
The use of the area-time-integral (ATI) technique, based only on satellite data, to estimate convective rain volume over a moving target is examined. The technique is based on the correlation between the radar echo area coverage integrated over the lifetime of the storm and the radar estimated rain volume. The processing of the GOES and radar data collected in 1981 is described. The radar and satellite parameters for six convective clusters from storm events occurring on June 12 and July 2, 1981 are analyzed and compared in terms of time steps and cluster lifetimes. Rain volume is calculated by first using the regression analysis to generate the regression equation used to obtain the ATI; the ATI versus rain volume relation is then employed to compute rain volume. The data reveal that the ATI technique using satellite data is applicable to the calculation of rain volume.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Nohara, Ryuki; Endo, Yui; Murai, Akihiko; Takemura, Hiroshi; Kouchi, Makiko; Tada, Mitsunori
2016-08-01
Individual human models are usually created by direct 3D scanning or deforming a template model according to the measured dimensions. In this paper, we propose a method to estimate all the necessary dimensions (full set) for the human model individualization from a small number of measured dimensions (subset) and human dimension database. For this purpose, we solved multiple regression equation from the dimension database given full set dimensions as the objective variable and subset dimensions as the explanatory variables. Thus, the full set dimensions are obtained by simply multiplying the subset dimensions to the coefficient matrix of the regression equation. We verified the accuracy of our method by imputing hand, foot, and whole body dimensions from their dimension database. The leave-one-out cross validation is employed in this evaluation. The mean absolute errors (MAE) between the measured and the estimated dimensions computed from 4 dimensions (hand length, breadth, middle finger breadth at proximal, and middle finger depth at proximal) in the hand, 2 dimensions (foot length, breadth, and lateral malleolus height) in the foot, and 1 dimension (height) and weight in the whole body are computed. The average MAE of non-measured dimensions were 4.58% in the hand, 4.42% in the foot, and 3.54% in the whole body, while that of measured dimensions were 0.00%.
Estimating bird damage from damage incidence in wine grape vineyards
DeHaven, R.W.; Hothem, R.L.
1981-01-01
Bird damage was measured during 1977 and 1978 at 32 wine grape vineyards in the San Joaquin Valley and North Coastal Region of California. Both the percentage bird loss (PBL) and the percentage of bunches damaged (BDI = bird damage incidence) were determined during 55 total-damage assessments, and the resulting data pairs were used to develop a regression of PBL on BDI. The final prediction equation was loge (PBL + 1) = 0.0385 BDI, for which the SE = 9.6297 10-4, and it accounted for 97% of the observed variation. We conclude that by using that equation, reasonably accurate predictions of PBL can be obtained from relatively quick and inexpensive estimates of BDI. Guidelines for the use of the prediction method and the accuracy of some PBL predictions are discussed.
Statistical models for estimating daily streamflow in Michigan
Holtschlag, D.J.; Salehi, Habib
1992-01-01
Statistical models for estimating daily streamflow were analyzed for 25 pairs of streamflow-gaging stations in Michigan. Stations were paired by randomly choosing a station operated in 1989 at which 10 or more years of continuous flow data had been collected and at which flow is virtually unregulated; a nearby station was chosen where flow characteristics are similar. Streamflow data from the 25 randomly selected stations were used as the response variables; streamflow data at the nearby stations were used to generate a set of explanatory variables. Ordinary-least squares regression (OLSR) equations, autoregressive integrated moving-average (ARIMA) equations, and transfer function-noise (TFN) equations were developed to estimate the log transform of flow for the 25 randomly selected stations. The precision of each type of equation was evaluated on the basis of the standard deviation of the estimation errors. OLSR equations produce one set of estimation errors; ARIMA and TFN models each produce l sets of estimation errors corresponding to the forecast lead. The lead-l forecast is the estimate of flow l days ahead of the most recent streamflow used as a response variable in the estimation. In this analysis, the standard deviation of lead l ARIMA and TFN forecast errors were generally lower than the standard deviation of OLSR errors for l < 2 days and l < 9 days, respectively. Composite estimates were computed as a weighted average of forecasts based on TFN equations and backcasts (forecasts of the reverse-ordered series) based on ARIMA equations. The standard deviation of composite errors varied throughout the length of the estimation interval and generally was at maximum near the center of the interval. For comparison with OLSR errors, the mean standard deviation of composite errors were computed for intervals of length 1 to 40 days. The mean standard deviation of length-l composite errors were generally less than the standard deviation of the OLSR errors for l < 32 days. In addition, the composite estimates ensure a gradual transition between periods of estimated and measured flows. Model performance among stations of differing model error magnitudes were compared by computing ratios of the mean standard deviation of the length l composite errors to the standard deviation of OLSR errors. The mean error ratio for the set of 25 selected stations was less than 1 for intervals l < 32 days. Considering the frequency characteristics of the length of intervals of estimated record in Michigan, the effective mean error ratio for intervals < 30 days was 0.52. Thus, for intervals of estimation of 1 month or less, the error of the composite estimate is substantially lower than error of the OLSR estimate.
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Evaluation and application of regional turbidity-sediment regression models in Virginia
Hyer, Kenneth; Jastram, John D.; Moyer, Douglas; Webber, James S.; Chanat, Jeffrey G.
2015-01-01
Conventional thinking has long held that turbidity-sediment surrogate-regression equations are site specific and that regression equations developed at a single monitoring station should not be applied to another station; however, few studies have evaluated this issue in a rigorous manner. If robust regional turbidity-sediment models can be developed successfully, their applications could greatly expand the usage of these methods. Suspended sediment load estimation could occur as soon as flow and turbidity monitoring commence at a site, suspended sediment sampling frequencies for various projects potentially could be reduced, and special-project applications (sediment monitoring following dam removal, for example) could be significantly enhanced. The objective of this effort was to investigate the turbidity-suspended sediment concentration (SSC) relations at all available USGS monitoring sites within Virginia to determine whether meaningful turbidity-sediment regression models can be developed by combining the data from multiple monitoring stations into a single model, known as a “regional” model. Following the development of the regional model, additional objectives included a comparison of predicted SSCs between the regional model and commonly used site-specific models, as well as an evaluation of why specific monitoring stations did not fit the regional model.
Holtschlag, D.J.; Koschik, J.A.
2001-01-01
St. Clair and Detroit Rivers are connecting channels between Lake Huron and Lake Erie in the Great Lakes waterway, and form part of the boundary between the United States and Canada. St. Clair River, the upper connecting channel, drains 222,400 square miles and has an average flow of about 182,000 cubic feet per second. Water from St. Clair River combines with local inflows and discharges into Lake St. Clair before flowing into Detroit River. In some reaches of St. Clair and Detroit Rivers, islands and dikes split the flow into two to four branches. Even when the flow in a reach is known, proportions of flows within individual branches of a reach are uncertain. Simple linear regression equations, subject to a flow continuity constraint, are developed to provide estimators of these proportions and flows. The equations are based on 533 paired measurements of flow in 13 reaches forming 31 branches. The equations provide a means for computing the expected values and uncertainties of steady-state flows on the basis of flow conditions specified at the upstream boundaries of the waterway. In 7 upstream reaches, flow is considered fixed because it can be determined on the basis of flows specified at waterway boundaries and flow continuity. In these reaches, the uncertainties of flow proportions indicated by the regression equations can be used directly to determine the uncertainties of the corresponding flows. In the remaining 6 downstream reaches, flow is considered uncertain because these reaches do not receive flow from all the branches of an upstream reach, or they receive flow from some branches of more than one upstream reach. Monte Carlo simulation analysis is used to quantify this increase in uncertainty associated with the propagation of uncertainties from upstream reaches to downstream reaches. To eliminate the need for Monte Carlo simulations for routine calculations, polynomial regression equations are developed to approximate the variation in uncertainties as a function of flow at the headwaters of St. Clair River. Finally, monthly flow-duration data on the main channels of St. Clair and Detroit Rivers are used with the equations developed in this report to estimate the steady-state flow-duration characteristics of selected branches.
Characteristics and Impact of Imperviousness From a GIS-based Hydrological Perspective
NASA Astrophysics Data System (ADS)
Moglen, G. E.; Kim, S.
2005-12-01
With the concern that imperviousness can be differently quantified depending on data sources and methods, this study assessed imperviousness estimates using two different data sources: land use and land cover. Year 2000 land use developed by the Maryland Department of Planning was utilized to estimate imperviousness by assigning imperviousness coefficients to unique land use categories. These estimates were compared with imperviousness estimates based on satellite-derived land cover from the 2001 National Land Cover Dataset. Our study developed the relationships between these two estimates in the form of regression equations to convert imperviousness derived from one data source to the other. The regression equations are considered reliable, based on goodness-of-fit measures. Furthermore, this study examined how quantitatively different imperviousness estimates affect the prediction of hydrological response both in the flow regime and in the thermal regime. We assessed the relationships between indicators of hydrological response and imperviousness-descriptors. As indicators of flow variability, coefficient of variance, lag-one autocorrelation, and mean daily flow change were calculated based on measured mean daily stream flow from the water year 1997 to 2003. For thermal variability, indicators such as percent-days of surge, degree-day, and mean daily temperature difference were calculated base on measured stream temperature over several basins in Maryland. To describe imperviousness through the hydrological process, GIS-based spatially distributed hydrological models were developed based on a water-balance method and the SCS-CN method. Imperviousness estimates from land use and land cover were used as predictors in these models to examine the effect of imperviousness using different data sources on the prediction of hydrological response. Indicators of hydrological response were also regressed on aggregate imperviousness. This allowed for identifying if hydrological response is more sensitive to spatially distributed imperviousness or aggregate (lumped) imperviousness. The regressions between indicators of hydrological response and imperviousness-descriptors were evaluated by examining goodness-of-fit measures such as explained variance or relative standard error. The results show that imperviousness estimates using land use are better predictors of flow variability and thermal variability than imperviousness estimates using land cover. Also, this study reveals that flow variability is more sensitive to spatially distributed models than lumped models, while thermal variability is equally responsive to both models. The findings from this study can be further examined from a policy perspective with regard to policies that are based on a threshold concept for imperviousness impacts on the ecological and hydrological system.
A parameter estimation subroutine package
NASA Technical Reports Server (NTRS)
Bierman, G. J.; Nead, M. W.
1978-01-01
Linear least squares estimation and regression analyses continue to play a major role in orbit determination and related areas. A library of FORTRAN subroutines were developed to facilitate analyses of a variety of estimation problems. An easy to use, multi-purpose set of algorithms that are reasonably efficient and which use a minimal amount of computer storage are presented. Subroutine inputs, outputs, usage and listings are given, along with examples of how these routines can be used. The routines are compact and efficient and are far superior to the normal equation and Kalman filter data processing algorithms that are often used for least squares analyses.
Evaluation of spatial, radiometric and spectral Thematic Mapper performance for coastal studies
NASA Technical Reports Server (NTRS)
Klemas, V. (Principal Investigator)
1984-01-01
The effect different wetland plant canopies have upon observed reflectance in Thematic Mapper bands is studied. The three major vegetation canopy types (broadleaf, gramineous and leafless) produce unique spectral responses for a similar quantity of live biomass. The spectral biomass estimate of a broadleaf canopy is most similar to the harvest biomass estimate when a broadleaf canopy radiance model is used. All major wetland vegetation species can be identified through TM imagery. Simple regression models are developed equating the vegetation index and the infrared index with biomass. The spectral radiance index largely agreed with harvest biomass estimates.
Optimal Estimation of Clock Values and Trends from Finite Data
NASA Technical Reports Server (NTRS)
Greenhall, Charles
2005-01-01
We show how to solve two problems of optimal linear estimation from a finite set of phase data. Clock noise is modeled as a stochastic process with stationary dth increments. The covariance properties of such a process are contained in the generalized autocovariance function (GACV). We set up two principles for optimal estimation: with the help of the GACV, these principles lead to a set of linear equations for the regression coefficients and some auxiliary parameters. The mean square errors of the estimators are easily calculated. The method can be used to check the results of other methods and to find good suboptimal estimators based on a small subset of the available data.
NASA Technical Reports Server (NTRS)
Argentiero, P.; Lowrey, B.
1976-01-01
The least squares collocation algorithm for estimating gravity anomalies from geodetic data is shown to be an application of the well known regression equations which provide the mean and covariance of a random vector (gravity anomalies) given a realization of a correlated random vector (geodetic data). It is also shown that the collocation solution for gravity anomalies is equivalent to the conventional least-squares-Stokes' function solution when the conventional solution utilizes properly weighted zero a priori estimates. The mathematical and physical assumptions underlying the least squares collocation estimator are described, and its numerical properties are compared with the numerical properties of the conventional least squares estimator.
Kronholm, Scott C.; Capel, Paul D.; Terziotti, Silvia
2016-01-01
Accurate estimation of total nitrogen loads is essential for evaluating conditions in the aquatic environment. Extrapolation of estimates beyond measured streams will greatly expand our understanding of total nitrogen loading to streams. Recursive partitioning and random forest regression were used to assess 85 geospatial, environmental, and watershed variables across 636 small (<585 km2) watersheds to determine which variables are fundamentally important to the estimation of annual loads of total nitrogen. Initial analysis led to the splitting of watersheds into three groups based on predominant land use (agricultural, developed, and undeveloped). Nitrogen application, agricultural and developed land area, and impervious or developed land in the 100-m stream buffer were commonly extracted variables by both recursive partitioning and random forest regression. A series of multiple linear regression equations utilizing the extracted variables were created and applied to the watersheds. As few as three variables explained as much as 76 % of the variability in total nitrogen loads for watersheds with predominantly agricultural land use. Catchment-scale national maps were generated to visualize the total nitrogen loads and yields across the USA. The estimates provided by these models can inform water managers and help identify areas where more in-depth monitoring may be beneficial.
Estimating a child's age from an image using whole body proportions.
Lucas, Teghan; Henneberg, Maciej
2017-09-01
The use and distribution of child pornography is an increasing problem. Forensic anthropologists are often asked to estimate a child's age from a photograph. Previous studies have attempted to estimate the age of children from photographs using ratios of the face. Here, we propose to include body measurement ratios into age estimates. A total of 1603 boys and 1833 girls aged 5-16 years were measured over a 10-year period. They are 'Cape Coloured' children from South Africa. Their age was regressed on ratios derived from anthropometric measurements of the head as well as the body. Multiple regression equations including four ratios for each sex (head height to shoulder and hip width, knee width, leg length and trunk length) have a standard error of 1.6-1.7 years. The error is of the same order as variation of differences between biological and chronological ages of the children. Thus, the error cannot be minimised any further as it is a direct reflection of a naturally occurring phenomenon.
Regionalisation of low flow frequency curves for the Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Mamun, Abdullah A.; Hashim, Alias; Daoud, Jamal I.
2010-02-01
SUMMARYRegional maps and equations for the magnitude and frequency of 1, 7 and 30-day low flows were derived and are presented in this paper. The river gauging stations of neighbouring catchments that produced similar low flow frequency curves were grouped together. As such, the Peninsular Malaysia was divided into seven low flow regions. Regional equations were developed using the multivariate regression technique. An empirical relationship was developed for mean annual minimum flow as a function of catchment area, mean annual rainfall and mean annual evaporation. The regional equations exhibited good coefficient of determination ( R2 > 0.90). Three low flow frequency curves showing the low, mean and high limits for each region were proposed based on a graphical best-fit technique. Knowing the catchment area, mean annual rainfall and evaporation in the region, design low flows of different durations can be easily estimated for the ungauged catchments. This procedure is expected to overcome the problem of data unavailability in estimating low flows in the Peninsular Malaysia.
Aerodynamic parameters of High-Angle-of attack Research Vehicle (HARV) estimated from flight data
NASA Technical Reports Server (NTRS)
Klein, Vladislav; Ratvasky, Thomas R.; Cobleigh, Brent R.
1990-01-01
Aerodynamic parameters of the High-Angle-of-Attack Research Aircraft (HARV) were estimated from flight data at different values of the angle of attack between 10 degrees and 50 degrees. The main part of the data was obtained from small amplitude longitudinal and lateral maneuvers. A small number of large amplitude maneuvers was also used in the estimation. The measured data were first checked for their compatibility. It was found that the accuracy of air data was degraded by unexplained bias errors. Then, the data were analyzed by a stepwise regression method for obtaining a structure of aerodynamic model equations and least squares parameter estimates. Because of high data collinearity in several maneuvers, some of the longitudinal and all lateral maneuvers were reanalyzed by using two biased estimation techniques, the principal components regression and mixed estimation. The estimated parameters in the form of stability and control derivatives, and aerodynamic coefficients were plotted against the angle of attack and compared with the wind tunnel measurements. The influential parameters are, in general, estimated with acceptable accuracy and most of them are in agreement with wind tunnel results. The simulated responses of the aircraft showed good prediction capabilities of the resulting model.
Xin, Hangshu; Khan, Nazir A; Falk, Kevin C; Yu, Peiqiang
2014-08-13
The objectives of this study were to quantify lipid-related inherent molecular structures using a Fourier transform infrared spectroscopy (FT-IR) technique and determine their relationship to oil content, fatty acid and glucosinolate profile, total polyphenols, and condensed tannins in seeds from newly developed yellow-seeded and brown-seeded Brassica carinata lines. Canola seeds were used as a reference. The lipid-related molecular spectral band intensities were strongly correlated to the contents of oil, fatty acids, glucosinolates, and polyphenols. The regression equations gave relatively high predictive power for the estimation of oil (R² = 0.99); all measured fatty acids (R² > 0.80), except C14:0, C20:3n-3, C22:2n-9, and C22:2n-6; 3-butenyl, 2-OH-3-butenyl, 4-OH-3-CH3-indolyl, and total glucosinolates (R² > 0.686); and total polyphenols (R² = 0.935). However, further study is required to obtain predictive equations based on large numbers of samples from diverse sources to illustrate the general applicability of these regression equations.
Al-Gindan, Yasmin Y.; Hankey, Catherine R.; Govan, Lindsay; Gallagher, Dympna; Heymsfield, Steven B.; Lean, Michael E. J.
2017-01-01
The reference organ-level body composition measurement method is MRI. Practical estimations of total adipose tissue mass (TATM), total adipose tissue fat mass (TATFM) and total body fat are valuable for epidemiology, but validated prediction equations based on MRI are not currently available. We aimed to derive and validate new anthropometric equations to estimate MRI-measured TATM/TATFM/total body fat and compare them with existing prediction equations using older methods. The derivation sample included 416 participants (222 women), aged between 18 and 88 years with BMI between 15·9 and 40·8 (kg/m2). The validation sample included 204 participants (110 women), aged between 18 and 86 years with BMI between 15·7 and 36·4 (kg/m2). Both samples included mixed ethnic/racial groups. All the participants underwent whole-body MRI to quantify TATM (dependent variable) and anthropometry (independent variables). Prediction equations developed using stepwise multiple regression were further investigated for agreement and bias before validation in separate data sets. Simplest equations with optimal R2 and Bland–Altman plots demonstrated good agreement without bias in the validation analyses: men: TATM (kg) = 0·198 weight (kg) + 0·478 waist (cm) − 0·147 height (cm) − 12·8 (validation: R2 0·79, CV = 20 %, standard error of the estimate (SEE)=3·8 kg) and women: TATM (kg)=0·789 weight (kg) + 0·0786 age (years) − 0·342 height (cm) + 24·5 (validation: R2 0·84, CV = 13 %, SEE = 3·0 kg). Published anthropometric prediction equations, based on MRI and computed tomographic scans, correlated strongly with MRI-measured TATM: (R2 0·70 – 0·82). Estimated TATFM correlated well with published prediction equations for total body fat based on underwater weighing (R2 0·70–0·80), with mean bias of 2·5–4·9 kg, correctable with log-transformation in most equations. In conclusion, new equations, using simple anthropometric measurements, estimated MRI-measured TATM with correlations and agreements suitable for use in groups and populations across a wide range of fatness. PMID:26435103
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Melching, C.S.; Marquardt, J.S.
1997-01-01
Design hydrographs computed from design storms, simple models of abstractions (interception, depression storage, and infiltration), and synthetic unit hydrographs provide vital information for stormwater, flood-plain, and water-resources management throughout the United States. Rainfall and runoff data for small watersheds in Lake County collected between 1990 and 1995 were studied to develop equations for estimation of synthetic unit-hydrograph parameters on the basis of watershed and storm characteristics. The synthetic unit-hydrograph parameters of interest were the time of concentration (TC) and watershed-storage coefficient (R) for the Clark unit-hydrograph method, the unit-graph lag (UL) for the Soil Conservation Service (now known as the Natural Resources Conservation Service) dimensionless unit hydrograph, and the hydrograph-time lag (TL) for the linear-reservoir method for unit-hydrograph estimation. Data from 66 storms with effective-precipitation depths greater than 0.4 inches on 9 small watersheds (areas between 0.06 and 37 square miles (mi2)) were utilized to develop the estimation equations, and data from 11 storms on 8 of these watersheds were utilized to verify (test) the estimation equations. The synthetic unit-hydrograph parameters were determined by calibration using the U.S. Army Corps of Engineers Flood Hydrograph Package HEC-1 (TC, R, and UL) or by manual analysis of the rainfall and run-off data (TL). The relation between synthetic unit-hydrograph parameters, and watershed and storm characteristics was determined by multiple linear regression of the logarithms of the parameters and characteristics. Separate sets of equations were developed with watershed area and main channel length as the starting parameters. Percentage of impervious cover, main channel slope, and depth of effective precipitation also were identified as important characteristics for estimation of synthetic unit-hydrograph parameters. The estimation equations utilizing area had multiple correlation coefficients of 0.873, 0.961, 0.968, and 0.963 for TC, R, UL, and TL, respectively, and the estimation equations utilizing main channel length had multiple correlation coefficients of 0.845, 0.957, 0.961, and 0.963 for TC, R, UL, and TL, respectively. Simulation of the measured hydrographs for the verification storms utilizing TC and R obtained from the estimation equations yielded good results without calibration. The peak discharge for 8 of the 11 storms was estimated within 25 percent and the time-to-peak discharge for 10 of the 11 storms was estimated within 20 percent. Thus, application of the estimation equations to determine synthetic unit-hydrograph parameters for design-storm simulation may result in reliable design hydrographs; as long as the physical characteristics of the watersheds under consideration are within the range of those for the watersheds in this study (area: 0.06-37 mi2, main channel length: 0.33-16.6 miles, main channel slope: 3.13-55.3 feet per mile, and percentage of impervious cover: 7.32-40.6 percent). The estimation equations are most reliable when applied to watersheds with areas less than 25 mi2.
Many-level multilevel structural equation modeling: An efficient evaluation strategy.
Pritikin, Joshua N; Hunter, Michael D; von Oertzen, Timo; Brick, Timothy R; Boker, Steven M
2017-01-01
Structural equation models are increasingly used for clustered or multilevel data in cases where mixed regression is too inflexible. However, when there are many levels of nesting, these models can become difficult to estimate. We introduce a novel evaluation strategy, Rampart, that applies an orthogonal rotation to the parts of a model that conform to commonly met requirements. This rotation dramatically simplifies fit evaluation in a way that becomes more potent as the size of the data set increases. We validate and evaluate the implementation using a 3-level latent regression simulation study. Then we analyze data from a state-wide child behavioral health measure administered by the Oklahoma Department of Human Services. We demonstrate the efficiency of Rampart compared to other similar software using a latent factor model with a 5-level decomposition of latent variance. Rampart is implemented in OpenMx, a free and open source software.
An Overview of Longitudinal Data Analysis Methods for Neurological Research
Locascio, Joseph J.; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models. PMID:22203825
NASA Technical Reports Server (NTRS)
Moes, Timothy R.; Smith, Mark S.; Morelli, Eugene A.
2003-01-01
Near real-time stability and control derivative extraction is required to support flight demonstration of Intelligent Flight Control System (IFCS) concepts being developed by NASA, academia, and industry. Traditionally, flight maneuvers would be designed and flown to obtain stability and control derivative estimates using a postflight analysis technique. The goal of the IFCS concept is to be able to modify the control laws in real time for an aircraft that has been damaged in flight. In some IFCS implementations, real-time parameter identification (PID) of the stability and control derivatives of the damaged aircraft is necessary for successfully reconfiguring the control system. This report investigates the usefulness of Prescribed Simultaneous Independent Surface Excitations (PreSISE) to provide data for rapidly obtaining estimates of the stability and control derivatives. Flight test data were analyzed using both equation-error and output-error PID techniques. The equation-error PID technique is known as Fourier Transform Regression (FTR) and is a frequency-domain real-time implementation. Selected results were compared with a time-domain output-error technique. The real-time equation-error technique combined with the PreSISE maneuvers provided excellent derivative estimation in the longitudinal axis. However, the PreSISE maneuvers as presently defined were not adequate for accurate estimation of the lateral-directional derivatives.
Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams
Stuckey, Marla H.
2006-01-01
Low-flow, base-flow, and mean-flow characteristics are an important part of assessing water resources in a watershed. These streamflow characteristics can be used by watershed planners and regulators to determine water availability, water-use allocations, assimilative capacities of streams, and aquatic-habitat needs. Streamflow characteristics are commonly predicted by use of regression equations when a nearby streamflow-gaging station is not available. Regression equations for predicting low-flow, base-flow, and mean-flow characteristics for Pennsylvania streams were developed from data collected at 293 continuous- and partial-record streamflow-gaging stations with flow unaffected by upstream regulation, diversion, or mining. Continuous-record stations used in the regression analysis had 9 years or more of data, and partial-record stations used had seven or more measurements collected during base-flow conditions. The state was divided into five low-flow regions and regional regression equations were developed for the 7-day, 10-year; 7-day, 2-year; 30-day, 10-year; 30-day, 2-year; and 90-day, 10-year low flows using generalized least-squares regression. Statewide regression equations were developed for the 10-year, 25-year, and 50-year base flows using generalized least-squares regression. Statewide regression equations were developed for harmonic mean and mean annual flow using weighted least-squares regression. Basin characteristics found to be significant explanatory variables at the 95-percent confidence level for one or more regression equations were drainage area, basin slope, thickness of soil, stream density, mean annual precipitation, mean elevation, and the percentage of glaciation, carbonate bedrock, forested area, and urban area within a basin. Standard errors of prediction ranged from 33 to 66 percent for the n-day, T-year low flows; 21 to 23 percent for the base flows; and 12 to 38 percent for the mean annual flow and harmonic mean, respectively. The regression equations are not valid in watersheds with upstream regulation, diversions, or mining activities. Watersheds with karst features need close examination as to the applicability of the regression-equation results.
Ishikawa, Taisuke; Fukushima, Ryuji; Suzuki, Shuji; Miyaishi, Yuka; Nishimura, Taiki; Hira, Satoshi; Hamabe, Lina; Tanaka, Ryou
2011-08-01
Non-invasive and immediate estimation of left atrial pressure (LAP) is very useful for the management of mitral regurgitation (MR), and many reports have assessed echocardiographic estimations of LAP to date. However, it has been unclear of which examination and evaluate article possess the best accuracy for the MR severity. The present research aims to establish the echocardiographic estimation equation of LAP that is well applicable for clinical MR dogs. After the chordae tendineae rupture was experimentally induced via left atriotomy in six healthy beagle dogs (three males and three females, two years old, weighing between 9.8 to 12.8 kg), a radio telemetry transmitter catheter was inserted, which allows the continuous recordings of LAP without the use of sedation. Approximately 5 weeks after the surgery, echocardiographic examination, indirect blood pressure measurement, measurement of plasma atrial natriuretic peptide (ANP) and LAP measurement by way of the radio telemetry system was performed simultaneously. Subsequently, simple linear regression equations between LAP and each variable were obtained, and the equations were evaluated whether to be applicable for clinical MR dogs. As a result, the ratio of early diastolic mitral flow to early diastolic lateral mitral annulus velocity (E/Ea) had the strongest correlation as maximum LAP=7.03*(E/Ea)-54.86 (r=0.74), and as mean LAP=4.94*(E/Ea)-40.37 (r=0.70) among the all variables. Therefore, these two equations associated with E/Ea should bring more precise and instant estimations of maximum and mean LAP in clinical MR dogs.
Wang, Jinghua; Xie, Peng; Huang, Jian-Min; Qu, Yan; Zhang, Fang; Wei, Ling-Ge; Fu, Peng; Huang, Xiao-Jie
2016-12-01
To verify whether the new Asian modified CKD-EPI equation improved the performance of original one in determining GFR in Chinese patients with CKD. A well-designed paired cohort was set up. Measured GFR (mGFR) was the result of 99m Tc-diethylene triamine pentaacetic acid ( 99m Tc-DTPA) dual plasma sample clearance method. The estimated GFR (eGFR) was the result of the CKD-EPI equation (eGFR1) and the new Asian modified CKD-EPI equation (eGFR2). The comparisons were performed to evaluate the superiority of the eGFR2 in bias, accuracy, precision, concordance correlation coefficient and the slope of regression equation and measure agreement. A total of 195 patients were enrolled and analyzed. The new Asian modified CKD-EPI equation improved the performance of the original one in bias and accuracy. However, nearly identical performance was observed in the respect of precision, concordance correlation coefficient, slope of eGFR against mGFR and 95 % limit of agreement. In the subgroup of GFR < 60 mL min -1 /1.73 m 2 , the bias of eGFR1 was less than eGFR2 but they have comparable precision and accuracy. In the subgroup of GFR > 60 mL min -1 /1.73 m 2 , eGFR2 performed better than eGFR1 in terms of bias and accuracy. The new Asian modified CKD-EPI equation can lead to more accurate GFR estimation in Chinese patients with CKD in general practice, especially in the higher GFR group.
Estimation procedures for understory biomass and fuel loads in sagebrush steppe invaded by woodlands
Alicia L. Reiner; Robin J. Tausch; Roger F. Walker
2010-01-01
Regression equations were developed to predict biomass for 9 shrubs, 9 grasses, and 10 forbs that generally dominate sagebrush ecosystems in central Nevada. Independent variables included percent cover, average height, and plant volume. We explored 2 ellipsoid volumes: one with maximum plant height and 2 crown diameters and another with live crown height and 2 crown...
Eric H. Wharton; Douglas M. Griffith
1993-01-01
Presents methods for synthesizing information from existing biomass literature when making biomass assessments over extensive geographic areas, such as for a state or region. Described are general applications to the northeastern United States, and specific applications to Ohio. Tables of appropriate regression equations and the tree and shrub species to which these...
1982-05-13
Size Of The Software. A favourite measure for software system size is linos of operational code, or deliverable code (operational code plus...regression models, these conversions are either derived from productivity measures using the "cost per instruction" type of equation or they are...appropriate to different development organisattons, differert project types, different sets of units for measuring e and s, and different items