Sample records for linear quantile estimation

  1. Simultaneous multiple non-crossing quantile regression estimation using kernel constraints

    PubMed Central

    Liu, Yufeng; Wu, Yichao

    2011-01-01

    Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842

  2. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  3. Estimating effects of limiting factors with regression quantiles

    USGS Publications Warehouse

    Cade, B.S.; Terrell, J.W.; Schroeder, R.L.

    1999-01-01

    In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.

  4. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  5. Quantile regression models of animal habitat relationships

    USGS Publications Warehouse

    Cade, Brian S.

    2003-01-01

    Typically, all factors that limit an organism are not measured and included in statistical models used to investigate relationships with their environment. If important unmeasured variables interact multiplicatively with the measured variables, the statistical models often will have heterogeneous response distributions with unequal variances. Quantile regression is an approach for estimating the conditional quantiles of a response variable distribution in the linear model, providing a more complete view of possible causal relationships between variables in ecological processes. Chapter 1 introduces quantile regression and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of estimates for homogeneous and heterogeneous regression models. Chapter 2 evaluates performance of quantile rankscore tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). A permutation F test maintained better Type I errors than the Chi-square T test for models with smaller n, greater number of parameters p, and more extreme quantiles τ. Both versions of the test required weighting to maintain correct Type I errors when there was heterogeneity under the alternative model. An example application related trout densities to stream channel width:depth. Chapter 3 evaluates a drop in dispersion, F-ratio like permutation test for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). Chapter 4 simulates from a large (N = 10,000) finite population representing grid areas on a landscape to demonstrate various forms of hidden bias that might occur when the effect of a measured habitat variable on some animal was confounded with the effect of another unmeasured variable (spatially and not spatially structured). Depending on whether interactions of the measured habitat and unmeasured variable were negative (interference interactions) or positive (facilitation interactions), either upper (τ > 0.5) or lower (τ < 0.5) quantile regression parameters were less biased than mean rate parameters. Sampling (n = 20 - 300) simulations demonstrated that confidence intervals constructed by inverting rankscore tests provided valid coverage of these biased parameters. Quantile regression was used to estimate effects of physical habitat resources on a bivalve mussel (Macomona liliana) in a New Zealand harbor by modeling the spatial trend surface as a cubic polynomial of location coordinates.

  6. Quantile Regression in the Study of Developmental Sciences

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…

  7. Non-stationary hydrologic frequency analysis using B-spline quantile regression

    NASA Astrophysics Data System (ADS)

    Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

    2017-11-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.

  8. Multiple imputation for cure rate quantile regression with censored data.

    PubMed

    Wu, Yuanshan; Yin, Guosheng

    2017-03-01

    The main challenge in the context of cure rate analysis is that one never knows whether censored subjects are cured or uncured, or whether they are susceptible or insusceptible to the event of interest. Considering the susceptible indicator as missing data, we propose a multiple imputation approach to cure rate quantile regression for censored data with a survival fraction. We develop an iterative algorithm to estimate the conditionally uncured probability for each subject. By utilizing this estimated probability and Bernoulli sample imputation, we can classify each subject as cured or uncured, and then employ the locally weighted method to estimate the quantile regression coefficients with only the uncured subjects. Repeating the imputation procedure multiple times and taking an average over the resultant estimators, we obtain consistent estimators for the quantile regression coefficients. Our approach relaxes the usual global linearity assumption, so that we can apply quantile regression to any particular quantile of interest. We establish asymptotic properties for the proposed estimators, including both consistency and asymptotic normality. We conduct simulation studies to assess the finite-sample performance of the proposed multiple imputation method and apply it to a lung cancer study as an illustration. © 2016, The International Biometric Society.

  9. Using nonlinear quantile regression to estimate the self-thinning boundary curve

    Treesearch

    Quang V. Cao; Thomas J. Dean

    2015-01-01

    The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...

  10. A gentle introduction to quantile regression for ecologists

    USGS Publications Warehouse

    Cade, B.S.; Noon, B.R.

    2003-01-01

    Quantile regression is a way to estimate the conditional quantiles of a response variable distribution in the linear model that provides a more complete view of possible causal relationships between variables in ecological processes. Typically, all the factors that affect ecological processes are not measured and included in the statistical models used to investigate relationships between variables associated with those processes. As a consequence, there may be a weak or no predictive relationship between the mean of the response variable (y) distribution and the measured predictive factors (X). Yet there may be stronger, useful predictive relationships with other parts of the response variable distribution. This primer relates quantile regression estimates to prediction intervals in parametric error distribution regression models (eg least squares), and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of the estimates for homogeneous and heterogeneous regression models.

  11. Shrinkage Estimation of Varying Covariate Effects Based On Quantile Regression

    PubMed Central

    Peng, Limin; Xu, Jinfeng; Kutner, Nancy

    2013-01-01

    Varying covariate effects often manifest meaningful heterogeneity in covariate-response associations. In this paper, we adopt a quantile regression model that assumes linearity at a continuous range of quantile levels as a tool to explore such data dynamics. The consideration of potential non-constancy of covariate effects necessitates a new perspective for variable selection, which, under the assumed quantile regression model, is to retain variables that have effects on all quantiles of interest as well as those that influence only part of quantiles considered. Current work on l1-penalized quantile regression either does not concern varying covariate effects or may not produce consistent variable selection in the presence of covariates with partial effects, a practical scenario of interest. In this work, we propose a shrinkage approach by adopting a novel uniform adaptive LASSO penalty. The new approach enjoys easy implementation without requiring smoothing. Moreover, it can consistently identify the true model (uniformly across quantiles) and achieve the oracle estimation efficiency. We further extend the proposed shrinkage method to the case where responses are subject to random right censoring. Numerical studies confirm the theoretical results and support the utility of our proposals. PMID:25332515

  12. More green space is related to less antidepressant prescription rates in the Netherlands: A Bayesian geoadditive quantile regression approach.

    PubMed

    Helbich, Marco; Klein, Nadja; Roberts, Hannah; Hagedoorn, Paulien; Groenewegen, Peter P

    2018-06-20

    Exposure to green space seems to be beneficial for self-reported mental health. In this study we used an objective health indicator, namely antidepressant prescription rates. Current studies rely exclusively upon mean regression models assuming linear associations. It is, however, plausible that the presence of green space is non-linearly related with different quantiles of the outcome antidepressant prescription rates. These restrictions may contribute to inconsistent findings. Our aim was: a) to assess antidepressant prescription rates in relation to green space, and b) to analyze how the relationship varies non-linearly across different quantiles of antidepressant prescription rates. We used cross-sectional data for the year 2014 at a municipality level in the Netherlands. Ecological Bayesian geoadditive quantile regressions were fitted for the 15%, 50%, and 85% quantiles to estimate green space-prescription rate correlations, controlling for physical activity levels, socio-demographics, urbanicity, etc. RESULTS: The results suggested that green space was overall inversely and non-linearly associated with antidepressant prescription rates. More important, the associations differed across the quantiles, although the variation was modest. Significant non-linearities were apparent: The associations were slightly positive in the lower quantile and strongly negative in the upper one. Our findings imply that an increased availability of green space within a municipality may contribute to a reduction in the number of antidepressant prescriptions dispensed. Green space is thus a central health and community asset, whilst a minimum level of 28% needs to be established for health gains. The highest effectiveness occurred at a municipality surface percentage higher than 79%. This inverse dose-dependent relation has important implications for setting future community-level health and planning policies. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. Linear Regression Quantile Mapping (RQM) - A new approach to bias correction with consistent quantile trends

    NASA Astrophysics Data System (ADS)

    Passow, Christian; Donner, Reik

    2017-04-01

    Quantile mapping (QM) is an established concept that allows to correct systematic biases in multiple quantiles of the distribution of a climatic observable. It shows remarkable results in correcting biases in historical simulations through observational data and outperforms simpler correction methods which relate only to the mean or variance. Since it has been shown that bias correction of future predictions or scenario runs with basic QM can result in misleading trends in the projection, adjusted, trend preserving, versions of QM were introduced in the form of detrended quantile mapping (DQM) and quantile delta mapping (QDM) (Cannon, 2015, 2016). Still, all previous versions and applications of QM based bias correction rely on the assumption of time-independent quantiles over the investigated period, which can be misleading in the context of a changing climate. Here, we propose a novel combination of linear quantile regression (QR) with the classical QM method to introduce a consistent, time-dependent and trend preserving approach of bias correction for historical and future projections. Since QR is a regression method, it is possible to estimate quantiles in the same resolution as the given data and include trends or other dependencies. We demonstrate the performance of the new method of linear regression quantile mapping (RQM) in correcting biases of temperature and precipitation products from historical runs (1959 - 2005) of the COSMO model in climate mode (CCLM) from the Euro-CORDEX ensemble relative to gridded E-OBS data of the same spatial and temporal resolution. A thorough comparison with established bias correction methods highlights the strengths and potential weaknesses of the new RQM approach. References: A.J. Cannon, S.R. Sorbie, T.Q. Murdock: Bias Correction of GCM Precipitation by Quantile Mapping - How Well Do Methods Preserve Changes in Quantiles and Extremes? Journal of Climate, 28, 6038, 2015 A.J. Cannon: Multivariate Bias Correction of Climate Model Outputs - Matching Marginal Distributions and Inter-variable Dependence Structure. Journal of Climate, 29, 7045, 2016

  14. Boosting structured additive quantile regression for longitudinal childhood obesity data.

    PubMed

    Fenske, Nora; Fahrmeir, Ludwig; Hothorn, Torsten; Rzehak, Peter; Höhle, Michael

    2013-07-25

    Childhood obesity and the investigation of its risk factors has become an important public health issue. Our work is based on and motivated by a German longitudinal study including 2,226 children with up to ten measurements on their body mass index (BMI) and risk factors from birth to the age of 10 years. We introduce boosting of structured additive quantile regression as a novel distribution-free approach for longitudinal quantile regression. The quantile-specific predictors of our model include conventional linear population effects, smooth nonlinear functional effects, varying-coefficient terms, and individual-specific effects, such as intercepts and slopes. Estimation is based on boosting, a computer intensive inference method for highly complex models. We propose a component-wise functional gradient descent boosting algorithm that allows for penalized estimation of the large variety of different effects, particularly leading to individual-specific effects shrunken toward zero. This concept allows us to flexibly estimate the nonlinear age curves of upper quantiles of the BMI distribution, both on population and on individual-specific level, adjusted for further risk factors and to detect age-varying effects of categorical risk factors. Our model approach can be regarded as the quantile regression analog of Gaussian additive mixed models (or structured additive mean regression models), and we compare both model classes with respect to our obesity data.

  15. Quantile rank maps: a new tool for understanding individual brain development.

    PubMed

    Chen, Huaihou; Kelly, Clare; Castellanos, F Xavier; He, Ye; Zuo, Xi-Nian; Reiss, Philip T

    2015-05-01

    We propose a novel method for neurodevelopmental brain mapping that displays how an individual's values for a quantity of interest compare with age-specific norms. By estimating smoothly age-varying distributions at a set of brain regions of interest, we derive age-dependent region-wise quantile ranks for a given individual, which can be presented in the form of a brain map. Such quantile rank maps could potentially be used for clinical screening. Bootstrap-based confidence intervals are proposed for the quantile rank estimates. We also propose a recalibrated Kolmogorov-Smirnov test for detecting group differences in the age-varying distribution. This test is shown to be more robust to model misspecification than a linear regression-based test. The proposed methods are applied to brain imaging data from the Nathan Kline Institute Rockland Sample and from the Autism Brain Imaging Data Exchange (ABIDE) sample. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Robust small area estimation of poverty indicators using M-quantile approach (Case study: Sub-district level in Bogor district)

    NASA Astrophysics Data System (ADS)

    Girinoto, Sadik, Kusman; Indahwati

    2017-03-01

    The National Socio-Economic Survey samples are designed to produce estimates of parameters of planned domains (provinces and districts). The estimation of unplanned domains (sub-districts and villages) has its limitation to obtain reliable direct estimates. One of the possible solutions to overcome this problem is employing small area estimation techniques. The popular choice of small area estimation is based on linear mixed models. However, such models need strong distributional assumptions and do not easy allow for outlier-robust estimation. As an alternative approach for this purpose, M-quantile regression approach to small area estimation based on modeling specific M-quantile coefficients of conditional distribution of study variable given auxiliary covariates. It obtained outlier-robust estimation from influence function of M-estimator type and also no need strong distributional assumptions. In this paper, the aim of study is to estimate the poverty indicator at sub-district level in Bogor District-West Java using M-quantile models for small area estimation. Using data taken from National Socioeconomic Survey and Villages Potential Statistics, the results provide a detailed description of pattern of incidence and intensity of poverty within Bogor district. We also compare the results with direct estimates. The results showed the framework may be preferable when direct estimate having no incidence of poverty at all in the small area.

  17. Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

    PubMed

    Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara

    2017-01-01

    In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.

  18. Matching a Distribution by Matching Quantiles Estimation

    PubMed Central

    Sgouropoulos, Nikolaos; Yao, Qiwei; Yastremiz, Claudia

    2015-01-01

    Motivated by the problem of selecting representative portfolios for backtesting counterparty credit risks, we propose a matching quantiles estimation (MQE) method for matching a target distribution by that of a linear combination of a set of random variables. An iterative procedure based on the ordinary least-squares estimation (OLS) is proposed to compute MQE. MQE can be easily modified by adding a LASSO penalty term if a sparse representation is desired, or by restricting the matching within certain range of quantiles to match a part of the target distribution. The convergence of the algorithm and the asymptotic properties of the estimation, both with or without LASSO, are established. A measure and an associated statistical test are proposed to assess the goodness-of-match. The finite sample properties are illustrated by simulation. An application in selecting a counterparty representative portfolio with a real dataset is reported. The proposed MQE also finds applications in portfolio tracking, which demonstrates the usefulness of combining MQE with LASSO. PMID:26692592

  19. A Study of Alternative Quantile Estimation Methods in Newsboy-Type Problems

    DTIC Science & Technology

    1980-03-01

    decision maker selects to have on hand. The newsboy cost equation may be formulated as a two-piece continuous linear function in the following manner. C(S...number of observations, some approximations may be possible. Three points which are near each other can be assumed to be linear and some estimator using...respectively. Define the value r as: r = [nq + 0.5] , (6) where [X] denotes the largest integer of X. Let us consider an estimate of X as the linear

  20. Robust neural network with applications to credit portfolio data analysis.

    PubMed

    Feng, Yijia; Li, Runze; Sudjianto, Agus; Zhang, Yiyun

    2010-01-01

    In this article, we study nonparametric conditional quantile estimation via neural network structure. We proposed an estimation method that combines quantile regression and neural network (robust neural network, RNN). It provides good smoothing performance in the presence of outliers and can be used to construct prediction bands. A Majorization-Minimization (MM) algorithm was developed for optimization. Monte Carlo simulation study is conducted to assess the performance of RNN. Comparison with other nonparametric regression methods (e.g., local linear regression and regression splines) in real data application demonstrate the advantage of the newly proposed procedure.

  1. Identifying Factors That Predict Promotion Time to E-4 and Re-Enlistment Eligibility for U.S. Marine Corps Field Radio Operators

    DTIC Science & Technology

    2014-12-01

    Primary Military Occupational Specialty PRO Proficiency Q-Q Quantile - Quantile RSS Residual Sum of Squares SI Shop Information T&R Training and...construct multivariate linear regression models to estimate Marines’ Computed Tier Score and time to achieve E-4 based on their individual personal...Science (GS) score, ASVAB Mathematics Knowledge (MK) score, ASVAB Paragraph Comprehension (PC) score, weight , and whether a Marine receives a weight

  2. Flood quantile estimation at ungauged sites by Bayesian networks

    NASA Astrophysics Data System (ADS)

    Mediero, L.; Santillán, D.; Garrote, L.

    2012-04-01

    Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.

  3. Quantile Regression Models for Current Status Data

    PubMed Central

    Ou, Fang-Shu; Zeng, Donglin; Cai, Jianwen

    2016-01-01

    Current status data arise frequently in demography, epidemiology, and econometrics where the exact failure time cannot be determined but is only known to have occurred before or after a known observation time. We propose a quantile regression model to analyze current status data, because it does not require distributional assumptions and the coefficients can be interpreted as direct regression effects on the distribution of failure time in the original time scale. Our model assumes that the conditional quantile of failure time is a linear function of covariates. We assume conditional independence between the failure time and observation time. An M-estimator is developed for parameter estimation which is computed using the concave-convex procedure and its confidence intervals are constructed using a subsampling method. Asymptotic properties for the estimator are derived and proven using modern empirical process theory. The small sample performance of the proposed method is demonstrated via simulation studies. Finally, we apply the proposed method to analyze data from the Mayo Clinic Study of Aging. PMID:27994307

  4. A quantile count model of water depth constraints on Cape Sable seaside sparrows

    USGS Publications Warehouse

    Cade, B.S.; Dong, Q.

    2008-01-01

    1. A quantile regression model for counts of breeding Cape Sable seaside sparrows Ammodramus maritimus mirabilis (L.) as a function of water depth and previous year abundance was developed based on extensive surveys, 1992-2005, in the Florida Everglades. The quantile count model extends linear quantile regression methods to discrete response variables, providing a flexible alternative to discrete parametric distributional models, e.g. Poisson, negative binomial and their zero-inflated counterparts. 2. Estimates from our multiplicative model demonstrated that negative effects of increasing water depth in breeding habitat on sparrow numbers were dependent on recent occupation history. Upper 10th percentiles of counts (one to three sparrows) decreased with increasing water depth from 0 to 30 cm when sites were not occupied in previous years. However, upper 40th percentiles of counts (one to six sparrows) decreased with increasing water depth for sites occupied in previous years. 3. Greatest decreases (-50% to -83%) in upper quantiles of sparrow counts occurred as water depths increased from 0 to 15 cm when previous year counts were 1, but a small proportion of sites (5-10%) held at least one sparrow even as water depths increased to 20 or 30 cm. 4. A zero-inflated Poisson regression model provided estimates of conditional means that also decreased with increasing water depth but rates of change were lower and decreased with increasing previous year counts compared to the quantile count model. Quantiles computed for the zero-inflated Poisson model enhanced interpretation of this model but had greater lack-of-fit for water depths > 0 cm and previous year counts 1, conditions where the negative effect of water depths were readily apparent and fitted better with the quantile count model.

  5. Quantile equivalence to evaluate compliance with habitat management objectives

    USGS Publications Warehouse

    Cade, Brian S.; Johnson, Pamela R.

    2011-01-01

    Equivalence estimated with linear quantile regression was used to evaluate compliance with habitat management objectives at Arapaho National Wildlife Refuge based on monitoring data collected in upland (5,781 ha; n = 511 transects) and riparian and meadow (2,856 ha, n = 389 transects) habitats from 2005 to 2008. Quantiles were used because the management objectives specified proportions of the habitat area that needed to comply with vegetation criteria. The linear model was used to obtain estimates that were averaged across 4 y. The equivalence testing framework allowed us to interpret confidence intervals for estimated proportions with respect to intervals of vegetative criteria (equivalence regions) in either a liberal, benefit-of-doubt or conservative, fail-safe approach associated with minimizing alternative risks. Simple Boolean conditional arguments were used to combine the quantile equivalence results for individual vegetation components into a joint statement for the multivariable management objectives. For example, management objective 2A required at least 809 ha of upland habitat with a shrub composition ≥0.70 sagebrush (Artemisia spp.), 20–30% canopy cover of sagebrush ≥25 cm in height, ≥20% canopy cover of grasses, and ≥10% canopy cover of forbs on average over 4 y. Shrub composition and canopy cover of grass each were readily met on >3,000 ha under either conservative or liberal interpretations of sampling variability. However, there were only 809–1,214 ha (conservative to liberal) with ≥10% forb canopy cover and 405–1,098 ha with 20–30%canopy cover of sagebrush ≥25 cm in height. Only 91–180 ha of uplands simultaneously met criteria for all four components, primarily because canopy cover of sagebrush and forbs was inversely related when considered at the spatial scale (30 m) of a sample transect. We demonstrate how the quantile equivalence analyses also can help refine the numerical specification of habitat objectives and explore specification of spatial scales for objectives with respect to sampling scales used to evaluate those objectives.

  6. Influences of spatial and temporal variation on fish-habitat relationships defined by regression quantiles

    USGS Publications Warehouse

    Dunham, J.B.; Cade, B.S.; Terrell, J.W.

    2002-01-01

    We used regression quantiles to model potentially limiting relationships between the standing crop of cutthroat trout Oncorhynchus clarki and measures of stream channel morphology. Regression quantile models indicated that variation in fish density was inversely related to the width:depth ratio of streams but not to stream width or depth alone. The spatial and temporal stability of model predictions were examined across years and streams, respectively. Variation in fish density with width:depth ratio (10th-90th regression quantiles) modeled for streams sampled in 1993-1997 predicted the variation observed in 1998-1999, indicating similar habitat relationships across years. Both linear and nonlinear models described the limiting relationships well, the latter performing slightly better. Although estimated relationships were transferable in time, results were strongly dependent on the influence of spatial variation in fish density among streams. Density changes with width:depth ratio in a single stream were responsible for the significant (P < 0.10) negative slopes estimated for the higher quantiles (>80th). This suggests that stream-scale factors other than width:depth ratio play a more direct role in determining population density. Much of the variation in densities of cutthroat trout among streams was attributed to the occurrence of nonnative brook trout Salvelinus fontinalis (a possible competitor) or connectivity to migratory habitats. Regression quantiles can be useful for estimating the effects of limiting factors when ecological responses are highly variable, but our results indicate that spatiotemporal variability in the data should be explicitly considered. In this study, data from individual streams and stream-specific characteristics (e.g., the occurrence of nonnative species and habitat connectivity) strongly affected our interpretation of the relationship between width:depth ratio and fish density.

  7. Rank score and permutation testing alternatives for regression quantile estimates

    USGS Publications Warehouse

    Cade, B.S.; Richards, J.D.; Mielke, P.W.

    2006-01-01

    Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.

  8. Interquantile Shrinkage in Regression Models

    PubMed Central

    Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.

    2012-01-01

    Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546

  9. Alternative configurations of Quantile Regression for estimating predictive uncertainty in water level forecasts for the Upper Severn River: a comparison

    NASA Astrophysics Data System (ADS)

    Lopez, Patricia; Verkade, Jan; Weerts, Albrecht; Solomatine, Dimitri

    2014-05-01

    Hydrological forecasting is subject to many sources of uncertainty, including those originating in initial state, boundary conditions, model structure and model parameters. Although uncertainty can be reduced, it can never be fully eliminated. Statistical post-processing techniques constitute an often used approach to estimate the hydrological predictive uncertainty, where a model of forecast error is built using a historical record of past forecasts and observations. The present study focuses on the use of the Quantile Regression (QR) technique as a hydrological post-processor. It estimates the predictive distribution of water levels using deterministic water level forecasts as predictors. This work aims to thoroughly verify uncertainty estimates using the implementation of QR that was applied in an operational setting in the UK National Flood Forecasting System, and to inter-compare forecast quality and skill in various, differing configurations of QR. These configurations are (i) 'classical' QR, (ii) QR constrained by a requirement that quantiles do not cross, (iii) QR derived on time series that have been transformed into the Normal domain (Normal Quantile Transformation - NQT), and (iv) a piecewise linear derivation of QR models. The QR configurations are applied to fourteen hydrological stations on the Upper Severn River with different catchments characteristics. Results of each QR configuration are conditionally verified for progressively higher flood levels, in terms of commonly used verification metrics and skill scores. These include Brier's probability score (BS), the continuous ranked probability score (CRPS) and corresponding skill scores as well as the Relative Operating Characteristic score (ROCS). Reliability diagrams are also presented and analysed. The results indicate that none of the four Quantile Regression configurations clearly outperforms the others.

  10. [Spatial heterogeneity in body condition of small yellow croaker in Yellow Sea and East China Sea based on mixed-effects model and quantile regression analysis].

    PubMed

    Liu, Zun-Lei; Yuan, Xing-Wei; Yan, Li-Ping; Yang, Lin-Lin; Cheng, Jia-Hua

    2013-09-01

    By using the 2008-2010 investigation data about the body condition of small yellow croaker in the offshore waters of southern Yellow Sea (SYS), open waters of northern East China Sea (NECS), and offshore waters of middle East China Sea (MECS), this paper analyzed the spatial heterogeneity of body length-body mass of juvenile and adult small yellow croakers by the statistical approaches of mean regression model and quantile regression model. The results showed that the residual standard errors from the analysis of covariance (ANCOVA) and the linear mixed-effects model were similar, and those from the simple linear regression were the highest. For the juvenile small yellow croakers, their mean body mass in SYS and NECS estimated by the mixed-effects mean regression model was higher than the overall average mass across the three regions, while the mean body mass in MECS was below the overall average. For the adult small yellow croakers, their mean body mass in NECS was higher than the overall average, while the mean body mass in SYS and MECS was below the overall average. The results from quantile regression indicated the substantial differences in the allometric relationships of juvenile small yellow croakers between SYS, NECS, and MECS, with the estimated mean exponent of the allometric relationship in SYS being 2.85, and the interquartile range being from 2.63 to 2.96, which indicated the heterogeneity of body form. The results from ANCOVA showed that the allometric body length-body mass relationships were significantly different between the 25th and 75th percentile exponent values (F=6.38, df=1737, P<0.01) and the 25th percentile and median exponent values (F=2.35, df=1737, P=0.039). The relationship was marginally different between the median and 75th percentile exponent values (F=2.21, df=1737, P=0.051). The estimated body length-body mass exponent of adult small yellow croakers in SYS was 3.01 (10th and 95th percentiles = 2.77 and 3.1, respectively). The estimated body length-body mass relationships were significantly different from the lower and upper quantiles of the exponent (F=3.31, df=2793, P=0.01) and the median and upper quantiles (F=3.56, df=2793, P<0.01), while no significant difference was observed between the lower and median quantiles (F=0.98, df=2793, P=0.43).

  11. Logistic quantile regression provides improved estimates for bounded avian counts: A case study of California Spotted Owl fledgling production

    USGS Publications Warehouse

    Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of the variance in the fledgling counts as climate, parent age class, and landscape habitat predictors. Our logistic quantile regression model can be used for any discrete response variables with fixed upper and lower bounds.

  12. Linear models: permutation methods

    USGS Publications Warehouse

    Cade, B.S.; Everitt, B.S.; Howell, D.C.

    2005-01-01

    Permutation tests (see Permutation Based Inference) for the linear model have applications in behavioral studies when traditional parametric assumptions about the error term in a linear model are not tenable. Improved validity of Type I error rates can be achieved with properly constructed permutation tests. Perhaps more importantly, increased statistical power, improved robustness to effects of outliers, and detection of alternative distributional differences can be achieved by coupling permutation inference with alternative linear model estimators. For example, it is well-known that estimates of the mean in linear model are extremely sensitive to even a single outlying value of the dependent variable compared to estimates of the median [7, 19]. Traditionally, linear modeling focused on estimating changes in the center of distributions (means or medians). However, quantile regression allows distributional changes to be estimated in all or any selected part of a distribution or responses, providing a more complete statistical picture that has relevance to many biological questions [6]...

  13. Quantiles for Finite Mixtures of Normal Distributions

    ERIC Educational Resources Information Center

    Rahman, Mezbahur; Rahman, Rumanur; Pearson, Larry M.

    2006-01-01

    Quantiles for finite mixtures of normal distributions are computed. The difference between a linear combination of independent normal random variables and a linear combination of independent normal densities is emphasized. (Contains 3 tables and 1 figure.)

  14. Spatially Modeling the Effects of Meteorological Drivers of PM2.5 in the Eastern United States via a Local Linear Penalized Quantile Regression Estimator.

    PubMed

    Russell, Brook T; Wang, Dewei; McMahan, Christopher S

    2017-08-01

    Fine particulate matter (PM 2.5 ) poses a significant risk to human health, with long-term exposure being linked to conditions such as asthma, chronic bronchitis, lung cancer, atherosclerosis, etc. In order to improve current pollution control strategies and to better shape public policy, the development of a more comprehensive understanding of this air pollutant is necessary. To this end, this work attempts to quantify the relationship between certain meteorological drivers and the levels of PM 2.5 . It is expected that the set of important meteorological drivers will vary both spatially and within the conditional distribution of PM 2.5 levels. To account for these characteristics, a new local linear penalized quantile regression methodology is developed. The proposed estimator uniquely selects the set of important drivers at every spatial location and for each quantile of the conditional distribution of PM 2.5 levels. The performance of the proposed methodology is illustrated through simulation, and it is then used to determine the association between several meteorological drivers and PM 2.5 over the Eastern United States (US). This analysis suggests that the primary drivers throughout much of the Eastern US tend to differ based on season and geographic location, with similarities existing between "typical" and "high" PM 2.5 levels.

  15. Variable Selection for Nonparametric Quantile Regression via Smoothing Spline AN OVA

    PubMed Central

    Lin, Chen-Yen; Bondell, Howard; Zhang, Hao Helen; Zou, Hui

    2014-01-01

    Quantile regression provides a more thorough view of the effect of covariates on a response. Nonparametric quantile regression has become a viable alternative to avoid restrictive parametric assumption. The problem of variable selection for quantile regression is challenging, since important variables can influence various quantiles in different ways. We tackle the problem via regularization in the context of smoothing spline ANOVA models. The proposed sparse nonparametric quantile regression (SNQR) can identify important variables and provide flexible estimates for quantiles. Our numerical study suggests the promising performance of the new procedure in variable selection and function estimation. Supplementary materials for this article are available online. PMID:24554792

  16. Efficient Regressions via Optimally Combining Quantile Information*

    PubMed Central

    Zhao, Zhibiao; Xiao, Zhijie

    2014-01-01

    We develop a generally applicable framework for constructing efficient estimators of regression models via quantile regressions. The proposed method is based on optimally combining information over multiple quantiles and can be applied to a broad range of parametric and nonparametric settings. When combining information over a fixed number of quantiles, we derive an upper bound on the distance between the efficiency of the proposed estimator and the Fisher information. As the number of quantiles increases, this upper bound decreases and the asymptotic variance of the proposed estimator approaches the Cramér-Rao lower bound under appropriate conditions. In the case of non-regular statistical estimation, the proposed estimator leads to super-efficient estimation. We illustrate the proposed method for several widely used regression models. Both asymptotic theory and Monte Carlo experiments show the superior performance over existing methods. PMID:25484481

  17. Incremental Treatment Costs Attributable to Overweight and Obesity in Patients with Diabetes: Quantile Regression Approach.

    PubMed

    Lee, Seung-Mi; Choi, In-Sun; Han, Euna; Suh, David; Shin, Eun-Kyung; Je, Seyunghe; Lee, Sung Su; Suh, Dong-Churl

    2018-01-01

    This study aimed to estimate treatment costs attributable to overweight and obesity in patients with diabetes who were less than 65 years of age in the United States. This study used data from the Medical Expenditure Panel Survey from 2001 to 2013. Patients with diabetes were identified by using the International Classification of Diseases, Ninth Revision, Clinical Modification code (250), clinical classification codes (049 and 050), or self-reported physician diagnoses. Total treatment costs attributable to overweight and obesity were calculated as the differences in the adjusted costs compared with individuals with diabetes and normal weight. Adjusted costs were estimated by using generalized linear models or unconditional quantile regression models. The mean annual treatment costs attributable to obesity were $1,852 higher than those attributable to normal weight, while costs attributable to overweight were $133 higher. The unconditional quantile regression results indicated that the impact of obesity on total treatment costs gradually became more significant as treatment costs approached the upper quantile. Among patients with diabetes who were less than 65 years of age, patients with diabetes and obesity have significantly higher treatment costs than patients with diabetes and normal weight. The economic burden of diabetes to society will continue to increase unless more proactive preventive measures are taken to effectively treat patients with overweight or obesity. © 2017 The Obesity Society.

  18. Prenatal Lead Exposure and Fetal Growth: Smaller Infants Have Heightened Susceptibility

    PubMed Central

    Rodosthenous, Rodosthenis S.; Burris, Heather H.; Svensson, Katherine; Amarasiriwardena, Chitra J.; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A.; Wright, Robert O.; Téllez-Rojo, Martha M.; Baccarelli, Andrea A.

    2016-01-01

    Background As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. Objectives To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. Methods We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. Results While linear regression showed a negative association between maternal BLL and BWGA z-score (β=−0.06 z-score units per log2 BLL increase; 95% CI: −0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [−0.08, −0.13] z-score units per log2 BLL increase; all P values <0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99–2.65) for having a SGA infant compared to the lowest BLL quartile. Conclusions While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. PMID:27923585

  19. Prenatal lead exposure and fetal growth: Smaller infants have heightened susceptibility.

    PubMed

    Rodosthenous, Rodosthenis S; Burris, Heather H; Svensson, Katherine; Amarasiriwardena, Chitra J; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A; Wright, Robert O; Téllez-Rojo, Martha M; Baccarelli, Andrea A

    2017-02-01

    As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. While linear regression showed a negative association between maternal BLL and BWGA z-score (β=-0.06 z-score units per log 2 BLL increase; 95% CI: -0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [-0.08, -0.13] z-score units per log 2 BLL increase; all P values<0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99-2.65) for having a SGA infant compared to the lowest BLL quartile. While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Estimating normative limits of Heidelberg Retina Tomograph optic disc rim area with quantile regression.

    PubMed

    Artes, Paul H; Crabb, David P

    2010-01-01

    To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.

  1. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  2. Nonparametric methods for drought severity estimation at ungauged sites

    NASA Astrophysics Data System (ADS)

    Sadri, S.; Burn, D. H.

    2012-12-01

    The objective in frequency analysis is, given extreme events such as drought severity or duration, to estimate the relationship between that event and the associated return periods at a catchment. Neural networks and other artificial intelligence approaches in function estimation and regression analysis are relatively new techniques in engineering, providing an attractive alternative to traditional statistical models. There are, however, few applications of neural networks and support vector machines in the area of severity quantile estimation for drought frequency analysis. In this paper, we compare three methods for this task: multiple linear regression, radial basis function neural networks, and least squares support vector regression (LS-SVR). The area selected for this study includes 32 catchments in the Canadian Prairies. From each catchment drought severities are extracted and fitted to a Pearson type III distribution, which act as observed values. For each method-duration pair, we use a jackknife algorithm to produce estimated values at each site. The results from these three approaches are compared and analyzed, and it is found that LS-SVR provides the best quantile estimates and extrapolating capacity.

  3. GLOBALLY ADAPTIVE QUANTILE REGRESSION WITH ULTRA-HIGH DIMENSIONAL DATA

    PubMed Central

    Zheng, Qi; Peng, Limin; He, Xuming

    2015-01-01

    Quantile regression has become a valuable tool to analyze heterogeneous covaraite-response associations that are often encountered in practice. The development of quantile regression methodology for high dimensional covariates primarily focuses on examination of model sparsity at a single or multiple quantile levels, which are typically prespecified ad hoc by the users. The resulting models may be sensitive to the specific choices of the quantile levels, leading to difficulties in interpretation and erosion of confidence in the results. In this article, we propose a new penalization framework for quantile regression in the high dimensional setting. We employ adaptive L1 penalties, and more importantly, propose a uniform selector of the tuning parameter for a set of quantile levels to avoid some of the potential problems with model selection at individual quantile levels. Our proposed approach achieves consistent shrinkage of regression quantile estimates across a continuous range of quantiles levels, enhancing the flexibility and robustness of the existing penalized quantile regression methods. Our theoretical results include the oracle rate of uniform convergence and weak convergence of the parameter estimators. We also use numerical studies to confirm our theoretical findings and illustrate the practical utility of our proposal. PMID:26604424

  4. Quantile uncertainty and value-at-risk model risk.

    PubMed

    Alexander, Carol; Sarabia, José María

    2012-08-01

    This article develops a methodology for quantifying model risk in quantile risk estimates. The application of quantile estimates to risk assessment has become common practice in many disciplines, including hydrology, climate change, statistical process control, insurance and actuarial science, and the uncertainty surrounding these estimates has long been recognized. Our work is particularly important in finance, where quantile estimates (called Value-at-Risk) have been the cornerstone of banking risk management since the mid 1980s. A recent amendment to the Basel II Accord recommends additional market risk capital to cover all sources of "model risk" in the estimation of these quantiles. We provide a novel and elegant framework whereby quantile estimates are adjusted for model risk, relative to a benchmark which represents the state of knowledge of the authority that is responsible for model risk. A simulation experiment in which the degree of model risk is controlled illustrates how to quantify Value-at-Risk model risk and compute the required regulatory capital add-on for banks. An empirical example based on real data shows how the methodology can be put into practice, using only two time series (daily Value-at-Risk and daily profit and loss) from a large bank. We conclude with a discussion of potential applications to nonfinancial risks. © 2012 Society for Risk Analysis.

  5. STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.

    PubMed

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2014-06-01

    Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.

  6. STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION

    PubMed Central

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2014-01-01

    Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560

  7. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  8. Spectral distance decay: Assessing species beta-diversity by quantile regression

    USGS Publications Warehouse

    Rocchinl, D.; Nagendra, H.; Ghate, R.; Cade, B.S.

    2009-01-01

    Remotely sensed data represents key information for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance may allow us to quantitatively estimate how beta-diversity in species changes with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological datasets are characterized by a high number of zeroes that can add noise to the regression model. Quantile regression can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this paper, we used ordinary least square (ols) and quantile regression to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.05) considering both ols and quantile regression. Nonetheless, ols regression estimate of mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when spectral distance approaches zero, was very low compared with the intercepts of upper quantiles, which detected high species similarity when habitats are more similar. In this paper we demonstrated the power of using quantile regressions applied to spectral distance decay in order to reveal species diversity patterns otherwise lost or underestimated by ordinary least square regression. ?? 2009 American Society for Photogrammetry and Remote Sensing.

  9. Regularized quantile regression for SNP marker estimation of pig growth curves.

    PubMed

    Barroso, L M A; Nascimento, M; Nascimento, A C C; Silva, F F; Serão, N V L; Cruz, C D; Resende, M D V; Silva, F L; Azevedo, C F; Lopes, P S; Guimarães, S E F

    2017-01-01

    Genomic growth curves are generally defined only in terms of population mean; an alternative approach that has not yet been exploited in genomic analyses of growth curves is the Quantile Regression (QR). This methodology allows for the estimation of marker effects at different levels of the variable of interest. We aimed to propose and evaluate a regularized quantile regression for SNP marker effect estimation of pig growth curves, as well as to identify the chromosome regions of the most relevant markers and to estimate the genetic individual weight trajectory over time (genomic growth curve) under different quantiles (levels). The regularized quantile regression (RQR) enabled the discovery, at different levels of interest (quantiles), of the most relevant markers allowing for the identification of QTL regions. We found the same relevant markers simultaneously affecting different growth curve parameters (mature weight and maturity rate): two (ALGA0096701 and ALGA0029483) for RQR(0.2), one (ALGA0096701) for RQR(0.5), and one (ALGA0003761) for RQR(0.8). Three average genomic growth curves were obtained and the behavior was explained by the curve in quantile 0.2, which differed from the others. RQR allowed for the construction of genomic growth curves, which is the key to identifying and selecting the most desirable animals for breeding purposes. Furthermore, the proposed model enabled us to find, at different levels of interest (quantiles), the most relevant markers for each trait (growth curve parameter estimates) and their respective chromosomal positions (identification of new QTL regions for growth curves in pigs). These markers can be exploited under the context of marker assisted selection while aiming to change the shape of pig growth curves.

  10. Predicting birth weight with conditionally linear transformation models.

    PubMed

    Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten

    2016-12-01

    Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.

  11. Estimation of peak discharge quantiles for selected annual exceedance probabilities in northeastern Illinois

    USGS Publications Warehouse

    Over, Thomas M.; Saito, Riki J.; Veilleux, Andrea G.; Sharpe, Jennifer B.; Soong, David T.; Ishii, Audrey L.

    2016-06-28

    This report provides two sets of equations for estimating peak discharge quantiles at annual exceedance probabilities (AEPs) of 0.50, 0.20, 0.10, 0.04, 0.02, 0.01, 0.005, and 0.002 (recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively) for watersheds in Illinois based on annual maximum peak discharge data from 117 watersheds in and near northeastern Illinois. One set of equations was developed through a temporal analysis with a two-step least squares-quantile regression technique that measures the average effect of changes in the urbanization of the watersheds used in the study. The resulting equations can be used to adjust rural peak discharge quantiles for the effect of urbanization, and in this study the equations also were used to adjust the annual maximum peak discharges from the study watersheds to 2010 urbanization conditions.The other set of equations was developed by a spatial analysis. This analysis used generalized least-squares regression to fit the peak discharge quantiles computed from the urbanization-adjusted annual maximum peak discharges from the study watersheds to drainage-basin characteristics. The peak discharge quantiles were computed by using the Expected Moments Algorithm following the removal of potentially influential low floods defined by a multiple Grubbs-Beck test. To improve the quantile estimates, regional skew coefficients were obtained from a newly developed regional skew model in which the skew increases with the urbanized land use fraction. The drainage-basin characteristics used as explanatory variables in the spatial analysis include drainage area, the fraction of developed land, the fraction of land with poorly drained soils or likely water, and the basin slope estimated as the ratio of the basin relief to basin perimeter.This report also provides the following: (1) examples to illustrate the use of the spatial and urbanization-adjustment equations for estimating peak discharge quantiles at ungaged sites and to improve flood-quantile estimates at and near a gaged site; (2) the urbanization-adjusted annual maximum peak discharges and peak discharge quantile estimates at streamgages from 181 watersheds including the 117 study watersheds and 64 additional watersheds in the study region that were originally considered for use in the study but later deemed to be redundant.The urbanization-adjustment equations, spatial regression equations, and peak discharge quantile estimates developed in this study will be made available in the web application StreamStats, which provides automated regression-equation solutions for user-selected stream locations. Figures and tables comparing the observed and urbanization-adjusted annual maximum peak discharge records by streamgage are provided at https://doi.org/10.3133/sir20165050 for download.

  12. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  13. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    PubMed

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  14. Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method.

    PubMed

    Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza

    2015-11-18

    Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available.

  15. Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method

    PubMed Central

    Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza

    2016-01-01

    Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889

  16. Implementation and Evaluation of the Streamflow Statistics (StreamStats) Web Application for Computing Basin Characteristics and Flood Peaks in Illinois

    USGS Publications Warehouse

    Ishii, Audrey L.; Soong, David T.; Sharpe, Jennifer B.

    2010-01-01

    Illinois StreamStats (ILSS) is a Web-based application for computing selected basin characteristics and flood-peak quantiles based on the most recently (2010) published (Soong and others, 2004) regional flood-frequency equations at any rural stream location in Illinois. Limited streamflow statistics including general statistics, flow durations, and base flows also are available for U.S. Geological Survey (USGS) streamflow-gaging stations. ILSS can be accessed on the Web at http://streamstats.usgs.gov/ by selecting the State Applications hyperlink and choosing Illinois from the pull-down menu. ILSS was implemented for Illinois by obtaining and projecting ancillary geographic information system (GIS) coverages; populating the StreamStats database with streamflow-gaging station data; hydroprocessing the 30-meter digital elevation model (DEM) for Illinois to conform to streams represented in the National Hydrographic Dataset 1:100,000 stream coverage; and customizing the Web-based Extensible Markup Language (XML) programs for computing basin characteristics for Illinois. The basin characteristics computed by ILSS then were compared to the basin characteristics used in the published study, and adjustments were applied to the XML algorithms for slope and basin length. Testing of ILSS was accomplished by comparing flood quantiles computed by ILSS at a an approximately random sample of 170 streamflow-gaging stations computed by ILSS with the published flood quantile estimates. Differences between the log-transformed flood quantiles were not statistically significant at the 95-percent confidence level for the State as a whole, nor by the regions determined by each equation, except for region 1, in the northwest corner of the State. In region 1, the average difference in flood quantile estimates ranged from 3.76 percent for the 2-year flood quantile to 4.27 percent for the 500-year flood quantile. The total number of stations in region 1 was small (21) and the mean difference is not large (less than one-tenth of the average prediction error for the regression-equation estimates). The sensitivity of the flood-quantile estimates to differences in the computed basin characteristics are determined and presented in tables. A test of usage consistency was conducted by having at least 7 new users compute flood quantile estimates at 27 locations. The average maximum deviation of the estimate from the mode value at each site was 1.31 percent after four mislocated sites were removed. A comparison of manual 100-year flood-quantile computations with ILSS at 34 sites indicated no statistically significant difference. ILSS appears to be an accurate, reliable, and effective tool for flood-quantile estimates.

  17. L-statistics for Repeated Measurements Data With Application to Trimmed Means, Quantiles and Tolerance Intervals.

    PubMed

    Assaad, Houssein I; Choudhary, Pankaj K

    2013-01-01

    The L -statistics form an important class of estimators in nonparametric statistics. Its members include trimmed means and sample quantiles and functions thereof. This article is devoted to theory and applications of L -statistics for repeated measurements data, wherein the measurements on the same subject are dependent and the measurements from different subjects are independent. This article has three main goals: (a) Show that the L -statistics are asymptotically normal for repeated measurements data. (b) Present three statistical applications of this result, namely, location estimation using trimmed means, quantile estimation and construction of tolerance intervals. (c) Obtain a Bahadur representation for sample quantiles. These results are generalizations of similar results for independently and identically distributed data. The practical usefulness of these results is illustrated by analyzing a real data set involving measurement of systolic blood pressure. The properties of the proposed point and interval estimators are examined via simulation.

  18. Modeling energy expenditure in children and adolescents using quantile regression

    PubMed Central

    Yang, Yunwen; Adolph, Anne L.; Puyau, Maurice R.; Vohra, Firoz A.; Zakeri, Issa F.

    2013-01-01

    Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in energy expenditure (EE). Study objective is to apply quantile regression (QR) to predict EE and determine quantile-dependent variation in covariate effects in nonobese and obese children. First, QR models will be developed to predict minute-by-minute awake EE at different quantile levels based on heart rate (HR) and physical activity (PA) accelerometry counts, and child characteristics of age, sex, weight, and height. Second, the QR models will be used to evaluate the covariate effects of weight, PA, and HR across the conditional EE distribution. QR and ordinary least squares (OLS) regressions are estimated in 109 children, aged 5–18 yr. QR modeling of EE outperformed OLS regression for both nonobese and obese populations. Average prediction errors for QR compared with OLS were not only smaller at the median τ = 0.5 (18.6 vs. 21.4%), but also substantially smaller at the tails of the distribution (10.2 vs. 39.2% at τ = 0.1 and 8.7 vs. 19.8% at τ = 0.9). Covariate effects of weight, PA, and HR on EE for the nonobese and obese children differed across quantiles (P < 0.05). The associations (linear and quadratic) between PA and HR with EE were stronger for the obese than nonobese population (P < 0.05). In conclusion, QR provided more accurate predictions of EE compared with conventional OLS regression, especially at the tails of the distribution, and revealed substantially different covariate effects of weight, PA, and HR on EE in nonobese and obese children. PMID:23640591

  19. Consistent model identification of varying coefficient quantile regression with BIC tuning parameter selection

    PubMed Central

    Zheng, Qi; Peng, Limin

    2016-01-01

    Quantile regression provides a flexible platform for evaluating covariate effects on different segments of the conditional distribution of response. As the effects of covariates may change with quantile level, contemporaneously examining a spectrum of quantiles is expected to have a better capacity to identify variables with either partial or full effects on the response distribution, as compared to focusing on a single quantile. Under this motivation, we study a general adaptively weighted LASSO penalization strategy in the quantile regression setting, where a continuum of quantile index is considered and coefficients are allowed to vary with quantile index. We establish the oracle properties of the resulting estimator of coefficient function. Furthermore, we formally investigate a BIC-type uniform tuning parameter selector and show that it can ensure consistent model selection. Our numerical studies confirm the theoretical findings and illustrate an application of the new variable selection procedure. PMID:28008212

  20. Estimating the Extreme Behaviors of Students Performance Using Quantile Regression--Evidences from Taiwan

    ERIC Educational Resources Information Center

    Chen, Sheng-Tung; Kuo, Hsiao-I.; Chen, Chi-Chung

    2012-01-01

    The two-stage least squares approach together with quantile regression analysis is adopted here to estimate the educational production function. Such a methodology is able to capture the extreme behaviors of the two tails of students' performance and the estimation outcomes have important policy implications. Our empirical study is applied to the…

  1. A quantile regression model for failure-time data with time-dependent covariates

    PubMed Central

    Gorfine, Malka; Goldberg, Yair; Ritov, Ya’acov

    2017-01-01

    Summary Since survival data occur over time, often important covariates that we wish to consider also change over time. Such covariates are referred as time-dependent covariates. Quantile regression offers flexible modeling of survival data by allowing the covariates to vary with quantiles. This article provides a novel quantile regression model accommodating time-dependent covariates, for analyzing survival data subject to right censoring. Our simple estimation technique assumes the existence of instrumental variables. In addition, we present a doubly-robust estimator in the sense of Robins and Rotnitzky (1992, Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell, N. P., Dietz, K. and Farewell, V. T. (editors), AIDS Epidemiology. Boston: Birkhaäuser, pp. 297–331.). The asymptotic properties of the estimators are rigorously studied. Finite-sample properties are demonstrated by a simulation study. The utility of the proposed methodology is demonstrated using the Stanford heart transplant dataset. PMID:27485534

  2. Robust and efficient estimation with weighted composite quantile regression

    NASA Astrophysics Data System (ADS)

    Jiang, Xuejun; Li, Jingzhi; Xia, Tian; Yan, Wanfeng

    2016-09-01

    In this paper we introduce a weighted composite quantile regression (CQR) estimation approach and study its application in nonlinear models such as exponential models and ARCH-type models. The weighted CQR is augmented by using a data-driven weighting scheme. With the error distribution unspecified, the proposed estimators share robustness from quantile regression and achieve nearly the same efficiency as the oracle maximum likelihood estimator (MLE) for a variety of error distributions including the normal, mixed-normal, Student's t, Cauchy distributions, etc. We also suggest an algorithm for the fast implementation of the proposed methodology. Simulations are carried out to compare the performance of different estimators, and the proposed approach is used to analyze the daily S&P 500 Composite index, which verifies the effectiveness and efficiency of our theoretical results.

  3. The Applicability of Confidence Intervals of Quantiles for the Generalized Logistic Distribution

    NASA Astrophysics Data System (ADS)

    Shin, H.; Heo, J.; Kim, T.; Jung, Y.

    2007-12-01

    The generalized logistic (GL) distribution has been widely used for frequency analysis. However, there is a little study related to the confidence intervals that indicate the prediction accuracy of distribution for the GL distribution. In this paper, the estimation of the confidence intervals of quantiles for the GL distribution is presented based on the method of moments (MOM), maximum likelihood (ML), and probability weighted moments (PWM) and the asymptotic variances of each quantile estimator are derived as functions of the sample sizes, return periods, and parameters. Monte Carlo simulation experiments are also performed to verify the applicability of the derived confidence intervals of quantile. As the results, the relative bias (RBIAS) and relative root mean square error (RRMSE) of the confidence intervals generally increase as return period increases and reverse as sample size increases. And PWM for estimating the confidence intervals performs better than the other methods in terms of RRMSE when the data is almost symmetric while ML shows the smallest RBIAS and RRMSE when the data is more skewed and sample size is moderately large. The GL model was applied to fit the distribution of annual maximum rainfall data. The results show that there are little differences in the estimated quantiles between ML and PWM while distinct differences in MOM.

  4. An application of quantile random forests for predictive mapping of forest attributes

    Treesearch

    E.A. Freeman; G.G. Moisen

    2015-01-01

    Increasingly, random forest models are used in predictive mapping of forest attributes. Traditional random forests output the mean prediction from the random trees. Quantile regression forests (QRF) is an extension of random forests developed by Nicolai Meinshausen that provides non-parametric estimates of the median predicted value as well as prediction quantiles. It...

  5. Comparing least-squares and quantile regression approaches to analyzing median hospital charges.

    PubMed

    Olsen, Cody S; Clark, Amy E; Thomas, Andrea M; Cook, Lawrence J

    2012-07-01

    Emergency department (ED) and hospital charges obtained from administrative data sets are useful descriptors of injury severity and the burden to EDs and the health care system. However, charges are typically positively skewed due to costly procedures, long hospital stays, and complicated or prolonged treatment for few patients. The median is not affected by extreme observations and is useful in describing and comparing distributions of hospital charges. A least-squares analysis employing a log transformation is one approach for estimating median hospital charges, corresponding confidence intervals (CIs), and differences between groups; however, this method requires certain distributional properties. An alternate method is quantile regression, which allows estimation and inference related to the median without making distributional assumptions. The objective was to compare the log-transformation least-squares method to the quantile regression approach for estimating median hospital charges, differences in median charges between groups, and associated CIs. The authors performed simulations using repeated sampling of observed statewide ED and hospital charges and charges randomly generated from a hypothetical lognormal distribution. The median and 95% CI and the multiplicative difference between the median charges of two groups were estimated using both least-squares and quantile regression methods. Performance of the two methods was evaluated. In contrast to least squares, quantile regression produced estimates that were unbiased and had smaller mean square errors in simulations of observed ED and hospital charges. Both methods performed well in simulations of hypothetical charges that met least-squares method assumptions. When the data did not follow the assumed distribution, least-squares estimates were often biased, and the associated CIs had lower than expected coverage as sample size increased. Quantile regression analyses of hospital charges provide unbiased estimates even when lognormal and equal variance assumptions are violated. These methods may be particularly useful in describing and analyzing hospital charges from administrative data sets. © 2012 by the Society for Academic Emergency Medicine.

  6. Application of empirical mode decomposition with local linear quantile regression in financial time series forecasting.

    PubMed

    Jaber, Abobaker M; Ismail, Mohd Tahir; Altaher, Alsaidi M

    2014-01-01

    This paper mainly forecasts the daily closing price of stock markets. We propose a two-stage technique that combines the empirical mode decomposition (EMD) with nonparametric methods of local linear quantile (LLQ). We use the proposed technique, EMD-LLQ, to forecast two stock index time series. Detailed experiments are implemented for the proposed method, in which EMD-LPQ, EMD, and Holt-Winter methods are compared. The proposed EMD-LPQ model is determined to be superior to the EMD and Holt-Winter methods in predicting the stock closing prices.

  7. A simulation study of nonparametric total deviation index as a measure of agreement based on quantile regression.

    PubMed

    Lin, Lawrence; Pan, Yi; Hedayat, A S; Barnhart, Huiman X; Haber, Michael

    2016-01-01

    Total deviation index (TDI) captures a prespecified quantile of the absolute deviation of paired observations from raters, observers, methods, assays, instruments, etc. We compare the performance of TDI using nonparametric quantile regression to the TDI assuming normality (Lin, 2000). This simulation study considers three distributions: normal, Poisson, and uniform at quantile levels of 0.8 and 0.9 for cases with and without contamination. Study endpoints include the bias of TDI estimates (compared with their respective theoretical values), standard error of TDI estimates (compared with their true simulated standard errors), and test size (compared with 0.05), and power. Nonparametric TDI using quantile regression, although it slightly underestimates and delivers slightly less power for data without contamination, works satisfactorily under all simulated cases even for moderate (say, ≥40) sample sizes. The performance of the TDI based on a quantile of 0.8 is in general superior to that of 0.9. The performances of nonparametric and parametric TDI methods are compared with a real data example. Nonparametric TDI can be very useful when the underlying distribution on the difference is not normal, especially when it has a heavy tail.

  8. Estimation of peak discharge quantiles for selected annual exceedance probabilities in Northeastern Illinois.

    DOT National Transportation Integrated Search

    2016-06-01

    This report provides two sets of equations for estimating peak discharge quantiles at annual exceedance probabilities (AEPs) of 0.50, 0.20, 0.10, : 0.04, 0.02, 0.01, 0.005, and 0.002 (recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years,...

  9. The quantile regression approach to efficiency measurement: insights from Monte Carlo simulations.

    PubMed

    Liu, Chunping; Laporte, Audrey; Ferguson, Brian S

    2008-09-01

    In the health economics literature there is an ongoing debate over approaches used to estimate the efficiency of health systems at various levels, from the level of the individual hospital - or nursing home - up to that of the health system as a whole. The two most widely used approaches to evaluating the efficiency with which various units deliver care are non-parametric data envelopment analysis (DEA) and parametric stochastic frontier analysis (SFA). Productivity researchers tend to have very strong preferences over which methodology to use for efficiency estimation. In this paper, we use Monte Carlo simulation to compare the performance of DEA and SFA in terms of their ability to accurately estimate efficiency. We also evaluate quantile regression as a potential alternative approach. A Cobb-Douglas production function, random error terms and a technical inefficiency term with different distributions are used to calculate the observed output. The results, based on these experiments, suggest that neither DEA nor SFA can be regarded as clearly dominant, and that, depending on the quantile estimated, the quantile regression approach may be a useful addition to the armamentarium of methods for estimating technical efficiency.

  10. Confidence intervals for expected moments algorithm flood quantile estimates

    USGS Publications Warehouse

    Cohn, Timothy A.; Lane, William L.; Stedinger, Jery R.

    2001-01-01

    Historical and paleoflood information can substantially improve flood frequency estimates if appropriate statistical procedures are properly applied. However, the Federal guidelines for flood frequency analysis, set forth in Bulletin 17B, rely on an inefficient “weighting” procedure that fails to take advantage of historical and paleoflood information. This has led researchers to propose several more efficient alternatives including the Expected Moments Algorithm (EMA), which is attractive because it retains Bulletin 17B's statistical structure (method of moments with the Log Pearson Type 3 distribution) and thus can be easily integrated into flood analyses employing the rest of the Bulletin 17B approach. The practical utility of EMA, however, has been limited because no closed‐form method has been available for quantifying the uncertainty of EMA‐based flood quantile estimates. This paper addresses that concern by providing analytical expressions for the asymptotic variance of EMA flood‐quantile estimators and confidence intervals for flood quantile estimates. Monte Carlo simulations demonstrate the properties of such confidence intervals for sites where a 25‐ to 100‐year streamgage record is augmented by 50 to 150 years of historical information. The experiments show that the confidence intervals, though not exact, should be acceptable for most purposes.

  11. Quantile regression via vector generalized additive models.

    PubMed

    Yee, Thomas W

    2004-07-30

    One of the most popular methods for quantile regression is the LMS method of Cole and Green. The method naturally falls within a penalized likelihood framework, and consequently allows for considerable flexible because all three parameters may be modelled by cubic smoothing splines. The model is also very understandable: for a given value of the covariate, the LMS method applies a Box-Cox transformation to the response in order to transform it to standard normality; to obtain the quantiles, an inverse Box-Cox transformation is applied to the quantiles of the standard normal distribution. The purposes of this article are three-fold. Firstly, LMS quantile regression is presented within the framework of the class of vector generalized additive models. This confers a number of advantages such as a unifying theory and estimation process. Secondly, a new LMS method based on the Yeo-Johnson transformation is proposed, which has the advantage that the response is not restricted to be positive. Lastly, this paper describes a software implementation of three LMS quantile regression methods in the S language. This includes the LMS-Yeo-Johnson method, which is estimated efficiently by a new numerical integration scheme. The LMS-Yeo-Johnson method is illustrated by way of a large cross-sectional data set from a New Zealand working population. Copyright 2004 John Wiley & Sons, Ltd.

  12. Differential effects of dietary diversity and maternal characteristics on linear growth of children aged 6-59 months in sub-Saharan Africa: a multi-country analysis.

    PubMed

    Amugsi, Dickson A; Dimbuene, Zacharie T; Kimani-Murage, Elizabeth W; Mberu, Blessing; Ezeh, Alex C

    2017-04-01

    To investigate the differential effects of dietary diversity (DD) and maternal characteristics on child linear growth at different points of the conditional distribution of height-for-age Z-score (HAZ) in sub-Saharan Africa. Secondary analysis of data from nationally representative cross-sectional samples of singleton children aged 0-59 months, born to mothers aged 15-49 years. The outcome variable was child HAZ. Quantile regression was used to perform the multivariate analysis. The most recent Demographic and Health Surveys from Ghana, Nigeria, Kenya, Mozambique and Democratic Republic of Congo (DRC). The present analysis was restricted to children aged 6-59 months (n 31 604). DD was associated positively with HAZ in the first four quantiles (5th, 10th, 25th and 50th) and the highest quantile (90th) in Nigeria. The largest effect occurred at the very bottom (5th quantile) and the very top (90th quantile) of the conditional HAZ distribution. In DRC, DD was significantly and positively associated with HAZ in the two lower quantiles (5th, 10th). The largest effects of maternal education occurred at the lower end of the conditional HAZ distribution in Ghana, Nigeria and DRC. Maternal BMI and height also had positive effects on HAZ at different points of the conditional distribution of HAZ. Our analysis shows that the association between DD and maternal factors and HAZ differs along the conditional HAZ distribution. Intervention measures need to take into account the heterogeneous effect of the determinants of child nutritional status along the different percentiles of the HAZ distribution.

  13. Comparison of different hydrological similarity measures to estimate flow quantiles

    NASA Astrophysics Data System (ADS)

    Rianna, M.; Ridolfi, E.; Napolitano, F.

    2017-07-01

    This paper aims to evaluate the influence of hydrological similarity measures on the definition of homogeneous regions. To this end, several attribute sets have been analyzed in the context of the Region of Influence (ROI) procedure. Several combinations of geomorphological, climatological, and geographical characteristics are also used to cluster potentially homogeneous regions. To verify the goodness of the resulting pooled sites, homogeneity tests arecarried out. Through a Monte Carlo simulation and a jack-knife procedure, flow quantiles areestimated for the regions effectively resulting as homogeneous. The analysis areperformed in both the so-called gauged and ungauged scenarios to analyze the effect of hydrological measures on flow quantiles estimation.

  14. Multi-element stochastic spectral projection for high quantile estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ko, Jordan, E-mail: jordan.ko@mac.com; Garnier, Josselin

    2013-06-15

    We investigate quantile estimation by multi-element generalized Polynomial Chaos (gPC) metamodel where the exact numerical model is approximated by complementary metamodels in overlapping domains that mimic the model’s exact response. The gPC metamodel is constructed by the non-intrusive stochastic spectral projection approach and function evaluation on the gPC metamodel can be considered as essentially free. Thus, large number of Monte Carlo samples from the metamodel can be used to estimate α-quantile, for moderate values of α. As the gPC metamodel is an expansion about the means of the inputs, its accuracy may worsen away from these mean values where themore » extreme events may occur. By increasing the approximation accuracy of the metamodel, we may eventually improve accuracy of quantile estimation but it is very expensive. A multi-element approach is therefore proposed by combining a global metamodel in the standard normal space with supplementary local metamodels constructed in bounded domains about the design points corresponding to the extreme events. To improve the accuracy and to minimize the sampling cost, sparse-tensor and anisotropic-tensor quadratures are tested in addition to the full-tensor Gauss quadrature in the construction of local metamodels; different bounds of the gPC expansion are also examined. The global and local metamodels are combined in the multi-element gPC (MEgPC) approach and it is shown that MEgPC can be more accurate than Monte Carlo or importance sampling methods for high quantile estimations for input dimensions roughly below N=8, a limit that is very much case- and α-dependent.« less

  15. Incremental impact of body mass status with modifiable unhealthy lifestyle behaviors on pharmaceutical expenditure.

    PubMed

    Kim, Tae Hyun; Lee, Eui-Kyung; Han, Euna

    Overweight/obesity is a growing health risk in Korea. The impact of overweight/obesity on pharmaceutical expenditure can be larger if individuals have multiple risk factors and multiple comorbidities. The current study estimated the combined effects of overweight/obesity and other unhealthy behaviors on pharmaceutical expenditure. An instrumental variable quantile regression model was estimated using Korea Health Panel Study data. The current study extracted data from 3 waves (2009, 2010, and 2011). The final sample included 7148 person-year observations for adults aged 20 years or older. Overweight/obese individuals had higher pharmaceutical expenditure than their non-obese counterparts only at the upper quantiles of the conditional distribution of pharmaceutical expenditure (by 119% at the 90th quantile and 115% at the 95th). The current study found a stronger association at the upper quantiles among men (152%, 144%, and 150% at the 75th, 90th, and 95th quantiles, respectively) than among women (152%, 150%, and 148% at the 75th, 90th, and 95th quantiles, respectively). The association at the upper quantiles was stronger when combined with moderate to heavy drinking and no regular physical check-up, particularly among males. The current study confirms that the association of overweight/obesity with modifiable unhealthy behaviors on pharmaceutical expenditure is larger than with overweight/obesity alone. Assessing the effect of overweight/obesity with lifestyle risk factors can help target groups for public health intervention programs. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Education and inequalities in risk scores for coronary heart disease and body mass index: evidence for a population strategy.

    PubMed

    Liu, Sze Yan; Kawachi, Ichiro; Glymour, M Maria

    2012-09-01

    Concerns have been raised that education may have greater benefits for persons at high risk of coronary heart disease (CHD) than for those at low risk. We estimated the association of education (less than high school, high school, or college graduates) with 10-year CHD risk and body mass index (BMI), using linear and quantile regression models, in the following two nationally representative datasets: the 2006 wave of the Health and Retirement Survey and the 2003-2008 National Health and Nutrition Examination Survey (NHANES). Higher educational attainment was associated with lower 10-year CHD risk for all groups. However, the magnitude of this association varied considerably across quantiles for some subgroups. For example, among women in NHANES, a high school degree was associated with 4% (95% confidence interval = -9% to 1%) and 17% (-24% to -8%) lower CHD risk in the 10th and 90th percentiles, respectively. For BMI, a college degree was associated with uniform decreases across the distribution for women, but with varying increases for men. Compared with those who had not completed high school, male college graduates in the NHANES sample had a BMI that was 6% greater (2% to 11%) at the 10th percentile of the BMI distribution and 7% lower (-10% to -3%) at the 90th percentile (ie, overweight/obese). Estimates from the Health and Retirement Survey sample and the marginal quantile regression models showed similar patterns. Conventional regression methods may mask important variations in the associations between education and CHD risk.

  17. Variable screening via quantile partial correlation

    PubMed Central

    Ma, Shujie; Tsai, Chih-Ling

    2016-01-01

    In quantile linear regression with ultra-high dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. PMID:28943683

  18. Modeling distributional changes in winter precipitation of Canada using Bayesian spatiotemporal quantile regression subjected to different teleconnections

    NASA Astrophysics Data System (ADS)

    Tan, Xuezhi; Gan, Thian Yew; Chen, Shu; Liu, Bingjun

    2018-05-01

    Climate change and large-scale climate patterns may result in changes in probability distributions of climate variables that are associated with changes in the mean and variability, and severity of extreme climate events. In this paper, we applied a flexible framework based on the Bayesian spatiotemporal quantile (BSTQR) model to identify climate changes at different quantile levels and their teleconnections to large-scale climate patterns such as El Niño-Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO) and Pacific-North American (PNA). Using the BSTQR model with time (year) as a covariate, we estimated changes in Canadian winter precipitation and their uncertainties at different quantile levels. There were some stations in eastern Canada showing distributional changes in winter precipitation such as an increase in low quantiles but a decrease in high quantiles. Because quantile functions in the BSTQR model vary with space and time and assimilate spatiotemporal precipitation data, the BSTQR model produced much spatially smoother and less uncertain quantile changes than the classic regression without considering spatiotemporal correlations. Using the BSTQR model with five teleconnection indices (i.e., SOI, PDO, PNA, NP and NAO) as covariates, we investigated effects of large-scale climate patterns on Canadian winter precipitation at different quantile levels. Winter precipitation responses to these five teleconnections were found to occur differently at different quantile levels. Effects of five teleconnections on Canadian winter precipitation were stronger at low and high than at medium quantile levels.

  19. Estimating equivalence with quantile regression

    USGS Publications Warehouse

    Cade, B.S.

    2011-01-01

    Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.

  20. Association between the Infant and Child Feeding Index (ICFI) and nutritional status of 6- to 35-month-old children in rural western China.

    PubMed

    Qu, Pengfei; Mi, Baibing; Wang, Duolao; Zhang, Ruo; Yang, Jiaomei; Liu, Danmeng; Dang, Shaonong; Yan, Hong

    2017-01-01

    The objective of this study was to determine the relationship between the quality of feeding practices and children's nutritional status in rural western China. A sample of 12,146 pairs of 6- to 35-month-old children and their mothers were recruited using stratified multistage cluster random sampling in rural western China. Quantile regression was used to analyze the relationship between the Infant and Child Feeding Index (ICFI) and children's nutritional status. In rural western China, 24.37% of all infants and young children suffer from malnutrition. Of this total, 19.57%, 8.74% and 4.63% of infants and children are classified as stunting, underweight and wasting, respectively. After adjusting for covariates, the quantile regression results suggested that qualified ICFI (ICFI > 13.8) was associated with all length and HAZ quantiles (P<0.05) and had a greater effect on the following: poor length and HAZ, the β-estimates (length) from 0.76 cm (95% CI: 0.53 to 0.99 cm) to 0.34 cm (95% CI: 0.09 to 0.59 cm) and the β-estimates (HAZ) from 0.17 (95% CI: 0.10 to 0.24) to 0.11 (95% CI: 0.04 to 0.19). Qualified ICFI was also associated with most weight quantiles (P<0.05 except the 80th and 90th quantiles) and poor and intermediate WAZ quantiles (P<0.05 including the 10th, 20th 30th and 40th quantiles). Additionally, qualified ICFI had a greater effect on poor weight and WAZ quantiles in which the β-estimates (weight) were from 0.20 kg (95% CI: 0.14 to 0.26 kg) to 0.06 kg (95% CI: 0.00 to 0.12 kg) and the β-estimates (WAZ) were from 0.14 (95% CI: 0.08 to 0.21) to 0.05 (95% CI: 0.01 to 0.10). Feeding practices were associated with the physical development of infants and young children, and proper feeding practices had a greater effect on poor physical development in infants and young children. For mothers in rural western China, proper guidelines and messaging on complementary feeding practices are necessary.

  1. A hierarchical Bayesian GEV model for improving local and regional flood quantile estimates

    NASA Astrophysics Data System (ADS)

    Lima, Carlos H. R.; Lall, Upmanu; Troy, Tara; Devineni, Naresh

    2016-10-01

    We estimate local and regional Generalized Extreme Value (GEV) distribution parameters for flood frequency analysis in a multilevel, hierarchical Bayesian framework, to explicitly model and reduce uncertainties. As prior information for the model, we assume that the GEV location and scale parameters for each site come from independent log-normal distributions, whose mean parameter scales with the drainage area. From empirical and theoretical arguments, the shape parameter for each site is shrunk towards a common mean. Non-informative prior distributions are assumed for the hyperparameters and the MCMC method is used to sample from the joint posterior distribution. The model is tested using annual maximum series from 20 streamflow gauges located in an 83,000 km2 flood prone basin in Southeast Brazil. The results show a significant reduction of uncertainty estimates of flood quantile estimates over the traditional GEV model, particularly for sites with shorter records. For return periods within the range of the data (around 50 years), the Bayesian credible intervals for the flood quantiles tend to be narrower than the classical confidence limits based on the delta method. As the return period increases beyond the range of the data, the confidence limits from the delta method become unreliable and the Bayesian credible intervals provide a way to estimate satisfactory confidence bands for the flood quantiles considering parameter uncertainties and regional information. In order to evaluate the applicability of the proposed hierarchical Bayesian model for regional flood frequency analysis, we estimate flood quantiles for three randomly chosen out-of-sample sites and compare with classical estimates using the index flood method. The posterior distributions of the scaling law coefficients are used to define the predictive distributions of the GEV location and scale parameters for the out-of-sample sites given only their drainage areas and the posterior distribution of the average shape parameter is taken as the regional predictive distribution for this parameter. While the index flood method does not provide a straightforward way to consider the uncertainties in the index flood and in the regional parameters, the results obtained here show that the proposed Bayesian method is able to produce adequate credible intervals for flood quantiles that are in accordance with empirical estimates.

  2. Economic policy uncertainty, equity premium and dependence between their quantiles: Evidence from quantile-on-quantile approach

    NASA Astrophysics Data System (ADS)

    Raza, Syed Ali; Zaighum, Isma; Shah, Nida

    2018-02-01

    This paper examines the relationship between economic policy uncertainty and equity premium in G7 countries over a period of the monthly data from January 1989 to December 2015 using a novel technique namely QQ regression proposed by Sim and Zhou (2015). Based on QQ approach, we estimate how the quantiles of the economic policy uncertainty affect the quantiles of the equity premium. Thus, it provides a comprehensive insight into the overall dependence structure between the equity premium and economic policy uncertainty as compared to traditional techniques like OLS or quantile regression. Overall, our empirical evidence suggests the existence of a negative association between equity premium and EPU predominately in all G7 countries, especially in the extreme low and extreme high tails. However, differences exist among countries and across different quantiles of EPU and the equity premium within each country. The existence of this heterogeneity among countries is due to the differences in terms of dependency on economic policy, other stock markets, and the linkages with other country's equity market.

  3. Regional trends in short-duration precipitation extremes: a flexible multivariate monotone quantile regression approach

    NASA Astrophysics Data System (ADS)

    Cannon, Alex

    2017-04-01

    Estimating historical trends in short-duration rainfall extremes at regional and local scales is challenging due to low signal-to-noise ratios and the limited availability of homogenized observational data. In addition to being of scientific interest, trends in rainfall extremes are of practical importance, as their presence calls into question the stationarity assumptions that underpin traditional engineering and infrastructure design practice. Even with these fundamental challenges, increasingly complex questions are being asked about time series of extremes. For instance, users may not only want to know whether or not rainfall extremes have changed over time, they may also want information on the modulation of trends by large-scale climate modes or on the nonstationarity of trends (e.g., identifying hiatus periods or periods of accelerating positive trends). Efforts have thus been devoted to the development and application of more robust and powerful statistical estimators for regional and local scale trends. While a standard nonparametric method like the regional Mann-Kendall test, which tests for the presence of monotonic trends (i.e., strictly non-decreasing or non-increasing changes), makes fewer assumptions than parametric methods and pools information from stations within a region, it is not designed to visualize detected trends, include information from covariates, or answer questions about the rate of change in trends. As a remedy, monotone quantile regression (MQR) has been developed as a nonparametric alternative that can be used to estimate a common monotonic trend in extremes at multiple stations. Quantile regression makes efficient use of data by directly estimating conditional quantiles based on information from all rainfall data in a region, i.e., without having to precompute the sample quantiles. The MQR method is also flexible and can be used to visualize and analyze the nonlinearity of the detected trend. However, it is fundamentally a univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.

  4. Using Gamma and Quantile Regressions to Explore the Association between Job Strain and Adiposity in the ELSA-Brasil Study: Does Gender Matter?

    PubMed

    Fonseca, Maria de Jesus Mendes da; Juvanhol, Leidjaira Lopes; Rotenberg, Lúcia; Nobre, Aline Araújo; Griep, Rosane Härter; Alves, Márcia Guimarães de Mello; Cardoso, Letícia de Oliveira; Giatti, Luana; Nunes, Maria Angélica; Aquino, Estela M L; Chor, Dóra

    2017-11-17

    This paper explores the association between job strain and adiposity, using two statistical analysis approaches and considering the role of gender. The research evaluated 11,960 active baseline participants (2008-2010) in the ELSA-Brasil study. Job strain was evaluated through a demand-control questionnaire, while body mass index (BMI) and waist circumference (WC) were evaluated in continuous form. The associations were estimated using gamma regression models with an identity link function. Quantile regression models were also estimated from the final set of co-variables established by gamma regression. The relationship that was found varied by analytical approach and gender. Among the women, no association was observed between job strain and adiposity in the fitted gamma models. In the quantile models, a pattern of increasing effects of high strain was observed at higher BMI and WC distribution quantiles. Among the men, high strain was associated with adiposity in the gamma regression models. However, when quantile regression was used, that association was found not to be homogeneous across outcome distributions. In addition, in the quantile models an association was observed between active jobs and BMI. Our results point to an association between job strain and adiposity, which follows a heterogeneous pattern. Modelling strategies can produce different results and should, accordingly, be used to complement one another.

  5. Peaks Over Threshold (POT): A methodology for automatic threshold estimation using goodness of fit p-value

    NASA Astrophysics Data System (ADS)

    Solari, Sebastián.; Egüen, Marta; Polo, María. José; Losada, Miguel A.

    2017-04-01

    Threshold estimation in the Peaks Over Threshold (POT) method and the impact of the estimation method on the calculation of high return period quantiles and their uncertainty (or confidence intervals) are issues that are still unresolved. In the past, methods based on goodness of fit tests and EDF-statistics have yielded satisfactory results, but their use has not yet been systematized. This paper proposes a methodology for automatic threshold estimation, based on the Anderson-Darling EDF-statistic and goodness of fit test. When combined with bootstrapping techniques, this methodology can be used to quantify both the uncertainty of threshold estimation and its impact on the uncertainty of high return period quantiles. This methodology was applied to several simulated series and to four precipitation/river flow data series. The results obtained confirmed its robustness. For the measured series, the estimated thresholds corresponded to those obtained by nonautomatic methods. Moreover, even though the uncertainty of the threshold estimation was high, this did not have a significant effect on the width of the confidence intervals of high return period quantiles.

  6. Modeling the human development index and the percentage of poor people using quantile smoothing splines

    NASA Astrophysics Data System (ADS)

    Mulyani, Sri; Andriyana, Yudhie; Sudartianto

    2017-03-01

    Mean regression is a statistical method to explain the relationship between the response variable and the predictor variable based on the central tendency of the data (mean) of the response variable. The parameter estimation in mean regression (with Ordinary Least Square or OLS) generates a problem if we apply it to the data with a symmetric, fat-tailed, or containing outlier. Hence, an alternative method is necessary to be used to that kind of data, for example quantile regression method. The quantile regression is a robust technique to the outlier. This model can explain the relationship between the response variable and the predictor variable, not only on the central tendency of the data (median) but also on various quantile, in order to obtain complete information about that relationship. In this study, a quantile regression is developed with a nonparametric approach such as smoothing spline. Nonparametric approach is used if the prespecification model is difficult to determine, the relation between two variables follow the unknown function. We will apply that proposed method to poverty data. Here, we want to estimate the Percentage of Poor People as the response variable involving the Human Development Index (HDI) as the predictor variable.

  7. Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

    Treesearch

    Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...

  8. Bayesian estimation of extreme flood quantiles using a rainfall-runoff model and a stochastic daily rainfall generator

    NASA Astrophysics Data System (ADS)

    Costa, Veber; Fernandes, Wilson

    2017-11-01

    Extreme flood estimation has been a key research topic in hydrological sciences. Reliable estimates of such events are necessary as structures for flood conveyance are continuously evolving in size and complexity and, as a result, their failure-associated hazards become more and more pronounced. Due to this fact, several estimation techniques intended to improve flood frequency analysis and reducing uncertainty in extreme quantile estimation have been addressed in the literature in the last decades. In this paper, we develop a Bayesian framework for the indirect estimation of extreme flood quantiles from rainfall-runoff models. In the proposed approach, an ensemble of long daily rainfall series is simulated with a stochastic generator, which models extreme rainfall amounts with an upper-bounded distribution function, namely, the 4-parameter lognormal model. The rationale behind the generation model is that physical limits for rainfall amounts, and consequently for floods, exist and, by imposing an appropriate upper bound for the probabilistic model, more plausible estimates can be obtained for those rainfall quantiles with very low exceedance probabilities. Daily rainfall time series are converted into streamflows by routing each realization of the synthetic ensemble through a conceptual hydrologic model, the Rio Grande rainfall-runoff model. Calibration of parameters is performed through a nonlinear regression model, by means of the specification of a statistical model for the residuals that is able to accommodate autocorrelation, heteroscedasticity and nonnormality. By combining the outlined steps in a Bayesian structure of analysis, one is able to properly summarize the resulting uncertainty and estimating more accurate credible intervals for a set of flood quantiles of interest. The method for extreme flood indirect estimation was applied to the American river catchment, at the Folsom dam, in the state of California, USA. Results show that most floods, including exceptionally large non-systematic events, were reasonably estimated with the proposed approach. In addition, by accounting for uncertainties in each modeling step, one is able to obtain a better understanding of the influential factors in large flood formation dynamics.

  9. Use of Flood Seasonality in Pooling-Group Formation and Quantile Estimation: An Application in Great Britain

    NASA Astrophysics Data System (ADS)

    Formetta, Giuseppe; Bell, Victoria; Stewart, Elizabeth

    2018-02-01

    Regional flood frequency analysis is one of the most commonly applied methods for estimating extreme flood events at ungauged sites or locations with short measurement records. It is based on: (i) the definition of a homogeneous group (pooling-group) of catchments, and on (ii) the use of the pooling-group data to estimate flood quantiles. Although many methods to define a pooling-group (pooling schemes, PS) are based on catchment physiographic similarity measures, in the last decade methods based on flood seasonality similarity have been contemplated. In this paper, two seasonality-based PS are proposed and tested both in terms of the homogeneity of the pooling-groups they generate and in terms of the accuracy in estimating extreme flood events. The method has been applied in 420 catchments in Great Britain (considered as both gauged and ungauged) and compared against the current Flood Estimation Handbook (FEH) PS. Results for gauged sites show that, compared to the current PS, the seasonality-based PS performs better both in terms of homogeneity of the pooling-group and in terms of the accuracy of flood quantile estimates. For ungauged locations, a national-scale hydrological model has been used for the first time to quantify flood seasonality. Results show that in 75% of the tested locations the seasonality-based PS provides an improvement in the accuracy of the flood quantile estimates. The remaining 25% were located in highly urbanized, groundwater-dependent catchments. The promising results support the aspiration that large-scale hydrological models complement traditional methods for estimating design floods.

  10. Quantile Regression for Recurrent Gap Time Data

    PubMed Central

    Luo, Xianghua; Huang, Chiung-Yu; Wang, Lan

    2014-01-01

    Summary Evaluating covariate effects on gap times between successive recurrent events is of interest in many medical and public health studies. While most existing methods for recurrent gap time analysis focus on modeling the hazard function of gap times, a direct interpretation of the covariate effects on the gap times is not available through these methods. In this article, we consider quantile regression that can provide direct assessment of covariate effects on the quantiles of the gap time distribution. Following the spirit of the weighted risk-set method by Luo and Huang (2011, Statistics in Medicine 30, 301–311), we extend the martingale-based estimating equation method considered by Peng and Huang (2008, Journal of the American Statistical Association 103, 637–649) for univariate survival data to analyze recurrent gap time data. The proposed estimation procedure can be easily implemented in existing software for univariate censored quantile regression. Uniform consistency and weak convergence of the proposed estimators are established. Monte Carlo studies demonstrate the effectiveness of the proposed method. An application to data from the Danish Psychiatric Central Register is presented to illustrate the methods developed in this article. PMID:23489055

  11. Automatic Feature Selection and Weighting for the Formation of Homogeneous Groups for Regional Intensity-Duration-Frequency (IDF) Curve Estimation

    NASA Astrophysics Data System (ADS)

    Yang, Z.; Burn, D. H.

    2017-12-01

    Extreme rainfall events can have devastating impacts on society. To quantify the associated risk, the IDF curve has been used to provide the essential rainfall-related information for urban planning. However, the recent changes in the rainfall climatology caused by climate change and urbanization have made the estimates provided by the traditional regional IDF approach increasingly inaccurate. This inaccuracy is mainly caused by two problems: 1) The ineffective choice of similarity indicators for the formation of a homogeneous group at different regions; and 2) An inadequate number of stations in the pooling group that does not adequately reflect the optimal balance between group size and group homogeneity or achieve the lowest uncertainty in the rainfall quantiles estimates. For the first issue, to consider the temporal difference among different meteorological and topographic indicators, a three-layer design is proposed based on three stages in the extreme rainfall formation: cloud formation, rainfall generation and change of rainfall intensity above urban surface. During the process, the impacts from climate change and urbanization are considered through the inclusion of potential relevant features at each layer. Then to consider spatial difference of similarity indicators for the homogeneous group formation at various regions, an automatic feature selection and weighting algorithm, specifically the hybrid searching algorithm of Tabu search, Lagrange Multiplier and Fuzzy C-means Clustering, is used to select the optimal combination of features for the potential optimal homogenous groups formation at a specific region. For the second issue, to compare the uncertainty of rainfall quantile estimates among potential groups, the two sample Kolmogorov-Smirnov test-based sample ranking process is used. During the process, linear programming is used to rank these groups based on the confidence intervals of the quantile estimates. The proposed methodology fills the gap of including the urbanization impacts during the pooling group formation, and challenges the traditional assumption that the same set of similarity indicators can be equally effective in generating the optimal homogeneous group for regions with different geographic and meteorological characteristics.

  12. Censored quantile regression with recursive partitioning-based weights

    PubMed Central

    Wey, Andrew; Wang, Lan; Rudser, Kyle

    2014-01-01

    Censored quantile regression provides a useful alternative to the Cox proportional hazards model for analyzing survival data. It directly models the conditional quantile of the survival time and hence is easy to interpret. Moreover, it relaxes the proportionality constraint on the hazard function associated with the popular Cox model and is natural for modeling heterogeneity of the data. Recently, Wang and Wang (2009. Locally weighted censored quantile regression. Journal of the American Statistical Association 103, 1117–1128) proposed a locally weighted censored quantile regression approach that allows for covariate-dependent censoring and is less restrictive than other censored quantile regression methods. However, their kernel smoothing-based weighting scheme requires all covariates to be continuous and encounters practical difficulty with even a moderate number of covariates. We propose a new weighting approach that uses recursive partitioning, e.g. survival trees, that offers greater flexibility in handling covariate-dependent censoring in moderately high dimensions and can incorporate both continuous and discrete covariates. We prove that this new weighting scheme leads to consistent estimation of the quantile regression coefficients and demonstrate its effectiveness via Monte Carlo simulations. We also illustrate the new method using a widely recognized data set from a clinical trial on primary biliary cirrhosis. PMID:23975800

  13. Simulation of extreme rainfall and projection of future changes using the GLIMCLIM model

    NASA Astrophysics Data System (ADS)

    Rashid, Md. Mamunur; Beecham, Simon; Chowdhury, Rezaul Kabir

    2017-10-01

    In this study, the performance of the Generalized LInear Modelling of daily CLImate sequence (GLIMCLIM) statistical downscaling model was assessed to simulate extreme rainfall indices and annual maximum daily rainfall (AMDR) when downscaled daily rainfall from National Centers for Environmental Prediction (NCEP) reanalysis and Coupled Model Intercomparison Project Phase 5 (CMIP5) general circulation models (GCM) (four GCMs and two scenarios) output datasets and then their changes were estimated for the future period 2041-2060. The model was able to reproduce the monthly variations in the extreme rainfall indices reasonably well when forced by the NCEP reanalysis datasets. Frequency Adapted Quantile Mapping (FAQM) was used to remove bias in the simulated daily rainfall when forced by CMIP5 GCMs, which reduced the discrepancy between observed and simulated extreme rainfall indices. Although the observed AMDR were within the 2.5th and 97.5th percentiles of the simulated AMDR, the model consistently under-predicted the inter-annual variability of AMDR. A non-stationary model was developed using the generalized linear model for local, shape and scale to estimate the AMDR with an annual exceedance probability of 0.01. The study shows that in general, AMDR is likely to decrease in the future. The Onkaparinga catchment will also experience drier conditions due to an increase in consecutive dry days coinciding with decreases in heavy (>long term 90th percentile) rainfall days, empirical 90th quantile of rainfall and maximum 5-day consecutive total rainfall for the future period (2041-2060) compared to the base period (1961-2000).

  14. Censored Quantile Instrumental Variable Estimates of the Price Elasticity of Expenditure on Medical Care.

    PubMed

    Kowalski, Amanda

    2016-01-02

    Efforts to control medical care costs depend critically on how individuals respond to prices. I estimate the price elasticity of expenditure on medical care using a censored quantile instrumental variable (CQIV) estimator. CQIV allows estimates to vary across the conditional expenditure distribution, relaxes traditional censored model assumptions, and addresses endogeneity with an instrumental variable. My instrumental variable strategy uses a family member's injury to induce variation in an individual's own price. Across the conditional deciles of the expenditure distribution, I find elasticities that vary from -0.76 to -1.49, which are an order of magnitude larger than previous estimates.

  15. Effects of environmental variables on invasive amphibian activity: Using model selection on quantiles for counts

    USGS Publications Warehouse

    Muller, Benjamin J.; Cade, Brian S.; Schwarzkoph, Lin

    2018-01-01

    Many different factors influence animal activity. Often, the value of an environmental variable may influence significantly the upper or lower tails of the activity distribution. For describing relationships with heterogeneous boundaries, quantile regressions predict a quantile of the conditional distribution of the dependent variable. A quantile count model extends linear quantile regression methods to discrete response variables, and is useful if activity is quantified by trapping, where there may be many tied (equal) values in the activity distribution, over a small range of discrete values. Additionally, different environmental variables in combination may have synergistic or antagonistic effects on activity, so examining their effects together, in a modeling framework, is a useful approach. Thus, model selection on quantile counts can be used to determine the relative importance of different variables in determining activity, across the entire distribution of capture results. We conducted model selection on quantile count models to describe the factors affecting activity (numbers of captures) of cane toads (Rhinella marina) in response to several environmental variables (humidity, temperature, rainfall, wind speed, and moon luminosity) over eleven months of trapping. Environmental effects on activity are understudied in this pest animal. In the dry season, model selection on quantile count models suggested that rainfall positively affected activity, especially near the lower tails of the activity distribution. In the wet season, wind speed limited activity near the maximum of the distribution, while minimum activity increased with minimum temperature. This statistical methodology allowed us to explore, in depth, how environmental factors influenced activity across the entire distribution, and is applicable to any survey or trapping regime, in which environmental variables affect activity.

  16. Height premium for job performance.

    PubMed

    Kim, Tae Hyun; Han, Euna

    2017-08-01

    This study assessed the relationship of height with wages, using the 1998 and 2012 Korean Labor and Income Panel Study data. The key independent variable was height measured in centimeters, which was included as a series of dummy indicators of height per 5cm span (<155cm, 155-160cm, 160-165cm, and ≥165cm for women; <165cm, 165-170cm, 170-175cm, 175-180cm, and ≥180cm for men). We controlled for household- and individual-level random effects. We used a random-effect quantile regression model for monthly wages to assess the heterogeneity in the height-wage relationship, across the conditional distribution of monthly wages. We found a non-linear relationship of height with monthly wages. For men, the magnitude of the height wage premium was overall larger at the upper quantile of the conditional distribution of log monthly wages than at the median to low quantile, particularly in professional and semi-professional occupations. The height-wage premium was also larger at the 90th quantile for self-employed women and salaried men. Our findings add a global dimension to the existing evidence on height-wage premium, demonstrating non-linearity in the association between height and wages and heterogeneous changes in the dispersion and direction of the association between height and wages, by wage level. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Instrumental Variable Analysis with a Nonlinear Exposure–Outcome Relationship

    PubMed Central

    Davies, Neil M.; Thompson, Simon G.

    2014-01-01

    Background: Instrumental variable methods can estimate the causal effect of an exposure on an outcome using observational data. Many instrumental variable methods assume that the exposure–outcome relation is linear, but in practice this assumption is often in doubt, or perhaps the shape of the relation is a target for investigation. We investigate this issue in the context of Mendelian randomization, the use of genetic variants as instrumental variables. Methods: Using simulations, we demonstrate the performance of a simple linear instrumental variable method when the true shape of the exposure–outcome relation is not linear. We also present a novel method for estimating the effect of the exposure on the outcome within strata of the exposure distribution. This enables the estimation of localized average causal effects within quantile groups of the exposure or as a continuous function of the exposure using a sliding window approach. Results: Our simulations suggest that linear instrumental variable estimates approximate a population-averaged causal effect. This is the average difference in the outcome if the exposure for every individual in the population is increased by a fixed amount. Estimates of localized average causal effects reveal the shape of the exposure–outcome relation for a variety of models. These methods are used to investigate the relations between body mass index and a range of cardiovascular risk factors. Conclusions: Nonlinear exposure–outcome relations should not be a barrier to instrumental variable analyses. When the exposure–outcome relation is not linear, either a population-averaged causal effect or the shape of the exposure–outcome relation can be estimated. PMID:25166881

  18. Quantile regression for the statistical analysis of immunological data with many non-detects.

    PubMed

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  19. Hospital charges associated with motorcycle crash factors: a quantile regression analysis.

    PubMed

    Olsen, Cody S; Thomas, Andrea M; Cook, Lawrence J

    2014-08-01

    Previous studies of motorcycle crash (MC) related hospital charges use trauma registries and hospital records, and do not adjust for the number of motorcyclists not requiring medical attention. This may lead to conservative estimates of helmet use effectiveness. MC records were probabilistically linked with emergency department and hospital records to obtain total hospital charges. Missing data were imputed. Multivariable quantile regression estimated reductions in hospital charges associated with helmet use and other crash factors. Motorcycle helmets were associated with reduced median hospital charges of $256 (42% reduction) and reduced 98th percentile of $32,390 (33% reduction). After adjusting for other factors, helmets were associated with reductions in charges in all upper percentiles studied. Quantile regression models described homogenous and heterogeneous associations between other crash factors and charges. Quantile regression comprehensively describes associations between crash factors and hospital charges. Helmet use among motorcyclists is associated with decreased hospital charges. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Censored Quantile Instrumental Variable Estimates of the Price Elasticity of Expenditure on Medical Care

    PubMed Central

    Kowalski, Amanda

    2015-01-01

    Efforts to control medical care costs depend critically on how individuals respond to prices. I estimate the price elasticity of expenditure on medical care using a censored quantile instrumental variable (CQIV) estimator. CQIV allows estimates to vary across the conditional expenditure distribution, relaxes traditional censored model assumptions, and addresses endogeneity with an instrumental variable. My instrumental variable strategy uses a family member’s injury to induce variation in an individual’s own price. Across the conditional deciles of the expenditure distribution, I find elasticities that vary from −0.76 to −1.49, which are an order of magnitude larger than previous estimates. PMID:26977117

  1. Finite-sample and asymptotic sign-based tests for parameters of non-linear quantile regression with Markov noise

    NASA Astrophysics Data System (ADS)

    Sirenko, M. A.; Tarasenko, P. F.; Pushkarev, M. I.

    2017-01-01

    One of the most noticeable features of sign-based statistical procedures is an opportunity to build an exact test for simple hypothesis testing of parameters in a regression model. In this article, we expanded a sing-based approach to the nonlinear case with dependent noise. The examined model is a multi-quantile regression, which makes it possible to test hypothesis not only of regression parameters, but of noise parameters as well.

  2. Asymptotics of nonparametric L-1 regression models with dependent data

    PubMed Central

    ZHAO, ZHIBIAO; WEI, YING; LIN, DENNIS K.J.

    2013-01-01

    We investigate asymptotic properties of least-absolute-deviation or median quantile estimates of the location and scale functions in nonparametric regression models with dependent data from multiple subjects. Under a general dependence structure that allows for longitudinal data and some spatially correlated data, we establish uniform Bahadur representations for the proposed median quantile estimates. The obtained Bahadur representations provide deep insights into the asymptotic behavior of the estimates. Our main theoretical development is based on studying the modulus of continuity of kernel weighted empirical process through a coupling argument. Progesterone data is used for an illustration. PMID:24955016

  3. High dimensional linear regression models under long memory dependence and measurement error

    NASA Astrophysics Data System (ADS)

    Kaul, Abhishek

    This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.

  4. Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies.

    PubMed

    Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre

    2018-03-15

    Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. We propose a methodology based on Cox mixed models and written under the R language. This semiparametric model is indeed flexible enough to fit duration data. To compare log-linear and Cox mixed models in terms of goodness-of-fit on real data sets, we also provide a procedure based on simulations and quantile-quantile plots. We present two examples from a data set of speech and gesture interactions, which illustrate the limitations of linear and log-linear mixed models, as compared to Cox models. The linear models are not validated on our data, whereas Cox models are. Moreover, in the second example, the Cox model exhibits a significant effect that the linear model does not. We provide methods to select the best-fitting models for repeated duration data and to compare statistical methodologies. In this study, we show that Cox models are best suited to the analysis of our data set.

  5. No causal impact of serum vascular endothelial growth factor level on temporal changes in body mass index in Japanese male workers: a five-year longitudinal study.

    PubMed

    Imatoh, Takuya; Kamimura, Seiichiro; Miyazaki, Motonobu

    2017-03-01

    It has been reported that adipocytes secrete vascular endothelial growth factor. Therefore, we conducted a 5-year longitudinal epidemiological study to further elucidate the association between vascular endothelial growth factor levels and temporal changes in body mass index. Our study subjects were Japanese male workers, who had regular health check-ups. Vascular endothelial growth factor levels were measured at baseline. To examine the association between vascular endothelial growth factor levels and overweight, we calculated the odds ratio using a multivariate logistic regression model. Moreover, linear mixed effect models were used to assess the association between vascular endothelial growth factor level and temporal changes in body mass index during the 5-year follow-up period. Vascular endothelial growth factor levels were marginally higher in subjects with a body mass index greater than 25 kg/m 2 compared with in those with a body mass index less than 25 kg/m 2 (505.4 vs. 465.5 pg/mL, P = 0.1) and were weakly correlated with leptin levels (β: 0.05, P = 0.07). In multivariate logistic regression, subjects in the highest vascular endothelial growth factor quantile were significantly associated with an increased risk for overweight compared with those in the lowest quantile (odds ratio 1.65, 95 % confidential interval: 1.10-2.50). Moreover P for trend was significant (P for trend = 0.003). However, the linear mixed effect model revealed that vascular endothelial growth factor levels were not associated with changes in body mass index over a 5-year period (quantile 2, β: 0.06, P = 0.46; quantile 3, β: -0.06, P = 0.45; quantile 4, β: -0.10, P = 0.22; quantile 1 as reference). Our results suggested that high vascular endothelial growth factor levels were significantly associated with overweight in Japanese males but high vascular endothelial growth factor levels did not necessarily cause obesity.

  6. A Study on Regional Rainfall Frequency Analysis for Flood Simulation Scenarios

    NASA Astrophysics Data System (ADS)

    Jung, Younghun; Ahn, Hyunjun; Joo, Kyungwon; Heo, Jun-Haeng

    2014-05-01

    Recently, climate change has been observed in Korea as well as in the entire world. The rainstorm has been gradually increased and then the damage has been grown. It is very important to manage the flood control facilities because of increasing the frequency and magnitude of severe rain storm. For managing flood control facilities in risky regions, data sets such as elevation, gradient, channel, land use and soil data should be filed up. Using this information, the disaster situations can be simulated to secure evacuation routes for various rainfall scenarios. The aim of this study is to investigate and determine extreme rainfall quantile estimates in Uijeongbu City using index flood method with L-moments parameter estimation. Regional frequency analysis trades space for time by using annual maximum rainfall data from nearby or similar sites to derive estimates for any given site in a homogeneous region. Regional frequency analysis based on pooled data is recommended for estimation of rainfall quantiles at sites with record lengths less than 5T, where T is return period of interest. Many variables relevant to precipitation can be used for grouping a region in regional frequency analysis. For regionalization of Han River basin, the k-means method is applied for grouping regions by variables of meteorology and geomorphology. The results from the k-means method are compared for each region using various probability distributions. In the final step of the regionalization analysis, goodness-of-fit measure is used to evaluate the accuracy of a set of candidate distributions. And rainfall quantiles by index flood method are obtained based on the appropriate distribution. And then, rainfall quantiles based on various scenarios are used as input data for disaster simulations. Keywords: Regional Frequency Analysis; Scenarios of Rainfall Quantile Acknowledgements This research was supported by a grant 'Establishing Active Disaster Management System of Flood Control Structures by using 3D BIM Technique' [NEMA-12-NH-57] from the Natural Hazard Mitigation Research Group, National Emergency Management Agency of Korea.

  7. Association Between Awareness of Hypertension and Health-Related Quality of Life in a Cross-Sectional Population-Based Study in Rural Area of Northwest China.

    PubMed

    Mi, Baibing; Dang, Shaonong; Li, Qiang; Zhao, Yaling; Yang, Ruihai; Wang, Duolao; Yan, Hong

    2015-07-01

    Hypertensive patients have more complex health care needs and are more likely to have poorer health-related quality of life than normotensive people. The awareness of hypertension could be related to reduce health-related quality of life. We propose the use of quantile regression to explore more detailed relationships between awareness of hypertension and health-related quality of life. In a cross-sectional, population-based study, 2737 participants (including 1035 hypertensive patients and 1702 normotensive participants) completed the Short-Form Health Survey. A quantile regression model was employed to investigate the association of physical component summary scores and mental component summary scores with awareness of hypertension and to evaluate the associated factors. Patients who were aware of hypertension (N = 554) had lower scores than patients who were unaware of hypertension (N = 481). The median (IQR) of physical component summary scores: 48.20 (13.88) versus 53.27 (10.79), P < 0.01; the mental component summary scores: 50.68 (15.09) versus 51.70 (10.65), P = 0.03. adjusting for covariates, the quantile regression results suggest awareness of hypertension was associated with most physical component summary scores quantiles (P < 0.05 except 10th and 20th quantiles) in which the β-estimates from -2.14 (95% CI: -3.80 to -0.48) to -1.45 (95% CI: -2.42 to -0.47), as the same significant trend with some poorer mental component summary scores quantiles in which the β-estimates from -3.47 (95% CI: -6.65 to -0.39) to -2.18 (95% CI: -4.30 to -0.06). The awareness of hypertension has a greater effect on those with intermediate physical component summary status: the β-estimates were equal to -2.04 (95% CI: -3.51 to -0.57, P < 0.05) at the 40th and decreased further to -1.45 (95% CI: -2.42 to -0.47, P < 0.01) at the 90th quantile. Awareness of hypertension was negatively related to health-related quality of life in hypertensive patients in rural western China, which has a greater effect on mental component summary scores with the poorer status and on physical component summary scores with the intermediate status.

  8. Effects of export concentration on CO2 emissions in developed countries: an empirical analysis.

    PubMed

    Apergis, Nicholas; Can, Muhlis; Gozgor, Giray; Lau, Chi Keung Marco

    2018-03-08

    This paper provides the evidence on the short- and the long-run effects of the export product concentration on the level of CO 2 emissions in 19 developed (high-income) economies, spanning the period 1962-2010. To this end, the paper makes use of the nonlinear panel unit root and cointegration tests with multiple endogenous structural breaks. It also considers the mean group estimations, the autoregressive distributed lag model, and the panel quantile regression estimations. The findings illustrate that the environmental Kuznets curve (EKC) hypothesis is valid in the panel dataset of 19 developed economies. In addition, it documents that a higher level of the product concentration of exports leads to lower CO 2 emissions. The results from the panel quantile regressions also indicate that the effect of the export product concentration upon the per capita CO 2 emissions is relatively high at the higher quantiles.

  9. Analysis of the labor productivity of enterprises via quantile regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2017-07-01

    In this study, we have analyzed the factors that affect the performance of Turkey's Top 500 Industrial Enterprises using quantile regression. The variable about labor productivity of enterprises is considered as dependent variable, the variableabout assets is considered as independent variable. The distribution of labor productivity of enterprises is right-skewed. If the dependent distribution is skewed, linear regression could not catch important aspects of the relationships between the dependent variable and its predictors due to modeling only the conditional mean. Hence, the quantile regression, which allows modelingany quantilesof the dependent distribution, including the median,appears to be useful. It examines whether relationships between dependent and independent variables are different for low, medium, and high percentiles. As a result of analyzing data, the effect of total assets is relatively constant over the entire distribution, except the upper tail. It hasa moderately stronger effect in the upper tail.

  10. Regional estimation of extreme suspended sediment concentrations using watershed characteristics

    NASA Astrophysics Data System (ADS)

    Tramblay, Yves; Ouarda, Taha B. M. J.; St-Hilaire, André; Poulin, Jimmy

    2010-01-01

    SummaryThe number of stations monitoring daily suspended sediment concentration (SSC) has been decreasing since the 1980s in North America while suspended sediment is considered as a key variable for water quality. The objective of this study is to test the feasibility of regionalising extreme SSC, i.e. estimating SSC extremes values for ungauged basins. Annual maximum SSC for 72 rivers in Canada and USA were modelled with probability distributions in order to estimate quantiles corresponding to different return periods. Regionalisation techniques, originally developed for flood prediction in ungauged basins, were tested using the climatic, topographic, land cover and soils attributes of the watersheds. Two approaches were compared, using either physiographic characteristics or seasonality of extreme SSC to delineate the regions. Multiple regression models to estimate SSC quantiles as a function of watershed characteristics were built in each region, and compared to a global model including all sites. Regional estimates of SSC quantiles were compared with the local values. Results show that regional estimation of extreme SSC is more efficient than a global regression model including all sites. Groups/regions of stations have been identified, using either the watershed characteristics or the seasonality of occurrence for extreme SSC values providing a method to better describe the extreme events of SSC. The most important variables for predicting extreme SSC are the percentage of clay in the soils, precipitation intensity and forest cover.

  11. Statistical downscaling modeling with quantile regression using lasso to estimate extreme rainfall

    NASA Astrophysics Data System (ADS)

    Santri, Dewi; Wigena, Aji Hamim; Djuraidah, Anik

    2016-02-01

    Rainfall is one of the climatic elements with high diversity and has many negative impacts especially extreme rainfall. Therefore, there are several methods that required to minimize the damage that may occur. So far, Global circulation models (GCM) are the best method to forecast global climate changes include extreme rainfall. Statistical downscaling (SD) is a technique to develop the relationship between GCM output as a global-scale independent variables and rainfall as a local- scale response variable. Using GCM method will have many difficulties when assessed against observations because GCM has high dimension and multicollinearity between the variables. The common method that used to handle this problem is principal components analysis (PCA) and partial least squares regression. The new method that can be used is lasso. Lasso has advantages in simultaneuosly controlling the variance of the fitted coefficients and performing automatic variable selection. Quantile regression is a method that can be used to detect extreme rainfall in dry and wet extreme. Objective of this study is modeling SD using quantile regression with lasso to predict extreme rainfall in Indramayu. The results showed that the estimation of extreme rainfall (extreme wet in January, February and December) in Indramayu could be predicted properly by the model at quantile 90th.

  12. Parameter Heterogeneity In Breast Cancer Cost Regressions – Evidence From Five European Countries

    PubMed Central

    Banks, Helen; Campbell, Harry; Douglas, Anne; Fletcher, Eilidh; McCallum, Alison; Moger, Tron Anders; Peltola, Mikko; Sveréus, Sofia; Wild, Sarah; Williams, Linda J.; Forbes, John

    2015-01-01

    Abstract We investigate parameter heterogeneity in breast cancer 1‐year cumulative hospital costs across five European countries as part of the EuroHOPE project. The paper aims to explore whether conditional mean effects provide a suitable representation of the national variation in hospital costs. A cohort of patients with a primary diagnosis of invasive breast cancer (ICD‐9 codes 174 and ICD‐10 C50 codes) is derived using routinely collected individual breast cancer data from Finland, the metropolitan area of Turin (Italy), Norway, Scotland and Sweden. Conditional mean effects are estimated by ordinary least squares for each country, and quantile regressions are used to explore heterogeneity across the conditional quantile distribution. Point estimates based on conditional mean effects provide a good approximation of treatment response for some key demographic and diagnostic specific variables (e.g. age and ICD‐10 diagnosis) across the conditional quantile distribution. For many policy variables of interest, however, there is considerable evidence of parameter heterogeneity that is concealed if decisions are based solely on conditional mean results. The use of quantile regression methods reinforce the need to consider beyond an average effect given the greater recognition that breast cancer is a complex disease reflecting patient heterogeneity. © 2015 The Authors. Health Economics Published by John Wiley & Sons Ltd. PMID:26633866

  13. Multiple imputation of rainfall missing data in the Iberian Mediterranean context

    NASA Astrophysics Data System (ADS)

    Miró, Juan Javier; Caselles, Vicente; Estrela, María José

    2017-11-01

    Given the increasing need for complete rainfall data networks, in recent years have been proposed diverse methods for filling gaps in observed precipitation series, progressively more advanced that traditional approaches to overcome the problem. The present study has consisted in validate 10 methods (6 linear, 2 non-linear and 2 hybrid) that allow multiple imputation, i.e., fill at the same time missing data of multiple incomplete series in a dense network of neighboring stations. These were applied for daily and monthly rainfall in two sectors in the Júcar River Basin Authority (east Iberian Peninsula), which is characterized by a high spatial irregularity and difficulty of rainfall estimation. A classification of precipitation according to their genetic origin was applied as pre-processing, and a quantile-mapping adjusting as post-processing technique. The results showed in general a better performance for the non-linear and hybrid methods, highlighting that the non-linear PCA (NLPCA) method outperforms considerably the Self Organizing Maps (SOM) method within non-linear approaches. On linear methods, the Regularized Expectation Maximization method (RegEM) was the best, but far from NLPCA. Applying EOF filtering as post-processing of NLPCA (hybrid approach) yielded the best results.

  14. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    PubMed Central

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  15. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    PubMed

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  16. Smooth conditional distribution function and quantiles under random censorship.

    PubMed

    Leconte, Eve; Poiraud-Casanova, Sandrine; Thomas-Agnan, Christine

    2002-09-01

    We consider a nonparametric random design regression model in which the response variable is possibly right censored. The aim of this paper is to estimate the conditional distribution function and the conditional alpha-quantile of the response variable. We restrict attention to the case where the response variable as well as the explanatory variable are unidimensional and continuous. We propose and discuss two classes of estimators which are smooth with respect to the response variable as well as to the covariate. Some simulations demonstrate that the new methods have better mean square error performances than the generalized Kaplan-Meier estimator introduced by Beran (1981) and considered in the literature by Dabrowska (1989, 1992) and Gonzalez-Manteiga and Cadarso-Suarez (1994).

  17. Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data

    PubMed Central

    Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan

    2016-01-01

    Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489

  18. Solvency supervision based on a total balance sheet approach

    NASA Astrophysics Data System (ADS)

    Pitselis, Georgios

    2009-11-01

    In this paper we investigate the adequacy of the own funds a company requires in order to remain healthy and avoid insolvency. Two methods are applied here; the quantile regression method and the method of mixed effects models. Quantile regression is capable of providing a more complete statistical analysis of the stochastic relationship among random variables than least squares estimation. The estimated mixed effects line can be considered as an internal industry equation (norm), which explains a systematic relation between a dependent variable (such as own funds) with independent variables (e.g. financial characteristics, such as assets, provisions, etc.). The above two methods are implemented with two data sets.

  19. A probability metric for identifying high-performing facilities: an application for pay-for-performance programs.

    PubMed

    Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan

    2014-12-01

    Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.

  20. Estimating risks to aquatic life using quantile regression

    USGS Publications Warehouse

    Schmidt, Travis S.; Clements, William H.; Cade, Brian S.

    2012-01-01

    One of the primary goals of biological assessment is to assess whether contaminants or other stressors limit the ecological potential of running waters. It is important to interpret responses to contaminants relative to other environmental factors, but necessity or convenience limit quantification of all factors that influence ecological potential. In these situations, the concept of limiting factors is useful for data interpretation. We used quantile regression to measure risks to aquatic life exposed to metals by including all regression quantiles (τ  =  0.05–0.95, by increments of 0.05), not just the upper limit of density (e.g., 90th quantile). We measured population densities (individuals/0.1 m2) of 2 mayflies (Rhithrogena spp., Drunella spp.) and a caddisfly (Arctopsyche grandis), aqueous metal mixtures (Cd, Cu, Zn), and other limiting factors (basin area, site elevation, discharge, temperature) at 125 streams in Colorado. We used a model selection procedure to test which factor was most limiting to density. Arctopsyche grandis was limited by other factors, whereas metals limited most quantiles of density for the 2 mayflies. Metals reduced mayfly densities most at sites where other factors were not limiting. Where other factors were limiting, low mayfly densities were observed despite metal concentrations. Metals affected mayfly densities most at quantiles above the mean and not just at the upper limit of density. Risk models developed from quantile regression showed that mayfly densities observed at background metal concentrations are improbable when metal mixtures are at US Environmental Protection Agency criterion continuous concentrations. We conclude that metals limit potential density, not realized average density. The most obvious effects on mayfly populations were at upper quantiles and not mean density. Therefore, we suggest that policy developed from mean-based measures of effects may not be as useful as policy based on the concept of limiting factors.

  1. Variability of daily UV index in Jokioinen, Finland, in 1995-2015

    NASA Astrophysics Data System (ADS)

    Heikkilä, A.; Uusitalo, K.; Kärhä, P.; Vaskuri, A.; Lakkala, K.; Koskela, T.

    2017-02-01

    UV Index is a measure for UV radiation harmful for the human skin, developed and used to promote the sun awareness and protection of people. Monitoring programs conducted around the world have produced a number of long-term time series of UV irradiance. One of the longest time series of solar spectral UV irradiance in Europe has been obtained from the continuous measurements of Brewer #107 spectrophotometer in Jokioinen (lat. 60°44'N, lon. 23°30'E), Finland, over the years 1995-2015. We have used descriptive statistics and estimates of cumulative distribution functions, quantiles and probability density functions in the analysis of the time series of daily UV Index maxima. Seasonal differences in the estimated distributions and in the trends of the estimated quantiles are found.

  2. Uncertainty analysis of an inflow forecasting model: extension of the UNEEC machine learning-based method

    NASA Astrophysics Data System (ADS)

    Pianosi, Francesca; Lal Shrestha, Durga; Solomatine, Dimitri

    2010-05-01

    This research presents an extension of UNEEC (Uncertainty Estimation based on Local Errors and Clustering, Shrestha and Solomatine, 2006, 2008 & Solomatine and Shrestha, 2009) method in the direction of explicit inclusion of parameter uncertainty. UNEEC method assumes that there is an optimal model and the residuals of the model can be used to assess the uncertainty of the model prediction. It is assumed that all sources of uncertainty including input, parameter and model structure uncertainty are explicitly manifested in the model residuals. In this research, theses assumptions are relaxed, and the UNEEC method is extended to consider parameter uncertainty as well (abbreviated as UNEEC-P). In UNEEC-P, first we use Monte Carlo (MC) sampling in parameter space to generate N model realizations (each of which is a time series), estimate the prediction quantiles based on the empirical distribution functions of the model residuals considering all the residual realizations, and only then apply the standard UNEEC method that encapsulates the uncertainty of a hydrologic model (expressed by quantiles of the error distribution) in a machine learning model (e.g., ANN). UNEEC-P is applied first to a linear regression model of synthetic data, and then to a real case study of forecasting inflow to lake Lugano in northern Italy. The inflow forecasting model is a stochastic heteroscedastic model (Pianosi and Soncini-Sessa, 2009). The preliminary results show that the UNEEC-P method produces wider uncertainty bounds, which is consistent with the fact that the method considers also parameter uncertainty of the optimal model. In the future UNEEC method will be further extended to consider input and structure uncertainty which will provide more realistic estimation of model predictions.

  3. Alternative Statistical Frameworks for Student Growth Percentile Estimation

    ERIC Educational Resources Information Center

    Lockwood, J. R.; Castellano, Katherine E.

    2015-01-01

    This article suggests two alternative statistical approaches for estimating student growth percentiles (SGP). The first is to estimate percentile ranks of current test scores conditional on past test scores directly, by modeling the conditional cumulative distribution functions, rather than indirectly through quantile regressions. This would…

  4. Estimating geographic variation on allometric growth and body condition of Blue Suckers with quantile regression

    USGS Publications Warehouse

    Cade, B.S.; Terrell, J.W.; Neely, B.C.

    2011-01-01

    Increasing our understanding of how environmental factors affect fish body condition and improving its utility as a metric of aquatic system health require reliable estimates of spatial variation in condition (weight at length). We used three statistical approaches that varied in how they accounted for heterogeneity in allometric growth to estimate differences in body condition of blue suckers Cycleptus elongatus across 19 large-river locations in the central USA. Quantile regression of an expanded allometric growth model provided the most comprehensive estimates, including variation in exponents within and among locations (range = 2.88–4.24). Blue suckers from more-southerly locations had the largest exponents. Mixed-effects mean regression of a similar expanded allometric growth model allowed exponents to vary among locations (range = 3.03–3.60). Mean relative weights compared across selected intervals of total length (TL = 510–594 and 594–692 mm) in a multiplicative model involved the implicit assumption that allometric exponents within and among locations were similar to the exponent (3.46) for the standard weight equation. Proportionate differences in the quantiles of weight at length for adult blue suckers (TL = 510, 594, 644, and 692 mm) compared with their average across locations ranged from 1.08 to 1.30 for southern locations (Texas, Mississippi) and from 0.84 to 1.00 for northern locations (Montana, North Dakota); proportionate differences for mean weight ranged from 1.13 to 1.17 and from 0.87 to 0.95, respectively, and those for mean relative weight ranged from 1.10 to 1.18 and from 0.86 to 0.98, respectively. Weights for fish at longer lengths varied by 600–700 g within a location and by as much as 2,000 g among southern and northern locations. Estimates for the Wabash River, Indiana (0.96–1.07 times the average; greatest increases for lower weights at shorter TLs), and for the Missouri River from Blair, Nebraska, to Sioux City, Iowa (0.90–1.00 times the average; greatest decreases for lower weights at longer TLs), were examined in detail to explain the additional information provided by quantile estimates.

  5. Topological and canonical kriging for design flood prediction in ungauged catchments: an improvement over a traditional regional regression approach?

    USGS Publications Warehouse

    Archfield, Stacey A.; Pugliese, Alessio; Castellarin, Attilio; Skøien, Jon O.; Kiang, Julie E.

    2013-01-01

    In the United States, estimation of flood frequency quantiles at ungauged locations has been largely based on regional regression techniques that relate measurable catchment descriptors to flood quantiles. More recently, spatial interpolation techniques of point data have been shown to be effective for predicting streamflow statistics (i.e., flood flows and low-flow indices) in ungauged catchments. Literature reports successful applications of two techniques, canonical kriging, CK (or physiographical-space-based interpolation, PSBI), and topological kriging, TK (or top-kriging). CK performs the spatial interpolation of the streamflow statistic of interest in the two-dimensional space of catchment descriptors. TK predicts the streamflow statistic along river networks taking both the catchment area and nested nature of catchments into account. It is of interest to understand how these spatial interpolation methods compare with generalized least squares (GLS) regression, one of the most common approaches to estimate flood quantiles at ungauged locations. By means of a leave-one-out cross-validation procedure, the performance of CK and TK was compared to GLS regression equations developed for the prediction of 10, 50, 100 and 500 yr floods for 61 streamgauges in the southeast United States. TK substantially outperforms GLS and CK for the study area, particularly for large catchments. The performance of TK over GLS highlights an important distinction between the treatments of spatial correlation when using regression-based or spatial interpolation methods to estimate flood quantiles at ungauged locations. The analysis also shows that coupling TK with CK slightly improves the performance of TK; however, the improvement is marginal when compared to the improvement in performance over GLS.

  6. Spatio-temporal characteristics of the extreme precipitation by L-moment-based index-flood method in the Yangtze River Delta region, China

    NASA Astrophysics Data System (ADS)

    Yin, Yixing; Chen, Haishan; Xu, Chong-Yu; Xu, Wucheng; Chen, Changchun; Sun, Shanlei

    2016-05-01

    The regionalization methods, which "trade space for time" by pooling information from different locations in the frequency analysis, are efficient tools to enhance the reliability of extreme quantile estimates. This paper aims at improving the understanding of the regional frequency of extreme precipitation by using regionalization methods, and providing scientific background and practical assistance in formulating the regional development strategies for water resources management in one of the most developed and flood-prone regions in China, the Yangtze River Delta (YRD) region. To achieve the main goals, L-moment-based index-flood (LMIF) method, one of the most popular regionalization methods, is used in the regional frequency analysis of extreme precipitation with special attention paid to inter-site dependence and its influence on the accuracy of quantile estimates, which has not been considered by most of the studies using LMIF method. Extensive data screening of stationarity, serial dependence, and inter-site dependence was carried out first. The entire YRD region was then categorized into four homogeneous regions through cluster analysis and homogenous analysis. Based on goodness-of-fit statistic and L-moment ratio diagrams, generalized extreme-value (GEV) and generalized normal (GNO) distributions were identified as the best fitted distributions for most of the sub-regions, and estimated quantiles for each region were obtained. Monte Carlo simulation was used to evaluate the accuracy of the quantile estimates taking inter-site dependence into consideration. The results showed that the root-mean-square errors (RMSEs) were bigger and the 90 % error bounds were wider with inter-site dependence than those without inter-site dependence for both the regional growth curve and quantile curve. The spatial patterns of extreme precipitation with a return period of 100 years were finally obtained which indicated that there are two regions with highest precipitation extremes and a large region with low precipitation extremes. However, the regions with low precipitation extremes are the most developed and densely populated regions of the country, and floods will cause great loss of human life and property damage due to the high vulnerability. The study methods and procedure demonstrated in this paper will provide useful reference for frequency analysis of precipitation extremes in large regions, and the findings of the paper will be beneficial in flood control and management in the study area.

  7. Regionalisation of a distributed method for flood quantiles estimation: Revaluation of local calibration hypothesis to enhance the spatial structure of the optimised parameter

    NASA Astrophysics Data System (ADS)

    Odry, Jean; Arnaud, Patrick

    2016-04-01

    The SHYREG method (Aubert et al., 2014) associates a stochastic rainfall generator and a rainfall-runoff model to produce rainfall and flood quantiles on a 1 km2 mesh covering the whole French territory. The rainfall generator is based on the description of rainy events by descriptive variables following probability distributions and is characterised by a high stability. This stochastic generator is fully regionalised, and the rainfall-runoff transformation is calibrated with a single parameter. Thanks to the stability of the approach, calibration can be performed against only flood quantiles associated with observated frequencies which can be extracted from relatively short time series. The aggregation of SHYREG flood quantiles to the catchment scale is performed using an areal reduction factor technique unique on the whole territory. Past studies demonstrated the accuracy of SHYREG flood quantiles estimation for catchments where flow data are available (Arnaud et al., 2015). Nevertheless, the parameter of the rainfall-runoff model is independently calibrated for each target catchment. As a consequence, this parameter plays a corrective role and compensates approximations and modelling errors which makes difficult to identify its proper spatial pattern. It is an inherent objective of the SHYREG approach to be completely regionalised in order to provide a complete and accurate flood quantiles database throughout France. Consequently, it appears necessary to identify the model configuration in which the calibrated parameter could be regionalised with acceptable performances. The revaluation of some of the method hypothesis is a necessary step before the regionalisation. Especially the inclusion or the modification of the spatial variability of imposed parameters (like production and transfer reservoir size, base flow addition and quantiles aggregation function) should lead to more realistic values of the only calibrated parameter. The objective of the work presented here is to develop a SHYREG evaluation scheme focusing on both local and regional performances. Indeed, it is necessary to maintain the accuracy of at site flood quantiles estimation while identifying a configuration leading to a satisfactory spatial pattern of the calibrated parameter. This ability to be regionalised can be appraised by the association of common regionalisation techniques and split sample validation tests on a set of around 1,500 catchments representing the whole diversity of France physiography. Also, the presence of many nested catchments and a size-based split sample validation make possible to assess the relevance of the calibrated parameter spatial structure inside the largest catchments. The application of this multi-objective evaluation leads to the selection of a version of SHYREG more suitable for regionalisation. References: Arnaud, P., Cantet, P., Aubert, Y., 2015. Relevance of an at-site flood frequency analysis method for extreme events based on stochastic simulation of hourly rainfall. Hydrological Sciences Journal: on press. DOI:10.1080/02626667.2014.965174 Aubert, Y., Arnaud, P., Ribstein, P., Fine, J.A., 2014. The SHYREG flow method-application to 1605 basins in metropolitan France. Hydrological Sciences Journal, 59(5): 993-1005. DOI:10.1080/02626667.2014.902061

  8. Trends of VOC exposures among a nationally representative sample: Analysis of the NHANES 1988 through 2004 data sets

    NASA Astrophysics Data System (ADS)

    Su, Feng-Chiao; Mukherjee, Bhramar; Batterman, Stuart

    2011-09-01

    Exposures to volatile organic compounds (VOCs) are ubiquitous due to emissions from personal, commercial and industrial products, but quantitative and representative information regarding long term exposure trends is lacking. This study characterizes trends from 1988 to 2004 for the 15 VOCs measured in blood in five cohorts of the National Health and Nutrition Examination Survey (NHANES), a large and representative sample of U.S. adults. Trends were evaluated at various percentiles using linear quantile regression (QR) models, which were adjusted for solvent-related occupations and cotinine levels. Most VOCs showed decreasing trends at all quantiles, e.g., median exposures declined by 2.5 (m,p-xylene) to 6.4 (tetrachloroethene) percent per year over the 15 year period. Trends varied by VOC and quantile, and were grouped into three patterns: similar decreases at all quantiles (including benzene, toluene); most rapid decreases at upper quantiles (ethylbenzene, m,p-xylene, o-xylene, styrene, chloroform, tetrachloroethene); and fastest declines at central quantiles (1,4-dichlorobenzene). These patterns reflect changes in exposure sources, e.g., upper-percentile exposures may result mostly from occupational exposure, while lower percentile exposures arise from general environmental sources. Both VOC emissions aggregated at the national level and VOC concentrations measured in ambient air also have declined substantially over the study period and are supportive of the exposure trends, although the NHANES data suggest the importance of indoor sources and personal activities on VOC exposures. While piecewise QR models suggest that exposures of several VOCs decreased little or any during the 1990's, followed by more rapid decreases from 1999 to 2004, questions are raised concerning the reliability of VOC data in several of the NHANES cohorts and its applicability as an exposure indicator, as demonstrated by the modest correlation between VOC levels in blood and personal air collected in the 1999/2000 cohort. Despite some limitations, the NHANES data provides a unique, long term and direct measurement of VOC exposures and trends.

  9. Interpolating Non-Parametric Distributions of Hourly Rainfall Intensities Using Random Mixing

    NASA Astrophysics Data System (ADS)

    Mosthaf, Tobias; Bárdossy, András; Hörning, Sebastian

    2015-04-01

    The correct spatial interpolation of hourly rainfall intensity distributions is of great importance for stochastical rainfall models. Poorly interpolated distributions may lead to over- or underestimation of rainfall and consequently to wrong estimates of following applications, like hydrological or hydraulic models. By analyzing the spatial relation of empirical rainfall distribution functions, a persistent order of the quantile values over a wide range of non-exceedance probabilities is observed. As the order remains similar, the interpolation weights of quantile values for one certain non-exceedance probability can be applied to the other probabilities. This assumption enables the use of kernel smoothed distribution functions for interpolation purposes. Comparing the order of hourly quantile values over different gauges with the order of their daily quantile values for equal probabilities, results in high correlations. The hourly quantile values also show high correlations with elevation. The incorporation of these two covariates into the interpolation is therefore tested. As only positive interpolation weights for the quantile values assure a monotonically increasing distribution function, the use of geostatistical methods like kriging is problematic. Employing kriging with external drift to incorporate secondary information is not applicable. Nonetheless, it would be fruitful to make use of covariates. To overcome this shortcoming, a new random mixing approach of spatial random fields is applied. Within the mixing process hourly quantile values are considered as equality constraints and correlations with elevation values are included as relationship constraints. To profit from the dependence of daily quantile values, distribution functions of daily gauges are used to set up lower equal and greater equal constraints at their locations. In this way the denser daily gauge network can be included in the interpolation of the hourly distribution functions. The applicability of this new interpolation procedure will be shown for around 250 hourly rainfall gauges in the German federal state of Baden-Württemberg. The performance of the random mixing technique within the interpolation is compared to applicable kriging methods. Additionally, the interpolation of kernel smoothed distribution functions is compared with the interpolation of fitted parametric distributions.

  10. Frequency analysis and its spatiotemporal characteristics of precipitation extreme events in China during 1951-2010

    NASA Astrophysics Data System (ADS)

    Shao, Yuehong; Wu, Junmei; Ye, Jinyin; Liu, Yonghe

    2015-08-01

    This study investigates frequency analysis and its spatiotemporal characteristics of precipitation extremes based on annual maximum of daily precipitation (AMP) data of 753 observation stations in China during the period 1951-2010. Several statistical methods including L-moments, Mann-Kendall test (MK test), Student's t test ( t test) and analysis of variance ( F-test) are used to study different statistical properties related to frequency and spatiotemporal characteristics of precipitation extremes. The results indicate that the AMP series of most sites have no linear trends at 90 % confidence level, but there is a distinctive decrease trend in Beijing-Tianjin-Tangshan region. The analysis of abrupt changes shows that there are no significant changes in most sites, and no distinctive regional patterns within the mutation sites either. An important innovation different from the previous studies is the shift in the mean and the variance which are also studied in this paper in order to further analyze the changes of strong and weak precipitation extreme events. The shift analysis shows that we should pay more attention to the drought in North China and to the flood control and drought in South China, especially to those regions that have no clear trend and have a significant shift in the variance. More important, this study conducts the comprehensive analysis of a complete set of quantile estimates and its spatiotemporal characteristic in China. Spatial distribution of quantile estimation based on the AMP series demonstrated that the values gradually increased from the Northwest to the Southeast with the increment of duration and return period, while the increasing rate of estimation is smooth in the arid and semiarid region and is rapid in humid region. Frequency estimates of 50-year return period are in agreement with the maximum observations of AMP series in the most stations, which can provide more quantitative and scientific basis for decision making.

  11. Calibration of limited-area ensemble precipitation forecasts for hydrological predictions

    NASA Astrophysics Data System (ADS)

    Diomede, Tommaso; Marsigli, Chiara; Montani, Andrea; Nerozzi, Fabrizio; Paccagnella, Tiziana

    2015-04-01

    The main objective of this study is to investigate the impact of calibration for limited-area ensemble precipitation forecasts, to be used for driving discharge predictions up to 5 days in advance. A reforecast dataset, which spans 30 years, based on the Consortium for Small Scale Modeling Limited-Area Ensemble Prediction System (COSMO-LEPS) was used for testing the calibration strategy. Three calibration techniques were applied: quantile-to-quantile mapping, linear regression, and analogs. The performance of these methodologies was evaluated in terms of statistical scores for the precipitation forecasts operationally provided by COSMO-LEPS in the years 2003-2007 over Germany, Switzerland, and the Emilia-Romagna region (northern Italy). The analog-based method seemed to be preferred because of its capability of correct position errors and spread deficiencies. A suitable spatial domain for the analog search can help to handle model spatial errors as systematic errors. However, the performance of the analog-based method may degrade in cases where a limited training dataset is available. A sensitivity test on the length of the training dataset over which to perform the analog search has been performed. The quantile-to-quantile mapping and linear regression methods were less effective, mainly because the forecast-analysis relation was not so strong for the available training dataset. A comparison between the calibration based on the deterministic reforecast and the calibration based on the full operational ensemble used as training dataset has been considered, with the aim to evaluate whether reforecasts are really worthy for calibration, given that their computational cost is remarkable. The verification of the calibration process was then performed by coupling ensemble precipitation forecasts with a distributed rainfall-runoff model. This test was carried out for a medium-sized catchment located in Emilia-Romagna, showing a beneficial impact of the analog-based method on the reduction of missed events for discharge predictions.

  12. Orosensory responsiveness and alcohol behaviour.

    PubMed

    Thibodeau, Margaret; Bajec, Martha; Pickering, Gary

    2017-08-01

    Consumption of alcoholic beverages is widespread through much of the world, and significantly impacts human health and well-being. We sought to determine the contribution of orosensation ('taste') to several alcohol intake measures by examining general responsiveness to taste and somatosensory stimuli in a convenience sample of 435 adults recruited from six cohorts. Each cohort was divided into quantiles based on their responsiveness to sweet, sour, bitter, salty, umami, metallic, and astringent stimuli, and the resulting quantiles pooled for analysis (Kruskal-Wallis ANOVA). Responsiveness to bitter and astringent stimuli was associated in a non-linear fashion with intake of all alcoholic beverage types, with the highest consumption observed in middle quantiles. Sourness responsiveness tended to be inversely associated with all measures of alcohol consumption. Regardless of sensation, the most responsive quantiles tended to drink less, although sweetness showed little relationship between responsiveness and intake. For wine, increased umami and metallic responsiveness tended to predict lower total consumption and frequency. A limited examination of individuals who abstain from all alcohol indicated a tendency toward higher responsiveness than alcohol consumers to sweetness, sourness, bitterness, and saltiness (biserial correlation), suggesting that broadly-tuned orosensory responsiveness may be protective against alcohol use and possibly misuse. Overall, these findings confirm the importance of orosensory responsiveness in mediating consumption of alcohol, and indicate areas for further research. Copyright © 2017. Published by Elsevier Inc.

  13. Environmental determinants of different blood lead levels in children: a quantile analysis from a nationwide survey.

    PubMed

    Etchevers, Anne; Le Tertre, Alain; Lucas, Jean-Paul; Bretin, Philippe; Oulhote, Youssef; Le Bot, Barbara; Glorennec, Philippe

    2015-01-01

    Blood lead levels (BLLs) have substantially decreased in recent decades in children in France. However, further reducing exposure is a public health goal because there is no clear toxicological threshold. The identification of the environmental determinants of BLLs as well as risk factors associated with high BLLs is important to update prevention strategies. We aimed to estimate the contribution of environmental sources of lead to different BLLs in children in France. We enrolled 484 children aged from 6months to 6years, in a nationwide cross-sectional survey in 2008-2009. We measured lead concentrations in blood and environmental samples (water, soils, household settled dusts, paints, cosmetics and traditional cookware). We performed two models: a multivariate generalized additive model on the geometric mean (GM), and a quantile regression model on the 10th, 25th, 50th, 75th and 90th quantile of BLLs. The GM of BLLs was 13.8μg/L (=1.38μg/dL) (95% confidence intervals (CI): 12.7-14.9) and the 90th quantile was 25.7μg/L (CI: 24.2-29.5). Household and common area dust, tap water, interior paint, ceramic cookware, traditional cosmetics, playground soil and dust, and environmental tobacco smoke were associated with the GM of BLLs. Household dust and tap water made the largest contributions to both the GM and the 90th quantile of BLLs. The concentration of lead in dust was positively correlated with all quantiles of BLLs even at low concentrations. Lead concentrations in tap water above 5μg/L were also positively correlated with the GM, 75th and 90th quantiles of BLLs in children drinking tap water. Preventative actions must target household settled dust and tap water to reduce the BLLs of children in France. The use of traditional cosmetics should be avoided whereas ceramic cookware should be limited to decorative purposes. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Technical note: Combining quantile forecasts and predictive distributions of streamflows

    NASA Astrophysics Data System (ADS)

    Bogner, Konrad; Liechti, Katharina; Zappa, Massimiliano

    2017-11-01

    The enhanced availability of many different hydro-meteorological modelling and forecasting systems raises the issue of how to optimally combine this great deal of information. Especially the usage of deterministic and probabilistic forecasts with sometimes widely divergent predicted future streamflow values makes it even more complicated for decision makers to sift out the relevant information. In this study multiple streamflow forecast information will be aggregated based on several different predictive distributions, and quantile forecasts. For this combination the Bayesian model averaging (BMA) approach, the non-homogeneous Gaussian regression (NGR), also known as the ensemble model output statistic (EMOS) techniques, and a novel method called Beta-transformed linear pooling (BLP) will be applied. By the help of the quantile score (QS) and the continuous ranked probability score (CRPS), the combination results for the Sihl River in Switzerland with about 5 years of forecast data will be compared and the differences between the raw and optimally combined forecasts will be highlighted. The results demonstrate the importance of applying proper forecast combination methods for decision makers in the field of flood and water resource management.

  15. Quantification of Uncertainty in the Flood Frequency Analysis

    NASA Astrophysics Data System (ADS)

    Kasiapillai Sudalaimuthu, K.; He, J.; Swami, D.

    2017-12-01

    Flood frequency analysis (FFA) is usually carried out for planning and designing of water resources and hydraulic structures. Owing to the existence of variability in sample representation, selection of distribution and estimation of distribution parameters, the estimation of flood quantile has been always uncertain. Hence, suitable approaches must be developed to quantify the uncertainty in the form of prediction interval as an alternate to deterministic approach. The developed framework in the present study to include uncertainty in the FFA discusses a multi-objective optimization approach to construct the prediction interval using ensemble of flood quantile. Through this approach, an optimal variability of distribution parameters is identified to carry out FFA. To demonstrate the proposed approach, annual maximum flow data from two gauge stations (Bow river at Calgary and Banff, Canada) are used. The major focus of the present study was to evaluate the changes in magnitude of flood quantiles due to the recent extreme flood event occurred during the year 2013. In addition, the efficacy of the proposed method was further verified using standard bootstrap based sampling approaches and found that the proposed method is reliable in modeling extreme floods as compared to the bootstrap methods.

  16. Flood frequency analysis - the challenge of using historical data

    NASA Astrophysics Data System (ADS)

    Engeland, Kolbjorn

    2015-04-01

    Estimates of high flood quantiles are needed for many applications, .e.g. dam safety assessments are based on the 1000 years flood, whereas the dimensioning of important infrastructure requires estimates of the 200 year flood. The flood quantiles are estimated by fitting a parametric distribution to a dataset of high flows comprising either annual maximum values or peaks over a selected threshold. Since the record length of data is limited compared to the desired flood quantile, the estimated flood magnitudes are based on a high degree of extrapolation. E.g. the longest time series available in Norway are around 120 years, and as a result any estimation of a 1000 years flood will require extrapolation. One solution is to extend the temporal dimension of a data series by including information about historical floods before the stream flow was systematically gaugeded. Such information could be flood marks or written documentation about flood events. The aim of this study was to evaluate the added value of using historical flood data for at-site flood frequency estimation. The historical floods were included in two ways by assuming: (1) the size of (all) floods above a high threshold within a time interval is known; and (2) the number of floods above a high threshold for a time interval is known. We used a Bayesian model formulation, with MCMC used for model estimation. This estimation procedure allowed us to estimate the predictive uncertainty of flood quantiles (i.e. both sampling and parameter uncertainty is accounted for). We tested the methods using 123 years of systematic data from Bulken in western Norway. In 2014 the largest flood in the systematic record was observed. From written documentation and flood marks we had information from three severe floods in the 18th century and they were likely to exceed the 2014 flood. We evaluated the added value in two ways. First we used the 123 year long streamflow time series and investigated the effect of having several shorter series' which could be supplemented with a limited number of known large flood events. Then we used the three historical floods from the 18th century combined with the whole and subsets of the 123 years of systematic observations. In the latter case several challenges were identified: i) The possibility to transfer water levels to river streamflows due to man made changes in the river profile, (ii) The stationarity of the data might be questioned since the three largest historical floods occurred during the "little ice age" with different climatic conditions compared to today.

  17. An Investigation of Factors Influencing Nurses' Clinical Decision-Making Skills.

    PubMed

    Wu, Min; Yang, Jinqiu; Liu, Lingying; Ye, Benlan

    2016-08-01

    This study aims to investigate the influencing factors on nurses' clinical decision-making (CDM) skills. A cross-sectional nonexperimental research design was conducted in the medical, surgical, and emergency departments of two university hospitals, between May and June 2014. We used a quantile regression method to identify the influencing factors across different quantiles of the CDM skills distribution and compared the results with the corresponding ordinary least squares (OLS) estimates. Our findings revealed that nurses were best at the skills of managing oneself. Educational level, experience, and the total structural empowerment had significant positive impacts on nurses' CDM skills, while the nurse-patient relationship, patient care and interaction, formal empowerment, and information empowerment were negatively correlated with nurses' CDM skills. These variables explained no more than 30% of the variance in nurses' CDM skills and mainly explained the lower quantiles of nurses' CDM skills distribution. © The Author(s) 2016.

  18. Heterogeneity in Smokers' Responses to Tobacco Control Policies.

    PubMed

    Nesson, Erik

    2017-02-01

    This paper uses unconditional quantile regression to estimate whether smokers' responses to tobacco control policies change across the distribution of smoking levels. I measure smoking behavior with the number of cigarettes smoked per day and also with serum cotinine levels, a continuous biomarker of nicotine exposure, using individual-level repeated cross-section data from the National Health and Nutrition Examination Surveys. I find that the cigarette taxes lead to reductions in both the number of cigarettes smoked per day and in smokers' cotinine levels. These reductions are most pronounced in the middle quantiles of both distributions in terms of marginal effects, but most pronounced in the lower quantiles in terms of tax elasticities. I do not find that higher cigarette taxes lead to statistically significant changes in the amount of nicotine smokers ingest from each cigarette. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.

  19. Simulating Quantile Models with Applications to Economics and Management

    NASA Astrophysics Data System (ADS)

    Machado, José A. F.

    2010-05-01

    The massive increase in the speed of computers over the past forty years changed the way that social scientists, applied economists and statisticians approach their trades and also the very nature of the problems that they could feasibly tackle. The new methods that use intensively computer power go by the names of "computer-intensive" or "simulation". My lecture will start with bird's eye view of the uses of simulation in Economics and Statistics. Then I will turn out to my own research on uses of computer- intensive methods. From a methodological point of view the question I address is how to infer marginal distributions having estimated a conditional quantile process, (Counterfactual Decomposition of Changes in Wage Distributions using Quantile Regression," Journal of Applied Econometrics 20, 2005). Illustrations will be provided of the use of the method to perform counterfactual analysis in several different areas of knowledge.

  20. Measuring racial/ethnic disparities across the distribution of health care expenditures.

    PubMed

    Cook, Benjamin Lê; Manning, Willard G

    2009-10-01

    To assess whether black-white and Hispanic-white disparities increase or abate in the upper quantiles of total health care expenditure, conditional on covariates. Nationally representative adult population of non-Hispanic whites, African Americans, and Hispanics from the 2001-2005 Medical Expenditure Panel Surveys. We examine unadjusted racial/ethnic differences across the distribution of expenditures. We apply quantile regression to measure disparities at the median, 75th, 90th, and 95th quantiles, testing for differences over the distribution of health care expenditures and across income and education categories. We test the sensitivity of the results to comparisons based only on health status and estimate a two-part model to ensure that results are not driven by an extremely skewed distribution of expenditures with a large zero mass. Black-white and Hispanic-white disparities diminish in the upper quantiles of expenditure, but expenditures for blacks and Hispanics remain significantly lower than for whites throughout the distribution. For most education and income categories, disparities exist at the median and decline, but remain significant even with increased education and income. Blacks and Hispanics receive significantly disparate care at high expenditure levels, suggesting prioritization of improved access to quality care among minorities with critical health issues.

  1. Log Pearson type 3 quantile estimators with regional skew information and low outlier adjustments

    USGS Publications Warehouse

    Griffis, V.W.; Stedinger, Jery R.; Cohn, T.A.

    2004-01-01

    The recently developed expected moments algorithm (EMA) [Cohn et al., 1997] does as well as maximum likelihood estimations at estimating log‐Pearson type 3 (LP3) flood quantiles using systematic and historical flood information. Needed extensions include use of a regional skewness estimator and its precision to be consistent with Bulletin 17B. Another issue addressed by Bulletin 17B is the treatment of low outliers. A Monte Carlo study compares the performance of Bulletin 17B using the entire sample with and without regional skew with estimators that use regional skew and censor low outliers, including an extended EMA estimator, the conditional probability adjustment (CPA) from Bulletin 17B, and an estimator that uses probability plot regression (PPR) to compute substitute values for low outliers. Estimators that neglect regional skew information do much worse than estimators that use an informative regional skewness estimator. For LP3 data the low outlier rejection procedure generally results in no loss of overall accuracy, and the differences between the MSEs of the estimators that used an informative regional skew are generally modest in the skewness range of real interest. Samples contaminated to model actual flood data demonstrate that estimators which give special treatment to low outliers significantly outperform estimators that make no such adjustment.

  2. Log Pearson type 3 quantile estimators with regional skew information and low outlier adjustments

    NASA Astrophysics Data System (ADS)

    Griffis, V. W.; Stedinger, J. R.; Cohn, T. A.

    2004-07-01

    The recently developed expected moments algorithm (EMA) [, 1997] does as well as maximum likelihood estimations at estimating log-Pearson type 3 (LP3) flood quantiles using systematic and historical flood information. Needed extensions include use of a regional skewness estimator and its precision to be consistent with Bulletin 17B. Another issue addressed by Bulletin 17B is the treatment of low outliers. A Monte Carlo study compares the performance of Bulletin 17B using the entire sample with and without regional skew with estimators that use regional skew and censor low outliers, including an extended EMA estimator, the conditional probability adjustment (CPA) from Bulletin 17B, and an estimator that uses probability plot regression (PPR) to compute substitute values for low outliers. Estimators that neglect regional skew information do much worse than estimators that use an informative regional skewness estimator. For LP3 data the low outlier rejection procedure generally results in no loss of overall accuracy, and the differences between the MSEs of the estimators that used an informative regional skew are generally modest in the skewness range of real interest. Samples contaminated to model actual flood data demonstrate that estimators which give special treatment to low outliers significantly outperform estimators that make no such adjustment.

  3. Spatio-temporal analysis of the extreme precipitation by the L-moment-based index-flood method in the Yangtze River Delta region, China

    NASA Astrophysics Data System (ADS)

    Yin, Yixing; Chen, Haishan; Xu, Chongyu; Xu, Wucheng; Chen, Changchun

    2014-05-01

    The regionalization methods which 'trade space for time' by including several at-site data records in the frequency analysis are an efficient tool to improve the reliability of extreme quantile estimates. With the main aims of improving the understanding of the regional frequency of extreme precipitation and providing scientific and practical background and assistance in formulating the regional development strategies for water resources management in one of the most developed and flood-prone regions in China, the Yangtze River Delta (YRD) region, in this paper, L-moment-based index-flood (LMIF) method, one of the popular regionalization methods, is used in the regional frequency analysis of extreme precipitation; attention was paid to inter-site dependence and its influence on the accuracy of quantile estimates, which hasn't been considered for most of the studies using LMIF method. Extensive data screening of stationarity, serial dependence and inter-site dependence was carried out first. The entire YRD region was then categorized into four homogeneous regions through cluster analysis and homogenous analysis. Based on goodness-of-fit statistic and L-moment ratio diagrams, Generalized extreme-value (GEV) and Generalized Normal (GNO) distributions were identified as the best-fit distributions for most of the sub regions. Estimated quantiles for each region were further obtained. Monte-Carlo simulation was used to evaluate the accuracy of the quantile estimates taking inter-site dependence into consideration. The results showed that the root mean square errors (RMSEs) were bigger and the 90% error bounds were wider with inter-site dependence than those with no inter-site dependence for both the regional growth curve and quantile curve. The spatial patterns of extreme precipitation with return period of 100 years were obtained which indicated that there are two regions with the highest precipitation extremes (southeastern coastal area of Zhejiang Province and the southwest part of Anhui Province) and a large region with low precipitation extremes in the north and middle parts of Zhejiang Province, Shanghai City and Jiangsu Province. However, the central areas with low precipitation extremes are the most developed and densely populated regions in the study area, thus floods will cause great loss of human life and property damage. These findings will contribute to formulating the regional development strategies for policymakers and stakeholders in water resource management against the menaces of frequently emerged floods.

  4. Bayesian Non-Stationary Flood Frequency Estimation at Ungauged Basins Using Climate Information and a Scaling Model

    NASA Astrophysics Data System (ADS)

    Lima, C. H.; Lall, U.

    2010-12-01

    Flood frequency statistical analysis most often relies on stationary assumptions, where distribution moments (e.g. mean, standard deviation) and associated flood quantiles do not change over time. In this sense, one expects that flood magnitudes and their frequency of occurrence will remain constant as observed in the historical information. However, evidence of inter-annual and decadal climate variability and anthropogenic change as well as an apparent increase in the number and magnitude of flood events across the globe have made the stationary assumption questionable. Here, we show how to estimate flood quantiles (e.g. 100-year flood) at ungauged basins without needing to consider stationarity. A statistical model based on the well known flow-area scaling law is proposed to estimate flood flows at ungauged basins. The slope and intercept scaling law coefficients are assumed time varying and a hierarchical Bayesian model is used to include climate information and reduce parameter uncertainties. Cross-validated results from 34 streamflow gauges located in a nested Basin in Brazil show that the proposed model is able to estimate flood quantiles at ungauged basins with remarkable skills compared with data based estimates using the full record. The model as developed in this work is also able to simulate sequences of flood flows considering global climate changes provided an appropriate climate index developed from the General Circulation Model is used as a predictor. The time varying flood frequency estimates can be used for pricing insurance models, and in a forecast mode for preparations for flooding, and finally, for timing infrastructure investments and location. Non-stationary 95% interval estimation for the 100-year Flood (shaded gray region) and 95% interval for the 100-year flood estimated from data (horizontal dashed and solid lines). The average distribution of the 100-year flood is shown in green in the right side.

  5. Student Growth Percentiles Based on MIRT: Implications of Calibrated Projection. CRESST Report 842

    ERIC Educational Resources Information Center

    Monroe, Scott; Cai, Li; Choi, Kilchan

    2014-01-01

    This research concerns a new proposal for calculating student growth percentiles (SGP, Betebenner, 2009). In Betebenner (2009), quantile regression (QR) is used to estimate the SGPs. However, measurement error in the score estimates, which always exists in practice, leads to bias in the QR-­based estimates (Shang, 2012). One way to address this…

  6. Estimation of Return Values of Wave Height: Consequences of Missing Observations

    ERIC Educational Resources Information Center

    Ryden, Jesper

    2008-01-01

    Extreme-value statistics is often used to estimate so-called return values (actually related to quantiles) for environmental quantities like wind speed or wave height. A basic method for estimation is the method of block maxima which consists in partitioning observations in blocks, where maxima from each block could be considered independent.…

  7. Local Composite Quantile Regression Smoothing for Harris Recurrent Markov Processes

    PubMed Central

    Li, Degui; Li, Runze

    2016-01-01

    In this paper, we study the local polynomial composite quantile regression (CQR) smoothing method for the nonlinear and nonparametric models under the Harris recurrent Markov chain framework. The local polynomial CQR regression method is a robust alternative to the widely-used local polynomial method, and has been well studied in stationary time series. In this paper, we relax the stationarity restriction on the model, and allow that the regressors are generated by a general Harris recurrent Markov process which includes both the stationary (positive recurrent) and nonstationary (null recurrent) cases. Under some mild conditions, we establish the asymptotic theory for the proposed local polynomial CQR estimator of the mean regression function, and show that the convergence rate for the estimator in nonstationary case is slower than that in stationary case. Furthermore, a weighted type local polynomial CQR estimator is provided to improve the estimation efficiency, and a data-driven bandwidth selection is introduced to choose the optimal bandwidth involved in the nonparametric estimators. Finally, we give some numerical studies to examine the finite sample performance of the developed methodology and theory. PMID:27667894

  8. Growth curves of preschool children in the northeast of iran: a population based study using quantile regression approach.

    PubMed

    Payande, Abolfazl; Tabesh, Hamed; Shakeri, Mohammad Taghi; Saki, Azadeh; Safarian, Mohammad

    2013-01-14

    Growth charts are widely used to assess children's growth status and can provide a trajectory of growth during early important months of life. The objectives of this study are going to construct growth charts and normal values of weight-for-age for children aged 0 to 5 years using a powerful and applicable methodology. The results compare with the World Health Organization (WHO) references and semi-parametric LMS method of Cole and Green. A total of 70737 apparently healthy boys and girls aged 0 to 5 years were recruited in July 2004 for 20 days from those attending community clinics for routine health checks as a part of a national survey. Anthropometric measurements were done by trained health staff using WHO methodology. The nonparametric quantile regression method obtained by local constant kernel estimation of conditional quantiles curves using for estimation of curves and normal values. The weight-for-age growth curves for boys and girls aged from 0 to 5 years were derived utilizing a population of children living in the northeast of Iran. The results were similar to the ones obtained by the semi-parametric LMS method in the same data. Among all age groups from 0 to 5 years, the median values of children's weight living in the northeast of Iran were lower than the corresponding values in WHO reference data. The weight curves of boys were higher than those of girls in all age groups. The differences between growth patterns of children living in the northeast of Iran versus international ones necessitate using local and regional growth charts. International normal values may not properly recognize the populations at risk for growth problems in Iranian children. Quantile regression (QR) as a flexible method which doesn't require restricted assumptions, proposed for estimation reference curves and normal values.

  9. Growth Curves of Preschool Children in the Northeast of Iran: A Population Based Study Using Quantile Regression Approach

    PubMed Central

    Payande, Abolfazl; Tabesh, Hamed; Shakeri, Mohammad Taghi; Saki, Azadeh; Safarian, Mohammad

    2013-01-01

    Introduction: Growth charts are widely used to assess children’s growth status and can provide a trajectory of growth during early important months of life. The objectives of this study are going to construct growth charts and normal values of weight-for-age for children aged 0 to 5 years using a powerful and applicable methodology. The results compare with the World Health Organization (WHO) references and semi-parametric LMS method of Cole and Green. Methods: A total of 70737 apparently healthy boys and girls aged 0 to 5 years were recruited in July 2004 for 20 days from those attending community clinics for routine health checks as a part of a national survey. Anthropometric measurements were done by trained health staff using WHO methodology. The nonparametric quantile regression method obtained by local constant kernel estimation of conditional quantiles curves using for estimation of curves and normal values. Results: The weight-for-age growth curves for boys and girls aged from 0 to 5 years were derived utilizing a population of children living in the northeast of Iran. The results were similar to the ones obtained by the semi-parametric LMS method in the same data. Among all age groups from 0 to 5 years, the median values of children’s weight living in the northeast of Iran were lower than the corresponding values in WHO reference data. The weight curves of boys were higher than those of girls in all age groups. Conclusion: The differences between growth patterns of children living in the northeast of Iran versus international ones necessitate using local and regional growth charts. International normal values may not properly recognize the populations at risk for growth problems in Iranian children. Quantile regression (QR) as a flexible method which doesn’t require restricted assumptions, proposed for estimation reference curves and normal values. PMID:23618470

  10. Air pollution and daily mortality in Erfurt, east Germany, 1980-1989.

    PubMed Central

    Spix, C; Heinrich, J; Dockery, D; Schwartz, J; Völksch, G; Schwinkowski, K; Cöllen, C; Wichmann, H E

    1993-01-01

    In Erfurt, Germany, unfavorable geography and emissions from coal burning lead to very high ambient pollution (up to about 4000 micrograms/m3 SO2 in 1980-89). To assess possible health effects of these exposures, total daily mortality was obtained for this same period. A multivariate model was fitted, including corrections for long-term fluctuations, influenza epidemics, and meterology, before analyzing the effect of pollution. The best fit for pollution was obtained for log (SO2 daily mean) with a lag of 2 days. Daily mortality increased by 10% for an increase in SO2 from 23 to 929 micrograms/m3 (5% quantile to 95% quantile). A harvesting effect (fewer people die on a given day if more deaths occurred in the last 15 days) may modify this by +/- 2%. The effect for particulates (SP, 1988-89 only) was stronger than the effect of SO2. Log SP (daily mean) increasing from 15 micrograms/m3 to 331 micrograms/m3 (5% quantile to 95% quantile) was associated with a 22% increase in mortality. Depending on harvesting, the observable effect may lie between 14% and 27%. There is no indication of a threshold or synergism. The effects of air pollution are smaller than the effects of influenza epidemics and are of the same size as meterologic effects. The results for the lower end of the dose range are in agreement with linear models fitted in studies of moderate air pollution and episode studies. Images Figure 1. Figure 2. PMID:8137781

  11. Air pollution and daily mortality in Erfurt, east Germany, 1980-1989.

    PubMed

    Spix, C; Heinrich, J; Dockery, D; Schwartz, J; Völksch, G; Schwinkowski, K; Cöllen, C; Wichmann, H E

    1993-11-01

    In Erfurt, Germany, unfavorable geography and emissions from coal burning lead to very high ambient pollution (up to about 4000 micrograms/m3 SO2 in 1980-89). To assess possible health effects of these exposures, total daily mortality was obtained for this same period. A multivariate model was fitted, including corrections for long-term fluctuations, influenza epidemics, and meterology, before analyzing the effect of pollution. The best fit for pollution was obtained for log (SO2 daily mean) with a lag of 2 days. Daily mortality increased by 10% for an increase in SO2 from 23 to 929 micrograms/m3 (5% quantile to 95% quantile). A harvesting effect (fewer people die on a given day if more deaths occurred in the last 15 days) may modify this by +/- 2%. The effect for particulates (SP, 1988-89 only) was stronger than the effect of SO2. Log SP (daily mean) increasing from 15 micrograms/m3 to 331 micrograms/m3 (5% quantile to 95% quantile) was associated with a 22% increase in mortality. Depending on harvesting, the observable effect may lie between 14% and 27%. There is no indication of a threshold or synergism. The effects of air pollution are smaller than the effects of influenza epidemics and are of the same size as meterologic effects. The results for the lower end of the dose range are in agreement with linear models fitted in studies of moderate air pollution and episode studies.

  12. Groundwater depth prediction in a shallow aquifer in north China by a quantile regression model

    NASA Astrophysics Data System (ADS)

    Li, Fawen; Wei, Wan; Zhao, Yong; Qiao, Jiale

    2017-01-01

    There is a close relationship between groundwater level in a shallow aquifer and the surface ecological environment; hence, it is important to accurately simulate and predict the groundwater level in eco-environmental construction projects. The multiple linear regression (MLR) model is one of the most useful methods to predict groundwater level (depth); however, the predicted values by this model only reflect the mean distribution of the observations and cannot effectively fit the extreme distribution data (outliers). The study reported here builds a prediction model of groundwater-depth dynamics in a shallow aquifer using the quantile regression (QR) method on the basis of the observed data of groundwater depth and related factors. The proposed approach was applied to five sites in Tianjin city, north China, and the groundwater depth was calculated in different quantiles, from which the optimal quantile was screened out according to the box plot method and compared to the values predicted by the MLR model. The results showed that the related factors in the five sites did not follow the standard normal distribution and that there were outliers in the precipitation and last-month (initial state) groundwater-depth factors because the basic assumptions of the MLR model could not be achieved, thereby causing errors. Nevertheless, these conditions had no effect on the QR model, as it could more effectively describe the distribution of original data and had a higher precision in fitting the outliers.

  13. Trends of VOC exposures among a nationally representative sample: Analysis of the NHANES 1988 through 2004 data sets

    PubMed Central

    Su, Feng-Chiao; Mukherjee, Bhramar; Batterman, Stuart

    2015-01-01

    Exposures to volatile organic compounds (VOCs) are ubiquitous due to emissions from personal, commercial and industrial products, but quantitative and representative information regarding long term exposure trends is lacking. This study characterizes trends from1988 to 2004 for the 15 VOCs measured in blood in five cohorts of the National Health and Nutrition Examination Survey (NHANES), a large and representative sample of U.S. adults. Trends were evaluated at various percentiles using linear quantile regression (QR) models, which were adjusted for solvent-related occupations and cotinine levels. Most VOCs showed decreasing trends at all quantiles, e.g., median exposures declined by 2.5 (m, p-xylene) to 6.4 (tetrachloroethene) percent per year over the 15 year period. Trends varied by VOC and quantile, and were grouped into three patterns: similar decreases at all quantiles (including benzene, toluene); most rapid decreases at upper quantiles (ethylbenzene, m, p-xylene, o-xylene, styrene, chloroform, tetrachloroethene); and fastest declines at central quantiles (1,4-dichlorobenzene). These patterns reflect changes in exposure sources, e.g., upper-percentile exposures may result mostly from occupational exposure, while lower percentile exposures arise from general environmental sources. Both VOC emissions aggregated at the national level and VOC concentrations measured in ambient air also have declined substantially over the study period and are supportive of the exposure trends, although the NHANES data suggest the importance of indoor sources and personal activities on VOC exposures. While piecewise QR models suggest that exposures of several VOCs decreased little or any during the 1990’s, followed by more rapid decreases from 1999 to 2004, questions are raised concerning the reliability of VOC data in several of the NHANES cohorts and its applicability as an exposure indicator, as demonstrated by the modest correlation between VOC levels in blood and personal air collected in the 1999/2000 cohort. Despite some limitations, the NHANES data provides a unique, long term and direct measurement of VOC exposures and trends. PMID:25705111

  14. Regional flow duration curves: Geostatistical techniques versus multivariate regression

    USGS Publications Warehouse

    Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.

    2016-01-01

    A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.

  15. Accelerating Approximate Bayesian Computation with Quantile Regression: application to cosmological redshift distributions

    NASA Astrophysics Data System (ADS)

    Kacprzak, T.; Herbel, J.; Amara, A.; Réfrégier, A.

    2018-02-01

    Approximate Bayesian Computation (ABC) is a method to obtain a posterior distribution without a likelihood function, using simulations and a set of distance metrics. For that reason, it has recently been gaining popularity as an analysis tool in cosmology and astrophysics. Its drawback, however, is a slow convergence rate. We propose a novel method, which we call qABC, to accelerate ABC with Quantile Regression. In this method, we create a model of quantiles of distance measure as a function of input parameters. This model is trained on a small number of simulations and estimates which regions of the prior space are likely to be accepted into the posterior. Other regions are then immediately rejected. This procedure is then repeated as more simulations are available. We apply it to the practical problem of estimation of redshift distribution of cosmological samples, using forward modelling developed in previous work. The qABC method converges to nearly same posterior as the basic ABC. It uses, however, only 20% of the number of simulations compared to basic ABC, achieving a fivefold gain in execution time for our problem. For other problems the acceleration rate may vary; it depends on how close the prior is to the final posterior. We discuss possible improvements and extensions to this method.

  16. Regional maximum rainfall analysis using L-moments at the Titicaca Lake drainage, Peru

    NASA Astrophysics Data System (ADS)

    Fernández-Palomino, Carlos Antonio; Lavado-Casimiro, Waldo Sven

    2017-08-01

    The present study investigates the application of the index flood L-moments-based regional frequency analysis procedure (RFA-LM) to the annual maximum 24-h rainfall (AM) of 33 rainfall gauge stations (RGs) to estimate rainfall quantiles at the Titicaca Lake drainage (TL). The study region was chosen because it is characterised by common floods that affect agricultural production and infrastructure. First, detailed quality analyses and verification of the RFA-LM assumptions were conducted. For this purpose, different tests for outlier verification, homogeneity, stationarity, and serial independence were employed. Then, the application of RFA-LM procedure allowed us to consider the TL as a single, hydrologically homogeneous region, in terms of its maximum rainfall frequency. That is, this region can be modelled by a generalised normal (GNO) distribution, chosen according to the Z test for goodness-of-fit, L-moments (LM) ratio diagram, and an additional evaluation of the precision of the regional growth curve. Due to the low density of RG in the TL, it was important to produce maps of the AM design quantiles estimated using RFA-LM. Therefore, the ordinary Kriging interpolation (OK) technique was used. These maps will be a useful tool for determining the different AM quantiles at any point of interest for hydrologists in the region.

  17. Estimating tree crown widths for the primary Acadian species in Maine

    Treesearch

    Matthew B. Russell; Aaron R. Weiskittel

    2012-01-01

    In this analysis, data for seven conifer and eight hardwood species were gathered from across the state of Maine for estimating tree crown widths. Maximum and largest crown width equations were developed using tree diameter at breast height as the primary predicting variable. Quantile regression techniques were used to estimate the maximum crown width and a constrained...

  18. Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

    PubMed Central

    Rice, John D.; Taylor, Jeremy M. G.

    2016-01-01

    One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492

  19. The 2011 heat wave in Greater Houston: Effects of land use on temperature.

    PubMed

    Zhou, Weihe; Ji, Shuang; Chen, Tsun-Hsuan; Hou, Yi; Zhang, Kai

    2014-11-01

    Effects of land use on temperatures during severe heat waves have been rarely studied. This paper examines land use-temperature associations during the 2011 heat wave in Greater Houston. We obtained high resolution of satellite-derived land use data from the US National Land Cover Database, and temperature observations at 138 weather stations from Weather Underground, Inc (WU) during the August of 2011, which was the hottest month in Houston since 1889. Land use regression and quantile regression methods were applied to the monthly averages of daily maximum/mean/minimum temperatures and 114 land use-related predictors. Although selected variables vary with temperature metric, distance to the coastline consistently appears among all models. Other variables are generally related to high developed intensity, open water or wetlands. In addition, our quantile regression analysis shows that distance to the coastline and high developed intensity areas have larger impacts on daily average temperatures at higher quantiles, and open water area has greater impacts on daily minimum temperatures at lower quantiles. By utilizing both land use regression and quantile regression on a recent heat wave in one of the largest US metropolitan areas, this paper provides a new perspective on the impacts of land use on temperatures. Our models can provide estimates of heat exposures for epidemiological studies, and our findings can be combined with demographic variables, air conditioning and relevant diseases information to identify 'hot spots' of population vulnerability for public health interventions to reduce heat-related health effects during heat waves. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Comparability of a short food frequency questionnaire to assess diet quality: the DISCOVER study.

    PubMed

    Dehghan, Mahshid; Ge, Yipeng; El Sheikh, Wala; Bawor, Monica; Rangarajan, Sumathy; Dennis, Brittany; Vair, Judith; Sholer, Heather; Hutchinson, Nichole; Iordan, Elizabeth; Mackie, Pam; Samaan, Zainab

    2017-09-01

    This study aims to assess comparability of a short food frequency questionnaire (SFFQ) used in the Determinants of Suicide: Conventional and Emergent Risk Study (DISCOVER Study) with a validated comprehensive FFQ (CFFQ). A total of 127 individuals completed SFFQ and CFFQ. Healthy eating was measured using Healthy Eating Score (HES). Estimated food intake and healthy eating assessed by SFFQ was compared with the CFFQ. For most food groups and HES, the highest Spearman's rank correlation coefficients between the two FFQs were r > .60. For macro-nutrients, the correlations exceeded 0.4. Cross-classification of quantile analysis showed that participants were classified between 46% and 81% into the exact same quantiles, while 10% or less were misclassified into opposite quantiles. The Bland-Altman plots showed an acceptable level of agreement between the two dietary measurement methods. The SFFQ can be used for Canadian with psychiatric disorders to rank them based on their dietary intake.

  1. Relationship between Urbanization and Cancer Incidence in Iran Using Quantile Regression.

    PubMed

    Momenyan, Somayeh; Sadeghifar, Majid; Sarvi, Fatemeh; Khodadost, Mahmoud; Mosavi-Jarrahi, Alireza; Ghaffari, Mohammad Ebrahim; Sekhavati, Eghbal

    2016-01-01

    Quantile regression is an efficient method for predicting and estimating the relationship between explanatory variables and percentile points of the response distribution, particularly for extreme percentiles of the distribution. To study the relationship between urbanization and cancer morbidity, we here applied quantile regression. This cross-sectional study was conducted for 9 cancers in 345 cities in 2007 in Iran. Data were obtained from the Ministry of Health and Medical Education and the relationship between urbanization and cancer morbidity was investigated using quantile regression and least square regression. Fitting models were compared using AIC criteria. R (3.0.1) software and the Quantreg package were used for statistical analysis. With the quantile regression model all percentiles for breast, colorectal, prostate, lung and pancreas cancers demonstrated increasing incidence rate with urbanization. The maximum increase for breast cancer was in the 90th percentile (β=0.13, p-value<0.001), for colorectal cancer was in the 75th percentile (β=0.048, p-value<0.001), for prostate cancer the 95th percentile (β=0.55, p-value<0.001), for lung cancer was in 95th percentile (β=0.52, p-value=0.006), for pancreas cancer was in 10th percentile (β=0.011, p-value<0.001). For gastric, esophageal and skin cancers, with increasing urbanization, the incidence rate was decreased. The maximum decrease for gastric cancer was in the 90th percentile(β=0.003, p-value<0.001), for esophageal cancer the 95th (β=0.04, p-value=0.4) and for skin cancer also the 95th (β=0.145, p-value=0.071). The AIC showed that for upper percentiles, the fitting of quantile regression was better than least square regression. According to the results of this study, the significant impact of urbanization on cancer morbidity requirs more effort and planning by policymakers and administrators in order to reduce risk factors such as pollution in urban areas and ensure proper nutrition recommendations are made.

  2. Effect of threatening life experiences and adverse family relations in ulcerative colitis: analysis using structural equation modeling and comparison with Crohn's disease.

    PubMed

    Slonim-Nevo, Vered; Sarid, Orly; Friger, Michael; Schwartz, Doron; Sergienko, Ruslan; Pereg, Avihu; Vardi, Hillel; Singer, Terri; Chernin, Elena; Greenberg, Dan; Odes, Shmuel

    2017-05-01

    We published that threatening life experiences and adverse family relations impact Crohn's disease (CD) adversely. In this study, we examine the influence of these stressors in ulcerative colitis (UC). Patients completed demography, economic status (ES), the Patient-Simple Clinical Colitis Activity Index (P-SCCAI), the Short Inflammatory Bowel Disease Questionnaire (SIBDQ), the Short-Form Health Survey (SF-36), the Brief Symptom Inventory (BSI), the Family Assessment Device (FAD), and the List of Threatening Life Experiences (LTE). Analysis included multiple linear and quantile regressions and structural equation modeling, comparing CD. UC patients (N=148, age 47.55±16.04 years, 50.6% women) had scores [median (interquartile range)] as follows: SCAAI, 2 (0.3-4.8); FAD, 1.8 (1.3-2.2); LTE, 1.0 (0-2.0); SF-36 Physical Health, 49.4 (36.8-55.1); SF-36 Mental Health, 45 (33.6-54.5); Brief Symptom Inventory-Global Severity Index (GSI), 0.5 (0.2-1.0). SIBDQ was 49.76±14.91. There were significant positive associations for LTE and SCAAI (25, 50, 75% quantiles), FAD and SF-36 Mental Health, FAD and LTE with GSI (50, 75, 90% quantiles), and ES with SF-36 and SIBDQ. The negative associations were as follows: LTE with SF-36 Physical/Mental Health, SIBDQ with FAD and LTE, ES with GSI (all quantiles), and P-SCCAI (75, 90% quantiles). In structural equation modeling analysis, LTE impacted ES negatively and ES impacted GSI negatively; LTE impacted GSI positively and GSI impacted P-SCCAI positively. In a split model, ES had a greater effect on GSI in UC than CD, whereas other path magnitudes were similar. Threatening life experiences, adverse family relations, and poor ES make UC patients less healthy both physically and mentally. The impact of ES is worse in UC than CD.

  3. A quantile regression approach can reveal the effect of fruit and vegetable consumption on plasma homocysteine levels.

    PubMed

    Verly, Eliseu; Steluti, Josiane; Fisberg, Regina Mara; Marchioni, Dirce Maria Lobo

    2014-01-01

    A reduction in homocysteine concentration due to the use of supplemental folic acid is well recognized, although evidence of the same effect for natural folate sources, such as fruits and vegetables (FV), is lacking. The traditional statistical analysis approaches do not provide further information. As an alternative, quantile regression allows for the exploration of the effects of covariates through percentiles of the conditional distribution of the dependent variable. To investigate how the associations of FV intake with plasma total homocysteine (tHcy) differ through percentiles in the distribution using quantile regression. A cross-sectional population-based survey was conducted among 499 residents of Sao Paulo City, Brazil. The participants provided food intake and fasting blood samples. Fruit and vegetable intake was predicted by adjusting for day-to-day variation using a proper measurement error model. We performed a quantile regression to verify the association between tHcy and the predicted FV intake. The predicted values of tHcy for each percentile model were calculated considering an increase of 200 g in the FV intake for each percentile. The results showed that tHcy was inversely associated with FV intake when assessed by linear regression whereas, the association was different when using quantile regression. The relationship with FV consumption was inverse and significant for almost all percentiles of tHcy. The coefficients increased as the percentile of tHcy increased. A simulated increase of 200 g in the FV intake could decrease the tHcy levels in the overall percentiles, but the higher percentiles of tHcy benefited more. This study confirms that the effect of FV intake on lowering the tHcy levels is dependent on the level of tHcy using an innovative statistical approach. From a public health point of view, encouraging people to increase FV intake would benefit people with high levels of tHcy.

  4. Normalization Approaches for Removing Systematic Biases Associated with Mass Spectrometry and Label-Free Proteomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Callister, Stephen J.; Barry, Richard C.; Adkins, Joshua N.

    2006-02-01

    Central tendency, linear regression, locally weighted regression, and quantile techniques were investigated for normalization of peptide abundance measurements obtained from high-throughput liquid chromatography-Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR MS). Arbitrary abundances of peptides were obtained from three sample sets, including a standard protein sample, two Deinococcus radiodurans samples taken from different growth phases, and two mouse striatum samples from control and methamphetamine-stressed mice (strain C57BL/6). The selected normalization techniques were evaluated in both the absence and presence of biological variability by estimating extraneous variability prior to and following normalization. Prior to normalization, replicate runs from each sample setmore » were observed to be statistically different, while following normalization replicate runs were no longer statistically different. Although all techniques reduced systematic bias, assigned ranks among the techniques revealed significant trends. For most LC-FTICR MS analyses, linear regression normalization ranked either first or second among the four techniques, suggesting that this technique was more generally suitable for reducing systematic biases.« less

  5. Data quantile-quantile plots: quantifying the time evolution of space climatology

    NASA Astrophysics Data System (ADS)

    Tindale, Elizabeth; Chapman, Sandra

    2017-04-01

    The solar wind is inherently variable across a wide range of spatio-temporal scales; embedded in the flow are the signatures of distinct non-linear physical processes from evolving turbulence to the dynamical solar corona. In-situ satellite observations of solar wind magnetic field and velocity are at minute and below time resolution and now extend over several solar cycles. Each solar cycle is unique, and the space climatology challenge is to quantify how solar wind variability changes within, and across, each distinct solar cycle, and how this in turn drives space weather at earth. We will demonstrate a novel statistical method, that of data-data quantile-quantile (DQQ) plots, which quantifies how the underlying statistical distribution of a given observable is changing in time. Importantly this method does not require any assumptions concerning the underlying functional form of the distribution and can identify multi-component behaviour that is changing in time. This can be used to determine when a sub-range of a given observable is undergoing a change in statistical distribution, or where the moments of the distribution only are changing and the functional form of the underlying distribution is not changing in time. The method is quite general; for this application we use data from the WIND satellite to compare the solar wind across the minima and maxima of solar cycles 23 and 24 [1], and how these changes are manifest in parameters that quantify coupling to the earth's magnetosphere. [1] Tindale, E., and S.C. Chapman (2016), Geophys. Res. Lett., 43(11), doi: 10.1002/2016GL068920.

  6. Differences in BMI z-Scores between Offspring of Smoking and Nonsmoking Mothers: A Longitudinal Study of German Children from Birth through 14 Years of Age

    PubMed Central

    Fenske, Nora; Müller, Manfred J.; Plachta-Danielzik, Sandra; Keil, Thomas; Grabenhenrich, Linus; von Kries, Rüdiger

    2014-01-01

    Background: Children of mothers who smoked during pregnancy have a lower birth weight but have a higher chance to become overweight during childhood. Objectives: We followed children longitudinally to assess the age when higher body mass index (BMI) z-scores became evident in the children of mothers who smoked during pregnancy, and to evaluate the trajectory of changes until adolescence. Methods: We pooled data from two German cohort studies that included repeated anthropometric measurements until 14 years of age and information on smoking during pregnancy and other risk factors for overweight. We used longitudinal quantile regression to estimate age- and sex-specific associations between maternal smoking and the 10th, 25th, 50th, 75th, and 90th quantiles of the BMI z-score distribution in study participants from birth through 14 years of age, adjusted for potential confounders. We used additive mixed models to estimate associations with mean BMI z-scores. Results: Mean and median (50th quantile) BMI z-scores at birth were smaller in the children of mothers who smoked during pregnancy compared with children of nonsmoking mothers, but BMI z-scores were significantly associated with maternal smoking beginning at the age of 4–5 years, and differences increased over time. For example, the difference in the median BMI z-score between the daughters of smokers versus nonsmokers was 0.12 (95% CI: 0.01, 0.21) at 5 years, and 0.30 (95% CI: 0.08, 0.39) at 14 years of age. For lower BMI z-score quantiles, the association with smoking was more pronounced in girls, whereas in boys the association was more pronounced for higher BMI z-score quantiles. Conclusions: A clear difference in BMI z-score (mean and median) between children of smoking and nonsmoking mothers emerged at 4–5 years of age. The shape and size of age-specific effect estimates for maternal smoking during pregnancy varied by age and sex across the BMI z-score distribution. Citation: Riedel C, Fenske N, Müller MJ, Plachta-Danielzik S, Keil T, Grabenhenrich L, von Kries R. 2014. Differences in BMI z-scores between offspring of smoking and nonsmoking mothers: a longitudinal study of German children from birth through 14 years of age. Environ Health Perspect 122:761–767; http://dx.doi.org/10.1289/ehp.1307139 PMID:24695368

  7. Extreme Quantile Estimation in Binary Response Models

    DTIC Science & Technology

    1990-03-01

    in Cancer Research," Biometria , VoL 66, pp. 307-316. Hsi, B.P. [1969], ’The Multiple Sample Up-and-Down Method in Bioassay," Journal of the American...New Method of Estimation," Biometria , VoL 53, pp. 439-454. Wetherill, G.B. [1976], Sequential Methods in Statistics, London: Chapman and Hall. Wu, C.FJ

  8. Socioeconomic and ethnic inequalities in exposure to air and noise pollution in London.

    PubMed

    Tonne, Cathryn; Milà, Carles; Fecht, Daniela; Alvarez, Mar; Gulliver, John; Smith, James; Beevers, Sean; Ross Anderson, H; Kelly, Frank

    2018-06-01

    Transport-related air and noise pollution, exposures linked to adverse health outcomes, varies within cities potentially resulting in exposure inequalities. Relatively little is known regarding inequalities in personal exposure to air pollution or transport-related noise. Our objectives were to quantify socioeconomic and ethnic inequalities in London in 1) air pollution exposure at residence compared to personal exposure; and 2) transport-related noise at residence from different sources. We used individual-level data from the London Travel Demand Survey (n = 45,079) between 2006 and 2010. We modeled residential (CMAQ-urban) and personal (London Hybrid Exposure Model) particulate matter <2.5 μm and nitrogen dioxide (NO 2 ), road-traffic noise at residence (TRANEX) and identified those within 50 dB noise contours of railways and Heathrow airport. We analyzed relationships between household income, area-level income deprivation and ethnicity with air and noise pollution using quantile and logistic regression. We observed inverse patterns in inequalities in air pollution when estimated at residence versus personal exposure with respect to household income (categorical, 8 groups). Compared to the lowest income group (<£10,000), the highest group (>£75,000) had lower residential NO 2 (-1.3 (95% CI -2.1, -0.6) μg/m 3 in the 95th exposure quantile) but higher personal NO 2 exposure (1.9 (95% CI 1.6, 2.3) μg/m 3 in the 95th quantile), which was driven largely by transport mode and duration. Inequalities in residential exposure to NO 2 with respect to area-level deprivation were larger at lower exposure quantiles (e.g. estimate for NO 2 5.1 (95% CI 4.6, 5.5) at quantile 0.15 versus 1.9 (95% CI 1.1, 2.6) at quantile 0.95), reflecting low-deprivation, high residential NO 2 areas in the city centre. Air pollution exposure at residence consistently overestimated personal exposure; this overestimation varied with age, household income, and area-level income deprivation. Inequalities in road traffic noise were generally small. In logistic regression models, the odds of living within a 50 dB contour of aircraft noise were highest in individuals with the highest household income, white ethnicity, and with the lowest area-level income deprivation. Odds of living within a 50 dB contour of rail noise were 19% (95% CI 3, 37) higher for black compared to white individuals. Socioeconomic inequalities in air pollution exposure were different for modeled residential versus personal exposure, which has important implications for environmental justice and confounding in epidemiology studies. Exposure misclassification was dependent on several factors related to health, a potential source of bias in epidemiological studies. Quantile regression revealed that socioeconomic and ethnic inequalities in air pollution are often not uniform across the exposure distribution. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Patient characteristics associated with differences in radiation exposure from pediatric abdomen-pelvis CT scans: a quantile regression analysis.

    PubMed

    Cooper, Jennifer N; Lodwick, Daniel L; Adler, Brent; Lee, Choonsik; Minneci, Peter C; Deans, Katherine J

    2017-06-01

    Computed tomography (CT) is a widely used diagnostic tool in pediatric medicine. However, due to concerns regarding radiation exposure, it is essential to identify patient characteristics associated with higher radiation burden from CT imaging, in order to more effectively target efforts towards dose reduction. Our objective was to identify the effects of various demographic and clinical patient characteristics on radiation exposure from single abdomen/pelvis CT scans in children. CT scans performed at our institution between January 2013 and August 2015 in patients under 16 years of age were processed using a software tool that estimates patient-specific organ and effective doses and merges these estimates with data from the electronic health record and billing record. Quantile regression models at the 50th, 75th, and 90th percentiles were used to estimate the effects of patients' demographic and clinical characteristics on effective dose. 2390 abdomen/pelvis CT scans (median effective dose 1.52mSv) were included. Of all characteristics examined, only older age, female gender, higher BMI, and whether the scan was a multiphase exam or an exam that required repeating for movement were significant predictors of higher effective dose at each quantile examined (all p<0.05). The effects of obesity and multiphase or repeat scanning on effective dose were magnified in higher dose scans. Older age, female gender, obesity, and multiphase or repeat scanning are all associated with increased effective dose from abdomen/pelvis CT. Targeted efforts to reduce dose from abdominal CT in these groups should be undertaken. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Quantile Functions, Convergence in Quantile, and Extreme Value Distribution Theory.

    DTIC Science & Technology

    1980-11-01

    Gnanadesikan (1968). Quantile functions are advocated by Parzen (1979) as providing an approach to probability-based data analysis. Quantile functions are... Gnanadesikan , R. (1968). Probability Plotting Methods for the Analysis of Data, Biomtrika, 55, 1-17.

  11. The association of fatigue, pain, depression and anxiety with work and activity impairment in immune mediated inflammatory diseases.

    PubMed

    Enns, Murray W; Bernstein, Charles N; Kroeker, Kristine; Graff, Lesley; Walker, John R; Lix, Lisa M; Hitchon, Carol A; El-Gabalawy, Renée; Fisk, John D; Marrie, Ruth Ann

    2018-01-01

    Impairment in work function is a frequent outcome in patients with chronic conditions such as immune-mediated inflammatory diseases (IMID), depression and anxiety disorders. The personal and economic costs of work impairment in these disorders are immense. Symptoms of pain, fatigue, depression and anxiety are potentially remediable forms of distress that may contribute to work impairment in chronic health conditions such as IMID. The present study evaluated the association between pain [Medical Outcomes Study Pain Effects Scale], fatigue [Daily Fatigue Impact Scale], depression and anxiety [Hospital Anxiety and Depression Scale] and work impairment [Work Productivity and Activity Impairment Scale] in four patient populations: multiple sclerosis (n = 255), inflammatory bowel disease (n = 248, rheumatoid arthritis (n = 154) and a depression and anxiety group (n = 307), using quantile regression, controlling for the effects of sociodemographic factors, physical disability, and cognitive deficits. Each of pain, depression symptoms, anxiety symptoms, and fatigue individually showed significant associations with work absenteeism, presenteeism, and general activity impairment (quantile regression standardized estimates ranging from 0.3 to 1.0). When the distress variables were entered concurrently into the regression models, fatigue was a significant predictor of work and activity impairment in all models (quantile regression standardized estimates ranging from 0.2 to 0.5). These findings have important clinical implications for understanding the determinants of work impairment and for improving work-related outcomes in chronic disease.

  12. Modeling soil organic carbon with Quantile Regression: Dissecting predictors' effects on carbon stocks

    NASA Astrophysics Data System (ADS)

    Lombardo, Luigi; Saia, Sergio; Schillaci, Calogero; Mai, P. Martin; Huser, Raphaël

    2018-05-01

    Soil Organic Carbon (SOC) estimation is crucial to manage both natural and anthropic ecosystems and has recently been put under the magnifying glass after the Paris agreement 2016 due to its relationship with greenhouse gas. Statistical applications have dominated the SOC stock mapping at regional scale so far. However, the community has hardly ever attempted to implement Quantile Regression (QR) to spatially predict the SOC distribution. In this contribution, we test QR to estimate SOC stock (0-30 $cm$ depth) in the agricultural areas of a highly variable semi-arid region (Sicily, Italy, around 25,000 $km2$) by using topographic and remotely sensed predictors. We also compare the results with those from available SOC stock measurement. The QR models produced robust performances and allowed to recognize dominant effects among the predictors with respect to the considered quantile. This information, currently lacking, suggests that QR can discern predictor influences on SOC stock at specific sub-domains of each predictors. In this work, the predictive map generated at the median shows lower errors than those of the Joint Research Centre and International Soil Reference, and Information Centre benchmarks. The results suggest the use of QR as a comprehensive and effective method to map SOC using legacy data in agro-ecosystems. The R code scripted in this study for QR is included.

  13. Modeling Longitudinal Data Containing Non-Normal Within Subject Errors

    NASA Technical Reports Server (NTRS)

    Feiveson, Alan; Glenn, Nancy L.

    2013-01-01

    The mission of the National Aeronautics and Space Administration’s (NASA) human research program is to advance safe human spaceflight. This involves conducting experiments, collecting data, and analyzing data. The data are longitudinal and result from a relatively few number of subjects; typically 10 – 20. A longitudinal study refers to an investigation where participant outcomes and possibly treatments are collected at multiple follow-up times. Standard statistical designs such as mean regression with random effects and mixed–effects regression are inadequate for such data because the population is typically not approximately normally distributed. Hence, more advanced data analysis methods are necessary. This research focuses on four such methods for longitudinal data analysis: the recently proposed linear quantile mixed models (lqmm) by Geraci and Bottai (2013), quantile regression, multilevel mixed–effects linear regression, and robust regression. This research also provides computational algorithms for longitudinal data that scientists can directly use for human spaceflight and other longitudinal data applications, then presents statistical evidence that verifies which method is best for specific situations. This advances the study of longitudinal data in a broad range of applications including applications in the sciences, technology, engineering and mathematics fields.

  14. The Income-Health Relationship 'Beyond the Mean': New Evidence from Biomarkers.

    PubMed

    Carrieri, Vincenzo; Jones, Andrew M

    2017-07-01

    The relationship between income and health is one of the most explored topics in health economics but less is known about this relationship at different points of the health distribution. Analysis based solely on the mean may miss important information in other parts of the distribution. This is especially relevant when clinical concern is focused on the tail of the distribution and when evaluating the income gradient at different points of the distribution and decomposing income-related inequalities in health is of interest. We use the unconditional quantile regression approach to analyse the income gradient across the entire distribution of objectively measured blood-based biomarkers. We apply an Oaxaca-Blinder decomposition at various quantiles of the biomarker distributions to analyse gender differentials in biomarkers and to measure the contribution of income (and other covariates) to these differentials. Using data from the Health Survey for England, we find a non-linear relationship between income and health and a strong gradient with respect to income at the highest quantiles of the biomarker distributions. We find that there is heterogeneity in the association of health to income across genders, which accounts for a substantial percentage of the gender differentials in observed health. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  15. Health tourism on the rise? Evidence from the Balance of Payments Statistics.

    PubMed

    Loh, Chung-Ping A

    2014-09-01

    The study assesses the presence and magnitude of global trends in health tourism using health-related travel (HRT) spending reported in the International Monetary Fund's Balance of Payments Statistics database. Linear regression and quantile regression are applied to estimate secular trends of the import and export of HRT based on a sample of countries from 2003 to 2009. The results show that from 2003 to 2009 the import and export of health tourism rose among countries with a high volume of such activities (accounting for the upper 40% of the countries), but not among those with a low volume. The uneven growth in health tourism has generated greater contrast between countries with high and low volumes of health tourism activities. However, the growth in the total import of health tourism did not outpace the population growth, implying that in general the population's tendency to engage in health tourism remained static.

  16. A random walk rule for phase I clinical trials.

    PubMed

    Durham, S D; Flournoy, N; Rosenberger, W F

    1997-06-01

    We describe a family of random walk rules for the sequential allocation of dose levels to patients in a dose-response study, or phase I clinical trial. Patients are sequentially assigned the next higher, same, or next lower dose level according to some probability distribution, which may be determined by ethical considerations as well as the patient's response. It is shown that one can choose these probabilities in order to center dose level assignments unimodally around any target quantile of interest. Estimation of the quantile is discussed; the maximum likelihood estimator and its variance are derived under a two-parameter logistic distribution, and the maximum likelihood estimator is compared with other nonparametric estimators. Random walk rules have clear advantages: they are simple to implement, and finite and asymptotic distribution theory is completely worked out. For a specific random walk rule, we compute finite and asymptotic properties and give examples of its use in planning studies. Having the finite distribution theory available and tractable obviates the need for elaborate simulation studies to analyze the properties of the design. The small sample properties of our rule, as determined by exact theory, compare favorably to those of the continual reassessment method, determined by simulation.

  17. Do Our Means of Inquiry Match our Intentions?

    PubMed Central

    Petscher, Yaacov

    2016-01-01

    A key stage of the scientific method is the analysis of data, yet despite the variety of methods that are available to researchers they are most frequently distilled to a model that focuses on the average relation between variables. Although research questions are frequently conceived with broad inquiry in mind, most regression methods are limited in comprehensively evaluating how observed behaviors are related to each other. Quantile regression is a largely unknown yet well-suited analytic technique similar to traditional regression analysis, but allows for a more systematic approach to understanding complex associations among observed phenomena in the psychological sciences. Data from the National Education Longitudinal Study of 1988/2000 are used to illustrate how quantile regression overcomes the limitations of average associations in linear regression by showing that psychological well-being and sex each differentially relate to reading achievement depending on one’s level of reading achievement. PMID:27486410

  18. Post-processing ECMWF precipitation and temperature ensemble reforecasts for operational hydrologic forecasting at various spatial scales

    NASA Astrophysics Data System (ADS)

    Verkade, J. S.; Brown, J. D.; Reggiani, P.; Weerts, A. H.

    2013-09-01

    The ECMWF temperature and precipitation ensemble reforecasts are evaluated for biases in the mean, spread and forecast probabilities, and how these biases propagate to streamflow ensemble forecasts. The forcing ensembles are subsequently post-processed to reduce bias and increase skill, and to investigate whether this leads to improved streamflow ensemble forecasts. Multiple post-processing techniques are used: quantile-to-quantile transform, linear regression with an assumption of bivariate normality and logistic regression. Both the raw and post-processed ensembles are run through a hydrologic model of the river Rhine to create streamflow ensembles. The results are compared using multiple verification metrics and skill scores: relative mean error, Brier skill score and its decompositions, mean continuous ranked probability skill score and its decomposition, and the ROC score. Verification of the streamflow ensembles is performed at multiple spatial scales: relatively small headwater basins, large tributaries and the Rhine outlet at Lobith. The streamflow ensembles are verified against simulated streamflow, in order to isolate the effects of biases in the forcing ensembles and any improvements therein. The results indicate that the forcing ensembles contain significant biases, and that these cascade to the streamflow ensembles. Some of the bias in the forcing ensembles is unconditional in nature; this was resolved by a simple quantile-to-quantile transform. Improvements in conditional bias and skill of the forcing ensembles vary with forecast lead time, amount, and spatial scale, but are generally moderate. The translation to streamflow forecast skill is further muted, and several explanations are considered, including limitations in the modelling of the space-time covariability of the forcing ensembles and the presence of storages.

  19. Effect of psychosocial stressors on patients with Crohn's disease: threatening life experiences and family relations.

    PubMed

    Slonim-Nevo, Vered; Sarid, Orly; Friger, Michael; Schwartz, Doron; Chernin, Elena; Shahar, Ilana; Sergienko, Ruslan; Vardi, Hillel; Rosenthal, Alexander; Mushkalo, Alexander; Dizengof, Vitaly; Ben-Yakov, Gil; Abu-Freha, Naim; Munteanu, Daniella; Gaspar, Nava; Eidelman, Leslie; Segal, Arik; Fich, Alexander; Greenberg, Dan; Odes, Shmuel

    2016-09-01

    Threatening life experiences and adverse family relations are major psychosocial stressors affecting mental and physical health in chronic illnesses, but their influence in Crohn's disease (CD) is unclear. We assessed whether these stressors would predict the psychological and medical condition of CD patients. Consecutive adult CD patients completed a series of instruments including demography, Patient Harvey-Bradshaw Index (P-HBI), Short Inflammatory Bowel Disease Questionnaire (SIBDQ), short-form survey instrument (SF-36), brief symptom inventory (BSI), family assessment device (FAD), and list of threatening life experiences (LTE). Associations of FAD and LTE with P-HBI, SIBDQ, SF-36, and BSI were examined by multiple linear and quantile regression analyses. The cohort included 391 patients, mean age 38.38±13.95 years, 59.6% women, with intermediate economic status. The median scores were as follows: P-HBI 4 (2-8), FAD 1.67 (1.3-2.1), LTE 1 (0-3), SF-36 physical health 43.75 (33.7-51.0), SF-36 mental health 42.99 (34.1-51.9), and BSI-Global Severity Index 0.81 (0.4-1.4). The SIBDQ was 47.27±13.9. LTE was associated with increased P-HBI in all quantiles and FAD in the 50% quantile. FAD and LTE were associated with reduced SIBDQ (P<0.001). Higher LTE was associated with lower SF-36 physical and mental health (P<0.001); FAD was associated with reduced mental health (P<0.001). FAD and LTE were associated positively with GSI in all quantiles; age was associated negatively. CD patients with more threatening life experiences and adverse family relations were less healthy both physically and mentally. Physicians offering patients sociopsychological therapy should relate to threatening life experiences and family relations.

  20. Quantitative evaluation of Alzheimer's disease

    NASA Astrophysics Data System (ADS)

    Duchesne, S.; Frisoni, G. B.

    2009-02-01

    We propose a single, quantitative metric called the disease evaluation factor (DEF) and assess its efficiency at estimating disease burden in normal, control subjects (CTRL) and probable Alzheimer's disease (AD) patients. The study group consisted in 75 patients with a diagnosis of probable AD and 75 age-matched normal CTRL without neurological or neuropsychological deficit. We calculated a reference eigenspace of MRI appearance from reference data, in which our CTRL and probable AD subjects were projected. We then calculated the multi-dimensional hyperplane separating the CTRL and probable AD groups. The DEF was estimated via a multidimensional weighted distance of eigencoordinates for a given subject and the CTRL group mean, along salient principal components forming the separating hyperplane. We used quantile plots, Kolmogorov-Smirnov and χ2 tests to compare the DEF values and test that their distribution was normal. We used a linear discriminant test to separate CTRL from probable AD based on the DEF factor, and reached an accuracy of 87%. A quantitative biomarker in AD would act as an important surrogate marker of disease status and progression.

  1. Fine-tuning satellite-based rainfall estimates

    NASA Astrophysics Data System (ADS)

    Harsa, Hastuadi; Buono, Agus; Hidayat, Rahmat; Achyar, Jaumil; Noviati, Sri; Kurniawan, Roni; Praja, Alfan S.

    2018-05-01

    Rainfall datasets are available from various sources, including satellite estimates and ground observation. The locations of ground observation scatter sparsely. Therefore, the use of satellite estimates is advantageous, because satellite estimates can provide data on places where the ground observations do not present. However, in general, the satellite estimates data contain bias, since they are product of algorithms that transform the sensors response into rainfall values. Another cause may come from the number of ground observations used by the algorithms as the reference in determining the rainfall values. This paper describe the application of bias correction method to modify the satellite-based dataset by adding a number of ground observation locations that have not been used before by the algorithm. The bias correction was performed by utilizing Quantile Mapping procedure between ground observation data and satellite estimates data. Since Quantile Mapping required mean and standard deviation of both the reference and the being-corrected data, thus the Inverse Distance Weighting scheme was applied beforehand to the mean and standard deviation of the observation data in order to provide a spatial composition of them, which were originally scattered. Therefore, it was possible to provide a reference data point at the same location with that of the satellite estimates. The results show that the new dataset have statistically better representation of the rainfall values recorded by the ground observation than the previous dataset.

  2. Bayesian quantitative precipitation forecasts in terms of quantiles

    NASA Astrophysics Data System (ADS)

    Bentzien, Sabrina; Friederichs, Petra

    2014-05-01

    Ensemble prediction systems (EPS) for numerical weather predictions on the mesoscale are particularly developed to obtain probabilistic guidance for high impact weather. An EPS not only issues a deterministic future state of the atmosphere but a sample of possible future states. Ensemble postprocessing then translates such a sample of forecasts into probabilistic measures. This study focus on probabilistic quantitative precipitation forecasts in terms of quantiles. Quantiles are particular suitable to describe precipitation at various locations, since no assumption is required on the distribution of precipitation. The focus is on the prediction during high-impact events and related to the Volkswagen Stiftung funded project WEX-MOP (Mesoscale Weather Extremes - Theory, Spatial Modeling and Prediction). Quantile forecasts are derived from the raw ensemble and via quantile regression. Neighborhood method and time-lagging are effective tools to inexpensively increase the ensemble spread, which results in more reliable forecasts especially for extreme precipitation events. Since an EPS provides a large amount of potentially informative predictors, a variable selection is required in order to obtain a stable statistical model. A Bayesian formulation of quantile regression allows for inference about the selection of predictive covariates by the use of appropriate prior distributions. Moreover, the implementation of an additional process layer for the regression parameters accounts for spatial variations of the parameters. Bayesian quantile regression and its spatially adaptive extension is illustrated for the German-focused mesoscale weather prediction ensemble COSMO-DE-EPS, which runs (pre)operationally since December 2010 at the German Meteorological Service (DWD). Objective out-of-sample verification uses the quantile score (QS), a weighted absolute error between quantile forecasts and observations. The QS is a proper scoring function and can be decomposed into reliability, resolutions and uncertainty parts. A quantile reliability plot gives detailed insights in the predictive performance of the quantile forecasts.

  3. Constructing inverse probability weights for continuous exposures: a comparison of methods.

    PubMed

    Naimi, Ashley I; Moodie, Erica E M; Auger, Nathalie; Kaufman, Jay S

    2014-03-01

    Inverse probability-weighted marginal structural models with binary exposures are common in epidemiology. Constructing inverse probability weights for a continuous exposure can be complicated by the presence of outliers, and the need to identify a parametric form for the exposure and account for nonconstant exposure variance. We explored the performance of various methods to construct inverse probability weights for continuous exposures using Monte Carlo simulation. We generated two continuous exposures and binary outcomes using data sampled from a large empirical cohort. The first exposure followed a normal distribution with homoscedastic variance. The second exposure followed a contaminated Poisson distribution, with heteroscedastic variance equal to the conditional mean. We assessed six methods to construct inverse probability weights using: a normal distribution, a normal distribution with heteroscedastic variance, a truncated normal distribution with heteroscedastic variance, a gamma distribution, a t distribution (1, 3, and 5 degrees of freedom), and a quantile binning approach (based on 10, 15, and 20 exposure categories). We estimated the marginal odds ratio for a single-unit increase in each simulated exposure in a regression model weighted by the inverse probability weights constructed using each approach, and then computed the bias and mean squared error for each method. For the homoscedastic exposure, the standard normal, gamma, and quantile binning approaches performed best. For the heteroscedastic exposure, the quantile binning, gamma, and heteroscedastic normal approaches performed best. Our results suggest that the quantile binning approach is a simple and versatile way to construct inverse probability weights for continuous exposures.

  4. Methods for estimating selected low-flow statistics and development of annual flow-duration statistics for Ohio

    USGS Publications Warehouse

    Koltun, G.F.; Kula, Stephanie P.

    2013-01-01

    This report presents the results of a study to develop methods for estimating selected low-flow statistics and for determining annual flow-duration statistics for Ohio streams. Regression techniques were used to develop equations for estimating 10-year recurrence-interval (10-percent annual-nonexceedance probability) low-flow yields, in cubic feet per second per square mile, with averaging periods of 1, 7, 30, and 90-day(s), and for estimating the yield corresponding to the long-term 80-percent duration flow. These equations, which estimate low-flow yields as a function of a streamflow-variability index, are based on previously published low-flow statistics for 79 long-term continuous-record streamgages with at least 10 years of data collected through water year 1997. When applied to the calibration dataset, average absolute percent errors for the regression equations ranged from 15.8 to 42.0 percent. The regression results have been incorporated into the U.S. Geological Survey (USGS) StreamStats application for Ohio (http://water.usgs.gov/osw/streamstats/ohio.html) in the form of a yield grid to facilitate estimation of the corresponding streamflow statistics in cubic feet per second. Logistic-regression equations also were developed and incorporated into the USGS StreamStats application for Ohio for selected low-flow statistics to help identify occurrences of zero-valued statistics. Quantiles of daily and 7-day mean streamflows were determined for annual and annual-seasonal (September–November) periods for each complete climatic year of streamflow-gaging station record for 110 selected streamflow-gaging stations with 20 or more years of record. The quantiles determined for each climatic year were the 99-, 98-, 95-, 90-, 80-, 75-, 70-, 60-, 50-, 40-, 30-, 25-, 20-, 10-, 5-, 2-, and 1-percent exceedance streamflows. Selected exceedance percentiles of the annual-exceedance percentiles were subsequently computed and tabulated to help facilitate consideration of the annual risk of exceedance or nonexceedance of annual and annual-seasonal-period flow-duration values. The quantiles are based on streamflow data collected through climatic year 2008.

  5. Statistics of concentrations due to single air pollution sources to be applied in numerical modelling of pollutant dispersion

    NASA Astrophysics Data System (ADS)

    Tumanov, Sergiu

    A test of goodness of fit based on rank statistics was applied to prove the applicability of the Eggenberger-Polya discrete probability law to hourly SO 2-concentrations measured in the vicinity of single sources. With this end in view, the pollutant concentration was considered an integral quantity which may be accepted if one properly chooses the unit of measurement (in this case μg m -3) and if account is taken of the limited accuracy of measurements. The results of the test being satisfactory, even in the range of upper quantiles, the Eggenberger-Polya law was used in association with numerical modelling to estimate statistical parameters, e.g. quantiles, cumulative probabilities of threshold concentrations to be exceeded, and so on, in the grid points of a network covering the area of interest. This only needs accurate estimations of means and variances of the concentration series which can readily be obtained through routine air pollution dispersion modelling.

  6. Composite marginal quantile regression analysis for longitudinal adolescent body mass index data.

    PubMed

    Yang, Chi-Chuan; Chen, Yi-Hau; Chang, Hsing-Yi

    2017-09-20

    Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  7. Serum uric acid in U.S. adolescents: distribution and relationship to demographic characteristics and cardiovascular risk factors.

    PubMed

    Shatat, Ibrahim F; Abdallah, Rany T; Sas, David J; Hailpern, Susan M

    2012-07-01

    Despite being associated with multiple disease processes and cardiovascular outcomes, uric acid (UA) reference ranges for adolescents are lacking. We sought to describe the distribution of UA and its relationship to demographic, clinical, socioeconomic, and dietary factors among U.S. adolescents. A nationally representative subsample of 1,912 adolescents aged 13-18 years in NHANES 2005-2008 representing 19,888,299 adolescents was used for this study. Percentiles of the distribution of UA were estimated using quantile regression. Linear regression models examined the association of UA and demographic, socioeconomic, and dietary factors. Mean UA level was 5.14 ± 1.45 mg/dl. Mean UA increased with increasing age and was higher in non-Hispanic white race, male sex, higher body mass index (BMI) Z-score, and with higher systolic blood pressure. In fully adjusted linear regression models, sex, age, race, and BMI were independent determinants of higher UA. This study defines serum UA reference ranges for adolescents. Also, it reveals some intriguing relationships between UA and demographic and clinical characteristics that warrant further studies to examine the pathophysiological role of UA in different disease processes.

  8. Improving medium-range ensemble streamflow forecasts through statistical post-processing

    NASA Astrophysics Data System (ADS)

    Mendoza, Pablo; Wood, Andy; Clark, Elizabeth; Nijssen, Bart; Clark, Martyn; Ramos, Maria-Helena; Nowak, Kenneth; Arnold, Jeffrey

    2017-04-01

    Probabilistic hydrologic forecasts are a powerful source of information for decision-making in water resources operations. A common approach is the hydrologic model-based generation of streamflow forecast ensembles, which can be implemented to account for different sources of uncertainties - e.g., from initial hydrologic conditions (IHCs), weather forecasts, and hydrologic model structure and parameters. In practice, hydrologic ensemble forecasts typically have biases and spread errors stemming from errors in the aforementioned elements, resulting in a degradation of probabilistic properties. In this work, we compare several statistical post-processing techniques applied to medium-range ensemble streamflow forecasts obtained with the System for Hydromet Applications, Research and Prediction (SHARP). SHARP is a fully automated prediction system for the assessment and demonstration of short-term to seasonal streamflow forecasting applications, developed by the National Center for Atmospheric Research, University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation. The suite of post-processing techniques includes linear blending, quantile mapping, extended logistic regression, quantile regression, ensemble analogs, and the generalized linear model post-processor (GLMPP). We assess and compare these techniques using multi-year hindcasts in several river basins in the western US. This presentation discusses preliminary findings about the effectiveness of the techniques for improving probabilistic skill, reliability, discrimination, sharpness and resolution.

  9. Streamflow distribution maps for the Cannon River drainage basin, southeast Minnesota, and the St. Louis River drainage basin, northeast Minnesota

    USGS Publications Warehouse

    Smith, Erik A.; Sanocki, Chris A.; Lorenz, David L.; Jacobsen, Katrin E.

    2017-12-27

    Streamflow distribution maps for the Cannon River and St. Louis River drainage basins were developed by the U.S. Geological Survey, in cooperation with the Legislative-Citizen Commission on Minnesota Resources, to illustrate relative and cumulative streamflow distributions. The Cannon River was selected to provide baseline data to assess the effects of potential surficial sand mining, and the St. Louis River was selected to determine the effects of ongoing Mesabi Iron Range mining. Each drainage basin (Cannon, St. Louis) was subdivided into nested drainage basins: the Cannon River was subdivided into 152 nested drainage basins, and the St. Louis River was subdivided into 353 nested drainage basins. For each smaller drainage basin, the estimated volumes of groundwater discharge (as base flow) and surface runoff flowing into all surface-water features were displayed under the following conditions: (1) extreme low-flow conditions, comparable to an exceedance-probability quantile of 0.95; (2) low-flow conditions, comparable to an exceedance-probability quantile of 0.90; (3) a median condition, comparable to an exceedance-probability quantile of 0.50; and (4) a high-flow condition, comparable to an exceedance-probability quantile of 0.02.Streamflow distribution maps were developed using flow-duration curve exceedance-probability quantiles in conjunction with Soil-Water-Balance model outputs; both the flow-duration curve and Soil-Water-Balance models were built upon previously published U.S. Geological Survey reports. The selected streamflow distribution maps provide a proactive water management tool for State cooperators by illustrating flow rates during a range of hydraulic conditions. Furthermore, after the nested drainage basins are highlighted in terms of surface-water flows, the streamflows can be evaluated in the context of meeting specific ecological flows under different flow regimes and potentially assist with decisions regarding groundwater and surface-water appropriations. Presented streamflow distribution maps are foundational work intended to support the development of additional streamflow distribution maps that include statistical constraints on the selected flow conditions.

  10. A data centred method to estimate and map how the local distribution of daily precipitation is changing

    NASA Astrophysics Data System (ADS)

    Chapman, Sandra; Stainforth, David; Watkins, Nick

    2014-05-01

    Estimates of how our climate is changing are needed locally in order to inform adaptation planning decisions. This requires quantifying the geographical patterns in changes at specific quantiles in distributions of variables such as daily temperature or precipitation. Here we focus on these local changes and on a method to transform daily observations of precipitation into patterns of local climate change. We develop a method[1] for analysing local climatic timeseries to assess which quantiles of the local climatic distribution show the greatest and most robust changes, to specifically address the challenges presented by daily precipitation data. We extract from the data quantities that characterize the changes in time of the likelihood of daily precipitation above a threshold and of the relative amount of precipitation in those days. Our method is a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of how fast different quantiles of precipitation distributions are changing. This involves both determining which quantiles and geographical locations show the greatest change but also, those at which any change is highly uncertain. We demonstrate this approach using E-OBS gridded data[2] timeseries of local daily precipitation from specific locations across Europe over the last 60 years. We treat geographical location and precipitation as independent variables and thus obtain as outputs the pattern of change at a given threshold of precipitation and with geographical location. This is model- independent, thus providing data of direct value in model calibration and assessment. Our results show regionally consistent patterns of systematic increase in precipitation on the wettest days, and of drying across all days which is of potential value in adaptation planning. [1] S C Chapman, D A Stainforth, N W Watkins, 2013, On Estimating Local Long Term Climate Trends, Phil. Trans. R. Soc. A, 371 20120287; D. A. Stainforth, 2013, S. C. Chapman, N. W. Watkins, Mapping climate change in European temperature distributions, Environ. Res. Lett. 8, 034031 [2] Haylock, M.R., N. Hofstra, A.M.G. Klein Tank, E.J. Klok, P.D. Jones and M. New. 2008: A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119

  11. Nonparametric Fine Tuning of Mixtures: Application to Non-Life Insurance Claims Distribution Estimation

    NASA Astrophysics Data System (ADS)

    Sardet, Laure; Patilea, Valentin

    When pricing a specific insurance premium, actuary needs to evaluate the claims cost distribution for the warranty. Traditional actuarial methods use parametric specifications to model claims distribution, like lognormal, Weibull and Pareto laws. Mixtures of such distributions allow to improve the flexibility of the parametric approach and seem to be quite well-adapted to capture the skewness, the long tails as well as the unobserved heterogeneity among the claims. In this paper, instead of looking for a finely tuned mixture with many components, we choose a parsimonious mixture modeling, typically a two or three-component mixture. Next, we use the mixture cumulative distribution function (CDF) to transform data into the unit interval where we apply a beta-kernel smoothing procedure. A bandwidth rule adapted to our methodology is proposed. Finally, the beta-kernel density estimate is back-transformed to recover an estimate of the original claims density. The beta-kernel smoothing provides an automatic fine-tuning of the parsimonious mixture and thus avoids inference in more complex mixture models with many parameters. We investigate the empirical performance of the new method in the estimation of the quantiles with simulated nonnegative data and the quantiles of the individual claims distribution in a non-life insurance application.

  12. A comparison of three approaches to non-stationary flood frequency analysis

    NASA Astrophysics Data System (ADS)

    Debele, S. E.; Strupczewski, W. G.; Bogdanowicz, E.

    2017-08-01

    Non-stationary flood frequency analysis (FFA) is applied to statistical analysis of seasonal flow maxima from Polish and Norwegian catchments. Three non-stationary estimation methods, namely, maximum likelihood (ML), two stage (WLS/TS) and GAMLSS (generalized additive model for location, scale and shape parameters), are compared in the context of capturing the effect of non-stationarity on the estimation of time-dependent moments and design quantiles. The use of a multimodel approach is recommended, to reduce the errors due to the model misspecification in the magnitude of quantiles. The results of calculations based on observed seasonal daily flow maxima and computer simulation experiments showed that GAMLSS gave the best results with respect to the relative bias and root mean square error in the estimates of trend in the standard deviation and the constant shape parameter, while WLS/TS provided better accuracy in the estimates of trend in the mean value. Within three compared methods the WLS/TS method is recommended to deal with non-stationarity in short time series. Some practical aspects of the GAMLSS package application are also presented. The detailed discussion of general issues related to consequences of climate change in the FFA is presented in the second part of the article entitled "Around and about an application of the GAMLSS package in non-stationary flood frequency analysis".

  13. Quantile-based bias correction and uncertainty quantification of extreme event attribution statements

    DOE PAGES

    Jeon, Soyoung; Paciorek, Christopher J.; Wehner, Michael F.

    2016-02-16

    Extreme event attribution characterizes how anthropogenic climate change may have influenced the probability and magnitude of selected individual extreme weather and climate events. Attribution statements often involve quantification of the fraction of attributable risk (FAR) or the risk ratio (RR) and associated confidence intervals. Many such analyses use climate model output to characterize extreme event behavior with and without anthropogenic influence. However, such climate models may have biases in their representation of extreme events. To account for discrepancies in the probabilities of extreme events between observational datasets and model datasets, we demonstrate an appropriate rescaling of the model output basedmore » on the quantiles of the datasets to estimate an adjusted risk ratio. Our methodology accounts for various components of uncertainty in estimation of the risk ratio. In particular, we present an approach to construct a one-sided confidence interval on the lower bound of the risk ratio when the estimated risk ratio is infinity. We demonstrate the methodology using the summer 2011 central US heatwave and output from the Community Earth System Model. In this example, we find that the lower bound of the risk ratio is relatively insensitive to the magnitude and probability of the actual event.« less

  14. A Bibliography for the ABLUE.

    DTIC Science & Technology

    1982-06-01

    scale based on two symmetric quantiles. Sankhya A 30, 335-336. [S] Gupta, S. S. and Gnanadesikan , M. (1966). Estimation of the parameters of the logistic...and Cheng (1971, 1972, 1974) Chan, Cheng, Mead and Panjer (1973) Cheng (1975) Eubank (1979, 1981a,b) Gupta and Gnanadesikan (1966) Hassanein (1969b

  15. Examining the Reliability of Student Growth Percentiles Using Multidimensional IRT

    ERIC Educational Resources Information Center

    Monroe, Scott; Cai, Li

    2015-01-01

    Student growth percentiles (SGPs, Betebenner, 2009) are used to locate a student's current score in a conditional distribution based on the student's past scores. Currently, following Betebenner (2009), quantile regression (QR) is most often used operationally to estimate the SGPs. Alternatively, multidimensional item response theory (MIRT) may…

  16. Wildfire Selectivity for Land Cover Type: Does Size Matter?

    PubMed Central

    Barros, Ana M. G.; Pereira, José M. C.

    2014-01-01

    Previous research has shown that fires burn certain land cover types disproportionally to their abundance. We used quantile regression to study land cover proneness to fire as a function of fire size, under the hypothesis that they are inversely related, for all land cover types. Using five years of fire perimeters, we estimated conditional quantile functions for lower (avoidance) and upper (preference) quantiles of fire selectivity for five land cover types - annual crops, evergreen oak woodlands, eucalypt forests, pine forests and shrublands. The slope of significant regression quantiles describes the rate of change in fire selectivity (avoidance or preference) as a function of fire size. We used Monte-Carlo methods to randomly permutate fires in order to obtain a distribution of fire selectivity due to chance. This distribution was used to test the null hypotheses that 1) mean fire selectivity does not differ from that obtained by randomly relocating observed fire perimeters; 2) that land cover proneness to fire does not vary with fire size. Our results show that land cover proneness to fire is higher for shrublands and pine forests than for annual crops and evergreen oak woodlands. As fire size increases, selectivity decreases for all land cover types tested. Moreover, the rate of change in selectivity with fire size is higher for preference than for avoidance. Comparison between observed and randomized data led us to reject both null hypotheses tested ( = 0.05) and to conclude it is very unlikely the observed values of fire selectivity and change in selectivity with fire size are due to chance. PMID:24454747

  17. Modelling probabilities of heavy precipitation by regional approaches

    NASA Astrophysics Data System (ADS)

    Gaal, L.; Kysely, J.

    2009-09-01

    Extreme precipitation events are associated with large negative consequences for human society, mainly as they may trigger floods and landslides. The recent series of flash floods in central Europe (affecting several isolated areas) on June 24-28, 2009, the worst one over several decades in the Czech Republic as to the number of persons killed and the extent of damage to buildings and infrastructure, is an example. Estimates of growth curves and design values (corresponding e.g. to 50-yr and 100-yr return periods) of precipitation amounts, together with their uncertainty, are important in hydrological modelling and other applications. The interest in high quantiles of precipitation distributions is also related to possible climate change effects, as climate model simulations tend to project increased severity of precipitation extremes in a warmer climate. The present study compares - in terms of Monte Carlo simulation experiments - several methods to modelling probabilities of precipitation extremes that make use of ‘regional approaches’: the estimation of distributions of extremes takes into account data in a ‘region’ (‘pooling group’), in which one may assume that the distributions at individual sites are identical apart from a site-specific scaling factor (the condition is referred to as ‘regional homogeneity’). In other words, all data in a region - often weighted in some way - are taken into account when estimating the probability distribution of extremes at a given site. The advantage is that sampling variations in the estimates of model parameters and high quantiles are to a large extent reduced compared to the single-site analysis. We focus on the ‘region-of-influence’ (ROI) method which is based on the identification of unique pooling groups (forming the database for the estimation) for each site under study. The similarity of sites is evaluated in terms of a set of site attributes related to the distributions of extremes. The issue of the size of the region is linked with a built-in test on regional homogeneity of data. Once a pooling group is delineated, weights based on a dissimilarity measure are assigned to individual sites involved in a pooling group, and all (weighted) data are employed in the estimation of model parameters and high quantiles at a given location. The ROI method is compared with the Hosking-Wallis (HW) regional frequency analysis, which is based on delineating fixed regions (instead of flexible pooling groups) and assigning unit weights to all sites in a region. The comparison of the performance of the individual regional models makes use of data on annual maxima of 1-day precipitation amounts at 209 stations covering the Czech Republic, with altitudes ranging from 150 to 1490 m a.s.l. We conclude that the ROI methodology is superior to the HW analysis, particularly for very high quantiles (100-yr return values). Another advantage of the ROI approach is that subjective decisions - unavoidable when fixed regions in the HW analysis are formed - may efficiently be suppressed, and almost all settings of the ROI method may be justified by results of the simulation experiments. The differences between (any) regional method and single-site analysis are very pronounced and suggest that the at-site estimation is highly unreliable. The ROI method is then applied to estimate high quantiles of precipitation amounts at individual sites. The estimates and their uncertainty are compared with those from a single-site analysis. We focus on the eastern part of the Czech Republic, i.e. an area with complex orography and a particularly pronounced role of Mediterranean cyclones in producing precipitation extremes. The design values are compared with precipitation amounts recorded during the recent heavy precipitation events, including the one associated with the flash flood on June 24, 2009. We also show that the ROI methodology may easily be transferred to the analysis of precipitation extremes in climate model outputs. It efficiently reduces (random) variations in the estimates of parameters of the extreme value distributions in individual gridboxes that result from large spatial variability of heavy precipitation, and represents a straightforward tool for ‘weighting’ data from neighbouring gridboxes within the estimation procedure. The study is supported by the Grant Agency of AS CR under project B300420801.

  18. Time Series Model Identification by Estimating Information, Memory, and Quantiles.

    DTIC Science & Technology

    1983-07-01

    Standards, Sect. D, 68D, 937-951. Parzen, Emanuel (1969) "Multiple time series modeling" Multivariate Analysis - II, edited by P. Krishnaiah , Academic... Krishnaiah , North Holland: Amsterdam, 283-295. Parzen, Emanuel (1979) "Forecasting and Whitening Filter Estimation" TIMS Studies in the Management...principle. Applications of Statistics, P. R. Krishnaiah , ed. North Holland: Amsterdam, 27-41. Box, G. E. P. and Jenkins, G. M. (1970) Time Series Analysis

  19. A Data Centred Method to Estimate and Map Changes in the Full Distribution of Daily Precipitation and Its Exceedances

    NASA Astrophysics Data System (ADS)

    Chapman, S. C.; Stainforth, D. A.; Watkins, N. W.

    2014-12-01

    Estimates of how our climate is changing are needed locally in order to inform adaptation planning decisions. This requires quantifying the geographical patterns in changes at specific quantiles or thresholds in distributions of variables such as daily temperature or precipitation. We develop a method[1] for analysing local climatic timeseries to assess which quantiles of the local climatic distribution show the greatest and most robust changes, to specifically address the challenges presented by 'heavy tailed' distributed variables such as daily precipitation. We extract from the data quantities that characterize the changes in time of the likelihood of daily precipitation above a threshold and of the relative amount of precipitation in those extreme precipitation days. Our method is a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of how fast different quantiles of precipitation distributions are changing. This involves both determining which quantiles and geographical locations show the greatest change but also, those at which any change is highly uncertain. We demonstrate this approach using E-OBS gridded data[2] timeseries of local daily precipitation from specific locations across Europe over the last 60 years. We treat geographical location and precipitation as independent variables and thus obtain as outputs the pattern of change at a given threshold of precipitation and with geographical location. This is model- independent, thus providing data of direct value in model calibration and assessment. Our results identify regionally consistent patterns which, dependent on location, show systematic increase in precipitation on the wettest days, shifts in precipitation patterns to less moderate days and more heavy days, and drying across all days which is of potential value in adaptation planning. [1] S C Chapman, D A Stainforth, N W Watkins, 2013 Phil. Trans. R. Soc. A, 371 20120287; D. A. Stainforth, S. C. Chapman, N. W. Watkins, 2013 Environ. Res. Lett. 8, 034031 [2] Haylock et al. 2008 J. Geophys. Res (Atmospheres), 113, D20119

  20. HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.

    PubMed

    Wiecki, Thomas V; Sofer, Imri; Frank, Michael J

    2013-01-01

    The diffusion model is a commonly used tool to infer latent psychological processes underlying decision-making, and to link them to neural mechanisms based on response times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of response time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model), which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject/condition than non-hierarchical methods, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g., fMRI) influence decision-making parameters. This paper will first describe the theoretical background of the drift diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the χ(2)-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs/

  1. Distributional Analysis in Educational Evaluation: A Case Study from the New York City Voucher Program

    ERIC Educational Resources Information Center

    Bitler, Marianne; Domina, Thurston; Penner, Emily; Hoynes, Hilary

    2015-01-01

    We use quantile treatment effects estimation to examine the consequences of the random-assignment New York City School Choice Scholarship Program across the distribution of student achievement. Our analyses suggest that the program had negligible and statistically insignificant effects across the skill distribution. In addition to contributing to…

  2. Public health impacts of ecosystem change in the Brazilian Amazon

    PubMed Central

    Bauch, Simone C.; Birkenbach, Anna M.; Pattanayak, Subhrendu K.; Sills, Erin O.

    2015-01-01

    The claim that nature delivers health benefits rests on a thin empirical evidence base. Even less evidence exists on how specific conservation policies affect multiple health outcomes. We address these gaps in knowledge by combining municipal-level panel data on diseases, public health services, climatic factors, demographics, conservation policies, and other drivers of land-use change in the Brazilian Amazon. To fully exploit this dataset, we estimate random-effects and quantile regression models of disease incidence. We find that malaria, acute respiratory infection (ARI), and diarrhea incidence are significantly and negatively correlated with the area under strict environmental protection. Results vary by disease for other types of protected areas (PAs), roads, and mining. The relationships between diseases and land-use change drivers also vary by quantile of the disease distribution. Conservation scenarios based on estimated regression results suggest that malaria, ARI, and diarrhea incidence would be reduced by expanding strict PAs, and malaria could be further reduced by restricting roads and mining. Although these relationships are complex, we conclude that interventions to preserve natural capital can deliver cobenefits by also increasing human (health) capital. PMID:26082548

  3. The weighted function method: A handy tool for flood frequency analysis or just a curiosity?

    NASA Astrophysics Data System (ADS)

    Bogdanowicz, Ewa; Kochanek, Krzysztof; Strupczewski, Witold G.

    2018-04-01

    The idea of the Weighted Function (WF) method for estimation of Pearson type 3 (Pe3) distribution introduced by Ma in 1984 has been revised and successfully applied for shifted inverse Gaussian (IGa3) distribution. Also the conditions of WF applicability to a shifted distribution have been formulated. The accuracy of WF flood quantiles for both Pe3 and IGa3 distributions was assessed by Monte Caro simulations under the true and false distribution assumption versus the maximum likelihood (MLM), moment (MOM) and L-moments (LMM) methods. Three datasets of annual peak flows of Polish catchments serve the case studies to compare the results of the WF, MOM, MLM and LMM performance for the real flood data. For the hundred-year flood the WF method revealed the explicit superiority only over the MLM surpassing the MOM and especially LMM both for the true and false distributional assumption with respect to relative bias and relative mean root square error values. Generally, the WF method performs well and for hydrological sample size and constitutes good alternative for the estimation of the flood upper quantiles.

  4. Modeling energy expenditure in children and adolescents using quantile regression

    USDA-ARS?s Scientific Manuscript database

    Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in energy expenditure (EE). Study objective is to apply quantile regression (QR) to predict EE and determine quantile-dependent variation in covariate effects in nonobese and obes...

  5. HIGHLIGHTING DIFFERENCES BETWEEN CONDITIONAL AND UNCONDITIONAL QUANTILE REGRESSION APPROACHES THROUGH AN APPLICATION TO ASSESS MEDICATION ADHERENCE

    PubMed Central

    BORAH, BIJAN J.; BASU, ANIRBAN

    2014-01-01

    The quantile regression (QR) framework provides a pragmatic approach in understanding the differential impacts of covariates along the distribution of an outcome. However, the QR framework that has pervaded the applied economics literature is based on the conditional quantile regression method. It is used to assess the impact of a covariate on a quantile of the outcome conditional on specific values of other covariates. In most cases, conditional quantile regression may generate results that are often not generalizable or interpretable in a policy or population context. In contrast, the unconditional quantile regression method provides more interpretable results as it marginalizes the effect over the distributions of other covariates in the model. In this paper, the differences between these two regression frameworks are highlighted, both conceptually and econometrically. Additionally, using real-world claims data from a large US health insurer, alternative QR frameworks are implemented to assess the differential impacts of covariates along the distribution of medication adherence among elderly patients with Alzheimer’s disease. PMID:23616446

  6. The effect of smoking habit changes on body weight: Evidence from the UK.

    PubMed

    Pieroni, Luca; Salmasi, Luca

    2016-03-01

    This paper evaluates the causal relationship between smoking and body weight through two waves (2004-2006) of the British Household Panel Survey. We model the effect of changes in smoking habits, such as quitting or reducing, and account for the heterogeneous responses of individuals located at different points of the body mass distribution by quantile regression. We test our results by means of a large set of control groups and investigate their robustness by using the changes-in-changes estimator and accounting for different thresholds to define smoking reductions. Our results reveal the positive effect of quitting smoking on weight changes, which is also found to increase in the highest quantiles, whereas the decision to reduce smoking does not affect body weight. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Fitness adjusted racial disparities in central adiposity among women in the USA using quantile regression.

    PubMed

    McDonald, S; Ortaglia, A; Supino, C; Kacka, M; Clenin, M; Bottai, M

    2017-06-01

    This study comprehensively explores racial/ethnic disparities in waist circumference (WC) after adjusting for cardiorespiratory fitness (CRF), among both adult and adolescent women, across WC percentiles. Analysis was conducted using data from the 1999 to 2004 National Health and Nutrition Examination Survey. Female participants ( n  = 3,977) aged 12-49 years with complete data on CRF, height, weight and WC were included. Quantile regression models, stratified by age groups (12-15, 16-19 and 20-49 years), were used to assess the association between WC and race/ethnicity adjusting for CRF, height and age across WC percentiles (10th, 25th, 50th, 75th, 90th and 95th). For non-Hispanic (NH) Black, in both the 16-19 and 20-49 years age groups, estimated WC was significantly greater than for NH White across percentiles above the median with estimates ranging from 5.2 to 11.5 cm. For Mexican Americans, in all age groups, estimated WC tended to be significantly greater than for NH White particularly for middle percentiles (50th and 75th) with point estimates ranging from 1.9 to 8.4 cm. Significant disparities in WC between NH Black and Mexican women, as compared to NH White, remain even after adjustment for CRF. The magnitude of the disparities associated with race/ethnicity differs across WC percentiles and age groups.

  8. Socio-demographic, clinical characteristics and utilization of mental health care services associated with SF-6D utility scores in patients with mental disorders: contributions of the quantile regression.

    PubMed

    Prigent, Amélie; Kamendje-Tchokobou, Blaise; Chevreul, Karine

    2017-11-01

    Health-related quality of life (HRQoL) is a widely used concept in the assessment of health care. Some generic HRQoL instruments, based on specific algorithms, can generate utility scores which reflect the preferences of the general population for the different health states described by the instrument. This study aimed to investigate the relationships between utility scores and potentially associated factors in patients with mental disorders followed in inpatient and/or outpatient care settings using two statistical methods. Patients were recruited in four psychiatric sectors in France. Patient responses to the SF-36 generic HRQoL instrument were used to calculate SF-6D utility scores. The relationships between utility scores and patient socio-demographic, clinical characteristics, and mental health care utilization, considered as potentially associated factors, were studied using OLS and quantile regressions. One hundred and seventy six patients were included. Women, severely ill patients and those hospitalized full-time tended to report lower utility scores, whereas psychotic disorders (as opposed to mood disorders) and part-time care were associated with higher scores. The quantile regression highlighted that the size of the associations between the utility scores and some patient characteristics varied along with the utility score distribution, and provided more accurate estimated values than OLS regression. The quantile regression may constitute a relevant complement for the analysis of factors associated with utility scores. For policy decision-making, the association of full-time hospitalization with lower utility scores while part-time care was associated with higher scores supports the further development of alternatives to full-time hospitalizations.

  9. Analysis of regional natural flow for evaluation of flood risk according to RCP climate change scenarios

    NASA Astrophysics Data System (ADS)

    Lee, J. Y.; Chae, B. S.; Wi, S.; KIm, T. W.

    2017-12-01

    Various climate change scenarios expect the rainfall in South Korea to increase by 3-10% in the future. The future increased rainfall has significant effect on the frequency of flood in future as well. This study analyzed the probability of future flood to investigate the stability of existing and new installed hydraulic structures and the possibility of increasing flood damage in mid-sized watersheds in South Korea. To achieve this goal, we first clarified the relationship between flood quantiles acquired from the flood-frequency analysis (FFA) and design rainfall-runoff analysis (DRRA) in gauged watersheds. Then, after synthetically generating the regional natural flow data according to RCP climate change scenarios, we developed mathematical formulas to estimate future flood quantiles based on the regression between DRRA and FFA incorporated with regional natural flows in unguaged watersheds. Finally, we developed a flood risk map to investigate the change of flood risk in terms of the return period for the past, present, and future. The results identified that the future flood quantiles and risks would increase in accordance with the RCP climate change scenarios. Because the regional flood risk was identified to increase in future comparing with the present status, comprehensive flood control will be needed to cope with extreme floods in future.

  10. Estimation of design floods in ungauged catchments using a regional index flood method. A case study of Lake Victoria Basin in Kenya

    NASA Astrophysics Data System (ADS)

    Nobert, Joel; Mugo, Margaret; Gadain, Hussein

    Reliable estimation of flood magnitudes corresponding to required return periods, vital for structural design purposes, is impacted by lack of hydrological data in the study area of Lake Victoria Basin in Kenya. Use of regional information, derived from data at gauged sites and regionalized for use at any location within a homogenous region, would improve the reliability of the design flood estimation. Therefore, the regional index flood method has been applied. Based on data from 14 gauged sites, a delineation of the basin into two homogenous regions was achieved using elevation variation (90-m DEM), spatial annual rainfall pattern and Principal Component Analysis of seasonal rainfall patterns (from 94 rainfall stations). At site annual maximum series were modelled using the Log normal (LN) (3P), Log Logistic Distribution (LLG), Generalized Extreme Value (GEV) and Log Pearson Type 3 (LP3) distributions. The parameters of the distributions were estimated using the method of probability weighted moments. Goodness of fit tests were applied and the GEV was identified as the most appropriate model for each site. Based on the GEV model, flood quantiles were estimated and regional frequency curves derived from the averaged at site growth curves. Using the least squares regression method, relationships were developed between the index flood, which is defined as the Mean Annual Flood (MAF) and catchment characteristics. The relationships indicated area, mean annual rainfall and altitude were the three significant variables that greatly influence the index flood. Thereafter, estimates of flood magnitudes in ungauged catchments within a homogenous region were estimated from the derived equations for index flood and quantiles from the regional curves. These estimates will improve flood risk estimation and to support water management and engineering decisions and actions.

  11. Updating estimates of low streamflow statistics to account for possible trends

    NASA Astrophysics Data System (ADS)

    Blum, A. G.; Archfield, S. A.; Hirsch, R. M.; Vogel, R. M.; Kiang, J. E.; Dudley, R. W.

    2017-12-01

    Given evidence of both increasing and decreasing trends in low flows in many streams, methods are needed to update estimators of low flow statistics used in water resources management. One such metric is the 10-year annual low-flow statistic (7Q10) calculated as the annual minimum seven-day streamflow which is exceeded in nine out of ten years on average. Historical streamflow records may not be representative of current conditions at a site if environmental conditions are changing. We present a new approach to frequency estimation under nonstationary conditions that applies a stationary nonparametric quantile estimator to a subset of the annual minimum flow record. Monte Carlo simulation experiments were used to evaluate this approach across a range of trend and no trend scenarios. Relative to the standard practice of using the entire available streamflow record, use of a nonparametric quantile estimator combined with selection of the most recent 30 or 50 years for 7Q10 estimation were found to improve accuracy and reduce bias. Benefits of data subset selection approaches were greater for higher magnitude trends annual minimum flow records with lower coefficients of variation. A nonparametric trend test approach for subset selection did not significantly improve upon always selecting the last 30 years of record. At 174 stream gages in the Chesapeake Bay region, 7Q10 estimators based on the most recent 30 years of flow record were compared to estimators based on the entire period of record. Given the availability of long records of low streamflow, using only a subset of the flow record ( 30 years) can be used to update 7Q10 estimators to better reflect current streamflow conditions.

  12. Predicting Word Reading Ability: A Quantile Regression Study

    ERIC Educational Resources Information Center

    McIlraith, Autumn L.

    2018-01-01

    Predictors of early word reading are well established. However, it is unclear if these predictors hold for readers across a range of word reading abilities. This study used quantile regression to investigate predictive relationships at different points in the distribution of word reading. Quantile regression analyses used preschool and…

  13. Improving photometric redshift estimation using GPZ: size information, post processing, and improved photometry

    NASA Astrophysics Data System (ADS)

    Gomes, Zahra; Jarvis, Matt J.; Almosallam, Ibrahim A.; Roberts, Stephen J.

    2018-03-01

    The next generation of large-scale imaging surveys (such as those conducted with the Large Synoptic Survey Telescope and Euclid) will require accurate photometric redshifts in order to optimally extract cosmological information. Gaussian Process for photometric redshift estimation (GPZ) is a promising new method that has been proven to provide efficient, accurate photometric redshift estimations with reliable variance predictions. In this paper, we investigate a number of methods for improving the photometric redshift estimations obtained using GPZ (but which are also applicable to others). We use spectroscopy from the Galaxy and Mass Assembly Data Release 2 with a limiting magnitude of r < 19.4 along with corresponding Sloan Digital Sky Survey visible (ugriz) photometry and the UKIRT Infrared Deep Sky Survey Large Area Survey near-IR (YJHK) photometry. We evaluate the effects of adding near-IR magnitudes and angular size as features for the training, validation, and testing of GPZ and find that these improve the accuracy of the results by ˜15-20 per cent. In addition, we explore a post-processing method of shifting the probability distributions of the estimated redshifts based on their Quantile-Quantile plots and find that it improves the bias by ˜40 per cent. Finally, we investigate the effects of using more precise photometry obtained from the Hyper Suprime-Cam Subaru Strategic Program Data Release 1 and find that it produces significant improvements in accuracy, similar to the effect of including additional features.

  14. Assessment of probabilistic areal reduction factors of precipitations for the entire French territory with gridded rainfall data.

    NASA Astrophysics Data System (ADS)

    Fouchier, Catherine; Maire, Alexis; Arnaud, Patrick; Cantet, Philippe; Odry, Jean

    2016-04-01

    The starting point of our study was the availability of maps of rainfall quantiles available for the entire French mainland territory at the spatial resolution of 1 km². These maps display the rainfall amounts estimated for different rainfall durations (from 15 minutes to 72 hours) and different return periods (from 2 years up to 1 000 years). They are provided by a regionalized stochastic hourly point rainfall generator, the SHYREG method which was previously developed by Irstea (Arnaud et al., 2007; Cantet and Arnaud, 2014). Being calibrated independently on numerous raingauges data (with an average density across the country of 1 raingauge per 200 km²), this method suffers from a limitation common to point-process rainfall generators: it can only reproduce point rainfall patterns and has no capacity to generate rainfall fields. It can't hence provide areal rainfall quantiles, the estimation of the latter being however needed for the construction of design rainfall or for the diagnostic of observed events. One means of bridging this gap between our local rainfall quantiles and areal rainfall quantiles is given by the concept of probabilistic areal reduction factors of rainfall (ARF) as defined by Omolayo (1993). This concept enables to estimate areal rainfall of a particular frequency within a certain amount of time from point rainfalls of the same frequency and duration. Assessing such ARF for the whole French territory is of particular interest since it should allow us to compute areal rainfall quantiles, and eventually watershed rainfall quantiles, by using the already available grids of statistical point rainfall of the SHYREG method. Our purpose was then to assess these ARF thanks to long time-series of spatial rainfall data. We have used two sets of rainfall fields: i) hourly rainfall fields from a 10-year reference database of Quantitative Precipitation Estimation (QPE) over France (Tabary et al., 2012), ii) daily rainfall fields resulting from a 53-year high-resolution atmospheric reanalysis over France with the SAFRAN-gauge-based analysis system (Vidal et al., 2010). We have then built samples of maximal rainfalls for each cell location (the "point" rainfalls) and for different areas centered on each cell location (the areal rainfalls) of these gridded data. To compute rainfall quantiles, we have fitted a Gumbel law, with the L-moment method, on each of these samples. Our daily and hourly ARF have then shown four main trends: i) a sensitivity to the return period, with ARF values decreasing when the return period increases; ii) a sensitivity to the rainfall duration, with ARF values decreasing when the rainfall duration decreases; iii) a sensitivity to the season, with ARF values smaller for the summer period than for the winter period; iv) a sensitivity to the geographical location, with low ARF values in the French Mediterranean area and ARF values close to 1 for the climatic zones of Northern and Western France (oceanic to semi-continental climate). The results of this data-intensive study led for the first time on the whole French territory are in agreement with studies led abroad (e.g. Allen and DeGaetano 2005, Overeem et al. 2010) and confirm and widen the results of previous studies that were carried out in France on smaller areas and with fewer rainfall durations (e.g. Ramos et al., 2006, Neppel et al., 2003). References Allen R. J. and DeGaetano A. T. (2005). Areal reduction factors for two eastern United States regions with high rain-gauge density. Journal of Hydrologic Engineering 10(4): 327-335. Arnaud P., Fine J.-A. and Lavabre J. (2007). An hourly rainfall generation model applicable to all types of climate. Atmospheric Research 85(2): 230-242. Cantet, P. and Arnaud, P. (2014). Extreme rainfall analysis by a stochastic model: impact of the copula choice on the sub-daily rainfall generation, Stochastic Environmental Research and Risk Assessment, Springer Berlin Heidelberg, 28(6), 1479-1492. Neppel L., Bouvier C. and Lavabre J. (2003). Areal reduction factor probabilities for rainfall in Languedoc Roussillon. IAHS-AISH Publication (278): 276-283. Omolayo, A. S. (1993). On the transposition of areal reduction factors for rainfall frequency estimation. Journal of Hydrology 145 (1-2): 191-205. Overeem A., Buishand T. A., Holleman I. and Uijlenhoet R. (2010). Extreme value modeling of areal rainfall from weather radar. Water Resources Research 46(9): 10 p. Ramos M.-H., Leblois E., Creutin J.-D. (2006). From point to areal rainfall: Linking the different approaches for the frequency characterisation of rainfalls in urban areas. Water Science and Technology. 54(6-7): 33-40. Tabary P., Dupuy P., L'Henaff G., Gueguen C., Moulin L., Laurantin O., Merlier C., Soubeyroux J. M. (2012). A 10-year (1997-2006) reanalysis of Quantitative Precipitation Estimation over France: methodology and first results. IAHS-AISH Publication (351) : 255-260. Vidal J.-P., Martin E., Franchistéguy L., Baillon M. and Soubeyroux J.-M. (2010). A 50-year high-resolution atmospheric reanalysis over France with the Safran system. International Journal of Climatology 30(11): 1627-1644.

  15. Quantile-based Bayesian maximum entropy approach for spatiotemporal modeling of ambient air quality levels.

    PubMed

    Yu, Hwa-Lung; Wang, Chih-Hsin

    2013-02-05

    Understanding the daily changes in ambient air quality concentrations is important to the assessing human exposure and environmental health. However, the fine temporal scales (e.g., hourly) involved in this assessment often lead to high variability in air quality concentrations. This is because of the complex short-term physical and chemical mechanisms among the pollutants. Consequently, high heterogeneity is usually present in not only the averaged pollution levels, but also the intraday variance levels of the daily observations of ambient concentration across space and time. This characteristic decreases the estimation performance of common techniques. This study proposes a novel quantile-based Bayesian maximum entropy (QBME) method to account for the nonstationary and nonhomogeneous characteristics of ambient air pollution dynamics. The QBME method characterizes the spatiotemporal dependence among the ambient air quality levels based on their location-specific quantiles and accounts for spatiotemporal variations using a local weighted smoothing technique. The epistemic framework of the QBME method can allow researchers to further consider the uncertainty of space-time observations. This study presents the spatiotemporal modeling of daily CO and PM10 concentrations across Taiwan from 1998 to 2009 using the QBME method. Results show that the QBME method can effectively improve estimation accuracy in terms of lower mean absolute errors and standard deviations over space and time, especially for pollutants with strong nonhomogeneous variances across space. In addition, the epistemic framework can allow researchers to assimilate the site-specific secondary information where the observations are absent because of the common preferential sampling issues of environmental data. The proposed QBME method provides a practical and powerful framework for the spatiotemporal modeling of ambient pollutants.

  16. Covariate Measurement Error Correction for Student Growth Percentiles Using the SIMEX Method

    ERIC Educational Resources Information Center

    Shang, Yi; VanIwaarden, Adam; Betebenner, Damian W.

    2015-01-01

    In this study, we examined the impact of covariate measurement error (ME) on the estimation of quantile regression and student growth percentiles (SGPs), and find that SGPs tend to be overestimated among students with higher prior achievement and underestimated among those with lower prior achievement, a problem we describe as ME endogeneity in…

  17. Highlighting differences between conditional and unconditional quantile regression approaches through an application to assess medication adherence.

    PubMed

    Borah, Bijan J; Basu, Anirban

    2013-09-01

    The quantile regression (QR) framework provides a pragmatic approach in understanding the differential impacts of covariates along the distribution of an outcome. However, the QR framework that has pervaded the applied economics literature is based on the conditional quantile regression method. It is used to assess the impact of a covariate on a quantile of the outcome conditional on specific values of other covariates. In most cases, conditional quantile regression may generate results that are often not generalizable or interpretable in a policy or population context. In contrast, the unconditional quantile regression method provides more interpretable results as it marginalizes the effect over the distributions of other covariates in the model. In this paper, the differences between these two regression frameworks are highlighted, both conceptually and econometrically. Additionally, using real-world claims data from a large US health insurer, alternative QR frameworks are implemented to assess the differential impacts of covariates along the distribution of medication adherence among elderly patients with Alzheimer's disease. Copyright © 2013 John Wiley & Sons, Ltd.

  18. A Quantile Regression Approach to Understanding the Relations Between Morphological Awareness, Vocabulary, and Reading Comprehension in Adult Basic Education Students

    PubMed Central

    Tighe, Elizabeth L.; Schatschneider, Christopher

    2015-01-01

    The purpose of this study was to investigate the joint and unique contributions of morphological awareness and vocabulary knowledge at five reading comprehension levels in Adult Basic Education (ABE) students. We introduce the statistical technique of multiple quantile regression, which enabled us to assess the predictive utility of morphological awareness and vocabulary knowledge at multiple points (quantiles) along the continuous distribution of reading comprehension. To demonstrate the efficacy of our multiple quantile regression analysis, we compared and contrasted our results with a traditional multiple regression analytic approach. Our results indicated that morphological awareness and vocabulary knowledge accounted for a large portion of the variance (82-95%) in reading comprehension skills across all quantiles. Morphological awareness exhibited the greatest unique predictive ability at lower levels of reading comprehension whereas vocabulary knowledge exhibited the greatest unique predictive ability at higher levels of reading comprehension. These results indicate the utility of using multiple quantile regression to assess trajectories of component skills across multiple levels of reading comprehension. The implications of our findings for ABE programs are discussed. PMID:25351773

  19. Quantile based Tsallis entropy in residual lifetime

    NASA Astrophysics Data System (ADS)

    Khammar, A. H.; Jahanshahi, S. M. A.

    2018-02-01

    Tsallis entropy is a generalization of type α of the Shannon entropy, that is a nonadditive entropy unlike the Shannon entropy. Shannon entropy may be negative for some distributions, but Tsallis entropy can always be made nonnegative by choosing appropriate value of α. In this paper, we derive the quantile form of this nonadditive's entropy function in the residual lifetime, namely the residual quantile Tsallis entropy (RQTE) and get the bounds for it, depending on the Renyi's residual quantile entropy. Also, we obtain relationship between RQTE and concept of proportional hazards model in the quantile setup. Based on the new measure, we propose a stochastic order and aging classes, and study its properties. Finally, we prove characterizations theorems for some well known lifetime distributions. It is shown that RQTE uniquely determines the parent distribution unlike the residual Tsallis entropy.

  20. On the Mean Squared Error of Nonparametric Quantile Estimators under Random Right-Censorship.

    DTIC Science & Technology

    1986-09-01

    SECURITY CI.ASSIFICATION lb. RESTRICTIVE MARKINGS UNCLASSIFIED 2a, SECURITY CLASSIFICATION AUTHORITY 3 . OISTRIBUTIONIAVAILASIL.ITY OF REPORT P16e 2b...UNCLASSIPIEO/UNLIMITEO 3 SAME AS RPT". 0 OTIC USERS 1 UNCLASSIFIED p." " 22. NAME OP RESPONSIBLE INOIVIOUAL 22b. TELEPHONE NUMBER 22c. OFFICE SYMBOL...in Section 3 , and the result for the kernel estimator Qn is derived in Section 4. It should be k. mentioned that the order statistic methods used by

  1. Bayesian uncertainty quantification in linear models for diffusion MRI.

    PubMed

    Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans

    2018-03-29

    Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. The gender gap reloaded: are school characteristics linked to labor market performance?

    PubMed

    Konstantopoulos, Spyros; Constant, Amelie

    2008-06-01

    This study examines the wage gender gap of young adults in the 1970s, 1980s, and 2000 in the US. Using quantile regression we estimate the gender gap across the entire wage distribution. We also study the importance of high school characteristics in predicting future labor market performance. We conduct analyses for three major racial/ethnic groups in the US: Whites, Blacks, and Hispanics, employing data from two rich longitudinal studies: NLS and NELS. Our results indicate that while some school characteristics are positive and significant predictors of future wages for Whites, they are less so for the two minority groups. We find significant wage gender disparities favoring men across all three surveys in the 1970s, 1980s, and 2000. The wage gender gap is more pronounced in higher paid jobs (90th quantile) for all groups, indicating the presence of a persistent and alarming "glass ceiling."

  3. Influences of spatial and temporal variation on fish-habitat relationships defined by regression quantiles

    Treesearch

    Jason B. Dunham; Brian S. Cade; James W. Terrell

    2002-01-01

    We used regression quantiles to model potentially limiting relationships between the standing crop of cutthroat trout Oncorhynchus clarki and measures of stream channel morphology. Regression quantile models indicated that variation in fish density was inversely related to the width:depth ratio of streams but not to stream width or depth alone. The...

  4. Superquantile/CVaR Risk Measures: Second-Order Theory

    DTIC Science & Technology

    2015-07-31

    order superquantile risk minimization as well as superquantile regression , a proposed second-order version of quantile regression . Keywords...minimization as well as superquantile regression , a proposed second-order version of quantile regression . 15. SUBJECT TERMS 16. SECURITY...superquantilies, because it is deeply tied to generalized regression . The joint formula (3) is central to quantile regression , a well known alternative

  5. Non-susceptible landslide areas in Italy and in the Mediterranean region

    NASA Astrophysics Data System (ADS)

    Marchesini, I.; Ardizzone, F.; Alvioli, M.; Rossi, M.; Guzzetti, F.

    2014-04-01

    We used landslide information for 13 study areas in Italy and morphometric information obtained from the 3 arc-second SRTM DEM to determine areas where landslide susceptibility is expected to be null or negligible in Italy, and in the landmasses surrounding the Mediterranean Sea. The morphometric information consisted in the local terrain slope computed in a square 3 × 3 cell moving window, and in the regional relative relief computed in a circular 15 × 15 cell moving window. We tested three different models to determine the non-susceptible landslide areas, including a linear model (LR), a quantile linear model (QLR), and a quantile non-linear model (QNL). We tested the performance of the three models using independent landslide information represented by the Italian Landslide Inventory (Inventario Fenomeni Franosi in Italia - IFFI). Best results were obtained using the QNL model. The corresponding zonation of non-susceptible landslide areas was intersected in a GIS with geographical census data for Italy. The result allowed determining that 57.5% of the population of Italy (in 2001) was located in areas where landslide susceptibility is expected to be null or negligible, and that the remaining 42.5% was located in areas where some landslide susceptibility is expected. We applied the QNL model to the landmasses surrounding the Mediterranean Sea, and we tested the synoptic non-susceptibility zonation using independent landslide information for three study areas in Spain. Results proved that the QNL model was capable of determining where landslide susceptibility is expected to be negligible in the Mediterranean area. We expect our results to be applicable in similar study areas, facilitating the identification of non-susceptible and susceptible landslide areas, at the synoptic scale.

  6. Non-susceptible landslide areas in Italy and in the Mediterranean region

    NASA Astrophysics Data System (ADS)

    Marchesini, I.; Ardizzone, F.; Alvioli, M.; Rossi, M.; Guzzetti, F.

    2014-08-01

    We used landslide information for 13 study areas in Italy and morphometric information obtained from the 3-arcseconds shuttle radar topography mission digital elevation model (SRTM DEM) to determine areas where landslide susceptibility is expected to be negligible in Italy and in the landmasses surrounding the Mediterranean Sea. The morphometric information consisted of the local terrain slope which was computed in a square 3 × 3-cell moving window, and in the regional relative relief computed in a circular 15 × 15-cell moving window. We tested three different models to classify the "non-susceptible" landslide areas, including a linear model (LNR), a quantile linear model (QLR), and a quantile, non-linear model (QNL). We tested the performance of the three models using independent landslide information presented by the Italian Landslide Inventory (Inventario Fenomeni Franosi in Italia - IFFI). Best results were obtained using the QNL model. The corresponding zonation of non-susceptible landslide areas was intersected in a geographic information system (GIS) with geographical census data for Italy. The result determined that 57.5% of the population of Italy (in 2001) was located in areas where landslide susceptibility is expected to be negligible. We applied the QNL model to the landmasses surrounding the Mediterranean Sea, and we tested the synoptic non-susceptibility zonation using independent landslide information for three study areas in Spain. Results showed that the QNL model was capable of determining where landslide susceptibility is expected to be negligible in the validation areas in Spain. We expect our results to be applicable in similar study areas, facilitating the identification of non-susceptible landslide areas, at the synoptic scale.

  7. Nonuniform sampling by quantiles.

    PubMed

    Craft, D Levi; Sonstrom, Reilly E; Rovnyak, Virginia G; Rovnyak, David

    2018-03-01

    A flexible strategy for choosing samples nonuniformly from a Nyquist grid using the concept of statistical quantiles is presented for broad classes of NMR experimentation. Quantile-directed scheduling is intuitive and flexible for any weighting function, promotes reproducibility and seed independence, and is generalizable to multiple dimensions. In brief, weighting functions are divided into regions of equal probability, which define the samples to be acquired. Quantile scheduling therefore achieves close adherence to a probability distribution function, thereby minimizing gaps for any given degree of subsampling of the Nyquist grid. A characteristic of quantile scheduling is that one-dimensional, weighted NUS schedules are deterministic, however higher dimensional schedules are similar within a user-specified jittering parameter. To develop unweighted sampling, we investigated the minimum jitter needed to disrupt subharmonic tracts, and show that this criterion can be met in many cases by jittering within 25-50% of the subharmonic gap. For nD-NUS, three supplemental components to choosing samples by quantiles are proposed in this work: (i) forcing the corner samples to ensure sampling to specified maximum values in indirect evolution times, (ii) providing an option to triangular backfill sampling schedules to promote dense/uniform tracts at the beginning of signal evolution periods, and (iii) providing an option to force the edges of nD-NUS schedules to be identical to the 1D quantiles. Quantile-directed scheduling meets the diverse needs of current NUS experimentation, but can also be used for future NUS implementations such as off-grid NUS and more. A computer program implementing these principles (a.k.a. QSched) in 1D- and 2D-NUS is available under the general public license. Copyright © 2018 Elsevier Inc. All rights reserved.

  8. Nonuniform sampling by quantiles

    NASA Astrophysics Data System (ADS)

    Craft, D. Levi; Sonstrom, Reilly E.; Rovnyak, Virginia G.; Rovnyak, David

    2018-03-01

    A flexible strategy for choosing samples nonuniformly from a Nyquist grid using the concept of statistical quantiles is presented for broad classes of NMR experimentation. Quantile-directed scheduling is intuitive and flexible for any weighting function, promotes reproducibility and seed independence, and is generalizable to multiple dimensions. In brief, weighting functions are divided into regions of equal probability, which define the samples to be acquired. Quantile scheduling therefore achieves close adherence to a probability distribution function, thereby minimizing gaps for any given degree of subsampling of the Nyquist grid. A characteristic of quantile scheduling is that one-dimensional, weighted NUS schedules are deterministic, however higher dimensional schedules are similar within a user-specified jittering parameter. To develop unweighted sampling, we investigated the minimum jitter needed to disrupt subharmonic tracts, and show that this criterion can be met in many cases by jittering within 25-50% of the subharmonic gap. For nD-NUS, three supplemental components to choosing samples by quantiles are proposed in this work: (i) forcing the corner samples to ensure sampling to specified maximum values in indirect evolution times, (ii) providing an option to triangular backfill sampling schedules to promote dense/uniform tracts at the beginning of signal evolution periods, and (iii) providing an option to force the edges of nD-NUS schedules to be identical to the 1D quantiles. Quantile-directed scheduling meets the diverse needs of current NUS experimentation, but can also be used for future NUS implementations such as off-grid NUS and more. A computer program implementing these principles (a.k.a. QSched) in 1D- and 2D-NUS is available under the general public license.

  9. Magnitude of flood flows for selected annual exceedance probabilities in Rhode Island through 2010

    USGS Publications Warehouse

    Zarriello, Phillip J.; Ahearn, Elizabeth A.; Levin, Sara B.

    2012-01-01

    Heavy persistent rains from late February through March 2010 caused severe widespread flooding in Rhode Island that set or nearly set record flows and water levels at many long-term streamgages in the State. In response, the U.S. Geological Survey, in partnership with the Federal Emergency Management Agency, conducted a study to update estimates of flood magnitudes at streamgages and regional equations for estimating flood flows at ungaged locations. This report provides information needed for flood plain management, transportation infrastructure design, flood insurance studies, and other purposes that can help minimize future flood damages and risks. The magnitudes of floods were determined from the annual peak flows at 43 streamgages in Rhode Island (20 sites), Connecticut (14 sites), and Massachusetts (9 sites) using the standard Bulletin 17B log-Pearson type III method and a modification of this method called the expected moments algorithm (EMA) for 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability (AEP) floods. Annual-peak flows were analyzed for the period of record through the 2010 water year; however, records were extended at 23 streamgages using the maintenance of variance extension (MOVE) procedure to best represent the longest period possible for determining the generalized skew and flood magnitudes. Generalized least square regression equations were developed from the flood quantiles computed at 41 streamgages (2 streamgages in Rhode Island with reported flood quantiles were not used in the regional regression because of regulation or redundancy) and their respective basin characteristics to estimate magnitude of floods at ungaged sites. Of 55 basin characteristics evaluated as potential explanatory variables, 3 were statistically significant—drainage area, stream density, and basin storage. The pseudo-coefficient of determination (pseudo-R2) indicates these three explanatory variables explain 95 to 96 percent of the variance in the flood magnitudes from 20- to 0.2-percent AEPs. Estimates of uncertainty of the at-site and regression flood magnitudes are provided and were combined with their respective estimated flood quantiles to improve estimates of flood flows at streamgages. This region has a long history of urban development, which is considered to have an important effect on flood flows. This study includes basins that have an impervious area ranging from 0.5 to 37 percent. Although imperviousness provided some explanatory power in the regression, it was not statistically significant at the 95-percent confidence level for any of the AEPs examined. Influence of urbanization on flood flows indicates a complex interaction with other characteristics that confounds a statistical explanation of its effects. Standard methods for calculating magnitude of floods for given AEP are based on the assumption of stationarity, that is, the annual peak flows exhibit no significant trend over time. A subset of 16 streamgages with 70 or more years of unregulated systematic record indicates all but 4 streamgages have a statistically significant positive trend at the 95-percent confidence level; three of these are statistically significant at about the 90-percent confidence level or above. If the trend continues linearly in time, the estimated magnitude of floods for any AEP, on average, will increase by 6, 13, and 21 percent in 10, 20, and 30 years' time, respectively. In 2010, new peaks of record were set at 18 of the 21 active streamgages in Rhode Island. The updated flood frequency analysis indicates the peaks at these streamgages ranged from 2- to 0.2-percent AEP. Many streamgages in the State peaked at a 0.5- and 0.2-percent AEP, except for streamgages in the Blackstone River Basin, which peaked from a 4- to 2-percent AEP.

  10. Flood Change Assessment and Attribution in Austrian alpine Basins

    NASA Astrophysics Data System (ADS)

    Claps, Pierluigi; Allamano, Paola; Como, Anastasia; Viglione, Alberto

    2016-04-01

    The present paper aims to investigate the sensitivity of flood peaks to global warming in the Austrian alpine basins. A group of 97 Austrian watersheds, with areas ranging from 14 to 6000 km2 and with average elevation ranging from 1000 to 2900 m a.s.l. have been considered. Annual maximum floods are available for the basins from 1890 to 2007 with two densities of observation. In a first period, until 1950, an average of 42 records of flood peaks are available. From 1951 to 2007 the density of observation increases to an average amount of contemporary peaks of 85. This information is very important with reference to the statistical tools used for the empirical assessment of change over time, that is linear quantile regressions. Application of this tool to the data set unveils trends in extreme events, confirmed by statistical testing, for the 0.75 and 0.95 empirical quantiles. All applications are made with specific (discharges/area) values . Similarly of what done in a previous approach, multiple quantile regressions have also been applied, confirming the presence of trends even when the possible interference of the specific discharge and morphoclimatic parameters (i.e. mean elevation and catchment area). Application of a geomorphoclimatic model by Allamano et al (2009) can allow to mimic to which extent the empirically available increase in air temperature and annual rainfall can justify the attribution of change derived by the empirical statistical tools. An comparison with data from Swiss alpine basins treated in a previous paper is finally undertaken.

  11. A Quantile Regression Approach to Understanding the Relations Among Morphological Awareness, Vocabulary, and Reading Comprehension in Adult Basic Education Students.

    PubMed

    Tighe, Elizabeth L; Schatschneider, Christopher

    2016-07-01

    The purpose of this study was to investigate the joint and unique contributions of morphological awareness and vocabulary knowledge at five reading comprehension levels in adult basic education (ABE) students. We introduce the statistical technique of multiple quantile regression, which enabled us to assess the predictive utility of morphological awareness and vocabulary knowledge at multiple points (quantiles) along the continuous distribution of reading comprehension. To demonstrate the efficacy of our multiple quantile regression analysis, we compared and contrasted our results with a traditional multiple regression analytic approach. Our results indicated that morphological awareness and vocabulary knowledge accounted for a large portion of the variance (82%-95%) in reading comprehension skills across all quantiles. Morphological awareness exhibited the greatest unique predictive ability at lower levels of reading comprehension whereas vocabulary knowledge exhibited the greatest unique predictive ability at higher levels of reading comprehension. These results indicate the utility of using multiple quantile regression to assess trajectories of component skills across multiple levels of reading comprehension. The implications of our findings for ABE programs are discussed. © Hammill Institute on Disabilities 2014.

  12. The Dynamics of the Evolution of the Black-White Test Score Gap

    ERIC Educational Resources Information Center

    Sohn, Kitae

    2012-01-01

    We apply a quantile version of the Oaxaca-Blinder decomposition to estimate the counterfactual distribution of the test scores of Black students. In the Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K), we find that the gap initially appears only at the top of the distribution of test scores. As children age, however,…

  13. Regional L-Moment-Based Flood Frequency Analysis in the Upper Vistula River Basin, Poland

    NASA Astrophysics Data System (ADS)

    Rutkowska, A.; Żelazny, M.; Kohnová, S.; Łyp, M.; Banasik, K.

    2017-02-01

    The Upper Vistula River basin was divided into pooling groups with similar dimensionless frequency distributions of annual maximum river discharge. The cluster analysis and the Hosking and Wallis (HW) L-moment-based method were used to divide the set of 52 mid-sized catchments into disjoint clusters with similar morphometric, land use, and rainfall variables, and to test the homogeneity within clusters. Finally, three and four pooling groups were obtained alternatively. Two methods for identification of the regional distribution function were used, the HW method and the method of Kjeldsen and Prosdocimi based on a bivariate extension of the HW measure. Subsequently, the flood quantile estimates were calculated using the index flood method. The ordinary least squares (OLS) and the generalised least squares (GLS) regression techniques were used to relate the index flood to catchment characteristics. Predictive performance of the regression scheme for the southern part of the Upper Vistula River basin was improved by using GLS instead of OLS. The results of the study can be recommended for the estimation of flood quantiles at ungauged sites, in flood risk mapping applications, and in engineering hydrology to help design flood protection structures.

  14. Quantifying Short-Term Dynamics of Parkinson’s Disease Using Self-Reported Symptom Data From an Internet Social Network

    PubMed Central

    Wicks, Paul; Vaughan, Timothy; Pentland, Alex

    2013-01-01

    Background Parkinson’s disease (PD) is an incurable neurological disease with approximately 0.3% prevalence. The hallmark symptom is gradual movement deterioration. Current scientific consensus about disease progression holds that symptoms will worsen smoothly over time unless treated. Accurate information about symptom dynamics is of critical importance to patients, caregivers, and the scientific community for the design of new treatments, clinical decision making, and individual disease management. Long-term studies characterize the typical time course of the disease as an early linear progression gradually reaching a plateau in later stages. However, symptom dynamics over durations of days to weeks remains unquantified. Currently, there is a scarcity of objective clinical information about symptom dynamics at intervals shorter than 3 months stretching over several years, but Internet-based patient self-report platforms may change this. Objective To assess the clinical value of online self-reported PD symptom data recorded by users of the health-focused Internet social research platform PatientsLikeMe (PLM), in which patients quantify their symptoms on a regular basis on a subset of the Unified Parkinson’s Disease Ratings Scale (UPDRS). By analyzing this data, we aim for a scientific window on the nature of symptom dynamics for assessment intervals shorter than 3 months over durations of several years. Methods Online self-reported data was validated against the gold standard Parkinson’s Disease Data and Organizing Center (PD-DOC) database, containing clinical symptom data at intervals greater than 3 months. The data were compared visually using quantile-quantile plots, and numerically using the Kolmogorov-Smirnov test. By using a simple piecewise linear trend estimation algorithm, the PLM data was smoothed to separate random fluctuations from continuous symptom dynamics. Subtracting the trends from the original data revealed random fluctuations in symptom severity. The average magnitude of fluctuations versus time since diagnosis was modeled by using a gamma generalized linear model. Results Distributions of ages at diagnosis and UPDRS in the PLM and PD-DOC databases were broadly consistent. The PLM patients were systematically younger than the PD-DOC patients and showed increased symptom severity in the PD off state. The average fluctuation in symptoms (UPDRS Parts I and II) was 2.6 points at the time of diagnosis, rising to 5.9 points 16 years after diagnosis. This fluctuation exceeds the estimated minimal and moderate clinically important differences, respectively. Not all patients conformed to the current clinical picture of gradual, smooth changes: many patients had regimes where symptom severity varied in an unpredictable manner, or underwent large rapid changes in an otherwise more stable progression. Conclusions This information about short-term PD symptom dynamics contributes new scientific understanding about the disease progression, currently very costly to obtain without self-administered Internet-based reporting. This understanding should have implications for the optimization of clinical trials into new treatments and for the choice of treatment decision timescales. PMID:23343503

  15. Quantifying short-term dynamics of Parkinson's disease using self-reported symptom data from an Internet social network.

    PubMed

    Little, Max; Wicks, Paul; Vaughan, Timothy; Pentland, Alex

    2013-01-24

    Parkinson's disease (PD) is an incurable neurological disease with approximately 0.3% prevalence. The hallmark symptom is gradual movement deterioration. Current scientific consensus about disease progression holds that symptoms will worsen smoothly over time unless treated. Accurate information about symptom dynamics is of critical importance to patients, caregivers, and the scientific community for the design of new treatments, clinical decision making, and individual disease management. Long-term studies characterize the typical time course of the disease as an early linear progression gradually reaching a plateau in later stages. However, symptom dynamics over durations of days to weeks remains unquantified. Currently, there is a scarcity of objective clinical information about symptom dynamics at intervals shorter than 3 months stretching over several years, but Internet-based patient self-report platforms may change this. To assess the clinical value of online self-reported PD symptom data recorded by users of the health-focused Internet social research platform PatientsLikeMe (PLM), in which patients quantify their symptoms on a regular basis on a subset of the Unified Parkinson's Disease Ratings Scale (UPDRS). By analyzing this data, we aim for a scientific window on the nature of symptom dynamics for assessment intervals shorter than 3 months over durations of several years. Online self-reported data was validated against the gold standard Parkinson's Disease Data and Organizing Center (PD-DOC) database, containing clinical symptom data at intervals greater than 3 months. The data were compared visually using quantile-quantile plots, and numerically using the Kolmogorov-Smirnov test. By using a simple piecewise linear trend estimation algorithm, the PLM data was smoothed to separate random fluctuations from continuous symptom dynamics. Subtracting the trends from the original data revealed random fluctuations in symptom severity. The average magnitude of fluctuations versus time since diagnosis was modeled by using a gamma generalized linear model. Distributions of ages at diagnosis and UPDRS in the PLM and PD-DOC databases were broadly consistent. The PLM patients were systematically younger than the PD-DOC patients and showed increased symptom severity in the PD off state. The average fluctuation in symptoms (UPDRS Parts I and II) was 2.6 points at the time of diagnosis, rising to 5.9 points 16 years after diagnosis. This fluctuation exceeds the estimated minimal and moderate clinically important differences, respectively. Not all patients conformed to the current clinical picture of gradual, smooth changes: many patients had regimes where symptom severity varied in an unpredictable manner, or underwent large rapid changes in an otherwise more stable progression. This information about short-term PD symptom dynamics contributes new scientific understanding about the disease progression, currently very costly to obtain without self-administered Internet-based reporting. This understanding should have implications for the optimization of clinical trials into new treatments and for the choice of treatment decision timescales.

  16. Design Life Level: Quantifying risk in a changing climate

    NASA Astrophysics Data System (ADS)

    Rootzén, Holger; Katz, Richard W.

    2013-09-01

    In the past, the concepts of return levels and return periods have been standard and important tools for engineering design. However, these concepts are based on the assumption of a stationary climate and do not apply to a changing climate, whether local or global. In this paper, we propose a refined concept, Design Life Level, which quantifies risk in a nonstationary climate and can serve as the basis for communication. In current practice, typical hydrologic risk management focuses on a standard (e.g., in terms of a high quantile corresponding to the specified probability of failure for a single year). Nevertheless, the basic information needed for engineering design should consist of (i) the design life period (e.g., the next 50 years, say 2015-2064); and (ii) the probability (e.g., 5% chance) of a hazardous event (typically, in the form of the hydrologic variable exceeding a high level) occurring during the design life period. Capturing both of these design characteristics, the Design Life Level is defined as an upper quantile (e.g., 5%) of the distribution of the maximum value of the hydrologic variable (e.g., water level) over the design life period. We relate this concept and variants of it to existing literature and illustrate how they, and some useful complementary plots, may be computed and used. One practically important consideration concerns quantifying the statistical uncertainty in estimating a high quantile under nonstationarity.

  17. Superquantile/CVaR Risk Measures: Second-Order Theory

    DTIC Science & Technology

    2014-07-17

    order version of quantile regression . Keywords: superquantiles, conditional value-at-risk, second-order superquantiles, mixed superquan- tiles... quantile regression . 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT Same as Report (SAR) 18. NUMBER OF PAGES 26 19a...second-order superquantiles is in the domain of generalized regression . We laid out in [16] a parallel methodology to that of quantile regression

  18. Quantile regression analyses of associated factors for body mass index in Korean adolescents.

    PubMed

    Kim, T H; Lee, E K; Han, E

    2015-05-01

    This study examined the influence of home and school environments, and individual health-risk behaviours on body weight outcomes in Korean adolescents. This was a cross-sectional observational study. Quantile regression models to explore heterogeneity in the association of specific factors with body mass index (BMI) over the entire conditional BMI distribution was used. A nationally representative web-based survey for youths was used. Paternal education level of college or more education was associated with lower BMI for girls, whereas college or more education of mothers was associated with higher BMI for boys; for both, the magnitude of association became larger at the upper quantiles of the conditional BMI distribution. Girls with good family economic status were more likely to have higher BMIs than those with average family economic status, particularly at the upper quantile of the conditional BMI distribution. Attending a co-ed school was associated with lower BMI for both genders with a larger association at the upper quantiles. Substantial screen time for TV watching, video games, or internet surfing was associated with a higher BMI with a larger association at the upper quantiles for both girls and boys. Dental prevention was negatively associated with BMI, whereas suicide consideration was positively associated with BMIs of both genders with a larger association at a higher quantile. These findings suggest that interventions aimed at behavioural changes and positive parental roles are needed to effectively address high adolescent BMI. Copyright © 2015 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

  19. Approximate Sample Size Formulas for Testing Group Mean Differences when Variances Are Unequal in One-Way ANOVA

    ERIC Educational Resources Information Center

    Guo, Jiin-Huarng; Luh, Wei-Ming

    2008-01-01

    This study proposes an approach for determining appropriate sample size for Welch's F test when unequal variances are expected. Given a certain maximum deviation in population means and using the quantile of F and t distributions, there is no need to specify a noncentrality parameter and it is easy to estimate the approximate sample size needed…

  20. Quantile Treatment Effects of College Quality on Earnings: Evidence from Administrative Data in Texas. NBER Working Paper No. 18068

    ERIC Educational Resources Information Center

    Andrews, Rodney J.; Li, Jing; Lovenheim, Michael F.

    2012-01-01

    This paper uses administrative data on schooling and earnings from Texas to estimate the effect of college quality on the distribution of earnings. We proxy college quality using the college sector from which students graduate and focus on identifying how graduating from UT-Austin, Texas A\\&M or a community college affects the distribution of…

  1. Estimating the number of motor units using random sums with independently thinned terms.

    PubMed

    Müller, Samuel; Conforto, Adriana Bastos; Z'graggen, Werner J; Kaelin-Lang, Alain

    2006-07-01

    The problem of estimating the numbers of motor units N in a muscle is embedded in a general stochastic model using the notion of thinning from point process theory. In the paper a new moment type estimator for the numbers of motor units in a muscle is denned, which is derived using random sums with independently thinned terms. Asymptotic normality of the estimator is shown and its practical value is demonstrated with bootstrap and approximative confidence intervals for a data set from a 31-year-old healthy right-handed, female volunteer. Moreover simulation results are presented and Monte-Carlo based quantiles, means, and variances are calculated for N in{300,600,1000}.

  2. Managing more than the mean: Using quantile regression to identify factors related to large elk groups

    USGS Publications Warehouse

    Brennan, Angela K.; Cross, Paul C.; Creely, Scott

    2015-01-01

    Synthesis and applications. Our analysis of elk group size distributions using quantile regression suggests that private land, irrigation, open habitat, elk density and wolf abundance can affect large elk group sizes. Thus, to manage larger groups by removal or dispersal of individuals, we recommend incentivizing hunting on private land (particularly if irrigated) during the regular and late hunting seasons, promoting tolerance of wolves on private land (if elk aggregate in these areas to avoid wolves) and creating more winter range and varied habitats. Relationships to the variables of interest also differed by quantile, highlighting the importance of using quantile regression to examine response variables more completely to uncover relationships important to conservation and management.

  3. Asymmetric impact of rainfall on India's food grain production: evidence from quantile autoregressive distributed lag model

    NASA Astrophysics Data System (ADS)

    Pal, Debdatta; Mitra, Subrata Kumar

    2018-01-01

    This study used a quantile autoregressive distributed lag (QARDL) model to capture asymmetric impact of rainfall on food production in India. It was found that the coefficient corresponding to the rainfall in the QARDL increased till the 75th quantile and started decreasing thereafter, though it remained in the positive territory. Another interesting finding is that at the 90th quantile and above the coefficients of rainfall though remained positive was not statistically significant and therefore, the benefit of high rainfall on crop production was not conclusive. However, the impact of other determinants, such as fertilizer and pesticide consumption, is quite uniform over the whole range of the distribution of food grain production.

  4. An assessment of temporal effect on extreme rainfall estimates

    NASA Astrophysics Data System (ADS)

    Das, Samiran; Zhu, Dehua; Chi-Han, Cheng

    2018-06-01

    This study assesses the temporal behaviour in terms of inter-decadal variability of extreme daily rainfall of stated return period relevant for hydrologic risk analysis using a novel regional parametric approach. The assessment is carried out based on annual maximum daily rainfall series of 180 meteorological stations of Yangtze River Basin over a 50-year period (1961-2010). The outcomes of the analysis reveal that while there were effects present indicating higher quantile values when estimated from data of the 1990s, it is found not to be noteworthy to exclude the data of any decade from the extreme rainfall estimation process for hydrologic risk analysis.

  5. A minimum distance estimation approach to the two-sample location-scale problem.

    PubMed

    Zhang, Zhiyi; Yu, Qiqing

    2002-09-01

    As reported by Kalbfleisch and Prentice (1980), the generalized Wilcoxon test fails to detect a difference between the lifetime distributions of the male and female mice died from Thymic Leukemia. This failure is a result of the test's inability to detect a distributional difference when a location shift and a scale change exist simultaneously. In this article, we propose an estimator based on the minimization of an average distance between two independent quantile processes under a location-scale model. Large sample inference on the proposed estimator, with possible right-censorship, is discussed. The mouse leukemia data are used as an example for illustration purpose.

  6. Detecting Long-term Trend of Water Quality Indices of Dong-gang River, Taiwan Using Quantile Regression

    NASA Astrophysics Data System (ADS)

    Yang, D.; Shiau, J.

    2013-12-01

    ABSTRACT BODY: Abstract Surface water quality is an essential issue in water-supply for human uses and sustaining healthy ecosystem of rivers. However, water quality of rivers is easily influenced by anthropogenic activities such as urban development and wastewater disposal. Long-term monitoring of water quality can assess whether water quality of rivers deteriorates or not. Taiwan is a population-dense area and heavily depends on surface water for domestic, industrial, and agricultural uses. Dong-gang River is one of major resources in southern Taiwan for agricultural requirements. The water-quality data of four monitoring stations of the Dong-gang River for the period of 2000-2012 are selected for trend analysis. The parameters used to characterize water quality of rivers include biochemical oxygen demand (BOD), dissolved oxygen (DO), suspended solids (SS), and ammonia nitrogen (NH3-N). These four water-quality parameters are integrated into an index called river pollution index (RPI) to indicate the pollution level of rivers. Although widely used non-parametric Mann-Kendall test and linear regression exhibit computational efficiency to identify trends of water-quality indices, limitations of such approaches include sensitive to outliers and estimations of conditional mean only. Quantile regression, capable of identifying changes over time of any percentile values, is employed in this study to detect long-term trend of water-quality indices for the Dong-gang River located in southern Taiwan. The results show that Dong-gang River 4 stations from 2000 to 2012 monthly long-term trends in water quality.To analyze s Dong-gang River long-term water quality trends and pollution characteristics. The results showed that the bridge measuring ammonia Long-dong, BOD5 measure in that station on a downward trend, DO, and SS is on the rise, River Pollution Index (RPI) on a downward trend. The results form Chau-Jhou station also ahowed simialar trends .more and more near the upstrean Hing-she station raise vivestok Sing-She stations are that ammonia on a upward trend, BOD5 no significant change in trend, DO, and SS is on the rise, river pollution index (RPI) a slight downward trend. Dong-gang River Basin , but the progress of sewer construction in slow. To reduce pollation in this river effort shoul be made regulatory reform on livestock waste control and acceleration of sewer construction. Keywords: quantile regression analysis, BOD5, RPI

  7. Early origins of inflammation: An examination of prenatal and childhood social adversity in a prospective cohort study.

    PubMed

    Slopen, Natalie; Loucks, Eric B; Appleton, Allison A; Kawachi, Ichiro; Kubzansky, Laura D; Non, Amy L; Buka, Stephen; Gilman, Stephen E

    2015-01-01

    Children exposed to social adversity carry a greater risk of poor physical and mental health into adulthood. This increased risk is thought to be due, in part, to inflammatory processes associated with early adversity that contribute to the etiology of many adult illnesses. The current study asks whether aspects of the prenatal social environment are associated with levels of inflammation in adulthood, and whether prenatal and childhood adversity both contribute to adult inflammation. We examined associations of prenatal and childhood adversity assessed through direct interviews of participants in the Collaborative Perinatal Project between 1959 and 1974 with blood levels of C-reactive protein in 355 offspring interviewed in adulthood (mean age=42.2 years). Linear and quantile regression models were used to estimate the effects of prenatal adversity and childhood adversity on adult inflammation, adjusting for age, sex, and race and other potential confounders. In separate linear regression models, high levels of prenatal and childhood adversity were associated with higher CRP in adulthood. When prenatal and childhood adversity were analyzed together, our results support the presence of an effect of prenatal adversity on (log) CRP level in adulthood (β=0.73, 95% CI: 0.26, 1.20) that is independent of childhood adversity and potential confounding factors including maternal health conditions reported during pregnancy. Supplemental analyses revealed similar findings using quantile regression models and logistic regression models that used a clinically-relevant CRP threshold (>3mg/L). In a fully-adjusted model that included childhood adversity, high prenatal adversity was associated with a 3-fold elevated odds (95% CI: 1.15, 8.02) of having a CRP level in adulthood that indicates high risk of cardiovascular disease. Social adversity during the prenatal period is a risk factor for elevated inflammation in adulthood independent of adversities during childhood. This evidence is consistent with studies demonstrating that adverse exposures in the maternal environment during gestation have lasting effects on development of the immune system. If these results reflect causal associations, they suggest that interventions to improve the social and environmental conditions of pregnancy would promote health over the life course. It remains necessary to identify the mechanisms that link maternal conditions during pregnancy to the development of fetal immune and other systems involved in adaptation to environmental stressors. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. miRNA Temporal Analyzer (mirnaTA): a bioinformatics tool for identifying differentially expressed microRNAs in temporal studies using normal quantile transformation.

    PubMed

    Cer, Regina Z; Herrera-Galeano, J Enrique; Anderson, Joseph J; Bishop-Lilly, Kimberly A; Mokashi, Vishwesh P

    2014-01-01

    Understanding the biological roles of microRNAs (miRNAs) is a an active area of research that has produced a surge of publications in PubMed, particularly in cancer research. Along with this increasing interest, many open-source bioinformatics tools to identify existing and/or discover novel miRNAs in next-generation sequencing (NGS) reads become available. While miRNA identification and discovery tools are significantly improved, the development of miRNA differential expression analysis tools, especially in temporal studies, remains substantially challenging. Further, the installation of currently available software is non-trivial and steps of testing with example datasets, trying with one's own dataset, and interpreting the results require notable expertise and time. Subsequently, there is a strong need for a tool that allows scientists to normalize raw data, perform statistical analyses, and provide intuitive results without having to invest significant efforts. We have developed miRNA Temporal Analyzer (mirnaTA), a bioinformatics package to identify differentially expressed miRNAs in temporal studies. mirnaTA is written in Perl and R (Version 2.13.0 or later) and can be run across multiple platforms, such as Linux, Mac and Windows. In the current version, mirnaTA requires users to provide a simple, tab-delimited, matrix file containing miRNA name and count data from a minimum of two to a maximum of 20 time points and three replicates. To recalibrate data and remove technical variability, raw data is normalized using Normal Quantile Transformation (NQT), and linear regression model is used to locate any miRNAs which are differentially expressed in a linear pattern. Subsequently, remaining miRNAs which do not fit a linear model are further analyzed in two different non-linear methods 1) cumulative distribution function (CDF) or 2) analysis of variances (ANOVA). After both linear and non-linear analyses are completed, statistically significant miRNAs (P < 0.05) are plotted as heat maps using hierarchical cluster analysis and Euclidean distance matrix computation methods. mirnaTA is an open-source, bioinformatics tool to aid scientists in identifying differentially expressed miRNAs which could be further mined for biological significance. It is expected to provide researchers with a means of interpreting raw data to statistical summaries in a fast and intuitive manner.

  9. Estimating earnings losses due to mental illness: a quantile regression approach.

    PubMed

    Marcotte, Dave E; Wilcox-Gök, Virginia

    2003-09-01

    The ability of workers to remain productive and sustain earnings when afflicted with mental illness depends importantly on access to appropriate treatment and on flexibility and support from employers. In the United States there is substantial variation in access to health care and sick leave and other employment flexibilities across the earnings distribution. Consequently, a worker's ability to work and how much his/her earnings are impeded likely depend upon his/her position in the earnings distribution. Because of this, focusing on average earnings losses may provide insufficient information on the impact of mental illness in the labor market. In this paper, we examine the effects of mental illness on earnings by recognizing that effects could vary across the distribution of earnings. Using data from the National Comorbidity Survey, we employ a quantile regression estimator to identify the effects at key points in the earnings distribution. We find that earnings effects vary importantly across the distribution. While average effects are often not large, mental illness more commonly imposes earnings losses at the lower tail of the distribution, especially for women. In only one case do we find an illness to have negative effects across the distribution. Mental illness can have larger negative impacts on economic outcomes than previously estimated, even if those effects are not uniform. Consequently, researchers and policy makers alike should not be placated by findings that mean earnings effects are relatively small. Such estimates miss important features of how and where mental illness is associated with real economic losses for the ill.

  10. A data centred method to estimate and map changes in the full distribution of daily surface temperature

    NASA Astrophysics Data System (ADS)

    Chapman, Sandra; Stainforth, David; Watkins, Nicholas

    2016-04-01

    Characterizing how our climate is changing includes local information which can inform adaptation planning decisions. This requires quantifying the geographical patterns in changes at specific quantiles or thresholds in distributions of variables such as daily surface temperature. Here we focus on these local changes and on a model independent method to transform daily observations into patterns of local climate change. Our method [1] is a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of how fast different quantiles of the distributions are changing. This involves both determining which quantiles and geographical locations show the greatest change but also, those at which any change is highly uncertain. For temperature, changes in the distribution itself can yield robust results [2]. We demonstrate how the fundamental timescales of anthropogenic climate change limit the identification of societally relevant aspects of changes. We show that it is nevertheless possible to extract, solely from observations, some confident quantified assessments of change at certain thresholds and locations [3]. We demonstrate this approach using E-OBS gridded data [4] timeseries of local daily surface temperature from specific locations across Europe over the last 60 years. [1] Chapman, S. C., D. A. Stainforth, N. W. Watkins, On estimating long term local climate trends, Phil. Trans. Royal Soc., A,371 20120287 (2013) [2] Stainforth, D. A. S. C. Chapman, N. W. Watkins, Mapping climate change in European temperature distributions, ERL 8, 034031 (2013) [3] Chapman, S. C., Stainforth, D. A., Watkins, N. W. Limits to the quantification of local climate change, ERL 10, 094018 (2015) [4] Haylock M. R. et al ., A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119, (2008)

  11. Analysis and trends of precipitation lapse rate and extreme indices over north Sikkim eastern Himalayas under CMIP5ESM-2M RCPs experiments

    NASA Astrophysics Data System (ADS)

    Singh, Vishal; Goyal, Manish Kumar

    2016-01-01

    This paper draws attention to highlight the spatial and temporal variability in precipitation lapse rate (PLR) and precipitation extreme indices (PEIs) through the mesoscale characterization of Teesta river catchment, which corresponds to north Sikkim eastern Himalayas. A PLR rate is an important variable for the snowmelt runoff models. In a mountainous region, the PLR could be varied from lower elevation parts to high elevation parts. In this study, a PLR was computed by accounting elevation differences, which varies from around 1500 m to 7000 m. A precipitation variability and extremity were analysed using multiple mathematical functions viz. quantile regression, spatial mean, spatial standard deviation, Mann-Kendall test and Sen's estimation. For this reason, a daily precipitation, in the historical (years 1980-2005) as measured/observed gridded points and projected experiments for the 21st century (years 2006-2100) simulated by CMIP5 ESM-2 M model (Coupled Model Intercomparison Project Phase 5 Earth System Model 2) employing three different radiative forcing scenarios (Representative Concentration Pathways), utilized for the research work. The outcomes of this study suggest that a PLR is significantly varied from lower elevation to high elevation parts. The PEI based analysis showed that the extreme high intensity events have been increased significantly, especially after 2040s. The PEI based observations also showed that the numbers of wet days are increased for all the RCPs. The quantile regression plots showed significant increments in the upper and lower quantiles of the various extreme indices. The Mann-Kendall test and Sen's estimation tests clearly indicated significant changing patterns in the frequency and intensity of the precipitation indices across all the sub-basins and RCP scenario in an intra-decadal time series domain. The RCP8.5 showed extremity of the projected outcomes.

  12. Do High Consumers of Sugar-Sweetened Beverages Respond Differently to Price Changes? A Finite Mixture IV-Tobit Approach.

    PubMed

    Etilé, Fabrice; Sharma, Anurag

    2015-09-01

    This study compares the impact of sugar-sweetened beverages (SSBs) tax between moderate and high consumers in Australia. The key methodological contribution is that price response heterogeneity is identified while controlling for censoring of consumption at zero and endogeneity of expenditure by using a finite mixture instrumental variable Tobit model. The SSB price elasticity estimates show a decreasing trend across increasing consumption quantiles, from -2.3 at the median to -0.2 at the 95th quantile. Although high consumers of SSBs have a less elastic demand for SSBs, their very high consumption levels imply that a tax would achieve higher reduction in consumption and higher health gains. Our results also suggest that an SSB tax would represent a small fiscal burden for consumers whatever their pre-policy level of consumption, and that an excise tax should be preferred to an ad valorem tax. Copyright © 2015 John Wiley & Sons, Ltd.

  13. Income elasticity of health expenditures in Iran.

    PubMed

    Zare, Hossein; Trujillo, Antonio J; Leidman, Eva; Buttorff, Christine

    2013-09-01

    Because of its policy implications, the income elasticity of health care expenditures is a subject of much debate. Governments may have an interest in subsidizing the care of those with low income. Using more than two decades of data from the Iran Household Expenditure and Income Survey, this article investigates the relationship between income and health care expenditure in urban and rural areas in Iran, a resource rich, upper-middle-income country. We implemented spline and quantile regression techniques to obtain a more robust description of the relationship of interest. This study finds non-uniform effects of income on health expenditures. Although the results show that health care is a necessity for all income brackets, spline regression estimates indicate that the income elasticity is lowest for the poorest Iranians in urban and rural areas. This suggests that they will show low flexibility in medical expenses as income fluctuates. Further, a quantile regression model assessing the effect of income at different level of medical expenditure suggests that households with lower medical expenses are less elastic.

  14. Intraday return inefficiency and long memory in the volatilities of forex markets and the role of trading volume

    NASA Astrophysics Data System (ADS)

    Shahzad, Syed Jawad Hussain; Hernandez, Jose Areola; Hanif, Waqas; Kayani, Ghulam Mujtaba

    2018-09-01

    We investigate the dynamics of efficiency and long memory, and the impact of trading volume on the efficiency of returns and volatilities of four major traded currencies, namely, the EUR, GBP, CHF and JPY. We do so by implementing full sample and rolling window multifractal detrended fluctuation analysis (MF-DFA) and a quantile-on-quantile (QQ) approach. This paper sheds new light by employing high frequency (5-min interval) data spanning from Jan 1, 2007 to Dec 31, 2016. Realized volatilities are estimated using Andersen et al.'s (2001) measure, while the QQ method employed is drawn from Sim and Zhou (2015). We find evidence of higher efficiency levels in the JPY and CHF currency markets. The impact of trading volume on efficiency is only significant for the JPY and CHF currencies. The GBP currency appears to be the least efficient, followed by the EUR. Implications of the results are discussed.

  15. A Study on Regional Frequency Analysis using Artificial Neural Network - the Sumjin River Basin

    NASA Astrophysics Data System (ADS)

    Jeong, C.; Ahn, J.; Ahn, H.; Heo, J. H.

    2017-12-01

    Regional frequency analysis means to make up for shortcomings in the at-site frequency analysis which is about a lack of sample size through the regional concept. Regional rainfall quantile depends on the identification of hydrologically homogeneous regions, hence the regional classification based on hydrological homogeneous assumption is very important. For regional clustering about rainfall, multidimensional variables and factors related geographical features and meteorological figure are considered such as mean annual precipitation, number of days with precipitation in a year and average maximum daily precipitation in a month. Self-Organizing Feature Map method which is one of the artificial neural network algorithm in the unsupervised learning techniques solves N-dimensional and nonlinear problems and be shown results simply as a data visualization technique. In this study, for the Sumjin river basin in South Korea, cluster analysis was performed based on SOM method using high-dimensional geographical features and meteorological factor as input data. then, for the results, in order to evaluate the homogeneity of regions, the L-moment based discordancy and heterogeneity measures were used. Rainfall quantiles were estimated as the index flood method which is one of regional rainfall frequency analysis. Clustering analysis using SOM method and the consequential variation in rainfall quantile were analyzed. This research was supported by a grant(2017-MPSS31-001) from Supporting Technology Development Program for Disaster Management funded by Ministry of Public Safety and Security(MPSS) of the Korean government.

  16. Group sequential designs for stepped-wedge cluster randomised trials

    PubMed Central

    Grayling, Michael J; Wason, James MS; Mander, Adrian P

    2017-01-01

    Background/Aims: The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Methods: Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. Results: We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial’s type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. Conclusion: The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial. PMID:28653550

  17. Group sequential designs for stepped-wedge cluster randomised trials.

    PubMed

    Grayling, Michael J; Wason, James Ms; Mander, Adrian P

    2017-10-01

    The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial's type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into stepped-wedge cluster randomised trials according to the needs of the particular trial.

  18. Heritability Across the Distribution: An Application of Quantile Regression

    PubMed Central

    Petrill, Stephen A.; Hart, Sara A.; Schatschneider, Christopher; Thompson, Lee A.; Deater-Deckard, Kirby; DeThorne, Laura S.; Bartlett, Christopher

    2016-01-01

    We introduce a new method for analyzing twin data called quantile regression. Through the application presented here, quantile regression is able to assess the genetic and environmental etiology of any skill or ability, at multiple points in the distribution of that skill or ability. This method is compared to the Cherny et al. (Behav Genet 22:153–162, 1992) method in an application to four different reading-related outcomes in 304 pairs of first-grade same sex twins enrolled in the Western Reserve Reading Project. Findings across the two methods were similar; both indicated some variation across the distribution of the genetic and shared environmental influences on non-word reading. However, quantile regression provides more details about the location and size of the measured effect. Applications of the technique are discussed. PMID:21877231

  19. Statistical Models and Inference Procedures for Structural and Materials Reliability

    DTIC Science & Technology

    1990-12-01

    as an official Department of the Army positio~n, policy, or decision, unless sD designated by other documentazion. 12a. DISTRIBUTION /AVAILABILITY...Some general stress-strength models were also developed and applied to the failure of systems subject to cyclic loading. Involved in the failure of...process control ideas and sequential design and analysis methods. Finally, smooth nonparametric quantile .wJ function estimators were studied. All of

  20. Factors influencing riverine fish assemblages in Massachusetts

    USGS Publications Warehouse

    Armstrong, David S.; Richards, Todd A.; Levin, Sara B.

    2011-01-01

    The U.S. Geological Survey, in cooperation with the Massachusetts Department of Conservation and Recreation, Massachusetts Department of Environmental Protection, and the Massachusetts Department of Fish and Game, conducted an investigation of fish assemblages in small- to medium-sized Massachusetts streams. The objective of this study was to determine relations between fish-assemblage characteristics and anthropogenic factors, including impervious cover and estimated flow alteration, relative to the effects of environmental factors, including physical-basin characteristics and land use. The results of this investigation supersede those of a preliminary analysis published in 2010. Fish data were obtained for 669 fish-sampling sites from the Massachusetts Division of Fisheries and Wildlife fish-community database. A review of the literature was used to select fish metrics - species richness, abundance of individual species, and abundances of species grouped on life history traits - responsive to flow alteration. The contributing areas to the fish-sampling sites were delineated and used with a geographic information system to determine a set of environmental and anthropogenic factors that were tested for use as explanatory variables in regression models. Reported and estimated withdrawals and return flows were used together with simulated unaltered streamflows to estimate altered streamflows and indicators of flow alteration for each fish-sampling site. Altered streamflows and indicators of flow alteration were calculated on the basis of methods developed in a previous U.S. Geological Survey study in which unaltered daily streamflows were simulated for a 44-year period (water years 1961-2004), and streamflow alterations were estimated by use of water-withdrawal and wastewater-return data previously reported to the State for the 2000-04 period and estimated domestic-well withdrawals and septic-system discharges. A variable selection process, conducted using principal components analysis and Spearman rank correlation, was used to select a set of 15 non-redundant environmental and anthropogenic factors to test for use as explanatory variables in the regression analyses. Twenty-one fish species were used in a multivariate analysis of fish-assemblage patterns. Results of nonmetric multidimensional scaling and hierarchical cluster analysis were used to group fish species into fluvial and macrohabitat generalist habitat-use classes. Two analytical techniques, quantile regression and generalized linear modeling, were applied to characterize the association between fish-response variables and environmental and anthropogenic explanatory variables. Quantile regression demonstrated that as percent impervious cover and an indicator of percent alteration of August median flow from groundwater withdrawals increase, the relative abundance and species richness of fluvial fish decrease. The quantile regression plots indicate that (1) as many as seven fluvial fish species are expected in streams with little flow alteration or impervious cover, (2) no more than four fluvial fish species are expected in streams where flow alterations from groundwater withdrawals exceed 50 percent of the August median flow or the percent area of impervious cover exceeds 15 percent, and (3) few fluvial fish remain at high rates of withdrawal (approaching 100 percent) or high rates of impervious cover (between 25 and 30 percent). Three generalized linear models (GLMs) were developed to quantify the response of fluvial fish to multiple environmental and anthropogenic variables. All variables in the GLM equations were demonstrated to be significant (p less than 0.05, with most less than 0.01). Variables in the fluvial-fish relative-abundance model were channel slope, estimated percent alteration of August median flow from groundwater withdrawals, percent wetland in a 240-meter buffer strip, and percent impervious cover. Variables in the fluvial-fish species-richness model were drainage area, channel slope, total undammed reach length, percent wetland in a 240-meter buffer strip, and percent impervious cover. Variables in the brook trout relativeabundance model were drainage area, percent open water, and percent impervious cover. The variability explained by the GLM models, as measured by the pseudo R2, ranged from 18.2 to 34.6, and correlations between observed and predicted values ranged from 0.50 to 0.60. Results of GLM models indicated that, keeping all other variables the same, a one-unit (1 percent) increase in the percent depletion of August median flow would result in a 0.9-percent decrease in the relative abundance (in counts per hour) of fluvial fish. The results of GLM models also indicated that a unit increase in impervious cover (1 percent) resulted in a 3.7-percent decrease in the relative abundance of fluvial fish, a 5.4-percent decrease in fluvial-fish species richness, and an 8.7-percent decrease in brook trout relative abundance.

  1. Non-inferiority tests for anti-infective drugs using control group quantiles.

    PubMed

    Fay, Michael P; Follmann, Dean A

    2016-12-01

    In testing for non-inferiority of anti-infective drugs, the primary endpoint is often the difference in the proportion of failures between the test and control group at a landmark time. The landmark time is chosen to approximately correspond to the qth historic quantile of the control group, and the non-inferiority margin is selected to be reasonable for the target level q. For designing these studies, a troubling issue is that the landmark time must be pre-specified, but there is no guarantee that the proportion of control failures at the landmark time will be close to the target level q. If the landmark time is far from the target control quantile, then the pre-specified non-inferiority margin may not longer be reasonable. Exact variable margin tests have been developed by Röhmel and Kieser to address this problem, but these tests can have poor power if the observed control failure rate at the landmark time is far from its historic value. We develop a new variable margin non-inferiority test where we continue sampling until a pre-specified proportion of failures, q, have occurred in the control group, where q is the target quantile level. The test does not require any assumptions on the failure time distributions, and hence, no knowledge of the true [Formula: see text] control quantile for the study is needed. Our new test is exact and has power comparable to (or greater than) its competitors when the true control quantile from the study equals (or differs moderately from) its historic value. Our nivm R package performs the test and gives confidence intervals on the difference in failure rates at the true target control quantile. The tests can be applied to time to cure or other numeric variables as well. A substantial proportion of new anti-infective drugs being developed use non-inferiority tests in their development, and typically, a pre-specified landmark time and its associated difference margin are set at the design stage to match a specific target control quantile. If through changing standard of care or selection of a different population the target quantile for the control group changes from its historic value, then the appropriateness of the pre-specified margin at the landmark time may be questionable. Our proposed test avoids this problem by sampling until a pre-specified proportion of the controls have failed. © The Author(s) 2016.

  2. Extending the Distributed Lag Model framework to handle chemical mixtures.

    PubMed

    Bello, Ghalib A; Arora, Manish; Austin, Christine; Horton, Megan K; Wright, Robert O; Gennings, Chris

    2017-07-01

    Distributed Lag Models (DLMs) are used in environmental health studies to analyze the time-delayed effect of an exposure on an outcome of interest. Given the increasing need for analytical tools for evaluation of the effects of exposure to multi-pollutant mixtures, this study attempts to extend the classical DLM framework to accommodate and evaluate multiple longitudinally observed exposures. We introduce 2 techniques for quantifying the time-varying mixture effect of multiple exposures on an outcome of interest. Lagged WQS, the first technique, is based on Weighted Quantile Sum (WQS) regression, a penalized regression method that estimates mixture effects using a weighted index. We also introduce Tree-based DLMs, a nonparametric alternative for assessment of lagged mixture effects. This technique is based on the Random Forest (RF) algorithm, a nonparametric, tree-based estimation technique that has shown excellent performance in a wide variety of domains. In a simulation study, we tested the feasibility of these techniques and evaluated their performance in comparison to standard methodology. Both methods exhibited relatively robust performance, accurately capturing pre-defined non-linear functional relationships in different simulation settings. Further, we applied these techniques to data on perinatal exposure to environmental metal toxicants, with the goal of evaluating the effects of exposure on neurodevelopment. Our methods identified critical neurodevelopmental windows showing significant sensitivity to metal mixtures. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. The effectiveness of drinking and driving policies for different alcohol-related fatalities: a quantile regression analysis.

    PubMed

    Ying, Yung-Hsiang; Wu, Chin-Chih; Chang, Koyin

    2013-09-27

    To understand the impact of drinking and driving laws on drinking and driving fatality rates, this study explored the different effects these laws have on areas with varying severity rates for drinking and driving. Unlike previous studies, this study employed quantile regression analysis. Empirical results showed that policies based on local conditions must be used to effectively reduce drinking and driving fatality rates; that is, different measures should be adopted to target the specific conditions in various regions. For areas with low fatality rates (low quantiles), people's habits and attitudes toward alcohol should be emphasized instead of transportation safety laws because "preemptive regulations" are more effective. For areas with high fatality rates (or high quantiles), "ex-post regulations" are more effective, and impact these areas approximately 0.01% to 0.05% more than they do areas with low fatality rates.

  4. Spatial quantile regression using INLA with applications to childhood overweight in Malawi.

    PubMed

    Mtambo, Owen P L; Masangwi, Salule J; Kazembe, Lawrence N M

    2015-04-01

    Analyses of childhood overweight have mainly used mean regression. However, using quantile regression is more appropriate as it provides flexibility to analyse the determinants of overweight corresponding to quantiles of interest. The main objective of this study was to fit a Bayesian additive quantile regression model with structured spatial effects for childhood overweight in Malawi using the 2010 Malawi DHS data. Inference was fully Bayesian using R-INLA package. The significant determinants of childhood overweight ranged from socio-demographic factors such as type of residence to child and maternal factors such as child age and maternal BMI. We observed significant positive structured spatial effects on childhood overweight in some districts of Malawi. We recommended that the childhood malnutrition policy makers should consider timely interventions based on risk factors as identified in this paper including spatial targets of interventions. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. The Effectiveness of Drinking and Driving Policies for Different Alcohol-Related Fatalities: A Quantile Regression Analysis

    PubMed Central

    Ying, Yung-Hsiang; Wu, Chin-Chih; Chang, Koyin

    2013-01-01

    To understand the impact of drinking and driving laws on drinking and driving fatality rates, this study explored the different effects these laws have on areas with varying severity rates for drinking and driving. Unlike previous studies, this study employed quantile regression analysis. Empirical results showed that policies based on local conditions must be used to effectively reduce drinking and driving fatality rates; that is, different measures should be adopted to target the specific conditions in various regions. For areas with low fatality rates (low quantiles), people’s habits and attitudes toward alcohol should be emphasized instead of transportation safety laws because “preemptive regulations” are more effective. For areas with high fatality rates (or high quantiles), “ex-post regulations” are more effective, and impact these areas approximately 0.01% to 0.05% more than they do areas with low fatality rates. PMID:24084673

  6. Microarray image analysis: background estimation using quantile and morphological filters.

    PubMed

    Bengtsson, Anders; Bengtsson, Henrik

    2006-02-28

    In a microarray experiment the difference in expression between genes on the same slide is up to 103 fold or more. At low expression, even a small error in the estimate will have great influence on the final test and reference ratios. In addition to the true spot intensity the scanned signal consists of different kinds of noise referred to as background. In order to assess the true spot intensity background must be subtracted. The standard approach to estimate background intensities is to assume they are equal to the intensity levels between spots. In the literature, morphological opening is suggested to be one of the best methods for estimating background this way. This paper examines fundamental properties of rank and quantile filters, which include morphological filters at the extremes, with focus on their ability to estimate between-spot intensity levels. The bias and variance of these filter estimates are driven by the number of background pixels used and their distributions. A new rank-filter algorithm is implemented and compared to methods available in Spot by CSIRO and GenePix Pro by Axon Instruments. Spot's morphological opening has a mean bias between -47 and -248 compared to a bias between 2 and -2 for the rank filter and the variability of the morphological opening estimate is 3 times higher than for the rank filter. The mean bias of Spot's second method, morph.close.open, is between -5 and -16 and the variability is approximately the same as for morphological opening. The variability of GenePix Pro's region-based estimate is more than ten times higher than the variability of the rank-filter estimate and with slightly more bias. The large variability is because the size of the background window changes with spot size. To overcome this, a non-adaptive region-based method is implemented. Its bias and variability are comparable to that of the rank filter. The performance of more advanced rank filters is equal to the best region-based methods. However, in order to get unbiased estimates these filters have to be implemented with great care. The performance of morphological opening is in general poor with a substantial spatial-dependent bias.

  7. Data-driven modeling of surface temperature anomaly and solar activity trends

    USGS Publications Warehouse

    Friedel, Michael J.

    2012-01-01

    A novel two-step modeling scheme is used to reconstruct and analyze surface temperature and solar activity data at global, hemispheric, and regional scales. First, the self-organizing map (SOM) technique is used to extend annual modern climate data from the century to millennial scale. The SOM component planes are used to identify and quantify strength of nonlinear relations among modern surface temperature anomalies (<150 years), tropical and extratropical teleconnections, and Palmer Drought Severity Indices (0–2000 years). Cross-validation of global sea and land surface temperature anomalies verifies that the SOM is an unbiased estimator with less uncertainty than the magnitude of anomalies. Second, the quantile modeling of SOM reconstructions reveal trends and periods in surface temperature anomaly and solar activity whose timing agrees with published studies. Temporal features in surface temperature anomalies, such as the Medieval Warm Period, Little Ice Age, and Modern Warming Period, appear at all spatial scales but whose magnitudes increase when moving from ocean to land, from global to regional scales, and from southern to northern regions. Some caveats that apply when interpreting these data are the high-frequency filtering of climate signals based on quantile model selection and increased uncertainty when paleoclimatic data are limited. Even so, all models find the rate and magnitude of Modern Warming Period anomalies to be greater than those during the Medieval Warm Period. Lastly, quantile trends among reconstructed equatorial Pacific temperature profiles support the recent assertion of two primary El Niño Southern Oscillation types. These results demonstrate the efficacy of this alternative modeling approach for reconstructing and interpreting scale-dependent climate variables.

  8. Copula-based assessment of the relationship between food peaks and flood volumes using information on historical floods by Bayesian Monte Carlo Markov Chain simulations

    NASA Astrophysics Data System (ADS)

    Gaál, Ladislav; Szolgay, Ján.; Bacigál, Tomáå.¡; Kohnová, Silvia

    2010-05-01

    Copula-based estimation methods of hydro-climatological extremes have increasingly been gaining attention of researchers and practitioners in the last couple of years. Unlike the traditional estimation methods which are based on bivariate cumulative distribution functions (CDFs), copulas are a relatively flexible tool of statistics that allow for modelling dependencies between two or more variables such as flood peaks and flood volumes without making strict assumptions on the marginal distributions. The dependence structure and the reliability of the joint estimates of hydro-climatological extremes, mainly in the right tail of the joint CDF not only depends on the particular copula adopted but also on the data available for the estimation of the marginal distributions of the individual variables. Generally, data samples for frequency modelling have limited temporal extent, which is a considerable drawback of frequency analyses in practice. Therefore, it is advised to deal with statistical methods that improve any part of the process of copula construction and result in more reliable design values of hydrological variables. The scarcity of the data sample mostly in the extreme tail of the joint CDF can be bypassed, e.g., by using a considerably larger amount of simulated data by rainfall-runoff analysis or by including historical information on the variables under study. The latter approach of data extension is used here to make the quantile estimates of the individual marginals of the copula more reliable. In the presented paper it is proposed to use historical information in the frequency analysis of the marginal distributions in the framework of Bayesian Monte Carlo Markov Chain (MCMC) simulations. Generally, a Bayesian approach allows for a straightforward combination of different sources of information on floods (e.g. flood data from systematic measurements and historical flood records, respectively) in terms of a product of the corresponding likelihood functions. On the other hand, the MCMC algorithm is a numerical approach for sampling from the likelihood distributions. The Bayesian MCMC methods therefore provide an attractive way to estimate the uncertainty in parameters and quantile metrics of frequency distributions. The applicability of the method is demonstrated in a case study of the hydroelectric power station Orlík on the Vltava River. This site has a key role in the flood prevention of Prague, the capital city of the Czech Republic. The record length of the available flood data is 126 years from the period 1877-2002, while the flood event observed in 2002 that caused extensive damages and numerous casualties is treated as a historic one. To estimate the joint probabilities of flood peaks and volumes, different copulas are fitted and their goodness-of-fit are evaluated by bootstrap simulations. Finally, selected quantiles of flood volumes conditioned on given flood peaks are derived and compared with those obtained by the traditional method used in the practice of water management specialists of the Vltava River.

  9. Incense Burning during Pregnancy and Birth Weight and Head Circumference among Term Births: The Taiwan Birth Cohort Study.

    PubMed

    Chen, Le-Yu; Ho, Christine

    2016-09-01

    Incense burning for rituals or religious purposes is an important tradition in many countries. However, incense smoke contains particulate matter and gas products such as carbon monoxide, sulfur, and nitrogen dioxide, which are potentially harmful to health. We analyzed the relationship between prenatal incense burning and birth weight and head circumference at birth using the Taiwan Birth Cohort Study. We also analyzed whether the associations varied by sex and along the distribution of birth outcomes. We performed ordinary least squares (OLS) and quantile regressions analysis on a sample of 15,773 term births (> 37 gestational weeks; 8,216 boys and 7,557 girls) in Taiwan in 2005. The associations were estimated separately for boys and girls as well as for the population as a whole. We controlled extensively for factors that may be correlated with incense burning and birth weight and head circumference, such as parental religion, demographics, and health characteristics, as well as pregnancy-related variables. Findings from fully adjusted OLS regressions indicated that exposure to incense was associated with lower birth weight in boys (-18 g; 95% CI: -36, -0.94) but not girls (1 g; 95% CI: -17, 19; interaction p-value = 0.31). Associations with head circumference were negative for boys (-0.95 mm; 95% CI: -1.8, -0.16) and girls (-0.71 mm; 95% CI: -1.5, 0.11; interaction p-values = 0.73). Quantile regression results suggested that the negative associations were larger among the lower quantiles of birth outcomes. OLS regressions showed that prenatal incense burning was associated with lower birth weight for boys and smaller head circumference for boys and girls. The associations were more pronounced among the lower quantiles of birth outcomes. Further research is necessary to confirm whether incense burning has differential effects by sex. Chen LY, Ho C. 2016. Incense burning during pregnancy and birth weight and head circumference among term births: The Taiwan Birth Cohort Study. Environ Health Perspect 124:1487-1492; http://dx.doi.org/10.1289/ehp.1509922.

  10. Incense Burning during Pregnancy and Birth Weight and Head Circumference among Term Births: The Taiwan Birth Cohort Study

    PubMed Central

    Chen, Le-Yu; Ho, Christine

    2016-01-01

    Background: Incense burning for rituals or religious purposes is an important tradition in many countries. However, incense smoke contains particulate matter and gas products such as carbon monoxide, sulfur, and nitrogen dioxide, which are potentially harmful to health. Objectives: We analyzed the relationship between prenatal incense burning and birth weight and head circumference at birth using the Taiwan Birth Cohort Study. We also analyzed whether the associations varied by sex and along the distribution of birth outcomes. Methods: We performed ordinary least squares (OLS) and quantile regressions analysis on a sample of 15,773 term births (> 37 gestational weeks; 8,216 boys and 7,557 girls) in Taiwan in 2005. The associations were estimated separately for boys and girls as well as for the population as a whole. We controlled extensively for factors that may be correlated with incense burning and birth weight and head circumference, such as parental religion, demographics, and health characteristics, as well as pregnancy-related variables. Results: Findings from fully adjusted OLS regressions indicated that exposure to incense was associated with lower birth weight in boys (–18 g; 95% CI: –36, –0.94) but not girls (1 g; 95% CI: –17, 19; interaction p-value = 0.31). Associations with head circumference were negative for boys (–0.95 mm; 95% CI: –1.8, –0.16) and girls (–0.71 mm; 95% CI: –1.5, 0.11; interaction p-values = 0.73). Quantile regression results suggested that the negative associations were larger among the lower quantiles of birth outcomes. Conclusions: OLS regressions showed that prenatal incense burning was associated with lower birth weight for boys and smaller head circumference for boys and girls. The associations were more pronounced among the lower quantiles of birth outcomes. Further research is necessary to confirm whether incense burning has differential effects by sex. Citation: Chen LY, Ho C. 2016. Incense burning during pregnancy and birth weight and head circumference among term births: The Taiwan Birth Cohort Study. Environ Health Perspect 124:1487–1492; http://dx.doi.org/10.1289/ehp.1509922 PMID:26967367

  11. Modeling long-term suspended-sediment export from an undisturbed forest catchment

    NASA Astrophysics Data System (ADS)

    Zimmermann, Alexander; Francke, Till; Elsenbeer, Helmut

    2013-04-01

    Most estimates of suspended sediment yields from humid, undisturbed, and geologically stable forest environments fall within a range of 5 - 30 t km-2 a-1. These low natural erosion rates in small headwater catchments (≤ 1 km2) support the common impression that a well-developed forest cover prevents surface erosion. Interestingly, those estimates originate exclusively from areas with prevailing vertical hydrological flow paths. Forest environments dominated by (near-) surface flow paths (overland flow, pipe flow, and return flow) and a fast response to rainfall, however, are not an exceptional phenomenon, yet only very few sediment yields have been estimated for these areas. Not surprisingly, even fewer long-term (≥ 10 years) records exist. In this contribution we present our latest research which aims at quantifying long-term suspended-sediment export from an undisturbed rainforest catchment prone to frequent overland flow. A key aspect of our approach is the application of machine-learning techniques (Random Forest, Quantile Regression Forest) which allows not only the handling of non-Gaussian data, non-linear relations between predictors and response, and correlations between predictors, but also the assessment of prediction uncertainty. For the current study we provided the machine-learning algorithms exclusively with information from a high-resolution rainfall time series to reconstruct discharge and suspended sediment dynamics for a 21-year period. The significance of our results is threefold. First, our estimates clearly show that forest cover does not necessarily prevent erosion if wet antecedent conditions and large rainfalls coincide. During these situations, overland flow is widespread and sediment fluxes increase in a non-linear fashion due to the mobilization of new sediment sources. Second, our estimates indicate that annual suspended sediment yields of the undisturbed forest catchment show large fluctuations. Depending on the frequency of large events, annual suspended-sediment yield varies between 74 - 416 t km-2 a-1. Third, the estimated sediment yields exceed former benchmark values by an order of magnitude and provide evidence that the erosion footprint of undisturbed, forested catchments can be undistinguishable from that of sustainably managed, but hydrologically less responsive areas. Because of the susceptibility to soil loss we argue that any land use should be avoided in natural erosion hotspots.

  12. Spline methods for approximating quantile functions and generating random samples

    NASA Technical Reports Server (NTRS)

    Schiess, J. R.; Matthews, C. G.

    1985-01-01

    Two cubic spline formulations are presented for representing the quantile function (inverse cumulative distribution function) of a random sample of data. Both B-spline and rational spline approximations are compared with analytic representations of the quantile function. It is also shown how these representations can be used to generate random samples for use in simulation studies. Comparisons are made on samples generated from known distributions and a sample of experimental data. The spline representations are more accurate for multimodal and skewed samples and to require much less time to generate samples than the analytic representation.

  13. Estimating extreme losses for the Florida Public Hurricane Model—part II

    NASA Astrophysics Data System (ADS)

    Gulati, Sneh; George, Florence; Hamid, Shahid

    2018-02-01

    Rising global temperatures are leading to an increase in the number of extreme events and losses (http://www.epa.gov/climatechange/science/indicators/). Accurate estimation of these extreme losses with the intention of protecting themselves against them is critical to insurance companies. In a previous paper, Gulati et al. (2014) discussed probable maximum loss (PML) estimation for the Florida Public Hurricane Loss Model (FPHLM) using parametric and nonparametric methods. In this paper, we investigate the use of semi-parametric methods to do the same. Detailed analysis of the data shows that the annual losses from FPHLM do not tend to be very heavy tailed, and therefore, neither the popular Hill's method nor the moment's estimator work well. However, Pickand's estimator with threshold around the 84th percentile provides a good fit for the extreme quantiles for the losses.

  14. Restoration of Monotonicity Respecting in Dynamic Regression

    PubMed Central

    Huang, Yijian

    2017-01-01

    Dynamic regression models, including the quantile regression model and Aalen’s additive hazards model, are widely adopted to investigate evolving covariate effects. Yet lack of monotonicity respecting with standard estimation procedures remains an outstanding issue. Advances have recently been made, but none provides a complete resolution. In this article, we propose a novel adaptive interpolation method to restore monotonicity respecting, by successively identifying and then interpolating nearest monotonicity-respecting points of an original estimator. Under mild regularity conditions, the resulting regression coefficient estimator is shown to be asymptotically equivalent to the original. Our numerical studies have demonstrated that the proposed estimator is much more smooth and may have better finite-sample efficiency than the original as well as, when available as only in special cases, other competing monotonicity-respecting estimators. Illustration with a clinical study is provided. PMID:29430068

  15. Distributional changes in rainfall and river flow in Sarawak, Malaysia

    NASA Astrophysics Data System (ADS)

    Sa'adi, Zulfaqar; Shahid, Shamsuddin; Ismail, Tarmizi; Chung, Eun-Sung; Wang, Xiao-Jun

    2017-11-01

    Climate change may not change the rainfall mean, but the variability and extremes. Therefore, it is required to explore the possible distributional changes of rainfall characteristics over time. The objective of present study is to assess the distributional changes in annual and northeast monsoon rainfall (November-January) and river flow in Sarawak where small changes in rainfall or river flow variability/distribution may have severe implications on ecology and agriculture. A quantile regression-based approach was used to assess the changes of scale and location of empirical probability density function over the period 1980-2014 at 31 observational stations. The results indicate that diverse variation patterns exist at all stations for annual rainfall but mainly increasing quantile trend at the lowers, and higher quantiles for the month of January and December. The significant increase in annual rainfall is found mostly in the north and central-coastal region and monsoon month rainfalls in the interior and north of Sarawak. Trends in river flow data show that changes in rainfall distribution have affected higher quantiles of river flow in monsoon months at some of the basins and therefore more flooding. The study reveals that quantile trend can provide more information of rainfall change which may be useful for climate change mitigation and adaptation planning.

  16. Serum calcium and incident diabetes: an observational study and meta-analysis.

    PubMed

    Sing, C W; Cheng, V K F; Ho, D K C; Kung, A W C; Cheung, B M Y; Wong, I C K; Tan, K C B; Salas-Salvadó, J; Becerra-Tomas, N; Cheung, C L

    2016-05-01

    The study aimed to prospectively evaluate if serum calcium is related to diabetes incidence in Hong Kong Chinese. The results showed that serum calcium has a significant association with increased risk of diabetes. The result of meta-analysis reinforced our findings. This study aimed to evaluate the association of serum calcium, including serum total calcium and albumin-corrected calcium, with incident diabetes in Hong Kong Chinese. We conducted a retrospective cohort study in 6096 participants aged 20 or above and free of diabetes at baseline. Serum calcium was measured at baseline. Incident diabetes was determined from several electronic databases. We also searched relevant databases for studies on serum calcium and incident diabetes and conducted a meta-analysis using fixed-effect modeling. During 59,130.9 person-years of follow-up, 631 participants developed diabetes. Serum total calcium and albumin-corrected calcium were associated with incident diabetes in the unadjusted model. After adjusting for demographic and clinical variables, the association remained significant only for serum total calcium (hazard ratio (HR), 1.32 (95 % confidence interval (CI), 1.02-1.70), highest vs. lowest quartile). In a meta-analysis of four studies including the current study, both serum total calcium (pooled risk ratio (RR), 1.38 (95 % CI, 1.15-1.65); I (2) = 5 %, comparing extreme quantiles) and albumin-corrected calcium (pooled RR, 1.29 (95 % CI, 1.03-1.61); I (2) = 0 %, comparing extreme quantiles) were associated with incident diabetes. Penalized regression splines showed that the association of incident diabetes with serum total calcium and albumin-correlated calcium was non-linear and linear, respectively. Elevated serum calcium concentration is associated with incident diabetes. The mechanism underlying this association warrants further investigation.

  17. Assessing the impact of local meteorological variables on surface ozone in Hong Kong during 2000-2015 using quantile and multiple line regression models

    NASA Astrophysics Data System (ADS)

    Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo

    2016-11-01

    The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.

  18. A generalization of the power law distribution with nonlinear exponent

    NASA Astrophysics Data System (ADS)

    Prieto, Faustino; Sarabia, José María

    2017-01-01

    The power law distribution is usually used to fit data in the upper tail of the distribution. However, commonly it is not valid to model data in all the range. In this paper, we present a new family of distributions, the so-called Generalized Power Law (GPL), which can be useful for modeling data in all the range and possess power law tails. To do that, we model the exponent of the power law using a non-linear function which depends on data and two parameters. Then, we provide some basic properties and some specific models of that new family of distributions. After that, we study a relevant model of the family, with special emphasis on the quantile and hazard functions, and the corresponding estimation and testing methods. Finally, as an empirical evidence, we study how the debt is distributed across municipalities in Spain. We check that power law model is only valid in the upper tail; we show analytically and graphically the competence of the new model with municipal debt data in the whole range; and we compare the new distribution with other well-known distributions including the Lognormal, the Generalized Pareto, the Fisk, the Burr type XII and the Dagum models.

  19. Economic consequences of aviation system disruptions: A reduced-form computable general equilibrium analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Zhenhua; Rose, Adam Z.; Prager, Fynnwin

    The state of the art approach to economic consequence analysis (ECA) is computable general equilibrium (CGE) modeling. However, such models contain thousands of equations and cannot readily be incorporated into computerized systems used by policy analysts to yield estimates of economic impacts of various types of transportation system failures due to natural hazards, human related attacks or technological accidents. This paper presents a reduced-form approach to simplify the analytical content of CGE models to make them more transparent and enhance their utilization potential. The reduced-form CGE analysis is conducted by first running simulations one hundred times, varying key parameters, suchmore » as magnitude of the initial shock, duration, location, remediation, and resilience, according to a Latin Hypercube sampling procedure. Statistical analysis is then applied to the “synthetic data” results in the form of both ordinary least squares and quantile regression. The analysis yields linear equations that are incorporated into a computerized system and utilized along with Monte Carlo simulation methods for propagating uncertainties in economic consequences. Although our demonstration and discussion focuses on aviation system disruptions caused by terrorist attacks, the approach can be applied to a broad range of threat scenarios.« less

  20. Climate Change Impact on Variability of Rainfall Intensity in Upper Blue Nile Basin, Ethiopia

    NASA Astrophysics Data System (ADS)

    Worku, L. Y.

    2015-12-01

    Extreme rainfall events are major problems in Ethiopia with the resulting floods that usually could cause significant damage to agriculture, ecology, infrastructure, disruption to human activities, loss of property, loss of lives and disease outbreak. The aim of this study was to explore the likely changes of precipitation extreme changes due to future climate change. The study specifically focuses to understand the future climate change impact on variability of rainfall intensity-duration-frequency in Upper Blue Nile basin. Precipitations data from two Global Climate Models (GCMs) have been used in the study are HadCM3 and CGCM3. Rainfall frequency analysis was carried out to estimate quantile with different return periods. Probability Weighted Method (PWM) selected estimation of parameter distribution and L-Moment Ratio Diagrams (LMRDs) used to find the best parent distribution for each station. Therefore, parent distributions for derived from frequency analysis are Generalized Logistic (GLOG), Generalized Extreme Value (GEV), and Gamma & Pearson III (P3) parent distribution. After analyzing estimated quantile simple disaggregation model was applied in order to find sub daily rainfall data. Finally the disaggregated rainfall is fitted to find IDF curve and the result shows in most parts of the basin rainfall intensity expected to increase in the future. As a result of the two GCM outputs, the study indicates there will be likely increase of precipitation extremes over the Blue Nile basin due to the changing climate. This study should be interpreted with caution as the GCM model outputs in this part of the world have huge uncertainty.

  1. Intersection of All Top Quantile

    EPA Pesticide Factsheets

    This layer combines the Top quantiles of the CES, CEVA, and EJSM layers so that viewers can see the overlap of 00e2??hot spots00e2?? for each method. This layer was created by James Sadd of Occidental College of Los Angeles

  2. Intersection of Screening Methods High Quantile

    EPA Pesticide Factsheets

    This layer combines the high quantiles of the CES, CEVA, and EJSM layers so that viewers can see the overlap of 00e2??hot spots00e2?? for each method. This layer was created by James Sadd of Occidental College of Los Angeles

  3. Ensuring the consistancy of Flow Direction Curve reconstructions: the 'quantile solidarity' approach

    NASA Astrophysics Data System (ADS)

    Poncelet, Carine; Andreassian, Vazken; Oudin, Ludovic

    2015-04-01

    Flow Duration Curves (FDCs) are a hydrologic tool describing the distribution of streamflows at a catchment outlet. FDCs are usually used for calibration of hydrological models, managing water quality and classifying catchments, among others. For gauged catchments, empirical FDCs can be computed from streamflow records. For ungauged catchments, on the other hand, FDCs cannot be obtained from streamflow records and must therefore be obtained in another manner, for example through reconstructions. Regression-based reconstructions are methods relying on the evaluation of quantiles separately from catchments' attributes (climatic or physical features).The advantage of this category of methods is that it is informative about the processes and it is non-parametric. However, the large number of parameters required can cause unwanted artifacts, typically reconstructions that do not always produce increasing quantiles. In this paper we propose a new approach named Quantile Solidarity (QS), which is applied under strict proxy-basin test conditions (Klemes, 1986) to a set of 600 French catchments. Half of the catchments are considered as gauged and used to calibrate the regression and compute residuals of the regression. The QS approach consists in a three-step regionalization scheme, which first links quantile values to physical descriptors, then reduces the number of regression parameters and finally exploits the spatial correlation of the residuals. The innovation is the utilisation of the parameters continuity across the quantiles to dramatically reduce the number of parameters. The second half of catchment is used as an independent validation set over which we show that the QS approach ensures strictly growing FDC reconstructions in ungauged conditions. Reference: V. KLEMEŠ (1986) Operational testing of hydrological simulation models, Hydrological Sciences Journal, 31:1, 13-24

  4. Early Warning Signals of Financial Crises with Multi-Scale Quantile Regressions of Log-Periodic Power Law Singularities.

    PubMed

    Zhang, Qun; Zhang, Qunzhi; Sornette, Didier

    2016-01-01

    We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the "S&P 500 1987" bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs.

  5. Quantile regression and clustering analysis of standardized precipitation index in the Tarim River Basin, Xinjiang, China

    NASA Astrophysics Data System (ADS)

    Yang, Peng; Xia, Jun; Zhang, Yongyong; Han, Jian; Wu, Xia

    2017-11-01

    Because drought is a very common and widespread natural disaster, it has attracted a great deal of academic interest. Based on 12-month time scale standardized precipitation indices (SPI12) calculated from precipitation data recorded between 1960 and 2015 at 22 weather stations in the Tarim River Basin (TRB), this study aims to identify the trends of SPI and drought duration, severity, and frequency at various quantiles and to perform cluster analysis of drought events in the TRB. The results indicated that (1) both precipitation and temperature at most stations in the TRB exhibited significant positive trends during 1960-2015; (2) multiple scales of SPIs changed significantly around 1986; (3) based on quantile regression analysis of temporal drought changes, the positive SPI slopes indicated less severe and less frequent droughts at lower quantiles, but clear variation was detected in the drought frequency; and (4) significantly different trends were found in drought frequency probably between severe droughts and drought frequency.

  6. Probabilistic forecasting for extreme NO2 pollution episodes.

    PubMed

    Aznarte, José L

    2017-10-01

    In this study, we investigate the convenience of quantile regression to predict extreme concentrations of NO 2 . Contrarily to the usual point-forecasting, where a single value is forecast for each horizon, probabilistic forecasting through quantile regression allows for the prediction of the full probability distribution, which in turn allows to build models specifically fit for the tails of this distribution. Using data from the city of Madrid, including NO 2 concentrations as well as meteorological measures, we build models that predict extreme NO 2 concentrations, outperforming point-forecasting alternatives, and we prove that the predictions are accurate, reliable and sharp. Besides, we study the relative importance of the independent variables involved, and show how the important variables for the median quantile are different than those important for the upper quantiles. Furthermore, we present a method to compute the probability of exceedance of thresholds, which is a simple and comprehensible manner to present probabilistic forecasts maximizing their usefulness. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Detection of relationships between SUDOSCAN with estimated glomerular filtration rate (eGFR) in Chinese patients with type 2 diabetes.

    PubMed

    Mao, Fei; Zhu, Xiaoming; Lu, Bin; Li, Yiming

    2018-04-01

    SUDOSCAN (Impeto Medical, Paris, France) has been proved to be a new and non-invasive method in detecting renal dysfunction in type 2 diabetes mellitus (T2DM) patients. In this study, we sought to compare the result of diabetic kidney dysfunction score (DKD-score) of SUDOSCAN with estimated glomerular filtration rate (eGFR) by using quantile regression analysis, which was completely different from previous studies. A total number of 223 Chinese T2DM patients were enrolled in the study. SUDOSCAN, renal function test (including blood urea nitrogen, creatinine and uric acid) and 99 mTc-diethylenetriamine pentaacetic acid ( 99 mTc-DTPA) renal dynamic imaging were performed in all T2DM patients. DKD-score of SUDOSCAN was compared with eGFR detected by 99 mTc-DTPA renal dynamic imaging through quantile regression analysis. Its validation and utility was further determined through bias and precision test. The quantile regression analysis demonstrated the relationship with eGFR was inverse and significant for almost all percentiles of DKD-score. The coefficients decreased as the percentile of DKD-score increased. And in validation data set, both the bias and precision were increased with the eGFR (median difference, -21.2 ml/min/1.73 m 2 for all individuals vs. -4.6 ml/min/1.73 m 2 for eGFR between 0 and 59 ml/min/1.73 m 2 ; interquartile range [IQR] for the difference, -25.4 ml/min/1.73 m 2 vs. -14.7 ml/min/1.73 m 2 ). The eGFR category misclassification rate were 10% in eGFR 0-59 ml/min/1.73 m 2 group, 57.3% in 60-90 group, and 87.2% in eGFR > 90 ml/min/1.73 m 2 group. DKD-score of SUDOSCAN could be used to detect renal dysfunction in T2DM patients. A higher prognostic value of DKD-score was detected when eGFR level was lower. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Quantile regression in the presence of monotone missingness with sensitivity analysis

    PubMed Central

    Liu, Minzhao; Daniels, Michael J.; Perri, Michael G.

    2016-01-01

    In this paper, we develop methods for longitudinal quantile regression when there is monotone missingness. In particular, we propose pattern mixture models with a constraint that provides a straightforward interpretation of the marginal quantile regression parameters. Our approach allows sensitivity analysis which is an essential component in inference for incomplete data. To facilitate computation of the likelihood, we propose a novel way to obtain analytic forms for the required integrals. We conduct simulations to examine the robustness of our approach to modeling assumptions and compare its performance to competing approaches. The model is applied to data from a recent clinical trial on weight management. PMID:26041008

  9. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  10. Measuring disparities across the distribution of mental health care expenditures.

    PubMed

    Le Cook, Benjamin; Manning, Willard; Alegria, Margarita

    2013-03-01

    Previous mental health care disparities studies predominantly compare mean mental health care use across racial/ethnic groups, leaving policymakers with little information on disparities among those with a higher level of expenditures. To identify racial/ethnic disparities among individuals at varying quantiles of mental health care expenditures. To assess whether disparities in the upper quantiles of expenditure differ by insurance status, income and education. Data were analyzed from a nationally representative sample of white, black and Latino adults 18 years and older (n=83,878). Our dependent variable was total mental health care expenditure. We measured disparities in any mental health care expenditures, disparities in mental health care expenditure at the 95th, 97.5 th, and 99 th expenditure quantiles of the full population using quantile regression, and at the 50 th, 75 th, and 95 th quantiles for positive users. In the full population, we tested interaction coefficients between race/ethnicity and income, insurance, and education levels to determine whether racial/ethnic disparities in the upper quantiles differed by income, insurance and education. Significant Black-white and Latino-white disparities were identified in any mental health care expenditures. In the full population, moving up the quantiles of mental health care expenditures, Black-White and Latino-White disparities were reduced but remained statistically significant. No statistically significant disparities were found in analyses of positive users only. The magnitude of black-white disparities was smaller among those enrolled in public insurance programs compared to the privately insured and uninsured in the 97.5 th and 99 th quantiles. Disparities persist in the upper quantiles among those in higher income categories and after excluding psychiatric inpatient and emergency department (ED) visits. Disparities exist in any mental health care and among those that use the most mental health care resources, but much of disparities seem to be driven by lack of access. The data do not allow us to disentangle whether disparities were related to white respondent's overuse or underuse as compared to minority groups. The cross-sectional data allow us to make only associational claims about the role of insurance, income, and education in disparities. With these limitations in mind, we identified a persistence of disparities in overall expenditures even among those in the highest income categories, after controlling for mental health status and observable sociodemographic characteristics. Interventions are needed to equalize resource allocation to racial/ethnic minority patients regardless of their income, with emphasis on outreach interventions to address the disparities in access that are responsible for the no/low expenditures for even Latinos at higher levels of illness severity. Increased policy efforts are needed to reduce the gap in health insurance for Latinos and improve outreach programs to enroll those in need into mental health care services. Future studies that conclusively disentangle overuse and appropriate use in these populations are warranted.

  11. Factors associated with the income distribution of full-time physicians: a quantile regression approach.

    PubMed

    Shih, Ya-Chen Tina; Konrad, Thomas R

    2007-10-01

    Physician income is generally high, but quite variable; hence, physicians have divergent perspectives regarding health policy initiatives and market reforms that could affect their incomes. We investigated factors underlying the distribution of income within the physician population. Full-time physicians (N=10,777) from the restricted version of the 1996-1997 Community Tracking Study Physician Survey (CTS-PS), 1996 Area Resource File, and 1996 health maintenance organization penetration data. We conducted separate analyses for primary care physicians (PCPs) and specialists. We employed least square and quantile regression models to examine factors associated with physician incomes at the mean and at various points of the income distribution, respectively. We accounted for the complex survey design for the CTS-PS data using appropriate weighted procedures and explored endogeneity using an instrumental variables method. We detected widespread and subtle effects of many variables on physician incomes at different points (10th, 25th, 75th, and 90th percentiles) in the distribution that were undetected when employing regression estimations focusing on only the means or medians. Our findings show that the effects of managed care penetration are demonstrable at the mean of specialist incomes, but are more pronounced at higher levels. Conversely, a gender gap in earnings occurs at all levels of income of both PCPs and specialists, but is more pronounced at lower income levels. The quantile regression technique offers an analytical tool to evaluate policy effects beyond the means. A longitudinal application of this approach may enable health policy makers to identify winners and losers among segments of the physician workforce and assess how market dynamics and health policy initiatives affect the overall physician income distribution over various time intervals.

  12. Bias correction of surface downwelling longwave and shortwave radiation for the EWEMBI dataset

    NASA Astrophysics Data System (ADS)

    Lange, Stefan

    2018-05-01

    Many meteorological forcing datasets include bias-corrected surface downwelling longwave and shortwave radiation (rlds and rsds). Methods used for such bias corrections range from multi-year monthly mean value scaling to quantile mapping at the daily timescale. An additional downscaling is necessary if the data to be corrected have a higher spatial resolution than the observational data used to determine the biases. This was the case when EartH2Observe (E2OBS; Calton et al., 2016) rlds and rsds were bias-corrected using more coarsely resolved Surface Radiation Budget (SRB; Stackhouse Jr. et al., 2011) data for the production of the meteorological forcing dataset EWEMBI (Lange, 2016). This article systematically compares various parametric quantile mapping methods designed specifically for this purpose, including those used for the production of EWEMBI rlds and rsds. The methods vary in the timescale at which they operate, in their way of accounting for physical upper radiation limits, and in their approach to bridging the spatial resolution gap between E2OBS and SRB. It is shown how temporal and spatial variability deflation related to bilinear interpolation and other deterministic downscaling approaches can be overcome by downscaling the target statistics of quantile mapping from the SRB to the E2OBS grid such that the sub-SRB-grid-scale spatial variability present in the original E2OBS data is retained. Cross validations at the daily and monthly timescales reveal that it is worthwhile to take empirical estimates of physical upper limits into account when adjusting either radiation component and that, overall, bias correction at the daily timescale is more effective than bias correction at the monthly timescale if sampling errors are taken into account.

  13. Environmental influence on mussel (Mytilus edulis) growth - A quantile regression approach

    NASA Astrophysics Data System (ADS)

    Bergström, Per; Lindegarth, Mats

    2016-03-01

    The need for methods for sustainable management and use of coastal ecosystems has increased in the last century. A key aspect for obtaining ecologically and economically sustainable aquaculture in threatened coastal areas is the requirement of geographic information of growth and potential production capacity. Growth varies over time and space and depends on a complex pattern of interactions between the bivalve and a diverse range of environmental factors (e.g. temperature, salinity, food availability). Understanding these processes and modelling the environmental control of bivalve growth has been central in aquaculture. In contrast to the most conventional modelling techniques, quantile regression can handle cases where not all factors are measured and provide the possibility to estimate the effect at different levels of the response distribution and give therefore a more complete picture of the relationship between environmental factors and biological response. Observation of the relationships between environmental factors and growth of the bivalve Mytilus edulis revealed relationships that varied both among level of growth rate and within the range of environmental variables along the Swedish west coast. The strongest patterns were found for water oxygen concentration level which had a negative effect on growth for all oxygen levels and growth levels. However, these patterns coincided with differences in growth among periods and very little of the remaining variability within periods could be explained indicating that interactive processes masked the importance of the individual variables. By using quantile regression and local regression (LOESS) this study was able to provide valuable information on environmental factors influencing the growth of M. edulis and important insight for the development of ecosystem based management tools of aquaculture activities, its use in mitigation efforts and successful management of human use of coastal areas.

  14. Factors Associated with the Income Distribution of Full-Time Physicians: A Quantile Regression Approach

    PubMed Central

    Shih, Ya-Chen Tina; Konrad, Thomas R

    2007-01-01

    Objective Physician income is generally high, but quite variable; hence, physicians have divergent perspectives regarding health policy initiatives and market reforms that could affect their incomes. We investigated factors underlying the distribution of income within the physician population. Data Sources Full-time physicians (N=10,777) from the restricted version of the 1996–1997 Community Tracking Study Physician Survey (CTS-PS), 1996 Area Resource File, and 1996 health maintenance organization penetration data. Study Design We conducted separate analyses for primary care physicians (PCPs) and specialists. We employed least square and quantile regression models to examine factors associated with physician incomes at the mean and at various points of the income distribution, respectively. We accounted for the complex survey design for the CTS-PS data using appropriate weighted procedures and explored endogeneity using an instrumental variables method. Principal Findings We detected widespread and subtle effects of many variables on physician incomes at different points (10th, 25th, 75th, and 90th percentiles) in the distribution that were undetected when employing regression estimations focusing on only the means or medians. Our findings show that the effects of managed care penetration are demonstrable at the mean of specialist incomes, but are more pronounced at higher levels. Conversely, a gender gap in earnings occurs at all levels of income of both PCPs and specialists, but is more pronounced at lower income levels. Conclusions The quantile regression technique offers an analytical tool to evaluate policy effects beyond the means. A longitudinal application of this approach may enable health policy makers to identify winners and losers among segments of the physician workforce and assess how market dynamics and health policy initiatives affect the overall physician income distribution over various time intervals. PMID:17850525

  15. Contrasting OLS and Quantile Regression Approaches to Student "Growth" Percentiles

    ERIC Educational Resources Information Center

    Castellano, Katherine Elizabeth; Ho, Andrew Dean

    2013-01-01

    Regression methods can locate student test scores in a conditional distribution, given past scores. This article contrasts and clarifies two approaches to describing these locations in terms of readily interpretable percentile ranks or "conditional status percentile ranks." The first is Betebenner's quantile regression approach that results in…

  16. Flood Frequency Analysis With Historical and Paleoflood Information

    NASA Astrophysics Data System (ADS)

    Stedinger, Jery R.; Cohn, Timothy A.

    1986-05-01

    An investigation is made of flood quantile estimators which can employ "historical" and paleoflood information in flood frequency analyses. Two categories of historical information are considered: "censored" data, where the magnitudes of historical flood peaks are known; and "binomial" data, where only threshold exceedance information is available. A Monte Carlo study employing the two-parameter lognormal distribution shows that maximum likelihood estimators (MLEs) can extract the equivalent of an additional 10-30 years of gage record from a 50-year period of historical observation. The MLE routines are shown to be substantially better than an adjusted-moment estimator similar to the one recommended in Bulletin 17B of the United States Water Resources Council Hydrology Committee (1982). The MLE methods performed well even when floods were drawn from other than the assumed lognormal distribution.

  17. Long Term Discharge Estimation for Ogoué River Basin

    NASA Astrophysics Data System (ADS)

    Seyler, F.; Linguet, L.; Calmant, S.

    2014-12-01

    Ogoué river basin is one the last preserved tropical rain forest basin in the world. The river basin covers about 75% of Gabon. Results of a study conducted on wall-to wall forest cover map using Landsat images (Fichet et al., 2014) gave a net forest loss of 0,38% from 1990 and 2000 and sensibly the same loss rate between 2000 and 2010. However, the country launched recently an ambitious development plan, with communication infrastructure, agriculture and forestry as well as mining projects. Hydrological cycle response to changes may be expected, in both quantitative and qualitative aspects. Unfortunately monitoring gauging stations have stopped functioning in the seventies, and Gabon will then be unable to evaluate, mitigate and adapt adequately to these environmental challenges. Historical data were registered during 42 years at Lambaréné (from 1929 to 1974) and during 10 to 20 years at 17 other ground stations. The quantile function approach (Tourian et al., 2013) has been tested to estimate discharge from J2 and ERS/Envisat/AltiKa virtual stations. This is an opportunity to assess long term discharge patterns in order to monitor land use change effects and eventual disturbance in runoff. Figure 1: Ogoué River basin: J2 (red) and ERS/ENVISAT/ALTIKa (purple) virtual stations Fichet, L. V., Sannier, C., Massard Makaga, E. K., Seyler, F. (2013) Assessing the accuracy of forest cover map for 1990, 2000 and 2010 at national scale in Gabon. In press IEEE Journal of Selected Topics in Applied Earth Observations and Remote SensingTourian, M. J., Sneeuw, N., & Bárdossy, A. (2013). A quantile function approach to discharge estimation from satellite altimetry (ENVISAT). Water Resources Research, 49(7), 4174-4186. doi:10.1002/wrcr.20348

  18. Design, innovation, and rural creative places: Are the arts the cherry on top, or the secret sauce?

    PubMed

    Wojan, Timothy R; Nichols, Bonnie

    2018-01-01

    Creative class theory explains the positive relationship between the arts and commercial innovation as the mutual attraction of artists and other creative workers by an unobserved creative milieu. This study explores alternative theories for rural settings, by analyzing establishment-level survey data combined with data on the local arts scene. The study identifies the local contextual factors associated with a strong design orientation, and estimates the impact that a strong design orientation has on the local economy. Data on innovation and design come from a nationally representative sample of establishments in tradable industries. Latent class analysis allows identifying unobserved subpopulations comprised of establishments with different design and innovation orientations. Logistic regression allows estimating the association between an establishment's design orientation and local contextual factors. A quantile instrumental variable regression allows assessing the robustness of the logistic regression results with respect to endogeneity. An estimate of design orientation at the local level derived from the survey is used to examine variation in economic performance during the period of recovery from the Great Recession (2010-2014). Three distinct innovation (substantive, nominal, and non-innovators) and design orientations (design-integrated, "design last finish," and no systematic approach to design) are identified. Innovation- and design-intensive establishments were identified in both rural and urban areas. Rural design-integrated establishments tended to locate in counties with more highly educated workforces and containing at least one performing arts organization. A quantile instrumental variable regression confirmed that the logistic regression result is robust to endogeneity concerns. Finally, rural areas characterized by design-integrated establishments experienced faster growth in wages relative to rural areas characterized by establishments using no systematic approach to design.

  19. Design, innovation, and rural creative places: Are the arts the cherry on top, or the secret sauce?

    PubMed Central

    Nichols, Bonnie

    2018-01-01

    Objective Creative class theory explains the positive relationship between the arts and commercial innovation as the mutual attraction of artists and other creative workers by an unobserved creative milieu. This study explores alternative theories for rural settings, by analyzing establishment-level survey data combined with data on the local arts scene. The study identifies the local contextual factors associated with a strong design orientation, and estimates the impact that a strong design orientation has on the local economy. Method Data on innovation and design come from a nationally representative sample of establishments in tradable industries. Latent class analysis allows identifying unobserved subpopulations comprised of establishments with different design and innovation orientations. Logistic regression allows estimating the association between an establishment’s design orientation and local contextual factors. A quantile instrumental variable regression allows assessing the robustness of the logistic regression results with respect to endogeneity. An estimate of design orientation at the local level derived from the survey is used to examine variation in economic performance during the period of recovery from the Great Recession (2010–2014). Results Three distinct innovation (substantive, nominal, and non-innovators) and design orientations (design-integrated, “design last finish,” and no systematic approach to design) are identified. Innovation- and design-intensive establishments were identified in both rural and urban areas. Rural design-integrated establishments tended to locate in counties with more highly educated workforces and containing at least one performing arts organization. A quantile instrumental variable regression confirmed that the logistic regression result is robust to endogeneity concerns. Finally, rural areas characterized by design-integrated establishments experienced faster growth in wages relative to rural areas characterized by establishments using no systematic approach to design. PMID:29489884

  20. No Evidence of Reciprocal Associations between Daily Sleep and Physical Activity.

    PubMed

    Mitchell, Jonathan A; Godbole, Suneeta; Moran, Kevin; Murray, Kate; James, Peter; Laden, Francine; Hipp, J Aaron; Kerr, Jacqueline; Glanz, Karen

    2016-10-01

    This study aimed to determine whether physical activity patterns are associated with sleep later at night and if nighttime sleep is associated with physical activity patterns the next day among adult women. Women (N = 353) living throughout the United States wore a wrist and a hip accelerometer for 7 d. Total sleep time (TST, hours per night) and sleep efficiency (SE, %) were estimated from the wrist accelerometer, and moderate to vigorous physical activity (MVPA, >1040 counts per minute, h·d) and sedentary behavior (SB, <100 counts per minute, h·d) were estimated from the hip accelerometer. Mixed-effects models adjusted for age, race, body mass index, education, employment, marital status, health status, and hip accelerometer wear time were used to analyze the data. Follow-up analyses using quantile regression were used to investigate associations among women with below average TST and MVPA and above average SB. The average age of our sample was 55.5 yr (SD = 10.2 yr). The majority of participants were White (79%) and married (72%), and half were employed full time (49%). The participants spent on average 8.9 and 1.1 h·d in SB and MVPA, respectively, and 6.8 h per night asleep. No associations were observed between MVPA and SB with nighttime TST or SE. There were no associations between nighttime TST and SE with MVPA or SB the next day. The findings were the same in the quantile regression analyses. In free-living adult women, accelerometry-estimated nighttime sleep and physical activity patterns were not associated with one another. On the basis of our observational study involving a sample of adult women, higher physical activity will not necessarily improve sleep at night on a day-to-day basis (and vice versa).

  1. Quantile regression reveals hidden bias and uncertainty in habitat models

    Treesearch

    Brian S. Cade; Barry R. Noon; Curtis H. Flather

    2005-01-01

    We simulated the effects of missing information on statistical distributions of animal response that covaried with measured predictors of habitat to evaluate the utility and performance of quantile regression for providing more useful intervals of uncertainty in habitat relationships. These procedures were evaulated for conditions in which heterogeneity and hidden bias...

  2. Goodness of Fit and Misspecification in Quantile Regressions

    ERIC Educational Resources Information Center

    Furno, Marilena

    2011-01-01

    The article considers a test of specification for quantile regressions. The test relies on the increase of the objective function and the worsening of the fit when unnecessary constraints are imposed. It compares the objective functions of restricted and unrestricted models and, in its different formulations, it verifies (a) forecast ability, (b)…

  3. Early Home Activities and Oral Language Skills in Middle Childhood: A Quantile Analysis

    ERIC Educational Resources Information Center

    Law, James; Rush, Robert; King, Tom; Westrupp, Elizabeth; Reilly, Sheena

    2018-01-01

    Oral language development is a key outcome of elementary school, and it is important to identify factors that predict it most effectively. Commonly researchers use ordinary least squares regression with conclusions restricted to average performance conditional on relevant covariates. Quantile regression offers a more sophisticated alternative.…

  4. Ordinary Least Squares and Quantile Regression: An Inquiry-Based Learning Approach to a Comparison of Regression Methods

    ERIC Educational Resources Information Center

    Helmreich, James E.; Krog, K. Peter

    2018-01-01

    We present a short, inquiry-based learning course on concepts and methods underlying ordinary least squares (OLS), least absolute deviation (LAD), and quantile regression (QR). Students investigate squared, absolute, and weighted absolute distance functions (metrics) as location measures. Using differential calculus and properties of convex…

  5. Principles of Quantile Regression and an Application

    ERIC Educational Resources Information Center

    Chen, Fang; Chalhoub-Deville, Micheline

    2014-01-01

    Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…

  6. Early Warning Signals of Financial Crises with Multi-Scale Quantile Regressions of Log-Periodic Power Law Singularities

    PubMed Central

    Zhang, Qun; Zhang, Qunzhi; Sornette, Didier

    2016-01-01

    We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the “S&P 500 1987” bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs. PMID:27806093

  7. Food away from home and body mass outcomes: taking heterogeneity into account enhances quality of results.

    PubMed

    Kim, Tae Hyun; Lee, Eui-Kyung; Han, Euna

    2014-09-01

    The aim of this study was to explore the heterogeneous association of consumption of food away from home (FAFH) with individual body mass outcomes including body mass index and waist circumference over the entire conditional distribution of each outcome. Information on 16,403 adults obtained from nationally representative data on nutrition and behavior in Korea was used. A quantile regression model captured the variability of the association of FAFH with body mass outcomes across the entire conditional distribution of each outcome measure. Heavy FAFH consumption was defined as obtaining ≥1400 kcal from FAFH on a single day. Heavy FAFH consumption, specifically at full-service restaurants, was significantly associated with higher body mass index (+0.46 kg/m2 at the 50th quantile, 0.55 at the 75th, 0.66 at the 90th, and 0.44 at the 95th) and waist circumference (+0.96 cm at the 25th quantile, 1.06 cm at the 50th, 1.35 cm at the 75th, and 0.96 cm at the 90th quantiles) with overall larger associations at higher quantiles. Findings of the study indicate that conventional regression methods may mask important heterogeneity in the association between heavy FAFH consumption and body mass outcomes. Further public health efforts are needed to improve the nutritional quality of affordable FAFH choices and nutrition education and to establish a healthy food consumption environment. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Impact of body mass on job quality.

    PubMed

    Kim, Tae Hyun; Han, Euna

    2015-04-01

    The current study explores the association between body mass and job quality, a composite measurement of job characteristics, for adults. We use nationally representative data from the Korean Labor and Income Panel Study for the years 2005, 2007, and 2008 with 7282 person-year observations for men and 4611 for women. A Quality of Work Index (QWI) is calculated based on work content, job security, the possibilities for improvement, compensation, work conditions, and interpersonal relationships at work. The key independent variable is the body mass index (kg/m(2)) splined at 18.5, 25, and 30. For men, BMI is positively associated with the QWI only in the normal weight segment (+0.19 percentage points at the 10th, +0.28 at the 50th, +0.32 at the 75th, +0.34 at the 90th, and +0.48 at the 95th quantiles). A unit increase in the BMI for women is associated with a lower QWI at the lower quantiles in the normal weight segment (-0.28 at the 5th, -0.19 at the 10th, and -0.25 percentage points at the 25th quantiles) and at the upper quantiles in the overweight segment (-1.15 at the 90th and -1.66 percentage points at the 95th quantiles). The results imply a spill-over cost of overweight or obesity beyond its impact on health in terms of success in the labor market. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. Impact of climate change on Gironde Estuary

    NASA Astrophysics Data System (ADS)

    Laborie, Vanessya; Hissel, François; Sergent, Philippe

    2014-05-01

    Within the THESEUS European project, a simplified mathematical model for storm surge levels in the Bay of Biscay was adjusted on 10 events at Le Verdon using wind and pressure fields from CLM/SGA, so that the water levels at Le Verdon have the same statistic quantiles as observed tide records for the period [1960-2000]. The analysis of future storm surge levels shows a decrease in their quantiles at Le Verdon, whereas there is an increase of the quantiles of total water levels. This increase is smaller than the sea level rise and gets even smaller as one enters farther upstream in the estuary. A numerical model of the Gironde Estuary was then used to evaluate future water levels at 6 locations of the estuary from Le Verdon to Bordeaux and to assess the changes in the quantiles of water levels during the XXIst century using ONERC's pessimistic scenario for sea level rise (60 cm). The model was fed by several data sources : wind fields at Royan and Mérignac interpolated from the grid of the European Climatolologic Model CLM/SGA, a tide signal at Le Verdon, the discharges of Garonne (at La Réole), the Dordogne (at Pessac) and Isle (at Libourne). A series of flood maps for different return periods between 2 and 100 years and for four time periods ([1960-1999], [2010-2039], [2040-2069] and [2070-2099]) have been built for the region of Bordeaux. Quantiles of water levels in the floodplain have also been calculated. The impact of climate change on the evolution of flooded areas in the Gironde Estuary and on quantiles of water levels in the floodplain mainly depends on the sea level rise. Areas which are not currently flooded for low return periods will be inundated in 2100. The influence of river discharges and dike breaching should also be taken into account for more accurate results.

  10. Gender difference in the association between food away-from-home consumption and body weight outcomes among Chinese adults.

    PubMed

    Du, Wen-Wen; Zhang, Bing; Wang, Hui-Jun; Wang, Zhi-Hong; Su, Chang; Zhang, Ji-Guo; Zhang, Ji; Jia, Xiao-Fang; Jiang, Hong-Ru

    2016-11-01

    The present study aimed to explore the associations between food away-from-home (FAFH) consumption and body weight outcomes among Chinese adults. FAFH was defined as food prepared at restaurants and the percentage of energy from FAFH was calculated. Measured BMI and waist circumference (WC) were used as body weight outcomes. Quantile regression models for BMI and WC were performed separately by gender. Information on demographic, socio-economic, diet and health parameters at individual, household and community levels was collected in twelve provinces of China. A cross-sectional sample of 7738 non-pregnant individuals aged 18-60 years from the China Health and Nutrition Survey 2011 was analysed. For males, quantile regression models showed that percentage of energy from FAFH was associated with an increase in BMI of 0·01, 0·01, 0·01, 0·02, 0·02 and 0·03 kg/m2 at the 5th, 25th, 50th, 75th, 90th and 95th quantile, and an increase in WC of 0·04, 0·06, 0·06, 0·04, 0·06, 0·05 and 0·07 cm at the 5th, 10th, 25th, 50th, 75th, 90th and 95th quantile. For females, percentage of energy from FAFH was associated with 0·01, 0·01, 0·01 and 0·02 kg/m2 increase in BMI at the 10th, 25th, 90th and 95th quantile, and with 0·05, 0·04, 0·03 and 0·03 cm increase in WC at the 5th, 10th, 25th and 75th quantile. Our findings suggest that FAFH consumption is relatively more important for BMI and WC among males rather than females in China. Public health initiatives are needed to encourage Chinese adults to make healthy food choices when eating out.

  11. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center

    PubMed Central

    Mehra, Tarun; Müller, Christian Thomas Benedikt; Volbracht, Jörk; Seifert, Burkhardt; Moos, Rudolf

    2015-01-01

    Principles Case weights of Diagnosis Related Groups (DRGs) are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG. Methods 28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings. Results Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001). Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90). Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile). Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52). ICU stay, mechanical and patient clinical complexity level score (PCCL) predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001). Conclusion We suggest considering psychiatric diagnosis, admission as an emergencay case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses. PMID:26517545

  12. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center.

    PubMed

    Mehra, Tarun; Müller, Christian Thomas Benedikt; Volbracht, Jörk; Seifert, Burkhardt; Moos, Rudolf

    2015-01-01

    Case weights of Diagnosis Related Groups (DRGs) are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG. 28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings. Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001). Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90). Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile). Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52). ICU stay, mechanical and patient clinical complexity level score (PCCL) predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001). We suggest considering psychiatric diagnosis, admission as an emergency case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses.

  13. Measuring Disparities across the Distribution of Mental Health Care Expenditures

    PubMed Central

    Cook, Benjamin Lê; Manning, Willard; Alegría, Margarita

    2013-01-01

    Background Previous mental health care disparities studies predominantly compare mean mental health care use across racial/ethnic groups, leaving policymakers with little information on disparities among those with a higher level of expenditures. Aims of the Study To identify racial/ethnic disparities among individuals at varying quantiles of mental health care expenditures. To assess whether disparities in the upper quantiles of expenditure differ by insurance status, income and education. Methods Data were analyzed from a nationally representative sample of white, black and Latino adults 18 years and older (n=83,878). Our dependent variable was total mental health care expenditure. We measured disparities in any mental health care expenditures, disparities in mental health care expenditure at the 95th, 97.5th, and 99th expenditure quantiles of the full population using quantile regression, and at the 50th, 75th, and 95th quantiles for positive users. In the full population, we tested interaction coefficients between race/ethnicity and income, insurance, and education levels to determine whether racial/ethnic disparities in the upper quantiles differed by income, insurance and education. Results Significant Black-white and Latino-white disparities were identified in any mental health care expenditures. In the full population, moving up the quantiles of mental health care expenditures, Black-White and Latino-White disparities were reduced but remained statistically significant. No statistically significant disparities were found in analyses of positive users only. The magnitude of black-white disparities was smaller among those enrolled in public insurance programs compared to the privately insured and uninsured in the 97.5th and 99th quantiles. Disparities persist in the upper quantiles among those in higher income categories and after excluding psychiatric inpatient and emergency department (ED) visits. Discussion Disparities exist in any mental health care and among those that use the most mental health care resources, but much of disparities seem to be driven by lack of access. The data do not allow us to disentangle whether disparities were related to white respondent’s overuse or underuse as compared to minority groups. The cross-sectional data allow us to make only associational claims about the role of insurance, income, and education in disparities. With these limitations in mind, we identified a persistence of disparities in overall expenditures even among those in the highest income categories, after controlling for mental health status and observable sociodemographic characteristics. Implications for Health Care Provision and Use Interventions are needed to equalize resource allocation to racial/ethnic minority patients regardless of their income, with emphasis on outreach interventions to address the disparities in access that are responsible for the no/low expenditures for even Latinos at higher levels of illness severity. Implications for Health Policies Increased policy efforts are needed to reduce the gap in health insurance for Latinos and improve outreach programs to enroll those in need into mental health care services. Implications for Further Research Future studies that conclusively disentangle overuse and appropriate use in these populations are warranted. PMID:23676411

  14. An observationally centred method to quantify local climate change as a distribution

    NASA Astrophysics Data System (ADS)

    Stainforth, David; Chapman, Sandra; Watkins, Nicholas

    2013-04-01

    For planning and adaptation, guidance on trends in local climate is needed at the specific thresholds relevant to particular impact or policy endeavours. This requires quantifying trends at specific quantiles in distributions of variables such as daily temperature or precipitation. These non-normal distributions vary both geographically and in time. The trends in the relevant quantiles may not simply follow the trend in the distribution mean. We present a method[1] for analysing local climatic timeseries data to assess which quantiles of the local climatic distribution show the greatest and most robust trends. We demonstrate this approach using E-OBS gridded data[2] timeseries of local daily temperature from specific locations across Europe over the last 60 years. Our method extracts the changing cumulative distribution function over time and uses a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of the sensitivity of different quantiles of the distributions to changing climate. Geographical location and temperature are treated as independent variables, we thus obtain as outputs how the trend or sensitivity varies with temperature (or occurrence likelihood), and with geographical location. These sensitivities are found to be geographically varying across Europe; as one would expect given the different influences on local climate between, say, Western Scotland and central Italy. We find as an output many regionally consistent patterns of response of potential value in adaptation planning. We discuss methods to quantify the robustness of these observed sensitivities and their statistical likelihood. This also quantifies the level of detail needed from climate models if they are to be used as tools to assess climate change impact. [1] S C Chapman, D A Stainforth, N W Watkins, 2013, On Estimating Local Long Term Climate Trends, Phil. Trans. R. Soc. A, in press [2] Haylock, M.R., N. Hofstra, A.M.G. Klein Tank, E.J. Klok, P.D. Jones and M. New. 2008: A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119, doi:10.1029/2008JD10201

  15. Obesity inequality in Malaysia: decomposing differences by gender and ethnicity using quantile regression.

    PubMed

    Dunn, Richard A; Tan, Andrew K G; Nayga, Rodolfo M

    2012-01-01

    Obesity prevalence is unequally distributed across gender and ethnic group in Malaysia. In this paper, we examine the role of socioeconomic inequality in explaining these disparities. The body mass index (BMI) distributions of Malays and Chinese, the two largest ethnic groups in Malaysia, are estimated through the use of quantile regression. The differences in the BMI distributions are then decomposed into two parts: attributable to differences in socioeconomic endowments and attributable to differences in responses to endowments. For both males and females, the BMI distribution of Malays is shifted toward the right of the distribution of Chinese, i.e., Malays exhibit higher obesity rates. In the lower 75% of the distribution, differences in socioeconomic endowments explain none of this difference. At the 90th percentile, differences in socioeconomic endowments account for no more than 30% of the difference in BMI between ethnic groups. Our results demonstrate that the higher levels of income and education that accrue with economic development will likely not eliminate obesity inequality. This leads us to conclude that reduction of obesity inequality, as well the overall level of obesity, requires increased efforts to alter the lifestyle behaviors of Malaysians.

  16. On Quantile Regression in Reproducing Kernel Hilbert Spaces with Data Sparsity Constraint

    PubMed Central

    Zhang, Chong; Liu, Yufeng; Wu, Yichao

    2015-01-01

    For spline regressions, it is well known that the choice of knots is crucial for the performance of the estimator. As a general learning framework covering the smoothing splines, learning in a Reproducing Kernel Hilbert Space (RKHS) has a similar issue. However, the selection of training data points for kernel functions in the RKHS representation has not been carefully studied in the literature. In this paper we study quantile regression as an example of learning in a RKHS. In this case, the regular squared norm penalty does not perform training data selection. We propose a data sparsity constraint that imposes thresholding on the kernel function coefficients to achieve a sparse kernel function representation. We demonstrate that the proposed data sparsity method can have competitive prediction performance for certain situations, and have comparable performance in other cases compared to that of the traditional squared norm penalty. Therefore, the data sparsity method can serve as a competitive alternative to the squared norm penalty method. Some theoretical properties of our proposed method using the data sparsity constraint are obtained. Both simulated and real data sets are used to demonstrate the usefulness of our data sparsity constraint. PMID:27134575

  17. An actuarial approach to retrofit savings in buildings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Subbarao, Krishnappa; Etingov, Pavel V.; Reddy, T. A.

    An actuarial method has been developed for determining energy savings from retrofits from energy use data for a number of buildings. This method should be contrasted with the traditional method of using pre- and post-retrofit data on the same building. This method supports the U.S. Department of Energy Building Performance Database of real building performance data and related tools that enable engineering and financial practitioners to evaluate retrofits. The actuarial approach derives, from the database, probability density functions (PDFs) for energy savings from retrofits by creating peer groups for the user’s pre post buildings. From the energy use distribution ofmore » the two groups, the savings PDF is derived. This provides the basis for engineering analysis as well as financial risk analysis leading to investment decisions. Several technical issues are addressed: The savings PDF is obtained from the pre- and post-PDF through a convolution. Smoothing using kernel density estimation is applied to make the PDF more realistic. The low data density problem can be mitigated through a neighborhood methodology. Correlations between pre and post buildings are addressed to improve the savings PDF. Sample size effects are addressed through the Kolmogorov--Smirnov tests and quantile-quantile plots.« less

  18. Evaluation of uncertainties in mean and extreme precipitation under climate change for northwestern Mediterranean watersheds from high-resolution Med and Euro-CORDEX ensembles

    NASA Astrophysics Data System (ADS)

    Colmet-Daage, Antoine; Sanchez-Gomez, Emilia; Ricci, Sophie; Llovel, Cécile; Borrell Estupina, Valérie; Quintana-Seguí, Pere; Llasat, Maria Carmen; Servat, Eric

    2018-01-01

    The climate change impact on mean and extreme precipitation events in the northern Mediterranean region is assessed using high-resolution EuroCORDEX and MedCORDEX simulations. The focus is made on three regions, Lez and Aude located in France, and Muga located in northeastern Spain, and eight pairs of global and regional climate models are analyzed with respect to the SAFRAN product. First the model skills are evaluated in terms of bias for the precipitation annual cycle over historical period. Then future changes in extreme precipitation, under two emission scenarios, are estimated through the computation of past/future change coefficients of quantile-ranked model precipitation outputs. Over the 1981-2010 period, the cumulative precipitation is overestimated for most models over the mountainous regions and underestimated over the coastal regions in autumn and higher-order quantile. The ensemble mean and the spread for future period remain unchanged under RCP4.5 scenario and decrease under RCP8.5 scenario. Extreme precipitation events are intensified over the three catchments with a smaller ensemble spread under RCP8.5 revealing more evident changes, especially in the later part of the 21st century.

  19. Estimation of descriptive statistics for multiply censored water quality data

    USGS Publications Warehouse

    Helsel, Dennis R.; Cohn, Timothy A.

    1988-01-01

    This paper extends the work of Gilliom and Helsel (1986) on procedures for estimating descriptive statistics of water quality data that contain “less than” observations. Previously, procedures were evaluated when only one detection limit was present. Here we investigate the performance of estimators for data that have multiple detection limits. Probability plotting and maximum likelihood methods perform substantially better than simple substitution procedures now commonly in use. Therefore simple substitution procedures (e.g., substitution of the detection limit) should be avoided. Probability plotting methods are more robust than maximum likelihood methods to misspecification of the parent distribution and their use should be encouraged in the typical situation where the parent distribution is unknown. When utilized correctly, less than values frequently contain nearly as much information for estimating population moments and quantiles as would the same observations had the detection limit been below them.

  20. Determinants of Academic Attainment in the United States: A Quantile Regression Analysis of Test Scores

    ERIC Educational Resources Information Center

    Haile, Getinet Astatike; Nguyen, Anh Ngoc

    2008-01-01

    We investigate the determinants of high school students' academic attainment in mathematics, reading and science in the United States; focusing particularly on possible differential impacts of ethnicity and family background across the distribution of test scores. Using data from the NELS2000 and employing quantile regression, we find two…

  1. Longitudinal analysis of the strengths and difficulties questionnaire scores of the Millennium Cohort Study children in England using M-quantile random-effects regression.

    PubMed

    Tzavidis, Nikos; Salvati, Nicola; Schmid, Timo; Flouri, Eirini; Midouhas, Emily

    2016-02-01

    Multilevel modelling is a popular approach for longitudinal data analysis. Statistical models conventionally target a parameter at the centre of a distribution. However, when the distribution of the data is asymmetric, modelling other location parameters, e.g. percentiles, may be more informative. We present a new approach, M -quantile random-effects regression, for modelling multilevel data. The proposed method is used for modelling location parameters of the distribution of the strengths and difficulties questionnaire scores of children in England who participate in the Millennium Cohort Study. Quantile mixed models are also considered. The analyses offer insights to child psychologists about the differential effects of risk factors on children's outcomes.

  2. Heterogeneous effects of oil shocks on exchange rates: evidence from a quantile regression approach.

    PubMed

    Su, Xianfang; Zhu, Huiming; You, Wanhai; Ren, Yinghua

    2016-01-01

    The determinants of exchange rates have attracted considerable attention among researchers over the past several decades. Most studies, however, ignore the possibility that the impact of oil shocks on exchange rates could vary across the exchange rate returns distribution. We employ a quantile regression approach to address this issue. Our results indicate that the effect of oil shocks on exchange rates is heterogeneous across quantiles. A large US depreciation or appreciation tends to heighten the effects of oil shocks on exchange rate returns. Positive oil demand shocks lead to appreciation pressures in oil-exporting countries and this result is robust across lower and upper return distributions. These results offer rich and useful information for investors and decision-makers.

  3. Censored data treatment using additional information in intelligent medical systems

    NASA Astrophysics Data System (ADS)

    Zenkova, Z. N.

    2015-11-01

    Statistical procedures are a very important and significant part of modern intelligent medical systems. They are used for proceeding, mining and analysis of different types of the data about patients and their diseases; help to make various decisions, regarding the diagnosis, treatment, medication or surgery, etc. In many cases the data can be censored or incomplete. It is a well-known fact that censorship considerably reduces the efficiency of statistical procedures. In this paper the author makes a brief review of the approaches which allow improvement of the procedures using additional information, and describes a modified estimation of an unknown cumulative distribution function involving additional information about a quantile which is known exactly. The additional information is used by applying a projection of a classical estimator to a set of estimators with certain properties. The Kaplan-Meier estimator is considered as an estimator of the unknown cumulative distribution function, the properties of the modified estimator are investigated for a case of a single right censorship by means of simulations.

  4. Bias and Variance Approximations for Estimators of Extreme Quantiles

    DTIC Science & Technology

    1988-11-01

    r u - g(u). The errors of these approximations are, respectively, O ...The conditions required for this are yrci, yr+ypci. Taking the special cases r -1, r -1 and the limit r -) O , we deduce Jelog g(Y) 6 2folog g(Y) ~ e( 3+2y...a 2 (log g(TipL, o , o )) - I + I- exp-a" a a r - (- + Z - Ze - Z + (Z 2 - z~eZ + Z3 e - Z) + 0(y 2 )) 2 18 and using the formula E[Zre- sz1 - (_-) r r ( r

  5. A Psychological Model for Aggregating Judgments of Magnitude

    NASA Astrophysics Data System (ADS)

    Merkle, Edgar C.; Steyvers, Mark

    In this paper, we develop and illustrate a psychologically-motivated model for aggregating judgments of magnitude across experts. The model assumes that experts' judgments are perturbed from the truth by both systematic biases and random error, and it provides aggregated estimates that are implicitly based on the application of nonlinear weights to individual judgments. The model is also easily extended to situations where experts report multiple quantile judgments. We apply the model to expert judgments concerning flange leaks in a chemical plant, illustrating its use and comparing it to baseline measures.

  6. Gender Gaps in Mathematics, Science and Reading Achievements in Muslim Countries: Evidence from Quantile Regression Analyses

    ERIC Educational Resources Information Center

    Shafiq, M. Najeeb

    2011-01-01

    Using quantile regression analyses, this study examines gender gaps in mathematics, science, and reading in Azerbaijan, Indonesia, Jordan, the Kyrgyz Republic, Qatar, Tunisia, and Turkey among 15 year-old students. The analyses show that girls in Azerbaijan achieve as well as boys in mathematics and science and overachieve in reading. In Jordan,…

  7. Gender Gaps in Mathematics, Science and Reading Achievements in Muslim Countries: A Quantile Regression Approach

    ERIC Educational Resources Information Center

    Shafiq, M. Najeeb

    2013-01-01

    Using quantile regression analyses, this study examines gender gaps in mathematics, science, and reading in Azerbaijan, Indonesia, Jordan, the Kyrgyz Republic, Qatar, Tunisia, and Turkey among 15-year-old students. The analyses show that girls in Azerbaijan achieve as well as boys in mathematics and science and overachieve in reading. In Jordan,…

  8. A Quantile Regression Approach to Understanding the Relations among Morphological Awareness, Vocabulary, and Reading Comprehension in Adult Basic Education Students

    ERIC Educational Resources Information Center

    Tighe, Elizabeth L.; Schatschneider, Christopher

    2016-01-01

    The purpose of this study was to investigate the joint and unique contributions of morphological awareness and vocabulary knowledge at five reading comprehension levels in adult basic education (ABE) students. We introduce the statistical technique of multiple quantile regression, which enabled us to assess the predictive utility of morphological…

  9. Trait Mindfulness as a Limiting Factor for Residual Depressive Symptoms: An Explorative Study Using Quantile Regression

    PubMed Central

    Radford, Sholto; Eames, Catrin; Brennan, Kate; Lambert, Gwladys; Crane, Catherine; Williams, J. Mark G.; Duggan, Danielle S.; Barnhofer, Thorsten

    2014-01-01

    Mindfulness has been suggested to be an important protective factor for emotional health. However, this effect might vary with regard to context. This study applied a novel statistical approach, quantile regression, in order to investigate the relation between trait mindfulness and residual depressive symptoms in individuals with a history of recurrent depression, while taking into account symptom severity and number of episodes as contextual factors. Rather than fitting to a single indicator of central tendency, quantile regression allows exploration of relations across the entire range of the response variable. Analysis of self-report data from 274 participants with a history of three or more previous episodes of depression showed that relatively higher levels of mindfulness were associated with relatively lower levels of residual depressive symptoms. This relationship was most pronounced near the upper end of the response distribution and moderated by the number of previous episodes of depression at the higher quantiles. The findings suggest that with lower levels of mindfulness, residual symptoms are less constrained and more likely to be influenced by other factors. Further, the limiting effect of mindfulness on residual symptoms is most salient in those with higher numbers of episodes. PMID:24988072

  10. Approximating Long-Term Statistics Early in the Global Precipitation Measurement Era

    NASA Technical Reports Server (NTRS)

    Stanley, Thomas; Kirschbaum, Dalia B.; Huffman, George J.; Adler, Robert F.

    2017-01-01

    Long-term precipitation records are vital to many applications, especially the study of extreme events. The Tropical Rainfall Measuring Mission (TRMM) has served this need, but TRMMs successor mission, Global Precipitation Measurement (GPM), does not yet provide a long-term record. Quantile mapping, the conversion of values across paired empirical distributions, offers a simple, established means to approximate such long-term statistics, but only within appropriately defined domains. This method was applied to a case study in Central America, demonstrating that quantile mapping between TRMM and GPM data maintains the performance of a real-time landslide model. Use of quantile mapping could bring the benefits of the latest satellite-based precipitation dataset to existing user communities such as those for hazard assessment, crop forecasting, numerical weather prediction, and disease tracking.

  11. Percentile-Based ETCCDI Temperature Extremes Indices for CMIP5 Model Output: New Results through Semiparametric Quantile Regression Approach

    NASA Astrophysics Data System (ADS)

    Li, L.; Yang, C.

    2017-12-01

    Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP8.5) emission scenarios run a multimodel ensemble of 19 members. We analyze changes in the six indices on global and regional scales over the 21st century relative to either the base period 1961-1990 or the reference period 1981-2000, and compare the results with those based on the CLIMDEX datasets.

  12. CO-occurring exposure to perchlorate, nitrate and thiocyanate alters thyroid function in healthy pregnant women

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Horton, Megan K., E-mail: megan.horton@mssm.edu; Blount, Benjamin C.; Valentin-Blasini, Liza

    Background: Adequate maternal thyroid function during pregnancy is necessary for normal fetal brain development, making pregnancy a critical window of vulnerability to thyroid disrupting insults. Sodium/iodide symporter (NIS) inhibitors, namely perchlorate, nitrate, and thiocyanate, have been shown individually to competitively inhibit uptake of iodine by the thyroid. Several epidemiologic studies examined the association between these individual exposures and thyroid function. Few studies have examined the effect of this chemical mixture on thyroid function during pregnancy Objectives: We examined the cross sectional association between urinary perchlorate, thiocyanate and nitrate concentrations and thyroid function among healthy pregnant women living in New Yorkmore » City using weighted quantile sum (WQS) regression. Methods: We measured thyroid stimulating hormone (TSH) and free thyroxine (FreeT4) in blood samples; perchlorate, thiocyanate, nitrate and iodide in urine samples collected from 284 pregnant women at 12 (±2.8) weeks gestation. We examined associations between urinary analyte concentrations and TSH or FreeT4 using linear regression or WQS adjusting for gestational age, urinary iodide and creatinine. Results: Individual analyte concentrations in urine were significantly correlated (Spearman's r 0.4–0.5, p<0.001). Linear regression analyses did not suggest associations between individual concentrations and thyroid function. The WQS revealed a significant positive association between the weighted sum of urinary concentrations of the three analytes and increased TSH. Perchlorate had the largest weight in the index, indicating the largest contribution to the WQS. Conclusions: Co-exposure to perchlorate, nitrate and thiocyanate may alter maternal thyroid function, specifically TSH, during pregnancy. - Highlights: • Perchlorate, nitrate, thiocyanate and iodide measured in maternal urine. • Thyroid function (TSH and Free T4) measured in maternal blood. • Weighted quantile sum (WQS) regression examined complex mixture effect. • WQS identified an inverse association between the exposure mixture and maternal TSH. • Perchlorate indicated as the ‘bad actor’ of the mixture.« less

  13. Tests of Sunspot Number Sequences: 3. Effects of Regression Procedures on the Calibration of Historic Sunspot Data

    NASA Astrophysics Data System (ADS)

    Lockwood, M.; Owens, M. J.; Barnard, L.; Usoskin, I. G.

    2016-11-01

    We use sunspot-group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups [RB] above a variable cut-off threshold of observed total whole spot area (uncorrected for foreshortening) to simulate what a lower-acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number [RA] using a variety of regression techniques. It is found that a very high correlation between RA and RB (r_{AB} > 0.98) does not prevent large errors in the intercalibration (for example sunspot-maximum values can be over 30 % too large even for such levels of r_{AB}). In generating the backbone sunspot number [R_{BB}], Svalgaard and Schatten ( Solar Phys., 2016) force regression fits to pass through the scatter-plot origin, which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot-cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile ("Q-Q") plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least-squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot-group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar-cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.

  14. Optimal regionalization of extreme value distributions for flood estimation

    NASA Astrophysics Data System (ADS)

    Asadi, Peiman; Engelke, Sebastian; Davison, Anthony C.

    2018-01-01

    Regionalization methods have long been used to estimate high return levels of river discharges at ungauged locations on a river network. In these methods, discharge measurements from a homogeneous group of similar, gauged, stations are used to estimate high quantiles at a target location that has no observations. The similarity of this group to the ungauged location is measured in terms of a hydrological distance measuring differences in physical and meteorological catchment attributes. We develop a statistical method for estimation of high return levels based on regionalizing the parameters of a generalized extreme value distribution. The group of stations is chosen by optimizing over the attribute weights of the hydrological distance, ensuring similarity and in-group homogeneity. Our method is applied to discharge data from the Rhine basin in Switzerland, and its performance at ungauged locations is compared to that of other regionalization methods. For gauged locations we show how our approach improves the estimation uncertainty for long return periods by combining local measurements with those from the chosen group.

  15. Inferring river bathymetry via Image-to-Depth Quantile Transformation (IDQT)

    USGS Publications Warehouse

    Legleiter, Carl

    2016-01-01

    Conventional, regression-based methods of inferring depth from passive optical image data undermine the advantages of remote sensing for characterizing river systems. This study introduces and evaluates a more flexible framework, Image-to-Depth Quantile Transformation (IDQT), that involves linking the frequency distribution of pixel values to that of depth. In addition, a new image processing workflow involving deep water correction and Minimum Noise Fraction (MNF) transformation can reduce a hyperspectral data set to a single variable related to depth and thus suitable for input to IDQT. Applied to a gravel bed river, IDQT avoided negative depth estimates along channel margins and underpredictions of pool depth. Depth retrieval accuracy (R25 0.79) and precision (0.27 m) were comparable to an established band ratio-based method, although a small shallow bias (0.04 m) was observed. Several ways of specifying distributions of pixel values and depths were evaluated but had negligible impact on the resulting depth estimates, implying that IDQT was robust to these implementation details. In essence, IDQT uses frequency distributions of pixel values and depths to achieve an aspatial calibration; the image itself provides information on the spatial distribution of depths. The approach thus reduces sensitivity to misalignment between field and image data sets and allows greater flexibility in the timing of field data collection relative to image acquisition, a significant advantage in dynamic channels. IDQT also creates new possibilities for depth retrieval in the absence of field data if a model could be used to predict the distribution of depths within a reach.

  16. Downscaling of daily precipitation using a hybrid model of Artificial Neural Network, Wavelet, and Quantile Mapping in Gharehsoo River Basin, Iran

    NASA Astrophysics Data System (ADS)

    Taie Semiromi, M.; Koch, M.

    2017-12-01

    Although linear/regression statistical downscaling methods are very straightforward and widely used, and they can be applied to a single predictor-predictand pair or spatial fields of predictors-predictands, the greatest constraint is the requirement of a normal distribution of the predictor and the predictand values, which means that it cannot be used to predict the distribution of daily rainfall because it is typically non-normal. To tacked with such a limitation, the current study aims to introduce a new developed hybrid technique taking advantages from Artificial Neural Networks (ANNs), Wavelet and Quantile Mapping (QM) for downscaling of daily precipitation for 10 rain-gauge stations located in Gharehsoo River Basin, Iran. With the purpose of daily precipitation downscaling, the study makes use of Second Generation Canadian Earth System Model (CanESM2) developed by Canadian Centre for Climate Modeling and Analysis (CCCma). Climate projections are available for three representative concentration pathways (RCPs) namely RCP 2.6, RCP 4.5 and RCP 8.5 for up to 2100. In this regard, 26 National Centers for Environmental Prediction (NCEP) reanalysis large-scale variables which have potential physical relationships with precipitation, were selected as candidate predictors. Afterwards, predictor screening was conducted using correlation, partial correlation and explained variance between predictors and predictand (precipitation). Depending on each rain-gauge station between two and three predictors were selected which their decomposed details (D) and approximation (A) obtained from discrete wavelet analysis were fed as inputs to the neural networks. After downscaling of daily precipitation, bias correction was conducted using quantile mapping. Out of the complete time series available, i.e. 1978-2005, two third of which namely 1978-1996 was used for calibration of QM and the reminder, i.e. 1997-2005 was considered for the validation. Result showed that the proposed hybrid method supported by QM for bias-correction could quite satisfactorily simulate daily precipitation. Also, results indicated that under all RCPs, precipitation will be more or less than 12% decreased by 2100. However, precipitation will be less decreased under RCP 8.5 compared with RCP 4.5.

  17. Uncertainties of flood frequency estimation approaches based on continuous simulation using data resampling

    NASA Astrophysics Data System (ADS)

    Arnaud, Patrick; Cantet, Philippe; Odry, Jean

    2017-11-01

    Flood frequency analyses (FFAs) are needed for flood risk management. Many methods exist ranging from classical purely statistical approaches to more complex approaches based on process simulation. The results of these methods are associated with uncertainties that are sometimes difficult to estimate due to the complexity of the approaches or the number of parameters, especially for process simulation. This is the case of the simulation-based FFA approach called SHYREG presented in this paper, in which a rainfall generator is coupled with a simple rainfall-runoff model in an attempt to estimate the uncertainties due to the estimation of the seven parameters needed to estimate flood frequencies. The six parameters of the rainfall generator are mean values, so their theoretical distribution is known and can be used to estimate the generator uncertainties. In contrast, the theoretical distribution of the single hydrological model parameter is unknown; consequently, a bootstrap method is applied to estimate the calibration uncertainties. The propagation of uncertainty from the rainfall generator to the hydrological model is also taken into account. This method is applied to 1112 basins throughout France. Uncertainties coming from the SHYREG method and from purely statistical approaches are compared, and the results are discussed according to the length of the recorded observations, basin size and basin location. Uncertainties of the SHYREG method decrease as the basin size increases or as the length of the recorded flow increases. Moreover, the results show that the confidence intervals of the SHYREG method are relatively small despite the complexity of the method and the number of parameters (seven). This is due to the stability of the parameters and takes into account the dependence of uncertainties due to the rainfall model and the hydrological calibration. Indeed, the uncertainties on the flow quantiles are on the same order of magnitude as those associated with the use of a statistical law with two parameters (here generalised extreme value Type I distribution) and clearly lower than those associated with the use of a three-parameter law (here generalised extreme value Type II distribution). For extreme flood quantiles, the uncertainties are mostly due to the rainfall generator because of the progressive saturation of the hydrological model.

  18. Assessing the pollution risk of a groundwater source field at western Laizhou Bay under seawater intrusion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zeng, Xiankui; Wu, Jichun; Wang, Dong, E-mail: wangdong@nju.edu.cn

    Coastal areas have great significance for human living, economy and society development in the world. With the rapid increase of pressures from human activities and climate change, the safety of groundwater resource is under the threat of seawater intrusion in coastal areas. The area of Laizhou Bay is one of the most serious seawater intruded areas in China, since seawater intrusion phenomenon was firstly recognized in the middle of 1970s. This study assessed the pollution risk of a groundwater source filed of western Laizhou Bay area by inferring the probability distribution of groundwater Cl{sup −} concentration. The numerical model ofmore » seawater intrusion process is built by using SEAWAT4. The parameter uncertainty of this model is evaluated by Markov Chain Monte Carlo (MCMC) simulation, and DREAM{sub (ZS)} is used as sampling algorithm. Then, the predictive distribution of Cl{sup -} concentration at groundwater source field is inferred by using the samples of model parameters obtained from MCMC. After that, the pollution risk of groundwater source filed is assessed by the predictive quantiles of Cl{sup -} concentration. The results of model calibration and verification demonstrate that the DREAM{sub (ZS)} based MCMC is efficient and reliable to estimate model parameters under current observation. Under the condition of 95% confidence level, the groundwater source point will not be polluted by seawater intrusion in future five years (2015–2019). In addition, the 2.5% and 97.5% predictive quantiles show that the Cl{sup −} concentration of groundwater source field always vary between 175 mg/l and 200 mg/l. - Highlights: • The parameter uncertainty of seawater intrusion model is evaluated by MCMC. • Groundwater source field won’t be polluted by seawater intrusion in future 5 years. • The pollution risk is assessed by the predictive quantiles of Cl{sup −} concentration.« less

  19. Numerical analysis of the accuracy of bivariate quantile distributions utilizing copulas compared to the GUM supplement 2 for oil pressure balance uncertainties

    NASA Astrophysics Data System (ADS)

    Ramnath, Vishal

    2017-11-01

    In the field of pressure metrology the effective area is Ae = A0 (1 + λP) where A0 is the zero-pressure area and λ is the distortion coefficient and the conventional practise is to construct univariate probability density functions (PDFs) for A0 and λ. As a result analytical generalized non-Gaussian bivariate joint PDFs has not featured prominently in pressure metrology. Recently extended lambda distribution based quantile functions have been successfully utilized for summarizing univariate arbitrary PDF distributions of gas pressure balances. Motivated by this development we investigate the feasibility and utility of extending and applying quantile functions to systems which naturally exhibit bivariate PDFs. Our approach is to utilize the GUM Supplement 1 methodology to solve and generate Monte Carlo based multivariate uncertainty data for an oil based pressure balance laboratory standard that is used to generate known high pressures, and which are in turn cross-floated against another pressure balance transfer standard in order to deduce the transfer standard's respective area. We then numerically analyse the uncertainty data by formulating and constructing an approximate bivariate quantile distribution that directly couples A0 and λ in order to compare and contrast its accuracy to an exact GUM Supplement 2 based uncertainty quantification analysis.

  20. Predicting Posttraumatic Stress Symptom Prevalence and Local Distribution after an Earthquake with Scarce Data.

    PubMed

    Dussaillant, Francisca; Apablaza, Mauricio

    2017-08-01

    After a major earthquake, the assignment of scarce mental health emergency personnel to different geographic areas is crucial to the effective management of the crisis. The scarce information that is available in the aftermath of a disaster may be valuable in helping predict where are the populations that are in most need. The objectives of this study were to derive algorithms to predict posttraumatic stress (PTS) symptom prevalence and local distribution after an earthquake and to test whether there are algorithms that require few input data and are still reasonably predictive. A rich database of PTS symptoms, informed after Chile's 2010 earthquake and tsunami, was used. Several model specifications for the mean and centiles of the distribution of PTS symptoms, together with posttraumatic stress disorder (PTSD) prevalence, were estimated via linear and quantile regressions. The models varied in the set of covariates included. Adjusted R2 for the most liberal specifications (in terms of numbers of covariates included) ranged from 0.62 to 0.74, depending on the outcome. When only including peak ground acceleration (PGA), poverty rate, and household damage in linear and quadratic form, predictive capacity was still good (adjusted R2 from 0.59 to 0.67 were obtained). Information about local poverty, household damage, and PGA can be used as an aid to predict PTS symptom prevalence and local distribution after an earthquake. This can be of help to improve the assignment of mental health personnel to the affected localities. Dussaillant F , Apablaza M . Predicting posttraumatic stress symptom prevalence and local distribution after an earthquake with scarce data. Prehosp Disaster Med. 2017;32(4):357-367.

  1. On the Computation of Optimal Designs for Certain Time Series Models with Applications to Optimal Quantile Selection for Location or Scale Parameter Estimation.

    DTIC Science & Technology

    1981-07-01

    process is observed over all of (0,1], the reproducing kernel Hilbert space (RKHS) techniques developed by Parzen (1961a, 1961b) 2 may be used to construct...covariance kernel,R, for the process (1.1) is the reproducing kernel for a reproducing kernel Hilbert space (RKHS) which will be denoted as H(R) (c.f...2.6), it is known that (c.f. Eubank, Smith and Smith (1981a, 1981b)), i) H(R) is a Hilbert function space consisting of functions which satisfy for fEH

  2. Quantifying Population-Level Risks Using an Individual-Based Model: Sea Otters, Harlequin Ducks, and the Exxon Valdez Oil Spill

    PubMed Central

    Harwell, Mark A; Gentile, John H; Parker, Keith R

    2012-01-01

    Ecological risk assessments need to advance beyond evaluating risks to individuals that are largely based on toxicity studies conducted on a few species under laboratory conditions, to assessing population-level risks to the environment, including considerations of variability and uncertainty. Two individual-based models (IBMs), recently developed to assess current risks to sea otters and seaducks in Prince William Sound more than 2 decades after the Exxon Valdez oil spill (EVOS), are used to explore population-level risks. In each case, the models had previously shown that there were essentially no remaining risks to individuals from polycyclic aromatic hydrocarbons (PAHs) derived from the EVOS. New sensitivity analyses are reported here in which hypothetical environmental exposures to PAHs were heuristically increased until assimilated doses reached toxicity reference values (TRVs) derived at the no-observed-adverse-effects and lowest-observed-adverse-effects levels (NOAEL and LOAEL, respectively). For the sea otters, this was accomplished by artificially increasing the number of sea otter pits that would intersect remaining patches of subsurface oil residues by orders of magnitude over actual estimated rates. Similarly, in the seaduck assessment, the PAH concentrations in the constituents of diet, sediments, and seawater were increased in proportion to their relative contributions to the assimilated doses by orders of magnitude over measured environmental concentrations, to reach the NOAEL and LOAEL thresholds. The stochastic IBMs simulated millions of individuals. From these outputs, frequency distributions were derived of assimilated doses for populations of 500 000 sea otters or seaducks in each of 7 or 8 classes, respectively. Doses to several selected quantiles were analyzed, ranging from the 1-in-1000th most-exposed individuals (99.9% quantile) to the median-exposed individuals (50% quantile). The resulting families of quantile curves provide the basis for characterizing the environmental thresholds below which no population-level effects could be detected and above which population-level effects would be expected to become manifest. This approach provides risk managers an enhanced understanding of the risks to populations under various conditions and assumptions, whether under hypothetically increased exposure regimes, as demonstrated here, or in situations in which actual exposures are near toxic effects levels. This study shows that individual-based models are especially amenable and appropriate for conducting population-level risk assessments, and that they can readily be used to answer questions about the risks to individuals and populations across a variety of exposure conditions. Integr Environ Assess Manag 2012; 8: 503–522. © 2012 SETAC PMID:22275071

  3. Quantifying how the full local distribution of daily precipitation is changing and its uncertainties

    NASA Astrophysics Data System (ADS)

    Stainforth, David; Chapman, Sandra; Watkins, Nicholas

    2016-04-01

    The study of the consequences of global warming would benefit from quantification of geographical patterns of change at specific thresholds or quantiles, and better understandings of the intrinsic uncertainties in such quantities. For precipitation a range of indices have been developed which focus on high percentiles (e.g. rainfall falling on days above the 99th percentile) and on absolute extremes (e.g. maximum annual one day precipitation) but scientific assessments are best undertaken in the context of changes in the whole climatic distribution. Furthermore, the relevant thresholds for climate-vulnerable policy decisions, adaptation planning and impact assessments, vary according to the specific sector and location of interest. We present a methodology which maintains the flexibility to provide information at different thresholds for different downstream users, both scientists and decision makers. We develop a method[1,2] for analysing local climatic timeseries to assess which quantiles of the local climatic distribution show the greatest and most robust changes in daily precipitation data. We extract from the data quantities that characterize the changes in time of the likelihood of daily precipitation above a threshold and of the amount of precipitation on those days. Our method is a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of how fast different quantiles of precipitation distributions are changing. This involves not only determining which quantiles and geographical locations show the greatest and smallest changes, but also those at which uncertainty undermines the ability to make confident statements about any change there may be. We demonstrate this approach using E-OBS gridded data[3] which are timeseries of local daily precipitation across Europe over the last 60+ years. We treat geographical location and precipitation as independent variables and thus obtain as outputs the geographical pattern of change at given thresholds of precipitation. This information is model- independent, thus providing data of direct value in model calibration and assessment. [1] S C Chapman, D A Stainforth, N W Watkins, 2013, On Estimating Local Long Term Climate Trends, Phil. Trans. R. Soc. A, 371 20120287; D. A. Stainforth, 2013 [2] S C Chapman, D A Stainforth, N W Watkins, 2015 Limits to the quantification of local climate change, ERL,10, 094018 (2015), ERL,10, 094018 [3] M R Haylock et al . 2008: A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119

  4. Quantifying population-level risks using an individual-based model: sea otters, Harlequin Ducks, and the Exxon Valdez oil spill.

    PubMed

    Harwell, Mark A; Gentile, John H; Parker, Keith R

    2012-07-01

    Ecological risk assessments need to advance beyond evaluating risks to individuals that are largely based on toxicity studies conducted on a few species under laboratory conditions, to assessing population-level risks to the environment, including considerations of variability and uncertainty. Two individual-based models (IBMs), recently developed to assess current risks to sea otters and seaducks in Prince William Sound more than 2 decades after the Exxon Valdez oil spill (EVOS), are used to explore population-level risks. In each case, the models had previously shown that there were essentially no remaining risks to individuals from polycyclic aromatic hydrocarbons (PAHs) derived from the EVOS. New sensitivity analyses are reported here in which hypothetical environmental exposures to PAHs were heuristically increased until assimilated doses reached toxicity reference values (TRVs) derived at the no-observed-adverse-effects and lowest-observed-adverse-effects levels (NOAEL and LOAEL, respectively). For the sea otters, this was accomplished by artificially increasing the number of sea otter pits that would intersect remaining patches of subsurface oil residues by orders of magnitude over actual estimated rates. Similarly, in the seaduck assessment, the PAH concentrations in the constituents of diet, sediments, and seawater were increased in proportion to their relative contributions to the assimilated doses by orders of magnitude over measured environmental concentrations, to reach the NOAEL and LOAEL thresholds. The stochastic IBMs simulated millions of individuals. From these outputs, frequency distributions were derived of assimilated doses for populations of 500,000 sea otters or seaducks in each of 7 or 8 classes, respectively. Doses to several selected quantiles were analyzed, ranging from the 1-in-1000th most-exposed individuals (99.9% quantile) to the median-exposed individuals (50% quantile). The resulting families of quantile curves provide the basis for characterizing the environmental thresholds below which no population-level effects could be detected and above which population-level effects would be expected to become manifest. This approach provides risk managers an enhanced understanding of the risks to populations under various conditions and assumptions, whether under hypothetically increased exposure regimes, as demonstrated here, or in situations in which actual exposures are near toxic effects levels. This study shows that individual-based models are especially amenable and appropriate for conducting population-level risk assessments, and that they can readily be used to answer questions about the risks to individuals and populations across a variety of exposure conditions. Copyright © 2012 SETAC.

  5. The World Health Organization Fetal Growth Charts: A Multinational Longitudinal Study of Ultrasound Biometric Measurements and Estimated Fetal Weight.

    PubMed

    Kiserud, Torvid; Piaggio, Gilda; Carroli, Guillermo; Widmer, Mariana; Carvalho, José; Neerup Jensen, Lisa; Giordano, Daniel; Cecatti, José Guilherme; Abdel Aleem, Hany; Talegawkar, Sameera A; Benachi, Alexandra; Diemert, Anke; Tshefu Kitoto, Antoinette; Thinkhamrop, Jadsada; Lumbiganon, Pisake; Tabor, Ann; Kriplani, Alka; Gonzalez Perez, Rogelio; Hecher, Kurt; Hanson, Mark A; Gülmezoglu, A Metin; Platt, Lawrence D

    2017-01-01

    Perinatal mortality and morbidity continue to be major global health challenges strongly associated with prematurity and reduced fetal growth, an issue of further interest given the mounting evidence that fetal growth in general is linked to degrees of risk of common noncommunicable diseases in adulthood. Against this background, WHO made it a high priority to provide the present fetal growth charts for estimated fetal weight (EFW) and common ultrasound biometric measurements intended for worldwide use. We conducted a multinational prospective observational longitudinal study of fetal growth in low-risk singleton pregnancies of women of high or middle socioeconomic status and without known environmental constraints on fetal growth. Centers in ten countries (Argentina, Brazil, Democratic Republic of the Congo, Denmark, Egypt, France, Germany, India, Norway, and Thailand) recruited participants who had reliable information on last menstrual period and gestational age confirmed by crown-rump length measured at 8-13 wk of gestation. Participants had anthropometric and nutritional assessments and seven scheduled ultrasound examinations during pregnancy. Fifty-two participants withdrew consent, and 1,387 participated in the study. At study entry, median maternal age was 28 y (interquartile range [IQR] 25-31), median height was 162 cm (IQR 157-168), median weight was 61 kg (IQR 55-68), 58% of the women were nulliparous, and median daily caloric intake was 1,840 cal (IQR 1,487-2,222). The median pregnancy duration was 39 wk (IQR 38-40) although there were significant differences between countries, the largest difference being 12 d (95% CI 8-16). The median birthweight was 3,300 g (IQR 2,980-3,615). There were differences in birthweight between countries, e.g., India had significantly smaller neonates than the other countries, even after adjusting for gestational age. Thirty-one women had a miscarriage, and three fetuses had intrauterine death. The 8,203 sets of ultrasound measurements were scrutinized for outliers and leverage points, and those measurements taken at 14 to 40 wk were selected for analysis. A total of 7,924 sets of ultrasound measurements were analyzed by quantile regression to establish longitudinal reference intervals for fetal head circumference, biparietal diameter, humerus length, abdominal circumference, femur length and its ratio with head circumference and with biparietal diameter, and EFW. There was asymmetric distribution of growth of EFW: a slightly wider distribution among the lower percentiles during early weeks shifted to a notably expanded distribution of the higher percentiles in late pregnancy. Male fetuses were larger than female fetuses as measured by EFW, but the disparity was smaller in the lower quantiles of the distribution (3.5%) and larger in the upper quantiles (4.5%). Maternal age and maternal height were associated with a positive effect on EFW, particularly in the lower tail of the distribution, of the order of 2% to 3% for each additional 10 y of age of the mother and 1% to 2% for each additional 10 cm of height. Maternal weight was associated with a small positive effect on EFW, especially in the higher tail of the distribution, of the order of 1.0% to 1.5% for each additional 10 kg of bodyweight of the mother. Parous women had heavier fetuses than nulliparous women, with the disparity being greater in the lower quantiles of the distribution, of the order of 1% to 1.5%, and diminishing in the upper quantiles. There were also significant differences in growth of EFW between countries. In spite of the multinational nature of the study, sample size is a limiting factor for generalization of the charts. This study provides WHO fetal growth charts for EFW and common ultrasound biometric measurements, and shows variation between different parts of the world.

  6. The World Health Organization Fetal Growth Charts: A Multinational Longitudinal Study of Ultrasound Biometric Measurements and Estimated Fetal Weight

    PubMed Central

    Carroli, Guillermo; Widmer, Mariana; Neerup Jensen, Lisa; Giordano, Daniel; Abdel Aleem, Hany; Talegawkar, Sameera A.; Benachi, Alexandra; Diemert, Anke; Tshefu Kitoto, Antoinette; Thinkhamrop, Jadsada; Lumbiganon, Pisake; Tabor, Ann; Kriplani, Alka; Gonzalez Perez, Rogelio; Hecher, Kurt; Hanson, Mark A.; Gülmezoglu, A. Metin; Platt, Lawrence D.

    2017-01-01

    Background Perinatal mortality and morbidity continue to be major global health challenges strongly associated with prematurity and reduced fetal growth, an issue of further interest given the mounting evidence that fetal growth in general is linked to degrees of risk of common noncommunicable diseases in adulthood. Against this background, WHO made it a high priority to provide the present fetal growth charts for estimated fetal weight (EFW) and common ultrasound biometric measurements intended for worldwide use. Methods and Findings We conducted a multinational prospective observational longitudinal study of fetal growth in low-risk singleton pregnancies of women of high or middle socioeconomic status and without known environmental constraints on fetal growth. Centers in ten countries (Argentina, Brazil, Democratic Republic of the Congo, Denmark, Egypt, France, Germany, India, Norway, and Thailand) recruited participants who had reliable information on last menstrual period and gestational age confirmed by crown–rump length measured at 8–13 wk of gestation. Participants had anthropometric and nutritional assessments and seven scheduled ultrasound examinations during pregnancy. Fifty-two participants withdrew consent, and 1,387 participated in the study. At study entry, median maternal age was 28 y (interquartile range [IQR] 25–31), median height was 162 cm (IQR 157–168), median weight was 61 kg (IQR 55–68), 58% of the women were nulliparous, and median daily caloric intake was 1,840 cal (IQR 1,487–2,222). The median pregnancy duration was 39 wk (IQR 38–40) although there were significant differences between countries, the largest difference being 12 d (95% CI 8–16). The median birthweight was 3,300 g (IQR 2,980–3,615). There were differences in birthweight between countries, e.g., India had significantly smaller neonates than the other countries, even after adjusting for gestational age. Thirty-one women had a miscarriage, and three fetuses had intrauterine death. The 8,203 sets of ultrasound measurements were scrutinized for outliers and leverage points, and those measurements taken at 14 to 40 wk were selected for analysis. A total of 7,924 sets of ultrasound measurements were analyzed by quantile regression to establish longitudinal reference intervals for fetal head circumference, biparietal diameter, humerus length, abdominal circumference, femur length and its ratio with head circumference and with biparietal diameter, and EFW. There was asymmetric distribution of growth of EFW: a slightly wider distribution among the lower percentiles during early weeks shifted to a notably expanded distribution of the higher percentiles in late pregnancy. Male fetuses were larger than female fetuses as measured by EFW, but the disparity was smaller in the lower quantiles of the distribution (3.5%) and larger in the upper quantiles (4.5%). Maternal age and maternal height were associated with a positive effect on EFW, particularly in the lower tail of the distribution, of the order of 2% to 3% for each additional 10 y of age of the mother and 1% to 2% for each additional 10 cm of height. Maternal weight was associated with a small positive effect on EFW, especially in the higher tail of the distribution, of the order of 1.0% to 1.5% for each additional 10 kg of bodyweight of the mother. Parous women had heavier fetuses than nulliparous women, with the disparity being greater in the lower quantiles of the distribution, of the order of 1% to 1.5%, and diminishing in the upper quantiles. There were also significant differences in growth of EFW between countries. In spite of the multinational nature of the study, sample size is a limiting factor for generalization of the charts. Conclusions This study provides WHO fetal growth charts for EFW and common ultrasound biometric measurements, and shows variation between different parts of the world. PMID:28118360

  7. Teaching for All? Teach For America’s Effects across the Distribution of Student Achievement

    PubMed Central

    Penner, Emily K.

    2016-01-01

    This paper examines the effect of Teach For America (TFA) on the distribution of student achievement in elementary school. It extends previous research by estimating quantile treatment effects (QTE) to examine how student achievement in TFA and non-TFA classrooms differs across the broader distribution of student achievement. It also updates prior distributional work on TFA by correcting for previously unidentified missing data and estimating unconditional, rather than conditional QTE. Consistent with previous findings, results reveal a positive impact of TFA teachers across the distribution of math achievement. In reading, however, relative to veteran non-TFA teachers, students at the bottom of the reading distribution score worse in TFA classrooms, and students in the upper half of the distribution perform better. PMID:27668032

  8. Secure Learning and Learning for Security: Research in the Intersection

    DTIC Science & Technology

    2010-05-13

    researchers to consider how Machine Learning and Statistics might be leveraged for constructing intelli - gent attacks. In a similar vein, security...Quantiles S am pl e Q ua nt ile s...8217 Residuals in Flow 144 Theoretical Quantiles S am pl e Q ua nt ile s 0 1 2 3 4 5 6 7 5. 0e + 07 1. 0e + 08 1. 5e + 08 Comparing Actual and Synthetic

  9. Bias correction of daily satellite precipitation data using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Pratama, A. W.; Buono, A.; Hidayat, R.; Harsa, H.

    2018-05-01

    Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) was producted by blending Satellite-only Climate Hazards Group InfraRed Precipitation (CHIRP) with Stasion observations data. The blending process was aimed to reduce bias of CHIRP. However, Biases of CHIRPS on statistical moment and quantil values were high during wet season over Java Island. This paper presented a bias correction scheme to adjust statistical moment of CHIRP using observation precipitation data. The scheme combined Genetic Algorithm and Nonlinear Power Transformation, the results was evaluated based on different season and different elevation level. The experiment results revealed that the scheme robustly reduced bias on variance around 100% reduction and leaded to reduction of first, and second quantile biases. However, bias on third quantile only reduced during dry months. Based on different level of elevation, the performance of bias correction process is only significantly different on skewness indicators.

  10. Distributional Analysis in Educational Evaluation: A Case Study from the New York City Voucher Program

    PubMed Central

    Domina, Thurston; Penner, Emily; Hoynes, Hilary

    2014-01-01

    We use quantile treatment effects estimation to examine the consequences of the random-assignment New York City School Choice Scholarship Program (NYCSCSP) across the distribution of student achievement. Our analyses suggest that the program had negligible and statistically insignificant effects across the skill distribution. In addition to contributing to the literature on school choice, the paper illustrates several ways in which distributional effects estimation can enrich educational research: First, we demonstrate that moving beyond a focus on mean effects estimation makes it possible to generate and test new hypotheses about the heterogeneity of educational treatment effects that speak to the justification for many interventions. Second, we demonstrate that distributional effects can uncover issues even with well-studied datasets by forcing analysts to view their data in new ways. Finally, such estimates highlight where in the overall national achievement distribution test scores of children exposed to particular interventions lie; this is important for exploring the external validity of the intervention’s effects. PMID:26207158

  11. Non-Susceptible Landslide Areas in Italy and in the Mediterranean Region

    NASA Astrophysics Data System (ADS)

    Alvioli, Massimiliano; Ardizzone, Francesca; Guzzetti, Fausto; Marchesini, Ivan; Rossi, Mauro

    2014-05-01

    Landslide susceptibility is the likelihood of a landslide occurring in a given area. Over the past three decades, researchers, and planning and environmental organisations have worked to assess landslide susceptibility at different geographical scales, and to produce maps portraying landslide susceptibility zonation. Little effort was made to determine where landslides are not expected, where susceptibility is null, or negligible. This is surprising because planners and decision makers are also interesting in knowing where landslides are not foreseen, or cannot occur in an area. We propose a method for the definition of non-susceptible landslide areas, at the synoptic scale. We applied the method in Italy and to the territory surrounding the Mediterranean Sea and we produced two synoptic-scale maps showing areas where landslides are not expected in Italy and in the Mediterranean area. To construct the method we used digital terrain elevation and landslide information. The digital terrain consisted in the 3-arc-second SRTM DEM, the landslide information was obtained for 13 areas in Italy where landslide inventory maps were available to us. We tested three different models to determine the non-susceptible landslide areas, including a linear model (LR), a quantile linear model (QLR), and a quantile non-linear model (QNL). Model performances have been evaluated using independent landslide information represented by the Italian Landslide Inventory (Inventario Fenomeni Franosi in Italia - IFFI). Best results were obtained using the QNL model. The corresponding zonation of non- susceptible landslide areas was intersected in a GIS with geographical census data for Italy. The results show that the 57.5% of the population of Italy (in 2001) was located in areas where landslide susceptibility was expected to be null or negligible, while the remaining 42.5% in areas where some landslide susceptibility was significant or not negligible. We applied the QNL model to the landmasses surrounding the Mediterranean Sea, and we tested the synoptic non- susceptibility zonation using independent landslide information for three study areas in Spain. Results proved that the QNL model was capable of determining where landslide susceptibility is expected to be negligible in the Mediterranean area. We expect our results to be applicable in similar study areas, facilitating the identification of non-susceptible and susceptible landslide areas, at the synoptic scale.

  12. Modelling average maximum daily temperature using r largest order statistics: An application to South African data

    PubMed Central

    2018-01-01

    Natural hazards (events that may cause actual disasters) are established in the literature as major causes of various massive and destructive problems worldwide. The occurrences of earthquakes, floods and heat waves affect millions of people through several impacts. These include cases of hospitalisation, loss of lives and economic challenges. The focus of this study was on the risk reduction of the disasters that occur because of extremely high temperatures and heat waves. Modelling average maximum daily temperature (AMDT) guards against the disaster risk and may also help countries towards preparing for extreme heat. This study discusses the use of the r largest order statistics approach of extreme value theory towards modelling AMDT over the period of 11 years, that is, 2000–2010. A generalised extreme value distribution for r largest order statistics is fitted to the annual maxima. This is performed in an effort to study the behaviour of the r largest order statistics. The method of maximum likelihood is used in estimating the target parameters and the frequency of occurrences of the hottest days is assessed. The study presents a case study of South Africa in which the data for the non-winter season (September–April of each year) are used. The meteorological data used are the AMDT that are collected by the South African Weather Service and provided by Eskom. The estimation of the shape parameter reveals evidence of a Weibull class as an appropriate distribution for modelling AMDT in South Africa. The extreme quantiles for specified return periods are estimated using the quantile function and the best model is chosen through the use of the deviance statistic with the support of the graphical diagnostic tools. The Entropy Difference Test (EDT) is used as a specification test for diagnosing the fit of the models to the data.

  13. Regional Frequency Analysis of Annual Maximum Streamflow in Gipuzkoa (Spain)

    NASA Astrophysics Data System (ADS)

    Erro, J.; López, J. J.

    2012-04-01

    Extreme streamflow events have been an important cause of recent flooding in Gipuzkoa, and any change in the magnitude of such events may have severe impacts upon urban structures such as dams, urban drainage systems and flood defences, and cause failures to occur. So a regional frequency analysis of annual maximum streamflow was developed for Gipuzkoa, using the well known L-moments approach together with the index-flood procedure, and following the four steps that characterize it: initial screening of the data, identification of homogeneous regions, choice of the appropriate frequency distribution and estimation of quantiles for different return periods. The preliminary study, completed in 2009, was based on the observations recorded at 22 stations distributed throughout the area. A primary filtering of the data revealed the absence of jumps, inconsistencies and changes in trends within the series, and the discordancy measures showed that none of the sites used in the analysis had to be considered discordant with the others. Regionalization was performed by cluster analysis, grouping the stations according to eight physical site characteristics: latitude, longitude, drainage basin area, elevation, main channel length of the basin, slope, annual mean rainfall and annual maximum rainfall. It resulted in two groups - one cluster with the 18 sites of small-medium basin area, and a second cluster with the 4 remaining sites of major basin area - in which the homogeneity criteria were tested and satisfied. However, the short lenght of the series together with the introduction of the observations of 2010 and the inclusion of a historic extreme streamflow event occurred in northern Spain in November 2011, completely changed the results. With this consideration and adjustment, all Gipuzkoa could be treated as a homogeneus region. The goodness-of-fit measures indicated that Generalized Logistic (GLO) is the only suitable distribution to characterize Gipuzkoa. Using the regional L-moment algorithm, quantiles associated with return periods of interest were estimated, and Monte Carlo simulation was used to compute RMSE, bias and error bounds for the estimates.

  14. Case–control and prospective studies of dietary α-linolenic acid intake and prostate cancer risk: a meta-analysis

    PubMed Central

    Carleton, Amanda J; Sievenpiper, John L; de Souza, Russell; McKeown-Eyssen, Gail; Jenkins, David J A

    2013-01-01

    Objective α-Linolenic acid (ALA) is considered to be a cardioprotective nutrient; however, some epidemiological studies have suggested that dietary ALA intake increases the risk of prostate cancer. The main objective was to conduct a systematic review and meta-analysis of case–control and prospective studies investigating the association between dietary ALA intake and prostate cancer risk. Design A systematic review and meta-analysis were conducted by searching MEDLINE and EMBASE for relevant prospective and case–control studies. Included studies We included all prospective cohort, case–control, nested case-cohort and nested case–control studies that investigated the effect of dietary ALA intake on the incidence (or diagnosis) of prostate cancer and provided relative risk (RR), HR or OR estimates. Primary outcome measure Data were pooled using the generic inverse variance method with a random effects model from studies that compared the highest ALA quantile with the lowest ALA quantile. Risk estimates were expressed as RR with 95% CIs. Heterogeneity was assessed by χ2 and quantified by I2. Results Data from five prospective and seven case–control studies were pooled. The overall RR estimate showed ALA intake to be positively but non-significantly associated with prostate cancer risk (1.08 (0.90 to 1.29), p=0.40; I2=85%), but the interpretation was complicated by evidence of heterogeneity not explained by study design. A weak, non-significant protective effect of ALA intake on prostate cancer risk in the prospective studies became significant (0.91 (0.83 to 0.99), p=0.02) without evidence of heterogeneity (I2=8%, p=0.35) on removal of one study during sensitivity analyses. Conclusions This analysis failed to confirm an association between dietary ALA intake and prostate cancer risk. Larger and longer observational and interventional studies are needed to define the role of ALA and prostate cancer. PMID:23674441

  15. Geostatistical Interpolation of Particle-Size Curves in Heterogeneous Aquifers

    NASA Astrophysics Data System (ADS)

    Guadagnini, A.; Menafoglio, A.; Secchi, P.

    2013-12-01

    We address the problem of predicting the spatial field of particle-size curves (PSCs) from measurements associated with soil samples collected at a discrete set of locations within an aquifer system. Proper estimates of the full PSC are relevant to applications related to groundwater hydrology, soil science and geochemistry and aimed at modeling physical and chemical processes occurring in heterogeneous earth systems. Hence, we focus on providing kriging estimates of the entire PSC at unsampled locations. To this end, we treat particle-size curves as cumulative distribution functions, model their densities as functional compositional data and analyze them by embedding these into the Hilbert space of compositional functions endowed with the Aitchison geometry. On this basis, we develop a new geostatistical methodology for the analysis of spatially dependent functional compositional data. Our functional compositional kriging (FCK) approach allows providing predictions at unsampled location of the entire particle-size curve, together with a quantification of the associated uncertainty, by fully exploiting both the functional form of the data and their compositional nature. This is a key advantage of our approach with respect to traditional methodologies, which treat only a set of selected features (e.g., quantiles) of PSCs. Embedding the full PSC into a geostatistical analysis enables one to provide a complete characterization of the spatial distribution of lithotypes in a reservoir, eventually leading to improved predictions of soil hydraulic attributes through pedotransfer functions as well as of soil geochemical parameters which are relevant in sorption/desorption and cation exchange processes. We test our new method on PSCs sampled along a borehole located within an alluvial aquifer near the city of Tuebingen, Germany. The quality of FCK predictions is assessed through leave-one-out cross-validation. A comparison between hydraulic conductivity estimates obtained via FCK approach and those predicted by classical kriging of effective particle diameters (i.e., quantiles of the PSCs) is finally performed.

  16. Merging Multi-model CMIP5/PMIP3 Past-1000 Ensemble Simulations with Tree Ring Proxy Data by Optimal Interpolation Approach

    NASA Astrophysics Data System (ADS)

    Chen, Xin; Luo, Yong; Xing, Pei; Nie, Suping; Tian, Qinhua

    2015-04-01

    Two sets of gridded annual mean surface air temperature in past millennia over the Northern Hemisphere was constructed employing optimal interpolation (OI) method so as to merge the tree ring proxy records with the simulations from CMIP5 (the fifth phase of the Climate Model Intercomparison Project). Both the uncertainties in proxy reconstruction and model simulations can be taken into account applying OI algorithm. For better preservation of physical coordinated features and spatial-temporal completeness of climate variability in 7 copies of model results, we perform the Empirical Orthogonal Functions (EOF) analysis to truncate the ensemble mean field as the first guess (background field) for OI. 681 temperature sensitive tree-ring chronologies are collected and screened from International Tree Ring Data Bank (ITRDB) and Past Global Changes (PAGES-2k) project. Firstly, two methods (variance matching and linear regression) are employed to calibrate the tree ring chronologies with instrumental data (CRUTEM4v) individually. In addition, we also remove the bias of both the background field and proxy records relative to instrumental dataset. Secondly, time-varying background error covariance matrix (B) and static "observation" error covariance matrix (R) are calculated for OI frame. In our scheme, matrix B was calculated locally, and "observation" error covariance are partially considered in R matrix (the covariance value between the pairs of tree ring sites that are very close to each other would be counted), which is different from the traditional assumption that R matrix should be diagonal. Comparing our results, it turns out that regional averaged series are not sensitive to the selection for calibration methods. The Quantile-Quantile plots indicate regional climatologies based on both methods are tend to be more agreeable with regional reconstruction of PAGES-2k in 20th century warming period than in little ice age (LIA). Lager volcanic cooling response over Asia and Europe in context of recent millennium are detected in our datasets than that revealed in regional reconstruction from PAGES-2k network. Verification experiments have showed that the merging approach really reconcile the proxy data and model ensemble simulations in an optimal way (with smaller errors than both of them). Further research is needed to improve the error estimation on them.

  17. Historical floods in flood frequency analysis: Is this game worth the candle?

    NASA Astrophysics Data System (ADS)

    Strupczewski, Witold G.; Kochanek, Krzysztof; Bogdanowicz, Ewa

    2017-11-01

    In flood frequency analysis (FFA) the profit from inclusion of historical information on the largest historical pre-instrumental floods depends primarily on reliability of the information, i.e. the accuracy of magnitude and return period of floods. This study is focused on possible theoretical maximum gain in accuracy of estimates of upper quantiles, that can be obtained by incorporating the largest historical floods of known return periods into the FFA. We assumed a simple case: N years of systematic records of annual maximum flows and either one largest (XM1) or two largest (XM1 and XM2) flood peak flows in a historical M-year long period. The problem is explored by Monte Carlo simulations with the maximum likelihood (ML) method. Both correct and false distributional assumptions are considered. In the first case the two-parameter extreme value models (Gumbel, log-Gumbel, Weibull) with various coefficients of variation serve as parent distributions. In the case of unknown parent distribution, the Weibull distribution was assumed as estimating model and the truncated Gumbel as parent distribution. The return periods of XM1 and XM2 are determined from the parent distribution. The results are then compared with the case, when return periods of XM1 and XM2 are defined by their plotting positions. The results are presented in terms of bias, root mean square error and the probability of overestimation of the quantile with 100-year return period. The results of the research indicate that the maximal profit of inclusion of pre-instrumental foods in the FFA may prove smaller than the cost of reconstruction of historical hydrological information.

  18. The impact of the 2008 financial crisis on food security and food expenditures in Mexico: a disproportionate effect on the vulnerable.

    PubMed

    Vilar-Compte, Mireya; Sandoval-Olascoaga, Sebastian; Bernal-Stuart, Ana; Shimoga, Sandhya; Vargas-Bustamante, Arturo

    2015-11-01

    The present paper investigated the impact of the 2008 financial crisis on food security in Mexico and how it disproportionally affected vulnerable households. A generalized ordered logistic regression was estimated to assess the impact of the crisis on households' food security status. An ordinary least squares and a quantile regression were estimated to evaluate the effect of the financial crisis on a continuous proxy measure of food security defined as the share of a household's current income devoted to food expenditures. Setting Both analyses were performed using pooled cross-sectional data from the Mexican National Household Income and Expenditure Survey 2008 and 2010. The analytical sample included 29,468 households in 2008 and 27,654 in 2010. The generalized ordered logistic model showed that the financial crisis significantly (P<0·05) decreased the probability of being food secure, mildly or moderately food insecure, compared with being severely food insecure (OR=0·74). A similar but smaller effect was found when comparing severely and moderately food-insecure households with mildly food-insecure and food-secure households (OR=0·81). The ordinary least squares model showed that the crisis significantly (P<0·05) increased the share of total income spent on food (β coefficient of 0·02). The quantile regression confirmed the findings suggested by the generalized ordered logistic model, showing that the effects of the crisis were more profound among poorer households. The results suggest that households that were more vulnerable before the financial crisis saw a worsened effect in terms of food insecurity with the crisis. Findings were consistent with both measures of food security--one based on self-reported experience and the other based on food spending.

  19. Impact of Community-Based HIV/AIDS Treatment on Household Incomes in Uganda

    PubMed Central

    Feulefack, Joseph F.; Luckert, Martin K.; Mohapatra, Sandeep; Cash, Sean B.; Alibhai, Arif; Kipp, Walter

    2013-01-01

    Though health benefits to households in developing countries from antiretroviral treatment (ART) programs are widely reported in the literature, specific estimates regarding impacts of treatments on household incomes are rare. This type of information is important to governments and donors, as it is an indication of returns to their ART investments, and to better understand the role of HIV/AIDS in development. The objective of this study is to estimate the impact of a community-based ART program on household incomes in a previously underserved rural region of Uganda. A community-based ART program, based largely on labor contributions from community volunteers, was implemented and evaluated. All households with HIV/AIDS patients enrolled in the treatment programme (n = 134 households) were surveyed five times; once at the beginning of the treatment and every three months thereafter for a period of one year. Data were collected on household income from cash earnings and value of own production. The analysis, using ordinary least squares and quantile regressions, identifies the impact of the ART program on household incomes over the first year of the treatment, while controlling for heterogeneity in household characteristics and temporal changes. As a result of the treatment, health conditions of virtually all patients improved, and household incomes increased by approximately 30% to 40%, regardless of household income quantile. These increases in income, however, varied significantly depending on socio-demographic and socio-economic control variables. Overall, results show large and significant impacts of the ART program on household incomes, suggesting large returns to public investments in ART, and that treating HIV/AIDS is an important precondition for development. Moreover, development programs that invest in human capital and build wealth are important complements that can increase the returns to ART programs. PMID:23840347

  20. Realistic sampling of anisotropic correlogram parameters for conditional simulation of daily rainfields

    NASA Astrophysics Data System (ADS)

    Gyasi-Agyei, Yeboah

    2018-01-01

    This paper has established a link between the spatial structure of radar rainfall, which more robustly describes the spatial structure, and gauge rainfall for improved daily rainfield simulation conditioned on the limited gauged data for regions with or without radar records. A two-dimensional anisotropic exponential function that has parameters of major and minor axes lengths, and direction, is used to describe the correlogram (spatial structure) of daily rainfall in the Gaussian domain. The link is a copula-based joint distribution of the radar-derived correlogram parameters that uses the gauge-derived correlogram parameters and maximum daily temperature as covariates of the Box-Cox power exponential margins and Gumbel copula. While the gauge-derived, radar-derived and the copula-derived correlogram parameters reproduced the mean estimates similarly using leave-one-out cross-validation of ordinary kriging, the gauge-derived parameters yielded higher standard deviation (SD) of the Gaussian quantile which reflects uncertainty in over 90% of cases. However, the distribution of the SD generated by the radar-derived and the copula-derived parameters could not be distinguished. For the validation case, the percentage of cases of higher SD by the gauge-derived parameter sets decreased to 81.2% and 86.6% for the non-calibration and the calibration periods, respectively. It has been observed that 1% reduction in the Gaussian quantile SD can cause over 39% reduction in the SD of the median rainfall estimate, actual reduction being dependent on the distribution of rainfall of the day. Hence the main advantage of using the most correct radar correlogram parameters is to reduce the uncertainty associated with conditional simulations that rely on SD through kriging.

  1. Customized Fetal Growth Charts for Parents' Characteristics, Race, and Parity by Quantile Regression Analysis: A Cross-sectional Multicenter Italian Study.

    PubMed

    Ghi, Tullio; Cariello, Luisa; Rizzo, Ludovica; Ferrazzi, Enrico; Periti, Enrico; Prefumo, Federico; Stampalija, Tamara; Viora, Elsa; Verrotti, Carla; Rizzo, Giuseppe

    2016-01-01

    The purpose of this study was to construct fetal biometric charts between 16 and 40 weeks' gestation that were customized for parental characteristics, race, and parity, using quantile regression analysis. In a multicenter cross-sectional study, 8070 sonographic examinations from low-risk pregnancies between 16 and 40 weeks' gestation were analyzed. The fetal measurements obtained were biparietal diameter, head circumference, abdominal circumference, and femur diaphysis length. Quantile regression was used to examine the impact of parental height and weight, parity, and race across biometric percentiles for the fetal measurements considered. Paternal and maternal height were significant covariates for all of the measurements considered (P < .05). Maternal weight significantly influenced head circumference, abdominal circumference, and femur diaphysis length. Parity was significantly associated with biparietal diameter and head circumference. Central African race was associated with head circumference and femur diaphysis length, whereas North African race was only associated with femur diaphysis length. In this study we constructed customized biometric growth charts using quantile regression in a large cohort of low-risk pregnancies. These charts offer the advantage of defining individualized normal ranges of fetal biometric parameters at each specific percentile corrected for parental height and weight, parity, and race. This study supports the importance of including these variables in routine sonographic screening for fetal growth abnormalities.

  2. Machine learning approaches for estimation of prediction interval for the model output.

    PubMed

    Shrestha, Durga L; Solomatine, Dimitri P

    2006-03-01

    A novel method for estimating prediction uncertainty using machine learning techniques is presented. Uncertainty is expressed in the form of the two quantiles (constituting the prediction interval) of the underlying distribution of prediction errors. The idea is to partition the input space into different zones or clusters having similar model errors using fuzzy c-means clustering. The prediction interval is constructed for each cluster on the basis of empirical distributions of the errors associated with all instances belonging to the cluster under consideration and propagated from each cluster to the examples according to their membership grades in each cluster. Then a regression model is built for in-sample data using computed prediction limits as targets, and finally, this model is applied to estimate the prediction intervals (limits) for out-of-sample data. The method was tested on artificial and real hydrologic data sets using various machine learning techniques. Preliminary results show that the method is superior to other methods estimating the prediction interval. A new method for evaluating performance for estimating prediction interval is proposed as well.

  3. An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

    PubMed Central

    2012-01-01

    Background The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. Results We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. Conclusion T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially. PMID:22276688

  4. Quantile-Specific Penetrance of Genes Affecting Lipoproteins, Adiposity and Height

    PubMed Central

    Williams, Paul T.

    2012-01-01

    Quantile-dependent penetrance is proposed to occur when the phenotypic expression of a SNP depends upon the population percentile of the phenotype. To illustrate the phenomenon, quantiles of height, body mass index (BMI), and plasma lipids and lipoproteins were compared to genetic risk scores (GRS) derived from single nucleotide polymorphisms (SNP)s having established genome-wide significance: 180 SNPs for height, 32 for BMI, 37 for low-density lipoprotein (LDL)-cholesterol, 47 for high-density lipoprotein (HDL)-cholesterol, 52 for total cholesterol, and 31 for triglycerides in 1930 subjects. Both phenotypes and GRSs were adjusted for sex, age, study, and smoking status. Quantile regression showed that the slope of the genotype-phenotype relationships increased with the percentile of BMI (P = 0.002), LDL-cholesterol (P = 3×10−8), HDL-cholesterol (P = 5×10−6), total cholesterol (P = 2.5×10−6), and triglyceride distribution (P = 7.5×10−6), but not height (P = 0.09). Compared to a GRS's phenotypic effect at the 10th population percentile, its effect at the 90th percentile was 4.2-fold greater for BMI, 4.9-fold greater for LDL-cholesterol, 1.9-fold greater for HDL-cholesterol, 3.1-fold greater for total cholesterol, and 3.3-fold greater for triglycerides. Moreover, the effect of the rs1558902 (FTO) risk allele was 6.7-fold greater at the 90th than the 10th percentile of the BMI distribution, and that of the rs3764261 (CETP) risk allele was 2.4-fold greater at the 90th than the 10th percentile of the HDL-cholesterol distribution. Conceptually, it maybe useful to distinguish environmental effects on the phenotype that in turn alters a gene's phenotypic expression (quantile-dependent penetrance) from environmental effects affecting the gene's phenotypic expression directly (gene-environment interaction). PMID:22235250

  5. Applying quantile regression for modeling equivalent property damage only crashes to identify accident blackspots.

    PubMed

    Washington, Simon; Haque, Md Mazharul; Oh, Jutaek; Lee, Dongmin

    2014-05-01

    Hot spot identification (HSID) aims to identify potential sites-roadway segments, intersections, crosswalks, interchanges, ramps, etc.-with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Multivariate quantile mapping bias correction: an N-dimensional probability density function transform for climate model simulations of multiple variables

    NASA Astrophysics Data System (ADS)

    Cannon, Alex J.

    2018-01-01

    Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.

  7. Statistical bias correction method applied on CMIP5 datasets over the Indian region during the summer monsoon season for climate change applications

    NASA Astrophysics Data System (ADS)

    Prasanna, V.

    2018-01-01

    This study makes use of temperature and precipitation from CMIP5 climate model output for climate change application studies over the Indian region during the summer monsoon season (JJAS). Bias correction of temperature and precipitation from CMIP5 GCM simulation results with respect to observation is discussed in detail. The non-linear statistical bias correction is a suitable bias correction method for climate change data because it is simple and does not add up artificial uncertainties to the impact assessment of climate change scenarios for climate change application studies (agricultural production changes) in the future. The simple statistical bias correction uses observational constraints on the GCM baseline, and the projected results are scaled with respect to the changing magnitude in future scenarios, varying from one model to the other. Two types of bias correction techniques are shown here: (1) a simple bias correction using a percentile-based quantile-mapping algorithm and (2) a simple but improved bias correction method, a cumulative distribution function (CDF; Weibull distribution function)-based quantile-mapping algorithm. This study shows that the percentile-based quantile mapping method gives results similar to the CDF (Weibull)-based quantile mapping method, and both the methods are comparable. The bias correction is applied on temperature and precipitation variables for present climate and future projected data to make use of it in a simple statistical model to understand the future changes in crop production over the Indian region during the summer monsoon season. In total, 12 CMIP5 models are used for Historical (1901-2005), RCP4.5 (2005-2100), and RCP8.5 (2005-2100) scenarios. The climate index from each CMIP5 model and the observed agricultural yield index over the Indian region are used in a regression model to project the changes in the agricultural yield over India from RCP4.5 and RCP8.5 scenarios. The results revealed a better convergence of model projections in the bias corrected data compared to the uncorrected data. The study can be extended to localized regional domains aimed at understanding the changes in the agricultural productivity in the future with an agro-economy or a simple statistical model. The statistical model indicated that the total food grain yield is going to increase over the Indian region in the future, the increase in the total food grain yield is approximately 50 kg/ ha for the RCP4.5 scenario from 2001 until the end of 2100, and the increase in the total food grain yield is approximately 90 kg/ha for the RCP8.5 scenario from 2001 until the end of 2100. There are many studies using bias correction techniques, but this study applies the bias correction technique to future climate scenario data from CMIP5 models and applied it to crop statistics to find future crop yield changes over the Indian region.

  8. Chronic kidney disease, cerebral blood flow, and white matter volume in hypertensive adults.

    PubMed

    Tamura, Manjula Kurella; Pajewski, Nicholas M; Bryan, R Nick; Weiner, Daniel E; Diamond, Matthew; Van Buren, Peter; Taylor, Addison; Beddhu, Srinivasan; Rosendorff, Clive; Jahanian, Hesamoddin; Zaharchuk, Greg

    2016-03-29

    To determine the relation between markers of kidney disease-estimated glomerular filtration rate (eGFR) and urine albumin to creatinine ratio (UACR)-with cerebral blood flow (CBF) and white matter volume (WMV) in hypertensive adults. We used baseline data collected from 665 nondiabetic hypertensive adults aged ≥50 years participating in the Systolic Blood Pressure Intervention Trial (SPRINT). We used arterial spin labeling to measure CBF and structural 3T images to segment tissue into normal and abnormal WMV. We used quantile regression to estimate the association between eGFR and UACR with CBF and abnormal WMV, adjusting for sociodemographic and clinical characteristics. There were 218 participants (33%) with eGFR <60 mL/min/1.73 m(2) and 146 participants (22%) with UACR ≥30 mg/g. Reduced eGFR was independently associated with higher adjusted median CBF, but not with abnormal WMV. Conversely, in adjusted analyses, there was a linear independent association between UACR and larger abnormal WMV, but not with CBF. Compared to participants with neither marker of CKD (eGFR ≥60 mL/min/1.73 m(2) and UACR <30 mg/g), median CBF was 5.03 mL/100 g/min higher (95% confidence interval [CI] 0.78, 9.29) and abnormal WMV was 0.63 cm(3) larger (95% CI 0.08, 1.17) among participants with both markers of CKD (eGFR <60 mL/min/1.73 m(2) and UACR ≥30 mg/g). Among nondiabetic hypertensive adults, reduced eGFR was associated with higher CBF and higher UACR was associated with larger abnormal WMV. © 2016 American Academy of Neurology.

  9. Deriving the number of jobs in proximity services from the number of inhabitants in French rural municipalities.

    PubMed

    Lenormand, Maxime; Huet, Sylvie; Deffuant, Guillaume

    2012-01-01

    We use a minimum requirement approach to derive the number of jobs in proximity services per inhabitant in French rural municipalities. We first classify the municipalities according to their time distance in minutes by car to the municipality where the inhabitants go the most frequently to get services (called MFM). For each set corresponding to a range of time distance to MFM, we perform a quantile regression estimating the minimum number of service jobs per inhabitant that we interpret as an estimation of the number of proximity jobs per inhabitant. We observe that the minimum number of service jobs per inhabitant is smaller in small municipalities. Moreover, for municipalities of similar sizes, when the distance to the MFM increases, the number of jobs of proximity services per inhabitant increases.

  10. Robust Inference of Risks of Large Portfolios

    PubMed Central

    Fan, Jianqing; Han, Fang; Liu, Han; Vickers, Byron

    2016-01-01

    We propose a bootstrap-based robust high-confidence level upper bound (Robust H-CLUB) for assessing the risks of large portfolios. The proposed approach exploits rank-based and quantile-based estimators, and can be viewed as a robust extension of the H-CLUB procedure (Fan et al., 2015). Such an extension allows us to handle possibly misspecified models and heavy-tailed data, which are stylized features in financial returns. Under mixing conditions, we analyze the proposed approach and demonstrate its advantage over H-CLUB. We further provide thorough numerical results to back up the developed theory, and also apply the proposed method to analyze a stock market dataset. PMID:27818569

  11. Stress indicators based on airborne thermal imagery for field phenotyping a heterogeneous tree population for response to water constraints

    PubMed Central

    Virlet, Nicolas; Lebourgeois, Valentine; Martinez, Sébastien; Costes, Evelyne; Labbé, Sylvain; Regnard, Jean-Luc

    2014-01-01

    As field phenotyping of plant response to water constraints constitutes a bottleneck for breeding programmes, airborne thermal imagery can contribute to assessing the water status of a wide range of individuals simultaneously. However, the presence of mixed soil–plant pixels in heterogeneous plant cover complicates the interpretation of canopy temperature. Moran’s Water Deficit Index (WDI = 1–ETact/ETmax), which was designed to overcome this difficulty, was compared with surface minus air temperature (T s–T a) as a water stress indicator. As parameterization of the theoretical equations for WDI computation is difficult, particularly when applied to genotypes with large architectural variability, a simplified procedure based on quantile regression was proposed to delineate the Vegetation Index–Temperature (VIT) scatterplot. The sensitivity of WDI to variations in wet and dry references was assessed by applying more or less stringent quantile levels. The different stress indicators tested on a series of airborne multispectral images (RGB, near-infrared, and thermal infrared) of a population of 122 apple hybrids, under two irrigation regimes, significantly discriminated the tree water statuses. For each acquisition date, the statistical method efficiently delineated the VIT scatterplot, while the limits obtained using the theoretical approach overlapped it, leading to inconsistent WDI values. Once water constraint was established, the different stress indicators were linearly correlated to the stem water potential among a tree subset. T s–T a showed a strong sensitivity to evaporative demand, which limited its relevancy for temporal comparisons. Finally, the statistical approach of WDI appeared the most suitable for high-throughput phenotyping. PMID:25080086

  12. Forecasting conditional climate-change using a hybrid approach

    USGS Publications Warehouse

    Esfahani, Akbar Akbari; Friedel, Michael J.

    2014-01-01

    A novel approach is proposed to forecast the likelihood of climate-change across spatial landscape gradients. This hybrid approach involves reconstructing past precipitation and temperature using the self-organizing map technique; determining quantile trends in the climate-change variables by quantile regression modeling; and computing conditional forecasts of climate-change variables based on self-similarity in quantile trends using the fractionally differenced auto-regressive integrated moving average technique. The proposed modeling approach is applied to states (Arizona, California, Colorado, Nevada, New Mexico, and Utah) in the southwestern U.S., where conditional forecasts of climate-change variables are evaluated against recent (2012) observations, evaluated at a future time period (2030), and evaluated as future trends (2009–2059). These results have broad economic, political, and social implications because they quantify uncertainty in climate-change forecasts affecting various sectors of society. Another benefit of the proposed hybrid approach is that it can be extended to any spatiotemporal scale providing self-similarity exists.

  13. A method to preserve trends in quantile mapping bias correction of climate modeled temperature

    NASA Astrophysics Data System (ADS)

    Grillakis, Manolis G.; Koutroulis, Aristeidis G.; Daliakopoulos, Ioannis N.; Tsanis, Ioannis K.

    2017-09-01

    Bias correction of climate variables is a standard practice in climate change impact (CCI) studies. Various methodologies have been developed within the framework of quantile mapping. However, it is well known that quantile mapping may significantly modify the long-term statistics due to the time dependency of the temperature bias. Here, a method to overcome this issue without compromising the day-to-day correction statistics is presented. The methodology separates the modeled temperature signal into a normalized and a residual component relative to the modeled reference period climatology, in order to adjust the biases only for the former and preserve the signal of the later. The results show that this method allows for the preservation of the originally modeled long-term signal in the mean, the standard deviation and higher and lower percentiles of temperature. To illustrate the improvements, the methodology is tested on daily time series obtained from five Euro CORDEX regional climate models (RCMs).

  14. Examining Predictive Validity of Oral Reading Fluency Slope in Upper Elementary Grades Using Quantile Regression.

    PubMed

    Cho, Eunsoo; Capin, Philip; Roberts, Greg; Vaughn, Sharon

    2017-07-01

    Within multitiered instructional delivery models, progress monitoring is a key mechanism for determining whether a child demonstrates an adequate response to instruction. One measure commonly used to monitor the reading progress of students is oral reading fluency (ORF). This study examined the extent to which ORF slope predicts reading comprehension outcomes for fifth-grade struggling readers ( n = 102) participating in an intensive reading intervention. Quantile regression models showed that ORF slope significantly predicted performance on a sentence-level fluency and comprehension assessment, regardless of the students' reading skills, controlling for initial ORF performance. However, ORF slope was differentially predictive of a passage-level comprehension assessment based on students' reading skills when controlling for initial ORF status. Results showed that ORF explained unique variance for struggling readers whose posttest performance was at the upper quantiles at the end of the reading intervention, but slope was not a significant predictor of passage-level comprehension for students whose reading problems were the most difficult to remediate.

  15. Using the Quantile Mapping to improve a weather generator

    NASA Astrophysics Data System (ADS)

    Chen, Y.; Themessl, M.; Gobiet, A.

    2012-04-01

    We developed a weather generator (WG) by using statistical and stochastic methods, among them are quantile mapping (QM), Monte-Carlo, auto-regression, empirical orthogonal function (EOF). One of the important steps in the WG is using QM, through which all the variables, no matter what distribution they originally are, are transformed into normal distributed variables. Therefore, the WG can work on normally distributed variables, which greatly facilitates the treatment of random numbers in the WG. Monte-Carlo and auto-regression are used to generate the realization; EOFs are employed for preserving spatial relationships and the relationships between different meteorological variables. We have established a complete model named WGQM (weather generator and quantile mapping), which can be applied flexibly to generate daily or hourly time series. For example, with 30-year daily (hourly) data and 100-year monthly (daily) data as input, the 100-year daily (hourly) data would be relatively reasonably produced. Some evaluation experiments with WGQM have been carried out in the area of Austria and the evaluation results will be presented.

  16. The role of ensemble post-processing for modeling the ensemble tail

    NASA Astrophysics Data System (ADS)

    Van De Vyver, Hans; Van Schaeybroeck, Bert; Vannitsem, Stéphane

    2016-04-01

    The past decades the numerical weather prediction community has witnessed a paradigm shift from deterministic to probabilistic forecast and state estimation (Buizza and Leutbecher, 2015; Buizza et al., 2008), in an attempt to quantify the uncertainties associated with initial-condition and model errors. An important benefit of a probabilistic framework is the improved prediction of extreme events. However, one may ask to what extent such model estimates contain information on the occurrence probability of extreme events and how this information can be optimally extracted. Different approaches have been proposed and applied on real-world systems which, based on extreme value theory, allow the estimation of extreme-event probabilities conditional on forecasts and state estimates (Ferro, 2007; Friederichs, 2010). Using ensemble predictions generated with a model of low dimensionality, a thorough investigation is presented quantifying the change of predictability of extreme events associated with ensemble post-processing and other influencing factors including the finite ensemble size, lead time and model assumption and the use of different covariates (ensemble mean, maximum, spread...) for modeling the tail distribution. Tail modeling is performed by deriving extreme-quantile estimates using peak-over-threshold representation (generalized Pareto distribution) or quantile regression. Common ensemble post-processing methods aim to improve mostly the ensemble mean and spread of a raw forecast (Van Schaeybroeck and Vannitsem, 2015). Conditional tail modeling, on the other hand, is a post-processing in itself, focusing on the tails only. Therefore, it is unclear how applying ensemble post-processing prior to conditional tail modeling impacts the skill of extreme-event predictions. This work is investigating this question in details. Buizza, Leutbecher, and Isaksen, 2008: Potential use of an ensemble of analyses in the ECMWF Ensemble Prediction System, Q. J. R. Meteorol. Soc. 134: 2051-2066.Buizza and Leutbecher, 2015: The forecast skill horizon, Q. J. R. Meteorol. Soc. 141: 3366-3382.Ferro, 2007: A probability model for verifying deterministic forecasts of extreme events. Weather and Forecasting 22 (5), 1089-1100.Friederichs, 2010: Statistical downscaling of extreme precipitation events using extreme value theory. Extremes 13, 109-132.Van Schaeybroeck and Vannitsem, 2015: Ensemble post-processing using member-by-member approaches: theoretical aspects. Q.J.R. Meteorol. Soc., 141: 807-818.

  17. Not dead yet: the seasonal water relations of two perennial ferns during California's exceptional drought.

    PubMed

    Baer, Alex; Wheeler, James K; Pittermann, Jarmila

    2016-04-01

    The understory of the redwood forests of California's coast harbors perennial ferns, including Polystichum munitum and Dryopteris arguta. Unusual for ferns, these species are adapted to the characteristic Mediterranean-type dry season, but the mechanisms of tolerance have not been studied. The water relations of P. munitum and D. arguta were surveyed for over a year, including measures of water potential (Ψ), stomatal conductance (gs) and frond stipe hydraulic conductivity (K). A dehydration and re-watering experiment on potted P. munitum plants corroborated the field data. The seasonal Ψ varied from 0 to below -3 MPa in both species, with gs and K generally tracking Ψ; the loss of K rarely exceeded 80%. Quantile regression analysis showed that, at the 0.1 quantile, 50% of K was lost at -2.58 and -3.84 MPa in P. munitum and D. arguta, respectively. The hydraulic recovery of re-watered plants was attributed to capillarity. The seasonal water relations of P. munitum and D. arguta are variable, but consistent with laboratory-based estimates of drought tolerance. Hydraulic and Ψ recovery following rain allows perennial ferns to survive severe drought, but prolonged water deficit, coupled with insect damage, may hamper frond survival. The legacy effects of drought on reproductive capacity and community dynamics are unknown. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  18. [Socioeconomic factors conditioning obesity in adults. Evidence based on quantile regression and panel data].

    PubMed

    Temporelli, Karina L; Viego, Valentina N

    2016-08-01

    Objective To measure the effect of socioeconomic variables on the prevalence of obesity. Factors such as income level, urbanization, incorporation of women into the labor market and access to unhealthy foods are considered in this paper. Method Econometric estimates of the proportion of obese men and women by country were calculated using models based on panel data and quantile regressions, with data from 192 countries for the period 2002-2005.Levels of per capita income, urbanization, income/big mac ratio price and labor indicators for female population were considered as explanatory variables. Results Factors that have influence over obesity in adults differ between men and women; accessibility to fast food is related to male obesity, while the employment mode causes higher rates in women. The underlying socioeconomic factors for obesity are also different depending on the magnitude of this problem in each country; in countries with low prevalence, a greater level of income favor the transition to obesogenic habits, while a higher income level mitigates the problem in those countries with high rates of obesity. Discussion Identifying the socio-economic causes of the significant increase in the prevalence of obesity is essential for the implementation of effective strategies for prevention, since this condition not only affects the quality of life of those who suffer from it but also puts pressure on health systems due to the treatment costs of associated diseases.

  19. Health and nutritional status of children in Ethiopia: do maternal characteristics matter?

    PubMed

    Seid, Abdu Kedir

    2013-03-01

    In Ethiopia, despite some recent improvements, the health and nutritional status of children is very poor. A better understanding of the main socioeconomic determinants of child health and nutrition is essential to address the problem and make appropriate interventions. In the present study, an attempt is made to explore the effect of maternal characteristics on the health and nutritional status of under-five children using the 2005 Ethiopian Demographic and Health Survey. The health and nutritional status of children are measured using the two widely used anthropometric indicators height-for-age (HAZ) and weight-for-height (WHZ). In the ordinary least squares (OLS) estimation, it is observed that maternal characteristics have a significant impact on child health and nutritional status. The magnitudes of the coefficients, however, are found to slightly increase when maternal education is instrumented in the 2SLS estimation. Moreover, in the quantile regression (QR) estimation, the impacts of maternal characteristics are observed to vary between long-term and current child health and nutritional status.

  20. A 6-year trend of the healthcare costs of arthritis in a population-based cohort of older women.

    PubMed

    Lo, Tkt; Parkinson, Lynne; Cunich, Michelle; Byles, Julie

    2016-06-01

    To provide an accurate representation of the economic burden of arthritis by estimating the adjusted incremental healthcare cost of arthritis at multiple percentiles and reporting the cost trends across time. A healthcare cost study based on health survey and linked administrative data, where costs were estimated from the government's perspective in dollars per person per year. Quantile regression was used to estimate the adjusted incremental cost at the 25th, 50th, 75th, 90th, and 95th percentiles. Data from 4287 older Australian women were included. The median incremental healthcare cost of arthritis was, in 2012 Australian dollars, $480 (95% CI: $498-759) in 2009; however, 5% of individuals had 5-times higher costs than the 'average individual' with arthritis. Healthcare cost of arthritis did not increase significantly from 2003 to 2009. Healthcare cost of arthritis represents a substantial burden for the governments. Future research should continue to monitor the economic burden of arthritis.

  1. An analysis of annual maximum streamflows in Terengganu, Malaysia using TL-moments approach

    NASA Astrophysics Data System (ADS)

    Ahmad, Ummi Nadiah; Shabri, Ani; Zakaria, Zahrahtul Amani

    2013-02-01

    TL-moments approach has been used in an analysis to determine the best-fitting distributions to represent the annual series of maximum streamflow data over 12 stations in Terengganu, Malaysia. The TL-moments with different trimming values are used to estimate the parameter of the selected distributions namely: generalized pareto (GPA), generalized logistic, and generalized extreme value distribution. The influence of TL-moments on estimated probability distribution functions are examined by evaluating the relative root mean square error and relative bias of quantile estimates through Monte Carlo simulations. The boxplot is used to show the location of the median and the dispersion of the data, which helps in reaching the decisive conclusions. For most of the cases, the results show that TL-moments with one smallest value was trimmed from the conceptual sample (TL-moments (1,0)), of GPA distribution was the most appropriate in majority of the stations for describing the annual maximum streamflow series in Terengganu, Malaysia.

  2. Hourly Wind Speed Interval Prediction in Arid Regions

    NASA Astrophysics Data System (ADS)

    Chaouch, M.; Ouarda, T.

    2013-12-01

    The long and extended warm and dry summers, the low rate of rain and humidity are the main factors that explain the increase of electricity consumption in hot arid regions. In such regions, the ventilating and air-conditioning installations, that are typically the most energy-intensive among energy consumption activities, are essential for securing healthy, safe and suitable indoor thermal conditions for building occupants and stored materials. The use of renewable energy resources such as solar and wind represents one of the most relevant solutions to overcome the increase of the electricity demand challenge. In the recent years, wind energy is gaining more importance among the researchers worldwide. Wind energy is intermittent in nature and hence the power system scheduling and dynamic control of wind turbine requires an estimate of wind energy. Accurate forecast of wind speed is a challenging task for the wind energy research field. In fact, due to the large variability of wind speed caused by the unpredictable and dynamic nature of the earth's atmosphere, there are many fluctuations in wind power production. This inherent variability of wind speed is the main cause of the uncertainty observed in wind power generation. Furthermore, producing wind power forecasts might be obtained indirectly by modeling the wind speed series and then transforming the forecasts through a power curve. Wind speed forecasting techniques have received substantial attention recently and several models have been developed. Basically two main approaches have been proposed in the literature: (1) physical models such as Numerical Weather Forecast and (2) statistical models such as Autoregressive integrated moving average (ARIMA) models, Neural Networks. While the initial focus in the literature has been on point forecasts, the need to quantify forecast uncertainty and communicate the risk of extreme ramp events has led to an interest in producing probabilistic forecasts. In short term context, probabilistic forecasts might be more relevant than point forecasts for the planner to build scenarios In this paper, we are interested in estimating predictive intervals of the hourly wind speed measures in few cities in United Arab emirates (UAE). More precisely, given a wind speed time series, our target is to forecast the wind speed at any specific hour during the day and provide in addition an interval with the coverage probability 0

  3. Mapping the changing pattern of local climate as an observed distribution

    NASA Astrophysics Data System (ADS)

    Chapman, Sandra; Stainforth, David; Watkins, Nicholas

    2013-04-01

    It is at local scales that the impacts of climate change will be felt directly and at which adaptation planning decisions must be made. This requires quantifying the geographical patterns in trends at specific quantiles in distributions of variables such as daily temperature or precipitation. Here we focus on these local changes and on the way observational data can be analysed to inform us about the pattern of local climate change. We present a method[1] for analysing local climatic timeseries data to assess which quantiles of the local climatic distribution show the greatest and most robust trends. We demonstrate this approach using E-OBS gridded data[2] timeseries of local daily temperature from specific locations across Europe over the last 60 years. Our method extracts the changing cumulative distribution function over time and uses a simple mathematical deconstruction of how the difference between two observations from two different time periods can be assigned to the combination of natural statistical variability and/or the consequences of secular climate change. This deconstruction facilitates an assessment of the sensitivity of different quantiles of the distributions to changing climate. Geographical location and temperature are treated as independent variables, we thus obtain as outputs the pattern of variation in sensitivity with temperature (or occurrence likelihood), and with geographical location. We find as an output many regionally consistent patterns of response of potential value in adaptation planning. We discuss methods to quantify and map the robustness of these observed sensitivities and their statistical likelihood. This also quantifies the level of detail needed from climate models if they are to be used as tools to assess climate change impact. [1] S C Chapman, D A Stainforth, N W Watkins, 2013, On Estimating Local Long Term Climate Trends, Phil. Trans. R. Soc. A, in press [2] Haylock, M.R., N. Hofstra, A.M.G. Klein Tank, E.J. Klok, P.D. Jones and M. New. 2008: A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119, doi:10.1029/2008JD10201

  4. Project Lifespan-based Nonstationary Hydrologic Design Methods for Changing Environment

    NASA Astrophysics Data System (ADS)

    Xiong, L.

    2017-12-01

    Under changing environment, we must associate design floods with the design life period of projects to ensure the hydrologic design is really relevant to the operation of the hydrologic projects, because the design value for a given exceedance probability over the project life period would be significantly different from that over other time periods of the same length due to the nonstationarity of probability distributions. Several hydrologic design methods that take the design life period of projects into account have been proposed in recent years, i.e. the expected number of exceedances (ENE), design life level (DLL), equivalent reliability (ER), and average design life level (ADLL). Among the four methods to be compared, both the ENE and ER methods are return period-based methods, while DLL and ADLL are risk/reliability- based methods which estimate design values for given probability values of risk or reliability. However, the four methods can be unified together under a general framework through a relationship transforming the so-called representative reliability (RRE) into the return period, i.e. m=1/1(1-RRE), in which we compute the return period m using the representative reliability RRE.The results of nonstationary design quantiles and associated confidence intervals calculated by ENE, ER and ADLL were very similar, since ENE or ER was a special case or had a similar expression form with respect to ADLL. In particular, the design quantiles calculated by ENE and ADLL were the same when return period was equal to the length of the design life. In addition, DLL can yield similar design values if the relationship between DLL and ER/ADLL return periods is considered. Furthermore, ENE, ER and ADLL had good adaptability to either an increasing or decreasing situation, yielding not too large or too small design quantiles. This is important for applications of nonstationary hydrologic design methods in actual practice because of the concern of choosing the emerging nonstationary methods versus the traditional stationary methods. There is still a long way to go for the conceptual transition from stationarity to nonstationarity in hydrologic design.

  5. GIS-aided low flow mapping

    NASA Astrophysics Data System (ADS)

    Saghafian, B.; Mohammadi, A.

    2003-04-01

    Most studies involving water resources allocation, water quality, hydropower generation, and allowable water withdrawal and transfer require estimation of low flows. Normally, frequency analysis on at-station D-day low flow data is performed to derive various T-yr return period values. However, this analysis is restricted to the location of hydrometric stations where the flow discharge is measured. Regional analysis is therefore conducted to relate the at-station low flow quantiles to watershed characteristics. This enables the transposition of low flow quantiles to ungauged sites. Nevertheless, a procedure to map the regional regression relations for the entire stream network, within the bounds of the relations, is particularly helpful when one studies and weighs alternative sites for certain water resources project. In this study, we used a GIS-aided procedure for low flow mapping in Gilan province, part of northern region in Iran. Gilan enjoys a humid climate with an average of 1100 mm annual precipitation. Although rich in water resources, the highly populated area is quite dependent on minimum amount of water to sustain the vast rice farming and to maintain required flow discharge for quality purposes. To carry out the low flow analysis, a total of 36 hydrometric stations with sufficient and reliable discharge data were identified in the region. The average area of the watersheds was 250 sq. km. Log Pearson type 3 was found the best distribution for flow durations over 60 days, while log normal fitted well the shorter duration series. Low flows with return periods of 2, 5, 10, 25, 50, and 100 year were then computed. Cluster analysis identified two homogeneous areas. Although various watershed parameters were examined in factor analysis, the results showed watershed area, length of the main stream, and annual precipitation were the most effective low flow parameters. The regression equations were then mapped with the aid of GIS based on flow accumulation maps and the corresponding spatially averaged values of other parameters over the upslope area of all stream pixels exceeding a certain threshold area. Such map clearly shows the spatial variation of low flow quantiles along the stream network and enables the study of low flow profiles along any stream.

  6. Calorie Labeling in Chain Restaurants and Body Weight: Evidence from New York.

    PubMed

    Restrepo, Brandon J

    2017-10-01

    This study analyzes the impact of local mandatory calorie labeling laws implemented by New York jurisdictions on body weight. The analysis indicates that on average the point-of-purchase provision of calorie information on chain restaurant menus reduced body mass index (BMI) by 1.5% and lowered the risk of obesity by 12%. Quantile regression results indicate that calorie labeling has similar impacts across the BMI distribution. An analysis of heterogeneity suggests that calorie labeling has a larger impact on the body weight of lower income individuals, especially lower income minorities. The estimated impacts of calorie labeling on physical activity, smoking, and the consumption of alcoholic beverages, fruits, and vegetables are small in magnitude, which suggests that other margins of adjustment drive the body-weight impacts estimated here. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  7. Robust fitting for neuroreceptor mapping.

    PubMed

    Chang, Chung; Ogden, R Todd

    2009-03-15

    Among many other uses, positron emission tomography (PET) can be used in studies to estimate the density of a neuroreceptor at each location throughout the brain by measuring the concentration of a radiotracer over time and modeling its kinetics. There are a variety of kinetic models in common usage and these typically rely on nonlinear least-squares (LS) algorithms for parameter estimation. However, PET data often contain artifacts (such as uncorrected head motion) and so the assumptions on which the LS methods are based may be violated. Quantile regression (QR) provides a robust alternative to LS methods and has been used successfully in many applications. We consider fitting various kinetic models to PET data using QR and study the relative performance of the methods via simulation. A data adaptive method for choosing between LS and QR is proposed and the performance of this method is also studied.

  8. Identifying the Safety Factors over Traffic Signs in State Roads using a Panel Quantile Regression Approach.

    PubMed

    Šarić, Željko; Xu, Xuecai; Duan, Li; Babić, Darko

    2018-06-20

    This study intended to investigate the interactions between accident rate and traffic signs in state roads located in Croatia, and accommodate the heterogeneity attributed to unobserved factors. The data from 130 state roads between 2012 and 2016 were collected from Traffic Accident Database System maintained by the Republic of Croatia Ministry of the Interior. To address the heterogeneity, a panel quantile regression model was proposed, in which quantile regression model offers a more complete view and a highly comprehensive analysis of the relationship between accident rate and traffic signs, while the panel data model accommodates the heterogeneity attributed to unobserved factors. Results revealed that (1) low visibility of material damage (MD) and death or injured (DI) increased the accident rate; (2) the number of mandatory signs and the number of warning signs were more likely to reduce the accident rate; (3)average speed limit and the number of invalid traffic signs per km exhibited a high accident rate. To our knowledge, it's the first attempt to analyze the interactions between accident consequences and traffic signs by employing a panel quantile regression model; by involving the visibility, the present study demonstrates that the low visibility causes a relatively higher risk of MD and DI; It is noteworthy that average speed limit corresponds with accident rate positively; The number of mandatory signs and the number of warning signs are more likely to reduce the accident rate; The number of invalid traffic signs per km are significant for accident rate, thus regular maintenance should be kept for a safer roadway environment.

  9. Smooth quantile normalization.

    PubMed

    Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N; Quackenbush, John; Irizarry, Rafael A; Bravo, Héctor Corrada

    2018-04-01

    Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.

  10. An impact of environmental changes on flows in the reach scale under a range of climatic conditions

    NASA Astrophysics Data System (ADS)

    Karamuz, Emilia; Romanowicz, Renata J.

    2016-04-01

    The present paper combines detection and adequate identification of causes of changes in flow regime at cross-sections along the Middle River Vistula reach using different methods. Two main experimental set ups (designs) have been applied to study the changes, a moving three-year window and low- and high-flow event based approach. In the first experiment, a Stochastic Transfer Function (STF) model and a quantile-based statistical analysis of flow patterns were compared. These two methods are based on the analysis of changes of the STF model parameters and standardised differences of flow quantile values. In the second experiment, in addition to the STF-based also a 1-D distributed model, MIKE11 was applied. The first step of the procedure used in the study is to define the river reaches that have recorded information on land use and water management changes. The second task is to perform the moving window analysis of standardised differences of flow quantiles and moving window optimisation of the STF model for flow routing. The third step consists of an optimisation of the STF and MIKE11 models for high- and low-flow events. The final step is to analyse the results and relate the standardised quantile changes and model parameter changes to historical land use changes and water management practices. Results indicate that both models give consistent assessment of changes in the channel for medium and high flows. ACKNOWLEDGEMENTS This research was supported by the Institute of Geophysics Polish Academy of Sciences through the Young Scientist Grant no. 3b/IGF PAN/2015.

  11. Evaluation of CMIP5 continental precipitation simulations relative to satellite-based gauge-adjusted observations

    NASA Astrophysics Data System (ADS)

    Mehran, A.; AghaKouchak, A.; Phillips, T. J.

    2014-02-01

    The objective of this study is to cross-validate 34 Coupled Model Intercomparison Project Phase 5 (CMIP5) historical simulations of precipitation against the Global Precipitation Climatology Project (GPCP) data, quantifying model pattern discrepancies, and biases for both entire distributions and their upper tails. The results of the volumetric hit index (VHI) analysis of the total monthly precipitation amounts show that most CMIP5 simulations are in good agreement with GPCP patterns in many areas but that their replication of observed precipitation over arid regions and certain subcontinental regions (e.g., northern Eurasia, eastern Russia, and central Australia) is problematical. Overall, the VHI of the multimodel ensemble mean and median also are superior to that of the individual CMIP5 models. However, at high quantiles of reference data (75th and 90th percentiles), all climate models display low skill in simulating precipitation, except over North America, the Amazon, and Central Africa. Analyses of total bias (B) in CMIP5 simulations reveal that most models overestimate precipitation over regions of complex topography (e.g., western North and South America and southern Africa and Asia), while underestimating it over arid regions. Also, while most climate model simulations show low biases over Europe, intermodel variations in bias over Australia and Amazonia are considerable. The quantile bias analyses indicate that CMIP5 simulations are even more biased at high quantiles of precipitation. It is found that a simple mean field bias removal improves the overall B and VHI values but does not make a significant improvement at high quantiles of precipitation.

  12. Statistical characterization of a large geochemical database and effect of sample size

    USGS Publications Warehouse

    Zhang, C.; Manheim, F.T.; Hinde, J.; Grossman, J.N.

    2005-01-01

    The authors investigated statistical distributions for concentrations of chemical elements from the National Geochemical Survey (NGS) database of the U.S. Geological Survey. At the time of this study, the NGS data set encompasses 48,544 stream sediment and soil samples from the conterminous United States analyzed by ICP-AES following a 4-acid near-total digestion. This report includes 27 elements: Al, Ca, Fe, K, Mg, Na, P, Ti, Ba, Ce, Co, Cr, Cu, Ga, La, Li, Mn, Nb, Nd, Ni, Pb, Sc, Sr, Th, V, Y and Zn. The goal and challenge for the statistical overview was to delineate chemical distributions in a complex, heterogeneous data set spanning a large geographic range (the conterminous United States), and many different geological provinces and rock types. After declustering to create a uniform spatial sample distribution with 16,511 samples, histograms and quantile-quantile (Q-Q) plots were employed to delineate subpopulations that have coherent chemical and mineral affinities. Probability groupings are discerned by changes in slope (kinks) on the plots. Major rock-forming elements, e.g., Al, Ca, K and Na, tend to display linear segments on normal Q-Q plots. These segments can commonly be linked to petrologic or mineralogical associations. For example, linear segments on K and Na plots reflect dilution of clay minerals by quartz sand (low in K and Na). Minor and trace element relationships are best displayed on lognormal Q-Q plots. These sensitively reflect discrete relationships in subpopulations within the wide range of the data. For example, small but distinctly log-linear subpopulations for Pb, Cu, Zn and Ag are interpreted to represent ore-grade enrichment of naturally occurring minerals such as sulfides. None of the 27 chemical elements could pass the test for either normal or lognormal distribution on the declustered data set. Part of the reasons relate to the presence of mixtures of subpopulations and outliers. Random samples of the data set with successively smaller numbers of data points showed that few elements passed standard statistical tests for normality or log-normality until sample size decreased to a few hundred data points. Large sample size enhances the power of statistical tests, and leads to rejection of most statistical hypotheses for real data sets. For large sample sizes (e.g., n > 1000), graphical methods such as histogram, stem-and-leaf, and probability plots are recommended for rough judgement of probability distribution if needed. ?? 2005 Elsevier Ltd. All rights reserved.

  13. Transit-time and age distributions for nonlinear time-dependent compartmental systems.

    PubMed

    Metzler, Holger; Müller, Markus; Sierra, Carlos A

    2018-02-06

    Many processes in nature are modeled using compartmental systems (reservoir/pool/box systems). Usually, they are expressed as a set of first-order differential equations describing the transfer of matter across a network of compartments. The concepts of age of matter in compartments and the time required for particles to transit the system are important diagnostics of these models with applications to a wide range of scientific questions. Until now, explicit formulas for transit-time and age distributions of nonlinear time-dependent compartmental systems were not available. We compute densities for these types of systems under the assumption of well-mixed compartments. Assuming that a solution of the nonlinear system is available at least numerically, we show how to construct a linear time-dependent system with the same solution trajectory. We demonstrate how to exploit this solution to compute transit-time and age distributions in dependence on given start values and initial age distributions. Furthermore, we derive equations for the time evolution of quantiles and moments of the age distributions. Our results generalize available density formulas for the linear time-independent case and mean-age formulas for the linear time-dependent case. As an example, we apply our formulas to a nonlinear and a linear version of a simple global carbon cycle model driven by a time-dependent input signal which represents fossil fuel additions. We derive time-dependent age distributions for all compartments and calculate the time it takes to remove fossil carbon in a business-as-usual scenario.

  14. Removing technical variability in RNA-seq data using conditional quantile normalization.

    PubMed

    Hansen, Kasper D; Irizarry, Rafael A; Wu, Zhijin

    2012-04-01

    The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions.

  15. Health Insurance Dynamics: Methodological Considerations and a Comparison of Estimates from Two Surveys.

    PubMed

    Graves, John A; Mishra, Pranita

    2016-10-01

    To highlight key methodological issues in studying insurance dynamics and to compare estimates across two commonly used surveys. Nonelderly uninsured adults and children sampled between 2001 and 2011 in the Medical Expenditure Panel Survey and the Survey of Income and Program Participation. We utilized nonparametric Kaplan-Meier methods to estimate quantiles (25th, 50th, and 75th percentiles) in the distribution of uninsured spells. We compared estimates obtained across surveys and across different methodological approaches to address issues like attrition, seam bias, censoring and truncation, and survey weighting method. All data were drawn from publicly available household surveys. Estimated uninsured spell durations in the MEPS were longer than those observed in the SIPP. There were few changes in spell durations between 2001 and 2011, with median durations of 14 months among adults and 5-7 months among children in the MEPS, and 8 months (adults) and 4 months (children) in the SIPP. The use of panel survey data to study insurance dynamics presents a unique set of methodological challenges. Researchers should consider key analytic and survey design trade-offs when choosing which survey can best suit their research goals. © Health Research and Educational Trust.

  16. The use of historical information for regional frequency analysis of extreme skew surge

    NASA Astrophysics Data System (ADS)

    Frau, Roberto; Andreewsky, Marc; Bernardara, Pietro

    2018-03-01

    The design of effective coastal protections requires an adequate estimation of the annual occurrence probability of rare events associated with a return period up to 103 years. Regional frequency analysis (RFA) has been proven to be an applicable way to estimate extreme events by sorting regional data into large and spatially distributed datasets. Nowadays, historical data are available to provide new insight on past event estimation. The utilisation of historical information would increase the precision and the reliability of regional extreme's quantile estimation. However, historical data are from significant extreme events that are not recorded by tide gauge. They usually look like isolated data and they are different from continuous data from systematic measurements of tide gauges. This makes the definition of the duration of our observations period complicated. However, the duration of the observation period is crucial for the frequency estimation of extreme occurrences. For this reason, we introduced here the concept of credible duration. The proposed RFA method (hereinafter referenced as FAB, from the name of the authors) allows the use of historical data together with systematic data, which is a result of the use of the credible duration concept.

  17. Value-at-Risk analysis using ARMAX GARCHX approach for estimating risk of banking subsector stock return’s

    NASA Astrophysics Data System (ADS)

    Dewi Ratih, Iis; Sutijo Supri Ulama, Brodjol; Prastuti, Mike

    2018-03-01

    Value at Risk (VaR) is one of the statistical methods used to measure market risk by estimating the worst losses in a given time period and level of confidence. The accuracy of this measuring tool is very important in determining the amount of capital that must be provided by the company to cope with possible losses. Because there is a greater losses to be faced with a certain degree of probability by the greater risk. Based on this, VaR calculation analysis is of particular concern to researchers and practitioners of the stock market to be developed, thus getting more accurate measurement estimates. In this research, risk analysis of stocks in four banking sub-sector, Bank Rakyat Indonesia, Bank Mandiri, Bank Central Asia and Bank Negara Indonesia will be done. Stock returns are expected to be influenced by exogenous variables, namely ICI and exchange rate. Therefore, in this research, stock risk estimation are done by using VaR ARMAX-GARCHX method. Calculating the VaR value with the ARMAX-GARCHX approach using window 500 gives more accurate results. Overall, Bank Central Asia is the only bank had the estimated maximum loss in the 5% quantile.

  18. Temporal development of extreme precipitation in Germany projected by EURO-CORDEX simulations

    NASA Astrophysics Data System (ADS)

    Brendel, Christoph; Deutschländer, Thomas

    2017-04-01

    A sustainable operation of transport infrastructure requires an enhanced resilience to the increasing impacts of climate change and related extreme meteorological events. To meet this challenge, the German Federal Ministry of Transport and Digital Infrastructure (BMVI) commenced a comprehensive national research program on safe and sustainable transport in Germany. A network of departmental research institutes addresses the "Adaptation of the German transport infrastructure towards climate change and extreme events". Various studies already have identified an increase in the average global precipitation for the 20th century. There is some indication that these increases are most visible in a rising frequency of precipitation extremes. However, the changes are highly variable between regions and seasons. With a further increase of atmospheric greenhouse gas concentrations in the 21st century, the likelihood of occurrence of such extreme events will continue to rise. A kernel estimator has been used in order to obtain a robust estimate of the temporal development of extreme precipitation events projected by an ensemble of EURO-CORDEX simulations. The kernel estimator measures the intensity of the poisson point process indicating temporal changes in the frequency of extreme events. Extreme precipitation events were selected using the peaks over threshold (POT) method with the 90th, 95th and 99th quantile of daily precipitation sums as thresholds. Application of this non-parametric approach with relative thresholds renders the use of a bias correction non-mandatory. In addition, in comparison to fitting an extreme value theory (EVT) distribution, the method is completely unsusceptible to outliers. First results show an overall increase of extreme precipitation events for Germany until the end of the 21st century. However, major differences between seasons, quantiles and the three different Representative Concentration Pathways (RCP 2.6, 4.5, and 8.5) have been identified. For instance, the frequency of extreme precipitation events more than triples in the most extreme scenario. Regional differences are rather small with the largest increase in northern Germany, particularly in coastal regions and the weakest increase in the most southern parts of Germany.

  19. Quantifying alteration of river flow regime by large reservoirs in France

    NASA Astrophysics Data System (ADS)

    Cipriani, Thomas; Sauquet, Eric

    2017-04-01

    Reservoirs may highly modify river flow regime. Knowing the alterations is of importance to better understand the biological and physical patterns along the river network. However data are not necessary available to carry out an analysis of modifications at a national scale, e.g. due to industrial interests or to lack of measurements. The objective of this study is to quantify the changes in a set of hydrological indices due to large reservoirs in France combining different data sources. The analysis is based on a comparison between influenced discharges (observed discharges) and natural discharges available from: (i) gauging stations available upstream the dam, (ii) regionalization procedures (Sauquet et al., 2008; Sauquet et Catalogne, 2011; Cipriani et al., 2012), or (iii) historical data free from human influence close to the dam location. The impact of large reservoirs is assessed considering different facets of the river flow regime, including flood quantiles, low flow characteristics, quantiles from the flow duration curve and the twelve mean monthly discharges. The departures from the indice representative of natural conditions quantify the effect of the reservoir management on the river flow regime. The analysis is based on 62 study cases. Results show large spread in terms of impact depending on the purposes of the reservoirs and the season of interest. Results also point out inconsistencies in data (water balance between outflow and inflow, downstream of the dam is not warranted) due to uncertainties in mean monthly discharges and to the imperfect knowledge of inflows and outflows. Lastly, we suggest a typology of hydrological alterations based on the purposes of the reservoirs. Cipriani T., Toilliez T., Sauquet E. (2012). Estimating 10 year return period peak flows and flood durations at ungauged locations in France. La Houille Blanche, 4-5: 5-13, doi : 10.1051/lhb/2012024. Sauquet E., Catalogne C. (2011). Comparison of catchment grouping methods for flow duration curve estimation at ungauged sites in France. Hydrology and Earth System Sciences, 15: 2421-2435, doi:10.5194/hess-15-2421-2011. Sauquet E., Gottschalk L., Krasovskaïa I. (2008). Estimating mean monthly runoff at ungauged locations: an application to France. Hydrology Research, 39(5-6): 403-423.

  20. Age- and sex-specific reference limits for creatinine, cystatin C and the estimated glomerular filtration rate.

    PubMed

    Hannemann, Anke; Friedrich, Nele; Dittmann, Kathleen; Spielhagen, Christin; Wallaschofski, Henri; Völzke, Henry; Rettig, Rainer; Endlich, Karlhans; Lendeckel, Uwe; Stracke, Sylvia; Nauck, Matthias

    2011-11-14

    Early detection of patients with chronic kidney disease is of great importance. This study developed reference limits for serum creatinine and serum cystatin C concentrations and for the estimated glomerular filtration rate (eGFR) in healthy subjects from the general population aged 25-65 years. This study defined a reference population including 985 subjects from the first follow-up of the Study of Health in Pomerania. Serum creatinine was measured with a modified kinetic Jaffé method. Serum cystatin C was measured with a nephelometric assay. The eGFR was calculated from serum creatinine according to the Cockcroft-Gault (eGFR(CG)) and the Modification of Diet in Renal Disease (eGFR(MDRD)) equation, respectively, as well as from serum cystatin C according to the formula by Larsson (eGFR(Larsson)). Non-parametric quantile regression was used to estimate the reference limits. For serum creatinine and serum cystatin C the 95th percentile and for eGFR(CG), eGFR(MDRD) and eGFR(Larsson) the 5th percentile were selected as reference limits. All data was weighted to reflect the age- and sex-structure of the German population in 2008. The reference limits for serum creatinine (men: 1.11-1.23 mg/dL; women: 0.93-1.00 mg/dL) and serum cystatin C levels (men: 0.92-1.04 mg/L; women: 0.84-1.02 mg/L) increased with advancing age. The reference limits for eGFR decreased with increasing age (eGFR(CG) men: 106.0-64.7 mL/min, women 84.4-57.9 mL/min; eGFR(MDRD) men: 82.5-62.2 mL/min/1.73 m², women 75.0-58.2 mL/min/1.73 m²; eGFR(Larsson) men: 85.5-72.9 mL/min, women 94.5-75.7 mL/min). This study presents age- and sex-specific reference limits for five measures of renal function based on quantile regression models.

  1. Synoptic and meteorological drivers of extreme ozone concentrations over Europe

    NASA Astrophysics Data System (ADS)

    Otero, Noelia Felipe; Sillmann, Jana; Schnell, Jordan L.; Rust, Henning W.; Butler, Tim

    2016-04-01

    The present work assesses the relationship between local and synoptic meteorological conditions and surface ozone concentration over Europe in spring and summer months, during the period 1998-2012 using a new interpolated data set of observed surface ozone concentrations over the European domain. Along with local meteorological conditions, the influence of large-scale atmospheric circulation on surface ozone is addressed through a set of airflow indices computed with a novel implementation of a grid-by-grid weather type classification across Europe. Drivers of surface ozone over the full distribution of maximum daily 8-hour average values are investigated, along with drivers of the extreme high percentiles and exceedances or air quality guideline thresholds. Three different regression techniques are applied: multiple linear regression to assess the drivers of maximum daily ozone, logistic regression to assess the probability of threshold exceedances and quantile regression to estimate the meteorological influence on extreme values, as represented by the 95th percentile. The relative importance of the input parameters (predictors) is assessed by a backward stepwise regression procedure that allows the identification of the most important predictors in each model. Spatial patterns of model performance exhibit distinct variations between regions. The inclusion of the ozone persistence is particularly relevant over Southern Europe. In general, the best model performance is found over Central Europe, where the maximum temperature plays an important role as a driver of maximum daily ozone as well as its extreme values, especially during warmer months.

  2. Adolescent mental health and earnings inequalities in adulthood: evidence from the Young-HUNT Study.

    PubMed

    Evensen, Miriam; Lyngstad, Torkild Hovde; Melkevik, Ole; Reneflot, Anne; Mykletun, Arnstein

    2017-02-01

    Previous studies have shown that adolescent mental health problems are associated with lower employment probabilities and risk of unemployment. The evidence on how earnings are affected is much weaker, and few have addressed whether any association reflects unobserved characteristics and whether the consequences of mental health problems vary across the earnings distribution. A population-based Norwegian health survey linked to administrative registry data (N=7885) was used to estimate how adolescents' mental health problems (separate indicators of internalising, conduct, and attention problems and total sum scores) affect earnings (≥30 years) in young adulthood. We used linear regression with fixed-effects models comparing either students within schools or siblings within families. Unconditional quantile regressions were used to explore differentials across the earnings distribution. Mental health problems in adolescence reduce average earnings in adulthood, and associations are robust to control for observed family background and school fixed effects. For some, but not all mental health problems, associations are also robust in sibling fixed-effects models, where all stable family factors are controlled. Further, we found much larger earnings loss below the 25th centile. Adolescent mental health problems reduce adult earnings, especially among individuals in the lower tail of the earnings distribution. Preventing mental health problems in adolescence may increase future earnings. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  3. Relationships between lead biomarkers and diurnal salivary cortisol indices in pregnant women from Mexico City: a cross-sectional study

    PubMed Central

    2014-01-01

    Background Lead (Pb) exposure during pregnancy may increase the risk of adverse maternal, infant, or childhood health outcomes by interfering with hypothalamic-pituitary-adrenal-axis function. We examined relationships between maternal blood or bone Pb concentrations and features of diurnal cortisol profiles in 936 pregnant women from Mexico City. Methods From 2007–11 we recruited women from hospitals/clinics affiliated with the Mexican Social Security System. Pb was measured in blood (BPb) during the second trimester and in mothers’ tibia and patella 1-month postpartum. We characterized maternal HPA-axis function using 10 timed salivary cortisol measurements collected over 2-days (mean: 19.7, range: 14–35 weeks gestation). We used linear mixed models to examine the relationship between Pb biomarkers and cortisol area under the curve (AUC), awakening response (CAR), and diurnal slope. Results After adjustment for confounders, women in the highest quintile of BPb concentrations had a reduced CAR (Ratio: −13%; Confidence Interval [CI]: −24, 1, p-value for trend < 0.05) compared to women in the lowest quintile. Tibia/patella Pb concentrations were not associated with CAR, but diurnal cortisol slopes were suggestively flatter among women in the highest patella Pb quantile compared to women in the lowest quantile (Ratio: 14%; CI: −2, 33). BPb and bone Pb concentrations were not associated with cortisol AUC. Conclusions Concurrent blood Pb levels were associated with cortisol awakening response in these pregnant women and this might explain adverse health outcomes associated with Pb. Further research is needed to confirm these results and determine if other environmental chemicals disrupt hypothalamic-pituitary-adrenal-axis function during pregnancy. PMID:24916609

  4. A nonparametric method for assessment of interactions in a median regression model for analyzing right censored data.

    PubMed

    Lee, MinJae; Rahbar, Mohammad H; Talebi, Hooshang

    2018-01-01

    We propose a nonparametric test for interactions when we are concerned with investigation of the simultaneous effects of two or more factors in a median regression model with right censored survival data. Our approach is developed to detect interaction in special situations, when the covariates have a finite number of levels with a limited number of observations in each level, and it allows varying levels of variance and censorship at different levels of the covariates. Through simulation studies, we compare the power of detecting an interaction between the study group variable and a covariate using our proposed procedure with that of the Cox Proportional Hazard (PH) model and censored quantile regression model. We also assess the impact of censoring rate and type on the standard error of the estimators of parameters. Finally, we illustrate application of our proposed method to real life data from Prospective Observational Multicenter Major Trauma Transfusion (PROMMTT) study to test an interaction effect between type of injury and study sites using median time for a trauma patient to receive three units of red blood cells. The results from simulation studies indicate that our procedure performs better than both Cox PH model and censored quantile regression model based on statistical power for detecting the interaction, especially when the number of observations is small. It is also relatively less sensitive to censoring rates or even the presence of conditionally independent censoring that is conditional on the levels of covariates.

  5. Hybrid ARIMAX quantile regression method for forecasting short term electricity consumption in east java

    NASA Astrophysics Data System (ADS)

    Prastuti, M.; Suhartono; Salehah, NA

    2018-04-01

    The need for energy supply, especially for electricity in Indonesia has been increasing in the last past years. Furthermore, the high electricity usage by people at different times leads to the occurrence of heteroscedasticity issue. Estimate the electricity supply that could fulfilled the community’s need is very important, but the heteroscedasticity issue often made electricity forecasting hard to be done. An accurate forecast of electricity consumptions is one of the key challenges for energy provider to make better resources and service planning and also take control actions in order to balance the electricity supply and demand for community. In this paper, hybrid ARIMAX Quantile Regression (ARIMAX-QR) approach was proposed to predict the short-term electricity consumption in East Java. This method will also be compared to time series regression using RMSE, MAPE, and MdAPE criteria. The data used in this research was the electricity consumption per half-an-hour data during the period of September 2015 to April 2016. The results show that the proposed approach can be a competitive alternative to forecast short-term electricity in East Java. ARIMAX-QR using lag values and dummy variables as predictors yield more accurate prediction in both in-sample and out-sample data. Moreover, both time series regression and ARIMAX-QR methods with addition of lag values as predictor could capture accurately the patterns in the data. Hence, it produces better predictions compared to the models that not use additional lag variables.

  6. Estimation of effects of factors related to preschooler body mass index using quantile regression model.

    PubMed

    Kim, Hee Soon; Park, Yun Hee; Park, Hyun Bong; Kim, Su Hee

    2014-12-01

    The purpose of this study was to investigate Korean preschoolers' obesity-related factors through an ecological approach and to identify Korean preschoolers' obesity-related factors and the different effects of ecological variables on body mass index and its quantiles through an ecological approach. The study design was cross-sectional. Through convenience sampling, 241 cases were collected from three kindergartens and seven nurseries in the Seoul metropolitan area and Kyunggi Province in April 2013 using self-administered questionnaires from preschoolers' mothers and homeroom teachers. Results of ordinary least square regression analysis show that mother's sedentary behavior (p < .001), sedentary behavior parenting (p = .039), healthy eating parenting (p = .027), physical activity-related social capital (p = .029) were significant factors of preschoolers' body mass index. While in the 5% body mass index distribution group, gender (p = .031), preference for physical activity (p = .015), mother's sedentary behavior parenting (p = .032), healthy eating parenting (p = .005), and teacher's sedentary behavior (p = .037) showed significant influences. In the 25% group, the effects of gender and preference for physical activity were no longer significant. In the 75% and 95% group, only mother's sedentary behavior showed a statistically significant influence (p < .001, p = .012 respectively). Efforts to lower the obesity rate of preschoolers should focus on their environment, especially on the sedentary behavior of mothers, as mothers are the main nurturers of this age group. Copyright © 2014. Published by Elsevier B.V.

  7. Assessment of Weighted Quantile Sum Regression for Modeling Chemical Mixtures and Cancer Risk

    PubMed Central

    Czarnota, Jenna; Gennings, Chris; Wheeler, David C

    2015-01-01

    In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case–control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome. PMID:26005323

  8. Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk.

    PubMed

    Czarnota, Jenna; Gennings, Chris; Wheeler, David C

    2015-01-01

    In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case-control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome.

  9. Influence of Donor and Recipient CYP3A4, CYP3A5, and ABCB1 Genotypes on Clinical Outcomes and Nephrotoxicity in Liver Transplant Recipients.

    PubMed

    Debette-Gratien, Marilyne; Woillard, Jean-Baptiste; Picard, Nicolas; Sebagh, Mylène; Loustaud-Ratti, Véronique; Sautereau, Denis; Samuel, Didier; Marquet, Pierre

    2016-10-01

    This study investigated the influence of the CYP3A4*22, CYP3A5*3, and ABCB1 exons 12, 21, and 26 polymorphisms in donors and recipients on clinical outcomes and renal function in 170 liver transplant patients on cyclosporin A (CsA) or tacrolimus (Tac). Allelic discrimination assays were used for genotyping. Multivariate time-dependent Cox proportional hazard models, multiple linear regression using the generalized estimating equation and linear mixed-effect models were used for statistical analysis. Expression of CYP3A5 by either or both the donor and the recipient was significantly associated with lower Tac, but not CsA, dose-normalized trough levels. In the whole population, graft loss was only significantly associated with longer exposure to high calcineurin inhibitor (CNI) concentrations (hazard ratio, 6.93; 95% confidence interval, 2.13-22.55), P = 0.00129), whereas in the Tac subgroup, the risk of graft loss was significantly higher in recipient CYP3A5*1 expressers (hazard ratio, 3.39; 95% confidence interval, 1.52-7.58; P = 0.0028). Renal function was significantly associated with: (1) baseline modification of diet in renal disease (β = 0.51 ± 0.05; P < 0.0001); (2) duration of patient follow-up (per visit, β = -0.98 ± 0.22; P < 0.0001); and (3) CNI exposure (per quantile increase, β = -2.42 ± 0.59; P < 0.0001). No genetic factor was associated with patient survival, acute rejection, liver function test results, recurrence of viral or other initial liver disease, or renal function. This study confirms the effect of CYP3A5*3 on tacrolimus dose requirement in liver transplantation and shows unexpected associations between the type of, and exposure to, CNI and either chronic rejection or graft loss. None of the genetic polymorphisms studied had a noticeable impact on renal function degradation at 10 years.

  10. Alarm Limits for Intraoperative Drug Infusions: A Report From the Multicenter Perioperative Outcomes Group.

    PubMed

    Berman, Mitchell F; Iyer, Nikhil; Freudzon, Leon; Wang, Shuang; Freundlich, Robert E; Housey, Michelle; Kheterpal, Sachin

    2017-10-01

    Continuous medication infusions are commonly used during surgical procedures. Alarm settings for infusion pumps are considered important for patient safety, but limits are not created in a standardized manner from actual usage data. We estimated 90th and 95th percentile infusion rates from a national database for potential use as upper limit alarm settings. We extracted infusion rate data from 17 major hospitals using intraoperative records provided by Multicenter Perioperative Outcomes Group for adult surgery between 2008 and 2014. Seven infusions were selected for study: propofol, remifentanil, dexmedetomidine, norepinephrine, phenylephrine, nitroglycerin, and esmolol. Each dosage entry for an infusion during a procedure was included. We estimated the 50th, 90th, and 95th percentile levels for each infusion across institutions, and performed quantile regression to examine factors that might affect the percentiles rates, such as use in general anesthesia versus sedation. The median 90th and 95th percentile infusion rates (with interquartile range) for propofol were 150 (140-150) and 170 (150-200) μg/kg/min. Quantile regression demonstrated higher 90th and 95th percentile rates during sedation for gastrointestinal endoscopy than for all surgical procedures performed under general anesthesia. For selected vasoactive medications, the corresponding median 90th and 95th percentile rates (with interquartile range) were norepinephrine 14.0 (9.8-18.1) and 18.3 (12.6-23.9) μg/min, and phenylephrine 60 (55-80) and 80 (75-100) μg/min. Alarm settings based on infusion rate percentile limits would be triggered at predictable rates; ie, the 95th percentile would be exceeded and an alarm sounded during 1 in 20 infusion rate entries. As a result, institutions could establish pump alarm settings consistent with desired alarm frequency using their own or externally validated usage data. Further study will be needed to determine the optimal percentile for infusion alarm settings.

  11. The impact of the 2008 financial crisis on food security and food expenditures in Mexico: a disproportionate effect on the vulnerable

    PubMed Central

    Vilar-Compte, Mireya; Sandoval-Olascoaga, Sebastian; Bernal-Stuart, Ana; Shimoga, Sandhya; Vargas-Bustamante, Arturo

    2015-01-01

    Objective The present paper investigated the impact of the 2008 financial crisis on food security in Mexico and how it disproportionally affected vulnerable households. Design A generalized ordered logistic regression was estimated to assess the impact of the crisis on households’ food security status. An ordinary least squares and a quantile regression were estimated to evaluate the effect of the financial crisis on a continuous proxy measure of food security defined as the share of a household’s current income devoted to food expenditures. Setting Both analyses were performed using pooled cross-sectional data from the Mexican National Household Income and Expenditure Survey 2008 and 2010. Subjects The analytical sample included 29 468 households in 2008 and 27 654 in 2010. Results The generalized ordered logistic model showed that the financial crisis significantly (P < 0·05) decreased the probability of being food secure, mildly or moderately food insecure, compared with being severely food insecure (OR = 0·74). A similar but smaller effect was found when comparing severely and moderately food-insecure households with mildly food-insecure and food-secure households (OR = 0·81). The ordinary least squares model showed that the crisis significantly (P < 0·05) increased the share of total income spent on food (β coefficient of 0·02). The quantile regression confirmed the findings suggested by the generalized ordered logistic model, showing that the effects of the crisis were more profound among poorer households. Conclusions The results suggest that households that were more vulnerable before the financial crisis saw a worsened effect in terms of food insecurity with the crisis. Findings were consistent with both measures of food security – one based on self-reported experience and the other based on food spending. PMID:25428800

  12. Coronary artery calcium distributions in older persons in the AGES-Reykjavik study

    PubMed Central

    Gudmundsson, Elias Freyr; Gudnason, Vilmundur; Sigurdsson, Sigurdur; Launer, Lenore J.; Harris, Tamara B.; Aspelund, Thor

    2013-01-01

    Coronary Artery Calcium (CAC) is a sign of advanced atherosclerosis and an independent risk factor for cardiac events. Here, we describe CAC-distributions in an unselected aged population and compare modelling methods to characterize CAC-distribution. CAC is difficult to model because it has a skewed and zero inflated distribution with over-dispersion. Data are from the AGES-Reykjavik sample, a large population based study [2002-2006] in Iceland of 5,764 persons aged 66-96 years. Linear regressions using logarithmic- and Box-Cox transformations on CAC+1, quantile regression and a Zero-Inflated Negative Binomial model (ZINB) were applied. Methods were compared visually and with the PRESS-statistic, R2 and number of detected associations with concurrently measured variables. There were pronounced differences in CAC according to sex, age, history of coronary events and presence of plaque in the carotid artery. Associations with conventional coronary artery disease (CAD) risk factors varied between the sexes. The ZINB model provided the best results with respect to the PRESS-statistic, R2, and predicted proportion of zero scores. The ZINB model detected similar numbers of associations as the linear regression on ln(CAC+1) and usually with the same risk factors. PMID:22990371

  13. Drought variability in six catchments in the UK

    NASA Astrophysics Data System (ADS)

    Kwok-Pan, Chun; Onof, Christian; Wheater, Howard

    2010-05-01

    Drought is fundamentally related to consistent low precipitation levels. Changes in global and regional drought patterns are suggested by numerous recent climate change studies. However, most of the climate change adaptation measures are at a catchment scale, and the development of a framework for studying persistence in precipitation is still at an early stage. Two stochastic approaches for modelling drought severity index (DSI) are proposed to investigate possible changes in droughts in six catchments in the UK. They are the autoregressive integrated moving average (ARIMA) and the generalised linear model (GLM) approach. Results of ARIMA modelling show that mean sea level pressure and possibly the North Atlantic Oscillation (NAO) index are important climate variables for short term drought forecasts, whereas relative humidity is not a significant climate variable despite its high correlation with the DSI series. By simulating rainfall series, the generalised linear model (GLM) approach can provide the probability density function of the DSI. GLM simulations indicate that the changes in the 10th and 50th quantiles of drought events are more noticeable than in the 90th extreme droughts. The possibility of extending the GLM approach to support risk-based water management is also discussed.

  14. Fish and invertebrate flow-biology relationships to support the determination of ecological flows for North Carolina

    USGS Publications Warehouse

    Phelan, Jennifer; Cuffney, Thomas F.; Patterson, Lauren A.; Eddy, Michele; Dykes, Robert; Pearsall, Sam; Goudreau, Chris; Mead, Jim; Tarver, Fred

    2017-01-01

    A method was developed to characterize fish and invertebrate responses to flow alteration in the state of North Carolina. This method involved using 80th percentile linear quantile regressions to relate six flow metrics to the diversity of riffle-run fish and benthic Ephemeroptera, Plecoptera, and Trichoptera (EPT) richness. All twelve flow-biology relationships were found to be significant, with both benthos and fish showing negative responses to ecodeficits and reductions in flow. The responses of benthic richness to reduced flows were consistent and generally greater than that of fish diversity. However, the riffle-run fish guild showed the greatest reductions in diversity in response to summer ecodeficits. The directional consistency and differential seasonal sensitivities of fish and invertebrates to reductions in flow highlight the need to consider seasonality when managing flows. In addition, all relationships were linear, and therefore do not provide clear thresholds to support ecological flow determinations and flow prescriptions to prevent the degradation of fish and invertebrate communities in North Carolina rivers and streams. A method of setting ecological flows based on the magnitude of change in biological condition that is acceptable to society is explored.

  15. Statistical downscaling of precipitation using long short-term memory recurrent neural networks

    NASA Astrophysics Data System (ADS)

    Misra, Saptarshi; Sarkar, Sudeshna; Mitra, Pabitra

    2017-11-01

    Hydrological impacts of global climate change on regional scale are generally assessed by downscaling large-scale climatic variables, simulated by General Circulation Models (GCMs), to regional, small-scale hydrometeorological variables like precipitation, temperature, etc. In this study, we propose a new statistical downscaling model based on Recurrent Neural Network with Long Short-Term Memory which captures the spatio-temporal dependencies in local rainfall. The previous studies have used several other methods such as linear regression, quantile regression, kernel regression, beta regression, and artificial neural networks. Deep neural networks and recurrent neural networks have been shown to be highly promising in modeling complex and highly non-linear relationships between input and output variables in different domains and hence we investigated their performance in the task of statistical downscaling. We have tested this model on two datasets—one on precipitation in Mahanadi basin in India and the second on precipitation in Campbell River basin in Canada. Our autoencoder coupled long short-term memory recurrent neural network model performs the best compared to other existing methods on both the datasets with respect to temporal cross-correlation, mean squared error, and capturing the extremes.

  16. Experimental and environmental factors affect spurious detection of ecological thresholds

    USGS Publications Warehouse

    Daily, Jonathan P.; Hitt, Nathaniel P.; Smith, David; Snyder, Craig D.

    2012-01-01

    Threshold detection methods are increasingly popular for assessing nonlinear responses to environmental change, but their statistical performance remains poorly understood. We simulated linear change in stream benthic macroinvertebrate communities and evaluated the performance of commonly used threshold detection methods based on model fitting (piecewise quantile regression [PQR]), data partitioning (nonparametric change point analysis [NCPA]), and a hybrid approach (significant zero crossings [SiZer]). We demonstrated that false detection of ecological thresholds (type I errors) and inferences on threshold locations are influenced by sample size, rate of linear change, and frequency of observations across the environmental gradient (i.e., sample-environment distribution, SED). However, the relative importance of these factors varied among statistical methods and between inference types. False detection rates were influenced primarily by user-selected parameters for PQR (τ) and SiZer (bandwidth) and secondarily by sample size (for PQR) and SED (for SiZer). In contrast, the location of reported thresholds was influenced primarily by SED. Bootstrapped confidence intervals for NCPA threshold locations revealed strong correspondence to SED. We conclude that the choice of statistical methods for threshold detection should be matched to experimental and environmental constraints to minimize false detection rates and avoid spurious inferences regarding threshold location.

  17. Can quantile mapping improve precipitation extremes from regional climate models?

    NASA Astrophysics Data System (ADS)

    Tani, Satyanarayana; Gobiet, Andreas

    2015-04-01

    The ability of quantile mapping to accurately bias correct regard to precipitation extremes is investigated in this study. We developed new methods by extending standard quantile mapping (QMα) to improve the quality of bias corrected extreme precipitation events as simulated by regional climate model (RCM) output. The new QM version (QMβ) was developed by combining parametric and nonparametric bias correction methods. The new nonparametric method is tested with and without a controlling shape parameter (Qmβ1 and Qmβ0, respectively). Bias corrections are applied on hindcast simulations for a small ensemble of RCMs at six different locations over Europe. We examined the quality of the extremes through split sample and cross validation approaches of these three bias correction methods. This split-sample approach mimics the application to future climate scenarios. A cross validation framework with particular focus on new extremes was developed. Error characteristics, q-q plots and Mean Absolute Error (MAEx) skill scores are used for evaluation. We demonstrate the unstable behaviour of correction function at higher quantiles with QMα, whereas the correction functions with for QMβ0 and QMβ1 are smoother, with QMβ1 providing the most reasonable correction values. The result from q-q plots demonstrates that, all bias correction methods are capable of producing new extremes but QMβ1 reproduces new extremes with low biases in all seasons compared to QMα, QMβ0. Our results clearly demonstrate the inherent limitations of empirical bias correction methods employed for extremes, particularly new extremes, and our findings reveals that the new bias correction method (Qmß1) produces more reliable climate scenarios for new extremes. These findings present a methodology that can better capture future extreme precipitation events, which is necessary to improve regional climate change impact studies.

  18. Diagnostic Imaging Services in Magnet and Non-Magnet Hospitals: Trends in Utilization and Costs.

    PubMed

    Jayawardhana, Jayani; Welton, John M

    2015-12-01

    The purpose of this study was to better understand trends in utilization and costs of diagnostic imaging services at Magnet hospitals (MHs) and non-Magnet hospitals (NMHs). A data set was created by merging hospital-level data from the American Hospital Association's annual survey and Medicare cost reports, individual-level inpatient data from the Healthcare Cost and Utilization Project, and Magnet recognition status data from the American Nurses Credentialing Center. A descriptive analysis was conducted to evaluate the trends in utilization and costs of CT, MRI, and ultrasound procedures among MHs and NMHs in urban locations between 2000 and 2006 from the following ten states: Arizona, California, Colorado, Florida, Iowa, Maryland, North Carolina, New Jersey, New York, and Washington. When matched by bed size, severity of illness (case mix index), and clinical technological sophistication (Saidin index) quantiles, MHs in higher quantiles indicated higher rates of utilization of imaging services for MRI, CT, and ultrasound in comparison with NMHs in the same quantiles. However, average costs of MRI, CT, and ultrasounds were lower at MHs in comparison with NMHs in the same quantiles. Overall, MHs that are larger in size (number of beds), serve more severely ill patients (case mix index), and are more technologically sophisticated (Saidin index) show higher utilization of diagnostic imaging services, although costs per procedure at MHs are lower in comparison with similar NMHs, indicating possible cost efficiency at MHs. Further research is necessary to understand the relationship between the utilization of diagnostic imaging services among MHs and its impact on patient outcomes. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  19. Use of Quantile Regression to Determine the Impact on Total Health Care Costs of Surgical Site Infections Following Common Ambulatory Procedures.

    PubMed

    Olsen, Margaret A; Tian, Fang; Wallace, Anna E; Nickel, Katelin B; Warren, David K; Fraser, Victoria J; Selvam, Nandini; Hamilton, Barton H

    2017-02-01

    To determine the impact of surgical site infections (SSIs) on health care costs following common ambulatory surgical procedures throughout the cost distribution. Data on costs of SSIs following ambulatory surgery are sparse, particularly variation beyond just mean costs. We performed a retrospective cohort study of persons undergoing cholecystectomy, breast-conserving surgery, anterior cruciate ligament reconstruction, and hernia repair from December 31, 2004 to December 31, 2010 using commercial insurer claims data. SSIs within 90 days post-procedure were identified; infections during a hospitalization or requiring surgery were considered serious. We used quantile regression, controlling for patient, operative, and postoperative factors to examine the impact of SSIs on 180-day health care costs throughout the cost distribution. The incidence of serious and nonserious SSIs was 0.8% and 0.2%, respectively, after 21,062 anterior cruciate ligament reconstruction, 0.5% and 0.3% after 57,750 cholecystectomy, 0.6% and 0.5% after 60,681 hernia, and 0.8% and 0.8% after 42,489 breast-conserving surgery procedures. Serious SSIs were associated with significantly higher costs than nonserious SSIs for all 4 procedures throughout the cost distribution. The attributable cost of serious SSIs increased for both cholecystectomy and hernia repair as the quantile of total costs increased ($38,410 for cholecystectomy with serious SSI vs no SSI at the 70th percentile of costs, up to $89,371 at the 90th percentile). SSIs, particularly serious infections resulting in hospitalization or surgical treatment, were associated with significantly increased health care costs after 4 common surgical procedures. Quantile regression illustrated the differential effect of serious SSIs on health care costs at the upper end of the cost distribution.

  20. Evaluation of CMIP5 continental precipitation simulations relative to satellite-based gauge-adjusted observations

    DOE PAGES

    Mehran, Ali; AghaKouchak, Amir; Phillips, Thomas J.

    2014-02-25

    Numerous studies have emphasized that climate simulations are subject to various biases and uncertainties. The objective of this study is to cross-validate 34 Coupled Model Intercomparison Project Phase 5 (CMIP5) historical simulations of precipitation against the Global Precipitation Climatology Project (GPCP) data, quantifying model pattern discrepancies and biases for both entire data distributions and their upper tails. The results of the Volumetric Hit Index (VHI) analysis of the total monthly precipitation amounts show that most CMIP5 simulations are in good agreement with GPCP patterns in many areas, but that their replication of observed precipitation over arid regions and certain sub-continentalmore » regions (e.g., northern Eurasia, eastern Russia, central Australia) is problematical. Overall, the VHI of the multi-model ensemble mean and median also are superior to that of the individual CMIP5 models. However, at high quantiles of reference data (e.g., the 75th and 90th percentiles), all climate models display low skill in simulating precipitation, except over North America, the Amazon, and central Africa. Analyses of total bias (B) in CMIP5 simulations reveal that most models overestimate precipitation over regions of complex topography (e.g. western North and South America and southern Africa and Asia), while underestimating it over arid regions. Also, while most climate model simulations show low biases over Europe, inter-model variations in bias over Australia and Amazonia are considerable. The Quantile Bias (QB) analyses indicate that CMIP5 simulations are even more biased at high quantiles of precipitation. Lastly, we found that a simple mean-field bias removal improves the overall B and VHI values, but does not make a significant improvement in these model performance metrics at high quantiles of precipitation.« less

  1. Use of Quantile Regression to Determine the Impact on Total Health Care Costs of Surgical Site Infections Following Common Ambulatory Procedures

    PubMed Central

    Olsen, Margaret A.; Tian, Fang; Wallace, Anna E.; Nickel, Katelin B.; Warren, David K.; Fraser, Victoria J.; Selvam, Nandini; Hamilton, Barton H.

    2017-01-01

    Objective To determine the impact of surgical site infections (SSIs) on healthcare costs following common ambulatory surgical procedures throughout the cost distribution. Background Data on costs of SSIs following ambulatory surgery are sparse, particularly variation beyond just mean costs. Methods We performed a retrospective cohort study of persons undergoing cholecystectomy, breast-conserving surgery (BCS), anterior cruciate ligament reconstruction (ACL), and hernia repair from 12/31/2004–12/31/2010 using commercial insurer claims data. SSIs within 90 days post-procedure were identified; infections during a hospitalization or requiring surgery were considered serious. We used quantile regression, controlling for patient, operative, and postoperative factors to examine the impact of SSIs on 180-day healthcare costs throughout the cost distribution. Results The incidence of serious and non-serious SSIs were 0.8% and 0.2% after 21,062 ACL, 0.5% and 0.3% after 57,750 cholecystectomy, 0.6% and 0.5% after 60,681 hernia, and 0.8% and 0.8% after 42,489 BCS procedures. Serious SSIs were associated with significantly higher costs than non-serious SSIs for all 4 procedures throughout the cost distribution. The attributable cost of serious SSIs increased for both cholecystectomy and hernia repair as the quantile of total costs increased ($38,410 for cholecystectomy with serious SSI vs. no SSI at the 70th percentile of costs, up to $89,371 at the 90th percentile). Conclusions SSIs, particularly serious infections resulting in hospitalization or surgical treatment, were associated with significantly increased healthcare costs after 4 common surgical procedures. Quantile regression illustrated the differential effect of serious SSIs on healthcare costs at the upper end of the cost distribution. PMID:28059961

  2. Towards a systematic approach to comparing distributions used in flood frequency analysis

    NASA Astrophysics Data System (ADS)

    Bobée, B.; Cavadias, G.; Ashkar, F.; Bernier, J.; Rasmussen, P.

    1993-02-01

    The estimation of flood quantiles from available streamflow records has been a topic of extensive research in this century. However, the large number of distributions and estimation methods proposed in the scientific literature has led to a state of confusion, and a gap prevails between theory and practice. This concerns both at-site and regional flood frequency estimation. To facilitate the work of "hydrologists, designers of hydraulic structures, irrigation engineers and planners of water resources", the World Meteorological Organization recently published a report which surveys and compares current methodologies, and recommends a number of statistical distributions and estimation procedures. This report is an important step towards the clarification of this difficult topic, but we think that it does not effectively satisfy the needs of practitioners as intended, because it contains some statements which are not statistically justified and which require further discussion. In the present paper we review commonly used procedures for flood frequency estimation, point out some of the reasons for the present state of confusion concerning the advantages and disadvantages of the various methods, and propose the broad lines of a possible comparison strategy. We recommend that the results of such comparisons be discussed in an international forum of experts, with the purpose of attaining a more coherent and broadly accepted strategy for estimating floods.

  3. Likelihood-based confidence intervals for estimating floods with given return periods

    NASA Astrophysics Data System (ADS)

    Martins, Eduardo Sávio P. R.; Clarke, Robin T.

    1993-06-01

    This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.

  4. magicaxis: Pretty scientific plotting with minor-tick and log minor-tick support

    NASA Astrophysics Data System (ADS)

    Robotham, Aaron S. G.

    2016-04-01

    The R suite magicaxis makes useful and pretty plots for scientific plotting and includes functions for base plotting, with particular emphasis on pretty axis labelling in a number of circumstances that are often used in scientific plotting. It also includes functions for generating images and contours that reflect the 2D quantile levels of the data designed particularly for output of MCMC posteriors where visualizing the location of the 68% and 95% 2D quantiles for covariant parameters is a necessary part of the post MCMC analysis, can generate low and high error bars, and allows clipping of values, rejection of bad values, and log stretching.

  5. Evaluation of normalization methods in mammalian microRNA-Seq data

    PubMed Central

    Garmire, Lana Xia; Subramaniam, Shankar

    2012-01-01

    Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution. PMID:22532701

  6. Bumps in river profiles: uncertainty assessment and smoothing using quantile regression techniques

    NASA Astrophysics Data System (ADS)

    Schwanghart, Wolfgang; Scherler, Dirk

    2017-12-01

    The analysis of longitudinal river profiles is an important tool for studying landscape evolution. However, characterizing river profiles based on digital elevation models (DEMs) suffers from errors and artifacts that particularly prevail along valley bottoms. The aim of this study is to characterize uncertainties that arise from the analysis of river profiles derived from different, near-globally available DEMs. We devised new algorithms - quantile carving and the CRS algorithm - that rely on quantile regression to enable hydrological correction and the uncertainty quantification of river profiles. We find that globally available DEMs commonly overestimate river elevations in steep topography. The distributions of elevation errors become increasingly wider and right skewed if adjacent hillslope gradients are steep. Our analysis indicates that the AW3D DEM has the highest precision and lowest bias for the analysis of river profiles in mountainous topography. The new 12 m resolution TanDEM-X DEM has a very low precision, most likely due to the combined effect of steep valley walls and the presence of water surfaces in valley bottoms. Compared to the conventional approaches of carving and filling, we find that our new approach is able to reduce the elevation bias and errors in longitudinal river profiles.

  7. L-moments and TL-moments of the generalized lambda distribution

    USGS Publications Warehouse

    Asquith, W.H.

    2007-01-01

    The 4-parameter generalized lambda distribution (GLD) is a flexible distribution capable of mimicking the shapes of many distributions and data samples including those with heavy tails. The method of L-moments and the recently developed method of trimmed L-moments (TL-moments) are attractive techniques for parameter estimation for heavy-tailed distributions for which the L- and TL-moments have been defined. Analytical solutions for the first five L- and TL-moments in terms of GLD parameters are derived. Unfortunately, numerical methods are needed to compute the parameters from the L- or TL-moments. Algorithms are suggested for parameter estimation. Application of the GLD using both L- and TL-moment parameter estimates from example data is demonstrated, and comparison of the L-moment fit of the 4-parameter kappa distribution is made. A small simulation study of the 98th percentile (far-right tail) is conducted for a heavy-tail GLD with high-outlier contamination. The simulations show, with respect to estimation of the 98th-percent quantile, that TL-moments are less biased (more robost) in the presence of high-outlier contamination. However, the robustness comes at the expense of considerably more sampling variability. ?? 2006 Elsevier B.V. All rights reserved.

  8. Measuring bulrush culm relationships to estimate plant biomass within a southern California treatment wetland

    USGS Publications Warehouse

    Daniels, Joan S. (Thullen); Cade, Brian S.; Sartoris, James J.

    2010-01-01

    Assessment of emergent vegetation biomass can be time consuming and labor intensive. To establish a less onerous, yet accurate method, for determining emergent plant biomass than by direct measurements we collected vegetation data over a six-year period and modeled biomass using easily obtained variables: culm (stem) diameter, culm height and culm density. From 1998 through 2005, we collected emergent vegetation samples (Schoenoplectus californicus andSchoenoplectus acutus) at a constructed treatment wetland in San Jacinto, California during spring and fall. Various statistical models were run on the data to determine the strongest relationships. We found that the nonlinear relationship: CB=β0DHβ110ε, where CB was dry culm biomass (g m−2), DH was density of culms × average height of culms in a plot, and β0 and β1 were parameters to estimate, proved to be the best fit for predicting dried-live above-ground biomass of the two Schoenoplectus species. The random error distribution, ε, was either assumed to be normally distributed for mean regression estimates or assumed to be an unspecified continuous distribution for quantile regression estimates.

  9. Preliminary assessment of factors influencing riverine fish communities in Massachusetts.

    USGS Publications Warehouse

    Armstrong, David S.; Richards, Todd A.; Brandt, Sara L.

    2010-01-01

    The U.S. Geological Survey, in cooperation with the Massachusetts Department of Conservation and Recreation (MDCR), Massachusetts Department of Environmental Protection (MDEP), and the Massachusetts Department of Fish and Game (MDFG), conducted a preliminary investigation of fish communities in small- to medium-sized Massachusetts streams. The objective of this investigation was to determine relations between fish-community characteristics and anthropogenic alteration, including flow alteration and impervious cover, relative to the effect of physical basin and land-cover (environmental) characteristics. Fish data were obtained for 756 fish-sampling sites from the Massachusetts Division of Fisheries and Wildlife fish-community database. A review of the literature was used to select a set of fish metrics responsive to flow alteration. Fish metrics tested include two fish-community metrics (fluvial-fish relative abundance and fluvial-fish species richness), and five indicator species metrics (relative abundance of brook trout, blacknose dace, fallfish, white sucker, and redfin pickerel). Streamflows were simulated for each fish-sampling site using the Sustainable Yield Estimator application (SYE). Daily streamflows and the SYE water-use database were used to determine a set of indicators of flow alteration, including percent alteration of August median flow, water-use intensity, and withdrawal and return-flow fraction. The contributing areas to the fish-sampling sites were delineated and used with a Geographic Information System (GIS) to determine a set of environmental characteristics, including elevation, basin slope, percent sand and gravel, percent wetland, and percent open water, and a set of anthropogenic-alteration variables, including impervious cover and dam density. Two analytical techniques, quantile regression and generalized linear modeling, were applied to determine the association between fish-response variables and the selected environmental and anthropogenic explanatory variables. Quantile regression indicated that flow alteration and impervious cover were negatively associated with both fluvial-fish relative abundance and fluvial-fish species richness. Three generalized linear models (GLMs) were developed to quantify the response of fish communities to multiple environmental and anthropogenic variables. Flow-alteration variables are statistically significant for the fluvial-fish relative-abundance model. Impervious cover is statistically significant for the fluvial-fish relative-abundance, fluvial-fish species richness, and brook trout relative-abundance models. The variables in the equations were demonstrated to be significant, and the variability explained by the models, as measured by the correlation between observed and predicted values, ranges from 39 to 65 percent. The GLM models indicated that, keeping all other variables the same, a one-unit (1 percent) increase in the percent depletion or percent surcharging of August median flow would result in a 0.4-percent decrease in the relative abundance (in counts per hour) of fluvial fish and that the relative abundance of fluvial fish was expected to be about 55 percent lower in net-depleted streams than in net-surcharged streams. The GLM models also indicated that a unit increase in impervious cover resulted in a 5.5-percent decrease in the relative abundance of fluvial fish and a 2.5-percent decrease in fluvial-fish species richness.

  10. Residual uncertainty estimation using instance-based learning with applications to hydrologic forecasting

    NASA Astrophysics Data System (ADS)

    Wani, Omar; Beckers, Joost V. L.; Weerts, Albrecht H.; Solomatine, Dimitri P.

    2017-08-01

    A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.

  11. Estimation of local extreme suspended sediment concentrations in California Rivers.

    PubMed

    Tramblay, Yves; Saint-Hilaire, André; Ouarda, Taha B M J; Moatar, Florentina; Hecht, Barry

    2010-09-01

    The total amount of suspended sediment load carried by a stream during a year is usually transported during one or several extreme events related to high river flow and intense rainfall, leading to very high suspended sediment concentrations (SSCs). In this study quantiles of SSC derived from annual maximums and the 99th percentile of SSC series are considered to be estimated locally in a site-specific approach using regional information. Analyses of relationships between physiographic characteristics and the selected indicators were undertaken using the localities of 5-km radius draining of each sampling site. Multiple regression models were built to test the regional estimation for these indicators of suspended sediment transport. To assess the accuracy of the estimates, a Jack-Knife re-sampling procedure was used to compute the relative bias and root mean square error of the models. Results show that for the 19 stations considered in California, the extreme SSCs can be estimated with 40-60% uncertainty, depending on the presence of flow regulation in the basin. This modelling approach is likely to prove functional in other Mediterranean climate watersheds since they appear useful in California, where geologic, climatic, physiographic, and land-use conditions are highly variable. Copyright 2010 Elsevier B.V. All rights reserved.

  12. A comparison of moment-based methods of estimation for the log Pearson type 3 distribution

    NASA Astrophysics Data System (ADS)

    Koutrouvelis, I. A.; Canavos, G. C.

    2000-06-01

    The log Pearson type 3 distribution is a very important model in statistical hydrology, especially for modeling annual flood series. In this paper we compare the various methods based on moments for estimating quantiles of this distribution. Besides the methods of direct and mixed moments which were found most successful in previous studies and the well-known indirect method of moments, we develop generalized direct moments and generalized mixed moments methods and a new method of adaptive mixed moments. The last method chooses the orders of two moments for the original observations by utilizing information contained in the sample itself. The results of Monte Carlo experiments demonstrated the superiority of this method in estimating flood events of high return periods when a large sample is available and in estimating flood events of low return periods regardless of the sample size. In addition, a comparison of simulation and asymptotic results shows that the adaptive method may be used for the construction of meaningful confidence intervals for design events based on the asymptotic theory even with small samples. The simulation results also point to the specific members of the class of generalized moments estimates which maintain small values for bias and/or mean square error.

  13. Financial risk protection from social health insurance.

    PubMed

    Barnes, Kayleigh; Mukherji, Arnab; Mullen, Patrick; Sood, Neeraj

    2017-09-01

    This paper estimates the impact of social health insurance on financial risk by utilizing data from a natural experiment created by the phased roll-out of a social health insurance program for the poor in India. We estimate the distributional impact of insurance on of out-of-pocket costs and incorporate these results with a stylized expected utility model to compute associated welfare effects. We adjust the standard model, accounting for conditions of developing countries by incorporating consumption floors, informal borrowing, and asset selling which allow us to separate the value of financial risk reduction from consumption smoothing and asset protection. Results show that insurance reduces out-of-pocket costs, particularly in higher quantiles of the distribution. We find reductions in the frequency and amount of money borrowed for health reasons. Finally, we find that the value of financial risk reduction outweighs total per household costs of the insurance program by two to five times. Copyright © 2017. Published by Elsevier B.V.

  14. Changes in seasonal streamflow extremes experienced in rivers of Northwestern South America (Colombia)

    NASA Astrophysics Data System (ADS)

    Pierini, J. O.; Restrepo, J. C.; Aguirre, J.; Bustamante, A. M.; Velásquez, G. J.

    2017-04-01

    A measure of the variability in seasonal extreme streamflow was estimated for the Colombian Caribbean coast, using monthly time series of freshwater discharge from ten watersheds. The aim was to detect modifications in the streamflow monthly distribution, seasonal trends, variance and extreme monthly values. A 20-year length time moving window, with 1-year successive shiftments, was applied to the monthly series to analyze the seasonal variability of streamflow. The seasonal-windowed data were statistically fitted through the Gamma distribution function. Scale and shape parameters were computed using the Maximum Likelihood Estimation (MLE) and the bootstrap method for 1000 resample. A trend analysis was performed for each windowed-serie, allowing to detect the window of maximum absolute values for trends. Significant temporal shifts in seasonal streamflow distribution and quantiles (QT), were obtained for different frequencies. Wet and dry extremes periods increased significantly in the last decades. Such increase did not occur simultaneously through the region. Some locations exhibited continuous increases only at minimum QT.

  15. High Risk Flash Flood Rainstorm Mapping Based on Regional L-moments Approach

    NASA Astrophysics Data System (ADS)

    Ding, Hui; Liao, Yifan; Lin, Bingzhang

    2017-04-01

    Difficulties and complexities in elaborating flash flood early-warning and forecasting system prompt hydrologists to develop some techniques to substantially reduce the disastrous outcome of a flash flood in advance. An ideal to specify those areas that are subject at high risk to flash flood in terms of rainfall intensity in a relatively large region is proposed in this paper. It is accomplished through design of the High Risk Flash Flood Rainstorm Area (HRFFRA) based on statistical analysis of historical rainfall data, synoptic analysis of prevailing storm rainfalls as well as the field survey of historical flash flood events in the region. A HRFFRA is defined as the area potentially under hitting by higher intense-precipitation for a given duration with certain return period that may cause a flash flood disaster in the area. This paper has presented in detail the development of the HRFFRA through the application of the end-to-end Regional L-moments Approach (RLMA) to precipitation frequency analysis in combination with the technique of spatial interpolation in Jiangxi Province, South China Mainland. Among others, the concept of hydrometeorologically homogenous region, the precision of frequency analysis in terms of parameter estimation, the accuracy of quantiles in terms of uncertainties and the consistency adjustments of quantiles over durations and space, etc., have been addressed. At the end of this paper, the mapping of the HRFFRA and an internet-based visualized user-friendly data-server of the HRFFRA are also introduced. Key words: HRFFRA; Flash Flood; RLMA; rainfall intensity; Hydrometeorological homogenous region.

  16. Overweight and Obesity in Southern Italy: their association with social and life-style characteristics and their effect on levels of biologic markers.

    PubMed

    Osella, Alberto R; Díaz, María Del Pilar; Cozzolongo, Rafaelle; Bonfiglio, Caterina; Franco, Isabella; Abrescia, Daniela Isabel; Bianco, Antonella; Giampiero, Elba Silvana; Petruzzi, José; Elsa, Lanzilota; Mario, Correale; Mastrosimni, Anna María; Giocchino, Leandro

    2014-01-01

    In the last decades, overweight and obesity have been transformed from minor public health issues to a major threat to public health affecting the most affluent societies and also the less developed ones. To estimate overweight-obesity prevalence in adults, their association with some social determinants and to assess the effect of these two conditions on levels of biologic and biochemical characteristics, by means of a population-based study. A random sample of the general population of Putignano was drawn. All participants completed a general pre-coded and a Food Frequency questionnaire; anthropometric measures were taken and a venous blood sample was drawn. All subjects underwent liver ultra-sonography. Data description was done by means of tables and then Quantile Regression was performed. Overall prevalence of overweight and obesity were 34.5% and 16.1% respectively. Both overweight and obesity were more frequent among male, married and low socio-economic position subjects. There were increasing frequencies of normal weight with higher levels of education. Overweight and obese subjects had more frequently Nonalcoholic Fatty Liver Disease, Hypertension and altered biochemical markers. Quantile regression showed a statistically significant association of age with overweight and obesity (maximum about 64.8 yo), gender (female) and low levels of education in both overweight and obesity. More than 10 gr/day of wine intake was associated with overweight. The prevention and treatment of overweight/obesity on a population wide basis are needed. Population-based strategies should also improve social and physical environmental contexts for healthful lifestyles.

  17. SLOPE—ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION

    PubMed Central

    Bogdan, Małgorzata; van den Berg, Ewout; Sabatti, Chiara; Su, Weijie; Candès, Emmanuel J.

    2015-01-01

    We introduce a new estimator for the vector of coefficients β in the linear model y = Xβ + z, where X has dimensions n × p with p possibly larger than n. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to minb∈ℝp12‖y−Xb‖ℓ22+λ1|b|(1)+λ2|b|(2)+⋯+λp|b|(p),where λ1 ≥ λ2 ≥ … ≥ λp ≥ 0 and |b|(1)≥|b|(2)≥⋯≥|b|(p) are the decreasing absolute values of the entries of b. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical ℓ1 procedures such as the Lasso. Here, the regularizer is a sorted ℓ1 norm, which penalizes the regression coefficients according to their rank: the higher the rank—that is, stronger the signal—the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289–300] procedure (BH) which compares more significant p-values with more stringent thresholds. One notable choice of the sequence {λi} is given by the BH critical values λBH(i)=z(1−i⋅q/2p), where q ∈ (0, 1) and z(α) is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with λBH provably controls FDR at level q. Moreover, it also appears to have appreciable inferential properties under more general designs X while having substantial power, as demonstrated in a series of experiments running on both simulated and real data. PMID:26709357

  18. The N-shaped environmental Kuznets curve: an empirical evaluation using a panel quantile regression approach.

    PubMed

    Allard, Alexandra; Takman, Johanna; Uddin, Gazi Salah; Ahmed, Ali

    2018-02-01

    We evaluate the N-shaped environmental Kuznets curve (EKC) using panel quantile regression analysis. We investigate the relationship between CO 2 emissions and GDP per capita for 74 countries over the period of 1994-2012. We include additional explanatory variables, such as renewable energy consumption, technological development, trade, and institutional quality. We find evidence for the N-shaped EKC in all income groups, except for the upper-middle-income countries. Heterogeneous characteristics are, however, observed over the N-shaped EKC. Finally, we find a negative relationship between renewable energy consumption and CO 2 emissions, which highlights the importance of promoting greener energy in order to combat global warming.

  19. Bias Correction of Satellite Precipitation Products (SPPs) using a User-friendly Tool: A Step in Enhancing Technical Capacity

    NASA Astrophysics Data System (ADS)

    Rushi, B. R.; Ellenburg, W. L.; Adams, E. C.; Flores, A.; Limaye, A. S.; Valdés-Pineda, R.; Roy, T.; Valdés, J. B.; Mithieu, F.; Omondi, S.

    2017-12-01

    SERVIR, a joint NASA-USAID initiative, works to build capacity in Earth observation technologies in developing countries for improved environmental decision making in the arena of: weather and climate, water and disasters, food security and land use/land cover. SERVIR partners with leading regional organizations in Eastern and Southern Africa, Hindu Kush-Himalaya, Mekong region, and West Africa to achieve its objectives. SERVIR develops hydrological applications to address specific needs articulated by key stakeholders and daily rainfall estimates are a vital input for these applications. Satellite-derived rainfall is subjected to systemic biases which need to be corrected before it can be used for any hydrologic application such as real-time or seasonal forecasting. SERVIR and the SWAAT team at the University of Arizona, have co-developed an open-source and user friendly tool of rainfall bias correction approaches for SPPs. Bias correction tools were developed based on Linear Scaling and Quantile Mapping techniques. A set of SPPs, such as PERSIANN-CCS, TMPA-RT, and CMORPH, are bias corrected using Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) data which incorporates ground based precipitation observations. This bias correction tools also contains a component, which is included to improve monthly mean of CHIRPS using precipitation products of the Global Surface Summary of the Day (GSOD) database developed by the National Climatic Data Center (NCDC). This tool takes input from command-line which makes it user-friendly and applicable in any operating platform without prior programming skills. This presentation will focus on this bias-correction tool for SPPs, including application scenarios.

  20. Understanding the Impact of Extreme Temperature on Crop Production in Karnataka in India

    NASA Astrophysics Data System (ADS)

    Mahato, S.; Murari, K. K.; Jayaraman, T.

    2017-12-01

    The impact of extreme temperature on crop yield is seldom explored in work around climate change impact on agriculture. Further, these studies are restricted mainly to crops such as wheat and maize. Since different agro-climatic zones bear different crops and cropping patterns, it is important to explore the nature of the impact of changes in climate variables in agricultural systems under differential conditions. The study explores the effects of temperature rise on the major crops paddy, jowar, ragi and tur in the state of Karnataka of southern India. The choice of the unit of study to understand impact of climate variability on crop yields is largely restricted to availability of data for the unit. While, previous studies have dealt with this issue by replacing yield with NDVI at finer resolution, the use of an index in place of yield data has its limitations and may not reflect the true estimates. For this study, the unit considered is taluk, i.e. sub-district level. The crop yield for taluk is obtained between the year the 1995 to 2011 by aggregating point yield data from crop cutting experiments for each year across the taluks. The long term temperature data shows significantly increasing trend that ranges between 0.6 to 0.75 C across Karnataka. Further, the analysis suggests a warming trend in seasonal average temperature for Kharif and Rabi seasons across districts. The study also found that many districts exhibit the tendency of occurrence of extreme temperature days, which is of particular concern in terms of crop yield, since exposure of crops to extreme temperature has negative consequences for crop production and productivity. Using growing degree days GDD, extreme degree days EDD and total season rainfall as predictor variables, the fixed effect model shows that EDD is a more influential parameter as compared to GDD and rainfall. Also it has a statistically significant negative effect in most cases. Further, quantile regression was used to evaluate the robustness of the estimates of EDD in relation to crop yield. This showed the estimates to be robust across quantiles for most of the crops studied. Thus indicating a strong negative influence of exposure to extreme temperature on crop yield in the region.

  1. Extreme climatic events drive mammal irruptions: regression analysis of 100-year trends in desert rainfall and temperature

    PubMed Central

    Greenville, Aaron C; Wardle, Glenda M; Dickman, Chris R

    2012-01-01

    Extreme climatic events, such as flooding rains, extended decadal droughts and heat waves have been identified increasingly as important regulators of natural populations. Climate models predict that global warming will drive changes in rainfall and increase the frequency and severity of extreme events. Consequently, to anticipate how organisms will respond we need to document how changes in extremes of temperature and rainfall compare to trends in the mean values of these variables and over what spatial scales the patterns are consistent. Using the longest historical weather records available for central Australia – 100 years – and quantile regression methods, we investigate if extreme climate events have changed at similar rates to median events, if annual rainfall has increased in variability, and if the frequency of large rainfall events has increased over this period. Specifically, we compared local (individual weather stations) and regional (Simpson Desert) spatial scales, and quantified trends in median (50th quantile) and extreme weather values (5th, 10th, 90th, and 95th quantiles). We found that median and extreme annual minimum and maximum temperatures have increased at both spatial scales over the past century. Rainfall changes have been inconsistent across the Simpson Desert; individual weather stations showed increases in annual rainfall, increased frequency of large rainfall events or more prolonged droughts, depending on the location. In contrast to our prediction, we found no evidence that intra-annual rainfall had become more variable over time. Using long-term live-trapping records (22 years) of desert small mammals as a case study, we demonstrate that irruptive events are driven by extreme rainfalls (>95th quantile) and that increases in the magnitude and frequency of extreme rainfall events are likely to drive changes in the populations of these species through direct and indirect changes in predation pressure and wildfires. PMID:23170202

  2. Pooling biomarker data from different studies of disease risk, with a focus on endogenous hormones

    PubMed Central

    Key, Timothy J; Appleby, Paul N; Allen, Naomi E; Reeves, Gillian K

    2010-01-01

    Large numbers of observations are needed to provide adequate power in epidemiological studies of biomarkers and cancer risk. However, there are currently few large mature studies with adequate numbers of cases with biospecimens available. Therefore pooling biomarker measures from different studies is a valuable approach, enabling investigators to make robust estimates of risk and to examine associations in subgroups of the population. The ideal situation is to have standardized methods in all studies so that the biomarker data can be pooled in their original units. However, even when the studies do not have standardized methods, as with existing studies on hormones and cancer, a simple approach using study-specific quantiles or percentage increases can provide substantial information on the relationship of the biomarker with cancer risk. PMID:20233851

  3. [Determinants of equity in financing medicines in Argentina: an empirical study].

    PubMed

    Dondo, Mariana; Monsalvo, Mauricio; Garibaldi, Lucas A

    2016-01-01

    Medicines are an important part of household health spending. A progressive system for financing drugs is thus essential for an equitable health system. Some authors have proposed that the determinants of equity in drug financing are socioeconomic, demographic, and associated with public interventions, but little progress has been made in the empirical evaluation and quantification of their relative importance. The current study estimated quantile regressions at the provincial level in Argentina and found that old age (> 65 years), unemployment, the existence of a public pharmaceutical laboratory, treatment transfers, and a health system orientated to primary care were important predictors of progressive payment schemes. Low income, weak institutions, and insufficient infrastructure and services were associated with the most regressive social responses to health needs, thereby aggravating living conditions and limiting development opportunities.

  4. Modelling the behaviour of unemployment rates in the US over time and across space

    NASA Astrophysics Data System (ADS)

    Holmes, Mark J.; Otero, Jesús; Panagiotidis, Theodore

    2013-11-01

    This paper provides evidence that unemployment rates across US states are stationary and therefore behave according to the natural rate hypothesis. We provide new insights by considering the effect of key variables on the speed of adjustment associated with unemployment shocks. A highly-dimensional VAR analysis of the half-lives associated with shocks to unemployment rates in pairs of states suggests that the distance between states and vacancy rates respectively exert a positive and negative influence. We find that higher homeownership rates do not lead to higher half-lives. When the symmetry assumption is relaxed through quantile regression, support for the Oswald hypothesis through a positive relationship between homeownership rates and half-lives is found at the higher quantiles.

  5. Comparison between changes in flood hazard and risk in Spain using historical information

    NASA Astrophysics Data System (ADS)

    Llasat, Maria-Carmen; Mediero, Luis; Garrote, Luis; Gilabert, Joan

    2015-04-01

    Recently, the COST Action ES0901 "European procedures for flood frequency estimation (FloodFreq)" had as objective "the comparison and evaluation of methods for flood frequency estimation under the various climatologic and geographic conditions found in Europe". It was highlighted the improvement of regional analyses on at-site estimates, in terms of the uncertainty of quantile estimates. In the case of Spain, a regional analysis was carried out at a national scale, which allows identifying the flow threshold corresponding to a given return period from the observed flow series recorded at a gauging station. In addition, Mediero et al. (2014) studied the possible influence of non-stationarity on flood series for the period 1942-2009. In parallel, Barnolas and Llasat (2007), among others, collected documentary information of catastrophic flood events in Spain for the last centuries. Traditionally, the first approach ("top-down") usually identifies a flood as catastrophic, when its exceeds the 500-year return period flood. However, the second one ("bottom-up approach") accounts for flood damages (Llasat et al, 2005). This study presents a comparison between both approaches, discussing the potential factors that can lead to discrepancies between them, as well as accounting for information about major changes experienced in the catchment that could lead to changes in flood hazard and risk.

  6. Evaluating changes to reservoir rule curves using historical water-level data

    USGS Publications Warehouse

    Mower, Ethan; Miranda, Leandro E.

    2013-01-01

    Flood control reservoirs are typically managed through rule curves (i.e. target water levels) which control the storage and release timing of flood waters. Changes to rule curves are often contemplated and requested by various user groups and management agencies with no information available about the actual flood risk of such requests. Methods of estimating flood risk in reservoirs are not easily available to those unfamiliar with hydrological models that track water movement through a river basin. We developed a quantile regression model that uses readily available daily water-level data to estimate risk of spilling. Our model provided a relatively simple process for estimating the maximum applicable water level under a specific flood risk for any day of the year. This water level represents an upper-limit umbrella under which water levels can be operated in a variety of ways. Our model allows the visualization of water-level management under a user-specified flood risk and provides a framework for incorporating the effect of a changing environment on water-level management in reservoirs, but is not designed to replace existing hydrological models. The model can improve communication and collaboration among agencies responsible for managing natural resources dependent on reservoir water levels.

  7. Analyses of single nucleotide polymorphisms in selected nutrient-sensitive genes in weight-regain prevention: the DIOGENES study.

    PubMed

    Larsen, Lesli H; Angquist, Lars; Vimaleswaran, Karani S; Hager, Jörg; Viguerie, Nathalie; Loos, Ruth J F; Handjieva-Darlenska, Teodora; Jebb, Susan A; Kunesova, Marie; Larsen, Thomas M; Martinez, J Alfredo; Papadaki, Angeliki; Pfeiffer, Andreas F H; van Baak, Marleen A; Sørensen, Thorkild Ia; Holst, Claus; Langin, Dominique; Astrup, Arne; Saris, Wim H M

    2012-05-01

    Differences in the interindividual response to dietary intervention could be modified by genetic variation in nutrient-sensitive genes. This study examined single nucleotide polymorphisms (SNPs) in presumed nutrient-sensitive candidate genes for obesity and obesity-related diseases for main and dietary interaction effects on weight, waist circumference, and fat mass regain over 6 mo. In total, 742 participants who had lost ≥ 8% of their initial body weight were randomly assigned to follow 1 of 5 different ad libitum diets with different glycemic indexes and contents of dietary protein. The SNP main and SNP-diet interaction effects were analyzed by using linear regression models, corrected for multiple testing by using Bonferroni correction and evaluated by using quantile-quantile (Q-Q) plots. After correction for multiple testing, none of the SNPs were significantly associated with weight, waist circumference, or fat mass regain. Q-Q plots showed that ALOX5AP rs4769873 showed a higher observed than predicted P value for the association with less waist circumference regain over 6 mo (-3.1 cm/allele; 95% CI: -4.6, -1.6; P/Bonferroni-corrected P = 0.000039/0.076), independently of diet. Additional associations were identified by using Q-Q plots for SNPs in ALOX5AP, TNF, and KCNJ11 for main effects; in LPL and TUB for glycemic index interaction effects on waist circumference regain; in GHRL, CCK, MLXIPL, and LEPR on weight; in PPARC1A, PCK2, ALOX5AP, PYY, and ADRB3 on waist circumference; and in PPARD, FABP1, PLAUR, and LPIN1 on fat mass regain for dietary protein interaction. The observed effects of SNP-diet interactions on weight, waist, and fat mass regain suggest that genetic variation in nutrient-sensitive genes can modify the response to diet. This trial was registered at clinicaltrials.gov as NCT00390637.

  8. Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.

    PubMed

    Bengtsson, Henrik; Hössjer, Ola

    2006-03-01

    Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.

  9. Desertification, salinization, and biotic homogenization in a dryland river ecosystem

    USGS Publications Warehouse

    Miyazono, S.; Patino, Reynaldo; Taylor, C.M.

    2015-01-01

    This study determined long-term changes in fish assemblages, river discharge, salinity, and local precipitation, and examined hydrological drivers of biotic homogenization in a dryland river ecosystem, the Trans-Pecos region of the Rio Grande/Rio Bravo del Norte (USA/Mexico). Historical (1977-1989) and current (2010-2011) fish assemblages were analyzed by rarefaction analysis (species richness), nonmetric multidimensional scaling (composition/variability), multiresponse permutation procedures (composition), and paired t-test (variability). Trends in hydrological conditions (1970s-2010s) were examined by Kendall tau and quantile regression, and associations between streamfiow and specific conductance (salinity) by generalized linear models. Since the 1970s, species richness and variability of fish assemblages decreased in the Rio Grande below the confluence with the Rio Conchos (Mexico), a major tributary, but not above it. There was increased representation of lower-flow/higher-salinity tolerant species, thus making fish communities below the confluence taxonomically and functionally more homogeneous to those above it. Unlike findings elsewhere, this biotic homogenization was due primarily to changes in the relative abundances of native species. While Rio Conchos discharge was > 2-fold higher than Rio Grande discharge above their confluence, Rio Conchos discharge decreased during the study period causing Rio Grande discharge below the confluence to also decrease. Rio Conchos salinity is lower than Rio Grande salinity above their confluence and, as Rio Conchos discharge decreased, it caused Rio Grande salinity below the confluence to increase (reduced dilution). Trends in discharge did not correspond to trends in precipitation except at extreme-high (90th quantile) levels. In conclusion, decreasing discharge from the Rio Conchos has led to decreasing flow and increasing salinity in the Rio Grande below the confluence. This spatially uneven desertification and salinization of the Rio Grande has in turn led to a region-wide homogenization of hydrological conditions and of taxonomic and functional attributes of fish assemblages.

  10. Desertification, salinization, and biotic homogenization in a dryland river ecosystem.

    PubMed

    Miyazono, Seiji; Patiño, Reynaldo; Taylor, Christopher M

    2015-04-01

    This study determined long-term changes in fish assemblages, river discharge, salinity, and local precipitation, and examined hydrological drivers of biotic homogenization in a dryland river ecosystem, the Trans-Pecos region of the Rio Grande/Rio Bravo del Norte (USA/Mexico). Historical (1977-1989) and current (2010-2011) fish assemblages were analyzed by rarefaction analysis (species richness), nonmetric multidimensional scaling (composition/variability), multiresponse permutation procedures (composition), and paired t-test (variability). Trends in hydrological conditions (1970s-2010s) were examined by Kendall tau and quantile regression, and associations between streamflow and specific conductance (salinity) by generalized linear models. Since the 1970s, species richness and variability of fish assemblages decreased in the Rio Grande below the confluence with the Rio Conchos (Mexico), a major tributary, but not above it. There was increased representation of lower-flow/higher-salinity tolerant species, thus making fish communities below the confluence taxonomically and functionally more homogeneous to those above it. Unlike findings elsewhere, this biotic homogenization was due primarily to changes in the relative abundances of native species. While Rio Conchos discharge was>2-fold higher than Rio Grande discharge above their confluence, Rio Conchos discharge decreased during the study period causing Rio Grande discharge below the confluence to also decrease. Rio Conchos salinity is lower than Rio Grande salinity above their confluence and, as Rio Conchos discharge decreased, it caused Rio Grande salinity below the confluence to increase (reduced dilution). Trends in discharge did not correspond to trends in precipitation except at extreme-high (90th quantile) levels. In conclusion, decreasing discharge from the Rio Conchos has led to decreasing flow and increasing salinity in the Rio Grande below the confluence. This spatially uneven desertification and salinization of the Rio Grande has in turn led to a region-wide homogenization of hydrological conditions and of taxonomic and functional attributes of fish assemblages. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. An algorithm for computing moments-based flood quantile estimates when historical flood information is available

    USGS Publications Warehouse

    Cohn, T.A.; Lane, W.L.; Baier, W.G.

    1997-01-01

    This paper presents the expected moments algorithm (EMA), a simple and efficient method for incorporating historical and paleoflood information into flood frequency studies. EMA can utilize three types of at-site flood information: systematic stream gage record; information about the magnitude of historical floods; and knowledge of the number of years in the historical period when no large flood occurred. EMA employs an iterative procedure to compute method-of-moments parameter estimates. Initial parameter estimates are calculated from systematic stream gage data. These moments are then updated by including the measured historical peaks and the expected moments, given the previously estimated parameters, of the below-threshold floods from the historical period. The updated moments result in new parameter estimates, and the last two steps are repeated until the algorithm converges. Monte Carlo simulations compare EMA, Bulletin 17B's [United States Water Resources Council, 1982] historically weighted moments adjustment, and maximum likelihood estimators when fitting the three parameters of the log-Pearson type III distribution. These simulations demonstrate that EMA is more efficient than the Bulletin 17B method, and that it is nearly as efficient as maximum likelihood estimation (MLE). The experiments also suggest that EMA has two advantages over MLE when dealing with the log-Pearson type III distribution: It appears that EMA estimates always exist and that they are unique, although neither result has been proven. EMA can be used with binomial or interval-censored data and with any distributional family amenable to method-of-moments estimation.

  12. An algorithm for computing moments-based flood quantile estimates when historical flood information is available

    NASA Astrophysics Data System (ADS)

    Cohn, T. A.; Lane, W. L.; Baier, W. G.

    This paper presents the expected moments algorithm (EMA), a simple and efficient method for incorporating historical and paleoflood information into flood frequency studies. EMA can utilize three types of at-site flood information: systematic stream gage record; information about the magnitude of historical floods; and knowledge of the number of years in the historical period when no large flood occurred. EMA employs an iterative procedure to compute method-of-moments parameter estimates. Initial parameter estimates are calculated from systematic stream gage data. These moments are then updated by including the measured historical peaks and the expected moments, given the previously estimated parameters, of the below-threshold floods from the historical period. The updated moments result in new parameter estimates, and the last two steps are repeated until the algorithm converges. Monte Carlo simulations compare EMA, Bulletin 17B's [United States Water Resources Council, 1982] historically weighted moments adjustment, and maximum likelihood estimators when fitting the three parameters of the log-Pearson type III distribution. These simulations demonstrate that EMA is more efficient than the Bulletin 17B method, and that it is nearly as efficient as maximum likelihood estimation (MLE). The experiments also suggest that EMA has two advantages over MLE when dealing with the log-Pearson type III distribution: It appears that EMA estimates always exist and that they are unique, although neither result has been proven. EMA can be used with binomial or interval-censored data and with any distributional family amenable to method-of-moments estimation.

  13. Inclusion of historical information in flood frequency analysis using a Bayesian MCMC technique: a case study for the power dam Orlík, Czech Republic

    NASA Astrophysics Data System (ADS)

    Gaál, Ladislav; Szolgay, Ján; Kohnová, Silvia; Hlavčová, Kamila; Viglione, Alberto

    2010-01-01

    The paper deals with at-site flood frequency estimation in the case when also information on hydrological events from the past with extraordinary magnitude are available. For the joint frequency analysis of systematic observations and historical data, respectively, the Bayesian framework is chosen, which, through adequately defined likelihood functions, allows for incorporation of different sources of hydrological information, e.g., maximum annual flood peaks, historical events as well as measurement errors. The distribution of the parameters of the fitted distribution function and the confidence intervals of the flood quantiles are derived by means of the Markov chain Monte Carlo simulation (MCMC) technique. The paper presents a sensitivity analysis related to the choice of the most influential parameters of the statistical model, which are the length of the historical period h and the perception threshold X0. These are involved in the statistical model under the assumption that except for the events termed as ‘historical’ ones, none of the (unknown) peak discharges from the historical period h should have exceeded the threshold X0. Both higher values of h and lower values of X0 lead to narrower confidence intervals of the estimated flood quantiles; however, it is emphasized that one should be prudent of selecting those parameters, in order to avoid making inferences with wrong assumptions on the unknown hydrological events having occurred in the past. The Bayesian MCMC methodology is presented on the example of the maximum discharges observed during the warm half year at the station Vltava-Kamýk (Czech Republic) in the period 1877-2002. Although the 2002 flood peak, which is related to the vast flooding that affected a large part of Central Europe at that time, occurred in the near past, in the analysis it is treated virtually as a ‘historical’ event in order to illustrate some crucial aspects of including information on extreme historical floods into at-site flood frequency analyses.

  14. A collision risk model to predict avian fatalities at wind facilities: an example using golden eagles, Aquila chrysaetos

    USGS Publications Warehouse

    New, Leslie; Bjerre, Emily; Millsap, Brian A.; Otto, Mark C.; Runge, Michael C.

    2015-01-01

    Wind power is a major candidate in the search for clean, renewable energy. Beyond the technical and economic challenges of wind energy development are environmental issues that may restrict its growth. Avian fatalities due to collisions with rotating turbine blades are a leading concern and there is considerable uncertainty surrounding avian collision risk at wind facilities. This uncertainty is not reflected in many models currently used to predict the avian fatalities that would result from proposed wind developments. We introduce a method to predict fatalities at wind facilities, based on pre-construction monitoring. Our method can directly incorporate uncertainty into the estimates of avian fatalities and can be updated if information on the true number of fatalities becomes available from post-construction carcass monitoring. Our model considers only three parameters: hazardous footprint, bird exposure to turbines and collision probability. By using a Bayesian analytical framework we account for uncertainties in these values, which are then reflected in our predictions and can be reduced through subsequent data collection. The simplicity of our approach makes it accessible to ecologists concerned with the impact of wind development, as well as to managers, policy makers and industry interested in its implementation in real-world decision contexts. We demonstrate the utility of our method by predicting golden eagle (Aquila chrysaetos) fatalities at a wind installation in the United States. Using pre-construction data, we predicted 7.48 eagle fatalities year-1 (95% CI: (1.1, 19.81)). The U.S. Fish and Wildlife Service uses the 80th quantile (11.0 eagle fatalities year-1) in their permitting process to ensure there is only a 20% chance a wind facility exceeds the authorized fatalities. Once data were available from two-years of post-construction monitoring, we updated the fatality estimate to 4.8 eagle fatalities year-1 (95% CI: (1.76, 9.4); 80th quantile, 6.3). In this case, the increased precision in the fatality prediction lowered the level of authorized take, and thus lowered the required amount of compensatory mitigation.

  15. A Collision Risk Model to Predict Avian Fatalities at Wind Facilities: An Example Using Golden Eagles, Aquila chrysaetos

    PubMed Central

    New, Leslie; Bjerre, Emily; Millsap, Brian; Otto, Mark C.; Runge, Michael C.

    2015-01-01

    Wind power is a major candidate in the search for clean, renewable energy. Beyond the technical and economic challenges of wind energy development are environmental issues that may restrict its growth. Avian fatalities due to collisions with rotating turbine blades are a leading concern and there is considerable uncertainty surrounding avian collision risk at wind facilities. This uncertainty is not reflected in many models currently used to predict the avian fatalities that would result from proposed wind developments. We introduce a method to predict fatalities at wind facilities, based on pre-construction monitoring. Our method can directly incorporate uncertainty into the estimates of avian fatalities and can be updated if information on the true number of fatalities becomes available from post-construction carcass monitoring. Our model considers only three parameters: hazardous footprint, bird exposure to turbines and collision probability. By using a Bayesian analytical framework we account for uncertainties in these values, which are then reflected in our predictions and can be reduced through subsequent data collection. The simplicity of our approach makes it accessible to ecologists concerned with the impact of wind development, as well as to managers, policy makers and industry interested in its implementation in real-world decision contexts. We demonstrate the utility of our method by predicting golden eagle (Aquila chrysaetos) fatalities at a wind installation in the United States. Using pre-construction data, we predicted 7.48 eagle fatalities year-1 (95% CI: (1.1, 19.81)). The U.S. Fish and Wildlife Service uses the 80th quantile (11.0 eagle fatalities year-1) in their permitting process to ensure there is only a 20% chance a wind facility exceeds the authorized fatalities. Once data were available from two-years of post-construction monitoring, we updated the fatality estimate to 4.8 eagle fatalities year-1 (95% CI: (1.76, 9.4); 80th quantile, 6.3). In this case, the increased precision in the fatality prediction lowered the level of authorized take, and thus lowered the required amount of compensatory mitigation. PMID:26134412

  16. Correcting systematic inflation in genetic association tests that consider interaction effects: application to a genome-wide association study of posttraumatic stress disorder.

    PubMed

    Almli, Lynn M; Duncan, Richard; Feng, Hao; Ghosh, Debashis; Binder, Elisabeth B; Bradley, Bekh; Ressler, Kerry J; Conneely, Karen N; Epstein, Michael P

    2014-12-01

    Genetic association studies of psychiatric outcomes often consider interactions with environmental exposures and, in particular, apply tests that jointly consider gene and gene-environment interaction effects for analysis. Using a genome-wide association study (GWAS) of posttraumatic stress disorder (PTSD), we report that heteroscedasticity (defined as variability in outcome that differs by the value of the environmental exposure) can invalidate traditional joint tests of gene and gene-environment interaction. To identify the cause of bias in traditional joint tests of gene and gene-environment interaction in a PTSD GWAS and determine whether proposed robust joint tests are insensitive to this problem. The PTSD GWAS data set consisted of 3359 individuals (978 men and 2381 women) from the Grady Trauma Project (GTP), a cohort study from Atlanta, Georgia. The GTP performed genome-wide genotyping of participants and collected environmental exposures using the Childhood Trauma Questionnaire and Trauma Experiences Inventory. We performed joint interaction testing of the Beck Depression Inventory and modified PTSD Symptom Scale in the GTP GWAS. We assessed systematic bias in our interaction analyses using quantile-quantile plots and genome-wide inflation factors. Application of the traditional joint interaction test to the GTP GWAS yielded systematic inflation across different outcomes and environmental exposures (inflation-factor estimates ranging from 1.07 to 1.21), whereas application of the robust joint test to the same data set yielded no such inflation (inflation-factor estimates ranging from 1.01 to 1.02). Simulated data further revealed that the robust joint test is valid in different heteroscedasticity models, whereas the traditional joint test is invalid. The robust joint test also has power similar to the traditional joint test when heteroscedasticity is not an issue. We believe the robust joint test should be used in candidate-gene studies and GWASs of psychiatric outcomes that consider environmental interactions. To make the procedure useful for applied investigators, we created a software tool that can be called from the popular PLINK package for analysis.

  17. Strategies to take into account variations in extreme rainfall events for design storms in urban area: an example over Naples (Southern Italy)

    NASA Astrophysics Data System (ADS)

    Mercogliano, P.; Rianna, G.

    2017-12-01

    Eminent works highlighted how available observations display ongoing increases in extreme rainfall events while climate models assess them for future. Although the constraints in rainfall networks observations and uncertainties in climate modelling currently affect in significant way investigations, the huge impacts potentially induced by climate changes (CC) suggest adopting effective adaptation measures in order to take proper precautions. In this regard, design storms are used by engineers to size hydraulic infrastructures potentially affected by direct (e.g. pluvial/urban flooding) and indirect (e.g. river flooding) effects of extreme rainfall events. Usually they are expressed as IDF curves, mathematical relationships between rainfall Intensity, Duration, and the return period (frequency, F). They are estimated interpreting through Extreme Theories Statistical Theories (ETST) past rainfall records under the assumption of steady conditions resulting then unsuitable under climate change. In this work, a methodology to estimate future variations in IDF curves is presented and carried out for the city of Naples (Southern Italy). In this regard, the Equidistance Quantile Matching Approach proposed by Sivrastav et al. (2014) is adopted. According it, daily-subdaily maximum precipitation observations [a] and the analogous daily data provided by climate projections on current [b] and future time spans [c] are interpreted in IDF terms through Generalized Extreme Value (GEV) approach. After, quantile based mapping approach is used to establish a statistical relationship between cumulative distribution functions resulting by GEV of [a] and [b] (spatial downscaling) and [b] and [c] functions (temporal downscaling). Coupling so-obtained relations permits generating IDF curves under CC assumption. To account for uncertainties in future projections, all climate simulations available for the area in Euro-Cordex multimodel ensemble at 0.11° (about 12 km) are considered under three different concentration scenarios (RCP2.6, RCP4.5 and RCP8.5). The results appear largely influenced by models, RCPs and time horizon of interest; nevertheless, clear indications of increases are detectable although with different magnitude on the different precipitation durations.

  18. Correcting Systematic Inflation in Genetic Association Tests That Consider Interaction Effects

    PubMed Central

    Almli, Lynn M.; Duncan, Richard; Feng, Hao; Ghosh, Debashis; Binder, Elisabeth B.; Bradley, Bekh; Ressler, Kerry J.; Conneely, Karen N.; Epstein, Michael P.

    2015-01-01

    IMPORTANCE Genetic association studies of psychiatric outcomes often consider interactions with environmental exposures and, in particular, apply tests that jointly consider gene and gene-environment interaction effects for analysis. Using a genome-wide association study (GWAS) of posttraumatic stress disorder (PTSD), we report that heteroscedasticity (defined as variability in outcome that differs by the value of the environmental exposure) can invalidate traditional joint tests of gene and gene-environment interaction. OBJECTIVES To identify the cause of bias in traditional joint tests of gene and gene-environment interaction in a PTSD GWAS and determine whether proposed robust joint tests are insensitive to this problem. DESIGN, SETTING, AND PARTICIPANTS The PTSD GWAS data set consisted of 3359 individuals (978 men and 2381 women) from the Grady Trauma Project (GTP), a cohort study from Atlanta, Georgia. The GTP performed genome-wide genotyping of participants and collected environmental exposures using the Childhood Trauma Questionnaire and Trauma Experiences Inventory. MAIN OUTCOMES AND MEASURES We performed joint interaction testing of the Beck Depression Inventory and modified PTSD Symptom Scale in the GTP GWAS. We assessed systematic bias in our interaction analyses using quantile-quantile plots and genome-wide inflation factors. RESULTS Application of the traditional joint interaction test to the GTP GWAS yielded systematic inflation across different outcomes and environmental exposures (inflation-factor estimates ranging from 1.07 to 1.21), whereas application of the robust joint test to the same data set yielded no such inflation (inflation-factor estimates ranging from 1.01 to 1.02). Simulated data further revealed that the robust joint test is valid in different heteroscedasticity models, whereas the traditional joint test is invalid. The robust joint test also has power similar to the traditional joint test when heteroscedasticity is not an issue. CONCLUSIONS AND RELEVANCE We believe the robust joint test should be used in candidate-gene studies and GWASs of psychiatric outcomes that consider environmental interactions. To make the procedure useful for applied investigators, we created a software tool that can be called from the popular PLINK package for analysis. PMID:25354142

  19. A Collision Risk Model to Predict Avian Fatalities at Wind Facilities: An Example Using Golden Eagles, Aquila chrysaetos.

    PubMed

    New, Leslie; Bjerre, Emily; Millsap, Brian; Otto, Mark C; Runge, Michael C

    2015-01-01

    Wind power is a major candidate in the search for clean, renewable energy. Beyond the technical and economic challenges of wind energy development are environmental issues that may restrict its growth. Avian fatalities due to collisions with rotating turbine blades are a leading concern and there is considerable uncertainty surrounding avian collision risk at wind facilities. This uncertainty is not reflected in many models currently used to predict the avian fatalities that would result from proposed wind developments. We introduce a method to predict fatalities at wind facilities, based on pre-construction monitoring. Our method can directly incorporate uncertainty into the estimates of avian fatalities and can be updated if information on the true number of fatalities becomes available from post-construction carcass monitoring. Our model considers only three parameters: hazardous footprint, bird exposure to turbines and collision probability. By using a Bayesian analytical framework we account for uncertainties in these values, which are then reflected in our predictions and can be reduced through subsequent data collection. The simplicity of our approach makes it accessible to ecologists concerned with the impact of wind development, as well as to managers, policy makers and industry interested in its implementation in real-world decision contexts. We demonstrate the utility of our method by predicting golden eagle (Aquila chrysaetos) fatalities at a wind installation in the United States. Using pre-construction data, we predicted 7.48 eagle fatalities year-1 (95% CI: (1.1, 19.81)). The U.S. Fish and Wildlife Service uses the 80th quantile (11.0 eagle fatalities year-1) in their permitting process to ensure there is only a 20% chance a wind facility exceeds the authorized fatalities. Once data were available from two-years of post-construction monitoring, we updated the fatality estimate to 4.8 eagle fatalities year-1 (95% CI: (1.76, 9.4); 80th quantile, 6.3). In this case, the increased precision in the fatality prediction lowered the level of authorized take, and thus lowered the required amount of compensatory mitigation.

  20. Prediction of allosteric sites and mediating interactions through bond-to-bond propensities

    NASA Astrophysics Data System (ADS)

    Amor, B. R. C.; Schaub, M. T.; Yaliraki, S. N.; Barahona, M.

    2016-08-01

    Allostery is a fundamental mechanism of biological regulation, in which binding of a molecule at a distant location affects the active site of a protein. Allosteric sites provide targets to fine-tune protein activity, yet we lack computational methodologies to predict them. Here we present an efficient graph-theoretical framework to reveal allosteric interactions (atoms and communication pathways strongly coupled to the active site) without a priori information of their location. Using an atomistic graph with energy-weighted covalent and weak bonds, we define a bond-to-bond propensity quantifying the non-local effect of instantaneous bond fluctuations propagating through the protein. Significant interactions are then identified using quantile regression. We exemplify our method with three biologically important proteins: caspase-1, CheY, and h-Ras, correctly predicting key allosteric interactions, whose significance is additionally confirmed against a reference set of 100 proteins. The almost-linear scaling of our method renders it suitable for high-throughput searches for candidate allosteric sites.

  1. Prediction of allosteric sites and mediating interactions through bond-to-bond propensities

    PubMed Central

    Amor, B. R. C.; Schaub, M. T.; Yaliraki, S. N.; Barahona, M.

    2016-01-01

    Allostery is a fundamental mechanism of biological regulation, in which binding of a molecule at a distant location affects the active site of a protein. Allosteric sites provide targets to fine-tune protein activity, yet we lack computational methodologies to predict them. Here we present an efficient graph-theoretical framework to reveal allosteric interactions (atoms and communication pathways strongly coupled to the active site) without a priori information of their location. Using an atomistic graph with energy-weighted covalent and weak bonds, we define a bond-to-bond propensity quantifying the non-local effect of instantaneous bond fluctuations propagating through the protein. Significant interactions are then identified using quantile regression. We exemplify our method with three biologically important proteins: caspase-1, CheY, and h-Ras, correctly predicting key allosteric interactions, whose significance is additionally confirmed against a reference set of 100 proteins. The almost-linear scaling of our method renders it suitable for high-throughput searches for candidate allosteric sites. PMID:27561351

  2. Age-, sex-, and education-specific norms for an extended CERAD Neuropsychological Assessment Battery-Results from the population-based LIFE-Adult-Study.

    PubMed

    Luck, Tobias; Pabst, Alexander; Rodriguez, Francisca S; Schroeter, Matthias L; Witte, Veronica; Hinz, Andreas; Mehnert, Anja; Engel, Christoph; Loeffler, Markus; Thiery, Joachim; Villringer, Arno; Riedel-Heller, Steffi G

    2018-05-01

    To provide new age-, sex-, and education-specific reference values for an extended version of the well-established Consortium to Establish a Registry for Alzheimer's Disease Neuropsychological Assessment Battery (CERAD-NAB) that additionally includes the Trail Making Test and the Verbal Fluency Test-S-Words. Norms were calculated based on the cognitive performances of n = 1,888 dementia-free participants (60-79 years) from the population-based German LIFE-Adult-Study. Multiple regressions were used to examine the association of the CERAD-NAB scores with age, sex, and education. In order to calculate the norms, quantile and censored quantile regression analyses were performed estimating marginal means of the test scores at 2.28, 6.68, 10, 15.87, 25, 50, 75, and 90 percentiles for age-, sex-, and education-specific subgroups. Multiple regression analyses revealed that younger age was significantly associated with better cognitive performance in 15 CERAD-NAB measures and higher education with better cognitive performance in all 17 measures. Women performed significantly better than men in 12 measures and men than women in four measures. The determined norms indicate ceiling effects for the cognitive performances in the Boston Naming, Word List Recognition, Constructional Praxis Copying, and Constructional Praxis Recall tests. The new norms for the extended CERAD-NAB will be useful for evaluating dementia-free German-speaking adults in a broad variety of relevant cognitive domains. The extended CERAD-NAB follows more closely the criteria for the new DSM-5 Mild and Major Neurocognitive Disorder. Additionally, it could be further developed to include a test for social cognition. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  3. Exploring prediction uncertainty of spatial data in geostatistical and machine learning Approaches

    NASA Astrophysics Data System (ADS)

    Klump, J. F.; Fouedjio, F.

    2017-12-01

    Geostatistical methods such as kriging with external drift as well as machine learning techniques such as quantile regression forest have been intensively used for modelling spatial data. In addition to providing predictions for target variables, both approaches are able to deliver a quantification of the uncertainty associated with the prediction at a target location. Geostatistical approaches are, by essence, adequate for providing such prediction uncertainties and their behaviour is well understood. However, they often require significant data pre-processing and rely on assumptions that are rarely met in practice. Machine learning algorithms such as random forest regression, on the other hand, require less data pre-processing and are non-parametric. This makes the application of machine learning algorithms to geostatistical problems an attractive proposition. The objective of this study is to compare kriging with external drift and quantile regression forest with respect to their ability to deliver reliable prediction uncertainties of spatial data. In our comparison we use both simulated and real world datasets. Apart from classical performance indicators, comparisons make use of accuracy plots, probability interval width plots, and the visual examinations of the uncertainty maps provided by the two approaches. By comparing random forest regression to kriging we found that both methods produced comparable maps of estimated values for our variables of interest. However, the measure of uncertainty provided by random forest seems to be quite different to the measure of uncertainty provided by kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. These preliminary results raise questions about assessing the risks associated with decisions based on the predictions from geostatistical and machine learning algorithms in a spatial context, e.g. mineral exploration.

  4. What do we gain with Probabilistic Flood Loss Models?

    NASA Astrophysics Data System (ADS)

    Schroeter, K.; Kreibich, H.; Vogel, K.; Merz, B.; Lüdtke, S.

    2015-12-01

    The reliability of flood loss models is a prerequisite for their practical usefulness. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions which are cast in a probabilistic framework. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.

  5. Confronting uncertainty in flood damage predictions

    NASA Astrophysics Data System (ADS)

    Schröter, Kai; Kreibich, Heidi; Vogel, Kristin; Merz, Bruno

    2015-04-01

    Reliable flood damage models are a prerequisite for the practical usefulness of the model results. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005 and 2006, in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.

  6. Development of Growth Charts of Pakistani Children Aged 4-15 Years Using Quantile Regression: A Cross-sectional Study

    PubMed Central

    Khan, Nazeer; Siddiqui, Junaid S; Baig-Ansari, Naila

    2018-01-01

    Background Growth charts are essential tools used by pediatricians as well as public health researchers in assessing and monitoring the well-being of pediatric populations. Development of these growth charts, especially for children above five years of age, is challenging and requires current anthropometric data and advanced statistical analysis. These growth charts are generally presented as a series of smooth centile curves. A number of modeling approaches are available for generating growth charts and applying these on national datasets is important for generating country-specific reference growth charts. Objective To demonstrate that quantile regression (QR) as a viable statistical approach to construct growth reference charts and to assess the applicability of the World Health Organization (WHO) 2007 growth standards to a large Pakistani population of school-going children. Methodology This is a secondary data analysis using anthropometric data of 9,515 students from a Pakistani survey conducted between 2007 and 2014 in four cities of Pakistan. Growth reference charts were created using QR as well as the LMS (Box-Cox transformation (L), the median (M), and the generalized coefficient of variation (S)) method and then compared with WHO 2007 growth standards. Results Centile values estimated by the LMS method and QR procedure had few differences. The centile values attained from QR procedure of BMI-for-age, weight-for-age, and height-for-age of Pakistani children were lower than the standard WHO 2007 centile. Conclusion QR should be considered as an alternative method to develop growth charts for its simplicity and lack of necessity to transform data. WHO 2007 standards are not suitable for Pakistani children. PMID:29632748

  7. Development of Growth Charts of Pakistani Children Aged 4-15 Years Using Quantile Regression: A Cross-sectional Study.

    PubMed

    Iftikhar, Sundus; Khan, Nazeer; Siddiqui, Junaid S; Baig-Ansari, Naila

    2018-02-02

    Background Growth charts are essential tools used by pediatricians as well as public health researchers in assessing and monitoring the well-being of pediatric populations. Development of these growth charts, especially for children above five years of age, is challenging and requires current anthropometric data and advanced statistical analysis. These growth charts are generally presented as a series of smooth centile curves. A number of modeling approaches are available for generating growth charts and applying these on national datasets is important for generating country-specific reference growth charts. Objective To demonstrate that quantile regression (QR) as a viable statistical approach to construct growth reference charts and to assess the applicability of the World Health Organization (WHO) 2007 growth standards to a large Pakistani population of school-going children. Methodology This is a secondary data analysis using anthropometric data of 9,515 students from a Pakistani survey conducted between 2007 and 2014 in four cities of Pakistan. Growth reference charts were created using QR as well as the LMS (Box-Cox transformation (L), the median (M), and the generalized coefficient of variation (S)) method and then compared with WHO 2007 growth standards. Results Centile values estimated by the LMS method and QR procedure had few differences. The centile values attained from QR procedure of BMI-for-age, weight-for-age, and height-for-age of Pakistani children were lower than the standard WHO 2007 centile. Conclusion QR should be considered as an alternative method to develop growth charts for its simplicity and lack of necessity to transform data. WHO 2007 standards are not suitable for Pakistani children.

  8. Using quantile regression to examine the effects of inequality across the mortality distribution in the U.S. counties

    PubMed Central

    Yang, Tse-Chuan; Chen, Vivian Yi-Ju; Shoff, Carla; Matthews, Stephen A.

    2012-01-01

    The U.S. has experienced a resurgence of income inequality in the past decades. The evidence regarding the mortality implications of this phenomenon has been mixed. This study employs a rarely used method in mortality research, quantile regression (QR), to provide insight into the ongoing debate of whether income inequality is a determinant of mortality and to investigate the varying relationship between inequality and mortality throughout the mortality distribution. Analyzing a U.S. dataset where the five-year (1998–2002) average mortality rates were combined with other county-level covariates, we found that the association between inequality and mortality was not constant throughout the mortality distribution and the impact of inequality on mortality steadily increased until the 80th percentile. When accounting for all potential confounders, inequality was significantly and positively related to mortality; however, this inequality–mortality relationship did not hold across the mortality distribution. A series of Wald tests confirmed this varying inequality–mortality relationship, especially between the lower and upper tails. The large variation in the estimated coefficients of the Gini index suggested that inequality had the greatest influence on those counties with a mortality rate of roughly 9.95 deaths per 1000 population (80th percentile) compared to any other counties. Furthermore, our results suggest that the traditional analytic methods that focus on mean or median value of the dependent variable can be, at most, applied to a narrow 20 percent of observations. This study demonstrates the value of QR. Our findings provide some insight as to why the existing evidence for the inequality–mortality relationship is mixed and suggest that analytical issues may play a role in clarifying whether inequality is a robust determinant of population health. PMID:22497847

  9. Physical Activity and Pediatric Obesity: A Quantile Regression Analysis

    PubMed Central

    Mitchell, Jonathan A.; Dowda, Marsha; Pate, Russell R.; Kordas, Katarzyna; Froberg, Karsten; Sardinha, Luís B.; Kolle, Elin; Page, Angela

    2016-01-01

    Purpose We aimed to determine if moderate-to-vigorous physical activity (MVPA) and sedentary behavior (SB) were independently associated with body mass index (BMI) and waist circumference (WC) in children and adolescents. Methods Data from the International Children’s Accelerometry Database (ICAD) were used to address our objectives (N=11,115; 6-18y; 51% female). We calculated age and gender specific body mass index (BMI) and waist circumference (WC) Z-scores and used accelerometry to estimate MVPA and total SB. Self-reported television viewing was used as a measure of leisure time SB. Quantile regression was used to analyze the data. Results MVPA and total SB were associated with lower and higher BMI and WC Z-scores, respectively. These associations were strongest at the higher percentiles of the Z-score distributions. After including MVPA and total SB in the same model the MVPA associations remained, but the SB associations were no longer present. For example, each additional hour per day of MVPA was not associated with BMI Z-score at the 10th percentile (b=-0.02, P=0.170), but was associated with lower BMI Z-score at the 50th (b=-0.19, P<0.001) and 90th percentiles (b=-0.41, P<0.001). More television viewing was associated with higher BMI and WC and the associations were strongest at the higher percentiles of the Z-score distributions, with adjustment for MVPA and total SB. Conclusions Our observation of stronger associations at the higher percentiles indicate that increasing MVPA and decreasing television viewing at the population-level could shift the upper tails of the BMI and WC frequency distributions to lower values, thereby lowering the number of children and adolescents classified as obese. PMID:27755284

  10. How important are determinants of obesity measured at the individual level for explaining geographic variation in body mass index distributions? Observational evidence from Canada using Quantile Regression and Blinder-Oaxaca Decomposition.

    PubMed

    Dutton, Daniel J; McLaren, Lindsay

    2016-04-01

    Obesity prevalence varies between geographic regions in Canada. The reasons for this variation are unclear but most likely implicate both individual-level and population-level factors. The objective of this study was to examine whether equalising correlates of body mass index (BMI) across these geographic regions could be reasonably expected to reduce differences in BMI distributions between regions. Using data from three cycles of the Canadian Community Health Survey (CCHS) 2001, 2003 and 2007 for males and females, we modelled between-region BMI cross-sectionally using quantile regression and Blinder-Oaxaca decomposition of the quantile regression results. We show that while individual-level variables (ie, age, income, education, physical activity level, fruit and vegetable consumption, smoking status, drinking status, family doctor status, rural status, employment in the past 12 months and marital status) may be Caucasian important correlates of BMI within geographic regions, those variables are not capable of explaining variation in BMI between regions. Equalisation of common correlates of BMI between regions cannot be reasonably expected to reduce differences in the BMI distributions between regions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  11. Forecasting peak asthma admissions in London: an application of quantile regression models.

    PubMed

    Soyiri, Ireneous N; Reidpath, Daniel D; Sarran, Christophe

    2013-07-01

    Asthma is a chronic condition of great public health concern globally. The associated morbidity, mortality and healthcare utilisation place an enormous burden on healthcare infrastructure and services. This study demonstrates a multistage quantile regression approach to predicting excess demand for health care services in the form of asthma daily admissions in London, using retrospective data from the Hospital Episode Statistics, weather and air quality. Trivariate quantile regression models (QRM) of asthma daily admissions were fitted to a 14-day range of lags of environmental factors, accounting for seasonality in a hold-in sample of the data. Representative lags were pooled to form multivariate predictive models, selected through a systematic backward stepwise reduction approach. Models were cross-validated using a hold-out sample of the data, and their respective root mean square error measures, sensitivity, specificity and predictive values compared. Two of the predictive models were able to detect extreme number of daily asthma admissions at sensitivity levels of 76 % and 62 %, as well as specificities of 66 % and 76 %. Their positive predictive values were slightly higher for the hold-out sample (29 % and 28 %) than for the hold-in model development sample (16 % and 18 %). QRMs can be used in multistage to select suitable variables to forecast extreme asthma events. The associations between asthma and environmental factors, including temperature, ozone and carbon monoxide can be exploited in predicting future events using QRMs.

  12. Forecasting peak asthma admissions in London: an application of quantile regression models

    NASA Astrophysics Data System (ADS)

    Soyiri, Ireneous N.; Reidpath, Daniel D.; Sarran, Christophe

    2013-07-01

    Asthma is a chronic condition of great public health concern globally. The associated morbidity, mortality and healthcare utilisation place an enormous burden on healthcare infrastructure and services. This study demonstrates a multistage quantile regression approach to predicting excess demand for health care services in the form of asthma daily admissions in London, using retrospective data from the Hospital Episode Statistics, weather and air quality. Trivariate quantile regression models (QRM) of asthma daily admissions were fitted to a 14-day range of lags of environmental factors, accounting for seasonality in a hold-in sample of the data. Representative lags were pooled to form multivariate predictive models, selected through a systematic backward stepwise reduction approach. Models were cross-validated using a hold-out sample of the data, and their respective root mean square error measures, sensitivity, specificity and predictive values compared. Two of the predictive models were able to detect extreme number of daily asthma admissions at sensitivity levels of 76 % and 62 %, as well as specificities of 66 % and 76 %. Their positive predictive values were slightly higher for the hold-out sample (29 % and 28 %) than for the hold-in model development sample (16 % and 18 %). QRMs can be used in multistage to select suitable variables to forecast extreme asthma events. The associations between asthma and environmental factors, including temperature, ozone and carbon monoxide can be exploited in predicting future events using QRMs.

  13. Moisture availability constraints on the leaf area to sapwood area ratio: analysis of measurements on Australian evergreen angiosperm trees

    NASA Astrophysics Data System (ADS)

    Togashi, Henrique; Prentice, Colin; Evans, Bradley; Forrester, David; Drake, Paul; Feikema, Paul; Brooksbank, Kim; Eamus, Derek; Taylor, Daniel

    2014-05-01

    The leaf area to sapwood area ratio (LA:SA) is a key plant trait that links photosynthesis to transpiration. Pipe model theory states that the sapwood cross-sectional area of a stem or branch at any point should scale isometrically with the area of leaves distal to that point. Optimization theory further suggests that LA:SA should decrease towards drier climates. Although acclimation of LA:SA to climate has been reported within species, much less is known about the scaling of this trait with climate among species. We compiled LA:SA measurements from 184 species of Australian evergreen angiosperm trees. The pipe model was broadly confirmed, based on measurements on branches and trunks of trees from one to 27 years old. We found considerable scatter in LA:SA among species. However quantile regression showed strong (0.2

  14. Morphological and moisture availability controls of the leaf area-to-sapwood area ratio: analysis of measurements on Australian trees.

    PubMed

    Togashi, Henrique Furstenau; Prentice, Iain Colin; Evans, Bradley John; Forrester, David Ian; Drake, Paul; Feikema, Paul; Brooksbank, Kim; Eamus, Derek; Taylor, Daniel

    2015-03-01

    The leaf area-to-sapwood area ratio (LA:SA) is a key plant trait that links photosynthesis to transpiration. The pipe model theory states that the sapwood cross-sectional area of a stem or branch at any point should scale isometrically with the area of leaves distal to that point. Optimization theory further suggests that LA:SA should decrease toward drier climates. Although acclimation of LA:SA to climate has been reported within species, much less is known about the scaling of this trait with climate among species. We compiled LA:SA measurements from 184 species of Australian evergreen angiosperm trees. The pipe model was broadly confirmed, based on measurements on branches and trunks of trees from one to 27 years old. Despite considerable scatter in LA:SA among species, quantile regression showed strong (0.2 < R1 < 0.65) positive relationships between two climatic moisture indices and the lowermost (5%) and uppermost (5-15%) quantiles of log LA:SA, suggesting that moisture availability constrains the envelope of minimum and maximum values of LA:SA typical for any given climate. Interspecific differences in plant hydraulic conductivity are probably responsible for the large scatter of values in the mid-quantile range and may be an important determinant of tree morphology.

  15. Morphological and moisture availability controls of the leaf area-to-sapwood area ratio: analysis of measurements on Australian trees

    PubMed Central

    Togashi, Henrique Furstenau; Prentice, Iain Colin; Evans, Bradley John; Forrester, David Ian; Drake, Paul; Feikema, Paul; Brooksbank, Kim; Eamus, Derek; Taylor, Daniel

    2015-01-01

    The leaf area-to-sapwood area ratio (LA:SA) is a key plant trait that links photosynthesis to transpiration. The pipe model theory states that the sapwood cross-sectional area of a stem or branch at any point should scale isometrically with the area of leaves distal to that point. Optimization theory further suggests that LA:SA should decrease toward drier climates. Although acclimation of LA:SA to climate has been reported within species, much less is known about the scaling of this trait with climate among species. We compiled LA:SA measurements from 184 species of Australian evergreen angiosperm trees. The pipe model was broadly confirmed, based on measurements on branches and trunks of trees from one to 27 years old. Despite considerable scatter in LA:SA among species, quantile regression showed strong (0.2 < R1 < 0.65) positive relationships between two climatic moisture indices and the lowermost (5%) and uppermost (5–15%) quantiles of log LA:SA, suggesting that moisture availability constrains the envelope of minimum and maximum values of LA:SA typical for any given climate. Interspecific differences in plant hydraulic conductivity are probably responsible for the large scatter of values in the mid-quantile range and may be an important determinant of tree morphology. PMID:25859331

  16. The heterogeneous effects of urbanization and income inequality on CO2 emissions in BRICS economies: evidence from panel quantile regression.

    PubMed

    Zhu, Huiming; Xia, Hang; Guo, Yawei; Peng, Cheng

    2018-04-12

    This paper empirically examines the effects of urbanization and income inequality on CO 2 emissions in the BRICS economies (i.e., Brazil, Russia, India, China, and South Africa) during the periods 1994-2013. The method we used is the panel quantile regression, which takes into account the unobserved individual heterogeneity and distributional heterogeneity. Our empirical results indicate that urbanization has a significant and negative impact on carbon emissions, except in the 80 th , 90 th , and 95 th quantiles. We also quantitatively investigate the direct and indirect effect of urbanization on carbon emissions, and the results show that we may underestimate urbanization's effect on carbon emissions if we ignore its indirect effect. In addition, in middle- and high-emission countries, income inequality has a significant and positive impact on carbon emissions. The results of our study indicate that in the BRICS economies, there is an inverted U-shaped environmental Kuznets curve (EKC) between the GDP per capita and carbon emissions. The conclusions of this study have important policy implications for policymakers. Policymakers should try to narrow the income gap between the rich and the poor to improve environmental quality; the BRICS economies can speed up urbanization to reduce carbon emissions, but they must improve energy efficiency and use clean energy to the greatest extent in the process.

  17. A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.

    PubMed

    Houseman, E Andres; Virji, M Abbas

    2017-08-01

    Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.

  18. Estimation of suspended-sediment rating curves and mean suspended-sediment loads

    USGS Publications Warehouse

    Crawford, Charles G.

    1991-01-01

    A simulation study was done to evaluate: (1) the accuracy and precision of parameter estimates for the bias-corrected, transformed-linear and non-linear models obtained by the method of least squares; (2) the accuracy of mean suspended-sediment loads calculated by the flow-duration, rating-curve method using model parameters obtained by the alternative methods. Parameter estimates obtained by least squares for the bias-corrected, transformed-linear model were considerably more precise than those obtained for the non-linear or weighted non-linear model. The accuracy of parameter estimates obtained for the biascorrected, transformed-linear and weighted non-linear model was similar and was much greater than the accuracy obtained by non-linear least squares. The improved parameter estimates obtained by the biascorrected, transformed-linear or weighted non-linear model yield estimates of mean suspended-sediment load calculated by the flow-duration, rating-curve method that are more accurate and precise than those obtained for the non-linear model.

  19. Factors Associated with Adherence to Adjuvant Endocrine Therapy Among Privately Insured and Newly Diagnosed Breast Cancer Patients: A Quantile Regression Analysis.

    PubMed

    Farias, Albert J; Hansen, Ryan N; Zeliadt, Steven B; Ornelas, India J; Li, Christopher I; Thompson, Beti

    2016-08-01

    Adherence to adjuvant endocrine therapy (AET) for estrogen receptor-positive breast cancer remains suboptimal, which suggests that women are not getting the full benefit of the treatment to reduce breast cancer recurrence and mortality. The majority of studies on adherence to AET focus on identifying factors among those women at the highest levels of adherence and provide little insight on factors that influence medication use across the distribution of adherence. To understand how factors influence adherence among women across low and high levels of adherence. A retrospective evaluation was conducted using the Truven Health MarketScan Commercial Claims and Encounters Database from 2007-2011. Privately insured women aged 18-64 years who were recently diagnosed and treated for breast cancer and who initiated AET within 12 months of primary treatment were assessed. Adherence was measured as the proportion of days covered (PDC) over a 12-month period. Simultaneous multivariable quantile regression was used to assess the association between treatment and demographic factors, use of mail order pharmacies, medication switching, and out-of-pocket costs and adherence. The effect of each variable was examined at the 40th, 60th, 80th, and 95th quantiles. Among the 6,863 women in the cohort, mail order pharmacies had the greatest influence on adherence at the 40th quantile, associated with a 29.6% (95% CI = 22.2-37.0) higher PDC compared with retail pharmacies. Out-of-pocket cost for a 30-day supply of AET greater than $20 was associated with an 8.6% (95% CI = 2.8-14.4) lower PDC versus $0-$9.99. The main factors that influenced adherence at the 95th quantile were mail order pharmacies, associated with a 4.4% higher PDC (95% CI = 3.8-5.0) versus retail pharmacies, and switching AET medication 2 or more times, associated with a 5.6% lower PDC versus not switching (95% CI = 2.3-9.0). Factors associated with adherence differed across quantiles. Addressing the use of mail order pharmacies and out-of-pocket costs for AET may have the greatest influence on improving adherence among those women with low adherence. This research was supported by a Ruth L. Kirschstein National Research Service Award for Individual Predoctoral Fellowship grant from the National Cancer Institute (grant number F31 CA174338), which was awarded to Farias. Additionally, Farias was funded by a Postdoctoral Fellowship at the University of Texas School of Public Health Cancer Education and Career Development Program through the National Cancer Institute (NIH Grant R25 CA57712). The other authors declare no conflicts of interest. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. Farias was primarily responsible for the study concept and design, along with Hansen and Zeliadt and with assistance from the other authors. Farias, Hansen, and Zeliadt took the lead in data interpretation, assisted by the other authors. The manuscript was written by Farias, along with Thompson and assisted by the other authors, and was revised by Ornelas, Li, and Farias, with assistance from the other authors.

  20. High throughput nonparametric probability density estimation.

    PubMed

    Farmer, Jenny; Jacobs, Donald

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.

  1. High throughput nonparametric probability density estimation

    PubMed Central

    Farmer, Jenny

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference. PMID:29750803

  2. A predictive estimation method for carbon dioxide transport by data-driven modeling with a physically-based data model

    NASA Astrophysics Data System (ADS)

    Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young; Jun, Seong-Chun; Choung, Sungwook; Yun, Seong-Taek; Oh, Junho; Kim, Hyun-Jun

    2017-11-01

    In this study, a data-driven method for predicting CO2 leaks and associated concentrations from geological CO2 sequestration is developed. Several candidate models are compared based on their reproducibility and predictive capability for CO2 concentration measurements from the Environment Impact Evaluation Test (EIT) site in Korea. Based on the data mining results, a one-dimensional solution of the advective-dispersive equation for steady flow (i.e., Ogata-Banks solution) is found to be most representative for the test data, and this model is adopted as the data model for the developed method. In the validation step, the method is applied to estimate future CO2 concentrations with the reference estimation by the Ogata-Banks solution, where a part of earlier data is used as the training dataset. From the analysis, it is found that the ensemble mean of multiple estimations based on the developed method shows high prediction accuracy relative to the reference estimation. In addition, the majority of the data to be predicted are included in the proposed quantile interval, which suggests adequate representation of the uncertainty by the developed method. Therefore, the incorporation of a reasonable physically-based data model enhances the prediction capability of the data-driven model. The proposed method is not confined to estimations of CO2 concentration and may be applied to various real-time monitoring data from subsurface sites to develop automated control, management or decision-making systems.

  3. Easy and accurate variance estimation of the nonparametric estimator of the partial area under the ROC curve and its application.

    PubMed

    Yu, Jihnhee; Yang, Luge; Vexler, Albert; Hutson, Alan D

    2016-06-15

    The receiver operating characteristic (ROC) curve is a popular technique with applications, for example, investigating an accuracy of a biomarker to delineate between disease and non-disease groups. A common measure of accuracy of a given diagnostic marker is the area under the ROC curve (AUC). In contrast with the AUC, the partial area under the ROC curve (pAUC) looks into the area with certain specificities (i.e., true negative rate) only, and it can be often clinically more relevant than examining the entire ROC curve. The pAUC is commonly estimated based on a U-statistic with the plug-in sample quantile, making the estimator a non-traditional U-statistic. In this article, we propose an accurate and easy method to obtain the variance of the nonparametric pAUC estimator. The proposed method is easy to implement for both one biomarker test and the comparison of two correlated biomarkers because it simply adapts the existing variance estimator of U-statistics. In this article, we show accuracy and other advantages of the proposed variance estimation method by broadly comparing it with previously existing methods. Further, we develop an empirical likelihood inference method based on the proposed variance estimator through a simple implementation. In an application, we demonstrate that, depending on the inferences by either the AUC or pAUC, we can make a different decision on a prognostic ability of a same set of biomarkers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  4. Robust estimation for partially linear models with large-dimensional covariates

    PubMed Central

    Zhu, LiPing; Li, RunZe; Cui, HengJian

    2014-01-01

    We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087

  5. Robust estimation for partially linear models with large-dimensional covariates.

    PubMed

    Zhu, LiPing; Li, RunZe; Cui, HengJian

    2013-10-01

    We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.

  6. Tractable Experiment Design via Mathematical Surrogates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Brian J.

    This presentation summarizes the development and implementation of quantitative design criteria motivated by targeted inference objectives for identifying new, potentially expensive computational or physical experiments. The first application is concerned with estimating features of quantities of interest arising from complex computational models, such as quantiles or failure probabilities. A sequential strategy is proposed for iterative refinement of the importance distributions used to efficiently sample the uncertain inputs to the computational model. In the second application, effective use of mathematical surrogates is investigated to help alleviate the analytical and numerical intractability often associated with Bayesian experiment design. This approach allows formore » the incorporation of prior information into the design process without the need for gross simplification of the design criterion. Illustrative examples of both design problems will be presented as an argument for the relevance of these research problems.« less

  7. An evaluation of the effectiveness of a risk-based monitoring approach implemented with clinical trials involving implantable cardiac medical devices.

    PubMed

    Diani, Christopher A; Rock, Angie; Moll, Phil

    2017-12-01

    Background Risk-based monitoring is a concept endorsed by the Food and Drug Administration to improve clinical trial data quality by focusing monitoring efforts on critical data elements and higher risk investigator sites. BIOTRONIK approached this by implementing a comprehensive strategy that assesses risk and data quality through a combination of operational controls and data surveillance. This publication demonstrates the effectiveness of a data-driven risk assessment methodology when used in conjunction with a tailored monitoring plan. Methods We developed a data-driven risk assessment system to rank 133 investigator sites comprising 3442 subjects and identify those sites that pose a potential risk to the integrity of data collected in implantable cardiac device clinical trials. This included identification of specific risk factors and a weighted scoring mechanism. We conducted trend analyses for risk assessment data collected over 1 year to assess the overall impact of our data surveillance process combined with other operational monitoring efforts. Results Trending analyses of key risk factors revealed an improvement in the quality of data collected during the observation period. The three risk factors follow-up compliance rate, unavailability of critical data, and noncompliance rate correspond closely with Food and Drug Administration's risk-based monitoring guidance document. Among these three risk factors, 100% (12/12) of quantiles analyzed showed an increase in data quality. Of these, 67% (8/12) of the improving trends in worst performing quantiles had p-values less than 0.05, and 17% (2/12) had p-values between 0.05 and 0.06. Among the poorest performing site quantiles, there was a statistically significant decrease in subject follow-up noncompliance rates, protocol noncompliance rates, and incidence of missing critical data. Conclusion One year after implementation of a comprehensive strategy for risk-based monitoring, including a data-driven risk assessment methodology to target on-site monitoring visits, statistically significant improvement was seen in a majority of measurable risk factors at the worst performing site quantiles. For the three risk factors which are most critical to the overall compliance of cardiac rhythm management medical device studies: follow-up compliance rate, unavailability of critical data, and noncompliance rate, we measured significant improvement in data quality. Although the worst performing site quantiles improved but not significantly in some risk factors such as subject attrition, the data-driven risk assessment highlighted key areas on which to continue focusing both on-site and centralized monitoring efforts. Data-driven surveillance of clinical trial performance provides actionable observations that can improve site performance. Clinical trials utilizing risk-based monitoring by leveraging a data-driven quality assessment combined with specific operational procedures may lead to an improvement in data quality and resource efficiencies.

  8. Two-sided Topp-Leone Weibull distribution

    NASA Astrophysics Data System (ADS)

    Podeang, Krittaya; Bodhisuwan, Winai

    2017-11-01

    In this paper, we introduce a general class of lifetime distributions, called the two-sided Topp-Leone generated family of distribution. A special case of new family is the two-sided Topp-Leone Weibull distribution. This distribution used the two-sided Topp-Leone distribution as a generator for the Weibull distribution. The two-sided Topp-Leone Weibull distribution is presented in several shapes of distributions such as decreasing, unimodal, and bimodal which make this distribution more than flexible than the Weibull distribution. Its quantile function is presented. The parameter estimation method by using maximum likelihood estimation is discussed. The proposed distribution is applied to the strength data set, remission times of bladder cancer patients data set and time to failure of turbocharger data set. We compare the proposed distribution to the Topp-Leone Generated Weibull distribution. In conclusion, the two-sided Topp-Leone Weibull distribution performs similarly as the Topp-Leone Generated Weibull distribution in the first and second data sets. However, the proposed distribution can perform better than fit to Topp-Leone Generated Weibull distribution for the other.

  9. A BEHAVIORAL ECONOMIC MODEL OF ALCOHOL ADVERTISING AND PRICE

    PubMed Central

    SAFFER, HENRY; DAVE, DHAVAL; GROSSMAN, MICHAEL

    2016-01-01

    SUMMARY This paper presents a new empirical study of the effects of televised alcohol advertising and alcohol price on alcohol consumption. A novel feature of this study is that the empirical work is guided by insights from behavioral economic theory. Unlike the theory used in most prior studies, this theory predicts that restriction on alcohol advertising on TV would be more effective in reducing consumption for individuals with high consumption levels but less effective for individuals with low consumption levels. The estimation work employs data from the National Longitudinal Survey of Youth, and the empirical model is estimated with quantile regressions. The results show that advertising has a small positive effect on consumption and that this effect is relatively larger at high consumption levels. The continuing importance of alcohol taxes is also supported. Education is employed as a proxy for self-regulation, and the results are consistent with this assumption. The key conclusion is that restrictions on alcohol advertising on TV would have a small negative effect on drinking, and this effect would be larger for heavy drinkers. PMID:25919364

  10. Independent technical review and analysis of hydraulic modeling and hydrology under low-flow conditions of the Des Plaines River near Riverside, Illinois

    USGS Publications Warehouse

    Over, Thomas M.; Straub, Timothy D.; Hortness, Jon E.; Murphy, Elizabeth A.

    2012-01-01

    The U.S. Geological Survey (USGS) has operated a streamgage and published daily flows for the Des Plaines River at Riverside since Oct. 1, 1943. A HEC-RAS model has been developed to estimate the effect of the removal of Hofmann Dam near the gage on low-flow elevations in the reach approximately 3 miles upstream from the dam. The Village of Riverside, the Illinois Department of Natural Resources-Office of Water Resources (IDNR-OWR), and the U. S. Army Corps of Engineers-Chicago District (USACE-Chicago) are interested in verifying the performance of the HEC-RAS model for specific low-flow conditions, and obtaining an estimate of selected daily flow quantiles and other low-flow statistics for a selected period of record that best represents current hydrologic conditions. Because the USGS publishes streamflow records for the Des Plaines River system and provides unbiased analyses of flows and stream hydraulic characteristics, the USGS served as an Independent Technical Reviewer (ITR) for this study.

  11. A quality-of-life-oriented endpoint for comparing therapies.

    PubMed

    Gelber, R D; Gelman, R S; Goldhirsch, A

    1989-09-01

    An endpoint, time without symptoms of disease and toxicity of treatment (TWiST), is defined to provide a single measure of length and quality of survival. Time with subjective side effects of treatment and time with unpleasant symptoms of disease are subtracted from overall survival time to calculate TWiST for each patient. The purpose of this paper is to describe the construction of this endpoint, and to elaborate on its interpretation for patient care decision-making. Estimating the distribution of TWiST using actuarial methods is shown by simulation studies to be biased as a result of induced dependency between TWiST and its censoring distribution. Considering the distribution of TWiST accumulated within a specified time from start of therapy, L, allows one to reduce this bias by substituting estimated TWiST for censored values and provides a method to evaluate the "payback" period for early toxic effects. Quantile distance plots provide graphical representations for treatment comparisons. The analysis of Ludwig Trial III evaluating toxic adjuvant therapies versus a no-treatment control group for postmenopausal women with node-positive breast cancer illustrates the methodology.

  12. Estimation of Value-at-Risk for Energy Commodities via CAViaR Model

    NASA Astrophysics Data System (ADS)

    Xiliang, Zhao; Xi, Zhu

    This paper uses the Conditional Autoregressive Value at Risk model (CAViaR) proposed by Engle and Manganelli (2004) to evaluate the value-at-risk for daily spot prices of Brent crude oil and West Texas Intermediate crude oil covering the period May 21th, 1987 to Novermber 18th, 2008. Then the accuracy of the estimates of CAViaR model, Normal-GARCH, and GED-GARCH was compared. The results show that all the methods do good job for the low confidence level (95%), and GED-GARCH is the best for spot WTI price, Normal-GARCH and Adaptive-CAViaR are the best for spot Brent price. However, for the high confidence level (99%), Normal-GARCH do a good job for spot WTI, GED-GARCH and four kind of CAViaR specifications do well for spot Brent price. Normal-GARCH does badly for spot Brent price. The result seems suggest that CAViaR do well as well as GED-GARCH since CAViaR directly model the quantile autoregression, but it does not outperform GED-GARCH although it does outperform Normal-GARCH.

  13. Fitting monthly Peninsula Malaysian rainfall using Tweedie distribution

    NASA Astrophysics Data System (ADS)

    Yunus, R. M.; Hasan, M. M.; Zubairi, Y. Z.

    2017-09-01

    In this study, the Tweedie distribution was used to fit the monthly rainfall data from 24 monitoring stations of Peninsula Malaysia for the period from January, 2008 to April, 2015. The aim of the study is to determine whether the distributions within the Tweedie family fit well the monthly Malaysian rainfall data. Within the Tweedie family, the gamma distribution is generally used for fitting the rainfall totals, however the Poisson-gamma distribution is more useful to describe two important features of rainfall pattern, which are the occurrences (dry months) and the amount (wet months). First, the appropriate distribution of the monthly rainfall was identified within the Tweedie family for each station. Then, the Tweedie Generalised Linear Model (GLM) with no explanatory variable was used to model the monthly rainfall data. Graphical representation was used to assess model appropriateness. The QQ plots of quantile residuals show that the Tweedie models fit the monthly rainfall data better for majority of the stations in the west coast and mid land than those in the east coast of Peninsula. This significant finding suggests that the best fitted distribution depends on the geographical location of the monitoring station. In this paper, a simple model is developed for generating synthetic rainfall data for use in various areas, including agriculture and irrigation. We have showed that the data that were simulated using the Tweedie distribution have fairly similar frequency histogram to that of the actual data. Both the mean number of rainfall events and mean amount of rain for a month were estimated simultaneously for the case that the Poisson gamma distribution fits the data reasonably well. Thus, this work complements previous studies that fit the rainfall amount and the occurrence of rainfall events separately, each to a different distribution.

  14. Integrating mindfulness training in school health education to promote healthy behaviors in adolescents: Feasibility and preliminary effects on exercise and dietary habits.

    PubMed

    Salmoirago-Blotcher, Elena; Druker, Susan; Frisard, Christine; Dunsiger, Shira I; Crawford, Sybil; Meleo-Meyer, Florence; Bock, Beth; Pbert, Lori

    2018-03-01

    Whether mindfulness training (MT) could improve healthy behaviors is unknown. This study sought to determine feasibility and acceptability of integrating MT into school-based health education (primary outcomes) and to explore its possible effects on healthy behaviors (exploratory outcomes). Two high schools in Massachusetts (2014-2015) were randomized to health education plus MT (HE-MT) (one session/week for 8 weeks) or to health education plus attention control (HE-AC). Dietary habits (24-h dietary recalls) and moderate-to-vigorous physical activity (MVPA/7-day recalls) were assessed at baseline, end of treatment (EOT), and 6 months thereafter. Quantile regression and linear mixed models were used, respectively, to estimate effects on MVPA and dietary outcomes adjusting for confounders. We recruited 53 9th graders (30 HEM, 23 HEAC; average age 14.5, 60% white, 59% female). Retention was 100% (EOT) and 96% (6 months); attendance was 96% (both conditions), with moderate-to-high satisfaction ratings. Among students with higher MVPA at baseline, MVPA was higher in HE-MT vs. HE-AC at both EOT (median difference = 81 min/week, p  = 0.005) and at 6 months ( p  = 0.004). Among males, median MVPA was higher (median difference = 99 min/week) in HE-MT vs. HEAC at both EOT ( p  = 0.056) and at 6 months ( p  = 0.04). No differences were noted in dietary habits. In sum, integrating school-based MT into health education was feasible and acceptable and had promising effects on MVPA among male and more active adolescents. These findings suggest that MT may improve healthy behaviors in adolescents and deserve to be reproduced in larger, rigorous studies.

  15. The Massachusetts Sustainable-Yield Estimator: A decision-support tool to assess water availability at ungaged stream locations in Massachusetts

    USGS Publications Warehouse

    Archfield, Stacey A.; Vogel, Richard M.; Steeves, Peter A.; Brandt, Sara L.; Weiskel, Peter K.; Garabedian, Stephen P.

    2010-01-01

    Federal, State and local water-resource managers require a variety of data and modeling tools to better understand water resources. The U.S. Geological Survey, in cooperation with the Massachusetts Department of Environmental Protection, has developed a statewide, interactive decision-support tool to meet this need. The decision-support tool, referred to as the Massachusetts Sustainable-Yield Estimator (MA SYE) provides screening-level estimates of the sustainable yield of a basin, defined as the difference between the unregulated streamflow and some user-specified quantity of water that must remain in the stream to support such functions as recreational activities or aquatic habitat. The MA SYE tool was designed, in part, because the quantity of surface water available in a basin is a time-varying quantity subject to competing demands for water. To compute sustainable yield, the MA SYE tool estimates a daily time series of unregulated, daily mean streamflow for a 44-year period of record spanning October 1, 1960, through September 30, 2004. Selected streamflow quantiles from an unregulated, daily flow-duration curve are estimated by solving six regression equations that are a function of physical and climate basin characteristics at an ungaged site on a stream of interest. Streamflow is then interpolated between the estimated quantiles to obtain a continuous daily flow-duration curve. A time series of unregulated daily streamflow subsequently is created by transferring the timing of the daily streamflow at a reference streamgage to the ungaged site by equating exceedence probabilities of contemporaneous flow at the two locations. One of 66 reference streamgages is selected by kriging, a geostatistical method, which is used to map the spatial relation among correlations between the time series of the logarithm of daily streamflows at each reference streamgage and the ungaged site. Estimated unregulated, daily mean streamflows show good agreement with observed unregulated, daily mean streamflow at 18 streamgages located across southern New England. Nash-Sutcliffe efficiency goodness-of-fit values are between 0.69 and 0.98, and percent root-mean-square-error values are between 19 and 283 percent. The MA SYE tool provides an estimate of streamflow adjusted for current (2000-04) water withdrawals and discharges using a spatially referenced database of permitted groundwater and surface-water withdrawal and discharge volumes. For a user-selected basin, the database is queried to obtain the locations of water withdrawal or discharge volumes within the basin. Groundwater and surface-water withdrawals and discharges are subtracted and added, respectively, from the unregulated, daily streamflow at an ungaged site to obtain a streamflow time series that includes the effects of these withdrawals and discharges. Users also have the option of applying an analytical solution to the time-varying, groundwater withdrawal and discharge volumes that take into account the effects of the aquifer properties on the timing and magnitude of streamflow alteration. For the MA SYE tool, it is assumed that groundwater and surface-water divides are coincident. For areas of southeastern Massachusetts and Cape Cod where this assumption is known to be violated, groundwater-flow models are used to estimate average monthly streamflows at fixed locations. There are several limitations to the quality and quantity of the spatially referenced database of groundwater and surface-water withdrawals and discharges. The adjusted streamflow values do not account for the effects on streamflow of climate change, septic-system discharge, impervious area, non-public water-supply withdrawals less than 100,000 gallons per day, and impounded surface-water bodies.

  16. Assessing the potential for improving S2S forecast skill through multimodel ensembling

    NASA Astrophysics Data System (ADS)

    Vigaud, N.; Robertson, A. W.; Tippett, M. K.; Wang, L.; Bell, M. J.

    2016-12-01

    Non-linear logistic regression is well suited to probability forecasting and has been successfully applied in the past to ensemble weather and climate predictions, providing access to the full probabilities distribution without any Gaussian assumption. However, little work has been done at sub-monthly lead times where relatively small re-forecast ensembles and lengths represent new challenges for which post-processing avenues have yet to be investigated. A promising approach consists in extending the definition of non-linear logistic regression by including the quantile of the forecast distribution as one of the predictors. So-called Extended Logistic Regression (ELR), which enables mutually consistent individual threshold probabilities, is here applied to ECMWF, CFSv2 and CMA re-forecasts from the S2S database in order to produce rainfall probabilities at weekly resolution. The ELR model is trained on seasonally-varying tercile categories computed for lead times of 1 to 4 weeks. It is then tested in a cross-validated manner, i.e. allowing real-time predictability applications, to produce rainfall tercile probabilities from individual weekly hindcasts that are finally combined by equal pooling. Results will be discussed over a broader North American region, where individual and MME forecasts generated out to 4 weeks lead are characterized by good probabilistic reliability but low sharpness, exhibiting systematically more skill in winter than summer.

  17. Traffic Predictive Control: Case Study and Evaluation

    DOT National Transportation Integrated Search

    2017-06-26

    This project developed a quantile regression method for predicting future traffic flow at a signalized intersection by combining both historical and real-time data. The algorithm exploits nonlinear correlations in historical measurements and efficien...

  18. Using instant messaging to enhance the interpersonal relationships of Taiwanese adolescents: evidence from quantile regression analysis.

    PubMed

    Lee, Yueh-Chiang; Sun, Ya Chung

    2009-01-01

    Even though use of the internet by adolescents has grown exponentially, little is known about the correlation between their interaction via Instant Messaging (IM) and the evolution of their interpersonal relationships in real life. In the present study, 369 junior high school students in Taiwan responded to questions regarding their IM usage and their dispositional measures of real-life interpersonal relationships. Descriptive statistics, factor analysis, and quantile regression methods were used to analyze the data. Results indicate that (1) IM helps define adolescents' self-identity (forming and maintaining individual friendships) and social-identity (belonging to a peer group), and (2) how development of an interpersonal relationship is impacted by the use of IM since it appears that adolescents use IM to improve their interpersonal relationships in real life.

  19. Streamflow trends in the United States

    USGS Publications Warehouse

    Lins, H.F.; Slack, J.R.

    1999-01-01

    Secular trends in streamflow are evaluated for 395 climate-sensitive streamgaging stations in the conterminous United States using the non-parametric Mann-Kendall test. Trends are calculated for selected quantiles of discharge, from the 0th to the 100th percentile, to evaluate differences between low-, medium-, and high-flow regimes during the twentieth century. Two general patterns emerge; trends are most prevalent in the annual minimum (Q0) to median (Q50) flow categories and least prevalent in the annual maximum (Q100) category; and, at all but the highest quantiles, streamflow has increased across broad sections of the United States. Decreases appear only in parts of the Pacific Northwest and the Southeast. Systematic patterns are less apparent in the Q100 flow. Hydrologically, these results indicate that the conterminous U.S. is getting wetter, but less extreme.

  20. Disturbance automated reference toolset (DART): Assessing patterns in ecological recovery from energy development on the Colorado Plateau

    USGS Publications Warehouse

    Nauman, Travis; Duniway, Michael C.; Villarreal, Miguel; Poitras, Travis

    2017-01-01

    A new disturbance automated reference toolset (DART) was developed to monitor human land surface impacts using soil-type and ecological context. DART identifies reference areas with similar soils, topography, and geology; and compares the disturbance condition to the reference area condition using a quantile-based approach based on a satellite vegetation index. DART was able to represent 26–55% of variation of relative differences in bare ground and 26–41% of variation in total foliar cover when comparing sites with nearby ecological reference areas using the Soil Adjusted Total Vegetation Index (SATVI). Assessment of ecological recovery at oil and gas pads on the Colorado Plateau with DART revealed that more than half of well-pads were below the 25th percentile of reference areas. Machine learning trend analysis of poorly recovering well-pads (quantile < 0.23) had out-of-bag error rates between 37 and 40% indicating moderate association with environmental and management variables hypothesized to influence recovery. Well-pads in grasslands (median quantile [MQ] = 13%), blackbrush (Coleogyne ramosissima) shrublands (MQ = 18%), arid canyon complexes (MQ = 18%), warmer areas with more summer-dominated precipitation, and state administered areas (MQ = 12%) had low recovery rates. Results showcase the usefulness of DART for assessing discrete surface land disturbances, and highlight the need for more targeted rehabilitation efforts at oil and gas well-pads in the arid southwest US.

Top