Sample records for bayesian geostatistical modelling

  1. EFFICIENT MODEL-FITTING AND MODEL-COMPARISON FOR HIGH-DIMENSIONAL BAYESIAN GEOSTATISTICAL MODELS. (R826887)

    EPA Science Inventory

    Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...

  2. [Bayesian geostatistical prediction of soil organic carbon contents of solonchak soils in nor-thern Tarim Basin, Xinjiang, China.

    PubMed

    Wu, Wei Mo; Wang, Jia Qiang; Cao, Qi; Wu, Jia Ping

    2017-02-01

    Accurate prediction of soil organic carbon (SOC) distribution is crucial for soil resources utilization and conservation, climate change adaptation, and ecosystem health. In this study, we selected a 1300 m×1700 m solonchak sampling area in northern Tarim Basin, Xinjiang, China, and collected a total of 144 soil samples (5-10 cm). The objectives of this study were to build a Baye-sian geostatistical model to predict SOC content, and to assess the performance of the Bayesian model for the prediction of SOC content by comparing with other three geostatistical approaches [ordinary kriging (OK), sequential Gaussian simulation (SGS), and inverse distance weighting (IDW)]. In the study area, soil organic carbon contents ranged from 1.59 to 9.30 g·kg -1 with a mean of 4.36 g·kg -1 and a standard deviation of 1.62 g·kg -1 . Sample semivariogram was best fitted by an exponential model with the ratio of nugget to sill being 0.57. By using the Bayesian geostatistical approach, we generated the SOC content map, and obtained the prediction variance, upper 95% and lower 95% of SOC contents, which were then used to evaluate the prediction uncertainty. Bayesian geostatistical approach performed better than that of the OK, SGS and IDW, demonstrating the advantages of Bayesian approach in SOC prediction.

  3. Approaches in highly parameterized inversion: bgaPEST, a Bayesian geostatistical approach implementation with PEST: documentation and instructions

    USGS Publications Warehouse

    Fienen, Michael N.; D'Oria, Marco; Doherty, John E.; Hunt, Randall J.

    2013-01-01

    The application bgaPEST is a highly parameterized inversion software package implementing the Bayesian Geostatistical Approach in a framework compatible with the parameter estimation suite PEST. Highly parameterized inversion refers to cases in which parameters are distributed in space or time and are correlated with one another. The Bayesian aspect of bgaPEST is related to Bayesian probability theory in which prior information about parameters is formally revised on the basis of the calibration dataset used for the inversion. Conceptually, this approach formalizes the conditionality of estimated parameters on the specific data and model available. The geostatistical component of the method refers to the way in which prior information about the parameters is used. A geostatistical autocorrelation function is used to enforce structure on the parameters to avoid overfitting and unrealistic results. Bayesian Geostatistical Approach is designed to provide the smoothest solution that is consistent with the data. Optionally, users can specify a level of fit or estimate a balance between fit and model complexity informed by the data. Groundwater and surface-water applications are used as examples in this text, but the possible uses of bgaPEST extend to any distributed parameter applications.

  4. Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.

    PubMed

    Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale

    2016-08-01

    Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty. © The Author(s) 2016.

  5. Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murakami, Haruko; Chen, X.; Hahn, Melanie S.

    2010-10-21

    This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less

  6. Analysis of dengue fever risk using geostatistics model in bone regency

    NASA Astrophysics Data System (ADS)

    Amran, Stang, Mallongi, Anwar

    2017-03-01

    This research aim is to analysis of dengue fever risk based on Geostatistics model in Bone Regency. Risk levels of dengue fever are denoted by parameter of Binomial distribution. Effect of temperature, rainfalls, elevation, and larvae abundance are investigated through Geostatistics model. Bayesian hierarchical method is used in estimation process. Using dengue fever data in eleven locations this research shows that temperature and rainfall have significant effect of dengue fever risk in Bone regency.

  7. Bayesian Geostatistical Analysis and Ecoclimatic Determinants of Corynebacterium pseudotuberculosis Infection among Horses

    PubMed Central

    Boysen, Courtney; Davis, Elizabeth G.; Beard, Laurie A.; Lubbers, Brian V.; Raghavan, Ram K.

    2015-01-01

    Kansas witnessed an unprecedented outbreak in Corynebacterium pseudotuberculosis infection among horses, a disease commonly referred to as pigeon fever during fall 2012. Bayesian geostatistical models were developed to identify key environmental and climatic risk factors associated with C. pseudotuberculosis infection in horses. Positive infection status among horses (cases) was determined by positive test results for characteristic abscess formation, positive bacterial culture on purulent material obtained from a lanced abscess (n = 82), or positive serologic evidence of exposure to organism (≥1:512)(n = 11). Horses negative for these tests (n = 172)(controls) were considered free of infection. Information pertaining to horse demographics and stabled location were obtained through review of medical records and/or contact with horse owners via telephone. Covariate information for environmental and climatic determinants were obtained from USDA (soil attributes), USGS (land use/land cover), and NASA MODIS and NASA Prediction of Worldwide Renewable Resources (climate). Candidate covariates were screened using univariate regression models followed by Bayesian geostatistical models with and without covariates. The best performing model indicated a protective effect for higher soil moisture content (OR = 0.53, 95% CrI = 0.25, 0.71), and detrimental effects for higher land surface temperature (≥35°C) (OR = 2.81, 95% CrI = 2.21, 3.85) and habitat fragmentation (OR = 1.31, 95% CrI = 1.27, 2.22) for C. pseudotuberculosis infection status in horses, while age, gender and breed had no effect. Preventative and ecoclimatic significance of these findings are discussed. PMID:26473728

  8. Bayesian Geostatistical Analysis and Ecoclimatic Determinants of Corynebacterium pseudotuberculosis Infection among Horses.

    PubMed

    Boysen, Courtney; Davis, Elizabeth G; Beard, Laurie A; Lubbers, Brian V; Raghavan, Ram K

    2015-01-01

    Kansas witnessed an unprecedented outbreak in Corynebacterium pseudotuberculosis infection among horses, a disease commonly referred to as pigeon fever during fall 2012. Bayesian geostatistical models were developed to identify key environmental and climatic risk factors associated with C. pseudotuberculosis infection in horses. Positive infection status among horses (cases) was determined by positive test results for characteristic abscess formation, positive bacterial culture on purulent material obtained from a lanced abscess (n = 82), or positive serologic evidence of exposure to organism (≥ 1:512)(n = 11). Horses negative for these tests (n = 172)(controls) were considered free of infection. Information pertaining to horse demographics and stabled location were obtained through review of medical records and/or contact with horse owners via telephone. Covariate information for environmental and climatic determinants were obtained from USDA (soil attributes), USGS (land use/land cover), and NASA MODIS and NASA Prediction of Worldwide Renewable Resources (climate). Candidate covariates were screened using univariate regression models followed by Bayesian geostatistical models with and without covariates. The best performing model indicated a protective effect for higher soil moisture content (OR = 0.53, 95% CrI = 0.25, 0.71), and detrimental effects for higher land surface temperature (≥ 35°C) (OR = 2.81, 95% CrI = 2.21, 3.85) and habitat fragmentation (OR = 1.31, 95% CrI = 1.27, 2.22) for C. pseudotuberculosis infection status in horses, while age, gender and breed had no effect. Preventative and ecoclimatic significance of these findings are discussed.

  9. Spatial mapping and prediction of Plasmodium falciparum infection risk among school-aged children in Côte d'Ivoire.

    PubMed

    Houngbedji, Clarisse A; Chammartin, Frédérique; Yapi, Richard B; Hürlimann, Eveline; N'Dri, Prisca B; Silué, Kigbafori D; Soro, Gotianwa; Koudou, Benjamin G; Assi, Serge-Brice; N'Goran, Eliézer K; Fantodji, Agathe; Utzinger, Jürg; Vounatsou, Penelope; Raso, Giovanna

    2016-09-07

    In Côte d'Ivoire, malaria remains a major public health issue, and thus a priority to be tackled. The aim of this study was to identify spatially explicit indicators of Plasmodium falciparum infection among school-aged children and to undertake a model-based spatial prediction of P. falciparum infection risk using environmental predictors. A cross-sectional survey was conducted, including parasitological examinations and interviews with more than 5,000 children from 93 schools across Côte d'Ivoire. A finger-prick blood sample was obtained from each child to determine Plasmodium species-specific infection and parasitaemia using Giemsa-stained thick and thin blood films. Household socioeconomic status was assessed through asset ownership and household characteristics. Children were interviewed for preventive measures against malaria. Environmental data were gathered from satellite images and digitized maps. A Bayesian geostatistical stochastic search variable selection procedure was employed to identify factors related to P. falciparum infection risk. Bayesian geostatistical logistic regression models were used to map the spatial distribution of P. falciparum infection and to predict the infection prevalence at non-sampled locations via Bayesian kriging. Complete data sets were available from 5,322 children aged 5-16 years across Côte d'Ivoire. P. falciparum was the predominant species (94.5 %). The Bayesian geostatistical variable selection procedure identified land cover and socioeconomic status as important predictors for infection risk with P. falciparum. Model-based prediction identified high P. falciparum infection risk in the north, central-east, south-east, west and south-west of Côte d'Ivoire. Low-risk areas were found in the south-eastern area close to Abidjan and the south-central and west-central part of the country. The P. falciparum infection risk and related uncertainty estimates for school-aged children in Côte d'Ivoire represent the most up-to-date malaria risk maps. These tools can be used for spatial targeting of malaria control interventions.

  10. Application of Bayesian geostatistics for evaluation of mass discharge uncertainty at contaminated sites

    NASA Astrophysics Data System (ADS)

    Troldborg, Mads; Nowak, Wolfgang; Lange, Ida V.; Santos, Marta C.; Binning, Philip J.; Bjerg, Poul L.

    2012-09-01

    Mass discharge estimates are increasingly being used when assessing risks of groundwater contamination and designing remedial systems at contaminated sites. Such estimates are, however, rather uncertain as they integrate uncertain spatial distributions of both concentration and groundwater flow. Here a geostatistical simulation method for quantifying the uncertainty of the mass discharge across a multilevel control plane is presented. The method accounts for (1) heterogeneity of both the flow field and the concentration distribution through Bayesian geostatistics, (2) measurement uncertainty, and (3) uncertain source zone and transport parameters. The method generates conditional realizations of the spatial flow and concentration distribution. An analytical macrodispersive transport solution is employed to simulate the mean concentration distribution, and a geostatistical model of the Box-Cox transformed concentration data is used to simulate observed deviations from this mean solution. By combining the flow and concentration realizations, a mass discharge probability distribution is obtained. The method has the advantage of avoiding the heavy computational burden of three-dimensional numerical flow and transport simulation coupled with geostatistical inversion. It may therefore be of practical relevance to practitioners compared to existing methods that are either too simple or computationally demanding. The method is demonstrated on a field site contaminated with chlorinated ethenes. For this site, we show that including a physically meaningful concentration trend and the cosimulation of hydraulic conductivity and hydraulic gradient across the transect helps constrain the mass discharge uncertainty. The number of sampling points required for accurate mass discharge estimation and the relative influence of different data types on mass discharge uncertainty is discussed.

  11. A Bayesian geostatistical approach for evaluating the uncertainty of contaminant mass discharges from point sources

    NASA Astrophysics Data System (ADS)

    Troldborg, M.; Nowak, W.; Binning, P. J.; Bjerg, P. L.

    2012-12-01

    Estimates of mass discharge (mass/time) are increasingly being used when assessing risks of groundwater contamination and designing remedial systems at contaminated sites. Mass discharge estimates are, however, prone to rather large uncertainties as they integrate uncertain spatial distributions of both concentration and groundwater flow velocities. For risk assessments or any other decisions that are being based on mass discharge estimates, it is essential to address these uncertainties. We present a novel Bayesian geostatistical approach for quantifying the uncertainty of the mass discharge across a multilevel control plane. The method decouples the flow and transport simulation and has the advantage of avoiding the heavy computational burden of three-dimensional numerical flow and transport simulation coupled with geostatistical inversion. It may therefore be of practical relevance to practitioners compared to existing methods that are either too simple or computationally demanding. The method is based on conditional geostatistical simulation and accounts for i) heterogeneity of both the flow field and the concentration distribution through Bayesian geostatistics (including the uncertainty in covariance functions), ii) measurement uncertainty, and iii) uncertain source zone geometry and transport parameters. The method generates multiple equally likely realizations of the spatial flow and concentration distribution, which all honour the measured data at the control plane. The flow realizations are generated by analytical co-simulation of the hydraulic conductivity and the hydraulic gradient across the control plane. These realizations are made consistent with measurements of both hydraulic conductivity and head at the site. An analytical macro-dispersive transport solution is employed to simulate the mean concentration distribution across the control plane, and a geostatistical model of the Box-Cox transformed concentration data is used to simulate observed deviations from this mean solution. By combining the flow and concentration realizations, a mass discharge probability distribution is obtained. Tests show that the decoupled approach is both efficient and able to provide accurate uncertainty estimates. The method is demonstrated on a Danish field site contaminated with chlorinated ethenes. For this site, we show that including a physically meaningful concentration trend and the co-simulation of hydraulic conductivity and hydraulic gradient across the transect helps constrain the mass discharge uncertainty. The number of sampling points required for accurate mass discharge estimation and the relative influence of different data types on mass discharge uncertainty is discussed.

  12. Spatial heterogeneity and risk factors for stunting among children under age five in Ethiopia: A Bayesian geo-statistical model.

    PubMed

    Hagos, Seifu; Hailemariam, Damen; WoldeHanna, Tasew; Lindtjørn, Bernt

    2017-01-01

    Understanding the spatial distribution of stunting and underlying factors operating at meso-scale is of paramount importance for intervention designing and implementations. Yet, little is known about the spatial distribution of stunting and some discrepancies are documented on the relative importance of reported risk factors. Therefore, the present study aims at exploring the spatial distribution of stunting at meso- (district) scale, and evaluates the effect of spatial dependency on the identification of risk factors and their relative contribution to the occurrence of stunting and severe stunting in a rural area of Ethiopia. A community based cross sectional study was conducted to measure the occurrence of stunting and severe stunting among children aged 0-59 months. Additionally, we collected relevant information on anthropometric measures, dietary habits, parent and child-related demographic and socio-economic status. Latitude and longitude of surveyed households were also recorded. Local Anselin Moran's I was calculated to investigate the spatial variation of stunting prevalence and identify potential local pockets (hotspots) of high prevalence. Finally, we employed a Bayesian geo-statistical model, which accounted for spatial dependency structure in the data, to identify potential risk factors for stunting in the study area. Overall, the prevalence of stunting and severe stunting in the district was 43.7% [95%CI: 40.9, 46.4] and 21.3% [95%CI: 19.5, 23.3] respectively. We identified statistically significant clusters of high prevalence of stunting (hotspots) in the eastern part of the district and clusters of low prevalence (cold spots) in the western. We found out that the inclusion of spatial structure of the data into the Bayesian model has shown to improve the fit for stunting model. The Bayesian geo-statistical model indicated that the risk of stunting increased as the child's age increased (OR 4.74; 95% Bayesian credible interval [BCI]:3.35-6.58) and among boys (OR 1.28; 95%BCI; 1.12-1.45). However, maternal education and household food security were found to be protective against stunting and severe stunting. Stunting prevalence may vary across space at different scale. For this, it's important that nutrition studies and, more importantly, control interventions take into account this spatial heterogeneity in the distribution of nutritional deficits and their underlying associated factors. The findings of this study also indicated that interventions integrating household food insecurity in nutrition programs in the district might help to avert the burden of stunting.

  13. Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola

    PubMed Central

    Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope

    2010-01-01

    The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775

  14. Systematic evaluation of sequential geostatistical resampling within MCMC for posterior sampling of near-surface geophysical inverse problems

    NASA Astrophysics Data System (ADS)

    Ruggeri, Paolo; Irving, James; Holliger, Klaus

    2015-08-01

    We critically examine the performance of sequential geostatistical resampling (SGR) as a model proposal mechanism for Bayesian Markov-chain-Monte-Carlo (MCMC) solutions to near-surface geophysical inverse problems. Focusing on a series of simple yet realistic synthetic crosshole georadar tomographic examples characterized by different numbers of data, levels of data error and degrees of model parameter spatial correlation, we investigate the efficiency of three different resampling strategies with regard to their ability to generate statistically independent realizations from the Bayesian posterior distribution. Quite importantly, our results show that, no matter what resampling strategy is employed, many of the examined test cases require an unreasonably high number of forward model runs to produce independent posterior samples, meaning that the SGR approach as currently implemented will not be computationally feasible for a wide range of problems. Although use of a novel gradual-deformation-based proposal method can help to alleviate these issues, it does not offer a full solution. Further, we find that the nature of the SGR is found to strongly influence MCMC performance; however no clear rule exists as to what set of inversion parameters and/or overall proposal acceptance rate will allow for the most efficient implementation. We conclude that although the SGR methodology is highly attractive as it allows for the consideration of complex geostatistical priors as well as conditioning to hard and soft data, further developments are necessary in the context of novel or hybrid MCMC approaches for it to be considered generally suitable for near-surface geophysical inversions.

  15. Quantifying natural delta variability using a multiple-point geostatistics prior uncertainty model

    NASA Astrophysics Data System (ADS)

    Scheidt, Céline; Fernandes, Anjali M.; Paola, Chris; Caers, Jef

    2016-10-01

    We address the question of quantifying uncertainty associated with autogenic pattern variability in a channelized transport system by means of a modern geostatistical method. This question has considerable relevance for practical subsurface applications as well, particularly those related to uncertainty quantification relying on Bayesian approaches. Specifically, we show how the autogenic variability in a laboratory experiment can be represented and reproduced by a multiple-point geostatistical prior uncertainty model. The latter geostatistical method requires selection of a limited set of training images from which a possibly infinite set of geostatistical model realizations, mimicking the training image patterns, can be generated. To that end, we investigate two methods to determine how many training images and what training images should be provided to reproduce natural autogenic variability. The first method relies on distance-based clustering of overhead snapshots of the experiment; the second method relies on a rate of change quantification by means of a computer vision algorithm termed the demon algorithm. We show quantitatively that with either training image selection method, we can statistically reproduce the natural variability of the delta formed in the experiment. In addition, we study the nature of the patterns represented in the set of training images as a representation of the "eigenpatterns" of the natural system. The eigenpattern in the training image sets display patterns consistent with previous physical interpretations of the fundamental modes of this type of delta system: a highly channelized, incisional mode; a poorly channelized, depositional mode; and an intermediate mode between the two.

  16. An interactive Bayesian geostatistical inverse protocol for hydraulic tomography

    USGS Publications Warehouse

    Fienen, Michael N.; Clemo, Tom; Kitanidis, Peter K.

    2008-01-01

    Hydraulic tomography is a powerful technique for characterizing heterogeneous hydrogeologic parameters. An explicit trade-off between characterization based on measurement misfit and subjective characterization using prior information is presented. We apply a Bayesian geostatistical inverse approach that is well suited to accommodate a flexible model with the level of complexity driven by the data and explicitly considering uncertainty. Prior information is incorporated through the selection of a parameter covariance model characterizing continuity and providing stability. Often, discontinuities in the parameter field, typically caused by geologic contacts between contrasting lithologic units, necessitate subdivision into zones across which there is no correlation among hydraulic parameters. We propose an interactive protocol in which zonation candidates are implied from the data and are evaluated using cross validation and expert knowledge. Uncertainty introduced by limited knowledge of dynamic regional conditions is mitigated by using drawdown rather than native head values. An adjoint state formulation of MODFLOW-2000 is used to calculate sensitivities which are used both for the solution to the inverse problem and to guide protocol decisions. The protocol is tested using synthetic two-dimensional steady state examples in which the wells are located at the edge of the region of interest.

  17. Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design

    NASA Astrophysics Data System (ADS)

    Leube, P. C.; Geiges, A.; Nowak, W.

    2012-02-01

    Incorporating hydro(geo)logical data, such as head and tracer data, into stochastic models of (subsurface) flow and transport helps to reduce prediction uncertainty. Because of financial limitations for investigation campaigns, information needs toward modeling or prediction goals should be satisfied efficiently and rationally. Optimal design techniques find the best one among a set of investigation strategies. They optimize the expected impact of data on prediction confidence or related objectives prior to data collection. We introduce a new optimal design method, called PreDIA(gnosis) (Preposterior Data Impact Assessor). PreDIA derives the relevant probability distributions and measures of data utility within a fully Bayesian, generalized, flexible, and accurate framework. It extends the bootstrap filter (BF) and related frameworks to optimal design by marginalizing utility measures over the yet unknown data values. PreDIA is a strictly formal information-processing scheme free of linearizations. It works with arbitrary simulation tools, provides full flexibility concerning measurement types (linear, nonlinear, direct, indirect), allows for any desired task-driven formulations, and can account for various sources of uncertainty (e.g., heterogeneity, geostatistical assumptions, boundary conditions, measurement values, model structure uncertainty, a large class of model errors) via Bayesian geostatistics and model averaging. Existing methods fail to simultaneously provide these crucial advantages, which our method buys at relatively higher-computational costs. We demonstrate the applicability and advantages of PreDIA over conventional linearized methods in a synthetic example of subsurface transport. In the example, we show that informative data is often invisible for linearized methods that confuse zero correlation with statistical independence. Hence, PreDIA will often lead to substantially better sampling designs. Finally, we extend our example to specifically highlight the consideration of conceptual model uncertainty.

  18. Constraining geostatistical models with hydrological data to improve prediction realism

    NASA Astrophysics Data System (ADS)

    Demyanov, V.; Rojas, T.; Christie, M.; Arnold, D.

    2012-04-01

    Geostatistical models reproduce spatial correlation based on the available on site data and more general concepts about the modelled patters, e.g. training images. One of the problem of modelling natural systems with geostatistics is in maintaining realism spatial features and so they agree with the physical processes in nature. Tuning the model parameters to the data may lead to geostatistical realisations with unrealistic spatial patterns, which would still honour the data. Such model would result in poor predictions, even though although fit the available data well. Conditioning the model to a wider range of relevant data provide a remedy that avoid producing unrealistic features in spatial models. For instance, there are vast amounts of information about the geometries of river channels that can be used in describing fluvial environment. Relations between the geometrical channel characteristics (width, depth, wave length, amplitude, etc.) are complex and non-parametric and are exhibit a great deal of uncertainty, which is important to propagate rigorously into the predictive model. These relations can be described within a Bayesian approach as multi-dimensional prior probability distributions. We propose a way to constrain multi-point statistics models with intelligent priors obtained from analysing a vast collection of contemporary river patterns based on previously published works. We applied machine learning techniques, namely neural networks and support vector machines, to extract multivariate non-parametric relations between geometrical characteristics of fluvial channels from the available data. An example demonstrates how ensuring geological realism helps to deliver more reliable prediction of a subsurface oil reservoir in a fluvial depositional environment.

  19. APPLICATION OF BAYESIAN AND GEOSTATISTICAL MODELING TO THE ENVIRONMENTAL MONITORING OF CS-137 AT THE IDAHO NATIONAL LABORATORY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kara G. Eby

    2010-08-01

    At the Idaho National Laboratory (INL) Cs-137 concentrations above the U.S. Environmental Protection Agency risk-based threshold of 0.23 pCi/g may increase the risk of human mortality due to cancer. As a leader in nuclear research, the INL has been conducting nuclear activities for decades. Elevated anthropogenic radionuclide levels including Cs-137 are a result of atmospheric weapons testing, the Chernobyl accident, and nuclear activities occurring at the INL site. Therefore environmental monitoring and long-term surveillance of Cs-137 is required to evaluate risk. However, due to the large land area involved, frequent and comprehensive monitoring is limited. Developing a spatial model thatmore » predicts Cs-137 concentrations at unsampled locations will enhance the spatial characterization of Cs-137 in surface soils, provide guidance for an efficient monitoring program, and pinpoint areas requiring mitigation strategies. The predictive model presented herein is based on applied geostatistics using a Bayesian analysis of environmental characteristics across the INL site, which provides kriging spatial maps of both Cs-137 estimates and prediction errors. Comparisons are presented of two different kriging methods, showing that the use of secondary information (i.e., environmental characteristics) can provide improved prediction performance in some areas of the INL site.« less

  20. Estimation of Fine Particulate Matter in Taipei Using Landuse Regression and Bayesian Maximum Entropy Methods

    PubMed Central

    Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming

    2011-01-01

    Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005–2007. PMID:21776223

  1. Estimation of fine particulate matter in Taipei using landuse regression and bayesian maximum entropy methods.

    PubMed

    Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming

    2011-06-01

    Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.

  2. Bayesian geostatistics in health cartography: the perspective of malaria.

    PubMed

    Patil, Anand P; Gething, Peter W; Piel, Frédéric B; Hay, Simon I

    2011-06-01

    Maps of parasite prevalences and other aspects of infectious diseases that vary in space are widely used in parasitology. However, spatial parasitological datasets rarely, if ever, have sufficient coverage to allow exact determination of such maps. Bayesian geostatistics (BG) is a method for finding a large sample of maps that can explain a dataset, in which maps that do a better job of explaining the data are more likely to be represented. This sample represents the knowledge that the analyst has gained from the data about the unknown true map. BG provides a conceptually simple way to convert these samples to predictions of features of the unknown map, for example regional averages. These predictions account for each map in the sample, yielding an appropriate level of predictive precision.

  3. Bayesian geostatistics in health cartography: the perspective of malaria

    PubMed Central

    Patil, Anand P.; Gething, Peter W.; Piel, Frédéric B.; Hay, Simon I.

    2011-01-01

    Maps of parasite prevalences and other aspects of infectious diseases that vary in space are widely used in parasitology. However, spatial parasitological datasets rarely, if ever, have sufficient coverage to allow exact determination of such maps. Bayesian geostatistics (BG) is a method for finding a large sample of maps that can explain a dataset, in which maps that do a better job of explaining the data are more likely to be represented. This sample represents the knowledge that the analyst has gained from the data about the unknown true map. BG provides a conceptually simple way to convert these samples to predictions of features of the unknown map, for example regional averages. These predictions account for each map in the sample, yielding an appropriate level of predictive precision. PMID:21420361

  4. Restricted spatial regression in practice: Geostatistical models, confounding, and robustness under model misspecification

    USGS Publications Warehouse

    Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.

    2015-01-01

    In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.

  5. Intelligent Decisions Need Intelligent Choice of Models and Data - a Bayesian Justifiability Analysis for Models with Vastly Different Complexity

    NASA Astrophysics Data System (ADS)

    Nowak, W.; Schöniger, A.; Wöhling, T.; Illman, W. A.

    2016-12-01

    Model-based decision support requires justifiable models with good predictive capabilities. This, in turn, calls for a fine adjustment between predictive accuracy (small systematic model bias that can be achieved with rather complex models), and predictive precision (small predictive uncertainties that can be achieved with simpler models with fewer parameters). The implied complexity/simplicity trade-off depends on the availability of informative data for calibration. If not available, additional data collection can be planned through optimal experimental design. We present a model justifiability analysis that can compare models of vastly different complexity. It rests on Bayesian model averaging (BMA) to investigate the complexity/performance trade-off dependent on data availability. Then, we disentangle the complexity component from the performance component. We achieve this by replacing actually observed data by realizations of synthetic data predicted by the models. This results in a "model confusion matrix". Based on this matrix, the modeler can identify the maximum model complexity that can be justified by the available (or planned) amount and type of data. As a side product, the matrix quantifies model (dis-)similarity. We apply this analysis to aquifer characterization via hydraulic tomography, comparing four models with a vastly different number of parameters (from a homogeneous model to geostatistical random fields). As a testing scenario, we consider hydraulic tomography data. Using subsets of these data, we determine model justifiability as a function of data set size. The test case shows that geostatistical parameterization requires a substantial amount of hydraulic tomography data to be justified, while a zonation-based model can be justified with more limited data set sizes. The actual model performance (as opposed to model justifiability), however, depends strongly on the quality of prior geological information.

  6. Spatial analysis and risk mapping of soil-transmitted helminth infections in Brazil, using Bayesian geostatistical models.

    PubMed

    Scholte, Ronaldo G C; Schur, Nadine; Bavia, Maria E; Carvalho, Edgar M; Chammartin, Frédérique; Utzinger, Jürg; Vounatsou, Penelope

    2013-11-01

    Soil-transmitted helminths (Ascaris lumbricoides, Trichuris trichiura and hookworm) negatively impact the health and wellbeing of hundreds of millions of people, particularly in tropical and subtropical countries, including Brazil. Reliable maps of the spatial distribution and estimates of the number of infected people are required for the control and eventual elimination of soil-transmitted helminthiasis. We used advanced Bayesian geostatistical modelling, coupled with geographical information systems and remote sensing to visualize the distribution of the three soil-transmitted helminth species in Brazil. Remotely sensed climatic and environmental data, along with socioeconomic variables from readily available databases were employed as predictors. Our models provided mean prevalence estimates for A. lumbricoides, T. trichiura and hookworm of 15.6%, 10.1% and 2.5%, respectively. By considering infection risk and population numbers at the unit of the municipality, we estimate that 29.7 million Brazilians are infected with A. lumbricoides, 19.2 million with T. trichiura and 4.7 million with hookworm. Our model-based maps identified important risk factors related to the transmission of soiltransmitted helminths and confirm that environmental variables are closely associated with indices of poverty. Our smoothed risk maps, including uncertainty, highlight areas where soil-transmitted helminthiasis control interventions are most urgently required, namely in the North and along most of the coastal areas of Brazil. We believe that our predictive risk maps are useful for disease control managers for prioritising control interventions and for providing a tool for more efficient surveillance-response mechanisms.

  7. A flexible Bayesian assessment for the expected impact of data on prediction confidence for optimal sampling designs

    NASA Astrophysics Data System (ADS)

    Leube, Philipp; Geiges, Andreas; Nowak, Wolfgang

    2010-05-01

    Incorporating hydrogeological data, such as head and tracer data, into stochastic models of subsurface flow and transport helps to reduce prediction uncertainty. Considering limited financial resources available for the data acquisition campaign, information needs towards the prediction goal should be satisfied in a efficient and task-specific manner. For finding the best one among a set of design candidates, an objective function is commonly evaluated, which measures the expected impact of data on prediction confidence, prior to their collection. An appropriate approach to this task should be stochastically rigorous, master non-linear dependencies between data, parameters and model predictions, and allow for a wide variety of different data types. Existing methods fail to fulfill all these requirements simultaneously. For this reason, we introduce a new method, denoted as CLUE (Cross-bred Likelihood Uncertainty Estimator), that derives the essential distributions and measures of data utility within a generalized, flexible and accurate framework. The method makes use of Bayesian GLUE (Generalized Likelihood Uncertainty Estimator) and extends it to an optimal design method by marginalizing over the yet unknown data values. Operating in a purely Bayesian Monte-Carlo framework, CLUE is a strictly formal information processing scheme free of linearizations. It provides full flexibility associated with the type of measurements (linear, non-linear, direct, indirect) and accounts for almost arbitrary sources of uncertainty (e.g. heterogeneity, geostatistical assumptions, boundary conditions, model concepts) via stochastic simulation and Bayesian model averaging. This helps to minimize the strength and impact of possible subjective prior assumptions, that would be hard to defend prior to data collection. Our study focuses on evaluating two different uncertainty measures: (i) expected conditional variance and (ii) expected relative entropy of a given prediction goal. The applicability and advantages are shown in a synthetic example. Therefor, we consider a contaminant source, posing a threat on a drinking water well in an aquifer. Furthermore, we assume uncertainty in geostatistical parameters, boundary conditions and hydraulic gradient. The two mentioned measures evaluate the sensitivity of (1) general prediction confidence and (2) exceedance probability of a legal regulatory threshold value on sampling locations.

  8. Goal-oriented Site Characterization in Hydrogeological Applications: An Overview

    NASA Astrophysics Data System (ADS)

    Nowak, W.; de Barros, F.; Rubin, Y.

    2011-12-01

    In this study, we address the importance of goal-oriented site characterization. Given the multiple sources of uncertainty in hydrogeological applications, information needs of modeling, prediction and decision support should be satisfied with efficient and rational field campaigns. In this work, we provide an overview of an optimal sampling design framework based on Bayesian decision theory, statistical parameter inference and Bayesian model averaging. It optimizes the field sampling campaign around decisions on environmental performance metrics (e.g., risk, arrival times, etc.) while accounting for parametric and model uncertainty in the geostatistical characterization, in forcing terms, and measurement error. The appealing aspects of the framework lie on its goal-oriented character and that it is directly linked to the confidence in a specified decision. We illustrate how these concepts could be applied in a human health risk problem where uncertainty from both hydrogeological and health parameters are accounted.

  9. Integrating Address Geocoding, Land Use Regression, and Spatiotemporal Geostatistical Estimation for Groundwater Tetrachloroethylene

    PubMed Central

    Messier, Kyle P.; Akita, Yasuyuki; Serre, Marc L.

    2012-01-01

    Geographic Information Systems (GIS) based techniques are cost-effective and efficient methods used by state agencies and epidemiology researchers for estimating concentration and exposure. However, budget limitations have made statewide assessments of contamination difficult, especially in groundwater media. Many studies have implemented address geocoding, land use regression, and geostatistics independently, but this is the first to examine the benefits of integrating these GIS techniques to address the need of statewide exposure assessments. A novel framework for concentration exposure is introduced that integrates address geocoding, land use regression (LUR), below detect data modeling, and Bayesian Maximum Entropy (BME). A LUR model was developed for Tetrachloroethylene that accounts for point sources and flow direction. We then integrate the LUR model into the BME method as a mean trend while also modeling below detects data as a truncated Gaussian probability distribution function. We increase available PCE data 4.7 times from previously available databases through multistage geocoding. The LUR model shows significant influence of dry cleaners at short ranges. The integration of the LUR model as mean trend in BME results in a 7.5% decrease in cross validation mean square error compared to BME with a constant mean trend. PMID:22264162

  10. Integrating address geocoding, land use regression, and spatiotemporal geostatistical estimation for groundwater tetrachloroethylene.

    PubMed

    Messier, Kyle P; Akita, Yasuyuki; Serre, Marc L

    2012-03-06

    Geographic information systems (GIS) based techniques are cost-effective and efficient methods used by state agencies and epidemiology researchers for estimating concentration and exposure. However, budget limitations have made statewide assessments of contamination difficult, especially in groundwater media. Many studies have implemented address geocoding, land use regression, and geostatistics independently, but this is the first to examine the benefits of integrating these GIS techniques to address the need of statewide exposure assessments. A novel framework for concentration exposure is introduced that integrates address geocoding, land use regression (LUR), below detect data modeling, and Bayesian Maximum Entropy (BME). A LUR model was developed for tetrachloroethylene that accounts for point sources and flow direction. We then integrate the LUR model into the BME method as a mean trend while also modeling below detects data as a truncated Gaussian probability distribution function. We increase available PCE data 4.7 times from previously available databases through multistage geocoding. The LUR model shows significant influence of dry cleaners at short ranges. The integration of the LUR model as mean trend in BME results in a 7.5% decrease in cross validation mean square error compared to BME with a constant mean trend.

  11. Characterizing regional-scale temporal evolution of air dose rates after the Fukushima Daiichi Nuclear Power Plant accident.

    PubMed

    Wainwright, Haruko M; Seki, Akiyuki; Mikami, Satoshi; Saito, Kimiaki

    2018-09-01

    In this study, we quantify the temporal changes of air dose rates in the regional scale around the Fukushima Dai-ichi Nuclear Power Plant in Japan, and predict the spatial distribution of air dose rates in the future. We first apply the Bayesian geostatistical method developed by Wainwright et al. (2017) to integrate multiscale datasets including ground-based walk and car surveys, and airborne surveys, all of which have different scales, resolutions, spatial coverage, and accuracy. This method is based on geostatistics to represent spatial heterogeneous structures, and also on Bayesian hierarchical models to integrate multiscale, multi-type datasets in a consistent manner. We apply this method to the datasets from three years: 2014 to 2016. The temporal changes among the three integrated maps enables us to characterize the spatiotemporal dynamics of radiation air dose rates. The data-driven ecological decay model is then coupled with the integrated map to predict future dose rates. Results show that the air dose rates are decreasing consistently across the region. While slower in the forested region, the decrease is particularly significant in the town area. The decontamination has contributed to significant reduction of air dose rates. By 2026, the air dose rates will continue to decrease, and the area above 3.8 μSv/h will be almost fully contained within the non-residential forested zone. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Geostatistical estimation of forest biomass in interior Alaska combining Landsat-derived tree cover, sampled airborne lidar and field observations

    NASA Astrophysics Data System (ADS)

    Babcock, Chad; Finley, Andrew O.; Andersen, Hans-Erik; Pattison, Robert; Cook, Bruce D.; Morton, Douglas C.; Alonzo, Michael; Nelson, Ross; Gregoire, Timothy; Ene, Liviu; Gobakken, Terje; Næsset, Erik

    2018-06-01

    The goal of this research was to develop and examine the performance of a geostatistical coregionalization modeling approach for combining field inventory measurements, strip samples of airborne lidar and Landsat-based remote sensing data products to predict aboveground biomass (AGB) in interior Alaska's Tanana Valley. The proposed modeling strategy facilitates pixel-level mapping of AGB density predictions across the entire spatial domain. Additionally, the coregionalization framework allows for statistically sound estimation of total AGB for arbitrary areal units within the study area---a key advance to support diverse management objectives in interior Alaska. This research focuses on appropriate characterization of prediction uncertainty in the form of posterior predictive coverage intervals and standard deviations. Using the framework detailed here, it is possible to quantify estimation uncertainty for any spatial extent, ranging from pixel-level predictions of AGB density to estimates of AGB stocks for the full domain. The lidar-informed coregionalization models consistently outperformed their counterpart lidar-free models in terms of point-level predictive performance and total AGB precision. Additionally, the inclusion of Landsat-derived forest cover as a covariate further improved estimation precision in regions with lower lidar sampling intensity. Our findings also demonstrate that model-based approaches that do not explicitly account for residual spatial dependence can grossly underestimate uncertainty, resulting in falsely precise estimates of AGB. On the other hand, in a geostatistical setting, residual spatial structure can be modeled within a Bayesian hierarchical framework to obtain statistically defensible assessments of uncertainty for AGB estimates.

  13. Multiobjective design of aquifer monitoring networks for optimal spatial prediction and geostatistical parameter estimation

    NASA Astrophysics Data System (ADS)

    Alzraiee, Ayman H.; Bau, Domenico A.; Garcia, Luis A.

    2013-06-01

    Effective sampling of hydrogeological systems is essential in guiding groundwater management practices. Optimal sampling of groundwater systems has previously been formulated based on the assumption that heterogeneous subsurface properties can be modeled using a geostatistical approach. Therefore, the monitoring schemes have been developed to concurrently minimize the uncertainty in the spatial distribution of systems' states and parameters, such as the hydraulic conductivity K and the hydraulic head H, and the uncertainty in the geostatistical model of system parameters using a single objective function that aggregates all objectives. However, it has been shown that the aggregation of possibly conflicting objective functions is sensitive to the adopted aggregation scheme and may lead to distorted results. In addition, the uncertainties in geostatistical parameters affect the uncertainty in the spatial prediction of K and H according to a complex nonlinear relationship, which has often been ineffectively evaluated using a first-order approximation. In this study, we propose a multiobjective optimization framework to assist the design of monitoring networks of K and H with the goal of optimizing their spatial predictions and estimating the geostatistical parameters of the K field. The framework stems from the combination of a data assimilation (DA) algorithm and a multiobjective evolutionary algorithm (MOEA). The DA algorithm is based on the ensemble Kalman filter, a Monte-Carlo-based Bayesian update scheme for nonlinear systems, which is employed to approximate the posterior uncertainty in K, H, and the geostatistical parameters of K obtained by collecting new measurements. Multiple MOEA experiments are used to investigate the trade-off among design objectives and identify the corresponding monitoring schemes. The methodology is applied to design a sampling network for a shallow unconfined groundwater system located in Rocky Ford, Colorado. Results indicate that the effect of uncertainties associated with the geostatistical parameters on the spatial prediction might be significantly alleviated (by up to 80% of the prior uncertainty in K and by 90% of the prior uncertainty in H) by sampling evenly distributed measurements with a spatial measurement density of more than 1 observation per 60 m × 60 m grid block. In addition, exploration of the interaction of objective functions indicates that the ability of head measurements to reduce the uncertainty associated with the correlation scale is comparable to the effect of hydraulic conductivity measurements.

  14. Bayesian geostatistical modelling of soil-transmitted helminth survey data in the People's Republic of China.

    PubMed

    Lai, Ying-Si; Zhou, Xiao-Nong; Utzinger, Jürg; Vounatsou, Penelope

    2013-12-18

    Soil-transmitted helminth infections affect tens of millions of individuals in the People's Republic of China (P.R. China). There is a need for high-resolution estimates of at-risk areas and number of people infected to enhance spatial targeting of control interventions. However, such information is not yet available for P.R. China. A geo-referenced database compiling surveys pertaining to soil-transmitted helminthiasis, carried out from 2000 onwards in P.R. China, was established. Bayesian geostatistical models relating the observed survey data with potential climatic, environmental and socioeconomic predictors were developed and used to predict at-risk areas at high spatial resolution. Predictors were extracted from remote sensing and other readily accessible open-source databases. Advanced Bayesian variable selection methods were employed to develop a parsimonious model. Our results indicate that the prevalence of soil-transmitted helminth infections in P.R. China considerably decreased from 2005 onwards. Yet, some 144 million people were estimated to be infected in 2010. High prevalence (>20%) of the roundworm Ascaris lumbricoides infection was predicted for large areas of Guizhou province, the southern part of Hubei and Sichuan provinces, while the northern part and the south-eastern coastal-line areas of P.R. China had low prevalence (<5%). High infection prevalence (>20%) with hookworm was found in Hainan, the eastern part of Sichuan and the southern part of Yunnan provinces. High infection prevalence (>20%) with the whipworm Trichuris trichiura was found in a few small areas of south P.R. China. Very low prevalence (<0.1%) of hookworm and whipworm infections were predicted for the northern parts of P.R. China. We present the first model-based estimates for soil-transmitted helminth infections throughout P.R. China at high spatial resolution. Our prediction maps provide useful information for the spatial targeting of soil-transmitted helminthiasis control interventions and for long-term monitoring and surveillance in the frame of enhanced efforts to control and eliminate the public health burden of these parasitic worm infections.

  15. Bayesian geostatistical modelling of soil-transmitted helminth survey data in the People’s Republic of China

    PubMed Central

    2013-01-01

    Background Soil-transmitted helminth infections affect tens of millions of individuals in the People’s Republic of China (P.R. China). There is a need for high-resolution estimates of at-risk areas and number of people infected to enhance spatial targeting of control interventions. However, such information is not yet available for P.R. China. Methods A geo-referenced database compiling surveys pertaining to soil-transmitted helminthiasis, carried out from 2000 onwards in P.R. China, was established. Bayesian geostatistical models relating the observed survey data with potential climatic, environmental and socioeconomic predictors were developed and used to predict at-risk areas at high spatial resolution. Predictors were extracted from remote sensing and other readily accessible open-source databases. Advanced Bayesian variable selection methods were employed to develop a parsimonious model. Results Our results indicate that the prevalence of soil-transmitted helminth infections in P.R. China considerably decreased from 2005 onwards. Yet, some 144 million people were estimated to be infected in 2010. High prevalence (>20%) of the roundworm Ascaris lumbricoides infection was predicted for large areas of Guizhou province, the southern part of Hubei and Sichuan provinces, while the northern part and the south-eastern coastal-line areas of P.R. China had low prevalence (<5%). High infection prevalence (>20%) with hookworm was found in Hainan, the eastern part of Sichuan and the southern part of Yunnan provinces. High infection prevalence (>20%) with the whipworm Trichuris trichiura was found in a few small areas of south P.R. China. Very low prevalence (<0.1%) of hookworm and whipworm infections were predicted for the northern parts of P.R. China. Conclusions We present the first model-based estimates for soil-transmitted helminth infections throughout P.R. China at high spatial resolution. Our prediction maps provide useful information for the spatial targeting of soil-transmitted helminthiasis control interventions and for long-term monitoring and surveillance in the frame of enhanced efforts to control and eliminate the public health burden of these parasitic worm infections. PMID:24350825

  16. National-scale aboveground biomass geostatistical mapping with FIA inventory and GLAS data: Preparation for sparsely sampled lidar assisted forest inventory

    NASA Astrophysics Data System (ADS)

    Babcock, C. R.; Finley, A. O.; Andersen, H. E.; Moskal, L. M.; Morton, D. C.; Cook, B.; Nelson, R.

    2017-12-01

    Upcoming satellite lidar missions, such as GEDI and IceSat-2, are designed to collect laser altimetry data from space for narrow bands along orbital tracts. As a result lidar metric sets derived from these sources will not be of complete spatial coverage. This lack of complete coverage, or sparsity, means traditional regression approaches that consider lidar metrics as explanatory variables (without error) cannot be used to generate wall-to-wall maps of forest inventory variables. We implement a coregionalization framework to jointly model sparsely sampled lidar information and point-referenced forest variable measurements to create wall-to-wall maps with full probabilistic uncertainty quantification of all inputs. We inform the model with USFS Forest Inventory and Analysis (FIA) in-situ forest measurements and GLAS lidar data to spatially predict aboveground forest biomass (AGB) across the contiguous US. We cast our model within a Bayesian hierarchical framework to better model complex space-varying correlation structures among the lidar metrics and FIA data, which yields improved prediction and uncertainty assessment. To circumvent computational difficulties that arise when fitting complex geostatistical models to massive datasets, we use a Nearest Neighbor Gaussian process (NNGP) prior. Results indicate that a coregionalization modeling approach to leveraging sampled lidar data to improve AGB estimation is effective. Further, fitting the coregionalization model within a Bayesian mode of inference allows for AGB quantification across scales ranging from individual pixel estimates of AGB density to total AGB for the continental US with uncertainty. The coregionalization framework examined here is directly applicable to future spaceborne lidar acquisitions from GEDI and IceSat-2. Pairing these lidar sources with the extensive FIA forest monitoring plot network using a joint prediction framework, such as the coregionalization model explored here, offers the potential to improve forest AGB accounting certainty and provide maps for post-model fitting analysis of the spatial distribution of AGB.

  17. Mapping malaria risk among children in Côte d'Ivoire using Bayesian geo-statistical models.

    PubMed

    Raso, Giovanna; Schur, Nadine; Utzinger, Jürg; Koudou, Benjamin G; Tchicaya, Emile S; Rohner, Fabian; N'goran, Eliézer K; Silué, Kigbafori D; Matthys, Barbara; Assi, Serge; Tanner, Marcel; Vounatsou, Penelope

    2012-05-09

    In Côte d'Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d'Ivoire at high spatial resolution. Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d'Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d'Ivoire, including uncertainty. Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d'Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation.

  18. Mapping malaria risk among children in Côte d’Ivoire using Bayesian geo-statistical models

    PubMed Central

    2012-01-01

    Background In Côte d’Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d’Ivoire at high spatial resolution. Methods Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d’Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d’Ivoire, including uncertainty. Results Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. Conclusion The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d’Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation. PMID:22571469

  19. Geostatistics and Bayesian updating for transmissivity estimation in a multiaquifer system in Manitoba, Canada.

    PubMed

    Kennedy, Paula L; Woodbury, Allan D

    2002-01-01

    In ground water flow and transport modeling, the heterogeneous nature of porous media has a considerable effect on the resulting flow and solute transport. Some method of generating the heterogeneous field from a limited dataset of uncertain measurements is required. Bayesian updating is one method that interpolates from an uncertain dataset using the statistics of the underlying probability distribution function. In this paper, Bayesian updating was used to determine the heterogeneous natural log transmissivity field for a carbonate and a sandstone aquifer in southern Manitoba. It was determined that the transmissivity in m2/sec followed a natural log normal distribution for both aquifers with a mean of -7.2 and - 8.0 for the carbonate and sandstone aquifers, respectively. The variograms were calculated using an estimator developed by Li and Lake (1994). Fractal nature was not evident in the variogram from either aquifer. The Bayesian updating heterogeneous field provided good results even in cases where little data was available. A large transmissivity zone in the sandstone aquifer was created by the Bayesian procedure, which is not a reflection of any deterministic consideration, but is a natural outcome of updating a prior probability distribution function with observations. The statistical model returns a result that is very reasonable; that is homogeneous in regions where little or no information is available to alter an initial state. No long range correlation trends or fractal behavior of the log-transmissivity field was observed in either aquifer over a distance of about 300 km.

  20. Spatial Modelling of Soil-Transmitted Helminth Infections in Kenya: A Disease Control Planning Tool

    PubMed Central

    Pullan, Rachel L.; Gething, Peter W.; Smith, Jennifer L.; Mwandawiro, Charles S.; Sturrock, Hugh J. W.; Gitonga, Caroline W.; Hay, Simon I.; Brooker, Simon

    2011-01-01

    Background Implementation of control of parasitic diseases requires accurate, contemporary maps that provide intervention recommendations at policy-relevant spatial scales. To guide control of soil transmitted helminths (STHs), maps are required of the combined prevalence of infection, indicating where this prevalence exceeds an intervention threshold of 20%. Here we present a new approach for mapping the observed prevalence of STHs, using the example of Kenya in 2009. Methods and Findings Observed prevalence data for hookworm, Ascaris lumbricoides and Trichuris trichiura were assembled for 106,370 individuals from 945 cross-sectional surveys undertaken between 1974 and 2009. Ecological and climatic covariates were extracted from high-resolution satellite data and matched to survey locations. Bayesian space-time geostatistical models were developed for each species, and were used to interpolate the probability that infection prevalence exceeded the 20% threshold across the country for both 1989 and 2009. Maps for each species were integrated to estimate combined STH prevalence using the law of total probability and incorporating a correction factor to adjust for associations between species. Population census data were combined with risk models and projected to estimate the population at risk and requiring treatment in 2009. In most areas for 2009, there was high certainty that endemicity was below the 20% threshold, with areas of endemicity ≥20% located around the shores of Lake Victoria and on the coast. Comparison of the predicted distributions for 1989 and 2009 show how observed STH prevalence has gradually decreased over time. The model estimated that a total of 2.8 million school-age children live in districts which warrant mass treatment. Conclusions Bayesian space-time geostatistical models can be used to reliably estimate the combined observed prevalence of STH and suggest that a quarter of Kenya's school-aged children live in areas of high prevalence and warrant mass treatment. As control is successful in reducing infection levels, updated models can be used to refine decision making in helminth control. PMID:21347451

  1. Sequential Bayesian Geostatistical Inversion and Evaluation of Combined Data Worth for Aquifer Characterization at the Hanford 300 Area

    NASA Astrophysics Data System (ADS)

    Murakami, H.; Chen, X.; Hahn, M. S.; Over, M. W.; Rockhold, M. L.; Vermeul, V.; Hammond, G. E.; Zachara, J. M.; Rubin, Y.

    2010-12-01

    Subsurface characterization for predicting groundwater flow and contaminant transport requires us to integrate large and diverse datasets in a consistent manner, and quantify the associated uncertainty. In this study, we sequentially assimilated multiple types of datasets for characterizing a three-dimensional heterogeneous hydraulic conductivity field at the Hanford 300 Area. The datasets included constant-rate injection tests, electromagnetic borehole flowmeter tests, lithology profile and tracer tests. We used the method of anchored distributions (MAD), which is a modular-structured Bayesian geostatistical inversion method. MAD has two major advantages over the other inversion methods. First, it can directly infer a joint distribution of parameters, which can be used as an input in stochastic simulations for prediction. In MAD, in addition to typical geostatistical structural parameters, the parameter vector includes multiple point values of the heterogeneous field, called anchors, which capture local trends and reduce uncertainty in the prediction. Second, MAD allows us to integrate the datasets sequentially in a Bayesian framework such that it updates the posterior distribution, as a new dataset is included. The sequential assimilation can decrease computational burden significantly. We applied MAD to assimilate different combinations of the datasets, and then compared the inversion results. For the injection and tracer test assimilation, we calculated temporal moments of pressure build-up and breakthrough curves, respectively, to reduce the data dimension. A massive parallel flow and transport code PFLOTRAN is used for simulating the tracer test. For comparison, we used different metrics based on the breakthrough curves not used in the inversion, such as mean arrival time, peak concentration and early arrival time. This comparison intends to yield the combined data worth, i.e. which combination of the datasets is the most effective for a certain metric, which will be useful for guiding the further characterization effort at the site and also the future characterization projects at the other sites.

  2. Targeting trachoma control through risk mapping: the example of Southern Sudan.

    PubMed

    Clements, Archie C A; Kur, Lucia W; Gatpan, Gideon; Ngondi, Jeremiah M; Emerson, Paul M; Lado, Mounir; Sabasio, Anthony; Kolaczinski, Jan H

    2010-08-17

    Trachoma is a major cause of blindness in Southern Sudan. Its distribution has only been partially established and many communities in need of intervention have therefore not been identified or targeted. The present study aimed to develop a tool to improve targeting of survey and control activities. A national trachoma risk map was developed using Bayesian geostatistics models, incorporating trachoma prevalence data from 112 geo-referenced communities surveyed between 2001 and 2009. Logistic regression models were developed using active trachoma (trachomatous inflammation follicular and/or trachomatous inflammation intense) in 6345 children aged 1-9 years as the outcome, and incorporating fixed effects for age, long-term average rainfall (interpolated from weather station data) and land cover (i.e. vegetation type, derived from satellite remote sensing), as well as geostatistical random effects describing spatial clustering of trachoma. The model predicted the west of the country to be at no or low trachoma risk. Trachoma clusters in the central, northern and eastern areas had a radius of 8 km after accounting for the fixed effects. In Southern Sudan, large-scale spatial variation in the risk of active trachoma infection is associated with aridity. Spatial prediction has identified likely high-risk areas to be prioritized for more data collection, potentially to be followed by intervention.

  3. Targeting Trachoma Control through Risk Mapping: The Example of Southern Sudan

    PubMed Central

    Clements, Archie C. A.; Kur, Lucia W.; Gatpan, Gideon; Ngondi, Jeremiah M.; Emerson, Paul M.; Lado, Mounir; Sabasio, Anthony; Kolaczinski, Jan H.

    2010-01-01

    Background Trachoma is a major cause of blindness in Southern Sudan. Its distribution has only been partially established and many communities in need of intervention have therefore not been identified or targeted. The present study aimed to develop a tool to improve targeting of survey and control activities. Methods/Principal Findings A national trachoma risk map was developed using Bayesian geostatistics models, incorporating trachoma prevalence data from 112 geo-referenced communities surveyed between 2001 and 2009. Logistic regression models were developed using active trachoma (trachomatous inflammation follicular and/or trachomatous inflammation intense) in 6345 children aged 1–9 years as the outcome, and incorporating fixed effects for age, long-term average rainfall (interpolated from weather station data) and land cover (i.e. vegetation type, derived from satellite remote sensing), as well as geostatistical random effects describing spatial clustering of trachoma. The model predicted the west of the country to be at no or low trachoma risk. Trachoma clusters in the central, northern and eastern areas had a radius of 8 km after accounting for the fixed effects. Conclusion In Southern Sudan, large-scale spatial variation in the risk of active trachoma infection is associated with aridity. Spatial prediction has identified likely high-risk areas to be prioritized for more data collection, potentially to be followed by intervention. PMID:20808910

  4. On the use of posterior predictive probabilities and prediction uncertainty to tailor informative sampling for parasitological surveillance in livestock.

    PubMed

    Musella, Vincenzo; Rinaldi, Laura; Lagazio, Corrado; Cringoli, Giuseppe; Biggeri, Annibale; Catelan, Dolores

    2014-09-15

    Model-based geostatistics and Bayesian approaches are appropriate in the context of Veterinary Epidemiology when point data have been collected by valid study designs. The aim is to predict a continuous infection risk surface. Little work has been done on the use of predictive infection probabilities at farm unit level. In this paper we show how to use predictive infection probability and related uncertainty from a Bayesian kriging model to draw a informative samples from the 8794 geo-referenced sheep farms of the Campania region (southern Italy). Parasitological data come from a first cross-sectional survey carried out to study the spatial distribution of selected helminths in sheep farms. A grid sampling was performed to select the farms for coprological examinations. Faecal samples were collected for 121 sheep farms and the presence of 21 different helminths were investigated using the FLOTAC technique. The 21 responses are very different in terms of geographical distribution and prevalence of infection. The observed prevalence range is from 0.83% to 96.69%. The distributions of the posterior predictive probabilities for all the 21 parasites are very heterogeneous. We show how the results of the Bayesian kriging model can be used to plan a second wave survey. Several alternatives can be chosen depending on the purposes of the second survey: weight by posterior predictive probabilities, their uncertainty or combining both information. The proposed Bayesian kriging model is simple, and the proposed samping strategy represents a useful tool to address targeted infection control treatments and surbveillance campaigns. It is easily extendable to other fields of research. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. High-Dimensional Bayesian Geostatistics

    PubMed Central

    Banerjee, Sudipto

    2017-01-01

    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings. PMID:29391920

  6. High-Dimensional Bayesian Geostatistics.

    PubMed

    Banerjee, Sudipto

    2017-06-01

    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.

  7. Model-Based Geostatistical Mapping of the Prevalence of Onchocerca volvulus in West Africa

    PubMed Central

    O’Hanlon, Simon J.; Slater, Hannah C.; Cheke, Robert A.; Boatin, Boakye A.; Coffeng, Luc E.; Pion, Sébastien D. S.; Boussinesq, Michel; Zouré, Honorat G. M.; Stolk, Wilma A.; Basáñez, María-Gloria

    2016-01-01

    Background The initial endemicity (pre-control prevalence) of onchocerciasis has been shown to be an important determinant of the feasibility of elimination by mass ivermectin distribution. We present the first geostatistical map of microfilarial prevalence in the former Onchocerciasis Control Programme in West Africa (OCP) before commencement of antivectorial and antiparasitic interventions. Methods and Findings Pre-control microfilarial prevalence data from 737 villages across the 11 constituent countries in the OCP epidemiological database were used as ground-truth data. These 737 data points, plus a set of statistically selected environmental covariates, were used in a Bayesian model-based geostatistical (B-MBG) approach to generate a continuous surface (at pixel resolution of 5 km x 5km) of microfilarial prevalence in West Africa prior to the commencement of the OCP. Uncertainty in model predictions was measured using a suite of validation statistics, performed on bootstrap samples of held-out validation data. The mean Pearson’s correlation between observed and estimated prevalence at validation locations was 0.693; the mean prediction error (average difference between observed and estimated values) was 0.77%, and the mean absolute prediction error (average magnitude of difference between observed and estimated values) was 12.2%. Within OCP boundaries, 17.8 million people were deemed to have been at risk, 7.55 million to have been infected, and mean microfilarial prevalence to have been 45% (range: 2–90%) in 1975. Conclusions and Significance This is the first map of initial onchocerciasis prevalence in West Africa using B-MBG. Important environmental predictors of infection prevalence were identified and used in a model out-performing those without spatial random effects or environmental covariates. Results may be compared with recent epidemiological mapping efforts to find areas of persisting transmission. These methods may be extended to areas where data are sparse, and may be used to help inform the feasibility of elimination with current and novel tools. PMID:26771545

  8. Bayesian Hierarchical Modeling for Big Data Fusion in Soil Hydrology

    NASA Astrophysics Data System (ADS)

    Mohanty, B.; Kathuria, D.; Katzfuss, M.

    2016-12-01

    Soil moisture datasets from remote sensing (RS) platforms (such as SMOS and SMAP) and reanalysis products from land surface models are typically available on a coarse spatial granularity of several square km. Ground based sensors on the other hand provide observations on a finer spatial scale (meter scale or less) but are sparsely available. Soil moisture is affected by high variability due to complex interactions between geologic, topographic, vegetation and atmospheric variables. Hydrologic processes usually occur at a scale of 1 km or less and therefore spatially ubiquitous and temporally periodic soil moisture products at this scale are required to aid local decision makers in agriculture, weather prediction and reservoir operations. Past literature has largely focused on downscaling RS soil moisture for a small extent of a field or a watershed and hence the applicability of such products has been limited. The present study employs a spatial Bayesian Hierarchical Model (BHM) to derive soil moisture products at a spatial scale of 1 km for the state of Oklahoma by fusing point scale Mesonet data and coarse scale RS data for soil moisture and its auxiliary covariates such as precipitation, topography, soil texture and vegetation. It is seen that the BHM model handles change of support problems easily while performing accurate uncertainty quantification arising from measurement errors and imperfect retrieval algorithms. The computational challenge arising due to the large number of measurements is tackled by utilizing basis function approaches and likelihood approximations. The BHM model can be considered as a complex Bayesian extension of traditional geostatistical prediction methods (such as Kriging) for large datasets in the presence of uncertainties.

  9. Obtaining parsimonious hydraulic conductivity fields using head and transport observations: A Bayesian geostatistical parameter estimation approach

    NASA Astrophysics Data System (ADS)

    Fienen, M.; Hunt, R.; Krabbenhoft, D.; Clemo, T.

    2009-08-01

    Flow path delineation is a valuable tool for interpreting the subsurface hydrogeochemical environment. Different types of data, such as groundwater flow and transport, inform different aspects of hydrogeologic parameter values (hydraulic conductivity in this case) which, in turn, determine flow paths. This work combines flow and transport information to estimate a unified set of hydrogeologic parameters using the Bayesian geostatistical inverse approach. Parameter flexibility is allowed by using a highly parameterized approach with the level of complexity informed by the data. Despite the effort to adhere to the ideal of minimal a priori structure imposed on the problem, extreme contrasts in parameters can result in the need to censor correlation across hydrostratigraphic bounding surfaces. These partitions segregate parameters into facies associations. With an iterative approach in which partitions are based on inspection of initial estimates, flow path interpretation is progressively refined through the inclusion of more types of data. Head observations, stable oxygen isotopes (18O/16O ratios), and tritium are all used to progressively refine flow path delineation on an isthmus between two lakes in the Trout Lake watershed, northern Wisconsin, United States. Despite allowing significant parameter freedom by estimating many distributed parameter values, a smooth field is obtained.

  10. Obtaining parsimonious hydraulic conductivity fields using head and transport observations: A Bayesian geostatistical parameter estimation approach

    USGS Publications Warehouse

    Fienen, M.; Hunt, R.; Krabbenhoft, D.; Clemo, T.

    2009-01-01

    Flow path delineation is a valuable tool for interpreting the subsurface hydrogeochemical environment. Different types of data, such as groundwater flow and transport, inform different aspects of hydrogeologic parameter values (hydraulic conductivity in this case) which, in turn, determine flow paths. This work combines flow and transport information to estimate a unified set of hydrogeologic parameters using the Bayesian geostatistical inverse approach. Parameter flexibility is allowed by using a highly parameterized approach with the level of complexity informed by the data. Despite the effort to adhere to the ideal of minimal a priori structure imposed on the problem, extreme contrasts in parameters can result in the need to censor correlation across hydrostratigraphic bounding surfaces. These partitions segregate parameters into facies associations. With an iterative approach in which partitions are based on inspection of initial estimates, flow path interpretation is progressively refined through the inclusion of more types of data. Head observations, stable oxygen isotopes (18O/16O ratios), and tritium are all used to progressively refine flow path delineation on an isthmus between two lakes in the Trout Lake watershed, northern Wisconsin, United States. Despite allowing significant parameter freedom by estimating many distributed parameter values, a smooth field is obtained.

  11. Carbon Tetrachloride Emissions from the US during 2008 - 2012 Derived from Atmospheric Data Using Bayesian and Geostatistical Inversions

    NASA Astrophysics Data System (ADS)

    Hu, L.; Montzka, S. A.; Miller, B.; Andrews, A. E.; Miller, J. B.; Lehman, S.; Sweeney, C.; Miller, S. M.; Thoning, K. W.; Siso, C.; Atlas, E. L.; Blake, D. R.; De Gouw, J. A.; Gilman, J.; Dutton, G. S.; Elkins, J. W.; Hall, B. D.; Chen, H.; Fischer, M. L.; Mountain, M. E.; Nehrkorn, T.; Biraud, S.; Tans, P. P.

    2015-12-01

    Global atmospheric observations suggest substantial ongoing emissions of carbon tetrachloride (CCl4) despite a 100% phase-out of production for dispersive uses since 1996 in developed countries and 2010 in other countries. Little progress has been made in understanding the causes of these ongoing emissions or identifying their contributing sources. In this study, we employed multiple inverse modeling techniques (i.e. Bayesian and geostatistical inversions) to assimilate CCl4 mole fractions observed from the National Oceanic and Atmospheric Administration (NOAA) flask-air sampling network over the US, and quantify its national and regional emissions during 2008 - 2012. Average national total emissions of CCl4 between 2008 and 2012 determined from these observations and an ensemble of inversions range between 2.1 and 6.1 Gg yr-1. This emission is substantially larger than the mean of 0.06 Gg/yr reported to the US EPA Toxics Release Inventory over these years, suggesting that under-reported emissions or non-reporting sources make up the bulk of CCl4 emissions from the US. But while the inventory does not account for the magnitude of observationally-derived CCl4 emissions, the regional distribution of derived and inventory emissions is similar. Furthermore, when considered relative to the distribution of uncapped landfills or population, the variability in measured mole fractions was most consistent with the distribution of industrial sources (i.e., those from the Toxics Release Inventory). Our results suggest that emissions from the US only account for a small fraction of the global on-going emissions of CCl4 (30 - 80 Gg yr-1 over this period). Finally, to ascertain the importance of the US emissions relative to the unaccounted global emission rate we considered multiple approaches to extrapolate our results to other countries and the globe.

  12. Estimation of Groundwater Radon in North Carolina Using Land Use Regression and Bayesian Maximum Entropy.

    PubMed

    Messier, Kyle P; Campbell, Ted; Bradley, Philip J; Serre, Marc L

    2015-08-18

    Radon ((222)Rn) is a naturally occurring chemically inert, colorless, and odorless radioactive gas produced from the decay of uranium ((238)U), which is ubiquitous in rocks and soils worldwide. Exposure to (222)Rn is likely the second leading cause of lung cancer after cigarette smoking via inhalation; however, exposure through untreated groundwater is also a contributing factor to both inhalation and ingestion routes. A land use regression (LUR) model for groundwater (222)Rn with anisotropic geological and (238)U based explanatory variables is developed, which helps elucidate the factors contributing to elevated (222)Rn across North Carolina. The LUR is also integrated into the Bayesian Maximum Entropy (BME) geostatistical framework to increase accuracy and produce a point-level LUR-BME model of groundwater (222)Rn across North Carolina including prediction uncertainty. The LUR-BME model of groundwater (222)Rn results in a leave-one out cross-validation r(2) of 0.46 (Pearson correlation coefficient = 0.68), effectively predicting within the spatial covariance range. Modeled results of (222)Rn concentrations show variability among intrusive felsic geological formations likely due to average bedrock (238)U defined on the basis of overlying stream-sediment (238)U concentrations that is a widely distributed consistently analyzed point-source data.

  13. Bayesian Geostatistical Model-Based Estimates of Soil-Transmitted Helminth Infection in Nigeria, Including Annual Deworming Requirements

    PubMed Central

    Oluwole, Akinola S.; Ekpo, Uwem F.; Karagiannis-Voules, Dimitrios-Alexios; Abe, Eniola M.; Olamiju, Francisca O.; Isiyaku, Sunday; Okoronkwo, Chukwu; Saka, Yisa; Nebe, Obiageli J.; Braide, Eka I.; Mafiana, Chiedu F.; Utzinger, Jürg; Vounatsou, Penelope

    2015-01-01

    Background The acceleration of the control of soil-transmitted helminth (STH) infections in Nigeria, emphasizing preventive chemotherapy, has become imperative in light of the global fight against neglected tropical diseases. Predictive risk maps are an important tool to guide and support control activities. Methodology STH infection prevalence data were obtained from surveys carried out in 2011 using standard protocols. Data were geo-referenced and collated in a nationwide, geographic information system database. Bayesian geostatistical models with remotely sensed environmental covariates and variable selection procedures were utilized to predict the spatial distribution of STH infections in Nigeria. Principal Findings We found that hookworm, Ascaris lumbricoides, and Trichuris trichiura infections are endemic in 482 (86.8%), 305 (55.0%), and 55 (9.9%) locations, respectively. Hookworm and A. lumbricoides infection co-exist in 16 states, while the three species are co-endemic in 12 states. Overall, STHs are endemic in 20 of the 36 states of Nigeria, including the Federal Capital Territory of Abuja. The observed prevalence at endemic locations ranged from 1.7% to 51.7% for hookworm, from 1.6% to 77.8% for A. lumbricoides, and from 1.0% to 25.5% for T. trichiura. Model-based predictions ranged from 0.7% to 51.0% for hookworm, from 0.1% to 82.6% for A. lumbricoides, and from 0.0% to 18.5% for T. trichiura. Our models suggest that day land surface temperature and dense vegetation are important predictors of the spatial distribution of STH infection in Nigeria. In 2011, a total of 5.7 million (13.8%) school-aged children were predicted to be infected with STHs in Nigeria. Mass treatment at the local government area level for annual or bi-annual treatment of the school-aged population in Nigeria in 2011, based on World Health Organization prevalence thresholds, were estimated at 10.2 million tablets. Conclusions/Significance The predictive risk maps and estimated deworming needs presented here will be helpful for escalating the control and spatial targeting of interventions against STH infections in Nigeria. PMID:25909633

  14. Bayesian geostatistical model-based estimates of soil-transmitted helminth infection in Nigeria, including annual deworming requirements.

    PubMed

    Oluwole, Akinola S; Ekpo, Uwem F; Karagiannis-Voules, Dimitrios-Alexios; Abe, Eniola M; Olamiju, Francisca O; Isiyaku, Sunday; Okoronkwo, Chukwu; Saka, Yisa; Nebe, Obiageli J; Braide, Eka I; Mafiana, Chiedu F; Utzinger, Jürg; Vounatsou, Penelope

    2015-04-01

    The acceleration of the control of soil-transmitted helminth (STH) infections in Nigeria, emphasizing preventive chemotherapy, has become imperative in light of the global fight against neglected tropical diseases. Predictive risk maps are an important tool to guide and support control activities. STH infection prevalence data were obtained from surveys carried out in 2011 using standard protocols. Data were geo-referenced and collated in a nationwide, geographic information system database. Bayesian geostatistical models with remotely sensed environmental covariates and variable selection procedures were utilized to predict the spatial distribution of STH infections in Nigeria. We found that hookworm, Ascaris lumbricoides, and Trichuris trichiura infections are endemic in 482 (86.8%), 305 (55.0%), and 55 (9.9%) locations, respectively. Hookworm and A. lumbricoides infection co-exist in 16 states, while the three species are co-endemic in 12 states. Overall, STHs are endemic in 20 of the 36 states of Nigeria, including the Federal Capital Territory of Abuja. The observed prevalence at endemic locations ranged from 1.7% to 51.7% for hookworm, from 1.6% to 77.8% for A. lumbricoides, and from 1.0% to 25.5% for T. trichiura. Model-based predictions ranged from 0.7% to 51.0% for hookworm, from 0.1% to 82.6% for A. lumbricoides, and from 0.0% to 18.5% for T. trichiura. Our models suggest that day land surface temperature and dense vegetation are important predictors of the spatial distribution of STH infection in Nigeria. In 2011, a total of 5.7 million (13.8%) school-aged children were predicted to be infected with STHs in Nigeria. Mass treatment at the local government area level for annual or bi-annual treatment of the school-aged population in Nigeria in 2011, based on World Health Organization prevalence thresholds, were estimated at 10.2 million tablets. The predictive risk maps and estimated deworming needs presented here will be helpful for escalating the control and spatial targeting of interventions against STH infections in Nigeria.

  15. Risk mapping of clonorchiasis in the People’s Republic of China: A systematic review and Bayesian geostatistical analysis

    PubMed Central

    Lai, Ying-Si; Zhou, Xiao-Nong; Pan, Zhi-Heng; Utzinger, Jürg; Vounatsou, Penelope

    2017-01-01

    Background Clonorchiasis, one of the most important food-borne trematodiases, affects more than 12 million people in the People’s Republic of China (P.R. China). Spatially explicit risk estimates of Clonorchis sinensis infection are needed in order to target control interventions. Methodology Georeferenced survey data pertaining to infection prevalence of C. sinensis in P.R. China from 2000 onwards were obtained via a systematic review in PubMed, ISI Web of Science, Chinese National Knowledge Internet, and Wanfang Data from January 1, 2000 until January 10, 2016, with no restriction of language or study design. Additional disease data were provided by the National Institute of Parasitic Diseases, Chinese Center for Diseases Control and Prevention in Shanghai. Environmental and socioeconomic proxies were extracted from remote-sensing and other data sources. Bayesian variable selection was carried out to identify the most important predictors of C. sinensis risk. Geostatistical models were applied to quantify the association between infection risk and the predictors of the disease, and to predict the risk of infection across P.R. China at high spatial resolution (over a grid with grid cell size of 5×5 km). Principal findings We obtained clonorchiasis survey data at 633 unique locations in P.R. China. We observed that the risk of C. sinensis infection increased over time, particularly from 2005 onwards. We estimate that around 14.8 million (95% Bayesian credible interval 13.8–15.8 million) people in P.R. China were infected with C. sinensis in 2010. Highly endemic areas (≥ 20%) were concentrated in southern and northeastern parts of the country. The provinces with the highest risk of infection and the largest number of infected people were Guangdong, Guangxi, and Heilongjiang. Conclusions/Significance Our results provide spatially relevant information for guiding clonorchiasis control interventions in P.R. China. The trend toward higher risk of C. sinensis infection in the recent past urges the Chinese government to pay more attention to the public health importance of clonorchiasis and to target interventions to high-risk areas. PMID:28253272

  16. Iowa radon leukaemia study: a hierarchical population risk model for spatially correlated exposure measured with error.

    PubMed

    Smith, Brian J; Zhang, Lixun; Field, R William

    2007-11-10

    This paper presents a Bayesian model that allows for the joint prediction of county-average radon levels and estimation of the associated leukaemia risk. The methods are motivated by radon data from an epidemiologic study of residential radon in Iowa that include 2726 outdoor and indoor measurements. Prediction of county-average radon is based on a geostatistical model for the radon data which assumes an underlying continuous spatial process. In the radon model, we account for uncertainties due to incomplete spatial coverage, spatial variability, characteristic differences between homes, and detector measurement error. The predicted radon averages are, in turn, included as a covariate in Poisson models for incident cases of acute lymphocytic (ALL), acute myelogenous (AML), chronic lymphocytic (CLL), and chronic myelogenous (CML) leukaemias reported to the Iowa cancer registry from 1973 to 2002. Since radon and leukaemia risk are modelled simultaneously in our approach, the resulting risk estimates accurately reflect uncertainties in the predicted radon exposure covariate. Posterior mean (95 per cent Bayesian credible interval) estimates of the relative risk associated with a 1 pCi/L increase in radon for ALL, AML, CLL, and CML are 0.91 (0.78-1.03), 1.01 (0.92-1.12), 1.06 (0.96-1.16), and 1.12 (0.98-1.27), respectively. Copyright 2007 John Wiley & Sons, Ltd.

  17. Modern space/time geostatistics using river distances: data integration of turbidity and E. coli measurements to assess fecal contamination along the Raritan River in New Jersey.

    PubMed

    Money, Eric S; Carter, Gail P; Serre, Marc L

    2009-05-15

    Escherichia coli (E. coli) is a widely used indicator of fecal contamination in water bodies. External contact and subsequent ingestion of bacteria coming from fecal contamination can lead to harmful health effects. Since E. coli data are sometimes limited, the objective of this study is to use secondary information in the form of turbidity to improve the assessment of E. coli at unmonitored locations. We obtained all E. coli and turbidity monitoring data available from existing monitoring networks for the 2000-2006 time period for the Raritan River Basin, New Jersey. Using collocated measurements, we developed a predictive model of E. coli from turbidity data. Using this model, soft data are constructed for E. coli given turbidity measurements at 739 space/time locations where only turbidity was measured. Finally, the Bayesian Maximum Entropy (BME) method of modern space/time geostatistics was used for the data integration of monitored and predicted E. coli data to produce maps showing E. coli concentration estimated daily across the river basin. The addition of soft data in conjunction with the use of river distances reduced estimation error by about 30%. Furthermore, based on these maps, up to 35% of river miles in the Raritan Basin had a probability of E coli impairment greater than 90% on the most polluted day of the study period.

  18. Three-Dimensional Bayesian Geostatistical Aquifer Characterization at the Hanford 300 Area using Tracer Test Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.

    2012-06-01

    Tracer testing under natural or forced gradient flow holds the potential to provide useful information for characterizing subsurface properties, through monitoring, modeling and interpretation of the tracer plume migration in an aquifer. Non-reactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling. A Bayesian data assimilation technique, the method of anchored distributions (MAD) [Rubin et al., 2010], was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of themore » Hanford formation. In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively-parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field data shows that the hydrogeological model, when conditioned on the tracer test data, can reproduce the tracer transport behavior better than the field characterized without the tracer test data. This study successfully demonstrates that MAD can sequentially assimilate multi-scale multi-type field data through a consistent Bayesian framework.« less

  19. Stochastic Inversion of 2D Magnetotelluric Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Jinsong

    2010-07-01

    The algorithm is developed to invert 2D magnetotelluric (MT) data based on sharp boundary parametrization using a Bayesian framework. Within the algorithm, we consider the locations and the resistivity of regions formed by the interfaces are as unknowns. We use a parallel, adaptive finite-element algorithm to forward simulate frequency-domain MT responses of 2D conductivity structure. Those unknown parameters are spatially correlated and are described by a geostatistical model. The joint posterior probability distribution function is explored by Markov Chain Monte Carlo (MCMC) sampling methods. The developed stochastic model is effective for estimating the interface locations and resistivity. Most importantly, itmore » provides details uncertainty information on each unknown parameter. Hardware requirements: PC, Supercomputer, Multi-platform, Workstation; Software requirements C and Fortan; Operation Systems/version is Linux/Unix or Windows« less

  20. Assessment of groundwater level estimation uncertainty using sequential Gaussian simulation and Bayesian bootstrapping

    NASA Astrophysics Data System (ADS)

    Varouchakis, Emmanouil; Hristopulos, Dionissios

    2015-04-01

    Space-time geostatistical approaches can improve the reliability of dynamic groundwater level models in areas with limited spatial and temporal data. Space-time residual Kriging (STRK) is a reliable method for spatiotemporal interpolation that can incorporate auxiliary information. The method usually leads to an underestimation of the prediction uncertainty. The uncertainty of spatiotemporal models is usually estimated by determining the space-time Kriging variance or by means of cross validation analysis. For de-trended data the former is not usually applied when complex spatiotemporal trend functions are assigned. A Bayesian approach based on the bootstrap idea and sequential Gaussian simulation are employed to determine the uncertainty of the spatiotemporal model (trend and covariance) parameters. These stochastic modelling approaches produce multiple realizations, rank the prediction results on the basis of specified criteria and capture the range of the uncertainty. The correlation of the spatiotemporal residuals is modeled using a non-separable space-time variogram based on the Spartan covariance family (Hristopulos and Elogne 2007, Varouchakis and Hristopulos 2013). We apply these simulation methods to investigate the uncertainty of groundwater level variations. The available dataset consists of bi-annual (dry and wet hydrological period) groundwater level measurements in 15 monitoring locations for the time period 1981 to 2010. The space-time trend function is approximated using a physical law that governs the groundwater flow in the aquifer in the presence of pumping. The main objective of this research is to compare the performance of two simulation methods for prediction uncertainty estimation. In addition, we investigate the performance of the Spartan spatiotemporal covariance function for spatiotemporal geostatistical analysis. Hristopulos, D.T. and Elogne, S.N. 2007. Analytic properties and covariance functions for a new class of generalized Gibbs random fields. IΕΕΕ Transactions on Information Theory, 53:4667-4467. Varouchakis, E.A. and Hristopulos, D.T. 2013. Improvement of groundwater level prediction in sparsely gauged basins using physical laws and local geographic features as auxiliary variables. Advances in Water Resources, 52:34-49. Research supported by the project SPARTA 1591: "Development of Space-Time Random Fields based on Local Interaction Models and Applications in the Processing of Spatiotemporal Datasets". "SPARTA" is implemented under the "ARISTEIA" Action of the operational programme Education and Lifelong Learning and is co-funded by the European Social Fund (ESF) and National Resources.

  1. An LUR/BME framework to estimate PM2.5 explained by on road mobile and stationary sources.

    PubMed

    Reyes, Jeanette M; Serre, Marc L

    2014-01-01

    Knowledge of particulate matter concentrations <2.5 μm in diameter (PM2.5) across the United States is limited due to sparse monitoring across space and time. Epidemiological studies need accurate exposure estimates in order to properly investigate potential morbidity and mortality. Previous works have used geostatistics and land use regression (LUR) separately to quantify exposure. This work combines both methods by incorporating a large area variability LUR model that accounts for on road mobile emissions and stationary source emissions along with data that take into account incompleteness of PM2.5 monitors into the modern geostatistical Bayesian Maximum Entropy (BME) framework to estimate PM2.5 across the United States from 1999 to 2009. A cross-validation was done to determine the improvement of the estimate due to the LUR incorporation into BME. These results were applied to known diseases to determine predicted mortality coming from total PM2.5 as well as PM2.5 explained by major contributing sources. This method showed a mean squared error reduction of over 21.89% oversimple kriging. PM2.5 explained by on road mobile emissions and stationary emissions contributed to nearly 568,090 and 306,316 deaths, respectively, across the United States from 1999 to 2007.

  2. Bayesian modeling to assess populated areas impacted by radiation from Fukushima

    NASA Astrophysics Data System (ADS)

    Hultquist, C.; Cervone, G.

    2017-12-01

    Citizen-led movements producing spatio-temporal big data are increasingly important sources of information about populations that are impacted by natural disasters. Citizen science can be used to fill gaps in disaster monitoring data, in addition to inferring human exposure and vulnerability to extreme environmental impacts. As a response to the 2011 release of radiation from Fukushima, Japan, the Safecast project began collecting open radiation data which grew to be a global dataset of over 70 million measurements to date. This dataset is spatially distributed primarily where humans are located and demonstrates abnormal patterns of population movements as a result of the disaster. Previous work has demonstrated that Safecast is highly correlated in comparison to government radiation observations. However, there is still a scientific need to understand the geostatistical variability of Safecast data and to assess how reliable the data are over space and time. The Bayesian hierarchical approach can be used to model the spatial distribution of datasets and flexibly integrate new flows of data without losing previous information. This enables an understanding of uncertainty in the spatio-temporal data to inform decision makers on areas of high levels of radiation where populations are located. Citizen science data can be scientifically evaluated and used as a critical source of information about populations that are impacted by a disaster.

  3. Reservoir studies with geostatistics to forecast performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, R.W.; Behrens, R.A.; Emanuel, A.S.

    1991-05-01

    In this paper example geostatistics and streamtube applications are presented for waterflood and CO{sub 2} flood in two low-permeability sandstone reservoirs. Thy hybrid approach of combining fine vertical resolution in cross-sectional models with streamtubes resulted in models that showed water channeling and provided realistic performance estimates. Results indicate that the combination of detailed geostatistical cross sections and fine-grid streamtube models offers a systematic approach for realistic performance forecasts.

  4. Modelling the geographical distribution of soil-transmitted helminth infections in Bolivia.

    PubMed

    Chammartin, Frédérique; Scholte, Ronaldo G C; Malone, John B; Bavia, Mara E; Nieto, Prixia; Utzinger, Jürg; Vounatsou, Penelope

    2013-05-25

    The prevalence of infection with the three common soil-transmitted helminths (i.e. Ascaris lumbricoides, Trichuris trichiura, and hookworm) in Bolivia is among the highest in Latin America. However, the spatial distribution and burden of soil-transmitted helminthiasis are poorly documented. We analysed historical survey data using Bayesian geostatistical models to identify determinants of the distribution of soil-transmitted helminth infections, predict the geographical distribution of infection risk, and assess treatment needs and costs in the frame of preventive chemotherapy. Rigorous geostatistical variable selection identified the most important predictors of A. lumbricoides, T. trichiura, and hookworm transmission. Results show that precipitation during the wettest quarter above 400 mm favours the distribution of A. lumbricoides. Altitude has a negative effect on T. trichiura. Hookworm is sensitive to temperature during the coldest month. We estimate that 38.0%, 19.3%, and 11.4% of the Bolivian population is infected with A. lumbricoides, T. trichiura, and hookworm, respectively. Assuming independence of the three infections, 48.4% of the population is infected with any soil-transmitted helminth. Empirical-based estimates, according to treatment recommendations by the World Health Organization, suggest a total of 2.9 million annualised treatments for the control of soil-transmitted helminthiasis in Bolivia. We provide estimates of soil-transmitted helminth infections in Bolivia based on high-resolution spatial prediction and an innovative variable selection approach. However, the scarcity of the data suggests that a national survey is required for more accurate mapping that will govern spatial targeting of soil-transmitted helminthiasis control.

  5. Modern Space/Time Geostatistics using River Distances: Data Integration of Turbidity and E.coli Measurements to Assess Fecal Contamination Along the Raritan River in New Jersey

    PubMed Central

    Money, Eric S.; Carter, Gail P.; Serre, Marc L.

    2009-01-01

    Escherichia coli (E.coli) is a widely used indicator of fecal contamination in water bodies. External contact and subsequent ingestion of bacteria coming from fecal contamination can lead to harmful health effects. Since E.coli data are sometimes limited, the objective of this study is to use secondary information in the form of turbidity to improve the assessment of E.coli at un-monitored locations. We obtained all E.coli and turbidity monitoring data available from existing monitoring networks for the 2000 – 2006 time period for the Raritan River Basin, New Jersey. Using collocated measurements we developed a predictive model of E.coli from turbidity data. Using this model, soft data are constructed for E.coli given turbidity measurements at 739 space/time locations where only turbidity was measured. Finally, the Bayesian Maximum Entropy (BME) method of modern space/time geostatistics was used for the data integration of monitored and predicted E.coli data to produce maps showing E.coli concentration estimated daily across the river basin. The addition of soft data in conjunction with the use of river distances reduced estimation error by about 30%. Furthermore, based on these maps, up to 35% of river miles in the Raritan Basin had a probability of E.coli impairment greater than 90% on the most polluted day of the study period. PMID:19544881

  6. A multiscale Bayesian data integration approach for mapping air dose rates around the Fukushima Daiichi Nuclear Power Plant.

    PubMed

    Wainwright, Haruko M; Seki, Akiyuki; Chen, Jinsong; Saito, Kimiaki

    2017-02-01

    This paper presents a multiscale data integration method to estimate the spatial distribution of air dose rates in the regional scale around the Fukushima Daiichi Nuclear Power Plant. We integrate various types of datasets, such as ground-based walk and car surveys, and airborne surveys, all of which have different scales, resolutions, spatial coverage, and accuracy. This method is based on geostatistics to represent spatial heterogeneous structures, and also on Bayesian hierarchical models to integrate multiscale, multi-type datasets in a consistent manner. The Bayesian method allows us to quantify the uncertainty in the estimates, and to provide the confidence intervals that are critical for robust decision-making. Although this approach is primarily data-driven, it has great flexibility to include mechanistic models for representing radiation transport or other complex correlations. We demonstrate our approach using three types of datasets collected at the same time over Fukushima City in Japan: (1) coarse-resolution airborne surveys covering the entire area, (2) car surveys along major roads, and (3) walk surveys in multiple neighborhoods. Results show that the method can successfully integrate three types of datasets and create an integrated map (including the confidence intervals) of air dose rates over the domain in high resolution. Moreover, this study provides us with various insights into the characteristics of each dataset, as well as radiocaesium distribution. In particular, the urban areas show high heterogeneity in the contaminant distribution due to human activities as well as large discrepancy among different surveys due to such heterogeneity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Hydraulic Conductivity Estimation using Bayesian Model Averaging and Generalized Parameterization

    NASA Astrophysics Data System (ADS)

    Tsai, F. T.; Li, X.

    2006-12-01

    Non-uniqueness in parameterization scheme is an inherent problem in groundwater inverse modeling due to limited data. To cope with the non-uniqueness problem of parameterization, we introduce a Bayesian Model Averaging (BMA) method to integrate a set of selected parameterization methods. The estimation uncertainty in BMA includes the uncertainty in individual parameterization methods as the within-parameterization variance and the uncertainty from using different parameterization methods as the between-parameterization variance. Moreover, the generalized parameterization (GP) method is considered in the geostatistical framework in this study. The GP method aims at increasing the flexibility of parameterization through the combination of a zonation structure and an interpolation method. The use of BMP with GP avoids over-confidence in a single parameterization method. A normalized least-squares estimation (NLSE) is adopted to calculate the posterior probability for each GP. We employee the adjoint state method for the sensitivity analysis on the weighting coefficients in the GP method. The adjoint state method is also applied to the NLSE problem. The proposed methodology is implemented to the Alamitos Barrier Project (ABP) in California, where the spatially distributed hydraulic conductivity is estimated. The optimal weighting coefficients embedded in GP are identified through the maximum likelihood estimation (MLE) where the misfits between the observed and calculated groundwater heads are minimized. The conditional mean and conditional variance of the estimated hydraulic conductivity distribution using BMA are obtained to assess the estimation uncertainty.

  8. Adjusting for sampling variability in sparse data: geostatistical approaches to disease mapping

    PubMed Central

    2011-01-01

    Background Disease maps of crude rates from routinely collected health data indexed at a small geographical resolution pose specific statistical problems due to the sparse nature of the data. Spatial smoothers allow areas to borrow strength from neighboring regions to produce a more stable estimate of the areal value. Geostatistical smoothers are able to quantify the uncertainty in smoothed rate estimates without a high computational burden. In this paper, we introduce a uniform model extension of Bayesian Maximum Entropy (UMBME) and compare its performance to that of Poisson kriging in measures of smoothing strength and estimation accuracy as applied to simulated data and the real data example of HIV infection in North Carolina. The aim is to produce more reliable maps of disease rates in small areas to improve identification of spatial trends at the local level. Results In all data environments, Poisson kriging exhibited greater smoothing strength than UMBME. With the simulated data where the true latent rate of infection was known, Poisson kriging resulted in greater estimation accuracy with data that displayed low spatial autocorrelation, while UMBME provided more accurate estimators with data that displayed higher spatial autocorrelation. With the HIV data, UMBME performed slightly better than Poisson kriging in cross-validatory predictive checks, with both models performing better than the observed data model with no smoothing. Conclusions Smoothing methods have different advantages depending upon both internal model assumptions that affect smoothing strength and external data environments, such as spatial correlation of the observed data. Further model comparisons in different data environments are required to provide public health practitioners with guidelines needed in choosing the most appropriate smoothing method for their particular health dataset. PMID:21978359

  9. Adjusting for sampling variability in sparse data: geostatistical approaches to disease mapping.

    PubMed

    Hampton, Kristen H; Serre, Marc L; Gesink, Dionne C; Pilcher, Christopher D; Miller, William C

    2011-10-06

    Disease maps of crude rates from routinely collected health data indexed at a small geographical resolution pose specific statistical problems due to the sparse nature of the data. Spatial smoothers allow areas to borrow strength from neighboring regions to produce a more stable estimate of the areal value. Geostatistical smoothers are able to quantify the uncertainty in smoothed rate estimates without a high computational burden. In this paper, we introduce a uniform model extension of Bayesian Maximum Entropy (UMBME) and compare its performance to that of Poisson kriging in measures of smoothing strength and estimation accuracy as applied to simulated data and the real data example of HIV infection in North Carolina. The aim is to produce more reliable maps of disease rates in small areas to improve identification of spatial trends at the local level. In all data environments, Poisson kriging exhibited greater smoothing strength than UMBME. With the simulated data where the true latent rate of infection was known, Poisson kriging resulted in greater estimation accuracy with data that displayed low spatial autocorrelation, while UMBME provided more accurate estimators with data that displayed higher spatial autocorrelation. With the HIV data, UMBME performed slightly better than Poisson kriging in cross-validatory predictive checks, with both models performing better than the observed data model with no smoothing. Smoothing methods have different advantages depending upon both internal model assumptions that affect smoothing strength and external data environments, such as spatial correlation of the observed data. Further model comparisons in different data environments are required to provide public health practitioners with guidelines needed in choosing the most appropriate smoothing method for their particular health dataset.

  10. Random vectors and spatial analysis by geostatistics for geotechnical applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Young, D.S.

    1987-08-01

    Geostatistics is extended to the spatial analysis of vector variables by defining the estimation variance and vector variogram in terms of the magnitude of difference vectors. Many random variables in geotechnology are in vectorial terms rather than scalars, and its structural analysis requires those sample variable interpolations to construct and characterize structural models. A better local estimator will result in greater quality of input models; geostatistics can provide such estimators; kriging estimators. The efficiency of geostatistics for vector variables is demonstrated in a case study of rock joint orientations in geological formations. The positive cross-validation encourages application of geostatistics tomore » spatial analysis of random vectors in geoscience as well as various geotechnical fields including optimum site characterization, rock mechanics for mining and civil structures, cavability analysis of block cavings, petroleum engineering, and hydrologic and hydraulic modelings.« less

  11. Gstat: a program for geostatistical modelling, prediction and simulation

    NASA Astrophysics Data System (ADS)

    Pebesma, Edzer J.; Wesseling, Cees G.

    1998-01-01

    Gstat is a computer program for variogram modelling, and geostatistical prediction and simulation. It provides a generic implementation of the multivariable linear model with trends modelled as a linear function of coordinate polynomials or of user-defined base functions, and independent or dependent, geostatistically modelled, residuals. Simulation in gstat comprises conditional or unconditional (multi-) Gaussian sequential simulation of point values or block averages, or (multi-) indicator sequential simulation. Besides many of the popular options found in other geostatistical software packages, gstat offers the unique combination of (i) an interactive user interface for modelling variograms and generalized covariances (residual variograms), that uses the device-independent plotting program gnuplot for graphical display, (ii) support for several ascii and binary data and map file formats for input and output, (iii) a concise, intuitive and flexible command language, (iv) user customization of program defaults, (v) no built-in limits, and (vi) free, portable ANSI-C source code. This paper describes the class of problems gstat can solve, and addresses aspects of efficiency and implementation, managing geostatistical projects, and relevant technical details.

  12. Stochastic modeling of a lava-flow aquifer system

    USGS Publications Warehouse

    Cronkite-Ratcliff, Collin; Phelps, Geoffrey A.

    2014-01-01

    This report describes preliminary three-dimensional geostatistical modeling of a lava-flow aquifer system using a multiple-point geostatistical model. The purpose of this study is to provide a proof-of-concept for this modeling approach. An example of the method is demonstrated using a subset of borehole geologic data and aquifer test data from a portion of the Calico Hills Formation, a lava-flow aquifer system that partially underlies Pahute Mesa, Nevada. Groundwater movement in this aquifer system is assumed to be controlled by the spatial distribution of two geologic units—rhyolite lava flows and zeolitized tuffs. The configuration of subsurface lava flows and tuffs is largely unknown because of limited data. The spatial configuration of the lava flows and tuffs is modeled by using a multiple-point geostatistical simulation algorithm that generates a large number of alternative realizations, each honoring the available geologic data and drawn from a geologic conceptual model of the lava-flow aquifer system as represented by a training image. In order to demonstrate how results from the geostatistical model could be analyzed in terms of available hydrologic data, a numerical simulation of part of an aquifer test was applied to the realizations of the geostatistical model.

  13. Application of geostatistics to risk assessment.

    PubMed

    Thayer, William C; Griffith, Daniel A; Goodrum, Philip E; Diamond, Gary L; Hassett, James M

    2003-10-01

    Geostatistics offers two fundamental contributions to environmental contaminant exposure assessment: (1) a group of methods to quantitatively describe the spatial distribution of a pollutant and (2) the ability to improve estimates of the exposure point concentration by exploiting the geospatial information present in the data. The second contribution is particularly valuable when exposure estimates must be derived from small data sets, which is often the case in environmental risk assessment. This article addresses two topics related to the use of geostatistics in human and ecological risk assessments performed at hazardous waste sites: (1) the importance of assessing model assumptions when using geostatistics and (2) the use of geostatistics to improve estimates of the exposure point concentration (EPC) in the limited data scenario. The latter topic is approached here by comparing design-based estimators that are familiar to environmental risk assessors (e.g., Land's method) with geostatistics, a model-based estimator. In this report, we summarize the basics of spatial weighting of sample data, kriging, and geostatistical simulation. We then explore the two topics identified above in a case study, using soil lead concentration data from a Superfund site (a skeet and trap range). We also describe several areas where research is needed to advance the use of geostatistics in environmental risk assessment.

  14. Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework.

    PubMed

    Akita, Yasuyuki; Baldasano, Jose M; Beelen, Rob; Cirach, Marta; de Hoogh, Kees; Hoek, Gerard; Nieuwenhuijsen, Mark; Serre, Marc L; de Nazelle, Audrey

    2014-04-15

    In recognition that intraurban exposure gradients may be as large as between-city variations, recent air pollution epidemiologic studies have become increasingly interested in capturing within-city exposure gradients. In addition, because of the rapidly accumulating health data, recent studies also need to handle large study populations distributed over large geographic domains. Even though several modeling approaches have been introduced, a consistent modeling framework capturing within-city exposure variability and applicable to large geographic domains is still missing. To address these needs, we proposed a modeling framework based on the Bayesian Maximum Entropy method that integrates monitoring data and outputs from existing air quality models based on Land Use Regression (LUR) and Chemical Transport Models (CTM). The framework was applied to estimate the yearly average NO2 concentrations over the region of Catalunya in Spain. By jointly accounting for the global scale variability in the concentration from the output of CTM and the intraurban scale variability through LUR model output, the proposed framework outperformed more conventional approaches.

  15. Using river distance and existing hydrography data can improve the geostatistical estimation of fish tissue mercury at unsampled locations.

    PubMed

    Money, Eric S; Sackett, Dana K; Aday, D Derek; Serre, Marc L

    2011-09-15

    Mercury in fish tissue is a major human health concern. Consumption of mercury-contaminated fish poses risks to the general population, including potentially serious developmental defects and neurological damage in young children. Therefore, it is important to accurately identify areas that have the potential for high levels of bioaccumulated mercury. However, due to time and resource constraints, it is difficult to adequately assess fish tissue mercury on a basin wide scale. We hypothesized that, given the nature of fish movement along streams, an analytical approach that takes into account distance traveled along these streams would improve the estimation accuracy for fish tissue mercury in unsampled streams. Therefore, we used a river-based Bayesian Maximum Entropy framework (river-BME) for modern space/time geostatistics to estimate fish tissue mercury at unsampled locations in the Cape Fear and Lumber Basins in eastern North Carolina. We also compared the space/time geostatistical estimation using river-BME to the more traditional Euclidean-based BME approach, with and without the inclusion of a secondary variable. Results showed that this river-based approach reduced the estimation error of fish tissue mercury by more than 13% and that the median estimate of fish tissue mercury exceeded the EPA action level of 0.3 ppm in more than 90% of river miles for the study domain.

  16. A Novel Approach of Understanding and Incorporating Error of Chemical Transport Models into a Geostatistical Framework

    NASA Astrophysics Data System (ADS)

    Reyes, J.; Vizuete, W.; Serre, M. L.; Xu, Y.

    2015-12-01

    The EPA employs a vast monitoring network to measure ambient PM2.5 concentrations across the United States with one of its goals being to quantify exposure within the population. However, there are several areas of the country with sparse monitoring spatially and temporally. One means to fill in these monitoring gaps is to use PM2.5 modeled estimates from Chemical Transport Models (CTMs) specifically the Community Multi-scale Air Quality (CMAQ) model. CMAQ is able to provide complete spatial coverage but is subject to systematic and random error due to model uncertainty. Due to the deterministic nature of CMAQ, often these uncertainties are not quantified. Much effort is employed to quantify the efficacy of these models through different metrics of model performance. Currently evaluation is specific to only locations with observed data. Multiyear studies across the United States are challenging because the error and model performance of CMAQ are not uniform over such large space/time domains. Error changes regionally and temporally. Because of the complex mix of species that constitute PM2.5, CMAQ error is also a function of increasing PM2.5 concentration. To address this issue we introduce a model performance evaluation for PM2.5 CMAQ that is regionalized and non-linear. This model performance evaluation leads to error quantification for each CMAQ grid. Areas and time periods of error being better qualified. The regionalized error correction approach is non-linear and is therefore more flexible at characterizing model performance than approaches that rely on linearity assumptions and assume homoscedasticity of CMAQ predictions errors. Corrected CMAQ data are then incorporated into the modern geostatistical framework of Bayesian Maximum Entropy (BME). Through cross validation it is shown that incorporating error-corrected CMAQ data leads to more accurate estimates than just using observed data by themselves.

  17. Integrated geostatistics for modeling fluid contacts and shales in Prudhoe Bay

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perez, G.; Chopra, A.K.; Severson, C.D.

    1997-12-01

    Geostatistics techniques are being used increasingly to model reservoir heterogeneity at a wide range of scales. A variety of techniques is now available with differing underlying assumptions, complexity, and applications. This paper introduces a novel method of geostatistics to model dynamic gas-oil contacts and shales in the Prudhoe Bay reservoir. The method integrates reservoir description and surveillance data within the same geostatistical framework. Surveillance logs and shale data are transformed to indicator variables. These variables are used to evaluate vertical and horizontal spatial correlation and cross-correlation of gas and shale at different times and to develop variogram models. Conditional simulationmore » techniques are used to generate multiple three-dimensional (3D) descriptions of gas and shales that provide a measure of uncertainty. These techniques capture the complex 3D distribution of gas-oil contacts through time. The authors compare results of the geostatistical method with conventional techniques as well as with infill wells drilled after the study. Predicted gas-oil contacts and shale distributions are in close agreement with gas-oil contacts observed at infill wells.« less

  18. 3D Sedimentological and geophysical studies of clastic reservoir analogs: Facies architecture, reservoir properties, and flow behavior within delta front facies elements of the Cretaceous Wall Creek Member, Frontier Formation, Wyoming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Christopher D. White

    2009-12-21

    Significant volumes of oil and gas occur in reservoirs formed by ancient river deltas. This has implications for the spatial distribution of rock types and the variation of transport properties. A between mudstones and sandstones may form baffles that influence productivity and recovery efficiency. Diagenetic processes such as compaction, dissolution, and cementation can also alter flow properties. A better understanding of these properties and improved methods will allow improved reservoir development planning and increased recovery of oil and gas from deltaic reservoirs. Surface exposures of ancient deltaic rocks provide a high-resolution view of variability. Insights gleaned from these exposures canmore » be used to model analogous reservoirs, for which data is sparser. The Frontier Formation in central Wyoming provides an opportunity for high-resolution models. The same rocks exposed in the Tisdale anticline are productive in nearby oil fields. Kilometers of exposure are accessible, and bedding-plane exposures allow use of high-resolution ground-penetrating radar. This study combined geologic interpretations, maps, vertical sections, core data, and ground-penetrating radar to construct geostatistical and flow models. Strata-conforming grids were use to reproduce the observed geometries. A new Bayesian method integrates outcrop, core, and radar amplitude and phase data. The proposed method propagates measurement uncertainty and yields an ensemble of plausible models for calcite concretions. These concretions affect flow significantly. Models which integrate more have different flow responses from simpler models, as demonstrated an exhaustive two-dimensional reference image and in three dimensions. This method is simple to implement within widely available geostatistics packages. Significant volumes of oil and gas occur in reservoirs that are inferred to have been formed by ancient river deltas. This geologic setting has implications for the spatial distribution of rock types (\\Eg sandstones and mudstones) and the variation of transport properties (\\Eg permeability and porosity) within bodies of a particular rock type. Both basin-wide processes such as sea-level change and the autocyclicity of deltaic processes commonly cause deltaic reservoirs to have large variability in rock properties; in particular, alternations between mudstones and sandstones may form baffles and trends in rock body permeability can influence productivity and recovery efficiency. In addition, diagenetic processes such as compaction, dissolution, and cementation can alter the spatial pattern of flow properties. A better understanding of these properties, and improved methods to model the properties and their effects, will allow improved reservoir development planning and increased recovery of oil and gas from deltaic reservoirs. Surface exposures of ancient deltaic rocks provide a high resolution, low uncertainty view of subsurface variability. Patterns and insights gleaned from these exposures can be used to model analogous reservoirs, for which data is much sparser. This approach is particularly attractive when reservoir formations are exposed at the surface. The Frontier Formation in central Wyoming provides an opportunity for high resolution characterization. The same rocks exposed in the vicinity of the Tisdale anticline are productive in nearby oil fields, including Salt Creek. Many kilometers of good-quality exposure are accessible, and the common bedding-plane exposures allow use of shallow-penetration, high-resolution electromagnetic methods known as ground-penetrating radar. This study combined geologic interpretations, maps, vertical sections, core data, and ground-penetrating radar to construct high-resolution geostatistical and flow models for the Wall Creek Member of the Frontier Formation. Stratal-conforming grids were use to reproduce the progradational and aggradational geometries observed in outcrop and radar data. A new, Bayesian method integrates outcrop--derived statistics, core observations of concretions, and radar amplitude and phase data. The proposed method consistently propagates measurement uncertainty through the model-building process, and yields an ensemble of plausible models for diagenetic calcite concretions. These concretions have a statistically significant on flow. Furthermore, neither geostatistical data from the outcrops nor geophysical data from radar is sufficient: models which integrate these data have significantly different flow responses. This was demonstrated both for an exhaustive two-dimensional reference image and in three dimensions, using flow simulations. This project wholly supported one PhD student and part of the education of an additional MS and PhD student. It helped to sponsor 6 refereed articles and 8 conference or similar presentations.« less

  19. Comparative study of transient hydraulic tomography with varying parameterizations and zonations: Laboratory sandbox investigation

    NASA Astrophysics Data System (ADS)

    Luo, Ning; Zhao, Zhanfeng; Illman, Walter A.; Berg, Steven J.

    2017-11-01

    Transient hydraulic tomography (THT) is a robust method of aquifer characterization to estimate the spatial distributions (or tomograms) of both hydraulic conductivity (K) and specific storage (Ss). However, the highly-parameterized nature of the geostatistical inversion approach renders it computationally intensive for large-scale investigations. In addition, geostatistics-based THT may produce overly smooth tomograms when head data used to constrain the inversion is limited. Therefore, alternative model conceptualizations for THT need to be examined. To investigate this, we simultaneously calibrated different groundwater models with varying parameterizations and zonations using two cases of different pumping and monitoring data densities from a laboratory sandbox. Specifically, one effective parameter model, four geology-based zonation models with varying accuracy and resolution, and five geostatistical models with different prior information are calibrated. Model performance is quantitatively assessed by examining the calibration and validation results. Our study reveals that highly parameterized geostatistical models perform the best among the models compared, while the zonation model with excellent knowledge of stratigraphy also yields comparable results. When few pumping tests with sparse monitoring intervals are available, the incorporation of accurate or simplified geological information into geostatistical models reveals more details in heterogeneity and yields more robust validation results. However, results deteriorate when inaccurate geological information are incorporated. Finally, our study reveals that transient inversions are necessary to obtain reliable K and Ss estimates for making accurate predictions of transient drawdown events.

  20. Incorporating geologic information into hydraulic tomography: A general framework based on geostatistical approach

    NASA Astrophysics Data System (ADS)

    Zha, Yuanyuan; Yeh, Tian-Chyi J.; Illman, Walter A.; Onoe, Hironori; Mok, Chin Man W.; Wen, Jet-Chau; Huang, Shao-Yang; Wang, Wenke

    2017-04-01

    Hydraulic tomography (HT) has become a mature aquifer test technology over the last two decades. It collects nonredundant information of aquifer heterogeneity by sequentially stressing the aquifer at different wells and collecting aquifer responses at other wells during each stress. The collected information is then interpreted by inverse models. Among these models, the geostatistical approaches, built upon the Bayesian framework, first conceptualize hydraulic properties to be estimated as random fields, which are characterized by means and covariance functions. They then use the spatial statistics as prior information with the aquifer response data to estimate the spatial distribution of the hydraulic properties at a site. Since the spatial statistics describe the generic spatial structures of the geologic media at the site rather than site-specific ones (e.g., known spatial distributions of facies, faults, or paleochannels), the estimates are often not optimal. To improve the estimates, we introduce a general statistical framework, which allows the inclusion of site-specific spatial patterns of geologic features. Subsequently, we test this approach with synthetic numerical experiments. Results show that this approach, using conditional mean and covariance that reflect site-specific large-scale geologic features, indeed improves the HT estimates. Afterward, this approach is applied to HT surveys at a kilometer-scale-fractured granite field site with a distinct fault zone. We find that by including fault information from outcrops and boreholes for HT analysis, the estimated hydraulic properties are improved. The improved estimates subsequently lead to better prediction of flow during a different pumping test at the site.

  1. Geostatistical applications in ground-water modeling in south-central Kansas

    USGS Publications Warehouse

    Ma, T.-S.; Sophocleous, M.; Yu, Y.-S.

    1999-01-01

    This paper emphasizes the supportive role of geostatistics in applying ground-water models. Field data of 1994 ground-water level, bedrock, and saltwater-freshwater interface elevations in south-central Kansas were collected and analyzed using the geostatistical approach. Ordinary kriging was adopted to estimate initial conditions for ground-water levels and topography of the Permian bedrock at the nodes of a finite difference grid used in a three-dimensional numerical model. Cokriging was used to estimate initial conditions for the saltwater-freshwater interface. An assessment of uncertainties in the estimated data is presented. The kriged and cokriged estimation variances were analyzed to evaluate the adequacy of data employed in the modeling. Although water levels and bedrock elevations are well described by spherical semivariogram models, additional data are required for better cokriging estimation of the interface data. The geostatistically analyzed data were employed in a numerical model of the Siefkes site in the project area. Results indicate that the computed chloride concentrations and ground-water drawdowns reproduced the observed data satisfactorily.This paper emphasizes the supportive role of geostatistics in applying ground-water models. Field data of 1994 ground-water level, bedrock, and saltwater-freshwater interface elevations in south-central Kansas were collected and analyzed using the geostatistical approach. Ordinary kriging was adopted to estimate initial conditions for ground-water levels and topography of the Permian bedrock at the nodes of a finite difference grid used in a three-dimensional numerical model. Cokriging was used to estimate initial conditions for the saltwater-freshwater interface. An assessment of uncertainties in the estimated data is presented. The kriged and cokriged estimation variances were analyzed to evaluate the adequacy of data employed in the modeling. Although water levels and bedrock elevations are well described by spherical semivariogram models, additional data are required for better cokriging estimation of the interface data. The geostatistically analyzed data were employed in a numerical model of the Siefkes site in the project area. Results indicate that the computed chloride concentrations and ground-water drawdowns reproduced the observed data satisfactorily.

  2. Geostatistics: a new tool for describing spatially-varied surface conditions from timber harvested and burned hillslopes

    Treesearch

    Peter R. Robichaud

    1997-01-01

    Geostatistics provides a method to describe the spatial continuity of many natural phenomena. Spatial models are based upon the concept of scaling, kriging and conditional simulation. These techniques were used to describe the spatially-varied surface conditions on timber harvest and burned hillslopes. Geostatistical techniques provided estimates of the ground cover (...

  3. Robust spatialization of soil water content at the scale of an agricultural field using geophysical and geostatistical methods

    NASA Astrophysics Data System (ADS)

    Henine, Hocine; Tournebize, Julien; Laurent, Gourdol; Christophe, Hissler; Cournede, Paul-Henry; Clement, Remi

    2017-04-01

    Research on the Critical Zone (CZ) is a prerequisite for undertaking issues related to ecosystemic services that human societies rely on (nutrient cycles, water supply and quality). However, while the upper part of CZ (vegetation, soil, surface water) is readily accessible, knowledge of the subsurface remains limited, due to the point-scale character of conventional direct observations. While the potential for geophysical methods to overcome this limitation is recognized, the translation of the geophysical information into physical properties or states of interest remains a challenge (e.g. the translation of soil electrical resistivity into soil water content). In this study, we propose a geostatistical framework using the Bayesian Maximum Entropy (BME) approach to assimilate geophysical and point-scale data. We especially focus on the prediction of the spatial distribution of soil water content using (1) TDR point-scale measurements of soil water content, which are considered as accurate data, and (2) soil water content data derived from electrical resistivity measurements, which are uncertain data but spatially dense. We used a synthetic dataset obtained with a vertical 2D domain to evaluate the performance of this geostatistical approach. Spatio-temporal simulations of soil water content were carried out using Hydrus-software for different scenarios: homogeneous or heterogeneous hydraulic conductivity distribution, and continuous or punctual infiltration pattern. From the simulations of soil water content, conceptual soil resistivity models were built using a forward modeling approach and point sampling of water content values, vertically ranged, were done. These two datasets are similar to field measurements of soil electrical resistivity (using electrical resistivity tomography, ERT) and soil water content (using TDR probes) obtained at the Boissy-le-Chatel site, in Orgeval catchment (East of Paris, France). We then integrated them into a specialization framework to predict the soil water content distribution and the results were compared to initial simulations (Hydrus results). We obtained more reliable water content specialization models when using the BME method. The presented approach integrates ERT and TDR measurements, and results demonstrate that its use significantly improves the spatial distribution of water content estimations. The approach will be applied to the experimental dataset collected at the Boissy le Châtel site where ERT data were collected daily during one hydrological year, using Syscal pro 48 electrodes (with a financial support of Equipex-Critex) and 10 TDR probes were used to monitor water content variation. Hourly hydrological survey (tile drainage discharge, precipitation, evapotranspiration variables and water table depth) were conducted at the same site. Data analysis and the application of geostatistical framework on the experimental dataset of 2015-2016 show satisfactory results and are reliable with the hydrological behavior of the study site.

  4. Three-dimensional Bayesian geostatistical aquifer characterization at the Hanford 300 Area using tracer test data

    NASA Astrophysics Data System (ADS)

    Chen, Xingyuan; Murakami, Haruko; Hahn, Melanie S.; Hammond, Glenn E.; Rockhold, Mark L.; Zachara, John M.; Rubin, Yoram

    2012-06-01

    Tracer tests performed under natural or forced gradient flow conditions can provide useful information for characterizing subsurface properties, through monitoring, modeling, and interpretation of the tracer plume migration in an aquifer. Nonreactive tracer experiments were conducted at the Hanford 300 Area, along with constant-rate injection tests and electromagnetic borehole flowmeter tests. A Bayesian data assimilation technique, the method of anchored distributions (MAD) (Rubin et al., 2010), was applied to assimilate the experimental tracer test data with the other types of data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation.In this study, the Bayesian prior information on the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using constant-rate injection and borehole flowmeter test data. The posterior distribution of the conductivity field was obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. MAD was implemented with the massively parallel three-dimensional flow and transport code PFLOTRAN to cope with the highly transient flow boundary conditions at the site and to meet the computational demands of MAD. A synthetic study proved that the proposed method could effectively invert tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. Application of MAD to actual field tracer data at the Hanford 300 Area demonstrates that inverting for spatial heterogeneity of hydraulic conductivity under transient flow conditions is challenging and more work is needed.

  5. A geostatistical approach to the change-of-support problem and variable-support data fusion in spatial analysis

    NASA Astrophysics Data System (ADS)

    Wang, Jun; Wang, Yang; Zeng, Hui

    2016-01-01

    A key issue to address in synthesizing spatial data with variable-support in spatial analysis and modeling is the change-of-support problem. We present an approach for solving the change-of-support and variable-support data fusion problems. This approach is based on geostatistical inverse modeling that explicitly accounts for differences in spatial support. The inverse model is applied here to produce both the best predictions of a target support and prediction uncertainties, based on one or more measurements, while honoring measurements. Spatial data covering large geographic areas often exhibit spatial nonstationarity and can lead to computational challenge due to the large data size. We developed a local-window geostatistical inverse modeling approach to accommodate these issues of spatial nonstationarity and alleviate computational burden. We conducted experiments using synthetic and real-world raster data. Synthetic data were generated and aggregated to multiple supports and downscaled back to the original support to analyze the accuracy of spatial predictions and the correctness of prediction uncertainties. Similar experiments were conducted for real-world raster data. Real-world data with variable-support were statistically fused to produce single-support predictions and associated uncertainties. The modeling results demonstrate that geostatistical inverse modeling can produce accurate predictions and associated prediction uncertainties. It is shown that the local-window geostatistical inverse modeling approach suggested offers a practical way to solve the well-known change-of-support problem and variable-support data fusion problem in spatial analysis and modeling.

  6. Applications of geostatistics and Markov models for logo recognition

    NASA Astrophysics Data System (ADS)

    Pham, Tuan

    2003-01-01

    Spatial covariances based on geostatistics are extracted as representative features of logo or trademark images. These spatial covariances are different from other statistical features for image analysis in that the structural information of an image is independent of the pixel locations and represented in terms of spatial series. We then design a classifier in the sense of hidden Markov models to make use of these geostatistical sequential data to recognize the logos. High recognition rates are obtained from testing the method against a public-domain logo database.

  7. Development of an Anisotropic Geological-Based Land Use Regression and Bayesian Maximum Entropy Model for Estimating Groundwater Radon across Northing Carolina

    NASA Astrophysics Data System (ADS)

    Messier, K. P.; Serre, M. L.

    2015-12-01

    Radon (222Rn) is a naturally occurring chemically inert, colorless, and odorless radioactive gas produced from the decay of uranium (238U), which is ubiquitous in rocks and soils worldwide. Exposure to 222Rn is likely the second leading cause of lung cancer after cigarette smoking via inhalation; however, exposure through untreated groundwater is also a contributing factor to both inhalation and ingestion routes. A land use regression (LUR) model for groundwater 222Rn with anisotropic geological and 238U based explanatory variables is developed, which helps elucidate the factors contributing to elevated 222Rn across North Carolina. Geological and uranium based variables are constructed in elliptical buffers surrounding each observation such that they capture the lateral geometric anisotropy present in groundwater 222Rn. Moreover, geological features are defined at three different geological spatial scales to allow the model to distinguish between large area and small area effects of geology on groundwater 222Rn. The LUR is also integrated into the Bayesian Maximum Entropy (BME) geostatistical framework to increase accuracy and produce a point-level LUR-BME model of groundwater 222Rn across North Carolina including prediction uncertainty. The LUR-BME model of groundwater 222Rn results in a leave-one out cross-validation of 0.46 (Pearson correlation coefficient= 0.68), effectively predicting within the spatial covariance range. Modeled results of 222Rn concentrations show variability among Intrusive Felsic geological formations likely due to average bedrock 238U defined on the basis of overlying stream-sediment 238U concentrations that is a widely distributed consistently analyzed point-source data.

  8. Analysis of Feature Intervisibility and Cumulative Visibility Using GIS, Bayesian and Spatial Statistics: A Study from the Mandara Mountains, Northern Cameroon

    PubMed Central

    Wright, David K.; MacEachern, Scott; Lee, Jaeyong

    2014-01-01

    The locations of diy-geδ-bay (DGB) sites in the Mandara Mountains, northern Cameroon are hypothesized to occur as a function of their ability to see and be seen from points on the surrounding landscape. A series of geostatistical, two-way and Bayesian logistic regression analyses were performed to test two hypotheses related to the intervisibility of the sites to one another and their visual prominence on the landscape. We determine that the intervisibility of the sites to one another is highly statistically significant when compared to 10 stratified-random permutations of DGB sites. Bayesian logistic regression additionally demonstrates that the visibility of the sites to points on the surrounding landscape is statistically significant. The location of sites appears to have also been selected on the basis of lower slope than random permutations of sites. Using statistical measures, many of which are not commonly employed in archaeological research, to evaluate aspects of visibility on the landscape, we conclude that the placement of DGB sites improved their conspicuousness for enhanced ritual, social cooperation and/or competition purposes. PMID:25383883

  9. Co-endemicity of Pulmonary Tuberculosis and Intestinal Helminth Infection in the People’s Republic of China

    PubMed Central

    Li, Xin-Xu; Ren, Zhou-Peng; Wang, Li-Xia; Zhang, Hui; Jiang, Shi-Wen; Chen, Jia-Xu; Wang, Jin-Feng; Zhou, Xiao-Nong

    2016-01-01

    Both pulmonary tuberculosis (PTB) and intestinal helminth infection (IHI) affect millions of individuals every year in China. However, the national-scale estimation of prevalence predictors and prevalence maps for these diseases, as well as co-endemic relative risk (RR) maps of both diseases’ prevalence are not well developed. There are co-endemic, high prevalence areas of both diseases, whose delimitation is essential for devising effective control strategies. Bayesian geostatistical logistic regression models including socio-economic, climatic, geographical and environmental predictors were fitted separately for active PTB and IHI based on data from the national surveys for PTB and major human parasitic diseases that were completed in 2010 and 2004, respectively. Prevalence maps and co-endemic RR maps were constructed for both diseases by means of Bayesian Kriging model and Bayesian shared component model capable of appraising the fraction of variance of spatial RRs shared by both diseases, and those specific for each one, under an assumption that there are unobserved covariates common to both diseases. Our results indicate that gross domestic product (GDP) per capita had a negative association, while rural regions, the arid and polar zones and elevation had positive association with active PTB prevalence; for the IHI prevalence, GDP per capita and distance to water bodies had a negative association, the equatorial and warm zones and the normalized difference vegetation index had a positive association. Moderate to high prevalence of active PTB and low prevalence of IHI were predicted in western regions, low to moderate prevalence of active PTB and low prevalence of IHI were predicted in north-central regions and the southeast coastal regions, and moderate to high prevalence of active PTB and high prevalence of IHI were predicted in the south-western regions. Thus, co-endemic areas of active PTB and IHI were located in the south-western regions of China, which might be determined by socio-economic factors, such as GDP per capita. PMID:27088504

  10. Three-Dimensional Bayesian Geostatistical Aquifer Characterization at the Hanford 300 Area using Tracer Test Data

    NASA Astrophysics Data System (ADS)

    Chen, X.; Murakami, H.; Hahn, M. S.; Hammond, G. E.; Rockhold, M. L.; Rubin, Y.

    2010-12-01

    Tracer testing under natural or forced gradient flow provides useful information for characterizing subsurface properties, by monitoring and modeling the tracer plume migration in a heterogeneous aquifer. At the Hanford 300 Area, non-reactive tracer experiments, in addition to constant-rate injection tests and electromagnetic borehole flowmeter (EBF) profiling, were conducted to characterize the heterogeneous hydraulic conductivity field. A Bayesian data assimilation technique, method of anchored distributions (MAD), is applied to assimilate the experimental tracer test data and to infer the three-dimensional heterogeneous structure of the hydraulic conductivity in the saturated zone of the Hanford formation. In this study, the prior information of the underlying random hydraulic conductivity field was obtained from previous field characterization efforts using the constant-rate injection tests and the EBF data. The posterior distribution of the random field is obtained by further conditioning the field on the temporal moments of tracer breakthrough curves at various observation wells. The parallel three-dimensional flow and transport code PFLOTRAN is implemented to cope with the highly transient flow boundary conditions at the site and to meet the computational demand of the proposed method. The validation results show that the field conditioned on the tracer test data better reproduces the tracer transport behavior compared to the field characterized previously without the tracer test data. A synthetic study proves that the proposed method can effectively assimilate tracer test data to capture the essential spatial heterogeneity of the three-dimensional hydraulic conductivity field. These characterization results will improve conceptual models developed for the site, including reactive transport models. The study successfully demonstrates the capability of MAD to assimilate multi-scale multi-type field data within a consistent Bayesian framework. The MAD framework can potentially be applied to combine geophysical data with other types of data in site characterization.

  11. A practical primer on geostatistics

    USGS Publications Warehouse

    Olea, Ricardo A.

    2009-01-01

    The Challenge—Most geological phenomena are extraordinarily complex in their interrelationships and vast in their geographical extension. Ordinarily, engineers and geoscientists are faced with corporate or scientific requirements to properly prepare geological models with measurements involving a small fraction of the entire area or volume of interest. Exact description of a system such as an oil reservoir is neither feasible nor economically possible. The results are necessarily uncertain. Note that the uncertainty is not an intrinsic property of the systems; it is the result of incomplete knowledge by the observer.The Aim of Geostatistics—The main objective of geostatistics is the characterization of spatial systems that are incompletely known, systems that are common in geology. A key difference from classical statistics is that geostatistics uses the sampling location of every measurement. Unless the measurements show spatial correlation, the application of geostatistics is pointless. Ordinarily the need for additional knowledge goes beyond a few points, which explains the display of results graphically as fishnet plots, block diagrams, and maps.Geostatistical Methods—Geostatistics is a collection of numerical techniques for the characterization of spatial attributes using primarily two tools: probabilistic models, which are used for spatial data in a manner similar to the way in which time-series analysis characterizes temporal data, or pattern recognition techniques. The probabilistic models are used as a way to handle uncertainty in results away from sampling locations, making a radical departure from alternative approaches like inverse distance estimation methods.Differences with Time Series—On dealing with time-series analysis, users frequently concentrate their attention on extrapolations for making forecasts. Although users of geostatistics may be interested in extrapolation, the methods work at their best interpolating. This simple difference has significant methodological implications.Historical Remarks—As a discipline, geostatistics was firmly established in the 1960s by the French engineer Georges Matheron, who was interested in the appraisal of ore reserves in mining. Geostatistics did not develop overnight. Like other disciplines, it has built on previous results, many of which were formulated with different objectives in various fields.Pioneers—Seminal ideas conceptually related to what today we call geostatistics or spatial statistics are found in the work of several pioneers, including: 1940s: A.N. Kolmogorov in turbulent flow and N. Wiener in stochastic processing; 1950s: D. Krige in mining; 1960s: B. Mathern in forestry and L.S. Gandin in meteorologyCalculations—Serious applications of geostatistics require the use of digital computers. Although for most geostatistical techniques rudimentary implementation from scratch is fairly straightforward, coding programs from scratch is recommended only as part of a practice that may help users to gain a better grasp of the formulations.Software—For professional work, the reader should employ software packages that have been thoroughly tested to handle any sampling scheme, that run as efficiently as possible, and that offer graphic capabilities for the analysis and display of results. This primer employs primarily the package Stanford Geomodeling Software (SGeMS) - recently developed at the Energy Resources Engineering Department at Stanford University - as a way to show how to obtain results practically. This applied side of the primer should not be interpreted as the notes being a manual for the use of SGeMS. The main objective of the primer is to help the reader gain an understanding of the fundamental concepts and tools in geostatistics.Organization of the Primer—The chapters of greatest importance are those covering kriging and simulation. All other materials are peripheral and are included for better comprehension of these main geostatistical modeling tools. The choice of kriging versus simulation is often a big puzzle to the uninitiated, let alone the different variants of both of them. Chapters 14, 18, and 19 are intended to shed light on those subjects. The critical aspect of assessing and modeling spatial correlation is covered in chapter 7. Chapters 2 and 3 review relevant concepts in classical statistics.Course Objectives—This course offers stochastic solutions to common problems in the characterization of complex geological systems. At the end of the course, participants should have: an understanding of the theoretical foundations of geostatistics; a good grasp of its possibilities and limitations; and reasonable familiarity with the SGeMS software, thus opening the possibility of practically applying geostatistics.

  12. Qualitative and quantitative comparison of geostatistical techniques of porosity prediction from the seismic and logging data: a case study from the Blackfoot Field, Alberta, Canada

    NASA Astrophysics Data System (ADS)

    Maurya, S. P.; Singh, K. H.; Singh, N. P.

    2018-05-01

    In present study, three recently developed geostatistical methods, single attribute analysis, multi-attribute analysis and probabilistic neural network algorithm have been used to predict porosity in inter well region for Blackfoot field, Alberta, Canada, an offshore oil field. These techniques make use of seismic attributes, generated by model based inversion and colored inversion techniques. The principle objective of the study is to find the suitable combination of seismic inversion and geostatistical techniques to predict porosity and identification of prospective zones in 3D seismic volume. The porosity estimated from these geostatistical approaches is corroborated with the well log porosity. The results suggest that all the three implemented geostatistical methods are efficient and reliable to predict the porosity but the multi-attribute and probabilistic neural network analysis provide more accurate and high resolution porosity sections. A low impedance (6000-8000 m/s g/cc) and high porosity (> 15%) zone is interpreted from inverted impedance and porosity sections respectively between 1060 and 1075 ms time interval and is characterized as reservoir. The qualitative and quantitative results demonstrate that of all the employed geostatistical methods, the probabilistic neural network along with model based inversion is the most efficient method for predicting porosity in inter well region.

  13. Geostatistical modelling of household malaria in Malawi

    NASA Astrophysics Data System (ADS)

    Chirombo, J.; Lowe, R.; Kazembe, L.

    2012-04-01

    Malaria is one of the most important diseases in the world today, common in tropical and subtropical areas with sub-Saharan Africa being the region most burdened, including Malawi. This region has the right combination of biotic and abiotic components, including socioeconomic, climatic and environmental factors that sustain transmission of the disease. Differences in these conditions across the country consequently lead to spatial variation in risk of the disease. Analysis of nationwide survey data that takes into account this spatial variation is crucial in a resource constrained country like Malawi for targeted allocation of scare resources in the fight against malaria. Previous efforts to map malaria risk in Malawi have been based on limited data collected from small surveys. The Malaria Indicator Survey conducted in 2010 is the most comprehensive malaria survey carried out in Malawi and provides point referenced data for the study. The data has been shown to be spatially correlated. We use Bayesian logistic regression models with spatial correlation to model the relationship between malaria presence in children and covariates such as socioeconomic status of households and meteorological conditions. This spatial model is then used to assess how malaria varies spatially and a malaria risk map for Malawi is produced. By taking intervention measures into account, the developed model is used to assess whether they have an effect on the spatial distribution of the disease and Bayesian kriging is used to predict areas where malaria risk is more likely to increase. It is hoped that this study can help reveal areas that require more attention from the authorities in the continuing fight against malaria, particularly in children under the age of five.

  14. Overview and technical and practical aspects for use of geostatistics in hazardous-, toxic-, and radioactive-waste-site investigations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bossong, C.R.; Karlinger, M.R.; Troutman, B.M.

    1999-10-01

    Technical and practical aspects of applying geostatistics are developed for individuals involved in investigation at hazardous-, toxic-, and radioactive-waste sites. Important geostatistical concepts, such as variograms and ordinary, universal, and indicator kriging, are described in general terms for introductory purposes and in more detail for practical applications. Variogram modeling using measured ground-water elevation data is described in detail to illustrate principles of stationarity, anisotropy, transformations, and cross validation. Several examples of kriging applications are described using ground-water-level elevations, bedrock elevations, and ground-water-quality data. A review of contemporary literature and selected public domain software associated with geostatistics also is provided, asmore » is a discussion of alternative methods for spatial modeling, including inverse distance weighting, triangulation, splines, trend-surface analysis, and simulation.« less

  15. Latin hypercube sampling and geostatistical modeling of spatial uncertainty in a spatially explicit forest landscape model simulation

    Treesearch

    Chonggang Xu; Hong S. He; Yuanman Hu; Yu Chang; Xiuzhen Li; Rencang Bu

    2005-01-01

    Geostatistical stochastic simulation is always combined with Monte Carlo method to quantify the uncertainty in spatial model simulations. However, due to the relatively long running time of spatially explicit forest models as a result of their complexity, it is always infeasible to generate hundreds or thousands of Monte Carlo simulations. Thus, it is of great...

  16. Developing a new Bayesian Risk Index for risk evaluation of soil contamination.

    PubMed

    Albuquerque, M T D; Gerassis, S; Sierra, C; Taboada, J; Martín, J E; Antunes, I M H R; Gallego, J R

    2017-12-15

    Industrial and agricultural activities heavily constrain soil quality. Potentially Toxic Elements (PTEs) are a threat to public health and the environment alike. In this regard, the identification of areas that require remediation is crucial. In the herein research a geochemical dataset (230 samples) comprising 14 elements (Cu, Pb, Zn, Ag, Ni, Mn, Fe, As, Cd, V, Cr, Ti, Al and S) was gathered throughout eight different zones distinguished by their main activity, namely, recreational, agriculture/livestock and heavy industry in the Avilés Estuary (North of Spain). Then a stratified systematic sampling method was used at short, medium, and long distances from each zone to obtain a representative picture of the total variability of the selected attributes. The information was then combined in four risk classes (Low, Moderate, High, Remediation) following reference values from several sediment quality guidelines (SQGs). A Bayesian analysis, inferred for each zone, allowed the characterization of PTEs correlations, the unsupervised learning network technique proving to be the best fit. Based on the Bayesian network structure obtained, Pb, As and Mn were selected as key contamination parameters. For these 3 elements, the conditional probability obtained was allocated to each observed point, and a simple, direct index (Bayesian Risk Index-BRI) was constructed as a linear rating of the pre-defined risk classes weighted by the previously obtained probability. Finally, the BRI underwent geostatistical modeling. One hundred Sequential Gaussian Simulations (SGS) were computed. The Mean Image and the Standard Deviation maps were obtained, allowing the definition of High/Low risk clusters (Local G clustering) and the computation of spatial uncertainty. High-risk clusters are mainly distributed within the area with the highest altitude (agriculture/livestock) showing an associated low spatial uncertainty, clearly indicating the need for remediation. Atmospheric emissions, mainly derived from the metallurgical industry, contribute to soil contamination by PTEs. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Estimation of geotechnical parameters on the basis of geophysical methods and geostatistics

    NASA Astrophysics Data System (ADS)

    Brom, Aleksander; Natonik, Adrianna

    2017-12-01

    The paper presents possible implementation of ordinary cokriging and geophysical investigation on humidity data acquired in geotechnical studies. The Author describes concept of geostatistics, terminology of geostatistical modelling, spatial correlation functions, principles of solving cokriging systems, advantages of (co-)kriging in comparison with other interpolation methods, obstacles in this type of attempt. Cross validation and discussion of results was performed with an indication of prospect of applying similar procedures in various researches..

  18. Geostatistical regularization operators for geophysical inverse problems on irregular meshes

    NASA Astrophysics Data System (ADS)

    Jordi, C.; Doetsch, J.; Günther, T.; Schmelzbach, C.; Robertsson, J. OA

    2018-05-01

    Irregular meshes allow to include complicated subsurface structures into geophysical modelling and inverse problems. The non-uniqueness of these inverse problems requires appropriate regularization that can incorporate a priori information. However, defining regularization operators for irregular discretizations is not trivial. Different schemes for calculating smoothness operators on irregular meshes have been proposed. In contrast to classical regularization constraints that are only defined using the nearest neighbours of a cell, geostatistical operators include a larger neighbourhood around a particular cell. A correlation model defines the extent of the neighbourhood and allows to incorporate information about geological structures. We propose an approach to calculate geostatistical operators for inverse problems on irregular meshes by eigendecomposition of a covariance matrix that contains the a priori geological information. Using our approach, the calculation of the operator matrix becomes tractable for 3-D inverse problems on irregular meshes. We tested the performance of the geostatistical regularization operators and compared them against the results of anisotropic smoothing in inversions of 2-D surface synthetic electrical resistivity tomography (ERT) data as well as in the inversion of a realistic 3-D cross-well synthetic ERT scenario. The inversions of 2-D ERT and seismic traveltime field data with geostatistical regularization provide results that are in good accordance with the expected geology and thus facilitate their interpretation. In particular, for layered structures the geostatistical regularization provides geologically more plausible results compared to the anisotropic smoothness constraints.

  19. Geotechnical parameter spatial distribution stochastic analysis based on multi-precision information assimilation

    NASA Astrophysics Data System (ADS)

    Wang, C.; Rubin, Y.

    2014-12-01

    Spatial distribution of important geotechnical parameter named compression modulus Es contributes considerably to the understanding of the underlying geological processes and the adequate assessment of the Es mechanics effects for differential settlement of large continuous structure foundation. These analyses should be derived using an assimilating approach that combines in-situ static cone penetration test (CPT) with borehole experiments. To achieve such a task, the Es distribution of stratum of silty clay in region A of China Expo Center (Shanghai) is studied using the Bayesian-maximum entropy method. This method integrates rigorously and efficiently multi-precision of different geotechnical investigations and sources of uncertainty. Single CPT samplings were modeled as a rational probability density curve by maximum entropy theory. Spatial prior multivariate probability density function (PDF) and likelihood PDF of the CPT positions were built by borehole experiments and the potential value of the prediction point, then, preceding numerical integration on the CPT probability density curves, the posterior probability density curve of the prediction point would be calculated by the Bayesian reverse interpolation framework. The results were compared between Gaussian Sequential Stochastic Simulation and Bayesian methods. The differences were also discussed between single CPT samplings of normal distribution and simulated probability density curve based on maximum entropy theory. It is shown that the study of Es spatial distributions can be improved by properly incorporating CPT sampling variation into interpolation process, whereas more informative estimations are generated by considering CPT Uncertainty for the estimation points. Calculation illustrates the significance of stochastic Es characterization in a stratum, and identifies limitations associated with inadequate geostatistical interpolation techniques. This characterization results will provide a multi-precision information assimilation method of other geotechnical parameters.

  20. Geostatistical Prediction of Microbial Water Quality Throughout a Stream Network Using Meteorology, Land Cover, and Spatiotemporal Autocorrelation.

    PubMed

    Holcomb, David A; Messier, Kyle P; Serre, Marc L; Rowny, Jakob G; Stewart, Jill R

    2018-06-25

    Predictive modeling is promising as an inexpensive tool to assess water quality. We developed geostatistical predictive models of microbial water quality that empirically modeled spatiotemporal autocorrelation in measured fecal coliform (FC) bacteria concentrations to improve prediction. We compared five geostatistical models featuring different autocorrelation structures, fit to 676 observations from 19 locations in North Carolina's Jordan Lake watershed using meteorological and land cover predictor variables. Though stream distance metrics (with and without flow-weighting) failed to improve prediction over the Euclidean distance metric, incorporating temporal autocorrelation substantially improved prediction over the space-only models. We predicted FC throughout the stream network daily for one year, designating locations "impaired", "unimpaired", or "unassessed" if the probability of exceeding the state standard was ≥90%, ≤10%, or >10% but <90%, respectively. We could assign impairment status to more of the stream network on days any FC were measured, suggesting frequent sample-based monitoring remains necessary, though implementing spatiotemporal predictive models may reduce the number of concurrent sampling locations required to adequately assess water quality. Together, these results suggest that prioritizing sampling at different times and conditions using geographically sparse monitoring networks is adequate to build robust and informative geostatistical models of water quality impairment.

  1. Modelling Geomechanical Heterogeneity of Rock Masses Using Direct and Indirect Geostatistical Conditional Simulation Methods

    NASA Astrophysics Data System (ADS)

    Eivazy, Hesameddin; Esmaieli, Kamran; Jean, Raynald

    2017-12-01

    An accurate characterization and modelling of rock mass geomechanical heterogeneity can lead to more efficient mine planning and design. Using deterministic approaches and random field methods for modelling rock mass heterogeneity is known to be limited in simulating the spatial variation and spatial pattern of the geomechanical properties. Although the applications of geostatistical techniques have demonstrated improvements in modelling the heterogeneity of geomechanical properties, geostatistical estimation methods such as Kriging result in estimates of geomechanical variables that are not fully representative of field observations. This paper reports on the development of 3D models for spatial variability of rock mass geomechanical properties using geostatistical conditional simulation method based on sequential Gaussian simulation. A methodology to simulate the heterogeneity of rock mass quality based on the rock mass rating is proposed and applied to a large open-pit mine in Canada. Using geomechanical core logging data collected from the mine site, a direct and an indirect approach were used to model the spatial variability of rock mass quality. The results of the two modelling approaches were validated against collected field data. The study aims to quantify the risks of pit slope failure and provides a measure of uncertainties in spatial variability of rock mass properties in different areas of the pit.

  2. An emission-weighted proximity model for air pollution exposure assessment.

    PubMed

    Zou, Bin; Wilson, J Gaines; Zhan, F Benjamin; Zeng, Yongnian

    2009-08-15

    Among the most common spatial models for estimating personal exposure are Traditional Proximity Models (TPMs). Though TPMs are straightforward to configure and interpret, they are prone to extensive errors in exposure estimates and do not provide prospective estimates. To resolve these inherent problems with TPMs, we introduce here a novel Emission Weighted Proximity Model (EWPM) to improve the TPM, which takes into consideration the emissions from all sources potentially influencing the receptors. EWPM performance was evaluated by comparing the normalized exposure risk values of sulfur dioxide (SO(2)) calculated by EWPM with those calculated by TPM and monitored observations over a one-year period in two large Texas counties. In order to investigate whether the limitations of TPM in potential exposure risk prediction without recorded incidence can be overcome, we also introduce a hybrid framework, a 'Geo-statistical EWPM'. Geo-statistical EWPM is a synthesis of Ordinary Kriging Geo-statistical interpolation and EWPM. The prediction results are presented as two potential exposure risk prediction maps. The performance of these two exposure maps in predicting individual SO(2) exposure risk was validated with 10 virtual cases in prospective exposure scenarios. Risk values for EWPM were clearly more agreeable with the observed concentrations than those from TPM. Over the entire study area, the mean SO(2) exposure risk from EWPM was higher relative to TPM (1.00 vs. 0.91). The mean bias of the exposure risk values of 10 virtual cases between EWPM and 'Geo-statistical EWPM' are much smaller than those between TPM and 'Geo-statistical TPM' (5.12 vs. 24.63). EWPM appears to more accurately portray individual exposure relative to TPM. The 'Geo-statistical EWPM' effectively augments the role of the standard proximity model and makes it possible to predict individual risk in future exposure scenarios resulting in adverse health effects from environmental pollution.

  3. spMC: an R-package for 3D lithological reconstructions based on spatial Markov chains

    NASA Astrophysics Data System (ADS)

    Sartore, Luca; Fabbri, Paolo; Gaetan, Carlo

    2016-09-01

    The paper presents the spatial Markov Chains (spMC) R-package and a case study of subsoil simulation/prediction located in a plain site of Northeastern Italy. spMC is a quite complete collection of advanced methods for data inspection, besides spMC implements Markov Chain models to estimate experimental transition probabilities of categorical lithological data. Furthermore, simulation methods based on most known prediction methods (as indicator Kriging and CoKriging) were implemented in spMC package. Moreover, other more advanced methods are available for simulations, e.g. path methods and Bayesian procedures, that exploit the maximum entropy. Since the spMC package was developed for intensive geostatistical computations, part of the code is implemented for parallel computations via the OpenMP constructs. A final analysis of this computational efficiency compares the simulation/prediction algorithms by using different numbers of CPU cores, and considering the example data set of the case study included in the package.

  4. A path-level exact parallelization strategy for sequential simulation

    NASA Astrophysics Data System (ADS)

    Peredo, Oscar F.; Baeza, Daniel; Ortiz, Julián M.; Herrero, José R.

    2018-01-01

    Sequential Simulation is a well known method in geostatistical modelling. Following the Bayesian approach for simulation of conditionally dependent random events, Sequential Indicator Simulation (SIS) method draws simulated values for K categories (categorical case) or classes defined by K different thresholds (continuous case). Similarly, Sequential Gaussian Simulation (SGS) method draws simulated values from a multivariate Gaussian field. In this work, a path-level approach to parallelize SIS and SGS methods is presented. A first stage of re-arrangement of the simulation path is performed, followed by a second stage of parallel simulation for non-conflicting nodes. A key advantage of the proposed parallelization method is to generate identical realizations as with the original non-parallelized methods. Case studies are presented using two sequential simulation codes from GSLIB: SISIM and SGSIM. Execution time and speedup results are shown for large-scale domains, with many categories and maximum kriging neighbours in each case, achieving high speedup results in the best scenarios using 16 threads of execution in a single machine.

  5. Applications of Geostatistics in Plant Nematology

    PubMed Central

    Wallace, M. K.; Hawkins, D. M.

    1994-01-01

    The application of geostatistics to plant nematology was made by evaluating soil and nematode data acquired from 200 soil samples collected from the Ap horizon of a reed canary-grass field in northern Minnesota. Geostatistical concepts relevant to nematology include semi-variogram modelling, kriging, and change of support calculations. Soil and nematode data generally followed a spherical semi-variogram model, with little random variability associated with soil data and large inherent variability for nematode data. Block kriging of soil and nematode data provided useful contour maps of the data. Change of snpport calculations indicated that most of the random variation in nematode data was due to short-range spatial variability in the nematode population densities. PMID:19279938

  6. Applications of geostatistics in plant nematology.

    PubMed

    Wallace, M K; Hawkins, D M

    1994-12-01

    The application of geostatistics to plant nematology was made by evaluating soil and nematode data acquired from 200 soil samples collected from the A(p) horizon of a reed canary-grass field in northern Minnesota. Geostatistical concepts relevant to nematology include semi-variogram modelling, kriging, and change of support calculations. Soil and nematode data generally followed a spherical semi-variogram model, with little random variability associated with soil data and large inherent variability for nematode data. Block kriging of soil and nematode data provided useful contour maps of the data. Change of snpport calculations indicated that most of the random variation in nematode data was due to short-range spatial variability in the nematode population densities.

  7. A hierarchical spatial model for well yield in complex aquifers

    NASA Astrophysics Data System (ADS)

    Montgomery, J.; O'sullivan, F.

    2017-12-01

    Efficiently siting and managing groundwater wells requires reliable estimates of the amount of water that can be produced, or the well yield. This can be challenging to predict in highly complex, heterogeneous fractured aquifers due to the uncertainty around local hydraulic properties. Promising statistical approaches have been advanced in recent years. For instance, kriging and multivariate regression analysis have been applied to well test data with limited but encouraging levels of prediction accuracy. Additionally, some analytical solutions to diffusion in homogeneous porous media have been used to infer "effective" properties consistent with observed flow rates or drawdown. However, this is an under-specified inverse problem with substantial and irreducible uncertainty. We describe a flexible machine learning approach capable of combining diverse datasets with constraining physical and geostatistical models for improved well yield prediction accuracy and uncertainty quantification. Our approach can be implemented within a hierarchical Bayesian framework using Markov Chain Monte Carlo, which allows for additional sources of information to be incorporated in priors to further constrain and improve predictions and reduce the model order. We demonstrate the usefulness of this approach using data from over 7,000 wells in a fractured bedrock aquifer.

  8. A GIS Tool for evaluating and improving NEXRAD and its application in distributed hydrologic modeling

    NASA Astrophysics Data System (ADS)

    Zhang, X.; Srinivasan, R.

    2008-12-01

    In this study, a user friendly GIS tool was developed for evaluating and improving NEXRAD using raingauge data. This GIS tool can automatically read in raingauge and NEXRAD data, evaluate the accuracy of NEXRAD for each time unit, implement several geostatistical methods to improve the accuracy of NEXRAD through raingauge data, and output spatial precipitation map for distributed hydrologic model. The geostatistical methods incorporated in this tool include Simple Kriging with varying local means, Kriging with External Drift, Regression Kriging, Co-Kriging, and a new geostatistical method that was newly developed by Li et al. (2008). This tool was applied in two test watersheds at hourly and daily temporal scale. The preliminary cross-validation results show that incorporating raingauge data to calibrate NEXRAD can pronouncedly change the spatial pattern of NEXRAD and improve its accuracy. Using different geostatistical methods, the GIS tool was applied to produce long term precipitation input for a distributed hydrologic model - Soil and Water Assessment Tool (SWAT). Animated video was generated to vividly illustrate the effect of using different precipitation input data on distributed hydrologic modeling. Currently, this GIS tool is developed as an extension of SWAT, which is used as water quantity and quality modeling tool by USDA and EPA. The flexible module based design of this tool also makes it easy to be adapted for other hydrologic models for hydrological modeling and water resources management.

  9. Combining geostatistics with Moran's I analysis for mapping soil heavy metals in Beijing, China.

    PubMed

    Huo, Xiao-Ni; Li, Hong; Sun, Dan-Feng; Zhou, Lian-Di; Li, Bao-Guo

    2012-03-01

    Production of high quality interpolation maps of heavy metals is important for risk assessment of environmental pollution. In this paper, the spatial correlation characteristics information obtained from Moran's I analysis was used to supplement the traditional geostatistics. According to Moran's I analysis, four characteristics distances were obtained and used as the active lag distance to calculate the semivariance. Validation of the optimality of semivariance demonstrated that using the two distances where the Moran's I and the standardized Moran's I, Z(I) reached a maximum as the active lag distance can improve the fitting accuracy of semivariance. Then, spatial interpolation was produced based on the two distances and their nested model. The comparative analysis of estimation accuracy and the measured and predicted pollution status showed that the method combining geostatistics with Moran's I analysis was better than traditional geostatistics. Thus, Moran's I analysis is a useful complement for geostatistics to improve the spatial interpolation accuracy of heavy metals.

  10. Combining Geostatistics with Moran’s I Analysis for Mapping Soil Heavy Metals in Beijing, China

    PubMed Central

    Huo, Xiao-Ni; Li, Hong; Sun, Dan-Feng; Zhou, Lian-Di; Li, Bao-Guo

    2012-01-01

    Production of high quality interpolation maps of heavy metals is important for risk assessment of environmental pollution. In this paper, the spatial correlation characteristics information obtained from Moran’s I analysis was used to supplement the traditional geostatistics. According to Moran’s I analysis, four characteristics distances were obtained and used as the active lag distance to calculate the semivariance. Validation of the optimality of semivariance demonstrated that using the two distances where the Moran’s I and the standardized Moran’s I, Z(I) reached a maximum as the active lag distance can improve the fitting accuracy of semivariance. Then, spatial interpolation was produced based on the two distances and their nested model. The comparative analysis of estimation accuracy and the measured and predicted pollution status showed that the method combining geostatistics with Moran’s I analysis was better than traditional geostatistics. Thus, Moran’s I analysis is a useful complement for geostatistics to improve the spatial interpolation accuracy of heavy metals. PMID:22690179

  11. Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

    PubMed

    MacNab, Ying C

    2016-08-01

    This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.

  12. A MS-lesion pattern discrimination plot based on geostatistics.

    PubMed

    Marschallinger, Robert; Schmidt, Paul; Hofmann, Peter; Zimmer, Claus; Atkinson, Peter M; Sellner, Johann; Trinka, Eugen; Mühlau, Mark

    2016-03-01

    A geostatistical approach to characterize MS-lesion patterns based on their geometrical properties is presented. A dataset of 259 binary MS-lesion masks in MNI space was subjected to directional variography. A model function was fit to express the observed spatial variability in x, y, z directions by the geostatistical parameters Range and Sill. Parameters Range and Sill correlate with MS-lesion pattern surface complexity and total lesion volume. A scatter plot of ln(Range) versus ln(Sill), classified by pattern anisotropy, enables a consistent and clearly arranged presentation of MS-lesion patterns based on geometry: the so-called MS-Lesion Pattern Discrimination Plot. The geostatistical approach and the graphical representation of results are considered efficient exploratory data analysis tools for cross-sectional, follow-up, and medication impact analysis.

  13. TiConverter: A training image converting tool for multiple-point geostatistics

    NASA Astrophysics Data System (ADS)

    Fadlelmula F., Mohamed M.; Killough, John; Fraim, Michael

    2016-11-01

    TiConverter is a tool developed to ease the application of multiple-point geostatistics whether by the open source Stanford Geostatistical Modeling Software (SGeMS) or other available commercial software. TiConverter has a user-friendly interface and it allows the conversion of 2D training images into numerical representations in four different file formats without the need for additional code writing. These are the ASCII (.txt), the geostatistical software library (GSLIB) (.txt), the Isatis (.dat), and the VTK formats. It performs the conversion based on the RGB color system. In addition, TiConverter offers several useful tools including image resizing, smoothing, and segmenting tools. The purpose of this study is to introduce the TiConverter, and to demonstrate its application and advantages with several examples from the literature.

  14. Resolving the Antarctic contribution to sea-level rise: a hierarchical modelling framework.

    PubMed

    Zammit-Mangion, Andrew; Rougier, Jonathan; Bamber, Jonathan; Schön, Nana

    2014-06-01

    Determining the Antarctic contribution to sea-level rise from observational data is a complex problem. The number of physical processes involved (such as ice dynamics and surface climate) exceeds the number of observables, some of which have very poor spatial definition. This has led, in general, to solutions that utilise strong prior assumptions or physically based deterministic models to simplify the problem. Here, we present a new approach for estimating the Antarctic contribution, which only incorporates descriptive aspects of the physically based models in the analysis and in a statistical manner. By combining physical insights with modern spatial statistical modelling techniques, we are able to provide probability distributions on all processes deemed to play a role in both the observed data and the contribution to sea-level rise. Specifically, we use stochastic partial differential equations and their relation to geostatistical fields to capture our physical understanding and employ a Gaussian Markov random field approach for efficient computation. The method, an instantiation of Bayesian hierarchical modelling, naturally incorporates uncertainty in order to reveal credible intervals on all estimated quantities. The estimated sea-level rise contribution using this approach corroborates those found using a statistically independent method. © 2013 The Authors. Environmetrics Published by John Wiley & Sons, Ltd.

  15. Resolving the Antarctic contribution to sea-level rise: a hierarchical modelling framework†

    PubMed Central

    Zammit-Mangion, Andrew; Rougier, Jonathan; Bamber, Jonathan; Schön, Nana

    2014-01-01

    Determining the Antarctic contribution to sea-level rise from observational data is a complex problem. The number of physical processes involved (such as ice dynamics and surface climate) exceeds the number of observables, some of which have very poor spatial definition. This has led, in general, to solutions that utilise strong prior assumptions or physically based deterministic models to simplify the problem. Here, we present a new approach for estimating the Antarctic contribution, which only incorporates descriptive aspects of the physically based models in the analysis and in a statistical manner. By combining physical insights with modern spatial statistical modelling techniques, we are able to provide probability distributions on all processes deemed to play a role in both the observed data and the contribution to sea-level rise. Specifically, we use stochastic partial differential equations and their relation to geostatistical fields to capture our physical understanding and employ a Gaussian Markov random field approach for efficient computation. The method, an instantiation of Bayesian hierarchical modelling, naturally incorporates uncertainty in order to reveal credible intervals on all estimated quantities. The estimated sea-level rise contribution using this approach corroborates those found using a statistically independent method. © 2013 The Authors. Environmetrics Published by John Wiley & Sons, Ltd. PMID:25505370

  16. Predicting polycyclic aromatic hydrocarbons using a mass fraction approach in a geostatistical framework across North Carolina.

    PubMed

    Reyes, Jeanette M; Hubbard, Heidi F; Stiegel, Matthew A; Pleil, Joachim D; Serre, Marc L

    2018-01-09

    Currently in the United States there are no regulatory standards for ambient concentrations of polycyclic aromatic hydrocarbons (PAHs), a class of organic compounds with known carcinogenic species. As such, monitoring data are not routinely collected resulting in limited exposure mapping and epidemiologic studies. This work develops the log-mass fraction (LMF) Bayesian maximum entropy (BME) geostatistical prediction method used to predict the concentration of nine particle-bound PAHs across the US state of North Carolina. The LMF method develops a relationship between a relatively small number of collocated PAH and fine Particulate Matter (PM2.5) samples collected in 2005 and applies that relationship to a larger number of locations where PM2.5 is routinely monitored to more broadly estimate PAH concentrations across the state. Cross validation and mapping results indicate that by incorporating both PAH and PM2.5 data, the LMF BME method reduces mean squared error by 28.4% and produces more realistic spatial gradients compared to the traditional kriging approach based solely on observed PAH data. The LMF BME method efficiently creates PAH predictions in a PAH data sparse and PM2.5 data rich setting, opening the door for more expansive epidemiologic exposure assessments of ambient PAH.

  17. Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.

    PubMed

    Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O; Gelfand, Alan E

    2016-01-01

    Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.

  18. Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

    PubMed Central

    Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O.; Gelfand, Alan E.

    2018-01-01

    Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online. PMID:29720777

  19. Hydrogeologic unit flow characterization using transition probability geostatistics.

    PubMed

    Jones, Norman L; Walker, Justin R; Carle, Steven F

    2005-01-01

    This paper describes a technique for applying the transition probability geostatistics method for stochastic simulation to a MODFLOW model. Transition probability geostatistics has some advantages over traditional indicator kriging methods including a simpler and more intuitive framework for interpreting geologic relationships and the ability to simulate juxtapositional tendencies such as fining upward sequences. The indicator arrays generated by the transition probability simulation are converted to layer elevation and thickness arrays for use with the new Hydrogeologic Unit Flow package in MODFLOW 2000. This makes it possible to preserve complex heterogeneity while using reasonably sized grids and/or grids with nonuniform cell thicknesses.

  20. Poly-Omic Prediction of Complex Traits: OmicKriging

    PubMed Central

    Wheeler, Heather E.; Aquino-Michaels, Keston; Gamazon, Eric R.; Trubetskoy, Vassily V.; Dolan, M. Eileen; Huang, R. Stephanie; Cox, Nancy J.; Im, Hae Kyung

    2014-01-01

    High-confidence prediction of complex traits such as disease risk or drug response is an ultimate goal of personalized medicine. Although genome-wide association studies have discovered thousands of well-replicated polymorphisms associated with a broad spectrum of complex traits, the combined predictive power of these associations for any given trait is generally too low to be of clinical relevance. We propose a novel systems approach to complex trait prediction, which leverages and integrates similarity in genetic, transcriptomic, or other omics-level data. We translate the omic similarity into phenotypic similarity using a method called Kriging, commonly used in geostatistics and machine learning. Our method called OmicKriging emphasizes the use of a wide variety of systems-level data, such as those increasingly made available by comprehensive surveys of the genome, transcriptome, and epigenome, for complex trait prediction. Furthermore, our OmicKriging framework allows easy integration of prior information on the function of subsets of omics-level data from heterogeneous sources without the sometimes heavy computational burden of Bayesian approaches. Using seven disease datasets from the Wellcome Trust Case Control Consortium (WTCCC), we show that OmicKriging allows simple integration of sparse and highly polygenic components yielding comparable performance at a fraction of the computing time of a recently published Bayesian sparse linear mixed model method. Using a cellular growth phenotype, we show that integrating mRNA and microRNA expression data substantially increases performance over either dataset alone. Using clinical statin response, we show improved prediction over existing methods. PMID:24799323

  1. Assessing the resolution-dependent utility of tomograms for geostatistics

    USGS Publications Warehouse

    Day-Lewis, F. D.; Lane, J.W.

    2004-01-01

    Geophysical tomograms are used increasingly as auxiliary data for geostatistical modeling of aquifer and reservoir properties. The correlation between tomographic estimates and hydrogeologic properties is commonly based on laboratory measurements, co-located measurements at boreholes, or petrophysical models. The inferred correlation is assumed uniform throughout the interwell region; however, tomographic resolution varies spatially due to acquisition geometry, regularization, data error, and the physics underlying the geophysical measurements. Blurring and inversion artifacts are expected in regions traversed by few or only low-angle raypaths. In the context of radar traveltime tomography, we derive analytical models for (1) the variance of tomographic estimates, (2) the spatially variable correlation with a hydrologic parameter of interest, and (3) the spatial covariance of tomographic estimates. Synthetic examples demonstrate that tomograms of qualitative value may have limited utility for geostatistics; moreover, the imprint of regularization may preclude inference of meaningful spatial statistics from tomograms.

  2. Geostatistical borehole image-based mapping of karst-carbonate aquifer pores

    USGS Publications Warehouse

    Michael Sukop,; Cunningham, Kevin J.

    2016-01-01

    Quantification of the character and spatial distribution of porosity in carbonate aquifers is important as input into computer models used in the calculation of intrinsic permeability and for next-generation, high-resolution groundwater flow simulations. Digital, optical, borehole-wall image data from three closely spaced boreholes in the karst-carbonate Biscayne aquifer in southeastern Florida are used in geostatistical experiments to assess the capabilities of various methods to create realistic two-dimensional models of vuggy megaporosity and matrix-porosity distribution in the limestone that composes the aquifer. When the borehole image data alone were used as the model training image, multiple-point geostatistics failed to detect the known spatial autocorrelation of vuggy megaporosity and matrix porosity among the three boreholes, which were only 10 m apart. Variogram analysis and subsequent Gaussian simulation produced results that showed a realistic conceptualization of horizontal continuity of strata dominated by vuggy megaporosity and matrix porosity among the three boreholes.

  3. Assessment and modeling of the groundwater hydrogeochemical quality parameters via geostatistical approaches

    NASA Astrophysics Data System (ADS)

    Karami, Shawgar; Madani, Hassan; Katibeh, Homayoon; Fatehi Marj, Ahmad

    2018-03-01

    Geostatistical methods are one of the advanced techniques used for interpolation of groundwater quality data. The results obtained from geostatistics will be useful for decision makers to adopt suitable remedial measures to protect the quality of groundwater sources. Data used in this study were collected from 78 wells in Varamin plain aquifer located in southeast of Tehran, Iran, in 2013. Ordinary kriging method was used in this study to evaluate groundwater quality parameters. According to what has been mentioned in this paper, seven main quality parameters (i.e. total dissolved solids (TDS), sodium adsorption ratio (SAR), electrical conductivity (EC), sodium (Na+), total hardness (TH), chloride (Cl-) and sulfate (SO4 2-)), have been analyzed and interpreted by statistical and geostatistical methods. After data normalization by Nscore method in WinGslib software, variography as a geostatistical tool to define spatial regression was compiled and experimental variograms were plotted by GS+ software. Then, the best theoretical model was fitted to each variogram based on the minimum RSS. Cross validation method was used to determine the accuracy of the estimated data. Eventually, estimation maps of groundwater quality were prepared in WinGslib software and estimation variance map and estimation error map were presented to evaluate the quality of estimation in each estimated point. Results showed that kriging method is more accurate than the traditional interpolation methods.

  4. Mapping, Bayesian Geostatistical Analysis and Spatial Prediction of Lymphatic Filariasis Prevalence in Africa

    PubMed Central

    Slater, Hannah; Michael, Edwin

    2013-01-01

    There is increasing interest to control or eradicate the major neglected tropical diseases. Accurate modelling of the geographic distributions of parasitic infections will be crucial to this endeavour. We used 664 community level infection prevalence data collated from the published literature in conjunction with eight environmental variables, altitude and population density, and a multivariate Bayesian generalized linear spatial model that allows explicit accounting for spatial autocorrelation and incorporation of uncertainty in input data and model parameters, to construct the first spatially-explicit map describing LF prevalence distribution in Africa. We also ran the best-fit model against predictions made by the HADCM3 and CCCMA climate models for 2050 to predict the likely distributions of LF under future climate and population changes. We show that LF prevalence is strongly influenced by spatial autocorrelation between locations but is only weakly associated with environmental covariates. Infection prevalence, however, is found to be related to variations in population density. All associations with key environmental/demographic variables appear to be complex and non-linear. LF prevalence is predicted to be highly heterogenous across Africa, with high prevalences (>20%) estimated to occur primarily along coastal West and East Africa, and lowest prevalences predicted for the central part of the continent. Error maps, however, indicate a need for further surveys to overcome problems with data scarcity in the latter and other regions. Analysis of future changes in prevalence indicates that population growth rather than climate change per se will represent the dominant factor in the predicted increase/decrease and spread of LF on the continent. We indicate that these results could play an important role in aiding the development of strategies that are best able to achieve the goals of parasite elimination locally and globally in a manner that may also account for the effects of future climate change on parasitic infection. PMID:23951194

  5. Geostatistical Model-Based Estimates of Schistosomiasis Prevalence among Individuals Aged ≤20 Years in West Africa

    PubMed Central

    Schur, Nadine; Hürlimann, Eveline; Garba, Amadou; Traoré, Mamadou S.; Ndir, Omar; Ratard, Raoult C.; Tchuem Tchuenté, Louis-Albert; Kristensen, Thomas K.; Utzinger, Jürg; Vounatsou, Penelope

    2011-01-01

    Background Schistosomiasis is a water-based disease that is believed to affect over 200 million people with an estimated 97% of the infections concentrated in Africa. However, these statistics are largely based on population re-adjusted data originally published by Utroska and colleagues more than 20 years ago. Hence, these estimates are outdated due to large-scale preventive chemotherapy programs, improved sanitation, water resources development and management, among other reasons. For planning, coordination, and evaluation of control activities, it is essential to possess reliable schistosomiasis prevalence maps. Methodology We analyzed survey data compiled on a newly established open-access global neglected tropical diseases database (i) to create smooth empirical prevalence maps for Schistosoma mansoni and S. haematobium for individuals aged ≤20 years in West Africa, including Cameroon, and (ii) to derive country-specific prevalence estimates. We used Bayesian geostatistical models based on environmental predictors to take into account potential clustering due to common spatially structured exposures. Prediction at unobserved locations was facilitated by joint kriging. Principal Findings Our models revealed that 50.8 million individuals aged ≤20 years in West Africa are infected with either S. mansoni, or S. haematobium, or both species concurrently. The country prevalence estimates ranged between 0.5% (The Gambia) and 37.1% (Liberia) for S. mansoni, and between 17.6% (The Gambia) and 51.6% (Sierra Leone) for S. haematobium. We observed that the combined prevalence for both schistosome species is two-fold lower in Gambia than previously reported, while we found an almost two-fold higher estimate for Liberia (58.3%) than reported before (30.0%). Our predictions are likely to overestimate overall country prevalence, since modeling was based on children and adolescents up to the age of 20 years who are at highest risk of infection. Conclusion/Significance We present the first empirical estimates for S. mansoni and S. haematobium prevalence at high spatial resolution throughout West Africa. Our prediction maps allow prioritizing of interventions in a spatially explicit manner, and will be useful for monitoring and evaluation of schistosomiasis control programs. PMID:21695107

  6. Conditioning geostatistical simulations of a heterogeneous paleo-fluvial bedrock aquifer using lithologs and pumping tests

    NASA Astrophysics Data System (ADS)

    Niazi, A.; Bentley, L. R.; Hayashi, M.

    2016-12-01

    Geostatistical simulations are used to construct heterogeneous aquifer models. Optimally, such simulations should be conditioned with both lithologic and hydraulic data. We introduce an approach to condition lithologic geostatistical simulations of a paleo-fluvial bedrock aquifer consisting of relatively high permeable sandstone channels embedded in relatively low permeable mudstone using hydraulic data. The hydraulic data consist of two-hour single well pumping tests extracted from the public water well database for a 250-km2 watershed in Alberta, Canada. First, lithologic models of the entire watershed are simulated and conditioned with hard lithological data using transition probability - Markov chain geostatistics (TPROGS). Then, a segment of the simulation around a pumping well is used to populate a flow model (FEFLOW) with either sand or mudstone. The values of the hydraulic conductivity and specific storage of sand and mudstone are then adjusted to minimize the difference between simulated and actual pumping test data using the parameter estimation program PEST. If the simulated pumping test data do not adequately match the measured data, the lithologic model is updated by locally deforming the lithology distribution using the probability perturbation method and the model parameters are again updated with PEST. This procedure is repeated until the simulated and measured data agree within a pre-determined tolerance. The procedure is repeated for each well that has pumping test data. The method creates a local groundwater model that honors both the lithologic model and pumping test data and provides estimates of hydraulic conductivity and specific storage. Eventually, the simulations will be integrated into a watershed-scale groundwater model.

  7. Monte Carlo Analysis of Reservoir Models Using Seismic Data and Geostatistical Models

    NASA Astrophysics Data System (ADS)

    Zunino, A.; Mosegaard, K.; Lange, K.; Melnikova, Y.; Hansen, T. M.

    2013-12-01

    We present a study on the analysis of petroleum reservoir models consistent with seismic data and geostatistical constraints performed on a synthetic reservoir model. Our aim is to invert directly for structure and rock bulk properties of the target reservoir zone. To infer the rock facies, porosity and oil saturation seismology alone is not sufficient but a rock physics model must be taken into account, which links the unknown properties to the elastic parameters. We then combine a rock physics model with a simple convolutional approach for seismic waves to invert the "measured" seismograms. To solve this inverse problem, we employ a Markov chain Monte Carlo (MCMC) method, because it offers the possibility to handle non-linearity, complex and multi-step forward models and provides realistic estimates of uncertainties. However, for large data sets the MCMC method may be impractical because of a very high computational demand. To face this challenge one strategy is to feed the algorithm with realistic models, hence relying on proper prior information. To address this problem, we utilize an algorithm drawn from geostatistics to generate geologically plausible models which represent samples of the prior distribution. The geostatistical algorithm learns the multiple-point statistics from prototype models (in the form of training images), then generates thousands of different models which are accepted or rejected by a Metropolis sampler. To further reduce the computation time we parallelize the software and run it on multi-core machines. The solution of the inverse problem is then represented by a collection of reservoir models in terms of facies, porosity and oil saturation, which constitute samples of the posterior distribution. We are finally able to produce probability maps of the properties we are interested in by performing statistical analysis on the collection of solutions.

  8. Spatial uncertainty of a geoid undulation model in Guayaquil, Ecuador

    NASA Astrophysics Data System (ADS)

    Chicaiza, E. G.; Leiva, C. A.; Arranz, J. J.; Buenańo, X. E.

    2017-06-01

    Geostatistics is a discipline that deals with the statistical analysis of regionalized variables. In this case study, geostatistics is used to estimate geoid undulation in the rural area of Guayaquil town in Ecuador. The geostatistical approach was chosen because the estimation error of prediction map is getting. Open source statistical software R and mainly geoR, gstat and RGeostats libraries were used. Exploratory data analysis (EDA), trend and structural analysis were carried out. An automatic model fitting by Iterative Least Squares and other fitting procedures were employed to fit the variogram. Finally, Kriging using gravity anomaly of Bouguer as external drift and Universal Kriging were used to get a detailed map of geoid undulation. The estimation uncertainty was reached in the interval [-0.5; +0.5] m for errors and a maximum estimation standard deviation of 2 mm in relation with the method of interpolation applied. The error distribution of the geoid undulation map obtained in this study provides a better result than Earth gravitational models publicly available for the study area according the comparison with independent validation points. The main goal of this paper is to confirm the feasibility to use geoid undulations from Global Navigation Satellite Systems and leveling field measurements and geostatistical techniques methods in order to use them in high-accuracy engineering projects.

  9. The Use of Geostatistics in the Study of Floral Phenology of Vulpia geniculata (L.) Link

    PubMed Central

    León Ruiz, Eduardo J.; García Mozo, Herminia; Domínguez Vilches, Eugenio; Galán, Carmen

    2012-01-01

    Traditionally phenology studies have been focused on changes through time, but there exist many instances in ecological research where it is necessary to interpolate among spatially stratified samples. The combined use of Geographical Information Systems (GIS) and Geostatistics can be an essential tool for spatial analysis in phenological studies. Geostatistics are a family of statistics that describe correlations through space/time and they can be used for both quantifying spatial correlation and interpolating unsampled points. In the present work, estimations based upon Geostatistics and GIS mapping have enabled the construction of spatial models that reflect phenological evolution of Vulpia geniculata (L.) Link throughout the study area during sampling season. Ten sampling points, scattered troughout the city and low mountains in the “Sierra de Córdoba” were chosen to carry out the weekly phenological monitoring during flowering season. The phenological data were interpolated by applying the traditional geostatitical method of Kriging, which was used to ellaborate weekly estimations of V. geniculata phenology in unsampled areas. Finally, the application of Geostatistics and GIS to create phenological maps could be an essential complement in pollen aerobiological studies, given the increased interest in obtaining automatic aerobiological forecasting maps. PMID:22629169

  10. The use of geostatistics in the study of floral phenology of Vulpia geniculata (L.) link.

    PubMed

    León Ruiz, Eduardo J; García Mozo, Herminia; Domínguez Vilches, Eugenio; Galán, Carmen

    2012-01-01

    Traditionally phenology studies have been focused on changes through time, but there exist many instances in ecological research where it is necessary to interpolate among spatially stratified samples. The combined use of Geographical Information Systems (GIS) and Geostatistics can be an essential tool for spatial analysis in phenological studies. Geostatistics are a family of statistics that describe correlations through space/time and they can be used for both quantifying spatial correlation and interpolating unsampled points. In the present work, estimations based upon Geostatistics and GIS mapping have enabled the construction of spatial models that reflect phenological evolution of Vulpia geniculata (L.) Link throughout the study area during sampling season. Ten sampling points, scattered throughout the city and low mountains in the "Sierra de Córdoba" were chosen to carry out the weekly phenological monitoring during flowering season. The phenological data were interpolated by applying the traditional geostatitical method of Kriging, which was used to elaborate weekly estimations of V. geniculata phenology in unsampled areas. Finally, the application of Geostatistics and GIS to create phenological maps could be an essential complement in pollen aerobiological studies, given the increased interest in obtaining automatic aerobiological forecasting maps.

  11. Geostatistical Borehole Image-Based Mapping of Karst-Carbonate Aquifer Pores.

    PubMed

    Sukop, Michael C; Cunningham, Kevin J

    2016-03-01

    Quantification of the character and spatial distribution of porosity in carbonate aquifers is important as input into computer models used in the calculation of intrinsic permeability and for next-generation, high-resolution groundwater flow simulations. Digital, optical, borehole-wall image data from three closely spaced boreholes in the karst-carbonate Biscayne aquifer in southeastern Florida are used in geostatistical experiments to assess the capabilities of various methods to create realistic two-dimensional models of vuggy megaporosity and matrix-porosity distribution in the limestone that composes the aquifer. When the borehole image data alone were used as the model training image, multiple-point geostatistics failed to detect the known spatial autocorrelation of vuggy megaporosity and matrix porosity among the three boreholes, which were only 10 m apart. Variogram analysis and subsequent Gaussian simulation produced results that showed a realistic conceptualization of horizontal continuity of strata dominated by vuggy megaporosity and matrix porosity among the three boreholes. © 2015, National Ground Water Association.

  12. CarbonTracker-Lagrange: A Framework for Greenhouse Gas Flux Estimation at Regional to Continental Scales

    NASA Astrophysics Data System (ADS)

    Andrews, A. E.

    2016-12-01

    CarbonTracker-Lagrange (CT-L) is a flexible modeling framework developed to take advantage of newly available atmospheric data for CO2 and other long-lived gases such as CH4 and N2O. The North American atmospheric CO2 measurement network has grown from three sites in 2004 to >100 sites in 2015. The US network includes tall tower, mountaintop, surface, and aircraft sites in the NOAA Global Greenhouse Gas Reference Network along with sites maintained by university, government and private sector researchers. The Canadian network is operated by Environment and Climate Change Canada. This unprecedented dataset can provide spatially and temporally resolved CO2 emissions and uptake flux estimates and quantitative information about drivers of variability, such as drought and temperature. CT-L is a platform for systematic comparison of data assimilation techniques and evaluation of assumed prior, model and observation errors. A novel feature of CT-L is the optimization of boundary values along with surface fluxes, leveraging vertically resolved data available from NOAA's aircraft sampling program. CT-L uses observation footprints (influence functions) from the Weather Research and Forecasting/Stochastic Time-Inverted Lagrangian Transport (WRF-STILT) modeling system to relate atmospheric measurements to upwind fluxes and boundary values. Footprints are pre-computed and the optimization algorithms are efficient, so many variants of the calculation can be performed. Fluxes are adjusted using Bayesian or Geostatistical methods to provide optimal agreement with observations. Satellite measurements of CO2 and CH4 from GOSAT are available starting in July 2009 and from OCO-2 since September 2014. With support from the NASA Carbon Monitoring System, we are developing flux estimation strategies that use remote sensing and in situ data together, including geostatistical inversions using satellite retrievals of solar-induced chlorophyll fluorescence. CT-L enables quantitative investigation of what new measurements would best complement the existing carbon observing system. We are also working to implement multi-species inversions for CO2 flux estimation using CO2 data along with CO, δ13CO2, COS and radiocarbon observations and for CH4 flux estimation using data for various hydrocarbons.

  13. Delineating Hydrofacies Spatial Distribution by Integrating Ensemble Data Assimilation and Indicator Geostatistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Xuehang; Chen, Xingyuan; Ye, Ming

    2015-07-01

    This study develops a new framework of facies-based data assimilation for characterizing spatial distribution of hydrofacies and estimating their associated hydraulic properties. This framework couples ensemble data assimilation with transition probability-based geostatistical model via a parameterization based on a level set function. The nature of ensemble data assimilation makes the framework efficient and flexible to be integrated with various types of observation data. The transition probability-based geostatistical model keeps the updated hydrofacies distributions under geological constrains. The framework is illustrated by using a two-dimensional synthetic study that estimates hydrofacies spatial distribution and permeability in each hydrofacies from transient head data.more » Our results show that the proposed framework can characterize hydrofacies distribution and associated permeability with adequate accuracy even with limited direct measurements of hydrofacies. Our study provides a promising starting point for hydrofacies delineation in complex real problems.« less

  14. A geostatistics-informed hierarchical sensitivity analysis method for complex groundwater flow and transport modeling: GEOSTATISTICAL SENSITIVITY ANALYSIS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dai, Heng; Chen, Xingyuan; Ye, Ming

    Sensitivity analysis is an important tool for quantifying uncertainty in the outputs of mathematical models, especially for complex systems with a high dimension of spatially correlated parameters. Variance-based global sensitivity analysis has gained popularity because it can quantify the relative contribution of uncertainty from different sources. However, its computational cost increases dramatically with the complexity of the considered model and the dimension of model parameters. In this study we developed a hierarchical sensitivity analysis method that (1) constructs an uncertainty hierarchy by analyzing the input uncertainty sources, and (2) accounts for the spatial correlation among parameters at each level ofmore » the hierarchy using geostatistical tools. The contribution of uncertainty source at each hierarchy level is measured by sensitivity indices calculated using the variance decomposition method. Using this methodology, we identified the most important uncertainty source for a dynamic groundwater flow and solute transport in model at the Department of Energy (DOE) Hanford site. The results indicate that boundary conditions and permeability field contribute the most uncertainty to the simulated head field and tracer plume, respectively. The relative contribution from each source varied spatially and temporally as driven by the dynamic interaction between groundwater and river water at the site. By using a geostatistical approach to reduce the number of realizations needed for the sensitivity analysis, the computational cost of implementing the developed method was reduced to a practically manageable level. The developed sensitivity analysis method is generally applicable to a wide range of hydrologic and environmental problems that deal with high-dimensional spatially-distributed parameters.« less

  15. Use of geostatistics for remediation planning to transcend urban political boundaries.

    PubMed

    Milillo, Tammy M; Sinha, Gaurav; Gardella, Joseph A

    2012-11-01

    Soil remediation plans are often dictated by areas of jurisdiction or property lines instead of scientific information. This study exemplifies how geostatistically interpolated surfaces can substantially improve remediation planning. Ordinary kriging, ordinary co-kriging, and inverse distance weighting spatial interpolation methods were compared for analyzing surface and sub-surface soil sample data originally collected by the US EPA and researchers at the University at Buffalo in Hickory Woods, an industrial-residential neighborhood in Buffalo, NY, where both lead and arsenic contamination is present. Past clean-up efforts estimated contamination levels from point samples, but parcel and agency jurisdiction boundaries were used to define remediation sites, rather than geostatistical models estimating the spatial behavior of the contaminants in the soil. Residents were understandably dissatisfied with the arbitrariness of the remediation plan. In this study we show how geostatistical mapping and participatory assessment can make soil remediation scientifically defensible, socially acceptable, and economically feasible. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Geostatistical uncertainty of assessing air quality using high-spatial-resolution lichen data: A health study in the urban area of Sines, Portugal.

    PubMed

    Ribeiro, Manuel C; Pinho, P; Branquinho, C; Llop, Esteve; Pereira, Maria J

    2016-08-15

    In most studies correlating health outcomes with air pollution, personal exposure assignments are based on measurements collected at air-quality monitoring stations not coinciding with health data locations. In such cases, interpolators are needed to predict air quality in unsampled locations and to assign personal exposures. Moreover, a measure of the spatial uncertainty of exposures should be incorporated, especially in urban areas where concentrations vary at short distances due to changes in land use and pollution intensity. These studies are limited by the lack of literature comparing exposure uncertainty derived from distinct spatial interpolators. Here, we addressed these issues with two interpolation methods: regression Kriging (RK) and ordinary Kriging (OK). These methods were used to generate air-quality simulations with a geostatistical algorithm. For each method, the geostatistical uncertainty was drawn from generalized linear model (GLM) analysis. We analyzed the association between air quality and birth weight. Personal health data (n=227) and exposure data were collected in Sines (Portugal) during 2007-2010. Because air-quality monitoring stations in the city do not offer high-spatial-resolution measurements (n=1), we used lichen data as an ecological indicator of air quality (n=83). We found no significant difference in the fit of GLMs with any of the geostatistical methods. With RK, however, the models tended to fit better more often and worse less often. Moreover, the geostatistical uncertainty results showed a marginally higher mean and precision with RK. Combined with lichen data and land-use data of high spatial resolution, RK is a more effective geostatistical method for relating health outcomes with air quality in urban areas. This is particularly important in small cities, which generally do not have expensive air-quality monitoring stations with high spatial resolution. Further, alternative ways of linking human activities with their environment are needed to improve human well-being. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Spatial distribution of Munida intermedia and M. sarsi (crustacea: Anomura) on the Galician continental shelf (NW Spain): Application of geostatistical analysis

    NASA Astrophysics Data System (ADS)

    Freire, J.; González-Gurriarán, E.; Olaso, I.

    1992-12-01

    Geostatistical methodology was used to analyse spatial structure and distribution of the epibenthic crustaceans Munida intermedia and M. sarsi within sets of data which had been collected during three survey cruises carried out on the Galician continental shelf (1983 and 1984). This study investigates the feasibility of using geostatistics for data collected according to traditional methods and of enhancing such methodology. The experimental variograms were calculated (pooled variance minus spatial covariance between samples taken one pair at a time vs. distance) and fitted to a 'spherical' model. The spatial structure model was used to estimate the abundance and distribution of the populations studied using the technique of kriging. The species display spatial structures, which are well marked during high density periods and in some areas (especially northern shelf). Geostatistical analysis allows identification of the density gradients in space as well as the patch grain along the continental shelf of 16-25 km diameter for M. intermedia and 12-20 km for M. sarsi. Patches of both species have a consistent location throughout the different cruises. As in other geographical areas, M. intermedia and M. sarsi usually appear at depths ranging from 200 to 500 m, with the highest densities in the continental shelf area located between Fisterra and Estaca de Bares. Althouh sampling was not originally designed specifically for geostatistics, this assay provides a measurement of spatial covariance, and shows variograms with variable structure depending on population density and geographical area. These ideas are useful in improving the design of future sampling cruises.

  18. Hydrogeologic Unit Flow Characterization Using Transition Probability Geostatistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, N L; Walker, J R; Carle, S F

    2003-11-21

    This paper describes a technique for applying the transition probability geostatistics method for stochastic simulation to a MODFLOW model. Transition probability geostatistics has several advantages over traditional indicator kriging methods including a simpler and more intuitive framework for interpreting geologic relationships and the ability to simulate juxtapositional tendencies such as fining upwards sequences. The indicator arrays generated by the transition probability simulation are converted to layer elevation and thickness arrays for use with the new Hydrogeologic Unit Flow (HUF) package in MODFLOW 2000. This makes it possible to preserve complex heterogeneity while using reasonably sized grids. An application of themore » technique involving probabilistic capture zone delineation for the Aberjona Aquifer in Woburn, Ma. is included.« less

  19. Nonpoint Source Solute Transport Normal to Aquifer Bedding in Heterogeneous, Markov Chain Random Fields

    NASA Astrophysics Data System (ADS)

    Zhang, H.; Harter, T.; Sivakumar, B.

    2005-12-01

    Facies-based geostatistical models have become important tools for the stochastic analysis of flow and transport processes in heterogeneous aquifers. However, little is known about the dependency of these processes on the parameters of facies- based geostatistical models. This study examines the nonpoint source solute transport normal to the major bedding plane in the presence of interconnected high conductivity (coarse- textured) facies in the aquifer medium and the dependence of the transport behavior upon the parameters of the constitutive facies model. A facies-based Markov chain geostatistical model is used to quantify the spatial variability of the aquifer system hydrostratigraphy. It is integrated with a groundwater flow model and a random walk particle transport model to estimate the solute travel time probability distribution functions (pdfs) for solute flux from the water table to the bottom boundary (production horizon) of the aquifer. The cases examined include, two-, three-, and four-facies models with horizontal to vertical facies mean length anisotropy ratios, ek, from 25:1 to 300:1, and with a wide range of facies volume proportions (e.g, from 5% to 95% coarse textured facies). Predictions of travel time pdfs are found to be significantly affected by the number of hydrostratigraphic facies identified in the aquifer, the proportions of coarse-textured sediments, the mean length of the facies (particularly the ratio of length to thickness of coarse materials), and - to a lesser degree - the juxtapositional preference among the hydrostratigraphic facies. In transport normal to the sedimentary bedding plane, travel time pdfs are not log- normally distributed as is often assumed. Also, macrodispersive behavior (variance of the travel time pdf) was found to not be a unique function of the conductivity variance. The skewness of the travel time pdf varied from negatively skewed to strongly positively skewed within the parameter range examined. We also show that the Markov chain approach may give significantly different travel time pdfs when compared to the more commonly used Gaussian random field approach even though the first and second order moments in the geostatistical distribution of the lnK field are identical. The choice of the appropriate geostatistical model is therefore critical in the assessment of nonpoint source transport.

  20. Nonpoint source solute transport normal to aquifer bedding in heterogeneous, Markov chain random fields

    NASA Astrophysics Data System (ADS)

    Zhang, Hua; Harter, Thomas; Sivakumar, Bellie

    2006-06-01

    Facies-based geostatistical models have become important tools for analyzing flow and mass transport processes in heterogeneous aquifers. Yet little is known about the relationship between these latter processes and the parameters of facies-based geostatistical models. In this study, we examine the transport of a nonpoint source solute normal (perpendicular) to the major bedding plane of an alluvial aquifer medium that contains multiple geologic facies, including interconnected, high-conductivity (coarse textured) facies. We also evaluate the dependence of the transport behavior on the parameters of the constitutive facies model. A facies-based Markov chain geostatistical model is used to quantify the spatial variability of the aquifer system's hydrostratigraphy. It is integrated with a groundwater flow model and a random walk particle transport model to estimate the solute traveltime probability density function (pdf) for solute flux from the water table to the bottom boundary (the production horizon) of the aquifer. The cases examined include two-, three-, and four-facies models, with mean length anisotropy ratios for horizontal to vertical facies, ek, from 25:1 to 300:1 and with a wide range of facies volume proportions (e.g., from 5 to 95% coarse-textured facies). Predictions of traveltime pdfs are found to be significantly affected by the number of hydrostratigraphic facies identified in the aquifer. Those predictions of traveltime pdfs also are affected by the proportions of coarse-textured sediments, the mean length of the facies (particularly the ratio of length to thickness of coarse materials), and, to a lesser degree, the juxtapositional preference among the hydrostratigraphic facies. In transport normal to the sedimentary bedding plane, traveltime is not lognormally distributed as is often assumed. Also, macrodispersive behavior (variance of the traveltime) is found not to be a unique function of the conductivity variance. For the parameter range examined, the third moment of the traveltime pdf varies from negatively skewed to strongly positively skewed. We also show that the Markov chain approach may give significantly different traveltime distributions when compared to the more commonly used Gaussian random field approach, even when the first- and second-order moments in the geostatistical distribution of the lnK field are identical. The choice of the appropriate geostatistical model is therefore critical in the assessment of nonpoint source transport, and uncertainty about that choice must be considered in evaluating the results.

  1. The total probabilities from high-resolution ensemble forecasting of floods

    NASA Astrophysics Data System (ADS)

    Olav Skøien, Jon; Bogner, Konrad; Salamon, Peter; Smith, Paul; Pappenberger, Florian

    2015-04-01

    Ensemble forecasting has for a long time been used in meteorological modelling, to give an indication of the uncertainty of the forecasts. As meteorological ensemble forecasts often show some bias and dispersion errors, there is a need for calibration and post-processing of the ensembles. Typical methods for this are Bayesian Model Averaging (Raftery et al., 2005) and Ensemble Model Output Statistics (EMOS) (Gneiting et al., 2005). There are also methods for regionalizing these methods (Berrocal et al., 2007) and for incorporating the correlation between lead times (Hemri et al., 2013). To make optimal predictions of floods along the stream network in hydrology, we can easily use the ensemble members as input to the hydrological models. However, some of the post-processing methods will need modifications when regionalizing the forecasts outside the calibration locations, as done by Hemri et al. (2013). We present a method for spatial regionalization of the post-processed forecasts based on EMOS and top-kriging (Skøien et al., 2006). We will also look into different methods for handling the non-normality of runoff and the effect on forecasts skills in general and for floods in particular. Berrocal, V. J., Raftery, A. E. and Gneiting, T.: Combining Spatial Statistical and Ensemble Information in Probabilistic Weather Forecasts, Mon. Weather Rev., 135(4), 1386-1402, doi:10.1175/MWR3341.1, 2007. Gneiting, T., Raftery, A. E., Westveld, A. H. and Goldman, T.: Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation, Mon. Weather Rev., 133(5), 1098-1118, doi:10.1175/MWR2904.1, 2005. Hemri, S., Fundel, F. and Zappa, M.: Simultaneous calibration of ensemble river flow predictions over an entire range of lead times, Water Resour. Res., 49(10), 6744-6755, doi:10.1002/wrcr.20542, 2013. Raftery, A. E., Gneiting, T., Balabdaoui, F. and Polakowski, M.: Using Bayesian Model Averaging to Calibrate Forecast Ensembles, Mon. Weather Rev., 133(5), 1155-1174, doi:10.1175/MWR2906.1, 2005. Skøien, J. O., Merz, R. and Blöschl, G.: Top-kriging - Geostatistics on stream networks, Hydrol. Earth Syst. Sci., 10(2), 277-287, 2006.

  2. Modelling the Ecological Comorbidity of Acute Respiratory Infection, Diarrhoea and Stunting among Children Under the Age of 5 Years in Somalia.

    PubMed

    Kinyoki, Damaris K; Manda, Samuel O; Moloney, Grainne M; Odundo, Elijah O; Berkley, James A; Noor, Abdisalan M; Kandala, Ngianga-Bakwin

    2017-04-01

    The aim of this study was to assess spatial co-occurrence of acute respiratory infections (ARI), diarrhoea and stunting among children of the age between 6 and 59 months in Somalia. Data were obtained from routine biannual nutrition surveys conducted by the Food and Agriculture Organization 2007-2010. A Bayesian hierarchical geostatistical shared component model was fitted to the residual spatial components of the three health conditions. Risk maps of the common spatial effects at 1×1 km resolution were derived. The empirical correlations of the enumeration area proportion were 0.37, 0.63 and 0.66 for ARI and stunting, diarrhoea and stunting and ARI and diarrhoea, respectively. Spatially, the posterior residual effects ranged 0.03-20.98, 0.16-6.37 and 0.08-9.66 for shared component between ARI and stunting, diarrhoea and stunting and ARI and diarrhoea, respectively. The analysis showed clearly that the spatial shared component between ARI, diarrhoea and stunting was higher in the southern part of the country. Interventions aimed at controlling and mitigating the adverse effects of these three childhood health conditions should focus on their common putative risk factors, particularly in the South in Somalia.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Davis, J.M.

    Three outcrop studies were conducted in deposits of different depositional environments. At each site, permeability measurements were obtained with an air-minipermeameter developed as part of this study. In addition, the geological units were mapped with either surveying, photographs, or both. Geostatistical analysis of the permeability data was performed to estimate the characteristics of the probability distribution function and the spatial correlation structure. The information obtained from the geological mapping was then compared with the results of the geostatistical analysis for any relationships that may exist. The main field site was located in the Albuquerque Basin of central New Mexico atmore » an outcrop of the Pliocene-Pleistocene Sierra Ladrones Formation. The second study was conducted on the walls of waste pits in alluvial fan deposits at the Nevada Test Site. The third study was conducted on an outcrop of an eolian deposit (miocene) south of Socorro, New Mexico. The results of the three studies were then used to construct a conceptual model relating depositional environment to geostatistical models of heterogeneity. The model presented is largely qualitative but provides a basis for further hypothesis formulation and testing.« less

  4. A multiple-point geostatistical approach to quantifying uncertainty for flow and transport simulation in geologically complex environments

    NASA Astrophysics Data System (ADS)

    Cronkite-Ratcliff, C.; Phelps, G. A.; Boucher, A.

    2011-12-01

    In many geologic settings, the pathways of groundwater flow are controlled by geologic heterogeneities which have complex geometries. Models of these geologic heterogeneities, and consequently, their effects on the simulated pathways of groundwater flow, are characterized by uncertainty. Multiple-point geostatistics, which uses a training image to represent complex geometric descriptions of geologic heterogeneity, provides a stochastic approach to the analysis of geologic uncertainty. Incorporating multiple-point geostatistics into numerical models provides a way to extend this analysis to the effects of geologic uncertainty on the results of flow simulations. We present two case studies to demonstrate the application of multiple-point geostatistics to numerical flow simulation in complex geologic settings with both static and dynamic conditioning data. Both cases involve the development of a training image from a complex geometric description of the geologic environment. Geologic heterogeneity is modeled stochastically by generating multiple equally-probable realizations, all consistent with the training image. Numerical flow simulation for each stochastic realization provides the basis for analyzing the effects of geologic uncertainty on simulated hydraulic response. The first case study is a hypothetical geologic scenario developed using data from the alluvial deposits in Yucca Flat, Nevada. The SNESIM algorithm is used to stochastically model geologic heterogeneity conditioned to the mapped surface geology as well as vertical drill-hole data. Numerical simulation of groundwater flow and contaminant transport through geologic models produces a distribution of hydraulic responses and contaminant concentration results. From this distribution of results, the probability of exceeding a given contaminant concentration threshold can be used as an indicator of uncertainty about the location of the contaminant plume boundary. The second case study considers a characteristic lava-flow aquifer system in Pahute Mesa, Nevada. A 3D training image is developed by using object-based simulation of parametric shapes to represent the key morphologic features of rhyolite lava flows embedded within ash-flow tuffs. In addition to vertical drill-hole data, transient pressure head data from aquifer tests can be used to constrain the stochastic model outcomes. The use of both static and dynamic conditioning data allows the identification of potential geologic structures that control hydraulic response. These case studies demonstrate the flexibility of the multiple-point geostatistics approach for considering multiple types of data and for developing sophisticated models of geologic heterogeneities that can be incorporated into numerical flow simulations.

  5. PCTO-SIM: Multiple-point geostatistical modeling using parallel conditional texture optimization

    NASA Astrophysics Data System (ADS)

    Pourfard, Mohammadreza; Abdollahifard, Mohammad J.; Faez, Karim; Motamedi, Sayed Ahmad; Hosseinian, Tahmineh

    2017-05-01

    Multiple-point Geostatistics is a well-known general statistical framework by which complex geological phenomena have been modeled efficiently. Pixel-based and patch-based are two major categories of these methods. In this paper, the optimization-based category is used which has a dual concept in texture synthesis as texture optimization. Our extended version of texture optimization uses the energy concept to model geological phenomena. While honoring the hard point, the minimization of our proposed cost function forces simulation grid pixels to be as similar as possible to training images. Our algorithm has a self-enrichment capability and creates a richer training database from a sparser one through mixing the information of all surrounding patches of the simulation nodes. Therefore, it preserves pattern continuity in both continuous and categorical variables very well. It also shows a fuzzy result in its every realization similar to the expected result of multi realizations of other statistical models. While the main core of most previous Multiple-point Geostatistics methods is sequential, the parallel main core of our algorithm enabled it to use GPU efficiently to reduce the CPU time. One new validation method for MPS has also been proposed in this paper.

  6. GEOSTATISTICS FOR WASTE MANAGEMENT: A USER'S MANUAL FOR THE GEOPACK (VERSION 1.0) GEOSTATISTICAL SOFTWARE SYSTEM

    EPA Science Inventory

    GEOPACK, a comprehensive user-friendly geostatistical software system, was developed to help in the analysis of spatially correlated data. The software system was developed to be used by scientists, engineers, regulators, etc., with little experience in geostatistical techniques...

  7. Geostatistics and petroleum geology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hohn, M.E.

    1988-01-01

    This book examines purpose and use of geostatistics in exploration and development of oil and gas with an emphasis on appropriate and pertinent case studies. It present an overview of geostatistics. Topics covered include: The semivariogram; Linear estimation; Multivariate geostatistics; Nonlinear estimation; From indicator variables to nonparametric estimation; and More detail, less certainty; conditional simulation.

  8. GEOSTATISTICS FOR WASTE MANAGEMENT: A USER'S MANUEL FOR THE GEOPACK (VERSION 1.0) GEOSTATISTICAL SOFTWARE SYSTEM

    EPA Science Inventory

    A comprehensive, user-friendly geostatistical software system called GEOPACk has been developed. The purpose of this software is to make available the programs necessary to undertake a geostatistical analysis of spatially correlated data. The programs were written so that they ...

  9. High resolution age-structured mapping of childhood vaccination coverage in low and middle income countries.

    PubMed

    Utazi, C Edson; Thorley, Julia; Alegana, Victor A; Ferrari, Matthew J; Takahashi, Saki; Metcalf, C Jessica E; Lessler, Justin; Tatem, Andrew J

    2018-03-14

    The expansion of childhood vaccination programs in low and middle income countries has been a substantial public health success story. Indicators of the performance of intervention programmes such as coverage levels and numbers covered are typically measured through national statistics or at the scale of large regions due to survey design, administrative convenience or operational limitations. These mask heterogeneities and 'coldspots' of low coverage that may allow diseases to persist, even if overall coverage is high. Hence, to decrease inequities and accelerate progress towards disease elimination goals, fine-scale variation in coverage should be better characterized. Using measles as an example, cluster-level Demographic and Health Surveys (DHS) data were used to map vaccination coverage at 1 km spatial resolution in Cambodia, Mozambique and Nigeria for varying age-group categories of children under five years, using Bayesian geostatistical techniques built on a suite of publicly available geospatial covariates and implemented via Markov Chain Monte Carlo (MCMC) methods. Measles vaccination coverage was found to be strongly predicted by just 4-5 covariates in geostatistical models, with remoteness consistently selected as a key variable. The output 1 × 1 km maps revealed significant heterogeneities within the three countries that were not captured using province-level summaries. Integration with population data showed that at the time of the surveys, few districts attained the 80% coverage, that is one component of the WHO Global Vaccine Action Plan 2020 targets. The elimination of vaccine-preventable diseases requires a strong evidence base to guide strategies and inform efficient use of limited resources. The approaches outlined here provide a route to moving beyond large area summaries of vaccination coverage that mask epidemiologically-important heterogeneities to detailed maps that capture subnational vulnerabilities. The output datasets are built on open data and methods, and in flexible format that can be aggregated to more operationally-relevant administrative unit levels. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  10. Geostatistical modelling of soil-transmitted helminth infection in Cambodia: do socioeconomic factors improve predictions?

    PubMed

    Karagiannis-Voules, Dimitrios-Alexios; Odermatt, Peter; Biedermann, Patricia; Khieu, Virak; Schär, Fabian; Muth, Sinuon; Utzinger, Jürg; Vounatsou, Penelope

    2015-01-01

    Soil-transmitted helminth infections are intimately connected with poverty. Yet, there is a paucity of using socioeconomic proxies in spatially explicit risk profiling. We compiled household-level socioeconomic data pertaining to sanitation, drinking-water, education and nutrition from readily available Demographic and Health Surveys, Multiple Indicator Cluster Surveys and World Health Surveys for Cambodia and aggregated the data at village level. We conducted a systematic review to identify parasitological surveys and made every effort possible to extract, georeference and upload the data in the open source Global Neglected Tropical Diseases database. Bayesian geostatistical models were employed to spatially align the village-aggregated socioeconomic predictors with the soil-transmitted helminth infection data. The risk of soil-transmitted helminth infection was predicted at a grid of 1×1km covering Cambodia. Additionally, two separate individual-level spatial analyses were carried out, for Takeo and Preah Vihear provinces, to assess and quantify the association between soil-transmitted helminth infection and socioeconomic indicators at an individual level. Overall, we obtained socioeconomic proxies from 1624 locations across the country. Surveys focussing on soil-transmitted helminth infections were extracted from 16 sources reporting data from 238 unique locations. We found that the risk of soil-transmitted helminth infection from 2000 onwards was considerably lower than in surveys conducted earlier. Population-adjusted prevalences for school-aged children from 2000 onwards were 28.7% for hookworm, 1.5% for Ascaris lumbricoides and 0.9% for Trichuris trichiura. Surprisingly, at the country-wide analyses, we did not find any significant association between soil-transmitted helminth infection and village-aggregated socioeconomic proxies. Based also on the individual-level analyses we conclude that socioeconomic proxies might not be good predictors at an aggregated large-scale analysis due to their large between- and within-village heterogeneity. Specific information of both the infection risk and potential predictors might be needed to obtain any existing association. The presented soil-transmitted helminth infection risk estimates for Cambodia can be used for guiding and evaluating control and elimination efforts. Copyright © 2014. Published by Elsevier B.V.

  11. A Modular GIS-Based Software Architecture for Model Parameter Estimation using the Method of Anchored Distributions (MAD)

    NASA Astrophysics Data System (ADS)

    Ames, D. P.; Osorio-Murillo, C.; Over, M. W.; Rubin, Y.

    2012-12-01

    The Method of Anchored Distributions (MAD) is an inverse modeling technique that is well-suited for estimation of spatially varying parameter fields using limited observations and Bayesian methods. This presentation will discuss the design, development, and testing of a free software implementation of the MAD technique using the open source DotSpatial geographic information system (GIS) framework, R statistical software, and the MODFLOW groundwater model. This new tool, dubbed MAD-GIS, is built using a modular architecture that supports the integration of external analytical tools and models for key computational processes including a forward model (e.g. MODFLOW, HYDRUS) and geostatistical analysis (e.g. R, GSLIB). The GIS-based graphical user interface provides a relatively simple way for new users of the technique to prepare the spatial domain, to identify observation and anchor points, to perform the MAD analysis using a selected forward model, and to view results. MAD-GIS uses the Managed Extensibility Framework (MEF) provided by the Microsoft .NET programming platform to support integration of different modeling and analytical tools at run-time through a custom "driver." Each driver establishes a connection with external programs through a programming interface, which provides the elements for communicating with core MAD software. This presentation gives an example of adapting the MODFLOW to serve as the external forward model in MAD-GIS for inferring the distribution functions of key MODFLOW parameters. Additional drivers for other models are being developed and it is expected that the open source nature of the project will engender the development of additional model drivers by 3rd party scientists.

  12. Predicting ectotherm disease vector spread—benefits from multidisciplinary approaches and directions forward

    NASA Astrophysics Data System (ADS)

    Thomas, Stephanie Margarete; Beierkuhnlein, Carl

    2013-05-01

    The occurrence of ectotherm disease vectors outside of their previous distribution area and the emergence of vector-borne diseases can be increasingly observed at a global scale and are accompanied by a growing number of studies which investigate the vast range of determining factors and their causal links. Consequently, a broad span of scientific disciplines is involved in tackling these complex phenomena. First, we evaluate the citation behaviour of relevant scientific literature in order to clarify the question "do scientists consider results of other disciplines to extend their expertise?" We then highlight emerging tools and concepts useful for risk assessment. Correlative models (regression-based, machine-learning and profile techniques), mechanistic models (basic reproduction number R 0) and methods of spatial regression, interaction and interpolation are described. We discuss further steps towards multidisciplinary approaches regarding new tools and emerging concepts to combine existing approaches such as Bayesian geostatistical modelling, mechanistic models which avoid the need for parameter fitting, joined correlative and mechanistic models, multi-criteria decision analysis and geographic profiling. We take the quality of both occurrence data for vector, host and disease cases, and data of the predictor variables into consideration as both determine the accuracy of risk area identification. Finally, we underline the importance of multidisciplinary research approaches. Even if the establishment of communication networks between scientific disciplines and the share of specific methods is time consuming, it promises new insights for the surveillance and control of vector-borne diseases worldwide.

  13. Modeling dolomitized carbonate-ramp reservoirs: A case study of the Seminole San Andres unit. Part 2 -- Seismic modeling, reservoir geostatistics, and reservoir simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, F.P.; Dai, J.; Kerans, C.

    1998-11-01

    In part 1 of this paper, the authors discussed the rock-fabric/petrophysical classes for dolomitized carbonate-ramp rocks, the effects of rock fabric and pore type on petrophysical properties, petrophysical models for analyzing wireline logs, the critical scales for defining geologic framework, and 3-D geologic modeling. Part 2 focuses on geophysical and engineering characterizations, including seismic modeling, reservoir geostatistics, stochastic modeling, and reservoir simulation. Synthetic seismograms of 30 to 200 Hz were generated to study the level of seismic resolution required to capture the high-frequency geologic features in dolomitized carbonate-ramp reservoirs. Outcrop data were collected to investigate effects of sampling interval andmore » scale-up of block size on geostatistical parameters. Semivariogram analysis of outcrop data showed that the sill of log permeability decreases and the correlation length increases with an increase of horizontal block size. Permeability models were generated using conventional linear interpolation, stochastic realizations without stratigraphic constraints, and stochastic realizations with stratigraphic constraints. Simulations of a fine-scale Lawyer Canyon outcrop model were used to study the factors affecting waterflooding performance. Simulation results show that waterflooding performance depends strongly on the geometry and stacking pattern of the rock-fabric units and on the location of production and injection wells.« less

  14. Usability and potential of geostatistics for spatial discrimination of multiple sclerosis lesion patterns.

    PubMed

    Marschallinger, Robert; Golaszewski, Stefan M; Kunz, Alexander B; Kronbichler, Martin; Ladurner, Gunther; Hofmann, Peter; Trinka, Eugen; McCoy, Mark; Kraus, Jörg

    2014-01-01

    In multiple sclerosis (MS) the individual disease courses are very heterogeneous among patients and biomarkers for setting the diagnosis and the estimation of the prognosis for individual patients would be very helpful. For this purpose, we are developing a multidisciplinary method and workflow for the quantitative, spatial, and spatiotemporal analysis and characterization of MS lesion patterns from MRI with geostatistics. We worked on a small data set involving three synthetic and three real-world MS lesion patterns, covering a wide range of possible MS lesion configurations. After brain normalization, MS lesions were extracted and the resulting binary 3-dimensional models of MS lesion patterns were subject to geostatistical indicator variography in three orthogonal directions. By applying geostatistical indicator variography, we were able to describe the 3-dimensional spatial structure of MS lesion patterns in a standardized manner. Fitting a model function to the empirical variograms, spatial characteristics of the MS lesion patterns could be expressed and quantified by two parameters. An orthogonal plot of these parameters enabled a well-arranged comparison of the involved MS lesion patterns. This method in development is a promising candidate to complement standard image-based statistics by incorporating spatial quantification. The work flow is generic and not limited to analyzing MS lesion patterns. It can be completely automated for the screening of radiological archives. Copyright © 2013 by the American Society of Neuroimaging.

  15. Topsoil moisture mapping using geostatistical techniques under different Mediterranean climatic conditions.

    PubMed

    Martínez-Murillo, J F; Hueso-González, P; Ruiz-Sinoga, J D

    2017-10-01

    Soil mapping has been considered as an important factor in the widening of Soil Science and giving response to many different environmental questions. Geostatistical techniques, through kriging and co-kriging techniques, have made possible to improve the understanding of eco-geomorphologic variables, e.g., soil moisture. This study is focused on mapping of topsoil moisture using geostatistical techniques under different Mediterranean climatic conditions (humid, dry and semiarid) in three small watersheds and considering topography and soil properties as key factors. A Digital Elevation Model (DEM) with a resolution of 1×1m was derived from a topographical survey as well as soils were sampled to analyzed soil properties controlling topsoil moisture, which was measured during 4-years. Afterwards, some topography attributes were derived from the DEM, the soil properties analyzed in laboratory, and the topsoil moisture was modeled for the entire watersheds applying three geostatistical techniques: i) ordinary kriging; ii) co-kriging considering as co-variate topography attributes; and iii) co-kriging ta considering as co-variates topography attributes and gravel content. The results indicated topsoil moisture was more accurately mapped in the dry and semiarid watersheds when co-kriging procedure was performed. The study is a contribution to improve the efficiency and accuracy of studies about the Mediterranean eco-geomorphologic system and soil hydrology in field conditions. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Improved Assimilation of Streamflow and Satellite Soil Moisture with the Evolutionary Particle Filter and Geostatistical Modeling

    NASA Astrophysics Data System (ADS)

    Yan, Hongxiang; Moradkhani, Hamid; Abbaszadeh, Peyman

    2017-04-01

    Assimilation of satellite soil moisture and streamflow data into hydrologic models using has received increasing attention over the past few years. Currently, these observations are increasingly used to improve the model streamflow and soil moisture predictions. However, the performance of this land data assimilation (DA) system still suffers from two limitations: 1) satellite data scarcity and quality; and 2) particle weight degeneration. In order to overcome these two limitations, we propose two possible solutions in this study. First, the general Gaussian geostatistical approach is proposed to overcome the limitation in the space/time resolution of satellite soil moisture products thus improving their accuracy at uncovered/biased grid cells. Secondly, an evolutionary PF approach based on Genetic Algorithm (GA) and Markov Chain Monte Carlo (MCMC), the so-called EPF-MCMC, is developed to further reduce weight degeneration and improve the robustness of the land DA system. This study provides a detailed analysis of the joint and separate assimilation of streamflow and satellite soil moisture into a distributed Sacramento Soil Moisture Accounting (SAC-SMA) model, with the use of recently developed EPF-MCMC and the general Gaussian geostatistical approach. Performance is assessed over several basins in the USA selected from Model Parameter Estimation Experiment (MOPEX) and located in different climate regions. The results indicate that: 1) the general Gaussian approach can predict the soil moisture at uncovered grid cells within the expected satellite data quality threshold; 2) assimilation of satellite soil moisture inferred from the general Gaussian model can significantly improve the soil moisture predictions; and 3) in terms of both deterministic and probabilistic measures, the EPF-MCMC can achieve better streamflow predictions. These results recommend that the geostatistical model is a helpful tool to aid the remote sensing technique and the EPF-MCMC is a reliable and effective DA approach in hydrologic applications.

  17. Identification of high-permeability subsurface structures with multiple point geostatistics and normal score ensemble Kalman filter

    NASA Astrophysics Data System (ADS)

    Zovi, Francesco; Camporese, Matteo; Hendricks Franssen, Harrie-Jan; Huisman, Johan Alexander; Salandin, Paolo

    2017-05-01

    Alluvial aquifers are often characterized by the presence of braided high-permeable paleo-riverbeds, which constitute an interconnected preferential flow network whose localization is of fundamental importance to predict flow and transport dynamics. Classic geostatistical approaches based on two-point correlation (i.e., the variogram) cannot describe such particular shapes. In contrast, multiple point geostatistics can describe almost any kind of shape using the empirical probability distribution derived from a training image. However, even with a correct training image the exact positions of the channels are uncertain. State information like groundwater levels can constrain the channel positions using inverse modeling or data assimilation, but the method should be able to handle non-Gaussianity of the parameter distribution. Here the normal score ensemble Kalman filter (NS-EnKF) was chosen as the inverse conditioning algorithm to tackle this issue. Multiple point geostatistics and NS-EnKF have already been tested in synthetic examples, but in this study they are used for the first time in a real-world case study. The test site is an alluvial unconfined aquifer in northeastern Italy with an extension of approximately 3 km2. A satellite training image showing the braid shapes of the nearby river and electrical resistivity tomography (ERT) images were used as conditioning data to provide information on channel shape, size, and position. Measured groundwater levels were assimilated with the NS-EnKF to update the spatially distributed groundwater parameters (hydraulic conductivity and storage coefficients). Results from the study show that the inversion based on multiple point geostatistics does not outperform the one with a multiGaussian model and that the information from the ERT images did not improve site characterization. These results were further evaluated with a synthetic study that mimics the experimental site. The synthetic results showed that only for a much larger number of conditioning piezometric heads, multiple point geostatistics and ERT could improve aquifer characterization. This shows that state of the art stochastic methods need to be supported by abundant and high-quality subsurface data.

  18. Three-dimensional imaging of aquifer and aquitard heterogeneity via transient hydraulic tomography at a highly heterogeneous field site

    NASA Astrophysics Data System (ADS)

    Zhao, Zhanfeng; Illman, Walter A.

    2018-04-01

    Previous studies have shown that geostatistics-based transient hydraulic tomography (THT) is robust for subsurface heterogeneity characterization through the joint inverse modeling of multiple pumping tests. However, the hydraulic conductivity (K) and specific storage (Ss) estimates can be smooth or even erroneous for areas where pumping/observation densities are low. This renders the imaging of interlayer and intralayer heterogeneity of highly contrasting materials including their unit boundaries difficult. In this study, we further test the performance of THT by utilizing existing and newly collected pumping test data of longer durations that showed drawdown responses in both aquifer and aquitard units at a field site underlain by a highly heterogeneous glaciofluvial deposit. The robust performance of the THT is highlighted through the comparison of different degrees of model parameterization including: (1) the effective parameter approach; (2) the geological zonation approach relying on borehole logs; and (3) the geostatistical inversion approach considering different prior information (with/without geological data). Results reveal that the simultaneous analysis of eight pumping tests with the geostatistical inverse model yields the best results in terms of model calibration and validation. We also find that the joint interpretation of long-term drawdown data from aquifer and aquitard units is necessary in mapping their full heterogeneous patterns including intralayer variabilities. Moreover, as geological data are included as prior information in the geostatistics-based THT analysis, the estimated K values increasingly reflect the vertical distribution patterns of permeameter-estimated K in both aquifer and aquitard units. Finally, the comparison of various THT approaches reveals that differences in the estimated K and Ss tomograms result in significantly different transient drawdown predictions at observation ports.

  19. On the importance of geological data for hydraulic tomography analysis: Laboratory sandbox study

    NASA Astrophysics Data System (ADS)

    Zhao, Zhanfeng; Illman, Walter A.; Berg, Steven J.

    2016-11-01

    This paper investigates the importance of geological data in Hydraulic Tomography (HT) through sandbox experiments. In particular, four groundwater models with homogeneous geological units constructed with borehole data of varying accuracy are jointly calibrated with multiple pumping test data of two different pumping and observation densities. The results are compared to those from a geostatistical inverse model. Model calibration and validation performances are quantitatively assessed using drawdown scatterplots. We find that accurate and inaccurate geological models can be well calibrated, despite the estimated K values for the poor geological models being quite different from the actual values. Model validation results reveal that inaccurate geological models yield poor drawdown predictions, but using more calibration data improves its predictive capability. Moreover, model comparisons among a highly parameterized geostatistical and layer-based geological models show that, (1) as the number of pumping tests and monitoring locations are reduced, the performance gap between the approaches decreases, and (2) a simplified geological model with a fewer number of layers is more reliable than the one based on the wrong description of stratigraphy. Finally, using a geological model as prior information in geostatistical inverse models results in the preservation of geological features, especially in areas where drawdown data are not available. Overall, our sandbox results emphasize the importance of incorporating geological data in HT surveys when data from pumping tests is sparse. These findings have important implications for field applications of HT where well distances are large.

  20. The moving-window Bayesian maximum entropy framework: estimation of PM(2.5) yearly average concentration across the contiguous United States.

    PubMed

    Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L

    2012-09-01

    Geostatistical methods are widely used in estimating long-term exposures for epidemiological studies on air pollution, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and the uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian maximum entropy (BME) method and applied this framework to estimate fine particulate matter (PM(2.5)) yearly average concentrations over the contiguous US. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingness in the air-monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM(2.5) data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM(2.5). Moreover, the MWBME method further reduces the MSE by 8.4-43.7%, with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM(2.5) across large geographical domains with expected spatial non-stationarity.

  1. Bayesian Maximum Entropy space/time estimation of surface water chloride in Maryland using river distances.

    PubMed

    Jat, Prahlad; Serre, Marc L

    2016-12-01

    Widespread contamination of surface water chloride is an emerging environmental concern. Consequently accurate and cost-effective methods are needed to estimate chloride along all river miles of potentially contaminated watersheds. Here we introduce a Bayesian Maximum Entropy (BME) space/time geostatistical estimation framework that uses river distances, and we compare it with Euclidean BME to estimate surface water chloride from 2005 to 2014 in the Gunpowder-Patapsco, Severn, and Patuxent subbasins in Maryland. River BME improves the cross-validation R 2 by 23.67% over Euclidean BME, and river BME maps are significantly different than Euclidean BME maps, indicating that it is important to use river BME maps to assess water quality impairment. The river BME maps of chloride concentration show wide contamination throughout Baltimore and Columbia-Ellicott cities, the disappearance of a clean buffer separating these two large urban areas, and the emergence of multiple localized pockets of contamination in surrounding areas. The number of impaired river miles increased by 0.55% per year in 2005-2009 and by 1.23% per year in 2011-2014, corresponding to a marked acceleration of the rate of impairment. Our results support the need for control measures and increased monitoring of unassessed river miles. Copyright © 2016. Published by Elsevier Ltd.

  2. Using geostatistics to evaluate cleanup goals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marcon, M.F.; Hopkins, L.P.

    1995-12-01

    Geostatistical analysis is a powerful predictive tool typically used to define spatial variability in environmental data. The information from a geostatistical analysis using kriging, a geostatistical. tool, can be taken a step further to optimize sampling location and frequency and help quantify sampling uncertainty in both the remedial investigation and remedial design at a hazardous waste site. Geostatistics were used to quantify sampling uncertainty in attainment of a risk-based cleanup goal and determine the optimal sampling frequency necessary to delineate the horizontal extent of impacted soils at a Gulf Coast waste site.

  3. Comments on ``Use of conditional simulation in nuclear waste site performance assessment`` by Carol Gotway

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Downing, D.J.

    1993-10-01

    This paper discusses Carol Gotway`s paper, ``The Use of Conditional Simulation in Nuclear Waste Site Performance Assessment.`` The paper centers on the use of conditional simulation and the use of geostatistical methods to simulate an entire field of values for subsequent use in a complex computer model. The issues of sampling designs for geostatistics, semivariogram estimation and anisotropy, turning bands method for random field generation, and estimation of the comulative distribution function are brought out.

  4. The worth of data in predicting aquitard continuity in hydrogeological design

    NASA Astrophysics Data System (ADS)

    James, Bruce R.; Freeze, R. Allan

    1993-07-01

    A Bayesian decision framework is developed for addressing questions of hydrogeological data worth associated with engineering design at sites in heterogeneous geological environments. The specific case investigated is one of remedial contaminant containment in an aquifer underlain by an aquitard of uncertain continuity. The framework is used to evaluate the worth of hard and soft data in investigating the aquitard's continuity. The analysis consists of four modules: (1) an aquitard realization generator based on indicator kriging, (2) a procedure for the Bayesian updating of the uncertainty with respect to aquitard windows, (3) a Monte Carlo simulation model for advective contaminant transport, and (4) an economic decision model. A sensitivity analysis for a generic design example involving a design decision between a no-action alternative and a containment alternative indicates that the data worth of a single borehole providing a hard point datum was more sensitive to economic parameters than to hydrogeological or geostatistical parameters. For this case, data worth is very sensitive to the projected cost of containment, the discount rate, and the estimated cost of failure. When it comes to hydrogeological parameters, such as the representative hydraulic conductivity of the aquitard or underlying aquifer, the sensitivity analysis indicates that it is more important to know whether the field value is above or below some threshold value than it is to know its actual numerical value. A good conceptual understanding of the site geology is important in estimating prior uncertainties. The framework was applied in a retrospective fashion to the design of a remediation program for soil contaminated by radioactive waste disposal at the Savannah River site in South Carolina. The cost-effectiveness of different patterns of boreholes was studied. A contour map is presented for the net expected value of sample information (EVSI) for a single borehole. The net EVSI of patterns of precise point measurements is also compared to that of an imprecise seismic survey.

  5. Can Geostatistical Models Represent Nature's Variability? An Analysis Using Flume Experiments

    NASA Astrophysics Data System (ADS)

    Scheidt, C.; Fernandes, A. M.; Paola, C.; Caers, J.

    2015-12-01

    The lack of understanding in the Earth's geological and physical processes governing sediment deposition render subsurface modeling subject to large uncertainty. Geostatistics is often used to model uncertainty because of its capability to stochastically generate spatially varying realizations of the subsurface. These methods can generate a range of realizations of a given pattern - but how representative are these of the full natural variability? And how can we identify the minimum set of images that represent this natural variability? Here we use this minimum set to define the geostatistical prior model: a set of training images that represent the range of patterns generated by autogenic variability in the sedimentary environment under study. The proper definition of the prior model is essential in capturing the variability of the depositional patterns. This work starts with a set of overhead images from an experimental basin that showed ongoing autogenic variability. We use the images to analyze the essential characteristics of this suite of patterns. In particular, our goal is to define a prior model (a minimal set of selected training images) such that geostatistical algorithms, when applied to this set, can reproduce the full measured variability. A necessary prerequisite is to define a measure of variability. In this study, we measure variability using a dissimilarity distance between the images. The distance indicates whether two snapshots contain similar depositional patterns. To reproduce the variability in the images, we apply an MPS algorithm to the set of selected snapshots of the sedimentary basin that serve as training images. The training images are chosen from among the initial set by using the distance measure to ensure that only dissimilar images are chosen. Preliminary investigations show that MPS can reproduce fairly accurately the natural variability of the experimental depositional system. Furthermore, the selected training images provide process information. They fall into three basic patterns: a channelized end member, a sheet flow end member, and one intermediate case. These represent the continuum between autogenic bypass or erosion, and net deposition.

  6. Geostatistical regularization of inverse models for the retrieval of vegetation biophysical variables

    NASA Astrophysics Data System (ADS)

    Atzberger, C.; Richter, K.

    2009-09-01

    The robust and accurate retrieval of vegetation biophysical variables using radiative transfer models (RTM) is seriously hampered by the ill-posedness of the inverse problem. With this research we further develop our previously published (object-based) inversion approach [Atzberger (2004)]. The object-based RTM inversion takes advantage of the geostatistical fact that the biophysical characteristics of nearby pixel are generally more similar than those at a larger distance. A two-step inversion based on PROSPECT+SAIL generated look-up-tables is presented that can be easily implemented and adapted to other radiative transfer models. The approach takes into account the spectral signatures of neighboring pixel and optimizes a common value of the average leaf angle (ALA) for all pixel of a given image object, such as an agricultural field. Using a large set of leaf area index (LAI) measurements (n = 58) acquired over six different crops of the Barrax test site, Spain), we demonstrate that the proposed geostatistical regularization yields in most cases more accurate and spatially consistent results compared to the traditional (pixel-based) inversion. Pros and cons of the approach are discussed and possible future extensions presented.

  7. Geostatistical applications in environmental remediation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, R.N.; Purucker, S.T.; Lyon, B.F.

    1995-02-01

    Geostatistical analysis refers to a collection of statistical methods for addressing data that vary in space. By incorporating spatial information into the analysis, geostatistics has advantages over traditional statistical analysis for problems with a spatial context. Geostatistics has a history of success in earth science applications, and its popularity is increasing in other areas, including environmental remediation. Due to recent advances in computer technology, geostatistical algorithms can be executed at a speed comparable to many standard statistical software packages. When used responsibly, geostatistics is a systematic and defensible tool can be used in various decision frameworks, such as the Datamore » Quality Objectives (DQO) process. At every point in the site, geostatistics can estimate both the concentration level and the probability or risk of exceeding a given value. Using these probability maps can assist in identifying clean-up zones. Given any decision threshold and an acceptable level of risk, the probability maps identify those areas that are estimated to be above or below the acceptable risk. Those areas that are above the threshold are of the most concern with regard to remediation. In addition to estimating clean-up zones, geostatistics can assist in designing cost-effective secondary sampling schemes. Those areas of the probability map with high levels of estimated uncertainty are areas where more secondary sampling should occur. In addition, geostatistics has the ability to incorporate soft data directly into the analysis. These data include historical records, a highly correlated secondary contaminant, or expert judgment. The role of geostatistics in environmental remediation is a tool that in conjunction with other methods can provide a common forum for building consensus.« less

  8. Comparing the performance of geostatistical models with additional information from covariates for sewage plume characterization.

    PubMed

    Del Monego, Maurici; Ribeiro, Paulo Justiniano; Ramos, Patrícia

    2015-04-01

    In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Matèrn models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.

  9. A Comparison of Traditional, Step-Path, and Geostatistical Techniques in the Stability Analysis of a Large Open Pit

    NASA Astrophysics Data System (ADS)

    Mayer, J. M.; Stead, D.

    2017-04-01

    With the increased drive towards deeper and more complex mine designs, geotechnical engineers are often forced to reconsider traditional deterministic design techniques in favour of probabilistic methods. These alternative techniques allow for the direct quantification of uncertainties within a risk and/or decision analysis framework. However, conventional probabilistic practices typically discretize geological materials into discrete, homogeneous domains, with attributes defined by spatially constant random variables, despite the fact that geological media display inherent heterogeneous spatial characteristics. This research directly simulates this phenomenon using a geostatistical approach, known as sequential Gaussian simulation. The method utilizes the variogram which imposes a degree of controlled spatial heterogeneity on the system. Simulations are constrained using data from the Ok Tedi mine site in Papua New Guinea and designed to randomly vary the geological strength index and uniaxial compressive strength using Monte Carlo techniques. Results suggest that conventional probabilistic techniques have a fundamental limitation compared to geostatistical approaches, as they fail to account for the spatial dependencies inherent to geotechnical datasets. This can result in erroneous model predictions, which are overly conservative when compared to the geostatistical results.

  10. A geostatistical extreme-value framework for fast simulation of natural hazard events

    PubMed Central

    Stephenson, David B.

    2016-01-01

    We develop a statistical framework for simulating natural hazard events that combines extreme value theory and geostatistics. Robust generalized additive model forms represent generalized Pareto marginal distribution parameters while a Student’s t-process captures spatial dependence and gives a continuous-space framework for natural hazard event simulations. Efficiency of the simulation method allows many years of data (typically over 10 000) to be obtained at relatively little computational cost. This makes the model viable for forming the hazard module of a catastrophe model. We illustrate the framework by simulating maximum wind gusts for European windstorms, which are found to have realistic marginal and spatial properties, and validate well against wind gust measurements. PMID:27279768

  11. Transition probability-based stochastic geological modeling using airborne geophysical data and borehole data

    NASA Astrophysics Data System (ADS)

    He, Xin; Koch, Julian; Sonnenborg, Torben O.; Jørgensen, Flemming; Schamper, Cyril; Christian Refsgaard, Jens

    2014-04-01

    Geological heterogeneity is a very important factor to consider when developing geological models for hydrological purposes. Using statistically based stochastic geological simulations, the spatial heterogeneity in such models can be accounted for. However, various types of uncertainties are associated with both the geostatistical method and the observation data. In the present study, TProGS is used as the geostatistical modeling tool to simulate structural heterogeneity for glacial deposits in a head water catchment in Denmark. The focus is on how the observation data uncertainty can be incorporated in the stochastic simulation process. The study uses two types of observation data: borehole data and airborne geophysical data. It is commonly acknowledged that the density of the borehole data is usually too sparse to characterize the horizontal heterogeneity. The use of geophysical data gives an unprecedented opportunity to obtain high-resolution information and thus to identify geostatistical properties more accurately especially in the horizontal direction. However, since such data are not a direct measurement of the lithology, larger uncertainty of point estimates can be expected as compared to the use of borehole data. We have proposed a histogram probability matching method in order to link the information on resistivity to hydrofacies, while considering the data uncertainty at the same time. Transition probabilities and Markov Chain models are established using the transformed geophysical data. It is shown that such transformation is in fact practical; however, the cutoff value for dividing the resistivity data into facies is difficult to determine. The simulated geological realizations indicate significant differences of spatial structure depending on the type of conditioning data selected. It is to our knowledge the first time that grid-to-grid airborne geophysical data including the data uncertainty are used in conditional geostatistical simulations in TProGS. Therefore, it provides valuable insights regarding the advantages and challenges of using such comprehensive data.

  12. Site characterization methodology for aquifers in support of bioreclamation activities. Volume 2: Borehole flowmeter technique, tracer tests, geostatistics and geology. Final report, August 1987-September 1989

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Young, S.C.

    1993-08-01

    This report discusses a field demonstration of a methodology for characterizing an aquifer's geohydrology in the detail required to design an optimum network of wells and/or infiltration galleries for bioreclamation systems. The project work was conducted on a 1-hectare test site at Columbus AFB, Mississippi. The technical report is divided into two volumes. Volume I describes the test site and the well network, the assumptions, and the application of equations that define groundwater flow to a well, the results of three large-scale aquifer tests, and the results of 160 single-pump tests. Volume II describes the bore hole flowmeter tests, themore » tracer tests, the geological investigations, the geostatistical analysis and the guidelines for using groundwater models to design bioreclamation systems. Site characterization, Hydraulic conductivity, Groundwater flow, Geostatistics, Geohydrology, Monitoring wells.« less

  13. Geostatistical simulations for radon indoor with a nested model including the housing factor.

    PubMed

    Cafaro, C; Giovani, C; Garavaglia, M

    2016-01-01

    The radon prone areas definition is matter of many researches in radioecology, since radon is considered a leading cause of lung tumours, therefore the authorities ask for support to develop an appropriate sanitary prevention strategy. In this paper, we use geostatistical tools to elaborate a definition accounting for some of the available information about the dwellings. Co-kriging is the proper interpolator used in geostatistics to refine the predictions by using external covariates. In advance, co-kriging is not guaranteed to improve significantly the results obtained by applying the common lognormal kriging. Here, instead, such multivariate approach leads to reduce the cross-validation residual variance to an extent which is deemed as satisfying. Furthermore, with the application of Monte Carlo simulations, the paradigm provides a more conservative radon prone areas definition than the one previously made by lognormal kriging. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Introduction to Geostatistics

    NASA Astrophysics Data System (ADS)

    Kitanidis, P. K.

    1997-05-01

    Introduction to Geostatistics presents practical techniques for engineers and earth scientists who routinely encounter interpolation and estimation problems when analyzing data from field observations. Requiring no background in statistics, and with a unique approach that synthesizes classic and geostatistical methods, this book offers linear estimation methods for practitioners and advanced students. Well illustrated with exercises and worked examples, Introduction to Geostatistics is designed for graduate-level courses in earth sciences and environmental engineering.

  15. Integration of geology, geostatistics, well logs and pressure data to model a heterogeneous supergiant field in Iran

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Samimi, B.; Bagherpour, H.; Nioc, A.

    1995-08-01

    The geological reservoir study of the supergiant Ahwaz field significantly improved the history matching process in many aspects, particularly the development of a geostatistical model which allowed a sound basis for changes and by delivering much needed accurate estimates of grid block vertical permeabilities. The geostatistical reservoir evaluation was facilitated by using the Heresim package and litho-stratigraphic zonations for the entire field. For each of the geological zones, 3-dimensional electrolithofacies and petrophysical property distributions (realizations) were treated which captured the heterogeneities which significantly affected fluid flow. However, as this level of heterogeneity was at a significantly smaller scale than themore » flow simulation grid blocks, a scaling up effort was needed to derive the effective flow properties of the blocks (porosity, horizontal and vertical permeability, and water saturation). The properties relating to the static reservoir description were accurately derived by using stream tube techniques developed in-house whereas, the relative permeabilities of the grid block were derived by dynamic pseudo relative permeability techniques. The prediction of vertical and lateral communication and water encroachment was facilitated by a close integration of pressure, saturation data, geostatistical modelling and sedimentological studies of the depositional environments and paleocurrents. The nature of reservoir barriers and baffles varied both vertically and laterally in this heterogeneous reservoir. Maps showing differences in pressure between zones after years of production served as a guide to integrating the static geological studies to the dynamic behaviour of each of the 16 reservoir zones. The use of deep wells being drilled to a deeper reservoir provided data to better understand the sweep efficiency and the continuity of barriers and baffles.« less

  16. Spatial analysis of groundwater levels using Fuzzy Logic and geostatistical tools

    NASA Astrophysics Data System (ADS)

    Theodoridou, P. G.; Varouchakis, E. A.; Karatzas, G. P.

    2017-12-01

    The spatial variability evaluation of the water table of an aquifer provides useful information in water resources management plans. Geostatistical methods are often employed to map the free surface of an aquifer. In geostatistical analysis using Kriging techniques the selection of the optimal variogram is very important for the optimal method performance. This work compares three different criteria to assess the theoretical variogram that fits to the experimental one: the Least Squares Sum method, the Akaike Information Criterion and the Cressie's Indicator. Moreover, variable distance metrics such as the Euclidean, Minkowski, Manhattan, Canberra and Bray-Curtis are applied to calculate the distance between the observation and the prediction points, that affects both the variogram calculation and the Kriging estimator. A Fuzzy Logic System is then applied to define the appropriate neighbors for each estimation point used in the Kriging algorithm. The two criteria used during the Fuzzy Logic process are the distance between observation and estimation points and the groundwater level value at each observation point. The proposed techniques are applied to a data set of 250 hydraulic head measurements distributed over an alluvial aquifer. The analysis showed that the Power-law variogram model and Manhattan distance metric within ordinary kriging provide the best results when the comprehensive geostatistical analysis process is applied. On the other hand, the Fuzzy Logic approach leads to a Gaussian variogram model and significantly improves the estimation performance. The two different variogram models can be explained in terms of a fractional Brownian motion approach and of aquifer behavior at local scale. Finally, maps of hydraulic head spatial variability and of predictions uncertainty are constructed for the area with the two different approaches comparing their advantages and drawbacks.

  17. Bayesian inversion using a geologically realistic and discrete model space

    NASA Astrophysics Data System (ADS)

    Jaeggli, C.; Julien, S.; Renard, P.

    2017-12-01

    Since the early days of groundwater modeling, inverse methods play a crucial role. Many research and engineering groups aim to infer extensive knowledge of aquifer parameters from a sparse set of observations. Despite decades of dedicated research on this topic, there are still several major issues to be solved. In the hydrogeological framework, one is often confronted with underground structures that present very sharp contrasts of geophysical properties. In particular, subsoil structures such as karst conduits, channels, faults, or lenses, strongly influence groundwater flow and transport behavior of the underground. For this reason it can be essential to identify their location and shape very precisely. Unfortunately, when inverse methods are specially trained to consider such complex features, their computation effort often becomes unaffordably high. The following work is an attempt to solve this dilemma. We present a new method that is, in some sense, a compromise between the ergodicity of Markov chain Monte Carlo (McMC) methods and the efficient handling of data by the ensemble based Kalmann filters. The realistic and complex random fields are generated by a Multiple-Point Statistics (MPS) tool. Nonetheless, it is applicable with any conditional geostatistical simulation tool. Furthermore, the algorithm is independent of any parametrization what becomes most important when two parametric systems are equivalent (permeability and resistivity, speed and slowness, etc.). When compared to two existing McMC schemes, the computational effort was divided by a factor of 12.

  18. Geostatistical radar-raingauge combination with nonparametric correlograms: methodological considerations and application in Switzerland

    NASA Astrophysics Data System (ADS)

    Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.

    2011-05-01

    Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation period. It is found that both methods yield merged fields of better quality than the original radar field or fields obtained by OK of gauge data. The newly suggested KED formulation is shown to be beneficial, in particular in mountainous regions where the quality of the Swiss radar composite is comparatively low. An analysis of the Kriging variances shows that none of the methods tested here provides a satisfactory uncertainty estimate. A suitable variable transformation is expected to improve this.

  19. Geostatistical radar-raingauge combination with nonparametric correlograms: methodological considerations and application in Switzerland

    NASA Astrophysics Data System (ADS)

    Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.

    2010-09-01

    Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation period. It is found that both methods yield merged fields of better quality than the original radar field or fields obtained by OK of gauge data. The newly suggested KED formulation is shown to be beneficial, in particular in mountainous regions where the quality of the Swiss radar composite is comparatively low. An analysis of the Kriging variances shows that none of the methods tested here provides a satisfactory uncertainty estimate. A suitable variable transformation is expected to improve this.

  20. Moving beyond qualitative evaluations of Bayesian models of cognition.

    PubMed

    Hemmer, Pernille; Tauber, Sean; Steyvers, Mark

    2015-06-01

    Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.

  1. Application of geostatistics to coal-resource characterization and mine planning. Final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kauffman, P.W.; Walton, D.R.; Martuneac, L.

    1981-12-01

    Geostatistics is a proven method of ore reserve estimation in many non-coal mining areas but little has been published concerning its application to coal resources. This report presents the case for using geostatistics for coal mining applications and describes how a coal mining concern can best utilize geostatistical techniques for coal resource characterization and mine planning. An overview of the theory of geostatistics is also presented. Many of the applications discussed are documented in case studies that are a part of the report. The results of an exhaustive literature search are presented and recommendations are made for needed future researchmore » and demonstration projects.« less

  2. Precipitation estimation in mountainous terrain using multivariate geostatistics. Part I: structural analysis

    USGS Publications Warehouse

    Hevesi, Joseph A.; Istok, Jonathan D.; Flint, Alan L.

    1992-01-01

    Values of average annual precipitation (AAP) are desired for hydrologic studies within a watershed containing Yucca Mountain, Nevada, a potential site for a high-level nuclear-waste repository. Reliable values of AAP are not yet available for most areas within this watershed because of a sparsity of precipitation measurements and the need to obtain measurements over a sufficient length of time. To estimate AAP over the entire watershed, historical precipitation data and station elevations were obtained from a network of 62 stations in southern Nevada and southeastern California. Multivariate geostatistics (cokriging) was selected as an estimation method because of a significant (p = 0.05) correlation of r = .75 between the natural log of AAP and station elevation. A sample direct variogram for the transformed variable, TAAP = ln [(AAP) 1000], was fitted with an isotropic, spherical model defined by a small nugget value of 5000, a range of 190 000 ft, and a sill value equal to the sample variance of 163 151. Elevations for 1531 additional locations were obtained from topographic maps to improve the accuracy of cokriged estimates. A sample direct variogram for elevation was fitted with an isotropic model consisting of a nugget value of 5500 and three nested transition structures: a Gaussian structure with a range of 61 000 ft, a spherical structure with a range of 70 000 ft, and a quasi-stationary, linear structure. The use of an isotropic, stationary model for elevation was considered valid within a sliding-neighborhood radius of 120 000 ft. The problem of fitting a positive-definite, nonlinear model of coregionalization to an inconsistent sample cross variogram for TAAP and elevation was solved by a modified use of the Cauchy-Schwarz inequality. A selected cross-variogram model consisted of two nested structures: a Gaussian structure with a range of 61 000 ft and a spherical structure with a range of 190 000 ft. Cross validation was used for model selection and for comparing the geostatistical model with six alternate estimation methods. Multivariate geostatistics provided the best cross-validation results.

  3. Uncertainty in training image-based inversion of hydraulic head data constrained to ERT data: Workflow and case study

    NASA Astrophysics Data System (ADS)

    Hermans, Thomas; Nguyen, Frédéric; Caers, Jef

    2015-07-01

    In inverse problems, investigating uncertainty in the posterior distribution of model parameters is as important as matching data. In recent years, most efforts have focused on techniques to sample the posterior distribution with reasonable computational costs. Within a Bayesian context, this posterior depends on the prior distribution. However, most of the studies ignore modeling the prior with realistic geological uncertainty. In this paper, we propose a workflow inspired by a Popper-Bayes philosophy that data should first be used to falsify models, then only be considered for matching. We propose a workflow consisting of three steps: (1) in defining the prior, we interpret multiple alternative geological scenarios from literature (architecture of facies) and site-specific data (proportions of facies). Prior spatial uncertainty is modeled using multiple-point geostatistics, where each scenario is defined using a training image. (2) We validate these prior geological scenarios by simulating electrical resistivity tomography (ERT) data on realizations of each scenario and comparing them to field ERT in a lower dimensional space. In this second step, the idea is to probabilistically falsify scenarios with ERT, meaning that scenarios which are incompatible receive an updated probability of zero while compatible scenarios receive a nonzero updated belief. (3) We constrain the hydrogeological model with hydraulic head and ERT using a stochastic search method. The workflow is applied to a synthetic and a field case studies in an alluvial aquifer. This study highlights the importance of considering and estimating prior uncertainty (without data) through a process of probabilistic falsification.

  4. Characterizing Volumetric Strain at Brady Hot Springs, Nevada, USA Using Geodetic Data, Numerical Models, and Prior Information

    NASA Astrophysics Data System (ADS)

    Reinisch, E. C.; Feigl, K. L.; Cardiff, M. A.; Morency, C.; Kreemer, C.; Akerley, J.

    2017-12-01

    Time-dependent deformation has been observed at Brady Hot Springs using data from the Global Positioning System (GPS) and interferometric synthetic aperture radar (InSAR) [e.g., Ali et al. 2016, http://dx.doi.org/10.1016/j.geothermics.2016.01.008]. We seek to determine the geophysical process governing the observed subsidence. As two end-member hypotheses, we consider thermal contraction and a decrease in pore fluid pressure. A decrease in temperature would cause contraction in the subsurface and subsidence at the surface. A decrease in pore fluid pressure would allow the volume of pores to shrink and also produce subsidence. To simulate these processes, we use a dislocation model that assumes uniform elastic properties in a half space [Okada, 1985]. The parameterization consists of many cubic volume elements (voxels), each of which contracts by closing its three mutually orthogonal bisecting square surfaces. Then we use linear inversion to solve for volumetric strain in each voxel given a measurement of range change. To differentiate between the two possible hypotheses, we use a Bayesian framework with geostatistical prior information. We perform inversion using each prior to decide if one leads to a more geophysically reasonable interpretation than the other. This work is part of a project entitled "Poroelastic Tomography by Adjoint Inverse Modeling of Data from Seismology, Geodesy, and Hydrology" and is supported by the Geothermal Technology Office of the U.S. Department of Energy [DE-EE0006760].

  5. Developing a spatial-statistical model and map of historical malaria prevalence in Botswana using a staged variable selection procedure

    PubMed Central

    Craig, Marlies H; Sharp, Brian L; Mabaso, Musawenkosi LH; Kleinschmidt, Immo

    2007-01-01

    Background Several malaria risk maps have been developed in recent years, many from the prevalence of infection data collated by the MARA (Mapping Malaria Risk in Africa) project, and using various environmental data sets as predictors. Variable selection is a major obstacle due to analytical problems caused by over-fitting, confounding and non-independence in the data. Testing and comparing every combination of explanatory variables in a Bayesian spatial framework remains unfeasible for most researchers. The aim of this study was to develop a malaria risk map using a systematic and practicable variable selection process for spatial analysis and mapping of historical malaria risk in Botswana. Results Of 50 potential explanatory variables from eight environmental data themes, 42 were significantly associated with malaria prevalence in univariate logistic regression and were ranked by the Akaike Information Criterion. Those correlated with higher-ranking relatives of the same environmental theme, were temporarily excluded. The remaining 14 candidates were ranked by selection frequency after running automated step-wise selection procedures on 1000 bootstrap samples drawn from the data. A non-spatial multiple-variable model was developed through step-wise inclusion in order of selection frequency. Previously excluded variables were then re-evaluated for inclusion, using further step-wise bootstrap procedures, resulting in the exclusion of another variable. Finally a Bayesian geo-statistical model using Markov Chain Monte Carlo simulation was fitted to the data, resulting in a final model of three predictor variables, namely summer rainfall, mean annual temperature and altitude. Each was independently and significantly associated with malaria prevalence after allowing for spatial correlation. This model was used to predict malaria prevalence at unobserved locations, producing a smooth risk map for the whole country. Conclusion We have produced a highly plausible and parsimonious model of historical malaria risk for Botswana from point-referenced data from a 1961/2 prevalence survey of malaria infection in 1–14 year old children. After starting with a list of 50 potential variables we ended with three highly plausible predictors, by applying a systematic and repeatable staged variable selection procedure that included a spatial analysis, which has application for other environmentally determined infectious diseases. All this was accomplished using general-purpose statistical software. PMID:17892584

  6. Spatial analysis of the distribution of Spodoptera frugiperda (J.E. Smith) (Lepidoptera: Noctuidae) and losses in maize crop productivity using geostatistics.

    PubMed

    Farias, Paulo R S; Barbosa, José C; Busoli, Antonio C; Overal, William L; Miranda, Vicente S; Ribeiro, Susane M

    2008-01-01

    The fall armyworm, Spodoptera frugiperda (J.E. Smith), is one of the chief pests of maize in the Americas. The study of its spatial distribution is fundamental for designing correct control strategies, improving sampling methods, determining actual and potential crop losses, and adopting precise agricultural techniques. In São Paulo state, Brazil, a maize field was sampled at weekly intervals, from germination through harvest, for caterpillar densities, using quadrates. In each of 200 quadrates, 10 plants were sampled per week. Harvest weights were obtained in the field for each quadrate, and ear diameters and lengths were also sampled (15 ears per quadrate) and used to estimate potential productivity of the quadrate. Geostatistical analyses of caterpillar densities showed greatest ranges for small caterpillars when semivariograms were adjusted for a spherical model that showed greatest fit. As the caterpillars developed in the field, their spatial distribution became increasingly random, as shown by a model adjusted to a straight line, indicating a lack of spatial dependence among samples. Harvest weight and ear length followed the spherical model, indicating the existence of spatial variability of the production parameters in the maize field. Geostatistics shows promise for the application of precise methods in the integrated control of pests.

  7. Inversion using a new low-dimensional representation of complex binary geological media based on a deep neural network

    NASA Astrophysics Data System (ADS)

    Laloy, Eric; Hérault, Romain; Lee, John; Jacques, Diederik; Linde, Niklas

    2017-12-01

    Efficient and high-fidelity prior sampling and inversion for complex geological media is still a largely unsolved challenge. Here, we use a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media. For inversion purposes, it has the attractive feature that random draws from an uncorrelated standard normal distribution yield model realizations with spatial characteristics that are in agreement with the training set. In comparison with the most commonly used parametric representations in probabilistic inversion, we find that our dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA (OPCA) and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model. For the considered examples, important compression ratios (200-500) are achieved. Given that the construction of our parameterization requires a training set of several tens of thousands of prior model realizations, our DR approach is more suited for probabilistic (or deterministic) inversion than for unconditional (or point-conditioned) geostatistical simulation. Probabilistic inversions of 2D steady-state and 3D transient hydraulic tomography data are used to demonstrate the DR-based inversion. For the 2D case study, the performance is superior compared to current state-of-the-art multiple-point statistics inversion by sequential geostatistical resampling (SGR). Inversion results for the 3D application are also encouraging.

  8. Development and comparison of Bayesian modularization method in uncertainty assessment of hydrological models

    NASA Astrophysics Data System (ADS)

    Li, L.; Xu, C.-Y.; Engeland, K.

    2012-04-01

    With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD

  9. The moving-window Bayesian Maximum Entropy framework: Estimation of PM2.5 yearly average concentration across the contiguous United States

    PubMed Central

    Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L.

    2013-01-01

    Geostatistical methods are widely used in estimating long-term exposures for air pollution epidemiological studies, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian Maximum Entropy (BME) method and applied this framework to estimate fine particulate matter (PM2.5) yearly average concentrations over the contiguous U.S. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingnees in the air monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM2.5 data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM2.5. Moreover, the MWBME method further reduces the MSE by 8.4% to 43.7% with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM2.5 across large geographical domains with expected spatial non-stationarity. PMID:22739679

  10. Reservoir property grids improve with geostatistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vogt, J.

    1993-09-01

    Visualization software, reservoir simulators and many other E and P software applications need reservoir property grids as input. Using geostatistics, as compared to other gridding methods, to produce these grids leads to the best output from the software programs. For the purpose stated herein, geostatistics is simply two types of gridding methods. Mathematically, these methods are based on minimizing or duplicating certain statistical properties of the input data. One geostatical method, called kriging, is used when the highest possible point-by-point accuracy is desired. The other method, called conditional simulation, is used when one wants statistics and texture of the resultingmore » grid to be the same as for the input data. In the following discussion, each method is explained, compared to other gridding methods, and illustrated through example applications. Proper use of geostatistical data in flow simulations, use of geostatistical data for history matching, and situations where geostatistics has no significant advantage over other methods, also will be covered.« less

  11. ON THE GEOSTATISTICAL APPROACH TO THE INVERSE PROBLEM. (R825689C037)

    EPA Science Inventory

    Abstract

    The geostatistical approach to the inverse problem is discussed with emphasis on the importance of structural analysis. Although the geostatistical approach is occasionally misconstrued as mere cokriging, in fact it consists of two steps: estimation of statist...

  12. A Bayesian Nonparametric Approach to Test Equating

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2009-01-01

    A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…

  13. Model Diagnostics for Bayesian Networks

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2006-01-01

    Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…

  14. Improved characterization of heterogeneous permeability in saline aquifers from transient pressure data during freshwater injection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kang, Peter K.; Lee, Jonghyun; Fu, Xiaojing

    Managing recharge of freshwater into saline aquifers requires accurate estimation of the heterogeneous permeability field for maximizing injection and recovery efficiency. Here we present a methodology for subsurface characterization in saline aquifers that takes advantage of the density difference between the injected freshwater and the ambient saline groundwater. We combine high-resolution forward modeling of density-driven flow with an efficient Bayesian geostatistical inversion algorithm. In the presence of a density difference between the injected and ambient fluids due to differences in salinity, the pressure field is coupled to the spatial distribution of salinity. This coupling renders the pressure field transient: themore » time evolution of the salinity distribution controls the density distribution which then leads to a time-evolving pressure distribution. We exploit this coupling between pressure and salinity to obtain an improved characterization of the permeability field without multiple pumping tests or additional salinity measurements. We show that the inversion performance improves with an increase in the mixed convection ratio—the relative importance between viscous forces from injection and buoyancy forces from density difference. Thus, our work shows that measuring transient pressure data at multiple sampling points during freshwater injection into saline aquifers can be an effective strategy for aquifer characterization, key to the successful management of aquifer recharge.« less

  15. Improved characterization of heterogeneous permeability in saline aquifers from transient pressure data during freshwater injection

    DOE PAGES

    Kang, Peter K.; Lee, Jonghyun; Fu, Xiaojing; ...

    2017-05-31

    Managing recharge of freshwater into saline aquifers requires accurate estimation of the heterogeneous permeability field for maximizing injection and recovery efficiency. Here we present a methodology for subsurface characterization in saline aquifers that takes advantage of the density difference between the injected freshwater and the ambient saline groundwater. We combine high-resolution forward modeling of density-driven flow with an efficient Bayesian geostatistical inversion algorithm. In the presence of a density difference between the injected and ambient fluids due to differences in salinity, the pressure field is coupled to the spatial distribution of salinity. This coupling renders the pressure field transient: themore » time evolution of the salinity distribution controls the density distribution which then leads to a time-evolving pressure distribution. We exploit this coupling between pressure and salinity to obtain an improved characterization of the permeability field without multiple pumping tests or additional salinity measurements. We show that the inversion performance improves with an increase in the mixed convection ratio—the relative importance between viscous forces from injection and buoyancy forces from density difference. Thus, our work shows that measuring transient pressure data at multiple sampling points during freshwater injection into saline aquifers can be an effective strategy for aquifer characterization, key to the successful management of aquifer recharge.« less

  16. Spatial co-distribution of neglected tropical diseases in the East African Great Lakes region: revisiting the justification for integrated control

    PubMed Central

    Clements, Archie C. A.; Deville, Marie-Alice; Ndayishimiye, Onésime; Brooker, Simon; Fenwick, Alan

    2010-01-01

    Summary OBJECTIVE To determine spatial patterns of co-endemicity of schistosomiasis mansoni and the soil-transmitted helminths (STHs) Ascaris lumbricoides, Trichuris trichiura and hookworm in the Great Lakes region of East Africa, to help plan integrated neglected tropical disease programmes in this region. METHOD Parasitological surveys were conducted in Uganda, Tanzania, Kenya and Burundi in 28 213 children in 404 schools. Bayesian geostatistical models were used to interpolate prevalence of these infections across the study area. Interpolated prevalence maps were overlaid to determine areas of co-endemicity. RESULTS In the Great Lakes region, prevalence was 18.1% for Schistosoma mansoni, 50.0% for hookworm, 6.8% for A. lumbricoides and 6.8% for T. trichiura. Hookworm infection was ubiquitous, whereas S. mansoni, A. lumbricoides and T. trichiura were highly focal. Most areas were endemic (prevalence ≥10%) or hyperendemic (prevalence ≥50%) for one or more STHs, whereas endemic areas for schistosomiasis mansoni were restricted to foci adjacent large perennial water bodies. CONCLUSION Because of the ubiquity of hookworm, treatment programmes are required for STH throughout the region but efficient schistosomiasis control should only be targeted at limited high-risk areas. Therefore, integration of schistosomiasis with STH control is only indicated in limited foci in East Africa. PMID:20409287

  17. Exploring prediction uncertainty of spatial data in geostatistical and machine learning Approaches

    NASA Astrophysics Data System (ADS)

    Klump, J. F.; Fouedjio, F.

    2017-12-01

    Geostatistical methods such as kriging with external drift as well as machine learning techniques such as quantile regression forest have been intensively used for modelling spatial data. In addition to providing predictions for target variables, both approaches are able to deliver a quantification of the uncertainty associated with the prediction at a target location. Geostatistical approaches are, by essence, adequate for providing such prediction uncertainties and their behaviour is well understood. However, they often require significant data pre-processing and rely on assumptions that are rarely met in practice. Machine learning algorithms such as random forest regression, on the other hand, require less data pre-processing and are non-parametric. This makes the application of machine learning algorithms to geostatistical problems an attractive proposition. The objective of this study is to compare kriging with external drift and quantile regression forest with respect to their ability to deliver reliable prediction uncertainties of spatial data. In our comparison we use both simulated and real world datasets. Apart from classical performance indicators, comparisons make use of accuracy plots, probability interval width plots, and the visual examinations of the uncertainty maps provided by the two approaches. By comparing random forest regression to kriging we found that both methods produced comparable maps of estimated values for our variables of interest. However, the measure of uncertainty provided by random forest seems to be quite different to the measure of uncertainty provided by kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. These preliminary results raise questions about assessing the risks associated with decisions based on the predictions from geostatistical and machine learning algorithms in a spatial context, e.g. mineral exploration.

  18. Geostatistical Analysis of Mesoscale Spatial Variability and Error in SeaWiFS and MODIS/Aqua Global Ocean Color Data

    NASA Astrophysics Data System (ADS)

    Glover, David M.; Doney, Scott C.; Oestreich, William K.; Tullo, Alisdair W.

    2018-01-01

    Mesoscale (10-300 km, weeks to months) physical variability strongly modulates the structure and dynamics of planktonic marine ecosystems via both turbulent advection and environmental impacts upon biological rates. Using structure function analysis (geostatistics), we quantify the mesoscale biological signals within global 13 year SeaWiFS (1998-2010) and 8 year MODIS/Aqua (2003-2010) chlorophyll a ocean color data (Level-3, 9 km resolution). We present geographical distributions, seasonality, and interannual variability of key geostatistical parameters: unresolved variability or noise, resolved variability, and spatial range. Resolved variability is nearly identical for both instruments, indicating that geostatistical techniques isolate a robust measure of biophysical mesoscale variability largely independent of measurement platform. In contrast, unresolved variability in MODIS/Aqua is substantially lower than in SeaWiFS, especially in oligotrophic waters where previous analysis identified a problem for the SeaWiFS instrument likely due to sensor noise characteristics. Both records exhibit a statistically significant relationship between resolved mesoscale variability and the low-pass filtered chlorophyll field horizontal gradient magnitude, consistent with physical stirring acting on large-scale gradient as an important factor supporting observed mesoscale variability. Comparable horizontal length scales for variability are found from tracer-based scaling arguments and geostatistical decorrelation. Regional variations between these length scales may reflect scale dependence of biological mechanisms that also create variability directly at the mesoscale, for example, enhanced net phytoplankton growth in coastal and frontal upwelling and convective mixing regions. Global estimates of mesoscale biophysical variability provide an improved basis for evaluating higher resolution, coupled ecosystem-ocean general circulation models, and data assimilation.

  19. Geostatistical mapping of effluent-affected sediment distribution on the Palos Verdes shelf

    USGS Publications Warehouse

    Murray, C.J.; Lee, H.J.; Hampton, M.A.

    2002-01-01

    Geostatistical techniques were used to study the spatial continuity of the thickness of effluent-affected sediment in the offshore Palos Verdes Margin area. The thickness data were measured directly from cores and indirectly from high-frequency subbottom profiles collected over the Palos Verdes Margin. Strong spatial continuity of the sediment thickness data was identified, with a maximum range of correlation in excess of 1.4 km. The spatial correlation showed a marked anisotropy, and was more than twice as continuous in the alongshore direction as in the cross-shelf direction. Sequential indicator simulation employing models fit to the thickness data variograms was used to map the distribution of the sediment, and to quantify the uncertainty in those estimates. A strong correlation between sediment thickness data and measurements of the mass of the contaminant p,p???-DDE per unit area was identified. A calibration based on the bivariate distribution of the thickness and p,p???-DDE data was applied using Markov-Bayes indicator simulation to extend the geostatistical study and map the contamination levels in the sediment. Integrating the map grids produced by the geostatistical study of the two variables indicated that 7.8 million m3 of effluent-affected sediment exist in the map area, containing approximately 61-72 Mg (metric tons) of p,p???-DDE. Most of the contaminated sediment (about 85% of the sediment and 89% of the p,p???-DDE) occurs in water depths < 100 m. The geostatistical study also indicated that the samples available for mapping are well distributed and the uncertainty of the estimates of the thickness and contamination level of the sediments is lowest in areas where the contaminated sediment is most prevalent. ?? 2002 Elsevier Science Ltd. All rights reserved.

  20. Geostatistics for spatial genetic structures: study of wild populations of perennial ryegrass.

    PubMed

    Monestiez, P; Goulard, M; Charmet, G

    1994-04-01

    Methods based on geostatistics were applied to quantitative traits of agricultural interest measured on a collection of 547 wild populations of perennial ryegrass in France. The mathematical background of these methods, which resembles spatial autocorrelation analysis, is briefly described. When a single variable is studied, the spatial structure analysis is similar to spatial autocorrelation analysis, and a spatial prediction method, called "kriging", gives a filtered map of the spatial pattern over all the sampled area. When complex interactions of agronomic traits with different evaluation sites define a multivariate structure for the spatial analysis, geostatistical methods allow the spatial variations to be broken down into two main spatial structures with ranges of 120 km and 300 km, respectively. The predicted maps that corresponded to each range were interpreted as a result of the isolation-by-distance model and as a consequence of selection by environmental factors. Practical collecting methodology for breeders may be derived from such spatial structures.

  1. Bayesian Model Averaging for Propensity Score Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Chen, Jianshen

    2013-01-01

    The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…

  2. Introduction to This Special Issue on Geostatistics and Geospatial Techniques in Remote Sensing

    NASA Technical Reports Server (NTRS)

    Atkinson, Peter; Quattrochi, Dale A.; Goodman, H. Michael (Technical Monitor)

    2000-01-01

    The germination of this special Computers & Geosciences (C&G) issue began at the Royal Geographical Society (with the Institute of British Geographers) (RGS-IBG) annual meeting in January 1997 held at the University of Exeter, UK. The snow and cold of the English winter were tempered greatly by warm and cordial discussion of how to stimulate and enhance cooperation on geostatistical and geospatial research in remote sensing 'across the big pond' between UK and US researchers. It was decided that one way forward would be to hold parallel sessions in 1998 on geostatistical and geospatial research in remote sensing at appropriate venues in both the UK and the US. Selected papers given at these sessions would be published as special issues of C&G on the UK side and Photogrammetric Engineering and Remote Sensing (PE&RS) on the US side. These issues would highlight the commonality in research on geostatistical and geospatial research in remote sensing on both sides of the Atlantic Ocean. As a consequence, a session on "Geostatistics and Geospatial Techniques for Remote Sensing of Land Surface Processes" was held at the RGS-IBG annual meeting in Guildford, Surrey, UK in January 1998, organized by the Modeling and Advanced Techniques Special Interest Group (MAT SIG) of the Remote Sensing Society (RSS). A similar session was held at the Association of American Geographers (AAG) annual meeting in Boston, Massachusetts in March 1998, sponsored by the AAG's Remote Sensing Specialty Group (RSSG). The 10 papers that make up this issue of C&G, comprise 7 papers from the UK and 3 papers from the LIS. We are both co-editors of each of the journal special issues, with the lead editor of each journal issue being from their respective side of the Atlantic. The special issue of PE&RS (vol. 65) that constitutes the other half of this co-edited journal series was published in early 1999, comprising 6 papers by US authors. We are indebted to the International Association for Mathematical Geology for allowing us to use C&G as a vehicle to convey how geostatistics and geospatial techniques can be used to analyze remote sensing and other types of spatial data. We see this special issue of C&G. and its complementary issue of PE&RS. as a testament to the vitality and interest in the application of geostatistical and geospatial techniques in remote sensing. We also see these special journal issues as the beginning of a fruitful. and hopefully long-term relationship, between American and British geographers and other researchers interested in geostatistical and geospatial techniques applied to remote sensing and other spatial data.

  3. Spatial and temporal distribution of soil-transmitted helminth infection in sub-Saharan Africa: a systematic review and geostatistical meta-analysis.

    PubMed

    Karagiannis-Voules, Dimitrios-Alexios; Biedermann, Patricia; Ekpo, Uwem F; Garba, Amadou; Langer, Erika; Mathieu, Els; Midzi, Nicholas; Mwinzi, Pauline; Polderman, Anton M; Raso, Giovanna; Sacko, Moussa; Talla, Idrissa; Tchuenté, Louis-Albert Tchuem; Touré, Seydou; Winkler, Mirko S; Utzinger, Jürg; Vounatsou, Penelope

    2015-01-01

    Interest is growing in predictive risk mapping for neglected tropical diseases (NTDs), particularly to scale up preventive chemotherapy, surveillance, and elimination efforts. Soil-transmitted helminths (hookworm, Ascaris lumbricoides, and Trichuris trichiura) are the most widespread NTDs, but broad geographical analyses are scarce. We aimed to predict the spatial and temporal distribution of soil-transmitted helminth infections, including the number of infected people and treatment needs, across sub-Saharan Africa. We systematically searched PubMed, Web of Knowledge, and African Journal Online from inception to Dec 31, 2013, without language restrictions, to identify georeferenced surveys. We extracted data from household surveys on sources of drinking water, sanitation, and women's level of education. Bayesian geostatistical models were used to align the data in space and estimate risk of with hookworm, A lumbricoides, and T trichiura over a grid of roughly 1 million pixels at a spatial resolution of 5 × 5 km. We calculated anthelmintic treatment needs on the basis of WHO guidelines (treatment of all school-aged children once per year where prevalence in this population is 20-50% or twice per year if prevalence is greater than 50%). We identified 459 relevant survey reports that referenced 6040 unique locations. We estimate that the prevalence of hookworm, A lumbricoides, and T trichiura among school-aged children from 2000 onwards was 16·5%, 6·6%, and 4·4%. These estimates are between 52% and 74% lower than those in surveys done before 2000, and have become similar to values for the entire communities. We estimated that 126 million doses of anthelmintic treatments are required per year. Patterns of soil-transmitted helminth infection in sub-Saharan Africa have changed and the prevalence of infection has declined substantially in this millennium, probably due to socioeconomic development and large-scale deworming programmes. The global control strategy should be reassessed, with emphasis given also to adults to progress towards local elimination. Swiss National Science Foundation and European Research Council. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Challenges of DHS and MIS to capture the entire pattern of malaria parasite risk and intervention effects in countries with different ecological zones: the case of Cameroon.

    PubMed

    Massoda Tonye, Salomon G; Kouambeng, Celestin; Wounang, Romain; Vounatsou, Penelope

    2018-04-06

    In 2011, the demographic and health survey (DHS) in Cameroon was combined with the multiple indicator cluster survey. Malaria parasitological data were collected, but the survey period did not overlap with the high malaria transmission season. A malaria indicator survey (MIS) was also conducted during the same year, within the malaria peak transmission season. This study compares estimates of the geographical distribution of malaria parasite risk and of the effects of interventions obtained from the DHS and MIS survey data. Bayesian geostatistical models were applied on DHS and MIS data to obtain georeferenced estimates of the malaria parasite prevalence and to assess the effects of interventions. Climatic predictors were retrieved from satellite sources. Geostatistical variable selection was used to identify the most important climatic predictors and indicators of malaria interventions. The overall observed malaria parasite risk among children was 33 and 30% in the DHS and MIS data, respectively. Both datasets identified the Normalized Difference Vegetation Index and the altitude as important predictors of the geographical distribution of the disease. However, MIS selected additional climatic factors as important disease predictors. The magnitude of the estimated malaria parasite risk at national level was similar in both surveys. Nevertheless, DHS estimates lower risk in the North and Coastal areas. MIS did not find any important intervention effects, although DHS revealed that the proportion of population with an insecticide-treated nets access in their household was statistically important. An important negative relationship between malaria parasitaemia and socioeconomic factors, such as the level of mother's education, place of residence and the household welfare were captured by both surveys. Timing of the malaria survey influences estimates of the geographical distribution of disease risk, especially in settings with seasonal transmission. In countries with different ecological zones and thus different seasonal patterns, a single survey may not be able to identify all high risk areas. A continuous MIS or a combination of MIS, health information system data and data from sentinel sites may be able to capture the disease risk distribution in space across different seasons.

  5. Bayesian Inference for Functional Dynamics Exploring in fMRI Data.

    PubMed

    Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing

    2016-01-01

    This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.

  6. Stochastic hydrogeology: what professionals really need?

    PubMed

    Renard, Philippe

    2007-01-01

    Quantitative hydrogeology celebrated its 150th anniversary in 2006. Geostatistics is younger but has had a very large impact in hydrogeology. Today, geostatistics is used routinely to interpolate deterministically most of the parameters that are required to analyze a problem or make a quantitative analysis. In a small number of cases, geostatistics is combined with deterministic approaches to forecast uncertainty. At a more academic level, geostatistics is used extensively to study physical processes in heterogeneous aquifers. Yet, there is an important gap between the academic use and the routine applications of geostatistics. The reasons for this gap are diverse. These include aspects related to the hydrogeology consulting market, technical reasons such as the lack of widely available software, but also a number of misconceptions. A change in this situation requires acting at different levels. First, regulators must be convinced of the benefit of using geostatistics. Second, the economic potential of the approach must be emphasized to customers. Third, the relevance of the theories needs to be increased. Last, but not least, software, data sets, and computing infrastructure such as grid computing need to be widely available.

  7. Geostatistical analysis of data on air temperature and plant phenology from Baden-Württemberg (Germany) as a basis for regional scaled models of climate change.

    PubMed

    Schröder, Winfried; Schmidt, Gunther; Hasenclever, Judith

    2006-09-01

    The rise of the air temperature is assured to be part of the global climatic change, but there is still a lack of knowledge about its effects at a regional scale. The article tackles the correlation of air temperature with the phenology of selected plants by the example of Baden-Württemberg to provide a spatial valid data base for regional climate change models. To this end, the data on air temperature and plant phenology, gathered from measurement sites without congruent coverage, were correlated after performing geostatistical analysis and estimation. In addition, geostatistics are used to analyze and cartographically depict the spatial structure of the phenology of plants in spring and in summer. The statistical analysis reveals a significant relationship between the rising air temperature and the earlier beginning of phenological phases like blooming or fruit maturation: From 1991 to 1999 spring time, as indicated by plant phenology, has begun up to 15 days earlier than from 1961 to 1990. As shown by geostatistics, this holds true for the whole territory of Baden-Württemberg. The effects of the rise of air temperature should be investigated not only by monitoring biological individuals, as for example plants, but on an ecosystem level as well. In Germany, the environmental monitoring should be supplemented by the study of the effects of the climatic change in ecosystems. Because air temperature and humidity have a great influence on the temporal and spatial distribution of pathogen carriers (vectors) and pathogens, mapping of the environmental determinants of vector and pathogen distribution in space and time should be performed in order to identify hot spots for risk assessment and further detailed epidemiological studies.

  8. Large-scale inverse model analyses employing fast randomized data reduction

    NASA Astrophysics Data System (ADS)

    Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan

    2017-08-01

    When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.

  9. Characterization and geostatistical mapping of water salinity: A case study of terminal complex in the Oued Righ Valley (southern Algeria)

    NASA Astrophysics Data System (ADS)

    Belkesier, Mohamed Saleh; Zeddouri, Aziez; Halassa, Younes; Kechiched, Rabah

    2018-05-01

    The region of Oued Righ contains large quantities of groundwater hosted by the three aquifers: the Terminal Complex (CT), the Continental Intercalary (CI) and the phreatic aquifer. The present study is focused on the water from CT aquifer in order to characterize their salinity using geostatistical tool for maping. Indeed, water in this aquifer show a high mineralization exceeding the OMS standards. The main hydro-chemical facies of this water is Chloride-Sodium and Sulfate-Sodium. The elementary statistics have been performed on the physico-chemical analysis from 97 wells whereas 766 wells were analyzed on salinity and are used for the geostatistical mapping. The obtained results show a spatial evolution of the salinity toward the direction South to the North. The salinity is locally strong in the central part of Oued Righ valley. The non-parametric geostatistic of indicator kriging was performed on the salinity data using a cut-off of 5230 mg/l which represents the average value in the studied area. The indicator Kriging allows the estimation of salinity probabilities I (5230 mg / l) in the water of the CT aquifer using bloc model (500 x 500 m). The automatic mapping is used to visualize the distribution of the kriged probabilities of salinity. These results can help to ensure a rational and a selective exploitation of groundwater according the salinity contents.

  10. Mine planning and emission control strategies using geostatistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martino, F.; Kim, Y.C.

    1983-03-01

    This paper reviews the past four years' research efforts performed jointly by the University of Arizona and the Homer City Owners in which geostatistics were applied to solve various problems associated with coal characterization, mine planning, and development of emission control strategies. Because geostatistics is the only technique which can quantify the degree of confidence associated with a given estimate (or prediction), it played an important role throughout the research efforts. Through geostatistics, it was learned that there is an urgent need for closely spaced sample information, if short-term coal quality predictions are to be made for mine planning purposes.

  11. The use of sequential indicator simulation to characterize geostatistical uncertainty; Yucca Mountain Site Characterization Project

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hansen, K.M.

    1992-10-01

    Sequential indicator simulation (SIS) is a geostatistical technique designed to aid in the characterization of uncertainty about the structure or behavior of natural systems. This report discusses a simulation experiment designed to study the quality of uncertainty bounds generated using SIS. The results indicate that, while SIS may produce reasonable uncertainty bounds in many situations, factors like the number and location of available sample data, the quality of variogram models produced by the user, and the characteristics of the geologic region to be modeled, can all have substantial effects on the accuracy and precision of estimated confidence limits. It ismore » recommended that users of SIS conduct validation studies for the technique on their particular regions of interest before accepting the output uncertainty bounds.« less

  12. Bayesian structural equation modeling in sport and exercise psychology.

    PubMed

    Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus

    2015-08-01

    Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.

  13. Geostatistics and petroleum geology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hohn, M.E.

    1988-01-01

    The book reviewed is designed as a practical guide to geostatistics or kriging for the petroleum geologists. The author's aim in the book is to explain geostatistics as a working tool for petroleum geologists through extensive use of case-study material mostly drawn from his own research in gas potential evaluation in West Virginia. Theory and mathematics are pared down to immediate needs.

  14. Bayesian model reduction and empirical Bayes for group (DCM) studies

    PubMed Central

    Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter

    2016-01-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570

  15. Performance prediction using geostatistics and window reservoir simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fontanilla, J.P.; Al-Khalawi, A.A.; Johnson, S.G.

    1995-11-01

    This paper is the first window model study in the northern area of a large carbonate reservoir in Saudi Arabia. It describes window reservoir simulation with geostatistics to model uneven water encroachment in the southwest producing area of the northern portion of the reservoir. In addition, this paper describes performance predictions that investigate the sweep efficiency of the current peripheral waterflood. A 50 x 50 x 549 (240 m. x 260 m. x 0.15 m. average grid block size) geological model was constructed with geostatistics software. Conditional simulation was used to obtain spatial distributions of porosity and volume of dolomite.more » Core data transforms were used to obtain horizontal and vertical permeability distributions. Simple averaging techniques were used to convert the 549-layer geological model to a 50 x 50 x 10 (240 m. x 260 m. x 8 m. average grid block size) window reservoir simulation model. Flux injectors and flux producers were assigned to the outermost grid blocks. Historical boundary flux rates were obtained from a coarsely-ridded full-field model. Pressure distribution, water cuts, GORs, and recent flowmeter data were history matched. Permeability correction factors and numerous parameter adjustments were required to obtain the final history match. The permeability correction factors were based on pressure transient permeability-thickness analyses. The prediction phase of the study evaluated the effects of infill drilling, the use of artificial lifts, workovers, horizontal wells, producing rate constraints, and tight zone development to formulate depletion strategies for the development of this area. The window model will also be used to investigate day-to-day reservoir management problems in this area.« less

  16. An introduction to using Bayesian linear regression with clinical data.

    PubMed

    Baldwin, Scott A; Larson, Michael J

    2017-11-01

    Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Assessment of nitrate pollution in the Grand Morin aquifers (France): combined use of geostatistics and physically based modeling.

    PubMed

    Flipo, Nicolas; Jeannée, Nicolas; Poulin, Michel; Even, Stéphanie; Ledoux, Emmanuel

    2007-03-01

    The objective of this work is to combine several approaches to better understand nitrate fate in the Grand Morin aquifers (2700 km(2)), part of the Seine basin. cawaqs results from the coupling of the hydrogeological model newsam with the hydrodynamic and biogeochemical model of river ProSe. cawaqs is coupled with the agronomic model Stics in order to simulate nitrate migration in basins. First, kriging provides a satisfactory representation of aquifer nitrate contamination from local observations, to set initial conditions for the physically based model. Then associated confidence intervals, derived from data using geostatistics, are used to validate cawaqs results. Results and evaluation obtained from the combination of these approaches are given (period 1977-1988). Then cawaqs is used to simulate nitrate fate for a 20-year period (1977-1996). The mean nitrate concentrations increase in aquifers is 0.09 mgN L(-1)yr(-1), resulting from an average infiltration flux of 3500 kgN.km(-2)yr(-1).

  18. Effects of error covariance structure on estimation of model averaging weights and predictive performance

    USGS Publications Warehouse

    Lu, Dan; Ye, Ming; Meyer, Philip D.; Curtis, Gary P.; Shi, Xiaoqing; Niu, Xu-Feng; Yabusaki, Steve B.

    2013-01-01

    When conducting model averaging for assessing groundwater conceptual model uncertainty, the averaging weights are often evaluated using model selection criteria such as AIC, AICc, BIC, and KIC (Akaike Information Criterion, Corrected Akaike Information Criterion, Bayesian Information Criterion, and Kashyap Information Criterion, respectively). However, this method often leads to an unrealistic situation in which the best model receives overwhelmingly large averaging weight (close to 100%), which cannot be justified by available data and knowledge. It was found in this study that this problem was caused by using the covariance matrix, CE, of measurement errors for estimating the negative log likelihood function common to all the model selection criteria. This problem can be resolved by using the covariance matrix, Cek, of total errors (including model errors and measurement errors) to account for the correlation between the total errors. An iterative two-stage method was developed in the context of maximum likelihood inverse modeling to iteratively infer the unknown Cek from the residuals during model calibration. The inferred Cek was then used in the evaluation of model selection criteria and model averaging weights. While this method was limited to serial data using time series techniques in this study, it can be extended to spatial data using geostatistical techniques. The method was first evaluated in a synthetic study and then applied to an experimental study, in which alternative surface complexation models were developed to simulate column experiments of uranium reactive transport. It was found that the total errors of the alternative models were temporally correlated due to the model errors. The iterative two-stage method using Cekresolved the problem that the best model receives 100% model averaging weight, and the resulting model averaging weights were supported by the calibration results and physical understanding of the alternative models. Using Cek obtained from the iterative two-stage method also improved predictive performance of the individual models and model averaging in both synthetic and experimental studies.

  19. Rural Poverty Dynamics and Refugee Communities in South Africa: A Spatial-Temporal Model.

    PubMed

    Sartorius, Kurt; Sartorius, Benn; Tollman, Stephen; Schatz, Enid; Kirsten, Johann; Collinson, Mark

    2013-01-01

    The assimilation of refugees into their host community economic structures is often problematic. The paper investigates the ability of refugees in rural South Africa to accumulate assets over time relative to their host community. Bayesian spatial-temporal modelling was employed to analyse a longitudinal database that indicated that the asset accumulation rate of former Mozambican refugee households was similar to their host community; however, they were unable to close the wealth gap. A series of geo-statistical wealth maps illustrate that there is a spatial element to the higher levels of absolute poverty in the former refugee villages. The primary reason for this is their physical location in drier conditions that are established further away from facilities and infrastructure. Neighbouring South African villages in close proximity, however, display lower levels of absolute poverty, suggesting that the spatial location of the refugees only partially explains their disadvantaged situation. In this regard, the results indicate that the wealth of former refugee households continues to be more compromised by higher mortality levels, poorer education, and less access to high-return employment opportunities. The long-term impact of low initial asset status appears to be perpetuated in this instance by difficulties in obtaining legal status in order to access state pensions, facilities, and opportunities. The usefulness of the results is that they can be used to sharpen the targeting of differentiated policy in a given geographical area for refugee communities in rural Africa. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Rural Poverty Dynamics and Refugee Communities in South Africa: A Spatial–Temporal Model

    PubMed Central

    Sartorius, Kurt; Sartorius, Benn; Tollman, Stephen; Schatz, Enid; Kirsten, Johann; Collinson, Mark

    2013-01-01

    The assimilation of refugees into their host community economic structures is often problematic. The paper investigates the ability of refugees in rural South Africa to accumulate assets over time relative to their host community. Bayesian spatial–temporal modelling was employed to analyse a longitudinal database that indicated that the asset accumulation rate of former Mozambican refugee households was similar to their host community; however, they were unable to close the wealth gap. A series of geo-statistical wealth maps illustrate that there is a spatial element to the higher levels of absolute poverty in the former refugee villages. The primary reason for this is their physical location in drier conditions that are established further away from facilities and infrastructure. Neighbouring South African villages in close proximity, however, display lower levels of absolute poverty, suggesting that the spatial location of the refugees only partially explains their disadvantaged situation. In this regard, the results indicate that the wealth of former refugee households continues to be more compromised by higher mortality levels, poorer education, and less access to high-return employment opportunities. The long-term impact of low initial asset status appears to be perpetuated in this instance by difficulties in obtaining legal status in order to access state pensions, facilities, and opportunities. The usefulness of the results is that they can be used to sharpen the targeting of differentiated policy in a given geographical area for refugee communities in rural Africa. Copyright © 2011 John Wiley & Sons, Ltd. PMID:24348199

  1. Siting Background Towers to Characterize Incoming Air for Urban Greenhouse Gas Estimation: A Case Study in the Washington, DC/Baltimore Area

    NASA Astrophysics Data System (ADS)

    Mueller, K.; Yadav, V.; Lopez-Coto, I.; Karion, A.; Gourdji, S.; Martin, C.; Whetstone, J.

    2018-03-01

    There is increased interest in understanding urban greenhouse gas (GHG) emissions. To accurately estimate city emissions, the influence of extraurban fluxes must first be removed from urban greenhouse gas (GHG) observations. This is especially true for regions, such as the U.S. Northeastern Corridor-Baltimore/Washington, DC (NEC-B/W), downwind of large fluxes. To help site background towers for the NEC-B/W, we use a coupled Bayesian Information Criteria and geostatistical regression approach to help site four background locations that best explain CO2 variability due to extraurban fluxes modeled at 12 urban towers. The synthetic experiment uses an atmospheric transport and dispersion model coupled with two different flux inventories to create modeled observations and evaluate 15 candidate towers located along the urban domain for February and July 2013. The analysis shows that the average ratios of extraurban inflow to total modeled enhancements at urban towers are 21% to 36% in February and 31% to 43% in July. In July, the incoming air dominates the total variability of synthetic enhancements at the urban towers (R2 = 0.58). Modeled observations from the selected background towers generally capture the variability in the synthetic CO2 enhancements at urban towers (R2 = 0.75, root-mean-square error (RMSE) = 3.64 ppm; R2 = 0.43, RMSE = 4.96 ppm for February and July). However, errors associated with representing background air can be up to 10 ppm for any given observation even with an optimal background tower configuration. More sophisticated methods may be necessary to represent background air to accurately estimate urban GHG emissions.

  2. Prediction of sedimentary facies of x-oilfield in northwest of China by geostatistical inversion

    NASA Astrophysics Data System (ADS)

    Lei, Zhao; Ling, Ke; Tingting, He

    2017-03-01

    In the early stage of oilfield development, there are only a few wells and well spacing can reach several kilometers. for the alluvial fans and other heterogeneous reservoirs, information from wells alone is not sufficient to derive detailed reservoir information. In this paper, the method of calculating sand thickness through geostatistics inversion is studied, and quantitative relationships between each sedimentary micro-facies are analyzed by combining with single well sedimentary facies. Further, the sedimentary facies plane distribution based on seismic inversion is obtained by combining with sedimentary model, providing the geological basis for the next exploration and deployment.

  3. Combining area-based and individual-level data in the geostatistical mapping of late-stage cancer incidence.

    PubMed

    Goovaerts, Pierre

    2009-01-01

    This paper presents a geostatistical approach to incorporate individual-level data (e.g. patient residences) and area-based data (e.g. rates recorded at census tract level) into the mapping of late-stage cancer incidence, with an application to breast cancer in three Michigan counties. Spatial trends in cancer incidence are first estimated from census data using area-to-point binomial kriging. This prior model is then updated using indicator kriging and individual-level data. Simulation studies demonstrate the benefits of this two-step approach over methods (kernel density estimation and indicator kriging) that process only residence data.

  4. G6PD Deficiency Prevalence and Estimates of Affected Populations in Malaria Endemic Countries: A Geostatistical Model-Based Map

    PubMed Central

    Howes, Rosalind E.; Piel, Frédéric B.; Patil, Anand P.; Nyangiri, Oscar A.; Gething, Peter W.; Dewi, Mewahyu; Hogg, Mariana M.; Battle, Katherine E.; Padilla, Carmencita D.; Baird, J. Kevin; Hay, Simon I.

    2012-01-01

    Background Primaquine is a key drug for malaria elimination. In addition to being the only drug active against the dormant relapsing forms of Plasmodium vivax, primaquine is the sole effective treatment of infectious P. falciparum gametocytes, and may interrupt transmission and help contain the spread of artemisinin resistance. However, primaquine can trigger haemolysis in patients with a deficiency in glucose-6-phosphate dehydrogenase (G6PDd). Poor information is available about the distribution of individuals at risk of primaquine-induced haemolysis. We present a continuous evidence-based prevalence map of G6PDd and estimates of affected populations, together with a national index of relative haemolytic risk. Methods and Findings Representative community surveys of phenotypic G6PDd prevalence were identified for 1,734 spatially unique sites. These surveys formed the evidence-base for a Bayesian geostatistical model adapted to the gene's X-linked inheritance, which predicted a G6PDd allele frequency map across malaria endemic countries (MECs) and generated population-weighted estimates of affected populations. Highest median prevalence (peaking at 32.5%) was predicted across sub-Saharan Africa and the Arabian Peninsula. Although G6PDd prevalence was generally lower across central and southeast Asia, rarely exceeding 20%, the majority of G6PDd individuals (67.5% median estimate) were from Asian countries. We estimated a G6PDd allele frequency of 8.0% (interquartile range: 7.4–8.8) across MECs, and 5.3% (4.4–6.7) within malaria-eliminating countries. The reliability of the map is contingent on the underlying data informing the model; population heterogeneity can only be represented by the available surveys, and important weaknesses exist in the map across data-sparse regions. Uncertainty metrics are used to quantify some aspects of these limitations in the map. Finally, we assembled a database of G6PDd variant occurrences to inform a national-level index of relative G6PDd haemolytic risk. Asian countries, where variants were most severe, had the highest relative risks from G6PDd. Conclusions G6PDd is widespread and spatially heterogeneous across most MECs where primaquine would be valuable for malaria control and elimination. The maps and population estimates presented here reflect potential risk of primaquine-associated harm. In the absence of non-toxic alternatives to primaquine, these results represent additional evidence to help inform safe use of this valuable, yet dangerous, component of the malaria-elimination toolkit. Please see later in the article for the Editors' Summary PMID:23152723

  5. G6PD deficiency prevalence and estimates of affected populations in malaria endemic countries: a geostatistical model-based map.

    PubMed

    Howes, Rosalind E; Piel, Frédéric B; Patil, Anand P; Nyangiri, Oscar A; Gething, Peter W; Dewi, Mewahyu; Hogg, Mariana M; Battle, Katherine E; Padilla, Carmencita D; Baird, J Kevin; Hay, Simon I

    2012-01-01

    Primaquine is a key drug for malaria elimination. In addition to being the only drug active against the dormant relapsing forms of Plasmodium vivax, primaquine is the sole effective treatment of infectious P. falciparum gametocytes, and may interrupt transmission and help contain the spread of artemisinin resistance. However, primaquine can trigger haemolysis in patients with a deficiency in glucose-6-phosphate dehydrogenase (G6PDd). Poor information is available about the distribution of individuals at risk of primaquine-induced haemolysis. We present a continuous evidence-based prevalence map of G6PDd and estimates of affected populations, together with a national index of relative haemolytic risk. Representative community surveys of phenotypic G6PDd prevalence were identified for 1,734 spatially unique sites. These surveys formed the evidence-base for a Bayesian geostatistical model adapted to the gene's X-linked inheritance, which predicted a G6PDd allele frequency map across malaria endemic countries (MECs) and generated population-weighted estimates of affected populations. Highest median prevalence (peaking at 32.5%) was predicted across sub-Saharan Africa and the Arabian Peninsula. Although G6PDd prevalence was generally lower across central and southeast Asia, rarely exceeding 20%, the majority of G6PDd individuals (67.5% median estimate) were from Asian countries. We estimated a G6PDd allele frequency of 8.0% (interquartile range: 7.4-8.8) across MECs, and 5.3% (4.4-6.7) within malaria-eliminating countries. The reliability of the map is contingent on the underlying data informing the model; population heterogeneity can only be represented by the available surveys, and important weaknesses exist in the map across data-sparse regions. Uncertainty metrics are used to quantify some aspects of these limitations in the map. Finally, we assembled a database of G6PDd variant occurrences to inform a national-level index of relative G6PDd haemolytic risk. Asian countries, where variants were most severe, had the highest relative risks from G6PDd. G6PDd is widespread and spatially heterogeneous across most MECs where primaquine would be valuable for malaria control and elimination. The maps and population estimates presented here reflect potential risk of primaquine-associated harm. In the absence of non-toxic alternatives to primaquine, these results represent additional evidence to help inform safe use of this valuable, yet dangerous, component of the malaria-elimination toolkit. Please see later in the article for the Editors' Summary.

  6. Role of malnutrition and parasite infections in the spatial variation in children's anaemia risk in northern Angola.

    PubMed

    Soares Magalhães, Ricardo J; Langa, Antonio; Pedro, João Mário; Sousa-Figueiredo, José Carlos; Clements, Archie C A; Vaz Nery, Susana

    2013-05-01

    Anaemia is known to have an impact on child development and mortality and is a severe public health problem in most countries in sub-Saharan Africa. We investigated the consistency between ecological and individual-level approaches to anaemia mapping by building spatial anaemia models for children aged ≤15 years using different modelling approaches. We aimed to (i) quantify the role of malnutrition, malaria, Schistosoma haematobium and soil-transmitted helminths (STHs) in anaemia endemicity; and (ii) develop a high resolution predictive risk map of anaemia for the municipality of Dande in northern Angola. We used parasitological survey data for children aged ≤15 years to build Bayesian geostatistical models of malaria (PfPR≤15), S. haematobium, Ascaris lumbricoides and Trichuris trichiura and predict small-scale spatial variations in these infections. Malnutrition, PfPR≤15, and S. haematobium infections were significantly associated with anaemia risk. An estimated 12.5%, 15.6% and 9.8% of anaemia cases could be averted by treating malnutrition, malaria and S. haematobium, respectively. Spatial clusters of high risk of anaemia (>86%) were identified. Using an individual-level approach to anaemia mapping at a small spatial scale, we found that anaemia in children aged ≤15 years is highly heterogeneous and that malnutrition and parasitic infections are important contributors to the spatial variation in anaemia risk. The results presented in this study can help inform the integration of the current provincial malaria control programme with ancillary micronutrient supplementation and control of neglected tropical diseases such as urogenital schistosomiasis and STH infections.

  7. GY SAMPLING THEORY AND GEOSTATISTICS: ALTERNATE MODELS OF VARIABILITY IN CONTINUOUS MEDIA

    EPA Science Inventory



    In the sampling theory developed by Pierre Gy, sample variability is modeled as the sum of a set of seven discrete error components. The variogram used in geostatisties provides an alternate model in which several of Gy's error components are combined in a continuous mode...

  8. Development and comparison in uncertainty assessment based Bayesian modularization method in hydrological modeling

    NASA Astrophysics Data System (ADS)

    Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn

    2013-04-01

    SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.

  9. Bayesian models: A statistical primer for ecologists

    USGS Publications Warehouse

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  10. Geostatistical three-dimensional modeling of oolite shoals, St. Louis Limestone, southwest Kansas

    USGS Publications Warehouse

    Qi, L.; Carr, T.R.; Goldstein, R.H.

    2007-01-01

    In the Hugoton embayment of southwestern Kansas, reservoirs composed of relatively thin (<4 m; <13.1 ft) oolitic deposits within the St. Louis Limestone have produced more than 300 million bbl of oil. The geometry and distribution of oolitic deposits control the heterogeneity of the reservoirs, resulting in exploration challenges and relatively low recovery. Geostatistical three-dimensional (3-D) models were constructed to quantify the geometry and spatial distribution of oolitic reservoirs, and the continuity of flow units within Big Bow and Sand Arroyo Creek fields. Lithofacies in uncored wells were predicted from digital logs using a neural network. The tilting effect from the Laramide orogeny was removed to construct restored structural surfaces at the time of deposition. Well data and structural maps were integrated to build 3-D models of oolitic reservoirs using stochastic simulations with geometry data. Three-dimensional models provide insights into the distribution, the external and internal geometry of oolitic deposits, and the sedimentologic processes that generated reservoir intervals. The structural highs and general structural trend had a significant impact on the distribution and orientation of the oolitic complexes. The depositional pattern and connectivity analysis suggest an overall aggradation of shallow-marine deposits during pulses of relative sea level rise followed by deepening near the top of the St. Louis Limestone. Cemented oolitic deposits were modeled as barriers and baffles and tend to concentrate at the edge of oolitic complexes. Spatial distribution of porous oolitic deposits controls the internal geometry of rock properties. Integrated geostatistical modeling methods can be applicable to other complex carbonate or siliciclastic reservoirs in shallow-marine settings. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  11. Taking a statistical approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wild, M.; Rouhani, S.

    1995-02-01

    A typical site investigation entails extensive sampling and monitoring. In the past, sampling plans have been designed on purely ad hoc bases, leading to significant expenditures and, in some cases, collection of redundant information. In many instances, sampling costs exceed the true worth of the collected data. The US Environmental Protection Agency (EPA) therefore has advocated the use of geostatistics to provide a logical framework for sampling and analysis of environmental data. Geostatistical methodology uses statistical techniques for the spatial analysis of a variety of earth-related data. The use of geostatistics was developed by the mining industry to estimate oremore » concentrations. The same procedure is effective in quantifying environmental contaminants in soils for risk assessments. Unlike classical statistical techniques, geostatistics offers procedures to incorporate the underlying spatial structure of the investigated field. Sample points spaced close together tend to be more similar than samples spaced further apart. This can guide sampling strategies and determine complex contaminant distributions. Geostatistic techniques can be used to evaluate site conditions on the basis of regular, irregular, random and even spatially biased samples. In most environmental investigations, it is desirable to concentrate sampling in areas of known or suspected contamination. The rigorous mathematical procedures of geostatistics allow for accurate estimates at unsampled locations, potentially reducing sampling requirements. The use of geostatistics serves as a decision-aiding and planning tool and can significantly reduce short-term site assessment costs, long-term sampling and monitoring needs, as well as lead to more accurate and realistic remedial design criteria.« less

  12. Bayesian model reduction and empirical Bayes for group (DCM) studies.

    PubMed

    Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter

    2016-03-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  13. A study of finite mixture model: Bayesian approach on financial time series data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-07-01

    Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.

  14. Incorporating reservoir heterogeneity with geostatistics to investigate waterflood recoveries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolcott, D.S.; Chopra, A.K.

    1993-03-01

    This paper presents an investigation of infill drilling performance and reservoir continuity with geostatistics and a reservoir simulator. The geostatistical technique provides many possible realizations and realistic descriptions of reservoir heterogeneity. Correlation between recovery efficiency and thickness of individual sand subunits is shown. Additional recovery from infill drilling results from thin, discontinuous subunits. The technique may be applied to variations in continuity for other sandstone reservoirs.

  15. Combined SEM/AVS and attenuation of concentration models for the assessment of bioavailability and mobility of metals in sediments of Sepetiba Bay (SE Brazil).

    PubMed

    Ribeiro, Andreza Portella; Figueiredo, Ana Maria Graciano; dos Santos, José Osman; Dantas, Elizabeth; Cotrim, Marycel Elena Barboza; Figueira, Rubens Cesar Lopes; Silva Filho, Emmanoel V; Wasserman, Julio Cesar

    2013-03-15

    This study proposes a new methodology to study contamination, bioavailability and mobility of metals (Cd, Cu, Ni, Pb, and Zn) using chemical and geostatistics approaches in marine sediments of Sepetiba Bay (SE Brazil). The chemical model of SEM (simultaneously extracted metals)/AVS (acid volatile sulfides) ratio uses a technique of cold acid extraction of metals to evaluate their bioavailability, and the geostatistical model of attenuation of concentrations estimates the mobility of metals. By coupling the two it was observed that Sepetiba Port, the urban area of Sepetiba and the riverine discharges may constitute potential sources of metals to Sepetiba Bay. The metals are concentrated in the NE area of the bay, where they tend to have their lowest mobility, as shown by the attenuation model, and are not bioavailable, as they tend to associate with sulfide and organic matter originated in the mangrove forests of nearby Guaratiba area. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Bayesian Data-Model Fit Assessment for Structural Equation Modeling

    ERIC Educational Resources Information Center

    Levy, Roy

    2011-01-01

    Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…

  17. Comparison of geostatistical interpolation and remote sensing techniques for estimating long-term exposure to ambient PM2.5 concentrations across the continental United States.

    PubMed

    Lee, Seung-Jae; Serre, Marc L; van Donkelaar, Aaron; Martin, Randall V; Burnett, Richard T; Jerrett, Michael

    2012-12-01

    A better understanding of the adverse health effects of chronic exposure to fine particulate matter (PM2.5) requires accurate estimates of PM2.5 variation at fine spatial scales. Remote sensing has emerged as an important means of estimating PM2.5 exposures, but relatively few studies have compared remote-sensing estimates to those derived from monitor-based data. We evaluated and compared the predictive capabilities of remote sensing and geostatistical interpolation. We developed a space-time geostatistical kriging model to predict PM2.5 over the continental United States and compared resulting predictions to estimates derived from satellite retrievals. The kriging estimate was more accurate for locations that were about 100 km from a monitoring station, whereas the remote sensing estimate was more accurate for locations that were > 100 km from a monitoring station. Based on this finding, we developed a hybrid map that combines the kriging and satellite-based PM2.5 estimates. We found that for most of the populated areas of the continental United States, geostatistical interpolation produced more accurate estimates than remote sensing. The differences between the estimates resulting from the two methods, however, were relatively small. In areas with extensive monitoring networks, the interpolation may provide more accurate estimates, but in the many areas of the world without such monitoring, remote sensing can provide useful exposure estimates that perform nearly as well.

  18. A geostatistical approach for quantification of contaminant mass discharge uncertainty using multilevel sampler measurements

    NASA Astrophysics Data System (ADS)

    Li, K. Betty; Goovaerts, Pierre; Abriola, Linda M.

    2007-06-01

    Contaminant mass discharge across a control plane downstream of a dense nonaqueous phase liquid (DNAPL) source zone has great potential to serve as a metric for the assessment of the effectiveness of source zone treatment technologies and for the development of risk-based source-plume remediation strategies. However, too often the uncertainty of mass discharge estimated in the field is not accounted for in the analysis. In this paper, a geostatistical approach is proposed to estimate mass discharge and to quantify its associated uncertainty using multilevel transect measurements of contaminant concentration (C) and hydraulic conductivity (K). The approach adapts the p-field simulation algorithm to propagate and upscale the uncertainty of mass discharge from the local uncertainty models of C and K. Application of this methodology to numerically simulated transects shows that, with a regular sampling pattern, geostatistics can provide an accurate model of uncertainty for the transects that are associated with low levels of source mass removal (i.e., transects that have a large percentage of contaminated area). For high levels of mass removal (i.e., transects with a few hot spots and large areas of near-zero concentration), a total sampling area equivalent to 6˜7% of the transect is required to achieve accurate uncertainty modeling. A comparison of the results for different measurement supports indicates that samples taken with longer screen lengths may lead to less accurate models of mass discharge uncertainty. The quantification of mass discharge uncertainty, in the form of a probability distribution, will facilitate risk assessment associated with various remediation strategies.

  19. Soil moisture estimation by assimilating L-band microwave brightness temperature with geostatistics and observation localization.

    PubMed

    Han, Xujun; Li, Xin; Rigon, Riccardo; Jin, Rui; Endrizzi, Stefano

    2015-01-01

    The observation could be used to reduce the model uncertainties with data assimilation. If the observation cannot cover the whole model area due to spatial availability or instrument ability, how to do data assimilation at locations not covered by observation? Two commonly used strategies were firstly described: One is covariance localization (CL); the other is observation localization (OL). Compared with CL, OL is easy to parallelize and more efficient for large-scale analysis. This paper evaluated OL in soil moisture profile characterizations, in which the geostatistical semivariogram was used to fit the spatial correlated characteristics of synthetic L-Band microwave brightness temperature measurement. The fitted semivariogram model and the local ensemble transform Kalman filter algorithm are combined together to weight and assimilate the observations within a local region surrounding the grid cell of land surface model to be analyzed. Six scenarios were compared: 1_Obs with one nearest observation assimilated, 5_Obs with no more than five nearest local observations assimilated, and 9_Obs with no more than nine nearest local observations assimilated. The scenarios with no more than 16, 25, and 36 local observations were also compared. From the results we can conclude that more local observations involved in assimilation will improve estimations with an upper bound of 9 observations in this case. This study demonstrates the potentials of geostatistical correlation representation in OL to improve data assimilation of catchment scale soil moisture using synthetic L-band microwave brightness temperature, which cannot cover the study area fully in space due to vegetation effects.

  20. Soil Moisture Estimation by Assimilating L-Band Microwave Brightness Temperature with Geostatistics and Observation Localization

    PubMed Central

    Han, Xujun; Li, Xin; Rigon, Riccardo; Jin, Rui; Endrizzi, Stefano

    2015-01-01

    The observation could be used to reduce the model uncertainties with data assimilation. If the observation cannot cover the whole model area due to spatial availability or instrument ability, how to do data assimilation at locations not covered by observation? Two commonly used strategies were firstly described: One is covariance localization (CL); the other is observation localization (OL). Compared with CL, OL is easy to parallelize and more efficient for large-scale analysis. This paper evaluated OL in soil moisture profile characterizations, in which the geostatistical semivariogram was used to fit the spatial correlated characteristics of synthetic L-Band microwave brightness temperature measurement. The fitted semivariogram model and the local ensemble transform Kalman filter algorithm are combined together to weight and assimilate the observations within a local region surrounding the grid cell of land surface model to be analyzed. Six scenarios were compared: 1_Obs with one nearest observation assimilated, 5_Obs with no more than five nearest local observations assimilated, and 9_Obs with no more than nine nearest local observations assimilated. The scenarios with no more than 16, 25, and 36 local observations were also compared. From the results we can conclude that more local observations involved in assimilation will improve estimations with an upper bound of 9 observations in this case. This study demonstrates the potentials of geostatistical correlation representation in OL to improve data assimilation of catchment scale soil moisture using synthetic L-band microwave brightness temperature, which cannot cover the study area fully in space due to vegetation effects. PMID:25635771

  1. Bayesian multimodel inference for dose-response studies

    USGS Publications Warehouse

    Link, W.A.; Albers, P.H.

    2007-01-01

    Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carle, S F

    Compositional data are represented as vector variables with individual vector components ranging between zero and a positive maximum value representing a constant sum constraint, usually unity (or 100 percent). The earth sciences are flooded with spatial distributions of compositional data, such as concentrations of major ion constituents in natural waters (e.g. mole, mass, or volume fractions), mineral percentages, ore grades, or proportions of mutually exclusive categories (e.g. a water-oil-rock system). While geostatistical techniques have become popular in earth science applications since the 1970s, very little attention has been paid to the unique mathematical properties of geostatistical formulations involving compositional variables.more » The book 'Geostatistical Analysis of Compositional Data' by Vera Pawlowsky-Glahn and Ricardo Olea (Oxford University Press, 2004), unlike any previous book on geostatistics, directly confronts the mathematical difficulties inherent to applying geostatistics to compositional variables. The book righteously justifies itself with prodigious referencing to previous work addressing nonsensical ranges of estimated values and error, spurious correlation, and singular cross-covariance matrices.« less

  3. Imprecise (fuzzy) information in geostatistics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bardossy, A.; Bogardi, I.; Kelly, W.E.

    1988-05-01

    A methodology based on fuzzy set theory for the utilization of imprecise data in geostatistics is presented. A common problem preventing a broader use of geostatistics has been the insufficient amount of accurate measurement data. In certain cases, additional but uncertain (soft) information is available and can be encoded as subjective probabilities, and then the soft kriging method can be applied (Journal, 1986). In other cases, a fuzzy encoding of soft information may be more realistic and simplify the numerical calculations. Imprecise (fuzzy) spatial information on the possible variogram is integrated into a single variogram which is used in amore » fuzzy kriging procedure. The overall uncertainty of prediction is represented by the estimation variance and the calculated membership function for each kriged point. The methodology is applied to the permeability prediction of a soil liner for hazardous waste containment. The available number of hard measurement data (20) was not enough for a classical geostatistical analysis. An additional 20 soft data made it possible to prepare kriged contour maps using the fuzzy geostatistical procedure.« less

  4. Spatial prediction of Plasmodium falciparum prevalence in Somalia

    PubMed Central

    Noor, Abdisalan M; Clements, Archie CA; Gething, Peter W; Moloney, Grainne; Borle, Mohammed; Shewchuk, Tanya; Hay, Simon I; Snow, Robert W

    2008-01-01

    Background Maps of malaria distribution are vital for optimal allocation of resources for anti-malarial activities. There is a lack of reliable contemporary malaria maps in endemic countries in sub-Saharan Africa. This problem is particularly acute in low malaria transmission countries such as those located in the horn of Africa. Methods Data from a national malaria cluster sample survey in 2005 and routine cluster surveys in 2007 were assembled for Somalia. Rapid diagnostic tests were used to examine the presence of Plasmodium falciparum parasites in finger-prick blood samples obtained from individuals across all age-groups. Bayesian geostatistical models, with environmental and survey covariates, were used to predict continuous maps of malaria prevalence across Somalia and to define the uncertainty associated with the predictions. Results For analyses the country was divided into north and south. In the north, the month of survey, distance to water, precipitation and temperature had no significant association with P. falciparum prevalence when spatial correlation was taken into account. In contrast, all the covariates, except distance to water, were significantly associated with parasite prevalence in the south. The inclusion of covariates improved model fit for the south but not for the north. Model precision was highest in the south. The majority of the country had a predicted prevalence of < 5%; areas with ≥ 5% prevalence were predominantly in the south. Conclusion The maps showed that malaria transmission in Somalia varied from hypo- to meso-endemic. However, even after including the selected covariates in the model, there still remained a considerable amount of unexplained spatial variation in parasite prevalence, indicating effects of other factors not captured in the study. Nonetheless the maps presented here provide the best contemporary information on malaria prevalence in Somalia. PMID:18717998

  5. Spatial prediction of Plasmodium falciparum prevalence in Somalia.

    PubMed

    Noor, Abdisalan M; Clements, Archie C A; Gething, Peter W; Moloney, Grainne; Borle, Mohammed; Shewchuk, Tanya; Hay, Simon I; Snow, Robert W

    2008-08-21

    Maps of malaria distribution are vital for optimal allocation of resources for anti-malarial activities. There is a lack of reliable contemporary malaria maps in endemic countries in sub-Saharan Africa. This problem is particularly acute in low malaria transmission countries such as those located in the horn of Africa. Data from a national malaria cluster sample survey in 2005 and routine cluster surveys in 2007 were assembled for Somalia. Rapid diagnostic tests were used to examine the presence of Plasmodium falciparum parasites in finger-prick blood samples obtained from individuals across all age-groups. Bayesian geostatistical models, with environmental and survey covariates, were used to predict continuous maps of malaria prevalence across Somalia and to define the uncertainty associated with the predictions. For analyses the country was divided into north and south. In the north, the month of survey, distance to water, precipitation and temperature had no significant association with P. falciparum prevalence when spatial correlation was taken into account. In contrast, all the covariates, except distance to water, were significantly associated with parasite prevalence in the south. The inclusion of covariates improved model fit for the south but not for the north. Model precision was highest in the south. The majority of the country had a predicted prevalence of < 5%; areas with > or = 5% prevalence were predominantly in the south. The maps showed that malaria transmission in Somalia varied from hypo- to meso-endemic. However, even after including the selected covariates in the model, there still remained a considerable amount of unexplained spatial variation in parasite prevalence, indicating effects of other factors not captured in the study. Nonetheless the maps presented here provide the best contemporary information on malaria prevalence in Somalia.

  6. A guide to Bayesian model selection for ecologists

    USGS Publications Warehouse

    Hooten, Mevin B.; Hobbs, N.T.

    2015-01-01

    The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.

  7. A geostatistical approach to predicting sulfur content in the Pittsburgh coal bed

    USGS Publications Warehouse

    Watson, W.D.; Ruppert, L.F.; Bragg, L.J.; Tewalt, S.J.

    2001-01-01

    The US Geological Survey (USGS) is completing a national assessment of coal resources in the five top coal-producing regions in the US. Point-located data provide measurements on coal thickness and sulfur content. The sample data and their geologic interpretation represent the most regionally complete and up-to-date assessment of what is known about top-producing US coal beds. The sample data are analyzed using a combination of geologic and Geographic Information System (GIS) models to estimate tonnages and qualities of the coal beds. Traditionally, GIS practitioners use contouring to represent geographical patterns of "similar" data values. The tonnage and grade of coal resources are then assessed by using the contour lines as references for interpolation. An assessment taken to this point is only indicative of resource quantity and quality. Data users may benefit from a statistical approach that would allow them to better understand the uncertainty and limitations of the sample data. To develop a quantitative approach, geostatistics were applied to the data on coal sulfur content from samples taken in the Pittsburgh coal bed (located in the eastern US, in the southwestern part of the state of Pennsylvania, and in adjoining areas in the states of Ohio and West Virginia). Geostatistical methods that account for regional and local trends were applied to blocks 2.7 mi (4.3 km) on a side. The data and geostatistics support conclusions concerning the average sulfur content and its degree of reliability at regional- and economic-block scale over the large, contiguous part of the Pittsburgh outcrop, but not to a mine scale. To validate the method, a comparison was made with the sulfur contents in sample data taken from 53 coal mines located in the study area. The comparison showed a high degree of similarity between the sulfur content in the mine samples and the sulfur content represented by the geostatistically derived contours. Published by Elsevier Science B.V.

  8. On the value of incorporating spatial statistics in large-scale geophysical inversions: the SABRe case

    NASA Astrophysics Data System (ADS)

    Kokkinaki, A.; Sleep, B. E.; Chambers, J. E.; Cirpka, O. A.; Nowak, W.

    2010-12-01

    Electrical Resistance Tomography (ERT) is a popular method for investigating subsurface heterogeneity. The method relies on measuring electrical potential differences and obtaining, through inverse modeling, the underlying electrical conductivity field, which can be related to hydraulic conductivities. The quality of site characterization strongly depends on the utilized inversion technique. Standard ERT inversion methods, though highly computationally efficient, do not consider spatial correlation of soil properties; as a result, they often underestimate the spatial variability observed in earth materials, thereby producing unrealistic subsurface models. Also, these methods do not quantify the uncertainty of the estimated properties, thus limiting their use in subsequent investigations. Geostatistical inverse methods can be used to overcome both these limitations; however, they are computationally expensive, which has hindered their wide use in practice. In this work, we compare a standard Gauss-Newton smoothness constrained least squares inversion method against the quasi-linear geostatistical approach using the three-dimensional ERT dataset of the SABRe (Source Area Bioremediation) project. The two methods are evaluated for their ability to: a) produce physically realistic electrical conductivity fields that agree with the wide range of data available for the SABRe site while being computationally efficient, and b) provide information on the spatial statistics of other parameters of interest, such as hydraulic conductivity. To explore the trade-off between inversion quality and computational efficiency, we also employ a 2.5-D forward model with corrections for boundary conditions and source singularities. The 2.5-D model accelerates the 3-D geostatistical inversion method. New adjoint equations are developed for the 2.5-D forward model for the efficient calculation of sensitivities. Our work shows that spatial statistics can be incorporated in large-scale ERT inversions to improve the inversion results without making them computationally prohibitive.

  9. On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)

    ERIC Educational Resources Information Center

    Wills, Andy J.; Pothos, Emmanuel M.

    2012-01-01

    Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…

  10. Geostatistical Methods For Determination of Roughness, Topography, And Changes of Antarctic Ice Streams From SAR And Radar Altimeter Data

    NASA Technical Reports Server (NTRS)

    Herzfeld, Ute C.

    2002-01-01

    The central objective of this project has been the development of geostatistical methods fro mapping elevation and ice surface characteristics from satellite radar altimeter (RA) and Syntheitc Aperture Radar (SAR) data. The main results are an Atlas of elevation maps of Antarctica, from GEOSAT RA data and an Atlas from ERS-1 RA data, including a total of about 200 maps with 3 km grid resolution. Maps and digital terrain models are applied to monitor and study changes in Antarctic ice streams and glaciers, including Lambert Glacier/Amery Ice Shelf, Mertz and Ninnis Glaciers, Jutulstraumen Glacier, Fimbul Ice Shelf, Slessor Glacier, Williamson Glacier and others.

  11. Uncertainty aggregation and reduction in structure-material performance prediction

    NASA Astrophysics Data System (ADS)

    Hu, Zhen; Mahadevan, Sankaran; Ao, Dan

    2018-02-01

    An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.

  12. Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees.

    PubMed

    Yang, Ziheng; Zhu, Tianqi

    2018-02-20

    The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.

  13. Accounting for regional background and population size in the detection of spatial clusters and outliers using geostatistical filtering and spatial neutral models: the case of lung cancer in Long Island, New York

    PubMed Central

    Goovaerts, Pierre; Jacquez, Geoffrey M

    2004-01-01

    Background Complete Spatial Randomness (CSR) is the null hypothesis employed by many statistical tests for spatial pattern, such as local cluster or boundary analysis. CSR is however not a relevant null hypothesis for highly complex and organized systems such as those encountered in the environmental and health sciences in which underlying spatial pattern is present. This paper presents a geostatistical approach to filter the noise caused by spatially varying population size and to generate spatially correlated neutral models that account for regional background obtained by geostatistical smoothing of observed mortality rates. These neutral models were used in conjunction with the local Moran statistics to identify spatial clusters and outliers in the geographical distribution of male and female lung cancer in Nassau, Queens, and Suffolk counties, New York, USA. Results We developed a typology of neutral models that progressively relaxes the assumptions of null hypotheses, allowing for the presence of spatial autocorrelation, non-uniform risk, and incorporation of spatially heterogeneous population sizes. Incorporation of spatial autocorrelation led to fewer significant ZIP codes than found in previous studies, confirming earlier claims that CSR can lead to over-identification of the number of significant spatial clusters or outliers. Accounting for population size through geostatistical filtering increased the size of clusters while removing most of the spatial outliers. Integration of regional background into the neutral models yielded substantially different spatial clusters and outliers, leading to the identification of ZIP codes where SMR values significantly depart from their regional background. Conclusion The approach presented in this paper enables researchers to assess geographic relationships using appropriate null hypotheses that account for the background variation extant in real-world systems. In particular, this new methodology allows one to identify geographic pattern above and beyond background variation. The implementation of this approach in spatial statistical software will facilitate the detection of spatial disparities in mortality rates, establishing the rationale for targeted cancer control interventions, including consideration of health services needs, and resource allocation for screening and diagnostic testing. It will allow researchers to systematically evaluate how sensitive their results are to assumptions implicit under alternative null hypotheses. PMID:15272930

  14. Bayesian Fundamentalism or Enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition.

    PubMed

    Jones, Matt; Love, Bradley C

    2011-08-01

    The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.

  15. Mapping the Risk of Soil-Transmitted Helminthic Infections in the Philippines

    PubMed Central

    Leonardo, Lydia; Gray, Darren J.; Carabin, Hélène; Halton, Kate; McManus, Donald P.; Williams, Gail M.; Rivera, Pilarita; Saniel, Ofelia; Hernandez, Leda; Yakob, Laith; McGarvey, Stephen T.; Clements, Archie C. A.

    2015-01-01

    Background In order to increase the efficient allocation of soil-transmitted helminth (STH) disease control resources in the Philippines, we aimed to describe for the first time the spatial variation in the prevalence of A. lumbricoides, T. trichiura and hookworm across the country, quantify the association between the physical environment and spatial variation of STH infection and develop predictive risk maps for each infection. Methodology/Principal Findings Data on STH infection from 35,573 individuals across the country were geolocated at the barangay level and included in the analysis. The analysis was stratified geographically in two major regions: 1) Luzon and the Visayas and 2) Mindanao. Bayesian geostatistical models of STH prevalence were developed, including age and sex of individuals and environmental variables (rainfall, land surface temperature and distance to inland water bodies) as predictors, and diagnostic uncertainty was incorporated. The role of environmental variables was different between regions of the Philippines. This analysis revealed that while A. lumbricoides and T. trichiura infections were widespread and highly endemic, hookworm infections were more circumscribed to smaller foci in the Visayas and Mindanao. Conclusions/Significance This analysis revealed significant spatial variation in STH infection prevalence within provinces of the Philippines. This suggests that a spatially targeted approach to STH interventions, including mass drug administration, is warranted. When financially possible, additional STH surveys should be prioritized to high-risk areas identified by our study in Luzon. PMID:26368819

  16. Planning spatial sampling of the soil from an uncertain reconnaissance variogram

    NASA Astrophysics Data System (ADS)

    Lark, R. Murray; Hamilton, Elliott M.; Kaninga, Belinda; Maseka, Kakoma K.; Mutondo, Moola; Sakala, Godfrey M.; Watts, Michael J.

    2017-12-01

    An estimated variogram of a soil property can be used to support a rational choice of sampling intensity for geostatistical mapping. However, it is known that estimated variograms are subject to uncertainty. In this paper we address two practical questions. First, how can we make a robust decision on sampling intensity, given the uncertainty in the variogram? Second, what are the costs incurred in terms of oversampling because of uncertainty in the variogram model used to plan sampling? To achieve this we show how samples of the posterior distribution of variogram parameters, from a computational Bayesian analysis, can be used to characterize the effects of variogram parameter uncertainty on sampling decisions. We show how one can select a sample intensity so that a target value of the kriging variance is not exceeded with some specified probability. This will lead to oversampling, relative to the sampling intensity that would be specified if there were no uncertainty in the variogram parameters. One can estimate the magnitude of this oversampling by treating the tolerable grid spacing for the final sample as a random variable, given the target kriging variance and the posterior sample values. We illustrate these concepts with some data on total uranium content in a relatively sparse sample of soil from agricultural land near mine tailings in the Copperbelt Province of Zambia.

  17. How the Bayesians Got Their Beliefs (and What Those Beliefs Actually Are): Comment on Bowers and Davis (2012)

    ERIC Educational Resources Information Center

    Griffiths, Thomas L.; Chater, Nick; Norris, Dennis; Pouget, Alexandre

    2012-01-01

    Bowers and Davis (2012) criticize Bayesian modelers for telling "just so" stories about cognition and neuroscience. Their criticisms are weakened by not giving an accurate characterization of the motivation behind Bayesian modeling or the ways in which Bayesian models are used and by not evaluating this theoretical framework against specific…

  18. Geostatistical noise filtering of geophysical images : application to unexploded ordnance (UXO) sites.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saito, Hirotaka; McKenna, Sean Andrew; Coburn, Timothy C.

    2004-07-01

    Geostatistical and non-geostatistical noise filtering methodologies, factorial kriging and a low-pass filter, and a region growing method are applied to analytic signal magnetometer images at two UXO contaminated sites to delineate UXO target areas. Overall delineation performance is improved by removing background noise. Factorial kriging slightly outperforms the low-pass filter but there is no distinct difference between them in terms of finding anomalies of interest.

  19. Bayesian Regression with Network Prior: Optimal Bayesian Filtering Perspective

    PubMed Central

    Qian, Xiaoning; Dougherty, Edward R.

    2017-01-01

    The recently introduced intrinsically Bayesian robust filter (IBRF) provides fully optimal filtering relative to a prior distribution over an uncertainty class ofjoint random process models, whereas formerly the theory was limited to model-constrained Bayesian robust filters, for which optimization was limited to the filters that are optimal for models in the uncertainty class. This paper extends the IBRF theory to the situation where there are both a prior on the uncertainty class and sample data. The result is optimal Bayesian filtering (OBF), where optimality is relative to the posterior distribution derived from the prior and the data. The IBRF theories for effective characteristics and canonical expansions extend to the OBF setting. A salient focus of the present work is to demonstrate the advantages of Bayesian regression within the OBF setting over the classical Bayesian approach in the context otlinear Gaussian models. PMID:28824268

  20. Modeling Diagnostic Assessments with Bayesian Networks

    ERIC Educational Resources Information Center

    Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego

    2007-01-01

    This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…

  1. Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests

    NASA Astrophysics Data System (ADS)

    Wheeler, David C.; Waller, Lance A.

    2009-03-01

    In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.

  2. Philosophy and the practice of Bayesian statistics

    PubMed Central

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2015-01-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575

  3. Philosophy and the practice of Bayesian statistics.

    PubMed

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2013-02-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.

  4. A stochastic-geometric model of soil variation in Pleistocene patterned ground

    NASA Astrophysics Data System (ADS)

    Lark, Murray; Meerschman, Eef; Van Meirvenne, Marc

    2013-04-01

    In this paper we examine the spatial variability of soil in parent material with complex spatial structure which arises from complex non-linear geomorphic processes. We show that this variability can be better-modelled by a stochastic-geometric model than by a standard Gaussian random field. The benefits of the new model are seen in the reproduction of features of the target variable which influence processes like water movement and pollutant dispersal. Complex non-linear processes in the soil give rise to properties with non-Gaussian distributions. Even under a transformation to approximate marginal normality, such variables may have a more complex spatial structure than the Gaussian random field model of geostatistics can accommodate. In particular the extent to which extreme values of the variable are connected in spatially coherent regions may be misrepresented. As a result, for example, geostatistical simulation generally fails to reproduce the pathways for preferential flow in an environment where coarse infill of former fluvial channels or coarse alluvium of braided streams creates pathways for rapid movement of water. Multiple point geostatistics has been developed to deal with this problem. Multiple point methods proceed by sampling from a set of training images which can be assumed to reproduce the non-Gaussian behaviour of the target variable. The challenge is to identify appropriate sources of such images. In this paper we consider a mode of soil variation in which the soil varies continuously, exhibiting short-range lateral trends induced by local effects of the factors of soil formation which vary across the region of interest in an unpredictable way. The trends in soil variation are therefore only apparent locally, and the soil variation at regional scale appears random. We propose a stochastic-geometric model for this mode of soil variation called the Continuous Local Trend (CLT) model. We consider a case study of soil formed in relict patterned ground with pronounced lateral textural variations arising from the presence of infilled ice-wedges of Pleistocene origin. We show how knowledge of the pedogenetic processes in this environment, along with some simple descriptive statistics, can be used to select and fit a CLT model for the apparent electrical conductivity (ECa) of the soil. We use the model to simulate realizations of the CLT process, and compare these with realizations of a fitted Gaussian random field. We show how statistics that summarize the spatial coherence of regions with small values of ECa, which are expected to have coarse texture and so larger saturated hydraulic conductivity, are better reproduced by the CLT model than by the Gaussian random field. This suggests that the CLT model could be used to generate an unlimited supply of training images to allow multiple point geostatistical simulation or prediction of this or similar variables.

  5. Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model

    NASA Astrophysics Data System (ADS)

    Al Sobhi, Mashail M.

    2015-02-01

    Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.

  6. Fundamentals and Recent Developments in Approximate Bayesian Computation

    PubMed Central

    Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka

    2017-01-01

    Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922

  7. Integrating geophysical data for mapping the contamination of industrial sites by polycyclic aromatic hydrocarbons: A geostatistical approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Colin, P.; Nicoletis, S.; Froidevaux, R.

    1996-12-31

    A case study is presented of building a map showing the probability that the concentration in polycyclic aromatic hydrocarbon (PAH) exceeds a critical threshold. This assessment is based on existing PAH sample data (direct information) and on an electrical resistivity survey (indirect information). Simulated annealing is used to build a model of the range of possible values for PAH concentrations and of the bivariate relationship between PAH concentrations and electrical resistivity. The geostatistical technique of simple indicator kriging is then used, together with the probabilistic model, to infer, at each node of a grid, the range of possible values whichmore » the PAH concentration can take. The risk map is then extracted for this characterization of the local uncertainty. The difference between this risk map and a traditional iso-concentration map is then discussed in terms of decision-making.« less

  8. An assessment of gas emanation hazard using a geographic information system and geostatistics.

    PubMed

    Astorri, F; Beaubien, S E; Ciotoli, G; Lombardi, S

    2002-03-01

    This paper describes the use of geostatistical analysis and GIS techniques to assess gas emanation hazards. The Mt. Vulsini volcanic district was selected for this study because of the wide range of natural phenomena locally present that affect gas migration in the near surface. In addition, soil gas samples that were collected in this area should allow for a calibration between the generated risk/hazard models and the measured distribution of toxic gas species at surface. The approach used during this study consisted of three general stages. First data were digitally organized into thematic layers, then software functions in the GIS program "ArcView" were used to compare and correlate these various layers, and then finally the produced "potential-risk" map was compared with radon soil gas data in order to validate the model and/or to select zones for further, more-detailed soil gas investigations.

  9. Use of geostatistics to predict virus decay rates for determination of septic tank setback distances.

    PubMed Central

    Yates, M V; Yates, S R; Warrick, A W; Gerba, C P

    1986-01-01

    Water samples were collected from 71 public drinking-water supply wells in the Tucson, Ariz., basin. Virus decay rates in the water samples were determined with MS-2 coliphage as a model virus. The correlations between the virus decay rates and the sample locations were shown by fitting a spherical model to the experimental semivariogram. Kriging, a geostatistical technique, was used to calculate virus decay rates at unsampled locations by using the known values at nearby wells. Based on the regional characteristics of groundwater flow and the kriged estimates of virus decay rates, a contour map of the area was constructed. The map shows the variation in separation distances that would have to be maintained between wells and sources of contamination to afford similar degrees of protection from viral contamination of the drinking water in wells throughout the basin. PMID:3532954

  10. G STL: the geostatistical template library in C++

    NASA Astrophysics Data System (ADS)

    Remy, Nicolas; Shtuka, Arben; Levy, Bruno; Caers, Jef

    2002-10-01

    The development of geostatistics has been mostly accomplished by application-oriented engineers in the past 20 years. The focus on concrete applications gave birth to many algorithms and computer programs designed to address different issues, such as estimating or simulating a variable while possibly accounting for secondary information such as seismic data, or integrating geological and geometrical data. At the core of any geostatistical data integration methodology is a well-designed algorithm. Yet, despite their obvious differences, all these algorithms share many commonalities on which to build a geostatistics programming library, lest the resulting library is poorly reusable and difficult to expand. Building on this observation, we design a comprehensive, yet flexible and easily reusable library of geostatistics algorithms in C++. The recent advent of the generic programming paradigm allows us elegantly to express the commonalities of the geostatistical algorithms into computer code. Generic programming, also referred to as "programming with concepts", provides a high level of abstraction without loss of efficiency. This last point is a major gain over object-oriented programming which often trades efficiency for abstraction. It is not enough for a numerical library to be reusable, it also has to be fast. Because generic programming is "programming with concepts", the essential step in the library design is the careful identification and thorough definition of these concepts shared by most geostatistical algorithms. Building on these definitions, a generic and expandable code can be developed. To show the advantages of such a generic library, we use G STL to build two sequential simulation programs working on two different types of grids—a surface with faults and an unstructured grid—without requiring any change to the G STL code.

  11. High Performance Geostatistical Modeling of Biospheric Resources

    NASA Astrophysics Data System (ADS)

    Pedelty, J. A.; Morisette, J. T.; Smith, J. A.; Schnase, J. L.; Crosier, C. S.; Stohlgren, T. J.

    2004-12-01

    We are using parallel geostatistical codes to study spatial relationships among biospheric resources in several study areas. For example, spatial statistical models based on large- and small-scale variability have been used to predict species richness of both native and exotic plants (hot spots of diversity) and patterns of exotic plant invasion. However, broader use of geostastics in natural resource modeling, especially at regional and national scales, has been limited due to the large computing requirements of these applications. To address this problem, we implemented parallel versions of the kriging spatial interpolation algorithm. The first uses the Message Passing Interface (MPI) in a master/slave paradigm on an open source Linux Beowulf cluster, while the second is implemented with the new proprietary Xgrid distributed processing system on an Xserve G5 cluster from Apple Computer, Inc. These techniques are proving effective and provide the basis for a national decision support capability for invasive species management that is being jointly developed by NASA and the US Geological Survey.

  12. Uncovering hidden heterogeneity: Geo-statistical models illuminate the fine scale effects of boating infrastructure on sediment characteristics and contaminants.

    PubMed

    Hedge, L H; Dafforn, K A; Simpson, S L; Johnston, E L

    2017-06-30

    Infrastructure associated with coastal communities is likely to not only directly displace natural systems, but also leave environmental footprints' that stretch over multiple scales. Some coastal infrastructure will, there- fore, generate a hidden layer of habitat heterogeneity in sediment systems that is not immediately observable in classical impact assessment frameworks. We examine the hidden heterogeneity associated with one of the most ubiquitous coastal modifications; dense swing moorings fields. Using a model based geo-statistical framework we highlight the variation in sedimentology throughout mooring fields and reference locations. Moorings were correlated with patches of sediment with larger particle sizes, and associated metal(loid) concentrations in these patches were depressed. Our work highlights two important ideas i) mooring fields create a mosaic of habitat in which contamination decreases and grain sizes increase close to moorings, and ii) model- based frameworks provide an information rich, easy-to-interpret way to communicate complex analyses to stakeholders. Crown Copyright © 2017. Published by Elsevier Ltd. All rights reserved.

  13. SRS 2010 Vegetation Inventory GeoStatistical Mapping Results for Custom Reaction Intensity and Total Dead Fuels.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, Lloyd A.; Paresol, Bernard

    This report of the geostatistical analysis results of the fire fuels response variables, custom reaction intensity and total dead fuels is but a part of an SRS 2010 vegetation inventory project. For detailed description of project, theory and background including sample design, methods, and results please refer to USDA Forest Service Savannah River Site internal report “SRS 2010 Vegetation Inventory GeoStatistical Mapping Report”, (Edwards & Parresol 2013).

  14. Landscape scale mapping of forest inventory data by nearest neighbor classification

    Treesearch

    Andrew Lister

    2009-01-01

    One of the goals of the Forest Service, U.S. Department of Agriculture's Forest Inventory and Analysis (FIA) program is large-area mapping. FIA scientists have tried many methods in the past, including geostatistical methods, linear modeling, nonlinear modeling, and simple choropleth and dot maps. Mapping methods that require individual model-based maps to be...

  15. Efficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation

    NASA Astrophysics Data System (ADS)

    Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad

    2016-05-01

    Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.

  16. Coupling geostatistics to detailed reservoir description allows better visualization and more accurate characterization/simulation of turbidite reservoirs: Elk Hills oil field, California

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allan, M.E.; Wilson, M.L.; Wightman, J.

    1996-12-31

    The Elk Hills giant oilfield, located in the southern San Joaquin Valley of California, has produced 1.1 billion barrels of oil from Miocene and shallow Pliocene reservoirs. 65% of the current 64,000 BOPD production is from the pressure-supported, deeper Miocene turbidite sands. In the turbidite sands of the 31 S structure, large porosity & permeability variations in the Main Body B and Western 31 S sands cause problems with the efficiency of the waterflooding. These variations have now been quantified and visualized using geostatistics. The end result is a more detailed reservoir characterization for simulation. Traditional reservoir descriptions based onmore » marker correlations, cross-sections and mapping do not provide enough detail to capture the short-scale stratigraphic heterogeneity needed for adequate reservoir simulation. These deterministic descriptions are inadequate to tie with production data as the thinly bedded sand/shale sequences blur into a falsely homogenous picture. By studying the variability of the geologic & petrophysical data vertically within each wellbore and spatially from well to well, a geostatistical reservoir description has been developed. It captures the natural variability of the sands and shales that was lacking from earlier work. These geostatistical studies allow the geologic and petrophysical characteristics to be considered in a probabilistic model. The end-product is a reservoir description that captures the variability of the reservoir sequences and can be used as a more realistic starting point for history matching and reservoir simulation.« less

  17. Assessment of spatial distribution of fallout radionuclides through geostatistics concept.

    PubMed

    Mabit, L; Bernard, C

    2007-01-01

    After introducing geostatistics concept and its utility in environmental science and especially in Fallout Radionuclide (FRN) spatialisation, a case study for cesium-137 ((137)Cs) redistribution at the field scale using geostatistics is presented. On a Canadian agricultural field, geostatistics coupled with a Geographic Information System (GIS) was used to test three different techniques of interpolation [Ordinary Kriging (OK), Inverse Distance Weighting power one (IDW1) and two (IDW2)] to create a (137)Cs map and to establish a radioisotope budget. Following the optimization of variographic parameters, an experimental semivariogram was developed to determine the spatial dependence of (137)Cs. It was adjusted to a spherical isotropic model with a range of 30 m and a very small nugget effect. This (137)Cs semivariogram showed a good autocorrelation (R(2)=0.91) and was well structured ('nugget-to-sill' ratio of 4%). It also revealed that the sampling strategy was adequate to reveal the spatial correlation of (137)Cs. The spatial redistribution of (137)Cs was estimated by Ordinary Kriging and IDW to produce contour maps. A radioisotope budget was established for the 2.16 ha agricultural field under investigation. It was estimated that around 2 x 10(7)Bq of (137)Cs were missing (around 30% of the total initial fallout) and were exported by physical processes (runoff and erosion processes) from the area under investigation. The cross-validation analysis showed that in the case of spatially structured data, OK is a better interpolation method than IDW1 or IDW2 for the assessment of potential radioactive contamination and/or pollution.

  18. Coupling geostatistics to detailed reservoir description allows better visualization and more accurate characterization/simulation of turbidite reservoirs: Elk Hills oil field, California

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allan, M.E.; Wilson, M.L.; Wightman, J.

    1996-01-01

    The Elk Hills giant oilfield, located in the southern San Joaquin Valley of California, has produced 1.1 billion barrels of oil from Miocene and shallow Pliocene reservoirs. 65% of the current 64,000 BOPD production is from the pressure-supported, deeper Miocene turbidite sands. In the turbidite sands of the 31 S structure, large porosity permeability variations in the Main Body B and Western 31 S sands cause problems with the efficiency of the waterflooding. These variations have now been quantified and visualized using geostatistics. The end result is a more detailed reservoir characterization for simulation. Traditional reservoir descriptions based on markermore » correlations, cross-sections and mapping do not provide enough detail to capture the short-scale stratigraphic heterogeneity needed for adequate reservoir simulation. These deterministic descriptions are inadequate to tie with production data as the thinly bedded sand/shale sequences blur into a falsely homogenous picture. By studying the variability of the geologic petrophysical data vertically within each wellbore and spatially from well to well, a geostatistical reservoir description has been developed. It captures the natural variability of the sands and shales that was lacking from earlier work. These geostatistical studies allow the geologic and petrophysical characteristics to be considered in a probabilistic model. The end-product is a reservoir description that captures the variability of the reservoir sequences and can be used as a more realistic starting point for history matching and reservoir simulation.« less

  19. Geospatial interpolation and mapping of tropospheric ozone pollution using geostatistics.

    PubMed

    Kethireddy, Swatantra R; Tchounwou, Paul B; Ahmad, Hafiz A; Yerramilli, Anjaneyulu; Young, John H

    2014-01-10

    Tropospheric ozone (O3) pollution is a major problem worldwide, including in the United States of America (USA), particularly during the summer months. Ozone oxidative capacity and its impact on human health have attracted the attention of the scientific community. In the USA, sparse spatial observations for O3 may not provide a reliable source of data over a geo-environmental region. Geostatistical Analyst in ArcGIS has the capability to interpolate values in unmonitored geo-spaces of interest. In this study of eastern Texas O3 pollution, hourly episodes for spring and summer 2012 were selectively identified. To visualize the O3 distribution, geostatistical techniques were employed in ArcMap. Using ordinary Kriging, geostatistical layers of O3 for all the studied hours were predicted and mapped at a spatial resolution of 1 kilometer. A decent level of prediction accuracy was achieved and was confirmed from cross-validation results. The mean prediction error was close to 0, the root mean-standardized-prediction error was close to 1, and the root mean square and average standard errors were small. O3 pollution map data can be further used in analysis and modeling studies. Kriging results and O3 decadal trends indicate that the populace in Houston-Sugar Land-Baytown, Dallas-Fort Worth-Arlington, Beaumont-Port Arthur, San Antonio, and Longview are repeatedly exposed to high levels of O3-related pollution, and are prone to the corresponding respiratory and cardiovascular health effects. Optimization of the monitoring network proves to be an added advantage for the accurate prediction of exposure levels.

  20. 3-D transient hydraulic tomography in unconfined aquifers with fast drainage response

    NASA Astrophysics Data System (ADS)

    Cardiff, M.; Barrash, W.

    2011-12-01

    We investigate, through numerical experiments, the viability of three-dimensional transient hydraulic tomography (3DTHT) for identifying the spatial distribution of groundwater flow parameters (primarily, hydraulic conductivity K) in permeable, unconfined aquifers. To invert the large amount of transient data collected from 3DTHT surveys, we utilize an iterative geostatistical inversion strategy in which outer iterations progressively increase the number of data points fitted and inner iterations solve the quasi-linear geostatistical formulas of Kitanidis. In order to base our numerical experiments around realistic scenarios, we utilize pumping rates, geometries, and test lengths similar to those attainable during 3DTHT field campaigns performed at the Boise Hydrogeophysical Research Site (BHRS). We also utilize hydrologic parameters that are similar to those observed at the BHRS and in other unconsolidated, unconfined fluvial aquifers. In addition to estimating K, we test the ability of 3DTHT to estimate both average storage values (specific storage Ss and specific yield Sy) as well as spatial variability in storage coefficients. The effects of model conceptualization errors during unconfined 3DTHT are investigated including: (1) assuming constant storage coefficients during inversion and (2) assuming stationary geostatistical parameter variability. Overall, our findings indicate that estimation of K is slightly degraded if storage parameters must be jointly estimated, but that this effect is quite small compared with the degradation of estimates due to violation of "structural" geostatistical assumptions. Practically, we find for our scenarios that assuming constant storage values during inversion does not appear to have a significant effect on K estimates or uncertainty bounds.

  1. Bayesian cross-validation for model evaluation and selection, with application to the North American Breeding Bird Survey

    USGS Publications Warehouse

    Link, William; Sauer, John R.

    2016-01-01

    The analysis of ecological data has changed in two important ways over the last 15 years. The development and easy availability of Bayesian computational methods has allowed and encouraged the fitting of complex hierarchical models. At the same time, there has been increasing emphasis on acknowledging and accounting for model uncertainty. Unfortunately, the ability to fit complex models has outstripped the development of tools for model selection and model evaluation: familiar model selection tools such as Akaike's information criterion and the deviance information criterion are widely known to be inadequate for hierarchical models. In addition, little attention has been paid to the evaluation of model adequacy in context of hierarchical modeling, i.e., to the evaluation of fit for a single model. In this paper, we describe Bayesian cross-validation, which provides tools for model selection and evaluation. We describe the Bayesian predictive information criterion and a Bayesian approximation to the BPIC known as the Watanabe-Akaike information criterion. We illustrate the use of these tools for model selection, and the use of Bayesian cross-validation as a tool for model evaluation, using three large data sets from the North American Breeding Bird Survey.

  2. A multiple-point geostatistical method for characterizing uncertainty of subsurface alluvial units and its effects on flow and transport

    USGS Publications Warehouse

    Cronkite-Ratcliff, C.; Phelps, G.A.; Boucher, A.

    2012-01-01

    This report provides a proof-of-concept to demonstrate the potential application of multiple-point geostatistics for characterizing geologic heterogeneity and its effect on flow and transport simulation. The study presented in this report is the result of collaboration between the U.S. Geological Survey (USGS) and Stanford University. This collaboration focused on improving the characterization of alluvial deposits by incorporating prior knowledge of geologic structure and estimating the uncertainty of the modeled geologic units. In this study, geologic heterogeneity of alluvial units is characterized as a set of stochastic realizations, and uncertainty is indicated by variability in the results of flow and transport simulations for this set of realizations. This approach is tested on a hypothetical geologic scenario developed using data from the alluvial deposits in Yucca Flat, Nevada. Yucca Flat was chosen as a data source for this test case because it includes both complex geologic and hydrologic characteristics and also contains a substantial amount of both surface and subsurface geologic data. Multiple-point geostatistics is used to model geologic heterogeneity in the subsurface. A three-dimensional (3D) model of spatial variability is developed by integrating alluvial units mapped at the surface with vertical drill-hole data. The SNESIM (Single Normal Equation Simulation) algorithm is used to represent geologic heterogeneity stochastically by generating 20 realizations, each of which represents an equally probable geologic scenario. A 3D numerical model is used to simulate groundwater flow and contaminant transport for each realization, producing a distribution of flow and transport responses to the geologic heterogeneity. From this distribution of flow and transport responses, the frequency of exceeding a given contaminant concentration threshold can be used as an indicator of uncertainty about the location of the contaminant plume boundary.

  3. Bayesian networks for maritime traffic accident prevention: benefits and challenges.

    PubMed

    Hänninen, Maria

    2014-12-01

    Bayesian networks are quantitative modeling tools whose applications to the maritime traffic safety context are becoming more popular. This paper discusses the utilization of Bayesian networks in maritime safety modeling. Based on literature and the author's own experiences, the paper studies what Bayesian networks can offer to maritime accident prevention and safety modeling and discusses a few challenges in their application to this context. It is argued that the capability of representing rather complex, not necessarily causal but uncertain relationships makes Bayesian networks an attractive modeling tool for the maritime safety and accidents. Furthermore, as the maritime accident and safety data is still rather scarce and has some quality problems, the possibility to combine data with expert knowledge and the easy way of updating the model after acquiring more evidence further enhance their feasibility. However, eliciting the probabilities from the maritime experts might be challenging and the model validation can be tricky. It is concluded that with the utilization of several data sources, Bayesian updating, dynamic modeling, and hidden nodes for latent variables, Bayesian networks are rather well-suited tools for the maritime safety management and decision-making. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management

    EPA Science Inventory

    A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...

  5. Preliminary Groundwater Simulations To Compare Different Reconstruction Methods of 3-d Alluvial Heterogeneity

    NASA Astrophysics Data System (ADS)

    Teles, V.; de Marsily, G.; Delay, F.; Perrier, E.

    Alluvial floodplains are extremely heterogeneous aquifers, whose three-dimensional structures are quite difficult to model. In general, when representing such structures, the medium heterogeneity is modeled with classical geostatistical or Boolean meth- ods. Another approach, still in its infancy, is called the genetic method because it simulates the generation of the medium by reproducing sedimentary processes. We developed a new genetic model to obtain a realistic three-dimensional image of allu- vial media. It does not simulate the hydrodynamics of sedimentation but uses semi- empirical and statistical rules to roughly reproduce fluvial deposition and erosion. The main processes, either at the stream scale or at the plain scale, are modeled by simple rules applied to "sediment" entities or to conceptual "erosion" entities. The model was applied to a several kilometer long portion of the Aube River floodplain (France) and reproduced the deposition and erosion cycles that occurred during the inferred climate periods (15 000 BP to present). A three-dimensional image of the aquifer was gener- ated, by extrapolating the two-dimensional information collected on a cross-section of the floodplain. Unlike geostatistical methods, this extrapolation does not use a statis- tical spatial analysis of the data, but a genetic analysis, which leads to a more realistic structure. Groundwater flow and transport simulations in the alluvium were carried out with a three-dimensional flow code or simulator (MODFLOW), using different rep- resentations of the alluvial reservoir of the Aube River floodplain: first an equivalent homogeneous medium, and then different heterogeneous media built either with the traditional geostatistical approach simulating the permeability distribution, or with the new genetic model presented here simulating sediment facies. In the latter case, each deposited entity of a given lithology was assigned a constant hydraulic conductivity value. Results of these models have been compared to assess the value of the genetic approach and will be presented.

  6. Use of geostatistics to determine the spatial distribution and infestation rate of leaf-cutting ant nests (Hymenoptera: Formicidae) in eucalyptus plantations.

    PubMed

    Lasmar, O; Zanetti, R; dos Santos, A; Fernandes, B V

    2012-08-01

    One of the fundamental steps in pest sampling is the assessment of the population distribution in the field. Several studies have investigated the distribution and appropriate sampling methods for leaf-cutting ants; however, more reliable methods are still required, such as those that use geostatistics. The objective of this study was to determine the spatial distribution and infestation rate of leaf-cutting ant nests in eucalyptus plantations by using geostatistics. The study was carried out in 2008 in two eucalyptus stands in Paraopeba, Minas Gerais, Brazil. All of the nests in the studied area were located and used for the generation of GIS maps, and the spatial pattern of distribution was determined considering the number and size of nests. Each analysis and map was made using the R statistics program and the geoR package. The nest spatial distribution in a savanna area of Minas Gerais was clustered to a certain extent. The models generated allowed the production of kriging maps of areas infested with leaf-cutting ants, where chemical intervention would be necessary, reducing the control costs, impact on humans, and the environment.

  7. Accuracy and uncertainty analysis of soil Bbf spatial distribution estimation at a coking plant-contaminated site based on normalization geostatistical technologies.

    PubMed

    Liu, Geng; Niu, Junjie; Zhang, Chao; Guo, Guanlin

    2015-12-01

    Data distribution is usually skewed severely by the presence of hot spots in contaminated sites. This causes difficulties for accurate geostatistical data transformation. Three types of typical normal distribution transformation methods termed the normal score, Johnson, and Box-Cox transformations were applied to compare the effects of spatial interpolation with normal distribution transformation data of benzo(b)fluoranthene in a large-scale coking plant-contaminated site in north China. Three normal transformation methods decreased the skewness and kurtosis of the benzo(b)fluoranthene, and all the transformed data passed the Kolmogorov-Smirnov test threshold. Cross validation showed that Johnson ordinary kriging has a minimum root-mean-square error of 1.17 and a mean error of 0.19, which was more accurate than the other two models. The area with fewer sampling points and that with high levels of contamination showed the largest prediction standard errors based on the Johnson ordinary kriging prediction map. We introduce an ideal normal transformation method prior to geostatistical estimation for severely skewed data, which enhances the reliability of risk estimation and improves the accuracy for determination of remediation boundaries.

  8. Downscaling remotely sensed imagery using area-to-point cokriging and multiple-point geostatistical simulation

    NASA Astrophysics Data System (ADS)

    Tang, Yunwei; Atkinson, Peter M.; Zhang, Jingxiong

    2015-03-01

    A cross-scale data integration method was developed and tested based on the theory of geostatistics and multiple-point geostatistics (MPG). The goal was to downscale remotely sensed images while retaining spatial structure by integrating images at different spatial resolutions. During the process of downscaling, a rich spatial correlation model in the form of a training image was incorporated to facilitate reproduction of similar local patterns in the simulated images. Area-to-point cokriging (ATPCK) was used as locally varying mean (LVM) (i.e., soft data) to deal with the change of support problem (COSP) for cross-scale integration, which MPG cannot achieve alone. Several pairs of spectral bands of remotely sensed images were tested for integration within different cross-scale case studies. The experiment shows that MPG can restore the spatial structure of the image at a fine spatial resolution given the training image and conditioning data. The super-resolution image can be predicted using the proposed method, which cannot be realised using most data integration methods. The results show that ATPCK-MPG approach can achieve greater accuracy than methods which do not account for the change of support issue.

  9. A Comparison of General Diagnostic Models (GDM) and Bayesian Networks Using a Middle School Mathematics Test

    ERIC Educational Resources Information Center

    Wu, Haiyan

    2013-01-01

    General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…

  10. Water Table Uncertainties due to Uncertainties in Structure and Properties of an Unconfined Aquifer.

    PubMed

    Hauser, Juerg; Wellmann, Florian; Trefry, Mike

    2018-03-01

    We consider two sources of geology-related uncertainty in making predictions of the steady-state water table elevation for an unconfined aquifer. That is the uncertainty in the depth to base of the aquifer and in the hydraulic conductivity distribution within the aquifer. Stochastic approaches to hydrological modeling commonly use geostatistical techniques to account for hydraulic conductivity uncertainty within the aquifer. In the absence of well data allowing derivation of a relationship between geophysical and hydrological parameters, the use of geophysical data is often limited to constraining the structural boundaries. If we recover the base of an unconfined aquifer from an analysis of geophysical data, then the associated uncertainties are a consequence of the geophysical inversion process. In this study, we illustrate this by quantifying water table uncertainties for the unconfined aquifer formed by the paleochannel network around the Kintyre Uranium deposit in Western Australia. The focus of the Bayesian parametric bootstrap approach employed for the inversion of the available airborne electromagnetic data is the recovery of the base of the paleochannel network and the associated uncertainties. This allows us to then quantify the associated influences on the water table in a conceptualized groundwater usage scenario and compare the resulting uncertainties with uncertainties due to an uncertain hydraulic conductivity distribution within the aquifer. Our modeling shows that neither uncertainties in the depth to the base of the aquifer nor hydraulic conductivity uncertainties alone can capture the patterns of uncertainty in the water table that emerge when the two are combined. © 2017, National Ground Water Association.

  11. Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model

    PubMed Central

    Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.

    2014-01-01

    Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689

  12. Construction of monitoring model and algorithm design on passenger security during shipping based on improved Bayesian network.

    PubMed

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.

  13. Construction of Monitoring Model and Algorithm Design on Passenger Security during Shipping Based on Improved Bayesian Network

    PubMed Central

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227

  14. Hierarchical Bayesian Modeling of Fluid-Induced Seismicity

    NASA Astrophysics Data System (ADS)

    Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.

    2017-11-01

    In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.

  15. Estimating virus occurrence using Bayesian modeling in multiple drinking water systems of the United States

    USGS Publications Warehouse

    Varughese, Eunice A.; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer S; Fout, G. Shay; Furlong, Edward T.; Kolpin, Dana W.; Glassmeyer, Susan T.; Keely, Scott P

    2017-01-01

    incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.

  16. A local approach for focussed Bayesian fusion

    NASA Astrophysics Data System (ADS)

    Sander, Jennifer; Heizmann, Michael; Goussev, Igor; Beyerer, Jürgen

    2009-04-01

    Local Bayesian fusion approaches aim to reduce high storage and computational costs of Bayesian fusion which is separated from fixed modeling assumptions. Using the small world formalism, we argue why this proceeding is conform with Bayesian theory. Then, we concentrate on the realization of local Bayesian fusion by focussing the fusion process solely on local regions that are task relevant with a high probability. The resulting local models correspond then to restricted versions of the original one. In a previous publication, we used bounds for the probability of misleading evidence to show the validity of the pre-evaluation of task specific knowledge and prior information which we perform to build local models. In this paper, we prove the validity of this proceeding using information theoretic arguments. For additional efficiency, local Bayesian fusion can be realized in a distributed manner. Here, several local Bayesian fusion tasks are evaluated and unified after the actual fusion process. For the practical realization of distributed local Bayesian fusion, software agents are predestinated. There is a natural analogy between the resulting agent based architecture and criminal investigations in real life. We show how this analogy can be used to improve the efficiency of distributed local Bayesian fusion additionally. Using a landscape model, we present an experimental study of distributed local Bayesian fusion in the field of reconnaissance, which highlights its high potential.

  17. Local Geostatistical Models and Big Data in Hydrological and Ecological Applications

    NASA Astrophysics Data System (ADS)

    Hristopulos, Dionissios

    2015-04-01

    The advent of the big data era creates new opportunities for environmental and ecological modelling but also presents significant challenges. The availability of remote sensing images and low-cost wireless sensor networks implies that spatiotemporal environmental data to cover larger spatial domains at higher spatial and temporal resolution for longer time windows. Handling such voluminous data presents several technical and scientific challenges. In particular, the geostatistical methods used to process spatiotemporal data need to overcome the dimensionality curse associated with the need to store and invert large covariance matrices. There are various mathematical approaches for addressing the dimensionality problem, including change of basis, dimensionality reduction, hierarchical schemes, and local approximations. We present a Stochastic Local Interaction (SLI) model that can be used to model local correlations in spatial data. SLI is a random field model suitable for data on discrete supports (i.e., regular lattices or irregular sampling grids). The degree of localization is determined by means of kernel functions and appropriate bandwidths. The strength of the correlations is determined by means of coefficients. In the "plain vanilla" version the parameter set involves scale and rigidity coefficients as well as a characteristic length. The latter determines in connection with the rigidity coefficient the correlation length of the random field. The SLI model is based on statistical field theory and extends previous research on Spartan spatial random fields [2,3] from continuum spaces to explicitly discrete supports. The SLI kernel functions employ adaptive bandwidths learned from the sampling spatial distribution [1]. The SLI precision matrix is expressed explicitly in terms of the model parameter and the kernel function. Hence, covariance matrix inversion is not necessary for parameter inference that is based on leave-one-out cross validation. This property helps to overcome a significant computational bottleneck of geostatistical models due to the poor scaling of the matrix inversion [4,5]. We present applications to real and simulated data sets, including the Walker lake data, and we investigate the SLI performance using various statistical cross validation measures. References [1] T. Hofmann, B. Schlkopf, A.J. Smola, Annals of Statistics, 36, 1171-1220 (2008). [2] D. T. Hristopulos, SIAM Journal on Scientific Computing, 24(6): 2125-2162 (2003). [3] D. T. Hristopulos and S. N. Elogne, IEEE Transactions on Signal Processing, 57(9): 3475-3487 (2009) [4] G. Jona Lasinio, G. Mastrantonio, and A. Pollice, Statistical Methods and Applications, 22(1):97-112 (2013) [5] Sun, Y., B. Li, and M. G. Genton (2012). Geostatistics for large datasets. In: Advances and Challenges in Space-time Modelling of Natural Events, Lecture Notes in Statistics, pp. 55-77. Springer, Berlin-Heidelberg.

  18. Modeling and scaleup of steamflood in a heterogeneous reservoir

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dehghani, K.; Basham, W.M.; Durlofsky, L.J.

    1995-11-01

    A series of simulation runs was conducted for different geostatistically derived cross-sectional models to study the degree of heterogeneity required for proper modeling of steamfloods in a thick, heavy-oil reservoir with thin diatomite barriers Different methods for coarsening the most detailed models were applied, and performance predictions for the coarsened and detailed models compared. Use of a general scaleup method provided the most accurate coarse grid models.

  19. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo

    2017-01-01

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241

  20. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo

    2017-06-07

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.

  1. Bayesian Models for Astrophysical Data Using R, JAGS, Python, and Stan

    NASA Astrophysics Data System (ADS)

    Hilbe, Joseph M.; de Souza, Rafael S.; Ishida, Emille E. O.

    2017-05-01

    This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.

  2. Cost-effective water quality assessment through the integration of monitoring data and modeling results

    NASA Astrophysics Data System (ADS)

    Lobuglio, Joseph N.; Characklis, Gregory W.; Serre, Marc L.

    2007-03-01

    Sparse monitoring data and error inherent in water quality models make the identification of waters not meeting regulatory standards uncertain. Additional monitoring can be implemented to reduce this uncertainty, but it is often expensive. These costs are currently a major concern, since developing total maximum daily loads, as mandated by the Clean Water Act, will require assessing tens of thousands of water bodies across the United States. This work uses the Bayesian maximum entropy (BME) method of modern geostatistics to integrate water quality monitoring data together with model predictions to provide improved estimates of water quality in a cost-effective manner. This information includes estimates of uncertainty and can be used to aid probabilistic-based decisions concerning the status of a water (i.e., impaired or not impaired) and the level of monitoring needed to characterize the water for regulatory purposes. This approach is applied to the Catawba River reservoir system in western North Carolina as a means of estimating seasonal chlorophyll a concentration. Mean concentration and confidence intervals for chlorophyll a are estimated for 66 reservoir segments over an 11-year period (726 values) based on 219 measured seasonal averages and 54 model predictions. Although the model predictions had a high degree of uncertainty, integration of modeling results via BME methods reduced the uncertainty associated with chlorophyll estimates compared with estimates made solely with information from monitoring efforts. Probabilistic predictions of future chlorophyll levels on one reservoir are used to illustrate the cost savings that can be achieved by less extensive and rigorous monitoring methods within the BME framework. While BME methods have been applied in several environmental contexts, employing these methods as a means of integrating monitoring and modeling results, as well as application of this approach to the assessment of surface water monitoring networks, represent unexplored areas of research.

  3. Geostatistics and spatial analysis in biological anthropology.

    PubMed

    Relethford, John H

    2008-05-01

    A variety of methods have been used to make evolutionary inferences based on the spatial distribution of biological data, including reconstructing population history and detection of the geographic pattern of natural selection. This article provides an examination of geostatistical analysis, a method used widely in geology but which has not often been applied in biological anthropology. Geostatistical analysis begins with the examination of a variogram, a plot showing the relationship between a biological distance measure and the geographic distance between data points and which provides information on the extent and pattern of spatial correlation. The results of variogram analysis are used for interpolating values of unknown data points in order to construct a contour map, a process known as kriging. The methods of geostatistical analysis and discussion of potential problems are applied to a large data set of anthropometric measures for 197 populations in Ireland. The geostatistical analysis reveals two major sources of spatial variation. One pattern, seen for overall body and craniofacial size, shows an east-west cline most likely reflecting the combined effects of past population dispersal and settlement. The second pattern is seen for craniofacial height and shows an isolation by distance pattern reflecting rapid spatial changes in the midlands region of Ireland, perhaps attributable to the genetic impact of the Vikings. The correspondence of these results with other analyses of these data and the additional insights generated from variogram analysis and kriging illustrate the potential utility of geostatistical analysis in biological anthropology. (c) 2008 Wiley-Liss, Inc.

  4. Automated connectionist-geostatistical classification as an approach to identify sea ice and land ice types, properties and provinces

    NASA Astrophysics Data System (ADS)

    Goetz-Weiss, L. R.; Herzfeld, U. C.; Trantow, T.; Hunke, E. C.; Maslanik, J. A.; Crocker, R. I.

    2016-12-01

    An important problem in model-data comparison is the identification of parameters that can be extracted from observational data as well as used in numerical models, which are typically based on idealized physical processes. Here, we present a suite of approaches to characterization and classification of sea ice and land ice types, properties and provinces based on several types of remote-sensing data. Applications will be given to not only illustrate the approach, but employ it in model evaluation and understanding of physical processes. (1) In a geostatistical characterization, spatial sea-ice properties in the Chukchi and Beaufort Sea and in Elsoon Lagoon are derived from analysis of RADARSAT and ERS-2 SAR data. (2) The analysis is taken further by utilizing multi-parameter feature vectors as inputs for unsupervised and supervised statistical classification, which facilitates classification of different sea-ice types. (3) Characteristic sea-ice parameters, as resultant from the classification, can then be applied in model evaluation, as demonstrated for the ridging scheme of the Los Alamos sea ice model, CICE, using high-resolution altimeter and image data collected from unmanned aircraft over Fram Strait during the Characterization of Arctic Sea Ice Experiment (CASIE). The characteristic parameters chosen in this application are directly related to deformation processes, which also underly the ridging scheme. (4) The method that is capable of the most complex classification tasks is the connectionist-geostatistical classification method. This approach has been developed to identify currently up to 18 different crevasse types in order to map progression of the surge through the complex Bering-Bagley Glacier System, Alaska, in 2011-2014. The analysis utilizes airborne altimeter data and video image data and satellite image data. Results of the crevasse classification are compare to fracture modeling and found to match.

  5. COST-EFFECTIVE SAMPLING FOR SPATIALLY DISTRIBUTED PHENOMENA

    EPA Science Inventory

    Various measures of sampling plan cost and loss are developed and analyzed as they relate to a variety of multidisciplinary sampling techniques. The sampling choices examined include methods from design-based sampling, model-based sampling, and geostatistics. Graphs and tables ar...

  6. Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum

    2006-01-01

    A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…

  7. Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837

    ERIC Educational Resources Information Center

    Levy, Roy

    2014-01-01

    Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…

  8. Bayesian techniques for analyzing group differences in the Iowa Gambling Task: A case study of intuitive and deliberate decision-makers.

    PubMed

    Steingroever, Helen; Pachur, Thorsten; Šmíra, Martin; Lee, Michael D

    2018-06-01

    The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.

  9. Invited commentary: Lost in estimation--searching for alternatives to markov chains to fit complex Bayesian models.

    PubMed

    Molitor, John

    2012-03-01

    Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.

  10. The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.

    PubMed

    Norris, Dennis

    2006-04-01

    This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers. ((c) 2006 APA, all rights reserved).

  11. Geological, geomechanical and geostatistical assessment of rockfall hazard in San Quirico Village (Abruzzo, Italy)

    NASA Astrophysics Data System (ADS)

    Chiessi, Vittorio; D'Orefice, Maurizio; Scarascia Mugnozza, Gabriele; Vitale, Valerio; Cannese, Christian

    2010-07-01

    This paper describes the results of a rockfall hazard assessment for the village of San Quirico (Abruzzo region, Italy) based on an engineering-geological model. After the collection of geological, geomechanical, and geomorphological data, the rockfall hazard assessment was performed based on two separate approaches: i) simulation of detachment of rock blocks and their downhill movement using a GIS; and ii) application of geostatistical techniques to the analysis of georeferenced observations of previously fallen blocks, in order to assess the probability of arrival of blocks due to potential future collapses. The results show that the trajectographic analysis is significantly influenced by the input parameters, with particular reference to the coefficients of restitution values. In order to solve this problem, the model was calibrated based on repeated field observations. The geostatistical approach is useful because it gives the best estimation of point-source phenomena such as rockfalls; however, the sensitivity of results to basic assumptions, e.g. assessment of variograms and choice of a threshold value, may be problematic. Consequently, interpolations derived from different variograms have been used and compared among them; hence, those showing the lowest errors were adopted. The data sets which were statistically analysed are relevant to both kinetic energy and surveyed rock blocks in the accumulation area. The obtained maps highlight areas susceptible to rock block arrivals, and show that the area accommodating the new settlement of S. Quirico Village has the highest level of hazard according to both probabilistic and deterministic methods.

  12. Assessment of geostatistical features for object-based image classification of contrasted landscape vegetation cover

    NASA Astrophysics Data System (ADS)

    de Oliveira Silveira, Eduarda Martiniano; de Menezes, Michele Duarte; Acerbi Júnior, Fausto Weimar; Castro Nunes Santos Terra, Marcela; de Mello, José Márcio

    2017-07-01

    Accurate mapping and monitoring of savanna and semiarid woodland biomes are needed to support the selection of areas of conservation, to provide sustainable land use, and to improve the understanding of vegetation. The potential of geostatistical features, derived from medium spatial resolution satellite imagery, to characterize contrasted landscape vegetation cover and improve object-based image classification is studied. The study site in Brazil includes cerrado sensu stricto, deciduous forest, and palm swamp vegetation cover. Sentinel 2 and Landsat 8 images were acquired and divided into objects, for each of which a semivariogram was calculated using near-infrared (NIR) and normalized difference vegetation index (NDVI) to extract the set of geostatistical features. The features selected by principal component analysis were used as input data to train a random forest algorithm. Tests were conducted, combining spectral and geostatistical features. Change detection evaluation was performed using a confusion matrix and its accuracies. The semivariogram curves were efficient to characterize spatial heterogeneity, with similar results using NIR and NDVI from Sentinel 2 and Landsat 8. Accuracy was significantly greater when combining geostatistical features with spectral data, suggesting that this method can improve image classification results.

  13. Bayesian flood forecasting methods: A review

    NASA Astrophysics Data System (ADS)

    Han, Shasha; Coulibaly, Paulin

    2017-08-01

    Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.

  14. Bayesian modeling of flexible cognitive control

    PubMed Central

    Jiang, Jiefeng; Heller, Katherine; Egner, Tobias

    2014-01-01

    “Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218

  15. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.

    PubMed

    Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.

  16. Bayesian statistics in medicine: a 25 year review.

    PubMed

    Ashby, Deborah

    2006-11-15

    This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.

  17. Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology

    PubMed Central

    Murakami, Yohei

    2014-01-01

    Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832

  18. An introduction to Bayesian statistics in health psychology.

    PubMed

    Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

    2017-09-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.

  19. Malaria Risk Mapping for Control in the Republic of Sudan

    PubMed Central

    Noor, Abdisalan M.; ElMardi, Khalid A.; Abdelgader, Tarig M.; Patil, Anand P.; Amine, Ahmed A. A.; Bakhiet, Sahar; Mukhtar, Maowia M.; Snow, Robert W.

    2012-01-01

    Evidence shows that malaria risk maps are rarely tailored to address national control program ambitions. Here, we generate a malaria risk map adapted for malaria control in Sudan. Community Plasmodium falciparum parasite rate (PfPR) data from 2000 to 2010 were assembled and were standardized to 2–10 years of age (PfPR2–10). Space-time Bayesian geostatistical methods were used to generate a map of malaria risk for 2010. Surfaces of aridity, urbanization, irrigation schemes, and refugee camps were combined with the PfPR2–10 map to tailor the epidemiological stratification for appropriate intervention design. In 2010, a majority of the geographical area of the Sudan had risk of < 1% PfPR2–10. Areas of meso- and hyperendemic risk were located in the south. About 80% of Sudan's population in 2011 was in the areas in the desert, urban centers, or where risk was < 1% PfPR2–10. Aggregated data suggest reducing risks in some high transmission areas since the 1960s. PMID:23033400

  20. A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.

    PubMed

    Houseman, E Andres; Virji, M Abbas

    2017-08-01

    Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.

  1. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    DTIC Science & Technology

    2017-09-01

    efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components

  2. Bayesian Learning and the Psychology of Rule Induction

    ERIC Educational Resources Information Center

    Endress, Ansgar D.

    2013-01-01

    In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to…

  3. Properties of the Bayesian Knowledge Tracing Model

    ERIC Educational Resources Information Center

    van de Sande, Brett

    2013-01-01

    Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…

  4. Bayesian Analysis of Longitudinal Data Using Growth Curve Models

    ERIC Educational Resources Information Center

    Zhang, Zhiyong; Hamagami, Fumiaki; Wang, Lijuan Lijuan; Nesselroade, John R.; Grimm, Kevin J.

    2007-01-01

    Bayesian methods for analyzing longitudinal data in social and behavioral research are recommended for their ability to incorporate prior information in estimating simple and complex models. We first summarize the basics of Bayesian methods before presenting an empirical example in which we fit a latent basis growth curve model to achievement data…

  5. Testing students' e-learning via Facebook through Bayesian structural equation modeling.

    PubMed

    Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad

    2017-01-01

    Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.

  6. Testing students’ e-learning via Facebook through Bayesian structural equation modeling

    PubMed Central

    Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad

    2017-01-01

    Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019

  7. Goal-Oriented Intelligence in Optimization of Distributed Parameter Systems

    DTIC Science & Technology

    2004-08-01

    Yarus, and R.L. Chambers, editors, AAPG Computer Applications in geology, No. 3, The American Association of Petroleum Geologists, Tulsa, OK, USA...Stochastic Modeling and Geostatistics – Principles, Methods, and Case Studies, AAPG Computer Applications in geology, No. 3, The American

  8. When mechanism matters: Bayesian forecasting using models of ecological diffusion

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

    2017-01-01

    Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.

  9. GEOSTATISTICAL SAMPLING DESIGNS FOR HAZARDOUS WASTE SITES

    EPA Science Inventory

    This chapter discusses field sampling design for environmental sites and hazardous waste sites with respect to random variable sampling theory, Gy's sampling theory, and geostatistical (kriging) sampling theory. The literature often presents these sampling methods as an adversari...

  10. [Spatial distribution pattern of Chilo suppressalis analyzed by classical method and geostatistics].

    PubMed

    Yuan, Zheming; Fu, Wei; Li, Fangyi

    2004-04-01

    Two original samples of Chilo suppressalis and their grid, random and sequence samples were analyzed by classical method and geostatistics to characterize the spatial distribution pattern of C. suppressalis. The limitations of spatial distribution analysis with classical method, especially influenced by the original position of grid, were summarized rather completely. On the contrary, geostatistics characterized well the spatial distribution pattern, congregation intensity and spatial heterogeneity of C. suppressalis. According to geostatistics, the population was up to Poisson distribution in low density. As for higher density population, its distribution was up to aggregative, and the aggregation intensity and dependence range were 0.1056 and 193 cm, respectively. Spatial heterogeneity was also found in the higher density population. Its spatial correlativity in line direction was more closely than that in row direction, and the dependence ranges in line and row direction were 115 and 264 cm, respectively.

  11. Geostatistics and GIS: tools for characterizing environmental contamination.

    PubMed

    Henshaw, Shannon L; Curriero, Frank C; Shields, Timothy M; Glass, Gregory E; Strickland, Paul T; Breysse, Patrick N

    2004-08-01

    Geostatistics is a set of statistical techniques used in the analysis of georeferenced data that can be applied to environmental contamination and remediation studies. In this study, the 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene (DDE) contamination at a Superfund site in western Maryland is evaluated. Concern about the site and its future clean up has triggered interest within the community because residential development surrounds the area. Spatial statistical methods, of which geostatistics is a subset, are becoming increasingly popular, in part due to the availability of geographic information system (GIS) software in a variety of application packages. In this article, the joint use of ArcGIS software and the R statistical computing environment are demonstrated as an approach for comprehensive geostatistical analyses. The spatial regression method, kriging, is used to provide predictions of DDE levels at unsampled locations both within the site and the surrounding areas where residential development is ongoing.

  12. Bayesian naturalness, simplicity, and testability applied to the B ‑ L MSSM GUT

    NASA Astrophysics Data System (ADS)

    Fundira, Panashe; Purves, Austin

    2018-04-01

    Recent years have seen increased use of Bayesian model comparison to quantify notions such as naturalness, simplicity, and testability, especially in the area of supersymmetric model building. After demonstrating that Bayesian model comparison can resolve a paradox that has been raised in the literature concerning the naturalness of the proton mass, we apply Bayesian model comparison to GUTs, an area to which it has not been applied before. We find that the GUTs are substantially favored over the nonunifying puzzle model. Of the GUTs we consider, the B ‑ L MSSM GUT is the most favored, but the MSSM GUT is almost equally favored.

  13. Surge of Bering Glacier and Bagley Ice Field: Parameterization of surge characteristics based on automated analysis of crevasse image data and laser altimeter data

    NASA Astrophysics Data System (ADS)

    Stachura, M.; Herzfeld, U. C.; McDonald, B.; Weltman, A.; Hale, G.; Trantow, T.

    2012-12-01

    The dynamical processes that occur during the surge of a large, complex glacier system are far from being understood. The aim of this paper is to derive a parameterization of surge characteristics that captures the principle processes and can serve as the basis for a dynamic surge model. Innovative mathematical methods are introduced that facilitate derivation of such a parameterization from remote-sensing observations. Methods include automated geostatistical characterization and connectionist-geostatistical classification of dynamic provinces and deformation states, using the vehicle of crevasse patterns. These methods are applied to analyze satellite and airborne image and laser altimeter data collected during the current surge of Bering Glacier and Bagley Ice Field, Alaska.

  14. Linking in situ LAI and fine resolution remote sensing data to map reference LAI over cropland and grassland using geostatistical regression method

    NASA Astrophysics Data System (ADS)

    He, Yaqian; Bo, Yanchen; Chai, Leilei; Liu, Xiaolong; Li, Aihua

    2016-08-01

    Leaf Area Index (LAI) is an important parameter of vegetation structure. A number of moderate resolution LAI products have been produced in urgent need of large scale vegetation monitoring. High resolution LAI reference maps are necessary to validate these LAI products. This study used a geostatistical regression (GR) method to estimate LAI reference maps by linking in situ LAI and Landsat TM/ETM+ and SPOT-HRV data over two cropland and two grassland sites. To explore the discrepancies of employing different vegetation indices (VIs) on estimating LAI reference maps, this study established the GR models for different VIs, including difference vegetation index (DVI), normalized difference vegetation index (NDVI), and ratio vegetation index (RVI). To further assess the performance of the GR model, the results from the GR and Reduced Major Axis (RMA) models were compared. The results show that the performance of the GR model varies between the cropland and grassland sites. At the cropland sites, the GR model based on DVI provides the best estimation, while at the grassland sites, the GR model based on DVI performs poorly. Compared to the RMA model, the GR model improves the accuracy of reference LAI maps in terms of root mean square errors (RMSE) and bias.

  15. Geostatistical analysis of fault and joint measurements in Austin Chalk, Superconducting Super Collider Site, Texas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mace, R.E.; Nance, H.S.; Laubach, S.E.

    1995-06-01

    Faults and joints are conduits for ground-water flow and targets for horizontal drilling in the petroleum industry. Spacing and size distribution are rarely predicted accurately by current structural models or documented adequately by conventional borehole or outcrop samples. Tunnel excavations present opportunities to measure fracture attributes in continuous subsurface exposures. These fracture measurements ran be used to improve structural models, guide interpretation of conventional borehole and outcrop data, and geostatistically quantify spatial and spacing characteristics for comparison to outcrop data or for generating distributions of fracture for numerical flow and transport modeling. Structure maps of over 9 mi of nearlymore » continuous tunnel excavations in Austin Chalk at the Superconducting Super Collider (SSC) site in Ellis County, Texas, provide a unique database of fault and joint populations for geostatistical analysis. Observationally, small faults (<10 ft. throw) occur in clusters or swarms that have as many as 24 faults, fault swarms are as much as 2,000 ft. wide and appear to be on average 1,000 ft. apart, and joints are in swarms spaced 500 to more than 2l,000 ft. apart. Semi-variograms show varying degrees of spatial correlation. These variograms have structured sills that correlate directly to highs and lows in fracture frequency observed in the tunnel. Semi-variograms generated with respect to fracture spacing and number also have structured sills, but tend to not show any near-field correlation. The distribution of fault spacing can be described with a negative exponential, which suggests a random distribution. However, there is clearly some structure and clustering in the spacing data as shown by running average and variograms, which implies that a number of different methods should be utilized to characterize fracture spacing.« less

  16. Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

    PubMed Central

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-01-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882

  17. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    PubMed

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  18. Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

    PubMed

    Kärkkäinen, Hanni P; Sillanpää, Mikko J

    2013-09-04

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.

  19. Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

    PubMed Central

    Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

    2013-01-01

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618

  20. Enhancing multiple-point geostatistical modeling: 1. Graph theory and pattern adjustment

    NASA Astrophysics Data System (ADS)

    Tahmasebi, Pejman; Sahimi, Muhammad

    2016-03-01

    In recent years, higher-order geostatistical methods have been used for modeling of a wide variety of large-scale porous media, such as groundwater aquifers and oil reservoirs. Their popularity stems from their ability to account for qualitative data and the great flexibility that they offer for conditioning the models to hard (quantitative) data, which endow them with the capability for generating realistic realizations of porous formations with very complex channels, as well as features that are mainly a barrier to fluid flow. One group of such models consists of pattern-based methods that use a set of data points for generating stochastic realizations by which the large-scale structure and highly-connected features are reproduced accurately. The cross correlation-based simulation (CCSIM) algorithm, proposed previously by the authors, is a member of this group that has been shown to be capable of simulating multimillion cell models in a matter of a few CPU seconds. The method is, however, sensitive to pattern's specifications, such as boundaries and the number of replicates. In this paper the original CCSIM algorithm is reconsidered and two significant improvements are proposed for accurately reproducing large-scale patterns of heterogeneities in porous media. First, an effective boundary-correction method based on the graph theory is presented by which one identifies the optimal cutting path/surface for removing the patchiness and discontinuities in the realization of a porous medium. Next, a new pattern adjustment method is proposed that automatically transfers the features in a pattern to one that seamlessly matches the surrounding patterns. The original CCSIM algorithm is then combined with the two methods and is tested using various complex two- and three-dimensional examples. It should, however, be emphasized that the methods that we propose in this paper are applicable to other pattern-based geostatistical simulation methods.

  1. [Spatial variability of soil nutrients based on geostatistics combined with GIS--a case study in Zunghua City of Hebei Province].

    PubMed

    Guo, X; Fu, B; Ma, K; Chen, L

    2000-08-01

    Geostatistics combined with GIS was applied to analyze the spatial variability of soil nutrients in topsoil (0-20 cm) in Zunghua City of Hebei Province. GIS can integrate attribute data with geographical data of system variables, which makes the application of geostatistics technique for large spatial scale more convenient. Soil nutrient data in this study included available N (alkaline hydrolyzing nitrogen), total N, available K, available P and organic matter. The results showed that the semivariograms of soil nutrients were best described by spherical model, except for that of available K, which was best fitted by complex structure of exponential model and linear with sill model. The spatial variability of available K was mainly produced by structural factor, while that of available N, total N, available P and organic matter was primarily caused by random factor. However, their spatial heterogeneity degree was different: the degree of total N and organic matter was higher, and that of available P and available N was lower. The results also indicated that the spatial correlation of the five tested soil nutrients at this large scale was moderately dependent. The ranges of available N and available P were almost same, which were 5 km and 5.5 km, respectively. The range of total N was up to 18 km, and that of organic matter was 8.5 km. For available K, the spatial variability scale primarily expressed exponential model between 0-3.5 km, but linear with sill model between 3.5-25.5 km. In addition, five soil nutrients exhibited different isotropic ranges. Available N and available P were isotropic through the whole research range (0-28 km). The isotropic range of available K was 0-8 km, and that of total N and organic matter was 0-10 km.

  2. APPLICATION OF BAYESIAN MONTE CARLO ANALYSIS TO A LAGRANGIAN PHOTOCHEMICAL AIR QUALITY MODEL. (R824792)

    EPA Science Inventory

    Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...

  3. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  4. Multiple-point statistical simulation for hydrogeological models: 3-D training image development and conditioning strategies

    NASA Astrophysics Data System (ADS)

    Høyer, Anne-Sophie; Vignoli, Giulio; Mejer Hansen, Thomas; Thanh Vu, Le; Keefer, Donald A.; Jørgensen, Flemming

    2017-12-01

    Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS) to hydrogeological modelling focus on relatively fine-scale models and concentrate on the estimation of facies-level structural uncertainty. Much less attention is paid to the use of input data and optimal construction of training images. For instance, even though the training image should capture a set of spatial geological characteristics to guide the simulations, the majority of the research still relies on 2-D or quasi-3-D training images. In the present study, we demonstrate a novel strategy for 3-D MPS modelling characterized by (i) realistic 3-D training images and (ii) an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers an area of 2810 km2 in the southern part of Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments) which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 million voxels with size 100 m × 100 m × 5 m. Data used for the modelling include water well logs, high-resolution seismic data, and a previously published 3-D geological model. We apply a series of different strategies for the simulations based on data quality, and develop a novel method to effectively create observed spatial trends. The training image is constructed as a relatively small 3-D voxel model covering an area of 90 km2. We use an iterative training image development strategy and find that even slight modifications in the training image create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical workflow to build the training image and effectively handle different types of input information to perform large-scale geostatistical modelling.

  5. A Bayesian alternative for multi-objective ecohydrological model specification

    NASA Astrophysics Data System (ADS)

    Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori

    2018-01-01

    Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior distributions in such approaches.

  6. Diagnostic accuracy of a bayesian latent group analysis for the detection of malingering-related poor effort.

    PubMed

    Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina

    2013-01-01

    In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.

  7. Probabilistic Inference: Task Dependency and Individual Differences of Probability Weighting Revealed by Hierarchical Bayesian Modeling

    PubMed Central

    Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno

    2016-01-01

    Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323

  8. Estimating Tree Height-Diameter Models with the Bayesian Method

    PubMed Central

    Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei

    2014-01-01

    Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733

  9. Estimating tree height-diameter models with the Bayesian method.

    PubMed

    Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei

    2014-01-01

    Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2.

  10. A spatial model for a stream networks of Citarik River with the environmental variables: potential of hydrogen (PH) and temperature

    NASA Astrophysics Data System (ADS)

    Bachrudin, A.; Mohamed, N. B.; Supian, S.; Sukono; Hidayat, Y.

    2018-03-01

    Application of existing geostatistical theory of stream networks provides a number of interesting and challenging problems. Most of statistical tools in the traditional geostatistics have been based on a Euclidean distance such as autocovariance functions, but for stream data is not permissible since it deals with a stream distance. To overcome this autocovariance developed a model based on the distance the flow with using convolution kernel approach (moving average construction). Spatial model for a stream networks is widely used to monitor environmental on a river networks. In a case study of a river in province of West Java, the objective of this paper is to analyze a capability of a predictive on two environmental variables, potential of hydrogen (PH) and temperature using ordinary kriging. Several the empirical results show: (1) The best fit of autocovariance functions for temperature and potential hydrogen (ph) of Citarik River is linear which also yields the smallest root mean squared prediction error (RMSPE), (2) the spatial correlation values between the locations on upstream and on downstream of Citarik river exhibit decreasingly

  11. Accurate Biomass Estimation via Bayesian Adaptive Sampling

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay

    2005-01-01

    The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.

  12. ORTHONORMAL RESIDUALS IN GEOSTATISTICS: MODEL CRITICISM AND PARAMETER ESTIMATION. (R825689C037)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  13. A Simulated Annealing based Optimization Algorithm for Automatic Variogram Model Fitting

    NASA Astrophysics Data System (ADS)

    Soltani-Mohammadi, Saeed; Safa, Mohammad

    2016-09-01

    Fitting a theoretical model to an experimental variogram is an important issue in geostatistical studies because if the variogram model parameters are tainted with uncertainty, the latter will spread in the results of estimations and simulations. Although the most popular fitting method is fitting by eye, in some cases use is made of the automatic fitting method on the basis of putting together the geostatistical principles and optimization techniques to: 1) provide a basic model to improve fitting by eye, 2) fit a model to a large number of experimental variograms in a short time, and 3) incorporate the variogram related uncertainty in the model fitting. Effort has been made in this paper to improve the quality of the fitted model by improving the popular objective function (weighted least squares) in the automatic fitting. Also, since the variogram model function (£) and number of structures (m) too affect the model quality, a program has been provided in the MATLAB software that can present optimum nested variogram models using the simulated annealing method. Finally, to select the most desirable model from among the single/multi-structured fitted models, use has been made of the cross-validation method, and the best model has been introduced to the user as the output. In order to check the capability of the proposed objective function and the procedure, 3 case studies have been presented.

  14. The Applications of Model-Based Geostatistics in Helminth Epidemiology and Control

    PubMed Central

    Magalhães, Ricardo J. Soares; Clements, Archie C.A.; Patil, Anand P.; Gething, Peter W.; Brooker, Simon

    2011-01-01

    Funding agencies are dedicating substantial resources to tackle helminth infections. Reliable maps of the distribution of helminth infection can assist these efforts by targeting control resources to areas of greatest need. The ability to define the distribution of infection at regional, national and subnational levels has been enhanced greatly by the increased availability of good quality survey data and the use of model-based geostatistics (MBG), enabling spatial prediction in unsampled locations. A major advantage of MBG risk mapping approaches is that they provide a flexible statistical platform for handling and representing different sources of uncertainty, providing plausible and robust information on the spatial distribution of infections to inform the design and implementation of control programmes. Focussing on schistosomiasis and soil-transmitted helminthiasis, with additional examples for lymphatic filariasis and onchocerciasis, we review the progress made to date with the application of MBG tools in large-scale, real-world control programmes and propose a general framework for their application to inform integrative spatial planning of helminth disease control programmes. PMID:21295680

  15. Using geostatistical methods to estimate snow water equivalence distribution in a mountain watershed

    USGS Publications Warehouse

    Balk, B.; Elder, K.; Baron, Jill S.

    1998-01-01

    Knowledge of the spatial distribution of snow water equivalence (SWE) is necessary to adequately forecast the volume and timing of snowmelt runoff.  In April 1997, peak accumulation snow depth and density measurements were independently taken in the Loch Vale watershed (6.6 km2), Rocky Mountain National Park, Colorado.  Geostatistics and classical statistics were used to estimate SWE distribution across the watershed.  Snow depths were spatially distributed across the watershed through kriging interpolation methods which provide unbiased estimates that have minimum variances.  Snow densities were spatially modeled through regression analysis.  Combining the modeled depth and density with snow-covered area (SCA produced an estimate of the spatial distribution of SWE.  The kriged estimates of snow depth explained 37-68% of the observed variance in the measured depths.  Steep slopes, variably strong winds, and complex energy balance in the watershed contribute to a large degree of heterogeneity in snow depth.

  16. Multivariate Analysis and Modeling of Sediment Pollution Using Neural Network Models and Geostatistics

    NASA Astrophysics Data System (ADS)

    Golay, Jean; Kanevski, Mikhaïl

    2013-04-01

    The present research deals with the exploration and modeling of a complex dataset of 200 measurement points of sediment pollution by heavy metals in Lake Geneva. The fundamental idea was to use multivariate Artificial Neural Networks (ANN) along with geostatistical models and tools in order to improve the accuracy and the interpretability of data modeling. The results obtained with ANN were compared to those of traditional geostatistical algorithms like ordinary (co)kriging and (co)kriging with an external drift. Exploratory data analysis highlighted a great variety of relationships (i.e. linear, non-linear, independence) between the 11 variables of the dataset (i.e. Cadmium, Mercury, Zinc, Copper, Titanium, Chromium, Vanadium and Nickel as well as the spatial coordinates of the measurement points and their depth). Then, exploratory spatial data analysis (i.e. anisotropic variography, local spatial correlations and moving window statistics) was carried out. It was shown that the different phenomena to be modeled were characterized by high spatial anisotropies, complex spatial correlation structures and heteroscedasticity. A feature selection procedure based on General Regression Neural Networks (GRNN) was also applied to create subsets of variables enabling to improve the predictions during the modeling phase. The basic modeling was conducted using a Multilayer Perceptron (MLP) which is a workhorse of ANN. MLP models are robust and highly flexible tools which can incorporate in a nonlinear manner different kind of high-dimensional information. In the present research, the input layer was made of either two (spatial coordinates) or three neurons (when depth as auxiliary information could possibly capture an underlying trend) and the output layer was composed of one (univariate MLP) to eight neurons corresponding to the heavy metals of the dataset (multivariate MLP). MLP models with three input neurons can be referred to as Artificial Neural Networks with EXternal drift (ANNEX). Moreover, the exact number of output neurons and the selection of the corresponding variables were based on the subsets created during the exploratory phase. Concerning hidden layers, no restriction were made and multiple architectures were tested. For each MLP model, the quality of the modeling procedure was assessed by variograms: if the variogram of the residuals demonstrates pure nugget effect and if the level of the nugget exactly corresponds to the nugget value of the theoretical variogram of the corresponding variable, all the structured information has been correctly extracted without overfitting. Finally, it is worth mentioning that simple MLP models are not always able to remove all the spatial correlation structure from the data. In that case, Neural Network Residual Kriging (NNRK) can be carried out and risk assessment can be conducted with Neural Network Residual Simulations (NNRS). Finally, the results of the ANNEX models were compared to those of ordinary (co)kriging and (co)kriging with an external drift. It was shown that the ANNEX models performed better than traditional geostatistical algorithms when the relationship between the variable of interest and the auxiliary predictor was not linear. References Kanevski, M. and Maignan, M. (2004). Analysis and Modelling of Spatial Environmental Data. Lausanne: EPFL Press.

  17. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    DTIC Science & Technology

    2016-01-05

    SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE

  18. Associated patterns of insecticide resistance in field populations of malaria vectors across Africa.

    PubMed

    Hancock, Penelope A; Wiebe, Antoinette; Gleave, Katherine A; Bhatt, Samir; Cameron, Ewan; Trett, Anna; Weetman, David; Smith, David L; Hemingway, Janet; Coleman, Michael; Gething, Peter W; Moyes, Catherine L

    2018-06-05

    The development of insecticide resistance in African malaria vectors threatens the continued efficacy of important vector control methods that rely on a limited set of insecticides. To understand the operational significance of resistance we require quantitative information about levels of resistance in field populations to the suite of vector control insecticides. Estimation of resistance is complicated by the sparsity of observations in field populations, variation in resistance over time and space at local and regional scales, and cross-resistance between different insecticide types. Using observations of the prevalence of resistance in mosquito species from the Anopheles gambiae complex sampled from 1,183 locations throughout Africa, we applied Bayesian geostatistical models to quantify patterns of covariation in resistance phenotypes across different insecticides. For resistance to the three pyrethroids tested, deltamethrin, permethrin, and λ-cyhalothrin, we found consistent forms of covariation across sub-Saharan Africa and covariation between resistance to these pyrethroids and resistance to DDT. We found no evidence of resistance interactions between carbamate and organophosphate insecticides or between these insecticides and those from other classes. For pyrethroids and DDT we found significant associations between predicted mean resistance and the observed frequency of kdr mutations in the Vgsc gene in field mosquito samples, with DDT showing the strongest association. These results improve our capacity to understand and predict resistance patterns throughout Africa and can guide the development of monitoring strategies. Copyright © 2018 the Author(s). Published by PNAS.

  19. Eco-hydrological Wireless Sensor Network and upscaling method research in the Heihe River Basin, China

    NASA Astrophysics Data System (ADS)

    Jin, Rui; kang, Jian

    2017-04-01

    Wireless Sensor Networks are recognized as one of most important near-surface components of GEOSS (Global Earth Observation System of Systems), with flourish development of low-cost, robust and integrated data loggers and sensors. A nested eco-hydrological wireless sensor network (EHWSN) was installed in the up- and middle-reaches of the Heihe River Basin, operated to obtain multi-scale observation of soil moisture, soil temperature and land surface temperature from 2012 till now. The spatial distribution of EHWSN was optimally designed based on the geo-statistical theory, with the aim to capture the spatial variations and temporal dynamics of soil moisture and soil temperature, and to produce ground truth at grid scale for validating the related remote sensing products and model simulation in the heterogeneous land surface. In terms of upscaling research, we have developed a set of method to aggregate multi-point WSN observations to grid scale ( 1km), including regression kriging estimation to utilize multi-resource remote sensing auxiliary information, block kriging with homogeneous measurement errors, and bayesian-based upscaling algorithm that utilizes MODIS-derived apparent thermal inertia. All the EHWSN observation are organized as datasets to be freely published at http://westdc.westgis.ac.cn/hiwater. EHWSN integrates distributed observation nodes to achieve an automated, intelligent and remote-controllable network that provides superior integrated, standardized and automated observation capabilities for hydrological and ecological processes research at the basin scale.

  20. Using rank-order geostatistics for spatial interpolation of highly skewed data in a heavy-metal contaminated site.

    PubMed

    Juang, K W; Lee, D Y; Ellsworth, T R

    2001-01-01

    The spatial distribution of a pollutant in contaminated soils is usually highly skewed. As a result, the sample variogram often differs considerably from its regional counterpart and the geostatistical interpolation is hindered. In this study, rank-order geostatistics with standardized rank transformation was used for the spatial interpolation of pollutants with a highly skewed distribution in contaminated soils when commonly used nonlinear methods, such as logarithmic and normal-scored transformations, are not suitable. A real data set of soil Cd concentrations with great variation and high skewness in a contaminated site of Taiwan was used for illustration. The spatial dependence of ranks transformed from Cd concentrations was identified and kriging estimation was readily performed in the standardized-rank space. The estimated standardized rank was back-transformed into the concentration space using the middle point model within a standardized-rank interval of the empirical distribution function (EDF). The spatial distribution of Cd concentrations was then obtained. The probability of Cd concentration being higher than a given cutoff value also can be estimated by using the estimated distribution of standardized ranks. The contour maps of Cd concentrations and the probabilities of Cd concentrations being higher than the cutoff value can be simultaneously used for delineation of hazardous areas of contaminated soils.

  1. Regional flow duration curves: Geostatistical techniques versus multivariate regression

    USGS Publications Warehouse

    Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.

    2016-01-01

    A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.

  2. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors

    PubMed Central

    Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437

  3. A Fast Surrogate-facilitated Data-driven Bayesian Approach to Uncertainty Quantification of a Regional Groundwater Flow Model with Structural Error

    NASA Astrophysics Data System (ADS)

    Xu, T.; Valocchi, A. J.; Ye, M.; Liang, F.

    2016-12-01

    Due to simplification and/or misrepresentation of the real aquifer system, numerical groundwater flow and solute transport models are usually subject to model structural error. During model calibration, the hydrogeological parameters may be overly adjusted to compensate for unknown structural error. This may result in biased predictions when models are used to forecast aquifer response to new forcing. In this study, we extend a fully Bayesian method [Xu and Valocchi, 2015] to calibrate a real-world, regional groundwater flow model. The method uses a data-driven error model to describe model structural error and jointly infers model parameters and structural error. In this study, Bayesian inference is facilitated using high performance computing and fast surrogate models. The surrogate models are constructed using machine learning techniques to emulate the response simulated by the computationally expensive groundwater model. We demonstrate in the real-world case study that explicitly accounting for model structural error yields parameter posterior distributions that are substantially different from those derived by the classical Bayesian calibration that does not account for model structural error. In addition, the Bayesian with error model method gives significantly more accurate prediction along with reasonable credible intervals.

  4. Spatiotemporal mathematical modelling of mutations of the dhps gene in African Plasmodium falciparum.

    PubMed

    Flegg, Jennifer A; Patil, Anand P; Venkatesan, Meera; Roper, Cally; Naidoo, Inbarani; Hay, Simon I; Sibley, Carol Hopkins; Guerin, Philippe J

    2013-07-17

    Plasmodium falciparum has repeatedly evolved resistance to first-line anti-malarial drugs, thwarting efforts to control and eliminate the disease and in some period of time this contributed largely to an increase in mortality. Here a mathematical model was developed to map the spatiotemporal trends in the distribution of mutations in the P. falciparum dihydropteroate synthetase (dhps) gene that confer resistance to the anti-malarial sulphadoxine, and are a useful marker for the combination of alleles in dhfr and dhps that is highly correlated with resistance to sulphadoxine-pyrimethamine (SP). The aim of this study was to present a proof of concept for spatiotemporal modelling of trends in anti-malarial drug resistance that can be applied to monitor trends in resistance to components of artemisinin combination therapy (ACT) or other anti-malarials, as they emerge or spread. Prevalence measurements of single nucleotide polymorphisms in three codon positions of the dihydropteroate synthetase (dhps) gene from published studies of dhps mutations across Africa were used. A model-based geostatistics approach was adopted to create predictive surfaces of the dhps540E mutation over the spatial domain of sub-Saharan Africa from 1990-2010. The statistical model was implemented within a Bayesian framework and hence quantified the associated uncertainty of the prediction of the prevalence of the dhps540E mutation in sub-Saharan Africa. The maps presented visualize the changing prevalence of the dhps540E mutation in sub-Saharan Africa. These allow prediction of space-time trends in the parasite resistance to SP, and provide probability distributions of resistance prevalence in places where no data are available as well as insight on the spread of resistance in a way that the data alone do not allow. The results of this work will be extended to design optimal sampling strategies for the future molecular surveillance of resistance, providing a proof of concept for similar techniques to design optimal strategies to monitor resistance to ACT.

  5. The Bayesian Revolution Approaches Psychological Development

    ERIC Educational Resources Information Center

    Shultz, Thomas R.

    2007-01-01

    This commentary reviews five articles that apply Bayesian ideas to psychological development, some with psychology experiments, some with computational modeling, and some with both experiments and modeling. The reviewed work extends the current Bayesian revolution into tasks often studied in children, such as causal learning and word learning, and…

  6. Hydraulic Tomography and the Curse of Storativity

    NASA Astrophysics Data System (ADS)

    Cirpka, O. A.; Li, W.; Englert, A.

    2006-12-01

    Pumping tests are among the most common techniques for hydrogeological site investigation. Their traditional analysis is based on fitting analytical expressions to measured time series of drawdown. These expressions were derived for homogeneous conditions, whereas all natural aquifers are heterogeneous. The mentioned conceptual inconsistency complicates the hydrogeological interpretation of the obtained coefficients. In particularly, it has been shown that the heterogeneity of transmissivity is aliased to variability in the estimated storativity. In hydraulic tomography, multiple pumping tests are jointly analyzed. The hydraulic parameters to be estimated are allowed to fluctuate in space. For regularization, a geostatistical smoothness criterion may be introduced. Thus, the inversion results in the most likely spatial distribution of parameters that is consistent with the drawdown measurements and follows a predefined geostatistical model. Applying the restricted maximum likelihood approach, the parameters of the prior covariance function (i.e., the prior variance and correlation length) can be inferred from the data as well. We have applied the quasi-linear geostatistical approach of inverse modeling to drawdown measurements of multiple, overlapping pumping tests performed at the test site Krauthausen near Jülich, Germany. To reduce the computational costs, we have characterized the drawdown curves by their temporal moments. In the estimation of the geostatistical parameters, the measurement error of heads turned out to be of vital importance. The less we trust the data, the larger is the estimated correlation length, resulting in a more uniform distribution of transmissivity. Similar to conventional pumping test analysis, the data analysis point to a high variability of storativity although the properties making up storativity are known to be only mildly heterogeneous. We conjecture that the unresolved small-scale spatial variability of conductivity is mapped to variability of storativity. This is rather unfortunate since reliable field data on the variability of storativity are missing. The study underscores that structural information is difficult to extract from hydraulic data alone. Information on length scales and major deterministic features may be gained by geophysical surveying, even if rock-laws directly relating geophysical to hydraulic properties are considered unreliable.

  7. Incorporating approximation error in surrogate based Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Zeng, L.; Li, W.; Wu, L.

    2015-12-01

    There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.

  8. Moving in Parallel Toward a Modern Modeling Epistemology: Bayes Factors and Frequentist Modeling Methods.

    PubMed

    Rodgers, Joseph Lee

    2016-01-01

    The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.

  9. A Bayesian Model of the Memory Colour Effect.

    PubMed

    Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R

    2018-01-01

    According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.

  10. A Bayesian Model of the Memory Colour Effect

    PubMed Central

    Olkkonen, Maria; Gegenfurtner, Karl R.

    2018-01-01

    According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects. PMID:29760874

  11. GIS, geostatistics, metadata banking, and tree-based models for data analysis and mapping in environmental monitoring and epidemiology.

    PubMed

    Schröder, Winfried

    2006-05-01

    By the example of environmental monitoring, some applications of geographic information systems (GIS), geostatistics, metadata banking, and Classification and Regression Trees (CART) are presented. These tools are recommended for mapping statistically estimated hot spots of vectors and pathogens. GIS were introduced as tools for spatially modelling the real world. The modelling can be done by mapping objects according to the spatial information content of data. Additionally, this can be supported by geostatistical and multivariate statistical modelling. This is demonstrated by the example of modelling marine habitats of benthic communities and of terrestrial ecoregions. Such ecoregionalisations may be used to predict phenomena based on the statistical relation between measurements of an interesting phenomenon such as, e.g., the incidence of medically relevant species and correlated characteristics of the ecoregions. The combination of meteorological data and data on plant phenology can enhance the spatial resolution of the information on climate change. To this end, meteorological and phenological data have to be correlated. To enable this, both data sets which are from disparate monitoring networks have to be spatially connected by means of geostatistical estimation. This is demonstrated by the example of transformation of site-specific data on plant phenology into surface data. The analysis allows for spatial comparison of the phenology during the two periods 1961-1990 and 1991-2002 covering whole Germany. The changes in both plant phenology and air temperature were proved to be statistically significant. Thus, they can be combined by GIS overlay technique to enhance the spatial resolution of the information on the climate change and use them for the prediction of vector incidences at the regional scale. The localisation of such risk hot spots can be done by geometrically merging surface data on promoting factors. This is demonstrated by the example of the transfer of heavy metals through soils. The predicted hot spots of heavy metal transfer can be validated empirically by measurement data which can be inquired by a metadata base linked with a geographic information system. A corresponding strategy for the detection of vector hot spots in medical epidemiology is recommended. Data on incidences and habitats of the Anophelinae in the marsh regions of Lower Saxony (Germany) were used to calculate a habitat model by CART, which together with climate data and data on ecoregions can be further used for the prediction of habitats of medically relevant vector species. In the future, this approach should be supported by an internet-based information system consisting of three components: metadata questionnaire, metadata base, and GIS to link metadata, surface data, and measurement data on incidences and habitats of medically relevant species and related data on climate, phenology, and ecoregional characteristic conditions.

  12. Comparing different approaches - data mining, geostatistic, and deterministic pedology - to assess the frequency of WRB Reference Soil Groups in the Italian soil regions

    NASA Astrophysics Data System (ADS)

    Lorenzetti, Romina; Barbetti, Roberto; L'Abate, Giovanni; Fantappiè, Maria; Costantini, Edoardo A. C.

    2013-04-01

    Estimating frequency of soil classes in map unit is always affected by some degree of uncertainty, especially at small scales, with a larger generalization. The aim of this study was to compare different possible approaches - data mining, geostatistic, deterministic pedology - to assess the frequency of WRB Reference Soil Groups (RSG) in the major Italian soil regions. In the soil map of Italy (Costantini et al., 2012), a list of the first five RSG was reported in each major 10 soil regions. The soil map was produced using the national soil geodatabase, which stored 22,015 analyzed and classified pedons, 1,413 soil typological unit (STU) and a set of auxiliary variables (lithology, land-use, DEM). Other variables were added, to better consider the influence of soil forming factors (slope, soil aridity index, carbon stock, soil inorganic carbon content, clay, sand, geography of soil regions and soil systems) and a grid at 1 km mesh was set up. The traditional deterministic pedology assessed the STU frequency according to the expert judgment presence in every elementary landscape which formed the mapping unit. Different data mining techniques were firstly compared in their ability to predict RSG through auxiliary variables (neural networks, random forests, boosted tree, supported vector machine (SVM)). We selected SVM according to the result of a testing set. A SVM model is a representation of the examples as points in space, mapped so that examples of separate categories are divided by a clear gap that is as wide as possible. The geostatistic algorithm we used was an indicator collocated cokriging. The class values of the auxiliary variables, available at all the points of the grid, were transformed in indicator variables (values 0, 1). A principal component analysis allowed us to select the variables that were able to explain the largest variability, and to correlate each RSG with the first principal component, which explained the 51% of the total variability. The principal component was used as collocated variable. The results were as many probability maps as the estimated WRB classes. They were summed up in a unique map, with the most probable class at each pixel. The first five more frequent RSG resulting from the three methods were compared. The outcomes were validated with a subset of the 10% of the pedons, kept out before the elaborations. The error estimate was produced for each estimated RSG. The first results, obtained in one of the most widespread soil region (plains and low hills of central and southern Italy) showed that the first two frequency classes were the same for all the three methods. The deterministic method differed from the others at the third position, while the statistical methods inverted the third and fourth position. An advantage of the SVM was the possibility to use in the same elaboration numeric and categorical variable, without any previous transformation, which reduced the processing time. A Bayesian validation indicated that the SVM method was as reliable as the indicator collocated cokriging, and better than the deterministic pedological approach.

  13. Past and ongoing shifts in Joshua tree distribution support future modeled range contraction

    Treesearch

    Kenneth L. Cole; Kirsten Ironside; Jon Eischeid; Gregg Garfin; Phillip B. Duffy; Chris Toney

    2011-01-01

    The future distribution of the Joshua tree (Yucca brevifolia) is projected by combining a geostatistical analysis of 20th-century climates over its current range, future modeled climates, and paleoecological data showing its response to a past similar climate change. As climate rapidly warmed ~11 700 years ago, the range of Joshua tree contracted, leaving only the...

  14. Accounting for aquifer heterogeneity from geological data to management tools.

    PubMed

    Blouin, Martin; Martel, Richard; Gloaguen, Erwan

    2013-01-01

    A nested workflow of multiple-point geostatistics (MPG) and sequential Gaussian simulation (SGS) was tested on a study area of 6 km(2) located about 20 km northwest of Quebec City, Canada. In order to assess its geological and hydrogeological parameter heterogeneity and to provide tools to evaluate uncertainties in aquifer management, direct and indirect field measurements are used as inputs in the geostatistical simulations to reproduce large and small-scale heterogeneities. To do so, the lithological information is first associated to equivalent hydrogeological facies (hydrofacies) according to hydraulic properties measured at several wells. Then, heterogeneous hydrofacies (HF) realizations are generated using a prior geological model as training image (TI) with the MPG algorithm. The hydraulic conductivity (K) heterogeneity modeling within each HF is finally computed using SGS algorithm. Different K models are integrated in a finite-element hydrogeological model to calculate multiple transport simulations. Different scenarios exhibit variations in mass transport path and dispersion associated with the large- and small-scale heterogeneity respectively. Three-dimensional maps showing the probability of overpassing different thresholds are presented as examples of management tools. © 2012, The Author(s). Groundwater © 2012, National Ground Water Association.

  15. The geostatistical approach for structural and stratigraphic framework analysis of offshore NW Bonaparte Basin, Australia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wahid, Ali, E-mail: ali.wahid@live.com; Salim, Ahmed Mohamed Ahmed, E-mail: mohamed.salim@petronas.com.my; Yusoff, Wan Ismail Wan, E-mail: wanismail-wanyusoff@petronas.com.my

    2016-02-01

    Geostatistics or statistical approach is based on the studies of temporal and spatial trend, which depend upon spatial relationships to model known information of variable(s) at unsampled locations. The statistical technique known as kriging was used for petrophycial and facies analysis, which help to assume spatial relationship to model the geological continuity between the known data and the unknown to produce a single best guess of the unknown. Kriging is also known as optimal interpolation technique, which facilitate to generate best linear unbiased estimation of each horizon. The idea is to construct a numerical model of the lithofacies and rockmore » properties that honor available data and further integrate with interpreting seismic sections, techtonostratigraphy chart with sea level curve (short term) and regional tectonics of the study area to find the structural and stratigraphic growth history of the NW Bonaparte Basin. By using kriging technique the models were built which help to estimate different parameters like horizons, facies, and porosities in the study area. The variograms were used to determine for identification of spatial relationship between data which help to find the depositional history of the North West (NW) Bonaparte Basin.« less

  16. Using Bayesian Networks to Improve Knowledge Assessment

    ERIC Educational Resources Information Center

    Millan, Eva; Descalco, Luis; Castillo, Gladys; Oliveira, Paula; Diogo, Sandra

    2013-01-01

    In this paper, we describe the integration and evaluation of an existing generic Bayesian student model (GBSM) into an existing computerized testing system within the Mathematics Education Project (PmatE--Projecto Matematica Ensino) of the University of Aveiro. This generic Bayesian student model had been previously evaluated with simulated…

  17. Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations

    ERIC Educational Resources Information Center

    Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon

    2018-01-01

    To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…

  18. Geostatistical and GIS analysis of the spatial variability of alluvial gold content in Ngoura-Colomines area, Eastern Cameroon: Implications for the exploration of primary gold deposit

    NASA Astrophysics Data System (ADS)

    Takodjou Wambo, Jonas Didero; Ganno, Sylvestre; Djonthu Lahe, Yannick Sthopira; Kouankap Nono, Gus Djibril; Fossi, Donald Hermann; Tchouatcha, Milan Stafford; Nzenti, Jean Paul

    2018-06-01

    Linear and nonlinear geostatistic is commonly used in ore grade estimation and seldom used in Geographical Information System (GIS) technology. In this study, we suggest an approach based on geostatistic linear ordinary kriging (OK) and Geographical Information System (GIS) techniques to investigate the spatial distribution of alluvial gold content, mineralized and gangue layers thicknesses from 73 pits at the Ngoura-Colomines area with the aim to determine controlling factors for the spatial distribution of mineralization and delineate the most prospective area for primary gold mineralization. Gold content varies between 0.1 and 4.6 g/m3 and has been broadly grouped into three statistical classes. These classes have been spatially subdivided into nine zones using ordinary kriging model based on physical and topographical characteristics. Both mineralized and barren layer thicknesses show randomly spatial distribution, and there is no correlation between these parameters and the gold content. This approach has shown that the Ngoura-Colomines area is located in a large shear zone compatible with the Riedel fault system composed of P and P‧ fractures oriented NE-SW and NNE-SSW respectively; E-W trending R fractures and R‧ fractures with NW-SE trends that could have contributed significantly to the establishment of this gold mineralization. The combined OK model and GIS analysis have led to the delineation of Colomines, Tissongo, Madubal and Boutou villages as the most prospective areas for the exploration of primary gold deposit in the study area.

  19. Field-scale soil moisture space-time geostatistical modeling for complex Palouse landscapes in the inland Pacific Northwest

    NASA Astrophysics Data System (ADS)

    Chahal, M. K.; Brown, D. J.; Brooks, E. S.; Campbell, C.; Cobos, D. R.; Vierling, L. A.

    2012-12-01

    Estimating soil moisture content continuously over space and time using geo-statistical techniques supports the refinement of process-based watershed hydrology models and the application of soil process models (e.g. biogeochemical models predicting greenhouse gas fluxes) to complex landscapes. In this study, we model soil profile volumetric moisture content for five agricultural fields with loess soils in the Palouse region of Eastern Washington and Northern Idaho. Using a combination of stratification and space-filling techniques, we selected 42 representative and distributed measurement locations in the Cook Agronomy Farm (Pullman, WA) and 12 locations each in four additional grower fields that span the precipitation gradient across the Palouse. At each measurement location, soil moisture was measured on an hourly basis at five different depths (30, 60, 90, 120, and 150 cm) using Decagon 5-TE/5-TM soil moisture sensors (Decagon Devices, Pullman, WA, USA). This data was collected over three years for the Cook Agronomy Farm and one year for each of the grower fields. In addition to ordinary kriging, we explored the correlation of volumetric water content with external, spatially exhaustive indices derived from terrain models, optical remote sensing imagery, and proximal soil sensing data (electromagnetic induction and VisNIR penetrometer)

  20. Characterizing the spatial structure of endangered species habitat using geostatistical analysis of IKONOS imagery

    USGS Publications Warehouse

    Wallace, C.S.A.; Marsh, S.E.

    2005-01-01

    Our study used geostatistics to extract measures that characterize the spatial structure of vegetated landscapes from satellite imagery for mapping endangered Sonoran pronghorn habitat. Fine spatial resolution IKONOS data provided information at the scale of individual trees or shrubs that permitted analysis of vegetation structure and pattern. We derived images of landscape structure by calculating local estimates of the nugget, sill, and range variogram parameters within 25 ?? 25-m image windows. These variogram parameters, which describe the spatial autocorrelation of the 1-m image pixels, are shown in previous studies to discriminate between different species-specific vegetation associations. We constructed two independent models of pronghorn landscape preference by coupling the derived measures with Sonoran pronghorn sighting data: a distribution-based model and a cluster-based model. The distribution-based model used the descriptive statistics for variogram measures at pronghorn sightings, whereas the cluster-based model used the distribution of pronghorn sightings within clusters of an unsupervised classification of derived images. Both models define similar landscapes, and validation results confirm they effectively predict the locations of an independent set of pronghorn sightings. Such information, although not a substitute for field-based knowledge of the landscape and associated ecological processes, can provide valuable reconnaissance information to guide natural resource management efforts. ?? 2005 Taylor & Francis Group Ltd.

  1. Semivariogram modeling by weighted least squares

    USGS Publications Warehouse

    Jian, X.; Olea, R.A.; Yu, Y.-S.

    1996-01-01

    Permissible semivariogram models are fundamental for geostatistical estimation and simulation of attributes having a continuous spatiotemporal variation. The usual practice is to fit those models manually to experimental semivariograms. Fitting by weighted least squares produces comparable results to fitting manually in less time, systematically, and provides an Akaike information criterion for the proper comparison of alternative models. We illustrate the application of a computer program with examples showing the fitting of simple and nested models. Copyright ?? 1996 Elsevier Science Ltd.

  2. BCM: toolkit for Bayesian analysis of Computational Models using samplers.

    PubMed

    Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

    2016-10-21

    Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.

  3. Two Approaches to Calibration in Metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campanelli, Mark

    2014-04-01

    Inferring mathematical relationships with quantified uncertainty from measurement data is common to computational science and metrology. Sufficient knowledge of measurement process noise enables Bayesian inference. Otherwise, an alternative approach is required, here termed compartmentalized inference, because collection of uncertain data and model inference occur independently. Bayesian parameterized model inference is compared to a Bayesian-compatible compartmentalized approach for ISO-GUM compliant calibration problems in renewable energy metrology. In either approach, model evidence can help reduce model discrepancy.

  4. Bayesian data analysis in population ecology: motivations, methods, and benefits

    USGS Publications Warehouse

    Dorazio, Robert

    2016-01-01

    During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.

  5. [Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

    PubMed

    Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

    2017-03-10

    To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.

  6. Soil texture and organic carbon fractions predicted from near-infrared spectroscopy and geostatistics

    USDA-ARS?s Scientific Manuscript database

    Field-specific management could help achieve agricultural sustainability by increasing production and decreasing environmental impacts. Near-infrared spectroscopy (NIRS) and geostatistics are relatively unexplored tools that could reduce time, labor, and costs of soil analysis. Our objective was to ...

  7. A Preliminary Bayesian Analysis of Incomplete Longitudinal Data from a Small Sample: Methodological Advances in an International Comparative Study of Educational Inequality

    ERIC Educational Resources Information Center

    Hsieh, Chueh-An; Maier, Kimberly S.

    2009-01-01

    The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…

  8. Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-04-01

    Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.

  9. Comparison of the spatial patterns of schistosomiasis in Zimbabwe at two points in time, spaced twenty-nine years apart: is climate variability of importance?

    PubMed

    Pedersen, Ulrik B; Karagiannis-Voules, Dimitrios-Alexios; Midzi, Nicholas; Mduluza, Tkafira; Mukaratirwa, Samson; Fensholt, Rasmus; Vennervald, Birgitte J; Kristensen, Thomas K; Vounatsou, Penelope; Stensgaard, Anna-Sofie

    2017-05-08

    Temperature, precipitation and humidity are known to be important factors for the development of schistosome parasites as well as their intermediate snail hosts. Climate therefore plays an important role in determining the geographical distribution of schistosomiasis and it is expected that climate change will alter distribution and transmission patterns. Reliable predictions of distribution changes and likely transmission scenarios are key to efficient schistosomiasis intervention-planning. However, it is often difficult to assess the direction and magnitude of the impact on schistosomiasis induced by climate change, as well as the temporal transferability and predictive accuracy of the models, as prevalence data is often only available from one point in time. We evaluated potential climate-induced changes on the geographical distribution of schistosomiasis in Zimbabwe using prevalence data from two points in time, 29 years apart; to our knowledge, this is the first study investigating this over such a long time period. We applied historical weather data and matched prevalence data of two schistosome species (Schistosoma haematobium and S. mansoni). For each time period studied, a Bayesian geostatistical model was fitted to a range of climatic, environmental and other potential risk factors to identify significant predictors that could help us to obtain spatially explicit schistosomiasis risk estimates for Zimbabwe. The observed general downward trend in schistosomiasis prevalence for Zimbabwe from 1981 and the period preceding a survey and control campaign in 2010 parallels a shift towards a drier and warmer climate. However, a statistically significant relationship between climate change and the change in prevalence could not be established.

  10. Dynamic Bayesian network modeling for longitudinal brain morphometry

    PubMed Central

    Chen, Rong; Resnick, Susan M; Davatzikos, Christos; Herskovits, Edward H

    2011-01-01

    Identifying interactions among brain regions from structural magnetic-resonance images presents one of the major challenges in computational neuroanatomy. We propose a Bayesian data-mining approach to the detection of longitudinal morphological changes in the human brain. Our method uses a dynamic Bayesian network to represent evolving inter-regional dependencies. The major advantage of dynamic Bayesian network modeling is that it can represent complicated interactions among temporal processes. We validated our approach by analyzing a simulated atrophy study, and found that this approach requires only a small number of samples to detect the ground-truth temporal model. We further applied dynamic Bayesian network modeling to a longitudinal study of normal aging and mild cognitive impairment — the Baltimore Longitudinal Study of Aging. We found that interactions among regional volume-change rates for the mild cognitive impairment group are different from those for the normal-aging group. PMID:21963916

  11. Variational learning and bits-back coding: an information-theoretic view to Bayesian learning.

    PubMed

    Honkela, Antti; Valpola, Harri

    2004-07-01

    The bits-back coding first introduced by Wallace in 1990 and later by Hinton and van Camp in 1993 provides an interesting link between Bayesian learning and information-theoretic minimum-description-length (MDL) learning approaches. The bits-back coding allows interpreting the cost function used in the variational Bayesian method called ensemble learning as a code length in addition to the Bayesian view of misfit of the posterior approximation and a lower bound of model evidence. Combining these two viewpoints provides interesting insights to the learning process and the functions of different parts of the model. In this paper, the problem of variational Bayesian learning of hierarchical latent variable models is used to demonstrate the benefits of the two views. The code-length interpretation provides new views to many parts of the problem such as model comparison and pruning and helps explain many phenomena occurring in learning.

  12. Combining binary decision tree and geostatistical methods to estimate snow distribution in a mountain watershed

    USGS Publications Warehouse

    Balk, Benjamin; Elder, Kelly

    2000-01-01

    We model the spatial distribution of snow across a mountain basin using an approach that combines binary decision tree and geostatistical techniques. In April 1997 and 1998, intensive snow surveys were conducted in the 6.9‐km2 Loch Vale watershed (LVWS), Rocky Mountain National Park, Colorado. Binary decision trees were used to model the large‐scale variations in snow depth, while the small‐scale variations were modeled through kriging interpolation methods. Binary decision trees related depth to the physically based independent variables of net solar radiation, elevation, slope, and vegetation cover type. These decision tree models explained 54–65% of the observed variance in the depth measurements. The tree‐based modeled depths were then subtracted from the measured depths, and the resulting residuals were spatially distributed across LVWS through kriging techniques. The kriged estimates of the residuals were added to the tree‐based modeled depths to produce a combined depth model. The combined depth estimates explained 60–85% of the variance in the measured depths. Snow densities were mapped across LVWS using regression analysis. Snow‐covered area was determined from high‐resolution aerial photographs. Combining the modeled depths and densities with a snow cover map produced estimates of the spatial distribution of snow water equivalence (SWE). This modeling approach offers improvement over previous methods of estimating SWE distribution in mountain basins.

  13. Classifying emotion in Twitter using Bayesian network

    NASA Astrophysics Data System (ADS)

    Surya Asriadie, Muhammad; Syahrul Mubarok, Mohamad; Adiwijaya

    2018-03-01

    Language is used to express not only facts, but also emotions. Emotions are noticeable from behavior up to the social media statuses written by a person. Analysis of emotions in a text is done in a variety of media such as Twitter. This paper studies classification of emotions on twitter using Bayesian network because of its ability to model uncertainty and relationships between features. The result is two models based on Bayesian network which are Full Bayesian Network (FBN) and Bayesian Network with Mood Indicator (BNM). FBN is a massive Bayesian network where each word is treated as a node. The study shows the method used to train FBN is not very effective to create the best model and performs worse compared to Naive Bayes. F1-score for FBN is 53.71%, while for Naive Bayes is 54.07%. BNM is proposed as an alternative method which is based on the improvement of Multinomial Naive Bayes and has much lower computational complexity compared to FBN. Even though it’s not better compared to FBN, the resulting model successfully improves the performance of Multinomial Naive Bayes. F1-Score for Multinomial Naive Bayes model is 51.49%, while for BNM is 52.14%.

  14. Geostatistical Investigations of Displacements on the Basis of Data from the Geodetic Monitoring of a Hydrotechnical Object

    NASA Astrophysics Data System (ADS)

    Namysłowska-Wilczyńska, Barbara; Wynalek, Janusz

    2017-12-01

    Geostatistical methods make the analysis of measurement data possible. This article presents the problems directed towards the use of geostatistics in spatial analysis of displacements based on geodetic monitoring. Using methods of applied (spatial) statistics, the research deals with interesting and current issues connected to space-time analysis, modeling displacements and deformations, as applied to any large-area objects on which geodetic monitoring is conducted (e.g., water dams, urban areas in the vicinity of deep excavations, areas at a macro-regional scale subject to anthropogenic influences caused by mining, etc.). These problems are very crucial, especially for safety assessment of important hydrotechnical constructions, as well as for modeling and estimating mining damage. Based on the geodetic monitoring data, a substantial basic empirical material was created, comprising many years of research results concerning displacements of controlled points situated on the crown and foreland of an exemplary earth dam, and used to assess the behaviour and safety of the object during its whole operating period. A research method at a macro-regional scale was applied to investigate some phenomena connected with the operation of the analysed big hydrotechnical construction. Applying a semivariogram function enabled the spatial variability analysis of displacements. Isotropic empirical semivariograms were calculated and then, theoretical parameters of analytical functions were determined, which approximated the courses of the mentioned empirical variability measure. Using ordinary (block) kriging at the grid nodes of an elementary spatial grid covering the analysed object, the values of the Z* estimated means of displacements were calculated together with the accompanying assessment of uncertainty estimation - a standard deviation of estimation σk. Raster maps of the distribution of estimated averages Z* and raster maps of deviations of estimation σk (in perspective) were obtained for selected years (1995 and 2007), taking the ground height 136 m a.s.l. into calculation. To calculate raster maps of Z* interpolated values, methods of quick interpolation were also used, such as the technique of the inverse distance squares, a linear model of kriging, a spline kriging, which made the recognition of the general background of displacements possible, without the accuracy assessment of Z* value estimation, i.e., the value of σk. These maps are also related to 1995 and 2007 and the elevation. As a result of applying these techniques, clear boundaries of subsiding areas, upthrusting and also horizontal displacements on the examined hydrotechnical object were marked out, which can be interpreted as areas of local deformations of the object, important for the safety of the construction. The effect of geostatistical research conducted, including the structural analysis, semivariograms modeling, estimating the displacements of the hydrotechnical object, are rich cartographic characteristic (semivariograms, raster maps, block diagrams), which present the spatial visualization of the conducted various analyses of the monitored displacements. The prepared geostatistical model (3D) of displacement variability (analysed within the area of the dam, during its operating period and including its height) will be useful not only in the correct assessment of displacements and deformations, but it will also make it possible to forecast these phenomena, which is crucial when the operating safety of such constructions is taken into account.

  15. Additive Genetic Variability and the Bayesian Alphabet

    PubMed Central

    Gianola, Daniel; de los Campos, Gustavo; Hill, William G.; Manfredi, Eduardo; Fernando, Rohan

    2009-01-01

    The use of all available molecular markers in statistical models for prediction of quantitative traits has led to what could be termed a genomic-assisted selection paradigm in animal and plant breeding. This article provides a critical review of some theoretical and statistical concepts in the context of genomic-assisted genetic evaluation of animals and crops. First, relationships between the (Bayesian) variance of marker effects in some regression models and additive genetic variance are examined under standard assumptions. Second, the connection between marker genotypes and resemblance between relatives is explored, and linkages between a marker-based model and the infinitesimal model are reviewed. Third, issues associated with the use of Bayesian models for marker-assisted selection, with a focus on the role of the priors, are examined from a theoretical angle. The sensitivity of a Bayesian specification that has been proposed (called “Bayes A”) with respect to priors is illustrated with a simulation. Methods that can solve potential shortcomings of some of these Bayesian regression procedures are discussed briefly. PMID:19620397

  16. Bayesian network modeling applied to coastal geomorphology: lessons learned from a decade of experimentation and application

    NASA Astrophysics Data System (ADS)

    Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.

    2016-12-01

    We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will demonstrate how Bayesian networks are used to extract new insights about coastal morphologic behavior, assess impacts to societal and ecological systems, and communicate probabilistic predictions to decision makers.

  17. A comment on priors for Bayesian occupancy models.

    PubMed

    Northrup, Joseph M; Gerber, Brian D

    2018-01-01

    Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are "uninformative" or "vague", such priors can easily be unintentionally highly informative. Here we report on how the specification of a "vague" normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts.

  18. Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marzouk, Youssef

    Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less

  19. RockFall analyst: A GIS extension for three-dimensional and spatially distributed rockfall hazard modeling

    NASA Astrophysics Data System (ADS)

    Lan, Hengxing; Derek Martin, C.; Lim, C. H.

    2007-02-01

    Geographic information system (GIS) modeling is used in combination with three-dimensional (3D) rockfall process modeling to assess rockfall hazards. A GIS extension, RockFall Analyst (RA), which is capable of effectively handling large amounts of geospatial information relative to rockfall behaviors, has been developed in ArcGIS using ArcObjects and C#. The 3D rockfall model considers dynamic processes on a cell plane basis. It uses inputs of distributed parameters in terms of raster and polygon features created in GIS. Two major components are included in RA: particle-based rockfall process modeling and geostatistics-based rockfall raster modeling. Rockfall process simulation results, 3D rockfall trajectories and their velocity features either for point seeders or polyline seeders are stored in 3D shape files. Distributed raster modeling, based on 3D rockfall trajectories and a spatial geostatistical technique, represents the distribution of spatial frequency, the flying and/or bouncing height, and the kinetic energy of falling rocks. A distribution of rockfall hazard can be created by taking these rockfall characteristics into account. A barrier analysis tool is also provided in RA to aid barrier design. An application of these modeling techniques to a case study is provided. The RA has been tested in ArcGIS 8.2, 8.3, 9.0 and 9.1.

  20. Modeling of surface dust concentration in snow cover at industrial area using neural networks and kriging

    NASA Astrophysics Data System (ADS)

    Sergeev, A. P.; Tarasov, D. A.; Buevich, A. G.; Shichkin, A. V.; Tyagunov, A. G.; Medvedev, A. N.

    2017-06-01

    Modeling of spatial distribution of pollutants in the urbanized territories is difficult, especially if there are multiple emission sources. When monitoring such territories, it is often impossible to arrange the necessary detailed sampling. Because of this, the usual methods of analysis and forecasting based on geostatistics are often less effective. Approaches based on artificial neural networks (ANNs) demonstrate the best results under these circumstances. This study compares two models based on ANNs, which are multilayer perceptron (MLP) and generalized regression neural networks (GRNNs) with the base geostatistical method - kriging. Models of the spatial dust distribution in the snow cover around the existing copper quarry and in the area of emissions of a nickel factory were created. To assess the effectiveness of the models three indices were used: the mean absolute error (MAE), the root-mean-square error (RMSE), and the relative root-mean-square error (RRMSE). Taking into account all indices the model of GRNN proved to be the most accurate which included coordinates of the sampling points and the distance to the likely emission source as input parameters for the modeling. Maps of spatial dust distribution in the snow cover were created in the study area. It has been shown that the models based on ANNs were more accurate than the kriging, particularly in the context of a limited data set.

  1. Bayesian estimation inherent in a Mexican-hat-type neural network

    NASA Astrophysics Data System (ADS)

    Takiyama, Ken

    2016-05-01

    Brain functions, such as perception, motor control and learning, and decision making, have been explained based on a Bayesian framework, i.e., to decrease the effects of noise inherent in the human nervous system or external environment, our brain integrates sensory and a priori information in a Bayesian optimal manner. However, it remains unclear how Bayesian computations are implemented in the brain. Herein, I address this issue by analyzing a Mexican-hat-type neural network, which was used as a model of the visual cortex, motor cortex, and prefrontal cortex. I analytically demonstrate that the dynamics of an order parameter in the model corresponds exactly to a variational inference of a linear Gaussian state-space model, a Bayesian estimation, when the strength of recurrent synaptic connectivity is appropriately stronger than that of an external stimulus, a plausible condition in the brain. This exact correspondence can reveal the relationship between the parameters in the Bayesian estimation and those in the neural network, providing insight for understanding brain functions.

  2. A non-parametric postprocessor for bias-correcting multi-model ensemble forecasts of hydrometeorological and hydrologic variables

    NASA Astrophysics Data System (ADS)

    Brown, James; Seo, Dong-Jun

    2010-05-01

    Operational forecasts of hydrometeorological and hydrologic variables often contain large uncertainties, for which ensemble techniques are increasingly used. However, the utility of ensemble forecasts depends on the unbiasedness of the forecast probabilities. We describe a technique for quantifying and removing biases from ensemble forecasts of hydrometeorological and hydrologic variables, intended for use in operational forecasting. The technique makes no a priori assumptions about the distributional form of the variables, which is often unknown or difficult to model parametrically. The aim is to estimate the conditional cumulative distribution function (ccdf) of the observed variable given a (possibly biased) real-time ensemble forecast from one or several forecasting systems (multi-model ensembles). The technique is based on Bayesian optimal linear estimation of indicator variables, and is analogous to indicator cokriging (ICK) in geostatistics. By developing linear estimators for the conditional expectation of the observed variable at many thresholds, ICK provides a discrete approximation of the full ccdf. Since ICK minimizes the conditional error variance of the indicator expectation at each threshold, it effectively minimizes the Continuous Ranked Probability Score (CRPS) when infinitely many thresholds are employed. However, the ensemble members used as predictors in ICK, and other bias-correction techniques, are often highly cross-correlated, both within and between models. Thus, we propose an orthogonal transform of the predictors used in ICK, which is analogous to using their principal components in the linear system of equations. This leads to a well-posed problem in which a minimum number of predictors are used to provide maximum information content in terms of the total variance explained. The technique is used to bias-correct precipitation ensemble forecasts from the NCEP Global Ensemble Forecast System (GEFS), for which independent validation results are presented. Extension to multimodel ensembles from the NCEP GFS and Short Range Ensemble Forecast (SREF) systems is also proposed.

  3. Forward modeling of gravity data using geostatistically generated subsurface density variations

    USGS Publications Warehouse

    Phelps, Geoffrey

    2016-01-01

    Using geostatistical models of density variations in the subsurface, constrained by geologic data, forward models of gravity anomalies can be generated by discretizing the subsurface and calculating the cumulative effect of each cell (pixel). The results of such stochastically generated forward gravity anomalies can be compared with the observed gravity anomalies to find density models that match the observed data. These models have an advantage over forward gravity anomalies generated using polygonal bodies of homogeneous density because generating numerous realizations explores a larger region of the solution space. The stochastic modeling can be thought of as dividing the forward model into two components: that due to the shape of each geologic unit and that due to the heterogeneous distribution of density within each geologic unit. The modeling demonstrates that the internally heterogeneous distribution of density within each geologic unit can contribute significantly to the resulting calculated forward gravity anomaly. Furthermore, the stochastic models match observed statistical properties of geologic units, the solution space is more broadly explored by producing a suite of successful models, and the likelihood of a particular conceptual geologic model can be compared. The Vaca Fault near Travis Air Force Base, California, can be successfully modeled as a normal or strike-slip fault, with the normal fault model being slightly more probable. It can also be modeled as a reverse fault, although this structural geologic configuration is highly unlikely given the realizations we explored.

  4. Adapting geostatistics to analyze spatial and temporal trends in weed populations

    USDA-ARS?s Scientific Manuscript database

    Geostatistics were originally developed in mining to estimate the location, abundance and quality of ore over large areas from soil samples to optimize future mining efforts. Here, some of these methods were adapted to weeds to account for a limited distribution area (i.e., inside a field), variatio...

  5. Estimating Solar PV Output Using Modern Space/Time Geostatistics (Presentation)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, S. J.; George, R.; Bush, B.

    2009-04-29

    This presentation describes a project that uses mapping techniques to predict solar output at subhourly resolution at any spatial point, develop a methodology that is applicable to natural resources in general, and demonstrate capability of geostatistical techniques to predict the output of a potential solar plant.

  6. A Nonparametric Geostatistical Method For Estimating Species Importance

    Treesearch

    Andrew J. Lister; Rachel Riemann; Michael Hoppus

    2001-01-01

    Parametric statistical methods are not always appropriate for conducting spatial analyses of forest inventory data. Parametric geostatistical methods such as variography and kriging are essentially averaging procedures, and thus can be affected by extreme values. Furthermore, non normal distributions violate the assumptions of analyses in which test statistics are...

  7. A JACKNIFE APPROACH TO EXAMINE UNCERTAINTY AND TEMPORAL CHANGES IN THE SPATIL CORRELATION OF A VOC PLUME

    EPA Science Inventory

    ABSTRACT: The application of geostatistics to spatial interpolation of time-invariant properties in ground-water studies (such as transmissivity or aquifer thickness) is well documented. The use of geostatistics on time-variant conditions such as ground-water quality is also be...

  8. Development of an Integrated Team Training Design and Assessment Architecture to Support Adaptability in Healthcare Teams

    DTIC Science & Technology

    2016-10-01

    and implementation of embedded, adaptive feedback and performance assessment. The investigators also initiated work designing a Bayesian Belief ...training; Teamwork; Adaptive performance; Leadership; Simulation; Modeling; Bayesian belief networks (BBN) 16. SECURITY CLASSIFICATION OF: 17. LIMITATION...Trauma teams Team training Teamwork Adaptability Adaptive performance Leadership Simulation Modeling Bayesian belief networks (BBN) 6

  9. A Bayesian Network Approach to Modeling Learning Progressions and Task Performance. CRESST Report 776

    ERIC Educational Resources Information Center

    West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.

    2010-01-01

    A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…

  10. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  11. Development of dynamic Bayesian models for web application test management

    NASA Astrophysics Data System (ADS)

    Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.

    2018-03-01

    The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.

  12. Trans-dimensional and hierarchical Bayesian approaches toward rigorous estimation of seismic sources and structures in the Northeast Asia

    NASA Astrophysics Data System (ADS)

    Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean

    2016-04-01

    A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.

  13. Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery

    PubMed Central

    Ekins, Sean; Reynolds, Robert C.; Kim, Hiyun; Koo, Mi-Sun; Ekonomidis, Marilyn; Talaue, Meliza; Paget, Steve D.; Woolhiser, Lisa K.; Lenaerts, Anne J.; Bunin, Barry A.; Connell, Nancy; Freundlich, Joel S.

    2013-01-01

    SUMMARY Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data, to experimentally validate virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screen a commercial library and experimentally confirm actives with hit rates exceeding typical HTS results by 1-2 orders of magnitude. The first dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery. PMID:23521795

  14. Bayesian data analysis for newcomers.

    PubMed

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  15. Evaluation of calibration efficacy under different levels of uncertainty

    DOE PAGES

    Heo, Yeonsook; Graziano, Diane J.; Guzowski, Leah; ...

    2014-06-10

    This study examines how calibration performs under different levels of uncertainty in model input data. It specifically assesses the efficacy of Bayesian calibration to enhance the reliability of EnergyPlus model predictions. A Bayesian approach can be used to update uncertain values of parameters, given measured energy-use data, and to quantify the associated uncertainty.We assess the efficacy of Bayesian calibration under a controlled virtual-reality setup, which enables rigorous validation of the accuracy of calibration results in terms of both calibrated parameter values and model predictions. Case studies demonstrate the performance of Bayesian calibration of base models developed from audit data withmore » differing levels of detail in building design, usage, and operation.« less

  16. Universal Darwinism As a Process of Bayesian Inference.

    PubMed

    Campbell, John O

    2016-01-01

    Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.

  17. Universal Darwinism As a Process of Bayesian Inference

    PubMed Central

    Campbell, John O.

    2016-01-01

    Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an “experiment” in the external world environment, and the results of that “experiment” or the “surprise” entailed by predicted and actual outcomes of the “experiment.” Minimization of free energy implies that the implicit measure of “surprise” experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438

  18. Spatial correlation of shear-wave velocity in the San Francisco Bay Area sediments

    USGS Publications Warehouse

    Thompson, E.M.; Baise, L.G.; Kayen, R.E.

    2007-01-01

    Ground motions recorded within sedimentary basins are variable over short distances. One important cause of the variability is that local soil properties are variable at all scales. Regional hazard maps developed for predicting site effects are generally derived from maps of surficial geology; however, recent studies have shown that mapped geologic units do not correlate well with the average shear-wave velocity of the upper 30 m, Vs(30). We model the horizontal variability of near-surface soil shear-wave velocity in the San Francisco Bay Area to estimate values in unsampled locations in order to account for site effects in a continuous manner. Previous geostatistical studies of soil properties have shown horizontal correlations at the scale of meters to tens of meters while the vertical correlations are on the order of centimeters. In this paper we analyze shear-wave velocity data over regional distances and find that surface shear-wave velocity is correlated at horizontal distances up to 4 km based on data from seismic cone penetration tests and the spectral analysis of surface waves. We propose a method to map site effects by using geostatistical methods based on the shear-wave velocity correlation structure within a sedimentary basin. If used in conjunction with densely spaced shear-wave velocity profiles in regions of high seismic risk, geostatistical methods can produce reliable continuous maps of site effects. ?? 2006 Elsevier Ltd. All rights reserved.

  19. A Bayesian hierarchical diffusion model decomposition of performance in Approach–Avoidance Tasks

    PubMed Central

    Krypotos, Angelos-Miltiadis; Beckers, Tom; Kindt, Merel; Wagenmakers, Eric-Jan

    2015-01-01

    Common methods for analysing response time (RT) tasks, frequently used across different disciplines of psychology, suffer from a number of limitations such as the failure to directly measure the underlying latent processes of interest and the inability to take into account the uncertainty associated with each individual's point estimate of performance. Here, we discuss a Bayesian hierarchical diffusion model and apply it to RT data. This model allows researchers to decompose performance into meaningful psychological processes and to account optimally for individual differences and commonalities, even with relatively sparse data. We highlight the advantages of the Bayesian hierarchical diffusion model decomposition by applying it to performance on Approach–Avoidance Tasks, widely used in the emotion and psychopathology literature. Model fits for two experimental data-sets demonstrate that the model performs well. The Bayesian hierarchical diffusion model overcomes important limitations of current analysis procedures and provides deeper insight in latent psychological processes of interest. PMID:25491372

  20. Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

    PubMed

    Shen, Yanna; Cooper, Gregory F

    2012-09-01

    This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  1. Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum

    2011-01-01

    Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…

  2. Improving imperfect data from health management information systems in Africa using space-time geostatistics.

    PubMed

    Gething, Peter W; Noor, Abdisalan M; Gikandi, Priscilla W; Ogara, Esther A A; Hay, Simon I; Nixon, Mark S; Snow, Robert W; Atkinson, Peter M

    2006-06-01

    Reliable and timely information on disease-specific treatment burdens within a health system is critical for the planning and monitoring of service provision. Health management information systems (HMIS) exist to address this need at national scales across Africa but are failing to deliver adequate data because of widespread underreporting by health facilities. Faced with this inadequacy, vital public health decisions often rely on crudely adjusted regional and national estimates of treatment burdens. This study has taken the example of presumed malaria in outpatients within the largely incomplete Kenyan HMIS database and has defined a geostatistical modelling framework that can predict values for all data that are missing through space and time. The resulting complete set can then be used to define treatment burdens for presumed malaria at any level of spatial and temporal aggregation. Validation of the model has shown that these burdens are quantified to an acceptable level of accuracy at the district, provincial, and national scale. The modelling framework presented here provides, to our knowledge for the first time, reliable information from imperfect HMIS data to support evidence-based decision-making at national and sub-national levels.

  3. A geostatistical approach to data harmonization - Application to radioactivity exposure data

    NASA Astrophysics Data System (ADS)

    Baume, O.; Skøien, J. O.; Heuvelink, G. B. M.; Pebesma, E. J.; Melles, S. J.

    2011-06-01

    Environmental issues such as air, groundwater pollution and climate change are frequently studied at spatial scales that cross boundaries between political and administrative regions. It is common for different administrations to employ different data collection methods. If these differences are not taken into account in spatial interpolation procedures then biases may appear and cause unrealistic results. The resulting maps may show misleading patterns and lead to wrong interpretations. Also, errors will propagate when these maps are used as input to environmental process models. In this paper we present and apply a geostatistical model that generalizes the universal kriging model such that it can handle heterogeneous data sources. The associated best linear unbiased estimation and prediction (BLUE and BLUP) equations are presented and it is shown that these lead to harmonized maps from which estimated biases are removed. The methodology is illustrated with an example of country bias removal in a radioactivity exposure assessment for four European countries. The application also addresses multicollinearity problems in data harmonization, which arise when both artificial bias factors and natural drifts are present and cannot easily be distinguished. Solutions for handling multicollinearity are suggested and directions for further investigations proposed.

  4. Improving Imperfect Data from Health Management Information Systems in Africa Using Space–Time Geostatistics

    PubMed Central

    Gething, Peter W; Noor, Abdisalan M; Gikandi, Priscilla W; Ogara, Esther A. A; Hay, Simon I; Nixon, Mark S; Snow, Robert W; Atkinson, Peter M

    2006-01-01

    Background Reliable and timely information on disease-specific treatment burdens within a health system is critical for the planning and monitoring of service provision. Health management information systems (HMIS) exist to address this need at national scales across Africa but are failing to deliver adequate data because of widespread underreporting by health facilities. Faced with this inadequacy, vital public health decisions often rely on crudely adjusted regional and national estimates of treatment burdens. Methods and Findings This study has taken the example of presumed malaria in outpatients within the largely incomplete Kenyan HMIS database and has defined a geostatistical modelling framework that can predict values for all data that are missing through space and time. The resulting complete set can then be used to define treatment burdens for presumed malaria at any level of spatial and temporal aggregation. Validation of the model has shown that these burdens are quantified to an acceptable level of accuracy at the district, provincial, and national scale. Conclusions The modelling framework presented here provides, to our knowledge for the first time, reliable information from imperfect HMIS data to support evidence-based decision-making at national and sub-national levels. PMID:16719557

  5. Scale Mixture Models with Applications to Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

    2003-11-01

    Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.

  6. Rigorous Approach in Investigation of Seismic Structure and Source Characteristicsin Northeast Asia: Hierarchical and Trans-dimensional Bayesian Inversion

    NASA Astrophysics Data System (ADS)

    Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.

    2015-12-01

    Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.

  7. Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

    ERIC Educational Resources Information Center

    Ansari, Asim; Iyengar, Raghuram

    2006-01-01

    We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…

  8. Next Steps in Bayesian Structural Equation Models: Comments on, Variations of, and Extensions to Muthen and Asparouhov (2012)

    ERIC Educational Resources Information Center

    Rindskopf, David

    2012-01-01

    Muthen and Asparouhov (2012) made a strong case for the advantages of Bayesian methodology in factor analysis and structural equation models. I show additional extensions and adaptations of their methods and show how non-Bayesians can take advantage of many (though not all) of these advantages by using interval restrictions on parameters. By…

  9. Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

    ERIC Educational Resources Information Center

    Marcoulides, Katerina M.

    2018-01-01

    This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

  10. A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.

    ERIC Educational Resources Information Center

    Glas, Cees A. W.; Meijer, Rob R.

    A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…

  11. A Tutorial Introduction to Bayesian Models of Cognitive Development

    ERIC Educational Resources Information Center

    Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei

    2011-01-01

    We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…

  12. Modeling of Academic Achievement of Primary School Students in Ethiopia Using Bayesian Multilevel Approach

    ERIC Educational Resources Information Center

    Sebro, Negusse Yohannes; Goshu, Ayele Taye

    2017-01-01

    This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…

  13. A Simulation Study Comparison of Bayesian Estimation with Conventional Methods for Estimating Unknown Change Points

    ERIC Educational Resources Information Center

    Wang, Lijuan; McArdle, John J.

    2008-01-01

    The main purpose of this research is to evaluate the performance of a Bayesian approach for estimating unknown change points using Monte Carlo simulations. The univariate and bivariate unknown change point mixed models were presented and the basic idea of the Bayesian approach for estimating the models was discussed. The performance of Bayesian…

  14. Estimation of parameter uncertainty for an activated sludge model using Bayesian inference: a comparison with the frequentist method.

    PubMed

    Zonta, Zivko J; Flotats, Xavier; Magrí, Albert

    2014-08-01

    The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.

  15. Optimal health and disease management using spatial uncertainty: a geographic characterization of emergent artemisinin-resistant Plasmodium falciparum distributions in Southeast Asia.

    PubMed

    Grist, Eric P M; Flegg, Jennifer A; Humphreys, Georgina; Mas, Ignacio Suay; Anderson, Tim J C; Ashley, Elizabeth A; Day, Nicholas P J; Dhorda, Mehul; Dondorp, Arjen M; Faiz, M Abul; Gething, Peter W; Hien, Tran T; Hlaing, Tin M; Imwong, Mallika; Kindermans, Jean-Marie; Maude, Richard J; Mayxay, Mayfong; McDew-White, Marina; Menard, Didier; Nair, Shalini; Nosten, Francois; Newton, Paul N; Price, Ric N; Pukrittayakamee, Sasithon; Takala-Harrison, Shannon; Smithuis, Frank; Nguyen, Nhien T; Tun, Kyaw M; White, Nicholas J; Witkowski, Benoit; Woodrow, Charles J; Fairhurst, Rick M; Sibley, Carol Hopkins; Guerin, Philippe J

    2016-10-24

    Artemisinin-resistant Plasmodium falciparum malaria parasites are now present across much of mainland Southeast Asia, where ongoing surveys are measuring and mapping their spatial distribution. These efforts require substantial resources. Here we propose a generic 'smart surveillance' methodology to identify optimal candidate sites for future sampling and thus map the distribution of artemisinin resistance most efficiently. The approach uses the 'uncertainty' map generated iteratively by a geostatistical model to determine optimal locations for subsequent sampling. The methodology is illustrated using recent data on the prevalence of the K13-propeller polymorphism (a genetic marker of artemisinin resistance) in the Greater Mekong Subregion. This methodology, which has broader application to geostatistical mapping in general, could improve the quality and efficiency of drug resistance mapping and thereby guide practical operations to eliminate malaria in affected areas.

  16. Bayesian estimation of differential transcript usage from RNA-seq data.

    PubMed

    Papastamoulis, Panagiotis; Rattray, Magnus

    2017-11-27

    Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian model which is originally designed for identifying changes in overall expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist model for inferring DTU. cjBitSeq is a read based model and performs fully Bayesian inference by MCMC sampling on the space of latent state of each transcript per gene. BayesDRIMSeq is a count based model and estimates the Bayes Factor of a DTU model against a null model using Laplace's approximation. The proposed models are benchmarked against the existing ones using a recent independent simulation study as well as a real RNA-seq dataset. Our results suggest that the Bayesian methods exhibit similar performance with DRIMSeq in terms of precision/recall but offer better calibration of False Discovery Rate.

  17. Development of uncertainty-based work injury model using Bayesian structural equation modelling.

    PubMed

    Chatterjee, Snehamoy

    2014-01-01

    This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.

  18. BUMPER: the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction

    NASA Astrophysics Data System (ADS)

    Holden, Phil; Birks, John; Brooks, Steve; Bush, Mark; Hwang, Grace; Matthews-Bird, Frazer; Valencia, Bryan; van Woesik, Robert

    2017-04-01

    We describe the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. The principal motivation for a Bayesian approach is that the palaeoenvironment is treated probabilistically, and can be updated as additional data become available. Bayesian approaches therefore provide a reconstruction-specific quantification of the uncertainty in the data and in the model parameters. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring 2 seconds to build a 100-taxon model from a 100-site training-set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training-sets under ideal assumptions. We then use these to demonstrate both the general applicability of the model and the sensitivity of reconstructions to the characteristics of the training-set, considering assemblage richness, taxon tolerances, and the number of training sites. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. In all of these applications an identically configured model is used, the only change being the input files that provide the training-set environment and taxon-count data.

  19. Geostatistics for environmental and geotechnical applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rouhani, S.; Srivastava, R.M.; Desbarats, A.J.

    1996-12-31

    This conference was held January 26--27, 1995 in Phoenix, Arizona. The purpose of this conference was to provide a multidisciplinary forum for exchange of state-of-the-art information on the technology of geostatistics and its applicability for environmental studies, especially site characterization. Individual papers have been processed separately for inclusion in the appropriate data bases.

  20. Geostatistics as a tool to define various categories of resources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sabourin, R.

    1983-02-01

    Definition of 'measured' and 'indicated' resources tend to be vague. Yet, the calculation of such categories of resources in a mineral deposit calls for specific technical criteria. The author discusses how a geostatistical methodology provides the technical criteria required to classify reasonably assured resources by levels of assurance of their existence.

  1. Integration of GIS, Geostatistics, and 3-D Technology to Assess the Spatial Distribution of Soil Moisture

    NASA Technical Reports Server (NTRS)

    Betts, M.; Tsegaye, T.; Tadesse, W.; Coleman, T. L.; Fahsi, A.

    1998-01-01

    The spatial and temporal distribution of near surface soil moisture is of fundamental importance to many physical, biological, biogeochemical, and hydrological processes. However, knowledge of these space-time dynamics and the processes which control them remains unclear. The integration of geographic information systems (GIS) and geostatistics together promise a simple mechanism to evaluate and display the spatial and temporal distribution of this vital hydrologic and physical variable. Therefore, this research demonstrates the use of geostatistics and GIS to predict and display soil moisture distribution under vegetated and non-vegetated plots. The research was conducted at the Winfred Thomas Agricultural Experiment Station (WTAES), Hazel Green, Alabama. Soil moisture measurement were done on a 10 by 10 m grid from tall fescue grass (GR), alfalfa (AA), bare rough (BR), and bare smooth (BS) plots. Results indicated that variance associated with soil moisture was higher for vegetated plots than non-vegetated plots. The presence of vegetation in general contributed to the spatial variability of soil moisture. Integration of geostatistics and GIS can improve the productivity of farm lands and the precision of farming.

  2. Application of bayesian networks to real-time flood risk estimation

    NASA Astrophysics Data System (ADS)

    Garrote, L.; Molina, M.; Blasco, G.

    2003-04-01

    This paper presents the application of a computational paradigm taken from the field of artificial intelligence - the bayesian network - to model the behaviour of hydrologic basins during floods. The final goal of this research is to develop representation techniques for hydrologic simulation models in order to define, develop and validate a mechanism, supported by a software environment, oriented to build decision models for the prediction and management of river floods in real time. The emphasis is placed on providing decision makers with tools to incorporate their knowledge of basin behaviour, usually formulated in terms of rainfall-runoff models, in the process of real-time decision making during floods. A rainfall-runoff model is only a step in the process of decision making. If a reliable rainfall forecast is available and the rainfall-runoff model is well calibrated, decisions can be based mainly on model results. However, in most practical situations, uncertainties in rainfall forecasts or model performance have to be incorporated in the decision process. The computation paradigm adopted for the simulation of hydrologic processes is the bayesian network. A bayesian network is a directed acyclic graph that represents causal influences between linked variables. Under this representation, uncertain qualitative variables are related through causal relations quantified with conditional probabilities. The solution algorithm allows the computation of the expected probability distribution of unknown variables conditioned to the observations. An approach to represent hydrologic processes by bayesian networks with temporal and spatial extensions is presented in this paper, together with a methodology for the development of bayesian models using results produced by deterministic hydrologic simulation models

  3. Model-based Bayesian inference for ROC data analysis

    NASA Astrophysics Data System (ADS)

    Lei, Tianhu; Bae, K. Ty

    2013-03-01

    This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.

  4. Learning Kriging by an instructive program.

    NASA Astrophysics Data System (ADS)

    Cuador, José

    2016-04-01

    There are three types of problem classification: the deterministic, the approximated and the stochastic problems. First, in the deterministic problems the law of the phenomenon and the data are known in the entire domain and for each instant of time. In the approximated problems, the law of the phenomenon behavior is unknown but the data can be known in the entire domain and for each instant of time. In the stochastic problems much of the law and the data are unknown in the domain, so in this case the spatial behavior of the data can only be explained with probabilistic laws. This is the most important reason why the students of geo-sciences careers and others related careers need to take courses in advance estimation methods. A good example of this situation is the estimation grades in ore mineral deposit for which the Geostatistics was formalized by G. Matheron in 1962 [6]. Geostatistics is defined as the application of the theory of Random Function to the recognition and estimation of natural phenomenon [4]. Nowadays, Geostatistics is widely used in several fields of earth sciences, for example: Mining, Oil exploration, Environment, Agricultural, Forest and others [3]. It provides a wide variety of tools for spatial data analysis and allows analysing models which are subjected to degrees of uncertainty with the rigor of mathematics and formal statistical analysis [9]. Adequate models for the Kriging interpolator has been developed according to the data behavior; however there are two key steps in applying this interpolator properly: the semivariogram determination and the Kriging neighborhood selection. The main objective of this paper is to present these two elements using an instructive program.

  5. Geostatistical Modeling of the Spatial Distribution of Sediment Oxygen Demand Within a Coastal Plain Blackwater Watershed

    USDA-ARS?s Scientific Manuscript database

    Blackwater streams of the Georgia Coastal Plain are often listed as impaired due to chronically low DO levels. Previous research has shown that high sediment oxygen demand (SOD) values, a hypothesized cause of lowered DO within these waters, are significantly positively correlated with TOC within th...

  6. Periodicity in spatial data and geostatistical models: autocorrelation between patches

    Treesearch

    Volker C. Radeloff; Todd F. Miller; Hong S. He; David J. Mladenoff

    2000-01-01

    Several recent studies in landscape ecology have found periodicity in correlograms or semi-variograms calculated, for instance, from spatial data of soils, forests, or animal populations. Some of the studies interpreted this as an indication of regular or periodic landscape patterns. This interpretation is in disagreement with other studies that doubt whether such...

  7. Area-to-point regression kriging for pan-sharpening

    NASA Astrophysics Data System (ADS)

    Wang, Qunming; Shi, Wenzhong; Atkinson, Peter M.

    2016-04-01

    Pan-sharpening is a technique to combine the fine spatial resolution panchromatic (PAN) band with the coarse spatial resolution multispectral bands of the same satellite to create a fine spatial resolution multispectral image. In this paper, area-to-point regression kriging (ATPRK) is proposed for pan-sharpening. ATPRK considers the PAN band as the covariate. Moreover, ATPRK is extended with a local approach, called adaptive ATPRK (AATPRK), which fits a regression model using a local, non-stationary scheme such that the regression coefficients change across the image. The two geostatistical approaches, ATPRK and AATPRK, were compared to the 13 state-of-the-art pan-sharpening approaches summarized in Vivone et al. (2015) in experiments on three separate datasets. ATPRK and AATPRK produced more accurate pan-sharpened images than the 13 benchmark algorithms in all three experiments. Unlike the benchmark algorithms, the two geostatistical solutions precisely preserved the spectral properties of the original coarse data. Furthermore, ATPRK can be enhanced by a local scheme in AATRPK, in cases where the residuals from a global regression model are such that their spatial character varies locally.

  8. Fast Geostatistical Inversion using Randomized Matrix Decompositions and Sketchings for Heterogeneous Aquifer Characterization

    NASA Astrophysics Data System (ADS)

    O'Malley, D.; Le, E. B.; Vesselinov, V. V.

    2015-12-01

    We present a fast, scalable, and highly-implementable stochastic inverse method for characterization of aquifer heterogeneity. The method utilizes recent advances in randomized matrix algebra and exploits the structure of the Quasi-Linear Geostatistical Approach (QLGA), without requiring a structured grid like Fast-Fourier Transform (FFT) methods. The QLGA framework is a more stable version of Gauss-Newton iterates for a large number of unknown model parameters, but provides unbiased estimates. The methods are matrix-free and do not require derivatives or adjoints, and are thus ideal for complex models and black-box implementation. We also incorporate randomized least-square solvers and data-reduction methods, which speed up computation and simulate missing data points. The new inverse methodology is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. Inversion results based on series of synthetic problems with steady-state and transient calibration data are presented.

  9. The applications of model-based geostatistics in helminth epidemiology and control.

    PubMed

    Magalhães, Ricardo J Soares; Clements, Archie C A; Patil, Anand P; Gething, Peter W; Brooker, Simon

    2011-01-01

    Funding agencies are dedicating substantial resources to tackle helminth infections. Reliable maps of the distribution of helminth infection can assist these efforts by targeting control resources to areas of greatest need. The ability to define the distribution of infection at regional, national and subnational levels has been enhanced greatly by the increased availability of good quality survey data and the use of model-based geostatistics (MBG), enabling spatial prediction in unsampled locations. A major advantage of MBG risk mapping approaches is that they provide a flexible statistical platform for handling and representing different sources of uncertainty, providing plausible and robust information on the spatial distribution of infections to inform the design and implementation of control programmes. Focussing on schistosomiasis and soil-transmitted helminthiasis, with additional examples for lymphatic filariasis and onchocerciasis, we review the progress made to date with the application of MBG tools in large-scale, real-world control programmes and propose a general framework for their application to inform integrative spatial planning of helminth disease control programmes. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. Kriging in the Shadows: Geostatistical Interpolation for Remote Sensing

    NASA Technical Reports Server (NTRS)

    Rossi, Richard E.; Dungan, Jennifer L.; Beck, Louisa R.

    1994-01-01

    It is often useful to estimate obscured or missing remotely sensed data. Traditional interpolation methods, such as nearest-neighbor or bilinear resampling, do not take full advantage of the spatial information in the image. An alternative method, a geostatistical technique known as indicator kriging, is described and demonstrated using a Landsat Thematic Mapper image in southern Chiapas, Mexico. The image was first classified into pasture and nonpasture land cover. For each pixel that was obscured by cloud or cloud shadow, the probability that it was pasture was assigned by the algorithm. An exponential omnidirectional variogram model was used to characterize the spatial continuity of the image for use in the kriging algorithm. Assuming a cutoff probability level of 50%, the error was shown to be 17% with no obvious spatial bias but with some tendency to categorize nonpasture as pasture (overestimation). While this is a promising result, the method's practical application in other missing data problems for remotely sensed images will depend on the amount and spatial pattern of the unobscured pixels and missing pixels and the success of the spatial continuity model used.

  11. Bayesian Modeling of a Human MMORPG Player

    NASA Astrophysics Data System (ADS)

    Synnaeve, Gabriel; Bessière, Pierre

    2011-03-01

    This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.

  12. A comment on priors for Bayesian occupancy models

    PubMed Central

    Gerber, Brian D.

    2018-01-01

    Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are “uninformative” or “vague”, such priors can easily be unintentionally highly informative. Here we report on how the specification of a “vague” normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts. PMID:29481554

  13. The Bayesian boom: good thing or bad?

    PubMed Central

    Hahn, Ulrike

    2014-01-01

    A series of high-profile critiques of Bayesian models of cognition have recently sparked controversy. These critiques question the contribution of rational, normative considerations in the study of cognition. The present article takes central claims from these critiques and evaluates them in light of specific models. Closer consideration of actual examples of Bayesian treatments of different cognitive phenomena allows one to defuse these critiques showing that they cannot be sustained across the diversity of applications of the Bayesian framework for cognitive modeling. More generally, there is nothing in the Bayesian framework that would inherently give rise to the deficits that these critiques perceive, suggesting they have been framed at the wrong level of generality. At the same time, the examples are used to demonstrate the different ways in which consideration of rationality uniquely benefits both theory and practice in the study of cognition. PMID:25152738

  14. Calibrating Bayesian Network Representations of Social-Behavioral Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whitney, Paul D.; Walsh, Stephen J.

    2010-04-08

    While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empiricalmore » comparison with data taken from the Minorities at Risk Organizational Behaviors database.« less

  15. A comprehensive probabilistic analysis model of oil pipelines network based on Bayesian network

    NASA Astrophysics Data System (ADS)

    Zhang, C.; Qin, T. X.; Jiang, B.; Huang, C.

    2018-02-01

    Oil pipelines network is one of the most important facilities of energy transportation. But oil pipelines network accident may result in serious disasters. Some analysis models for these accidents have been established mainly based on three methods, including event-tree, accident simulation and Bayesian network. Among these methods, Bayesian network is suitable for probabilistic analysis. But not all the important influencing factors are considered and the deployment rule of the factors has not been established. This paper proposed a probabilistic analysis model of oil pipelines network based on Bayesian network. Most of the important influencing factors, including the key environment condition and emergency response are considered in this model. Moreover, the paper also introduces a deployment rule for these factors. The model can be used in probabilistic analysis and sensitive analysis of oil pipelines network accident.

  16. An evaluation of Bayesian techniques for controlling model complexity and selecting inputs in a neural network for short-term load forecasting.

    PubMed

    Hippert, Henrique S; Taylor, James W

    2010-04-01

    Artificial neural networks have frequently been proposed for electricity load forecasting because of their capabilities for the nonlinear modelling of large multivariate data sets. Modelling with neural networks is not an easy task though; two of the main challenges are defining the appropriate level of model complexity, and choosing the input variables. This paper evaluates techniques for automatic neural network modelling within a Bayesian framework, as applied to six samples containing daily load and weather data for four different countries. We analyse input selection as carried out by the Bayesian 'automatic relevance determination', and the usefulness of the Bayesian 'evidence' for the selection of the best structure (in terms of number of neurones), as compared to methods based on cross-validation. Copyright 2009 Elsevier Ltd. All rights reserved.

  17. An evaluation of the Bayesian approach to fitting the N-mixture model for use with pseudo-replicated count data

    USGS Publications Warehouse

    Toribo, S.G.; Gray, B.R.; Liang, S.

    2011-01-01

    The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.

  18. Bayesian analysis of CCDM models

    NASA Astrophysics Data System (ADS)

    Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.

    2017-09-01

    Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.

  19. Refining value-at-risk estimates using a Bayesian Markov-switching GJR-GARCH copula-EVT model.

    PubMed

    Sampid, Marius Galabe; Hasim, Haslifah M; Dai, Hongsheng

    2018-01-01

    In this paper, we propose a model for forecasting Value-at-Risk (VaR) using a Bayesian Markov-switching GJR-GARCH(1,1) model with skewed Student's-t innovation, copula functions and extreme value theory. A Bayesian Markov-switching GJR-GARCH(1,1) model that identifies non-constant volatility over time and allows the GARCH parameters to vary over time following a Markov process, is combined with copula functions and EVT to formulate the Bayesian Markov-switching GJR-GARCH(1,1) copula-EVT VaR model, which is then used to forecast the level of risk on financial asset returns. We further propose a new method for threshold selection in EVT analysis, which we term the hybrid method. Empirical and back-testing results show that the proposed VaR models capture VaR reasonably well in periods of calm and in periods of crisis.

  20. Comparing interval estimates for small sample ordinal CFA models

    PubMed Central

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002

  1. Comparing interval estimates for small sample ordinal CFA models.

    PubMed

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.

  2. Bayesian demography 250 years after Bayes

    PubMed Central

    Bijak, Jakub; Bryant, John

    2016-01-01

    Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889

  3. Robust geostatistical analysis of spatial data

    NASA Astrophysics Data System (ADS)

    Papritz, Andreas; Künsch, Hans Rudolf; Schwierz, Cornelia; Stahel, Werner A.

    2013-04-01

    Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outliers affect the modelling of the large-scale spatial trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation (Welsh and Richardson, 1997). Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and non-sampled locations and kriging variances. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis a data set on heavy metal contamination of the soil in the vicinity of a metal smelter. Marchant, B.P. and Lark, R.M. 2007. Robust estimation of the variogram by residual maximum likelihood. Geoderma 140: 62-72. Richardson, A.M. and Welsh, A.H. 1995. Robust restricted maximum likelihood in mixed linear models. Biometrics 51: 1429-1439. Welsh, A.H. and Richardson, A.M. 1997. Approaches to the robust estimation of mixed models. In: Handbook of Statistics Vol. 15, Elsevier, pp. 343-384.

  4. Hybrid Optimal Design of the Eco-Hydrological Wireless Sensor Network in the Middle Reach of the Heihe River Basin, China

    PubMed Central

    Kang, Jian; Li, Xin; Jin, Rui; Ge, Yong; Wang, Jinfeng; Wang, Jianghao

    2014-01-01

    The eco-hydrological wireless sensor network (EHWSN) in the middle reaches of the Heihe River Basin in China is designed to capture the spatial and temporal variability and to estimate the ground truth for validating the remote sensing productions. However, there is no available prior information about a target variable. To meet both requirements, a hybrid model-based sampling method without any spatial autocorrelation assumptions is developed to optimize the distribution of EHWSN nodes based on geostatistics. This hybrid model incorporates two sub-criteria: one for the variogram modeling to represent the variability, another for improving the spatial prediction to evaluate remote sensing productions. The reasonability of the optimized EHWSN is validated from representativeness, the variogram modeling and the spatial accuracy through using 15 types of simulation fields generated with the unconditional geostatistical stochastic simulation. The sampling design shows good representativeness; variograms estimated by samples have less than 3% mean error relative to true variograms. Then, fields at multiple scales are predicted. As the scale increases, estimated fields have higher similarities to simulation fields at block sizes exceeding 240 m. The validations prove that this hybrid sampling method is effective for both objectives when we do not know the characteristics of an optimized variables. PMID:25317762

  5. Accounting for pH heterogeneity and variability in modelling human health risks from cadmium in contaminated land.

    PubMed

    Gay, J Rebecca; Korre, Anna

    2009-07-01

    The authors have previously published a methodology which combines quantitative probabilistic human health risk assessment and spatial statistical methods (geostatistics) to produce an assessment, incorporating uncertainty, of risks to human health from exposure to contaminated land. The model assumes a constant soil to plant concentration factor (CF(veg)) when calculating intake of contaminants. This model is modified here to enhance its use in a situation where CF(veg) varies according to soil pH, as is the case for cadmium. The original methodology uses sequential indicator simulation (SIS) to map soil concentration estimates for one contaminant across a site. A real, age-stratified population is mapped across the contaminated area, and intake of soil contaminants by individuals is calculated probabilistically using an adaptation of the Contaminated Land Exposure Assessment (CLEA) model. The proposed improvement involves not only the geostatistical estimation of the contaminant concentration, but also that of soil pH, which in turn leads to a variable CF(veg) estimate which influences the human intake results. The results presented demonstrate that taking pH into account can influence the outcome of the risk assessment greatly. It is proposed that a similar adaptation could be used for other combinations of soil variables which influence CF(veg).

  6. Hybrid optimal design of the eco-hydrological wireless sensor network in the middle reach of the Heihe River Basin, China.

    PubMed

    Kang, Jian; Li, Xin; Jin, Rui; Ge, Yong; Wang, Jinfeng; Wang, Jianghao

    2014-10-14

    The eco-hydrological wireless sensor network (EHWSN) in the middle reaches of the Heihe River Basin in China is designed to capture the spatial and temporal variability and to estimate the ground truth for validating the remote sensing productions. However, there is no available prior information about a target variable. To meet both requirements, a hybrid model-based sampling method without any spatial autocorrelation assumptions is developed to optimize the distribution of EHWSN nodes based on geostatistics. This hybrid model incorporates two sub-criteria: one for the variogram modeling to represent the variability, another for improving the spatial prediction to evaluate remote sensing productions. The reasonability of the optimized EHWSN is validated from representativeness, the variogram modeling and the spatial accuracy through using 15 types of simulation fields generated with the unconditional geostatistical stochastic simulation. The sampling design shows good representativeness; variograms estimated by samples have less than 3% mean error relative to true variograms. Then, fields at multiple scales are predicted. As the scale increases, estimated fields have higher similarities to simulation fields at block sizes exceeding 240 m. The validations prove that this hybrid sampling method is effective for both objectives when we do not know the characteristics of an optimized variables.

  7. Inverse modeling of hydraulic tests in fractured crystalline rock based on a transition probability geostatistical approach

    NASA Astrophysics Data System (ADS)

    Blessent, Daniela; Therrien, René; Lemieux, Jean-Michel

    2011-12-01

    This paper presents numerical simulations of a series of hydraulic interference tests conducted in crystalline bedrock at Olkiluoto (Finland), a potential site for the disposal of the Finnish high-level nuclear waste. The tests are in a block of crystalline bedrock of about 0.03 km3 that contains low-transmissivity fractures. Fracture density, orientation, and fracture transmissivity are estimated from Posiva Flow Log (PFL) measurements in boreholes drilled in the rock block. On the basis of those data, a geostatistical approach relying on a transitional probability and Markov chain models is used to define a conceptual model based on stochastic fractured rock facies. Four facies are defined, from sparsely fractured bedrock to highly fractured bedrock. Using this conceptual model, three-dimensional groundwater flow is then simulated to reproduce interference pumping tests in either open or packed-off boreholes. Hydraulic conductivities of the fracture facies are estimated through automatic calibration using either hydraulic heads or both hydraulic heads and PFL flow rates as targets for calibration. The latter option produces a narrower confidence interval for the calibrated hydraulic conductivities, therefore reducing the associated uncertainty and demonstrating the usefulness of the measured PFL flow rates. Furthermore, the stochastic facies conceptual model is a suitable alternative to discrete fracture network models to simulate fluid flow in fractured geological media.

  8. A Bayesian approach to meta-analysis of plant pathology studies.

    PubMed

    Mila, A L; Ngugi, H K

    2011-01-01

    Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.

  9. Introduction to Bayesian statistical approaches to compositional analyses of transgenic crops 1. Model validation and setting the stage.

    PubMed

    Harrison, Jay M; Breeze, Matthew L; Harrigan, George G

    2011-08-01

    Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. A surrogate-based sensitivity quantification and Bayesian inversion of a regional groundwater flow model

    NASA Astrophysics Data System (ADS)

    Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor

    2018-02-01

    Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.

  11. Predicting the Future as Bayesian Inference: People Combine Prior Knowledge with Observations when Estimating Duration and Extent

    ERIC Educational Resources Information Center

    Griffiths, Thomas L.; Tenenbaum, Joshua B.

    2011-01-01

    Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…

  12. Variations on Bayesian Prediction and Inference

    DTIC Science & Technology

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  13. Impact of radionuclide spatial variability on groundwater quality downstream from a shallow waste burial in the Chernobyl Exclusion Zone

    NASA Astrophysics Data System (ADS)

    Nguyen, H. L.; de Fouquet, C.; Courbet, C.; Simonucci, C. A.

    2016-12-01

    The effects of spatial variability of hydraulic parameters and initial groundwater plume localization on the possible extent of groundwater pollution plumes have already been broadly studied. However, only a few studies, such as Kjeldsen et al. (1995), take into account the effect of source term spatial variability. We explore this question with the 90Sr migration modeling from a shallow waste burial located in the Chernobyl Exclusion Zone to the underlying sand aquifer. Our work is based upon groundwater sampled once or twice a year since 1995 until 2015 from about 60 piezometers and more than 3,000 137Cs soil activity measurements. These measurements were taken in 1999 from one of the trenches dug after the explosion of the Chernobyl nuclear power plant, the so-called "T22 Trench", where radioactive waste was buried in 1987. The geostatistical analysis of 137Cs activity data in soils from Bugai et al. (2005) is first reconsidered to delimit the trench borders using georadar data as a covariable and to perform geostatistical simulations in order to evaluate the uncertainties of this inventory. 90Sr activity in soils is derived from 137Cs/154Eu and 90Sr/154Eu activity ratios in Chernobyl hot fuel particles (Bugai et al., 2003). Meanwhile, a coupled 1D non saturated/3D saturated transient transport model is constructed under the MELODIE software (IRSN, 2009). The previous 90Sr transport model developed by Bugai et al. (2012) did not take into account the effect of water table fluctuations highlighted by Van Meir et al. (2007) which may cause some discrepancies between model predictions and field observations. They are thus reproduced on a 1D vertical non saturated model. The equiprobable radionuclide localization maps produced by the geostatistical simulations are selected to illustrate different heterogeneities in the radionuclide inventory and are implemented in the 1D model. The obtained activity fluxes from all the 1D vertical models are then injected in a 3D saturated transient model to assess the extent of the radionuclide plume in the groundwater and its most likely evolution over time by taking into account uncertainties associated with the source term spatial variability.

  14. Assessing spatial uncertainty in reservoir characterization for carbon sequestration planning using public well-log data: A case study

    USGS Publications Warehouse

    Venteris, E.R.; Carter, K.M.

    2009-01-01

    Mapping and characterization of potential geologic reservoirs are key components in planning carbon dioxide (CO2) injection projects. The geometry of target and confining layers is vital to ensure that the injected CO2 remains in a supercritical state and is confined to the target layer. Also, maps of injection volume (porosity) are necessary to estimate sequestration capacity at undrilled locations. Our study uses publicly filed geophysical logs and geostatistical modeling methods to investigate the reliability of spatial prediction for oil and gas plays in the Medina Group (sandstone and shale facies) in northwestern Pennsylvania. Specifically, the modeling focused on two targets: the Grimsby Formation and Whirlpool Sandstone. For each layer, thousands of data points were available to model structure and thickness but only hundreds were available to support volumetric modeling because of the rarity of density-porosity logs in the public records. Geostatistical analysis based on this data resulted in accurate structure models, less accurate isopach models, and inconsistent models of pore volume. Of the two layers studied, only the Whirlpool Sandstone data provided for a useful spatial model of pore volume. Where reliable models for spatial prediction are absent, the best predictor available for unsampled locations is the mean value of the data, and potential sequestration sites should be planned as close as possible to existing wells with volumetric data. ?? 2009. The American Association of Petroleum Geologists/Division of Environmental Geosciences. All rights reserved.

  15. Pattern-Based Inverse Modeling for Characterization of Subsurface Flow Models with Complex Geologic Heterogeneity

    NASA Astrophysics Data System (ADS)

    Golmohammadi, A.; Jafarpour, B.; M Khaninezhad, M. R.

    2017-12-01

    Calibration of heterogeneous subsurface flow models leads to ill-posed nonlinear inverse problems, where too many unknown parameters are estimated from limited response measurements. When the underlying parameters form complex (non-Gaussian) structured spatial connectivity patterns, classical variogram-based geostatistical techniques cannot describe the underlying connectivity patterns. Modern pattern-based geostatistical methods that incorporate higher-order spatial statistics are more suitable for describing such complex spatial patterns. Moreover, when the underlying unknown parameters are discrete (geologic facies distribution), conventional model calibration techniques that are designed for continuous parameters cannot be applied directly. In this paper, we introduce a novel pattern-based model calibration method to reconstruct discrete and spatially complex facies distributions from dynamic flow response data. To reproduce complex connectivity patterns during model calibration, we impose a feasibility constraint to ensure that the solution follows the expected higher-order spatial statistics. For model calibration, we adopt a regularized least-squares formulation, involving data mismatch, pattern connectivity, and feasibility constraint terms. Using an alternating directions optimization algorithm, the regularized objective function is divided into a continuous model calibration problem, followed by mapping the solution onto the feasible set. The feasibility constraint to honor the expected spatial statistics is implemented using a supervised machine learning algorithm. The two steps of the model calibration formulation are repeated until the convergence criterion is met. Several numerical examples are used to evaluate the performance of the developed method.

  16. Posterior Predictive Model Checking in Bayesian Networks

    ERIC Educational Resources Information Center

    Crawford, Aaron

    2014-01-01

    This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…

  17. Metrics for evaluating performance and uncertainty of Bayesian network models

    Treesearch

    Bruce G. Marcot

    2012-01-01

    This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...

  18. Technical note: Bayesian calibration of dynamic ruminant nutrition models.

    PubMed

    Reed, K F; Arhonditsis, G B; France, J; Kebreab, E

    2016-08-01

    Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  19. Bayesian logistic regression approaches to predict incorrect DRG assignment.

    PubMed

    Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

    2018-05-07

    Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.

  20. Geostatistical enhancement of european hydrological predictions

    NASA Astrophysics Data System (ADS)

    Pugliese, Alessio; Castellarin, Attilio; Parajka, Juraj; Arheimer, Berit; Bagli, Stefano; Mazzoli, Paolo; Montanari, Alberto; Blöschl, Günter

    2016-04-01

    Geostatistical Enhancement of European Hydrological Prediction (GEEHP) is a research experiment developed within the EU funded SWITCH-ON project, which proposes to conduct comparative experiments in a virtual laboratory in order to share water-related information and tackle changes in the hydrosphere for operational needs (http://www.water-switch-on.eu). The main objective of GEEHP deals with the prediction of streamflow indices and signatures in ungauged basins at different spatial scales. In particular, among several possible hydrological signatures we focus in our experiment on the prediction of flow-duration curves (FDCs) along the stream-network, which has attracted an increasing scientific attention in the last decades due to the large number of practical and technical applications of the curves (e.g. hydropower potential estimation, riverine habitat suitability and ecological assessments, etc.). We apply a geostatistical procedure based on Top-kriging, which has been recently shown to be particularly reliable and easy-to-use regionalization approach, employing two different type of streamflow data: pan-European E-HYPE simulations (http://hypeweb.smhi.se/europehype) and observed daily streamflow series collected in two pilot study regions, i.e. Tyrol (merging data from Austrian and Italian stream gauging networks) and Sweden. The merger of the two study regions results in a rather large area (~450000 km2) and might be considered as a proxy for a pan-European application of the approach. In a first phase, we implement a bidirectional validation, i.e. E-HYPE catchments are set as training sites to predict FDCs at the same sites where observed data are available, and vice-versa. Such a validation procedure reveals (1) the usability of the proposed approach for predicting the FDCs over the entire river network of interest using alternatively observed data and E-HYPE simulations and (2) the accuracy of E-HYPE-based predictions of FDCs in ungauged sites. In a second phase, we develop a module, to be added to the flow-duration curve prediction framework, capable of enhancing E-HYPE-based predictions of FDCs by modelling the residuals obtained from the first phase. Among all possible methods, we apply geostatistical modelling of residuals and, alternatively, regional regression, so that residuals between empirical and E-HYPE-base predicted FDCs are described in terms of geomorphological and climatic catchment descriptors.

  1. Inhomogeneities detection in annual precipitation time series in Portugal using direct sequential simulation

    NASA Astrophysics Data System (ADS)

    Caineta, Júlio; Ribeiro, Sara; Costa, Ana Cristina; Henriques, Roberto; Soares, Amílcar

    2014-05-01

    Climate data homogenisation is of major importance in monitoring climate change, the validation of weather forecasting, general circulation and regional atmospheric models, modelling of erosion, drought monitoring, among other studies of hydrological and environmental impacts. This happens because non-climate factors can cause time series discontinuities which may hide the true climatic signal and patterns, thus potentially bias the conclusions of those studies. In the last two decades, many methods have been developed to identify and remove these inhomogeneities. One of those is based on geostatistical simulation (DSS - direct sequential simulation), where local probability density functions (pdf) are calculated at candidate monitoring stations, using spatial and temporal neighbouring observations, and then are used for detection of inhomogeneities. This approach has been previously applied to detect inhomogeneities in four precipitation series (wet day count) from a network with 66 monitoring stations located in the southern region of Portugal (1980-2001). This study revealed promising results and the potential advantages of geostatistical techniques for inhomogeneities detection in climate time series. This work extends the case study presented before and investigates the application of the geostatistical stochastic approach to ten precipitation series that were previously classified as inhomogeneous by one of six absolute homogeneity tests (Mann-Kendall test, Wald-Wolfowitz runs test, Von Neumann ratio test, Standard normal homogeneity test (SNHT) for a single break, Pettit test, and Buishand range test). Moreover, a sensibility analysis is implemented to investigate the number of simulated realisations that should be used to accurately infer the local pdfs. Accordingly, the number of simulations per iteration is increased from 50 to 500, which resulted in a more representative local pdf. A set of default and recommended settings is provided, which will help other users to implement this method. The need of user intervention is reduced to a minimum through the usage of a cross-platform script. Finally, as in the previous study, the results are compared with those from the SNHT, Pettit and Buishand range tests, which were applied to composite (ratio) reference series. Acknowledgements: The authors gratefully acknowledge the financial support of "Fundação para a Ciência e Tecnologia" (FCT), Portugal, through the research project PTDC/GEO-MET/4026/2012 ("GSIMCLI - Geostatistical simulation with local distributions for the homogenization and interpolation of climate data").

  2. Assessment of parametric uncertainty for groundwater reactive transport modeling,

    USGS Publications Warehouse

    Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

    2014-01-01

    The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.

  3. Prospective evaluation of a Bayesian model to predict organizational change.

    PubMed

    Molfenter, Todd; Gustafson, Dave; Kilo, Chuck; Bhattacharya, Abhik; Olsson, Jesper

    2005-01-01

    This research examines a subjective Bayesian model's ability to predict organizational change outcomes and sustainability of those outcomes for project teams participating in a multi-organizational improvement collaborative.

  4. An application of geostatistics and fractal geometry for reservoir characterization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aasum, Y.; Kelkar, M.G.; Gupta, S.P.

    1991-03-01

    This paper presents an application of geostatistics and fractal geometry concepts for 2D characterization of rock properties (k and {phi}) in a dolomitic, layered-cake reservoir. The results indicate that lack of closely spaced data yield effectively random distributions of properties. Further, incorporation of geology reduces uncertainties in fractal interpolation of wellbore properties.

  5. Regression and Geostatistical Techniques: Considerations and Observations from Experiences in NE-FIA

    Treesearch

    Rachel Riemann; Andrew Lister

    2005-01-01

    Maps of forest variables improve our understanding of the forest resource by allowing us to view and analyze it spatially. The USDA Forest Service's Northeastern Forest Inventory and Analysis unit (NE-FIA) has used geostatistical techniques, particularly stochastic simulation, to produce maps and spatial data sets of FIA variables. That work underscores the...

  6. Spatial interpolation of forest conditions using co-conditional geostatistical simulation

    Treesearch

    H. Todd Mowrer

    2000-01-01

    In recent work the author used the geostatistical Monte Carlo technique of sequential Gaussian simulation (s.G.s.) to investigate uncertainty in a GIS analysis of potential old-growth forest areas. The current study compares this earlier technique to that of co-conditional simulation, wherein the spatial cross-correlations between variables are included. As in the...

  7. Geostatistics applied to gas reservoirs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meunier, G.; Coulomb, C.; Laille, J.P.

    1989-09-01

    The spatial distribution of many of the physical parameters connected with a gas reservoir is of primary interest to both engineers and geologists throughout the study, development, and operation of a field. It is therefore desirable for the distribution to be capable of statistical interpretation, to have a simple graphical representation, and to allow data to be entered from either two- or three-dimensional grids. To satisfy these needs while dealing with the geographical variables, new methods have been developed under the name geostatistics. This paper describes briefly the theory of geostatistics and its most recent improvements for the specific problemmore » of subsurface description. The external-drift technique has been emphasized in particular, and in addition, four case studies related to gas reservoirs are presented.« less

  8. Geostatistical Characteristic of Space -Time Variation in Underground Water Selected Quality Parameters in Klodzko Water Intake Area (SW Part of Poland)

    NASA Astrophysics Data System (ADS)

    Namysłowska-Wilczyńska, Barbara

    2016-04-01

    This paper presents selected results of research connected with the development of a (3D) geostatistical hydrogeochemical model of the Klodzko Drainage Basin, dedicated to the spatial and time variation in the selected quality parameters of underground water in the Klodzko water intake area (SW part of Poland). The research covers the period 2011÷2012. Spatial analyses of the variation in various quality parameters, i.e, contents of: ammonium ion [gNH4+/m3], NO3- (nitrate ion) [gNO3/m3], PO4-3 (phosphate ion) [gPO4-3/m3], total organic carbon C (TOC) [gC/m3], pH redox potential and temperature C [degrees], were carried out on the basis of the chemical determinations of the quality parameters of underground water samples taken from the wells in the water intake area. Spatial and time variation in the quality parameters was analyzed on the basis of archival data (period 1977÷1999) for 22 (pump and siphon) wells with a depth ranging from 9.5 to 38.0 m b.g.l., later data obtained (November 2011) from tests of water taken from 14 existing wells. The wells were built in the years 1954÷1998. The water abstraction depth (difference between the terrain elevation and the dynamic water table level) is ranged from 276÷286 m a.s.l., with an average of 282.05 m a.s.l. Dynamic water table level is contained between 6.22 m÷16.44 m b.g.l., with a mean value of 9.64 m b.g.l. The latest data (January 2012) acquired from 3 new piezometers, with a depth of 9÷10m, which were made in other locations in the relevant area. Thematic databases, containing original data on coordinates X, Y (latitude, longitude) and Z (terrain elevation and time - years) and on regionalized variables, i.e. the underground water quality parameters in the Klodzko water intake area determined for different analytical configurations (22 wells, 14 wells, 14 wells + 3 piezometers), were created. Both archival data (acquired in the years 1977÷1999) and the latest data (collected in 2011÷2012) were analyzed. These data were subjected to spatial analyses using statistical and geostatistical methods. The evaluation of basic statistics of the investigated quality parameters, including their histograms of distributions, scatter diagrams between these parameters and also correlation coefficients r were presented in this article. The directional semivariogram function and the ordinary (block) kriging procedure were used to build the 3D geostatistical model. The geostatistical parameters of the theoretical models of directional semivariograms of the studied water quality parameters, calculated along the time interval and along the wells depth (taking into account the terrain elevation), were used in the ordinary (block) kriging estimation. The obtained results of estimation, i.e. block diagrams allowed to determine the levels of increased values Z* of studied underground water quality parameters. Analysis of the variability in the selected quality parameters of underground water for an analyzed area in Klodzko water intake was enriched by referring to the results of geostatistical studies carried out for underground water quality parameters and also for a treated water and in Klodzko water supply system (iron Fe, manganese Mn, ammonium ion NH4+ contents), discussed in earlier works. Spatial and time variation in the latter-mentioned parameters was analysed on the basis of the data (2007÷2011, 2008÷2011). Generally, the behaviour of the underground water quality parameters has been found to vary in space and time. Thanks to the spatial analyses of the variation in the quality parameters in the Kłodzko underground water intake area some regularities (trends) in the variation in water quality have been identified.

  9. Bayesian truncation errors in chiral effective field theory: model checking and accounting for correlations

    NASA Astrophysics Data System (ADS)

    Melendez, Jordan; Wesolowski, Sarah; Furnstahl, Dick

    2017-09-01

    Chiral effective field theory (EFT) predictions are necessarily truncated at some order in the EFT expansion, which induces an error that must be quantified for robust statistical comparisons to experiment. A Bayesian model yields posterior probability distribution functions for these errors based on expectations of naturalness encoded in Bayesian priors and the observed order-by-order convergence pattern of the EFT. As a general example of a statistical approach to truncation errors, the model was applied to chiral EFT for neutron-proton scattering using various semi-local potentials of Epelbaum, Krebs, and Meißner (EKM). Here we discuss how our model can learn correlation information from the data and how to perform Bayesian model checking to validate that the EFT is working as advertised. Supported in part by NSF PHY-1614460 and DOE NUCLEI SciDAC DE-SC0008533.

  10. A FAST BAYESIAN METHOD FOR UPDATING AND FORECASTING HOURLY OZONE LEVELS

    EPA Science Inventory

    A Bayesian hierarchical space-time model is proposed by combining information from real-time ambient AIRNow air monitoring data, and output from a computer simulation model known as the Community Multi-scale Air Quality (Eta-CMAQ) forecast model. A model validation analysis shows...

  11. Bayesian methods for characterizing unknown parameters of material models

    DOE PAGES

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    2016-02-04

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less

  12. Bayesian methods for characterizing unknown parameters of material models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less

  13. Bayesian methods in reliability

    NASA Astrophysics Data System (ADS)

    Sander, P.; Badoux, R.

    1991-11-01

    The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.

  14. Bayesian accounts of covert selective attention: A tutorial review.

    PubMed

    Vincent, Benjamin T

    2015-05-01

    Decision making and optimal observer models offer an important theoretical approach to the study of covert selective attention. While their probabilistic formulation allows quantitative comparison to human performance, the models can be complex and their insights are not always immediately apparent. Part 1 establishes the theoretical appeal of the Bayesian approach, and introduces the way in which probabilistic approaches can be applied to covert search paradigms. Part 2 presents novel formulations of Bayesian models of 4 important covert attention paradigms, illustrating optimal observer predictions over a range of experimental manipulations. Graphical model notation is used to present models in an accessible way and Supplementary Code is provided to help bridge the gap between model theory and practical implementation. Part 3 reviews a large body of empirical and modelling evidence showing that many experimental phenomena in the domain of covert selective attention are a set of by-products. These effects emerge as the result of observers conducting Bayesian inference with noisy sensory observations, prior expectations, and knowledge of the generative structure of the stimulus environment.

  15. Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters

    PubMed Central

    Gilet, Estelle; Diard, Julien; Bessière, Pierre

    2011-01-01

    In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments. PMID:21674043

  16. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

    PubMed

    Duarte, Belmiro P M; Wong, Weng Kee

    2015-08-01

    This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.

  17. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

    PubMed Central

    Duarte, Belmiro P. M.; Wong, Weng Kee

    2014-01-01

    Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159

  18. A systematic review of Bayesian articles in psychology: The last 25 years.

    PubMed

    van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah

    2017-06-01

    Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  19. A Gibbs sampler for Bayesian analysis of site-occupancy data

    USGS Publications Warehouse

    Dorazio, Robert M.; Rodriguez, Daniel Taylor

    2012-01-01

    1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.

  20. Spatio-temporal mapping of Madagascar's Malaria Indicator Survey results to assess Plasmodium falciparum endemicity trends between 2011 and 2016.

    PubMed

    Kang, Su Yun; Battle, Katherine E; Gibson, Harry S; Ratsimbasoa, Arsène; Randrianarivelojosia, Milijaona; Ramboarina, Stéphanie; Zimmerman, Peter A; Weiss, Daniel J; Cameron, Ewan; Gething, Peter W; Howes, Rosalind E

    2018-05-23

    Reliable measures of disease burden over time are necessary to evaluate the impact of interventions and assess sub-national trends in the distribution of infection. Three Malaria Indicator Surveys (MISs) have been conducted in Madagascar since 2011. They provide a valuable resource to assess changes in burden that is complementary to the country's routine case reporting system. A Bayesian geostatistical spatio-temporal model was developed in an integrated nested Laplace approximation framework to map the prevalence of Plasmodium falciparum malaria infection among children from 6 to 59 months in age across Madagascar for 2011, 2013 and 2016 based on the MIS datasets. The model was informed by a suite of environmental and socio-demographic covariates known to influence infection prevalence. Spatio-temporal trends were quantified across the country. Despite a relatively small decrease between 2013 and 2016, the prevalence of malaria infection has increased substantially in all areas of Madagascar since 2011. In 2011, almost half (42.3%) of the country's population lived in areas of very low malaria risk (<1% parasite prevalence), but by 2016, this had dropped to only 26.7% of the population. Meanwhile, the population in high transmission areas (prevalence >20%) increased from only 2.2% in 2011 to 9.2% in 2016. A comparison of the model-based estimates with the raw MIS results indicates there was an underestimation of the situation in 2016, since the raw figures likely associated with survey timings were delayed until after the peak transmission season. Malaria remains an important health problem in Madagascar. The monthly and annual prevalence maps developed here provide a way to evaluate the magnitude of change over time, taking into account variability in survey input data. These methods can contribute to monitoring sub-national trends of malaria prevalence in Madagascar as the country aims for geographically progressive elimination.

Top