Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.
Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale
2016-08-01
Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty. PMID:27566774
Bayesian geostatistical modeling of Malaria Indicator Survey data in Angola.
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006-2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Schur, Nadine; Hürlimann, Eveline; Stensgaard, Anna-Sofie; Chimfwembe, Kingford; Mushinge, Gabriel; Simoonga, Christopher; Kabatereine, Narcis B; Kristensen, Thomas K; Utzinger, Jürg; Vounatsou, Penelope
2013-11-01
Schistosomiasis remains one of the most prevalent parasitic diseases in the tropics and subtropics, but current statistics are outdated due to demographic and ecological transformations and ongoing control efforts. Reliable risk estimates are important to plan and evaluate interventions in a spatially explicit and cost-effective manner. We analysed a large ensemble of georeferenced survey data derived from an open-access neglected tropical diseases database to create smooth empirical prevalence maps for Schistosoma mansoni and Schistosoma haematobium for a total of 13 countries of eastern Africa. Bayesian geostatistical models based on climatic and other environmental data were used to account for potential spatial clustering in spatially structured exposures. Geostatistical variable selection was employed to reduce the set of covariates. Alignment factors were implemented to combine surveys on different age-groups and to acquire separate estimates for individuals aged ≤20 years and entire communities. Prevalence estimates were combined with population statistics to obtain country-specific numbers of Schistosoma infections. We estimate that 122 million individuals in eastern Africa are currently infected with either S. mansoni, or S. haematobium, or both species concurrently. Country-specific population-adjusted prevalence estimates range between 12.9% (Uganda) and 34.5% (Mozambique) for S. mansoni and between 11.9% (Djibouti) and 40.9% (Mozambique) for S. haematobium. Our models revealed that infection risk in Burundi, Eritrea, Ethiopia, Kenya, Rwanda, Somalia and Sudan might be considerably higher than previously reported, while in Mozambique and Tanzania, the risk might be lower than current estimates suggest. Our empirical, large-scale, high-resolution infection risk estimates for S. mansoni and S. haematobium in eastern Africa can guide future control interventions and provide a benchmark for subsequent monitoring and evaluation activities. PMID:22019933
Predictive risk mapping of schistosomiasis in Brazil using Bayesian geostatistical models.
Scholte, Ronaldo G C; Gosoniu, Laura; Malone, John B; Chammartin, Frédérique; Utzinger, Jürg; Vounatsou, Penelope
2014-04-01
Schistosomiasis is one of the most common parasitic diseases in tropical and subtropical areas, including Brazil. A national control programme was initiated in Brazil in the mid-1970s and proved successful in terms of morbidity control, as the number of cases with hepato-splenic involvement was reduced significantly. To consolidate control and move towards elimination, there is a need for reliable maps on the spatial distribution of schistosomiasis, so that interventions can target communities at highest risk. The purpose of this study was to map the distribution of Schistosoma mansoni in Brazil. We utilized readily available prevalence data from the national schistosomiasis control programme for the years 2005-2009, derived remotely sensed climatic and environmental data and obtained socioeconomic data from various sources. Data were collated into a geographical information system and Bayesian geostatistical models were developed. Model-based maps identified important risk factors related to the transmission of S. mansoni and confirmed that environmental variables are closely associated with indices of poverty. Our smoothed predictive risk map, including uncertainty, highlights priority areas for intervention, namely the northern parts of North and Southeast regions and the eastern part of Northeast region. Our predictive risk map provides a useful tool for to strengthen existing surveillance-response mechanisms. PMID:24361640
Kara G. Eby
2010-08-01
At the Idaho National Laboratory (INL) Cs-137 concentrations above the U.S. Environmental Protection Agency risk-based threshold of 0.23 pCi/g may increase the risk of human mortality due to cancer. As a leader in nuclear research, the INL has been conducting nuclear activities for decades. Elevated anthropogenic radionuclide levels including Cs-137 are a result of atmospheric weapons testing, the Chernobyl accident, and nuclear activities occurring at the INL site. Therefore environmental monitoring and long-term surveillance of Cs-137 is required to evaluate risk. However, due to the large land area involved, frequent and comprehensive monitoring is limited. Developing a spatial model that predicts Cs-137 concentrations at unsampled locations will enhance the spatial characterization of Cs-137 in surface soils, provide guidance for an efficient monitoring program, and pinpoint areas requiring mitigation strategies. The predictive model presented herein is based on applied geostatistics using a Bayesian analysis of environmental characteristics across the INL site, which provides kriging spatial maps of both Cs-137 estimates and prediction errors. Comparisons are presented of two different kriging methods, showing that the use of secondary information (i.e., environmental characteristics) can provide improved prediction performance in some areas of the INL site.
2013-01-01
Background Soil-transmitted helminth infections affect tens of millions of individuals in the People’s Republic of China (P.R. China). There is a need for high-resolution estimates of at-risk areas and number of people infected to enhance spatial targeting of control interventions. However, such information is not yet available for P.R. China. Methods A geo-referenced database compiling surveys pertaining to soil-transmitted helminthiasis, carried out from 2000 onwards in P.R. China, was established. Bayesian geostatistical models relating the observed survey data with potential climatic, environmental and socioeconomic predictors were developed and used to predict at-risk areas at high spatial resolution. Predictors were extracted from remote sensing and other readily accessible open-source databases. Advanced Bayesian variable selection methods were employed to develop a parsimonious model. Results Our results indicate that the prevalence of soil-transmitted helminth infections in P.R. China considerably decreased from 2005 onwards. Yet, some 144 million people were estimated to be infected in 2010. High prevalence (>20%) of the roundworm Ascaris lumbricoides infection was predicted for large areas of Guizhou province, the southern part of Hubei and Sichuan provinces, while the northern part and the south-eastern coastal-line areas of P.R. China had low prevalence (<5%). High infection prevalence (>20%) with hookworm was found in Hainan, the eastern part of Sichuan and the southern part of Yunnan provinces. High infection prevalence (>20%) with the whipworm Trichuris trichiura was found in a few small areas of south P.R. China. Very low prevalence (<0.1%) of hookworm and whipworm infections were predicted for the northern parts of P.R. China. Conclusions We present the first model-based estimates for soil-transmitted helminth infections throughout P.R. China at high spatial resolution. Our prediction maps provide useful information for the spatial targeting of
Oluwole, Akinola S.; Ekpo, Uwem F.; Karagiannis-Voules, Dimitrios-Alexios; Abe, Eniola M.; Olamiju, Francisca O.; Isiyaku, Sunday; Okoronkwo, Chukwu; Saka, Yisa; Nebe, Obiageli J.; Braide, Eka I.; Mafiana, Chiedu F.; Utzinger, Jürg; Vounatsou, Penelope
2015-01-01
Background The acceleration of the control of soil-transmitted helminth (STH) infections in Nigeria, emphasizing preventive chemotherapy, has become imperative in light of the global fight against neglected tropical diseases. Predictive risk maps are an important tool to guide and support control activities. Methodology STH infection prevalence data were obtained from surveys carried out in 2011 using standard protocols. Data were geo-referenced and collated in a nationwide, geographic information system database. Bayesian geostatistical models with remotely sensed environmental covariates and variable selection procedures were utilized to predict the spatial distribution of STH infections in Nigeria. Principal Findings We found that hookworm, Ascaris lumbricoides, and Trichuris trichiura infections are endemic in 482 (86.8%), 305 (55.0%), and 55 (9.9%) locations, respectively. Hookworm and A. lumbricoides infection co-exist in 16 states, while the three species are co-endemic in 12 states. Overall, STHs are endemic in 20 of the 36 states of Nigeria, including the Federal Capital Territory of Abuja. The observed prevalence at endemic locations ranged from 1.7% to 51.7% for hookworm, from 1.6% to 77.8% for A. lumbricoides, and from 1.0% to 25.5% for T. trichiura. Model-based predictions ranged from 0.7% to 51.0% for hookworm, from 0.1% to 82.6% for A. lumbricoides, and from 0.0% to 18.5% for T. trichiura. Our models suggest that day land surface temperature and dense vegetation are important predictors of the spatial distribution of STH infection in Nigeria. In 2011, a total of 5.7 million (13.8%) school-aged children were predicted to be infected with STHs in Nigeria. Mass treatment at the local government area level for annual or bi-annual treatment of the school-aged population in Nigeria in 2011, based on World Health Organization prevalence thresholds, were estimated at 10.2 million tablets. Conclusions/Significance The predictive risk maps and estimated
Model Selection for Geostatistical Models
Hoeting, Jennifer A.; Davis, Richard A.; Merton, Andrew A.; Thompson, Sandra E.
2006-02-01
We consider the problem of model selection for geospatial data. Spatial correlation is typically ignored in the selection of explanatory variables and this can influence model selection results. For example, the inclusion or exclusion of particular explanatory variables may not be apparent when spatial correlation is ignored. To address this problem, we consider the Akaike Information Criterion (AIC) as applied to a geostatistical model. We offer a heuristic derivation of the AIC in this context and provide simulation results that show that using AIC for a geostatistical model is superior to the often used approach of ignoring spatial correlation in the selection of explanatory variables. These ideas are further demonstrated via a model for lizard abundance. We also employ the principle of minimum description length (MDL) to variable selection for the geostatistical model. The effect of sampling design on the selection of explanatory covariates is also explored.
An interactive Bayesian geostatistical inverse protocol for hydraulic tomography
Fienen, Michael N.; Clemo, Tom; Kitanidis, Peter K.
2008-01-01
Hydraulic tomography is a powerful technique for characterizing heterogeneous hydrogeologic parameters. An explicit trade-off between characterization based on measurement misfit and subjective characterization using prior information is presented. We apply a Bayesian geostatistical inverse approach that is well suited to accommodate a flexible model with the level of complexity driven by the data and explicitly considering uncertainty. Prior information is incorporated through the selection of a parameter covariance model characterizing continuity and providing stability. Often, discontinuities in the parameter field, typically caused by geologic contacts between contrasting lithologic units, necessitate subdivision into zones across which there is no correlation among hydraulic parameters. We propose an interactive protocol in which zonation candidates are implied from the data and are evaluated using cross validation and expert knowledge. Uncertainty introduced by limited knowledge of dynamic regional conditions is mitigated by using drawdown rather than native head values. An adjoint state formulation of MODFLOW-2000 is used to calculate sensitivities which are used both for the solution to the inverse problem and to guide protocol decisions. The protocol is tested using synthetic two-dimensional steady state examples in which the wells are located at the edge of the region of interest.
2010-01-01
Background The Zambia Malaria Indicator Survey (ZMIS) of 2006 was the first nation-wide malaria survey, which combined parasitological data with other malaria indicators such as net use, indoor residual spraying and household related aspects. The survey was carried out by the Zambian Ministry of Health and partners with the objective of estimating the coverage of interventions and malaria related burden in children less than five years. In this study, the ZMIS data were analysed in order (i) to estimate an empirical high-resolution parasitological risk map in the country and (ii) to assess the relation between malaria interventions and parasitaemia risk after adjusting for environmental and socio-economic confounders. Methods The parasitological risk was predicted from Bayesian geostatistical and spatially independent models relating parasitaemia risk and environmental/climatic predictors of malaria. A number of models were fitted to capture the (potential) non-linearity in the malaria-environment relation and to identify the elapsing time between environmental effects and parasitaemia risk. These models included covariates (a) in categorical scales and (b) in penalized and basis splines terms. Different model validation methods were used to identify the best fitting model. Model-based risk predictions at unobserved locations were obtained via Bayesian predictive distributions for the best fitting model. Results Model validation indicated that linear environmental predictors were able to fit the data as well as or even better than more complex non-linear terms and that the data do not support spatial dependence. Overall the averaged population-adjusted parasitaemia risk was 20.0% in children less than five years with the highest risk predicted in the northern (38.3%) province. The odds of parasitaemia in children living in a household with at least one bed net decreases by 40% (CI: 12%, 61%) compared to those without bed nets. Conclusions The map of parasitaemia
Geostatistical Modeling of Pore Velocity
Devary, J.L.; Doctor, P.G.
1981-06-01
A significant part of evaluating a geologic formation as a nuclear waste repository involves the modeling of contaminant transport in the surrounding media in the event the repository is breached. The commonly used contaminant transport models are deterministic. However, the spatial variability of hydrologic field parameters introduces uncertainties into contaminant transport predictions. This paper discusses the application of geostatistical techniques to the modeling of spatially varying hydrologic field parameters required as input to contaminant transport analyses. Kriging estimation techniques were applied to Hanford Reservation field data to calculate hydraulic conductivity and the ground-water potential gradients. These quantities were statistically combined to estimate the groundwater pore velocity and to characterize the pore velocity estimation error. Combining geostatistical modeling techniques with product error propagation techniques results in an effective stochastic characterization of groundwater pore velocity, a hydrologic parameter required for contaminant transport analyses.
Bayesian geostatistics in health cartography: the perspective of malaria.
Patil, Anand P; Gething, Peter W; Piel, Frédéric B; Hay, Simon I
2011-06-01
Maps of parasite prevalences and other aspects of infectious diseases that vary in space are widely used in parasitology. However, spatial parasitological datasets rarely, if ever, have sufficient coverage to allow exact determination of such maps. Bayesian geostatistics (BG) is a method for finding a large sample of maps that can explain a dataset, in which maps that do a better job of explaining the data are more likely to be represented. This sample represents the knowledge that the analyst has gained from the data about the unknown true map. BG provides a conceptually simple way to convert these samples to predictions of features of the unknown map, for example regional averages. These predictions account for each map in the sample, yielding an appropriate level of predictive precision. PMID:21420361
Bayesian Geostatistical Analysis and Prediction of Rhodesian Human African Trypanosomiasis
Wardrop, Nicola A.; Atkinson, Peter M.; Gething, Peter W.; Fèvre, Eric M.; Picozzi, Kim; Kakembo, Abbas S. L.; Welburn, Susan C.
2010-01-01
Background The persistent spread of Rhodesian human African trypanosomiasis (HAT) in Uganda in recent years has increased concerns of a potential overlap with the Gambian form of the disease. Recent research has aimed to increase the evidence base for targeting control measures by focusing on the environmental and climatic factors that control the spatial distribution of the disease. Objectives One recent study used simple logistic regression methods to explore the relationship between prevalence of Rhodesian HAT and several social, environmental and climatic variables in two of the most recently affected districts of Uganda, and suggested the disease had spread into the study area due to the movement of infected, untreated livestock. Here we extend this study to account for spatial autocorrelation, incorporate uncertainty in input data and model parameters and undertake predictive mapping for risk of high HAT prevalence in future. Materials and Methods Using a spatial analysis in which a generalised linear geostatistical model is used in a Bayesian framework to account explicitly for spatial autocorrelation and incorporate uncertainty in input data and model parameters we are able to demonstrate a more rigorous analytical approach, potentially resulting in more accurate parameter and significance estimates and increased predictive accuracy, thereby allowing an assessment of the validity of the livestock movement hypothesis given more robust parameter estimation and appropriate assessment of covariate effects. Results Analysis strongly supports the theory that Rhodesian HAT was imported to the study area via the movement of untreated, infected livestock from endemic areas. The confounding effect of health care accessibility on the spatial distribution of Rhodesian HAT and the linkages between the disease's distribution and minimum land surface temperature have also been confirmed via the application of these methods. Conclusions Predictive mapping indicates an
Fienen, Michael N.; D'Oria, Marco; Doherty, John E.; Hunt, Randall J.
2013-01-01
The application bgaPEST is a highly parameterized inversion software package implementing the Bayesian Geostatistical Approach in a framework compatible with the parameter estimation suite PEST. Highly parameterized inversion refers to cases in which parameters are distributed in space or time and are correlated with one another. The Bayesian aspect of bgaPEST is related to Bayesian probability theory in which prior information about parameters is formally revised on the basis of the calibration dataset used for the inversion. Conceptually, this approach formalizes the conditionality of estimated parameters on the speciﬁc data and model available. The geostatistical component of the method refers to the way in which prior information about the parameters is used. A geostatistical autocorrelation function is used to enforce structure on the parameters to avoid overﬁtting and unrealistic results. Bayesian Geostatistical Approach is designed to provide the smoothest solution that is consistent with the data. Optionally, users can specify a level of ﬁt or estimate a balance between ﬁt and model complexity informed by the data. Groundwater and surface-water applications are used as examples in this text, but the possible uses of bgaPEST extend to any distributed parameter applications.
NASA Astrophysics Data System (ADS)
Troldborg, M.; Nowak, W.; Binning, P. J.; Bjerg, P. L.
2012-12-01
Estimates of mass discharge (mass/time) are increasingly being used when assessing risks of groundwater contamination and designing remedial systems at contaminated sites. Mass discharge estimates are, however, prone to rather large uncertainties as they integrate uncertain spatial distributions of both concentration and groundwater flow velocities. For risk assessments or any other decisions that are being based on mass discharge estimates, it is essential to address these uncertainties. We present a novel Bayesian geostatistical approach for quantifying the uncertainty of the mass discharge across a multilevel control plane. The method decouples the flow and transport simulation and has the advantage of avoiding the heavy computational burden of three-dimensional numerical flow and transport simulation coupled with geostatistical inversion. It may therefore be of practical relevance to practitioners compared to existing methods that are either too simple or computationally demanding. The method is based on conditional geostatistical simulation and accounts for i) heterogeneity of both the flow field and the concentration distribution through Bayesian geostatistics (including the uncertainty in covariance functions), ii) measurement uncertainty, and iii) uncertain source zone geometry and transport parameters. The method generates multiple equally likely realizations of the spatial flow and concentration distribution, which all honour the measured data at the control plane. The flow realizations are generated by analytical co-simulation of the hydraulic conductivity and the hydraulic gradient across the control plane. These realizations are made consistent with measurements of both hydraulic conductivity and head at the site. An analytical macro-dispersive transport solution is employed to simulate the mean concentration distribution across the control plane, and a geostatistical model of the Box-Cox transformed concentration data is used to simulate observed
NASA Astrophysics Data System (ADS)
Painter, S. L.; Jiang, Y.; Woodbury, A. D.
2002-12-01
The Edwards Aquifer, a highly heterogeneous karst aquifer located in south central Texas, is the sole source of drinking water for more than one million people. Hydraulic conductivity (K) measurements in the Edwards Aquifer are sparse, highly variable (log-K variance of 6.4), and are mostly from single-well drawdown tests that are appropriate for the spatial scale of a few meters. To support ongoing efforts to develop a groundwater management (MODFLOW) model of the San Antonio segment of the Edwards Aquifer, a multistep procedure was developed to assign hydraulic parameters to the 402 m x 402 m computational cells intended for the management model. The approach used a combination of nonparametric geostatistical analysis, stochastic simulation, numerical upscaling, and automatic model calibration based on Bayesian updating [1,2]. Indicator correlograms reveal a nested spatial structure in the well-test K of the confined zone, with practical correlation ranges of 3,600 and 15,000 meters and a large nugget effect. The fitted geostatistical model was used in unconditional stochastic simulations by the sequential indicator simulation method. The resulting realizations of K, defined at the scale of the well tests, were then numerically upscaled to the block scale. A new geostatistical model was fitted to the upscaled values. The upscaled model was then used to cokrige the block-scale K based on the well-test K. The resulting K map was then converted to transmissivity (T) using deterministically mapped aquifer thickness. When tested in a forward groundwater model, the upscaled T reproduced hydraulic heads better than a simple kriging of the well-test values (mean error of -3.9 meter and mean-absolute-error of 12 meters, as compared with -13 and 17 meters for the simple kriging). As the final step in the study, the upscaled T map was used as the prior distribution in an inverse procedure based on Bayesian updating [1,2]. When input to the forward groundwater model, the
Boysen, Courtney; Davis, Elizabeth G.; Beard, Laurie A.; Lubbers, Brian V.; Raghavan, Ram K.
2015-01-01
Kansas witnessed an unprecedented outbreak in Corynebacterium pseudotuberculosis infection among horses, a disease commonly referred to as pigeon fever during fall 2012. Bayesian geostatistical models were developed to identify key environmental and climatic risk factors associated with C. pseudotuberculosis infection in horses. Positive infection status among horses (cases) was determined by positive test results for characteristic abscess formation, positive bacterial culture on purulent material obtained from a lanced abscess (n = 82), or positive serologic evidence of exposure to organism (≥1:512)(n = 11). Horses negative for these tests (n = 172)(controls) were considered free of infection. Information pertaining to horse demographics and stabled location were obtained through review of medical records and/or contact with horse owners via telephone. Covariate information for environmental and climatic determinants were obtained from USDA (soil attributes), USGS (land use/land cover), and NASA MODIS and NASA Prediction of Worldwide Renewable Resources (climate). Candidate covariates were screened using univariate regression models followed by Bayesian geostatistical models with and without covariates. The best performing model indicated a protective effect for higher soil moisture content (OR = 0.53, 95% CrI = 0.25, 0.71), and detrimental effects for higher land surface temperature (≥35°C) (OR = 2.81, 95% CrI = 2.21, 3.85) and habitat fragmentation (OR = 1.31, 95% CrI = 1.27, 2.22) for C. pseudotuberculosis infection status in horses, while age, gender and breed had no effect. Preventative and ecoclimatic significance of these findings are discussed. PMID:26473728
Boysen, Courtney; Davis, Elizabeth G; Beard, Laurie A; Lubbers, Brian V; Raghavan, Ram K
2015-01-01
Kansas witnessed an unprecedented outbreak in Corynebacterium pseudotuberculosis infection among horses, a disease commonly referred to as pigeon fever during fall 2012. Bayesian geostatistical models were developed to identify key environmental and climatic risk factors associated with C. pseudotuberculosis infection in horses. Positive infection status among horses (cases) was determined by positive test results for characteristic abscess formation, positive bacterial culture on purulent material obtained from a lanced abscess (n = 82), or positive serologic evidence of exposure to organism (≥ 1:512)(n = 11). Horses negative for these tests (n = 172)(controls) were considered free of infection. Information pertaining to horse demographics and stabled location were obtained through review of medical records and/or contact with horse owners via telephone. Covariate information for environmental and climatic determinants were obtained from USDA (soil attributes), USGS (land use/land cover), and NASA MODIS and NASA Prediction of Worldwide Renewable Resources (climate). Candidate covariates were screened using univariate regression models followed by Bayesian geostatistical models with and without covariates. The best performing model indicated a protective effect for higher soil moisture content (OR = 0.53, 95% CrI = 0.25, 0.71), and detrimental effects for higher land surface temperature (≥ 35°C) (OR = 2.81, 95% CrI = 2.21, 3.85) and habitat fragmentation (OR = 1.31, 95% CrI = 1.27, 2.22) for C. pseudotuberculosis infection status in horses, while age, gender and breed had no effect. Preventative and ecoclimatic significance of these findings are discussed. PMID:26473728
High Performance Geostatistical Modeling of Biospheric Resources
NASA Astrophysics Data System (ADS)
Pedelty, J. A.; Morisette, J. T.; Smith, J. A.; Schnase, J. L.; Crosier, C. S.; Stohlgren, T. J.
2004-12-01
We are using parallel geostatistical codes to study spatial relationships among biospheric resources in several study areas. For example, spatial statistical models based on large- and small-scale variability have been used to predict species richness of both native and exotic plants (hot spots of diversity) and patterns of exotic plant invasion. However, broader use of geostastics in natural resource modeling, especially at regional and national scales, has been limited due to the large computing requirements of these applications. To address this problem, we implemented parallel versions of the kriging spatial interpolation algorithm. The first uses the Message Passing Interface (MPI) in a master/slave paradigm on an open source Linux Beowulf cluster, while the second is implemented with the new proprietary Xgrid distributed processing system on an Xserve G5 cluster from Apple Computer, Inc. These techniques are proving effective and provide the basis for a national decision support capability for invasive species management that is being jointly developed by NASA and the US Geological Survey.
Fienen, M.; Hunt, R.; Krabbenhoft, D.; Clemo, T.
2009-01-01
Flow path delineation is a valuable tool for interpreting the subsurface hydrogeochemical environment. Different types of data, such as groundwater flow and transport, inform different aspects of hydrogeologie parameter values (hydraulic conductivity in this case) which, in turn, determine flow paths. This work combines flow and transport information to estimate a unified set of hydrogeologic parameters using the Bayesian geostatistical inverse approach. Parameter flexibility is allowed by using a highly parameterized approach with the level of complexity informed by the data. Despite the effort to adhere to the ideal of minimal a priori structure imposed on the problem, extreme contrasts in parameters can result in the need to censor correlation across hydrostratigraphic bounding surfaces. These partitions segregate parameters into faci??s associations. With an iterative approach in which partitions are based on inspection of initial estimates, flow path interpretation is progressively refined through the inclusion of more types of data. Head observations, stable oxygen isotopes (18O/16O) ratios), and tritium are all used to progressively refine flow path delineation on an isthmus between two lakes in the Trout Lake watershed, northern Wisconsin, United States. Despite allowing significant parameter freedom by estimating many distributed parameter values, a smooth field is obtained. Copyright 2009 by the American Geophysical Union.
Geostatistical modelling of household malaria in Malawi
NASA Astrophysics Data System (ADS)
Chirombo, J.; Lowe, R.; Kazembe, L.
2012-04-01
Malaria is one of the most important diseases in the world today, common in tropical and subtropical areas with sub-Saharan Africa being the region most burdened, including Malawi. This region has the right combination of biotic and abiotic components, including socioeconomic, climatic and environmental factors that sustain transmission of the disease. Differences in these conditions across the country consequently lead to spatial variation in risk of the disease. Analysis of nationwide survey data that takes into account this spatial variation is crucial in a resource constrained country like Malawi for targeted allocation of scare resources in the fight against malaria. Previous efforts to map malaria risk in Malawi have been based on limited data collected from small surveys. The Malaria Indicator Survey conducted in 2010 is the most comprehensive malaria survey carried out in Malawi and provides point referenced data for the study. The data has been shown to be spatially correlated. We use Bayesian logistic regression models with spatial correlation to model the relationship between malaria presence in children and covariates such as socioeconomic status of households and meteorological conditions. This spatial model is then used to assess how malaria varies spatially and a malaria risk map for Malawi is produced. By taking intervention measures into account, the developed model is used to assess whether they have an effect on the spatial distribution of the disease and Bayesian kriging is used to predict areas where malaria risk is more likely to increase. It is hoped that this study can help reveal areas that require more attention from the authorities in the continuing fight against malaria, particularly in children under the age of five.
Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.
2015-01-01
In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.
NASA Astrophysics Data System (ADS)
Hu, L.; Montzka, S. A.; Miller, B.; Andrews, A. E.; Miller, J. B.; Lehman, S.; Sweeney, C.; Miller, S. M.; Thoning, K. W.; Siso, C.; Atlas, E. L.; Blake, D. R.; De Gouw, J. A.; Gilman, J.; Dutton, G. S.; Elkins, J. W.; Hall, B. D.; Chen, H.; Fischer, M. L.; Mountain, M. E.; Nehrkorn, T.; Biraud, S.; Tans, P. P.
2015-12-01
Global atmospheric observations suggest substantial ongoing emissions of carbon tetrachloride (CCl4) despite a 100% phase-out of production for dispersive uses since 1996 in developed countries and 2010 in other countries. Little progress has been made in understanding the causes of these ongoing emissions or identifying their contributing sources. In this study, we employed multiple inverse modeling techniques (i.e. Bayesian and geostatistical inversions) to assimilate CCl4 mole fractions observed from the National Oceanic and Atmospheric Administration (NOAA) flask-air sampling network over the US, and quantify its national and regional emissions during 2008 - 2012. Average national total emissions of CCl4 between 2008 and 2012 determined from these observations and an ensemble of inversions range between 2.1 and 6.1 Gg yr-1. This emission is substantially larger than the mean of 0.06 Gg/yr reported to the US EPA Toxics Release Inventory over these years, suggesting that under-reported emissions or non-reporting sources make up the bulk of CCl4 emissions from the US. But while the inventory does not account for the magnitude of observationally-derived CCl4 emissions, the regional distribution of derived and inventory emissions is similar. Furthermore, when considered relative to the distribution of uncapped landfills or population, the variability in measured mole fractions was most consistent with the distribution of industrial sources (i.e., those from the Toxics Release Inventory). Our results suggest that emissions from the US only account for a small fraction of the global on-going emissions of CCl4 (30 - 80 Gg yr-1 over this period). Finally, to ascertain the importance of the US emissions relative to the unaccounted global emission rate we considered multiple approaches to extrapolate our results to other countries and the globe.
Fractal and geostatistical methods for modeling of a fracture network
Chiles, J.P.
1988-08-01
The modeling of fracture networks is useful for fluid flow and rock mechanics studies. About 6600 fracture traces were recorded on drifts of a uranium mine in a granite massif. The traces have an extension of 0.20-20 m. The network was studied by fractal and by geostatistical methods but can be considered neither as a fractal with a constant dimension nor a set of purely randomly located fractures. Two kinds of generalization of conventional models can still provide more flexibility for the characterization of the network: (a) a nonscaling fractal model with variable similarity dimension (for a 2-D network of traces, the dimension varying from 2 for the 10-m scale to 1 for the centimeter scale, (b) a parent-daughter model with a regionalized density; the geostatistical study allows a 3-D model to be established where: fractures are assumed to be discs; fractures are grouped in clusters or swarms; and fracturation density is regionalized (with two ranges at about 30 and 300 m). The fractal model is easy to fit and to simulate along a line, but 2-D and 3-D simulations are more difficult. The geostatistical model is more complex, but easy to simulate, even in 3-D.
Stochastic Local Interaction (SLI) model: Bridging machine learning and geostatistics
NASA Astrophysics Data System (ADS)
Hristopulos, Dionissios T.
2015-12-01
Machine learning and geostatistics are powerful mathematical frameworks for modeling spatial data. Both approaches, however, suffer from poor scaling of the required computational resources for large data applications. We present the Stochastic Local Interaction (SLI) model, which employs a local representation to improve computational efficiency. SLI combines geostatistics and machine learning with ideas from statistical physics and computational geometry. It is based on a joint probability density function defined by an energy functional which involves local interactions implemented by means of kernel functions with adaptive local kernel bandwidths. SLI is expressed in terms of an explicit, typically sparse, precision (inverse covariance) matrix. This representation leads to a semi-analytical expression for interpolation (prediction), which is valid in any number of dimensions and avoids the computationally costly covariance matrix inversion.
Examples of improved reservoir modeling through geostatistical data integration
Bashore, W.M.; Araktingi, U.G.
1994-12-31
Results from four case studies are presented to demonstrate improvements in reservoir modeling and subsequent flow predictions through various uses of geostatistical integration methods. Specifically, these cases highlight improvements gained from (1) better understanding of reservoir geometries through 3D visualization, (2) forward modeling to assess the value of new data prior to acquisition and integration, (3) assessment of reduced uncertainty in porosity prediction through integration of seismic acoustic impedance, and (4) integration of crosswell tomographic and reflection data. The intent of each of these examples is to quantify the add-value of geological and geophysical data integration in engineering terms such as fluid-flow results and reservoir property predictions.
Slater, Hannah; Michael, Edwin
2013-01-01
There is increasing interest to control or eradicate the major neglected tropical diseases. Accurate modelling of the geographic distributions of parasitic infections will be crucial to this endeavour. We used 664 community level infection prevalence data collated from the published literature in conjunction with eight environmental variables, altitude and population density, and a multivariate Bayesian generalized linear spatial model that allows explicit accounting for spatial autocorrelation and incorporation of uncertainty in input data and model parameters, to construct the first spatially-explicit map describing LF prevalence distribution in Africa. We also ran the best-fit model against predictions made by the HADCM3 and CCCMA climate models for 2050 to predict the likely distributions of LF under future climate and population changes. We show that LF prevalence is strongly influenced by spatial autocorrelation between locations but is only weakly associated with environmental covariates. Infection prevalence, however, is found to be related to variations in population density. All associations with key environmental/demographic variables appear to be complex and non-linear. LF prevalence is predicted to be highly heterogenous across Africa, with high prevalences (>20%) estimated to occur primarily along coastal West and East Africa, and lowest prevalences predicted for the central part of the continent. Error maps, however, indicate a need for further surveys to overcome problems with data scarcity in the latter and other regions. Analysis of future changes in prevalence indicates that population growth rather than climate change per se will represent the dominant factor in the predicted increase/decrease and spread of LF on the continent. We indicate that these results could play an important role in aiding the development of strategies that are best able to achieve the goals of parasite elimination locally and globally in a manner that may also account
Model-Based Geostatistical Mapping of the Prevalence of Onchocerca volvulus in West Africa
O’Hanlon, Simon J.; Slater, Hannah C.; Cheke, Robert A.; Boatin, Boakye A.; Coffeng, Luc E.; Pion, Sébastien D. S.; Boussinesq, Michel; Zouré, Honorat G. M.; Stolk, Wilma A.; Basáñez, María-Gloria
2016-01-01
Background The initial endemicity (pre-control prevalence) of onchocerciasis has been shown to be an important determinant of the feasibility of elimination by mass ivermectin distribution. We present the first geostatistical map of microfilarial prevalence in the former Onchocerciasis Control Programme in West Africa (OCP) before commencement of antivectorial and antiparasitic interventions. Methods and Findings Pre-control microfilarial prevalence data from 737 villages across the 11 constituent countries in the OCP epidemiological database were used as ground-truth data. These 737 data points, plus a set of statistically selected environmental covariates, were used in a Bayesian model-based geostatistical (B-MBG) approach to generate a continuous surface (at pixel resolution of 5 km x 5km) of microfilarial prevalence in West Africa prior to the commencement of the OCP. Uncertainty in model predictions was measured using a suite of validation statistics, performed on bootstrap samples of held-out validation data. The mean Pearson’s correlation between observed and estimated prevalence at validation locations was 0.693; the mean prediction error (average difference between observed and estimated values) was 0.77%, and the mean absolute prediction error (average magnitude of difference between observed and estimated values) was 12.2%. Within OCP boundaries, 17.8 million people were deemed to have been at risk, 7.55 million to have been infected, and mean microfilarial prevalence to have been 45% (range: 2–90%) in 1975. Conclusions and Significance This is the first map of initial onchocerciasis prevalence in West Africa using B-MBG. Important environmental predictors of infection prevalence were identified and used in a model out-performing those without spatial random effects or environmental covariates. Results may be compared with recent epidemiological mapping efforts to find areas of persisting transmission. These methods may be extended to areas where
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
NASA Astrophysics Data System (ADS)
Illman, Walter A.; Berg, Steven J.; Zhao, Zhanfeng
2015-05-01
The robust performance of hydraulic tomography (HT) based on geostatistics has been demonstrated through numerous synthetic, laboratory, and field studies. While geostatistical inverse methods offer many advantages, one key disadvantage is its highly parameterized nature, which renders it computationally intensive for large-scale problems. Another issue is that geostatistics-based HT may produce overly smooth images of subsurface heterogeneity when there are few monitoring interval data. Therefore, some may question the utility of the geostatistical inversion approach in certain situations and seek alternative approaches. To investigate these issues, we simultaneously calibrated different groundwater models with varying subsurface conceptualizations and parameter resolutions using a laboratory sandbox aquifer. The compared models included: (1) isotropic and anisotropic effective parameter models; (2) a heterogeneous model that faithfully represents the geological features; and (3) a heterogeneous model based on geostatistical inverse modeling. The performance of these models was assessed by quantitatively examining the results from model calibration and validation. Calibration data consisted of steady state drawdown data from eight pumping tests and validation data consisted of data from 16 separate pumping tests not used in the calibration effort. Results revealed that the geostatistical inversion approach performed the best among the approaches compared, although the geological model that faithfully represented stratigraphy came a close second. In addition, when the number of pumping tests available for inverse modeling was small, the geological modeling approach yielded more robust validation results. This suggests that better knowledge of stratigraphy obtained via geophysics or other means may contribute to improved results for HT.
Bayesian kinematic earthquake source models
NASA Astrophysics Data System (ADS)
Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.
2009-12-01
Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high
Geostatistical modeling of uncertainty, simulation, and proposed applications in GIScience
NASA Astrophysics Data System (ADS)
Doucette, Peter; Dolloff, John; Lenihan, Michael
2015-05-01
Geostatistical modeling of spatial uncertainty has its roots in the mining, water and oil reservoir exploration communities, and has great potential for broader applications as proposed in this paper. This paper describes the underlying statistical models and their use in both the estimation of quantities of interest and the Monte-Carlo simulation of their uncertainty or errors, including their variance or expected magnitude and their spatial correlations or inter-relationships. These quantities can include 2D or 3D terrain locations, feature vertex locations, or any specified attributes whose statistical properties vary spatially. The simulation of spatial uncertainty or errors is a practical and powerful tool for understanding the effects of error propagation in complex systems. This paper describes various simulation techniques and trades-off their generality with complexity and speed. One technique recently proposed by the authors, Fast Sequential Simulation, has the ability to simulate tens of millions of errors with specifiable variance and spatial correlations in a few seconds on a lap-top computer. This ability allows for the timely evaluation of resultant output errors or the performance of a "down-stream" module or application. It also allows for near-real time evaluation when such a simulation capability is built into the application itself.
Frequentist tests for Bayesian models
NASA Astrophysics Data System (ADS)
Lucy, L. B.
2016-04-01
Analogues of the frequentist chi-square and F tests are proposed for testing goodness-of-fit and consistency for Bayesian models. Simple examples exhibit these tests' detection of inconsistency between consecutive experiments with identical parameters, when the first experiment provides the prior for the second. In a related analysis, a quantitative measure is derived for judging the degree of tension between two different experiments with partially overlapping parameter vectors.
Flexible Bayesian Human Fecundity Models
Kim, Sungduk; Sundaram, Rajeshwari; Buck Louis, Germaine M.; Pyper, Cecilia
2016-01-01
Human fecundity is an issue of considerable interest for both epidemiological and clinical audiences, and is dependent upon a couple’s biologic capacity for reproduction coupled with behaviors that place a couple at risk for pregnancy. Bayesian hierarchical models have been proposed to better model the conception probabilities by accounting for the acts of intercourse around the day of ovulation, i.e., during the fertile window. These models can be viewed in the framework of a generalized nonlinear model with an exponential link. However, a fixed choice of link function may not always provide the best fit, leading to potentially biased estimates for probability of conception. Motivated by this, we propose a general class of models for fecundity by relaxing the choice of the link function under the generalized nonlinear model framework. We use a sample from the Oxford Conception Study (OCS) to illustrate the utility and fit of this general class of models for estimating human conception. Our findings reinforce the need for attention to be paid to the choice of link function in modeling conception, as it may bias the estimation of conception probabilities. Various properties of the proposed models are examined and a Markov chain Monte Carlo sampling algorithm was developed for implementing the Bayesian computations. The deviance information criterion measure and logarithm of pseudo marginal likelihood are used for guiding the choice of links. The supplemental material section contains technical details of the proof of the theorem stated in the paper, and contains further simulation results and analysis.
Bayesian Networks for Social Modeling
Whitney, Paul D.; White, Amanda M.; Walsh, Stephen J.; Dalton, Angela C.; Brothers, Alan J.
2011-03-28
This paper describes a body of work developed over the past five years. The work addresses the use of Bayesian network (BN) models for representing and predicting social/organizational behaviors. The topics covered include model construction, validation, and use. These topics show the bulk of the lifetime of such model, beginning with construction, moving to validation and other aspects of model ‘critiquing’, and finally demonstrating how the modeling approach might be used to inform policy analysis. To conclude, we discuss limitations of using BN for this activity and suggest remedies to address those limitations. The primary benefits of using a well-developed computational, mathematical, and statistical modeling structure, such as BN, are 1) there are significant computational, theoretical and capability bases on which to build 2) ability to empirically critique the model, and potentially evaluate competing models for a social/behavioral phenomena.
Modeling Diagnostic Assessments with Bayesian Networks
ERIC Educational Resources Information Center
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
Giardina, Federica; Franke, Jonas; Vounatsou, Penelope
2015-01-01
The study of malaria spatial epidemiology has benefited from recent advances in geographic information system and geostatistical modelling. Significant progress in earth observation technologies has led to the development of moderate, high and very high resolution imagery. Extensive literature exists on the relationship between malaria and environmental/climatic factors in different geographical areas, but few studies have linked human malaria parasitemia survey data with remote sensing-derived land cover/land use variables and very few have used Earth Observation products. Comparison among the different resolution products to model parasitemia has not yet been investigated. In this study, we probe a proximity measure to incorporate different land cover classes and assess the effect of the spatial resolution of remotely sensed land cover and elevation on malaria risk estimation in Mozambique after adjusting for other environmental factors at a fixed spatial resolution. We used data from the Demographic and Health survey carried out in 2011, which collected malaria parasitemia data on children from 0 to 5 years old, analysing them with a Bayesian geostatistical model. We compared the risk predicted using land cover and elevation at moderate resolution with the risk obtained employing the same variables at high resolution. We used elevation data at moderate and high resolution and the land cover layer from the Moderate Resolution Imaging Spectroradiometer as well as the one produced by MALAREO, a project covering part of Mozambique during 2010-2012 that was funded by the European Union's 7th Framework Program. Moreover, the number of infected children was predicted at different spatial resolutions using AFRIPOP population data and the enhanced population data generated by the MALAREO project for comparison of estimates. The Bayesian geostatistical model showed that the main determinants of malaria presence are precipitation and day temperature. However, the presence
Use of geostatistical modeling to capture complex geology in finite-element analyses
Rautman, C.A.; Longenbaugh, R.S.; Ryder, E.E.
1995-12-01
This paper summarizes a number of transient thermal analyses performed for a representative two-dimensional cross section of volcanic tuffs at Yucca Mountain using the finite element, nonlinear heat-conduction code COYOTE-II. In addition to conventional design analyses, in which material properties are formulated as a uniform single material and as horizontally layered, internally uniform matters, an attempt was made to increase the resemblance of the thermal property field to the actual geology by creating two fairly complex, geologically realistic models. The first model was created by digitizing an existing two-dimensional geologic cross section of Yucca Mountain. The second model was created using conditional geostatistical simulation. Direct mapping of geostatistically generated material property fields onto finite element computational meshes was demonstrated to yield temperature fields approximately equivalent to those generated through more conventional procedures. However, the ability to use the geostatistical models offers a means of simplifying the physical-process analyses.
NASA Astrophysics Data System (ADS)
Kosack, Christian; Vogt, Christian; Rath, Volker; Marquart, Gabriele
2010-05-01
The knowledge of the permeability distribution at depth is of primary concern for any geothermal reservoir engineering. However, permeability might change over orders of magnitude even for a single rock type and is additionally controlled by tectonic or engineered fracturing of the rocks. During reservoir exploration pumping tests are regularly performed where tracer marked water is pumped in one borehole and retrieved at one or a few others. At the European Enhanced Geothermal System (EGS) test site at Soultz-sous-Forêts three wells had been drilled in the granitic bedrock down to 4 to 5 km and were hydraulically stimulated to enhance the hydraulic connectivity between the wells. In July 2005, a tracer circulation test was carried out in order to estimate the changes of the hydraulic properties. Therefore a tracer was injected into the well GPK3 for 19 hours at a rate of 0.015 m3 s-1 and a concentration of 0.389 mol m-3. Tracer concentration was measured in the production wells over the following 5 months, while the produced water was re-injected into GPK3. This experiment demonstrated a good hydraulic connection between GPK3 and one of the production wells, GPK2, while a very low connectivity was observed in the other one, GPK4. We tested three different approaches simulating the pumping experiment with the numerical simulator shemat_suite in a simplified 3D model of the site in order to study their respective potential to estimate a reliable permeability distribution for the Soultz reservoir: A full-physics gradient-based Bayesian inversion, a massive Monte Carlo approach with geostatistic analysis, and an Ensemble-Kalman-Filter (EnKF) assimilation. A common feature in all models is a high permeability zone which acts as main flow area and transports most of the tracer. It is assumed to be associated with the fault zone cutting through the boreholes GPK2 and GPK3. With the Bayesian Inversion we were able to estimate a parameter set consisting of porosity
NASA Astrophysics Data System (ADS)
Yan, Hongxiang; Moradkhani, Hamid
2016-08-01
Assimilation of satellite soil moisture and streamflow data into a distributed hydrologic model has received increasing attention over the past few years. This study provides a detailed analysis of the joint and separate assimilation of streamflow and Advanced Scatterometer (ASCAT) surface soil moisture into a distributed Sacramento Soil Moisture Accounting (SAC-SMA) model, with the use of recently developed particle filter-Markov chain Monte Carlo (PF-MCMC) method. Performance is assessed over the Salt River Watershed in Arizona, which is one of the watersheds without anthropogenic effects in Model Parameter Estimation Experiment (MOPEX). A total of five data assimilation (DA) scenarios are designed and the effects of the locations of streamflow gauges and the ASCAT soil moisture on the predictions of soil moisture and streamflow are assessed. In addition, a geostatistical model is introduced to overcome the significantly biased satellite soil moisture and also discontinuity issue. The results indicate that: (1) solely assimilating outlet streamflow can lead to biased soil moisture estimation; (2) when the study area can only be partially covered by the satellite data, the geostatistical approach can estimate the soil moisture for those uncovered grid cells; (3) joint assimilation of streamflow and soil moisture from geostatistical modeling can further improve the surface soil moisture prediction. This study recommends that the geostatistical model is a helpful tool to aid the remote sensing technique and the hydrologic DA study.
Bayesian Calibration of Microsimulation Models.
Rutter, Carolyn M; Miglioretti, Diana L; Savarino, James E
2009-12-01
Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models. PMID:20076767
Bayesian Calibration of Microsimulation Models
Rutter, Carolyn M.; Miglioretti, Diana L.; Savarino, James E.
2009-01-01
Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models. PMID:20076767
Integrated geostatistics for modeling fluid contacts and shales in Prudhoe Bay
Perez, G.; Chopra, A.K.; Severson, C.D.
1997-12-01
Geostatistics techniques are being used increasingly to model reservoir heterogeneity at a wide range of scales. A variety of techniques is now available with differing underlying assumptions, complexity, and applications. This paper introduces a novel method of geostatistics to model dynamic gas-oil contacts and shales in the Prudhoe Bay reservoir. The method integrates reservoir description and surveillance data within the same geostatistical framework. Surveillance logs and shale data are transformed to indicator variables. These variables are used to evaluate vertical and horizontal spatial correlation and cross-correlation of gas and shale at different times and to develop variogram models. Conditional simulation techniques are used to generate multiple three-dimensional (3D) descriptions of gas and shales that provide a measure of uncertainty. These techniques capture the complex 3D distribution of gas-oil contacts through time. The authors compare results of the geostatistical method with conventional techniques as well as with infill wells drilled after the study. Predicted gas-oil contacts and shale distributions are in close agreement with gas-oil contacts observed at infill wells.
Sparse Bayesian infinite factor models
Bhattacharya, A.; Dunson, D. B.
2011-01-01
We focus on sparse modelling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases. We use our prior on a parameter-expanded loading matrix to avoid the order dependence typical in factor analysis models and develop an efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loading matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies, and the approach is applied to predict survival times from gene expression data. PMID:23049129
A geostatistical methodology to assess the accuracy of unsaturated flow models
Smoot, J.L.; Williams, R.E.
1996-04-01
The Pacific Northwest National Laboratory spatiotemporal movement of water injected into (PNNL) has developed a Hydrologic unsaturated sediments at the Hanford Site in Evaluation Methodology (HEM) to assist the Washington State was used to develop a new U.S. Nuclear Regulatory Commission in method for evaluating mathematical model evaluating the potential that infiltrating meteoric predictions. Measured water content data were water will produce leachate at commercial low- interpolated geostatistically to a 16 x 16 x 36 level radioactive waste disposal sites. Two key grid at several time intervals. Then a issues are raised in the HEM: (1) evaluation of mathematical model was used to predict water mathematical models that predict facility content at the same grid locations at the selected performance, and (2) estimation of the times. Node-by-node comparison of the uncertainty associated with these mathematical mathematical model predictions with the model predictions. The technical objective of geostatistically interpolated values was this research is to adapt geostatistical tools conducted. The method facilitates a complete commonly used for model parameter estimation accounting and categorization of model error at to the problem of estimating the spatial every node. The comparison suggests that distribution of the dependent variable to be model results generally are within measurement calculated by the model. To fulfill this error. The worst model error occurs in silt objective, a database describing the lenses and is in excess of measurement error.
Rautman, C.A.
1995-09-01
Two-dimensional, heterogeneous, spatially correlated models of thermal conductivity and bulk density have been created for a representative, east-west cross section of Yucca Mountain, Nevada, using geostatistical simulation. The thermal conductivity models are derived from spatially correlated, surrogate material-property models of porosity, through a multiple linear-regression equation, which expresses thermal conductivity as a function of porosity and initial temperature and saturation. Bulk-density values were obtained through a similar, linear-regression relationship with porosity. The use of a surrogate-property allows the use of spatially much-more-abundant porosity measurements to condition the simulations. Modeling was conducted in stratigraphic coordinates to represent original depositional continuity of material properties and the completed models were transformed to real-world coordinates to capture present-day tectonic tilting and faulting of the material-property units. Spatial correlation lengths required for geostatistical modeling were assumed, but are based on the results of previous transect-sampling and geostatistical-modeling work.
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Illman, Walter A.; Wöhling, Thomas; Nowak, Wolfgang
2015-12-01
Groundwater modelers face the challenge of how to assign representative parameter values to the studied aquifer. Several approaches are available to parameterize spatial heterogeneity in aquifer parameters. They differ in their conceptualization and complexity, ranging from homogeneous models to heterogeneous random fields. While it is common practice to invest more effort into data collection for models with a finer resolution of heterogeneities, there is a lack of advice which amount of data is required to justify a certain level of model complexity. In this study, we propose to use concepts related to Bayesian model selection to identify this balance. We demonstrate our approach on the characterization of a heterogeneous aquifer via hydraulic tomography in a sandbox experiment (Illman et al., 2010). We consider four increasingly complex parameterizations of hydraulic conductivity: (1) Effective homogeneous medium, (2) geology-based zonation, (3) interpolation by pilot points, and (4) geostatistical random fields. First, we investigate the shift in justified complexity with increasing amount of available data by constructing a model confusion matrix. This matrix indicates the maximum level of complexity that can be justified given a specific experimental setup. Second, we determine which parameterization is most adequate given the observed drawdown data. Third, we test how the different parameterizations perform in a validation setup. The results of our test case indicate that aquifer characterization via hydraulic tomography does not necessarily require (or justify) a geostatistical description. Instead, a zonation-based model might be a more robust choice, but only if the zonation is geologically adequate.
Bayesian Data-Model Fit Assessment for Structural Equation Modeling
ERIC Educational Resources Information Center
Levy, Roy
2011-01-01
Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…
Karagiannis-Voules, Dimitrios-Alexios; Odermatt, Peter; Biedermann, Patricia; Khieu, Virak; Schär, Fabian; Muth, Sinuon; Utzinger, Jürg; Vounatsou, Penelope
2015-01-01
Soil-transmitted helminth infections are intimately connected with poverty. Yet, there is a paucity of using socioeconomic proxies in spatially explicit risk profiling. We compiled household-level socioeconomic data pertaining to sanitation, drinking-water, education and nutrition from readily available Demographic and Health Surveys, Multiple Indicator Cluster Surveys and World Health Surveys for Cambodia and aggregated the data at village level. We conducted a systematic review to identify parasitological surveys and made every effort possible to extract, georeference and upload the data in the open source Global Neglected Tropical Diseases database. Bayesian geostatistical models were employed to spatially align the village-aggregated socioeconomic predictors with the soil-transmitted helminth infection data. The risk of soil-transmitted helminth infection was predicted at a grid of 1×1km covering Cambodia. Additionally, two separate individual-level spatial analyses were carried out, for Takeo and Preah Vihear provinces, to assess and quantify the association between soil-transmitted helminth infection and socioeconomic indicators at an individual level. Overall, we obtained socioeconomic proxies from 1624 locations across the country. Surveys focussing on soil-transmitted helminth infections were extracted from 16 sources reporting data from 238 unique locations. We found that the risk of soil-transmitted helminth infection from 2000 onwards was considerably lower than in surveys conducted earlier. Population-adjusted prevalences for school-aged children from 2000 onwards were 28.7% for hookworm, 1.5% for Ascaris lumbricoides and 0.9% for Trichuris trichiura. Surprisingly, at the country-wide analyses, we did not find any significant association between soil-transmitted helminth infection and village-aggregated socioeconomic proxies. Based also on the individual-level analyses we conclude that socioeconomic proxies might not be good predictors at an
NASA Astrophysics Data System (ADS)
Linde, Niklas; Lochbühler, Tobias; Dogan, Mine; Van Dam, Remke L.
2015-12-01
We propose a new framework to compare alternative geostatistical descriptions of a given site. Multiple realizations of each of the considered geostatistical models and their corresponding tomograms (based on inversion of noise-contaminated simulated data) are used as a multivariate training image. The training image is scanned with a direct sampling algorithm to obtain conditional realizations of hydraulic conductivity that are not only in agreement with the geostatistical model, but also honor the spatially varying resolution of the site-specific tomogram. Model comparison is based on the quality of the simulated geophysical data from the ensemble of conditional realizations. The tomogram in this study is obtained by inversion of cross-hole ground-penetrating radar (GPR) first-arrival travel time data acquired at the MAcro-Dispersion Experiment (MADE) site in Mississippi (USA). Various heterogeneity descriptions ranging from multi-Gaussian fields to fields with complex multiple-point statistics inferred from outcrops are considered. Under the assumption that the relationship between porosity and hydraulic conductivity inferred from local measurements is valid, we find that conditioned multi-Gaussian realizations and derivatives thereof can explain the crosshole geophysical data. A training image based on an aquifer analog from Germany was found to be in better agreement with the geophysical data than the one based on the local outcrop, which appears to under-represent high hydraulic conductivity zones. These findings are only based on the information content in a single resolution-limited tomogram and extending the analysis to tracer or higher resolution surface GPR data might lead to different conclusions (e.g., that discrete facies boundaries are necessary). Our framework makes it possible to identify inadequate geostatistical models and petrophysical relationships, effectively narrowing the space of possible heterogeneity representations.
A conceptual sedimentological-geostatistical model of aquifer heterogeneity based on outcrop studies
Davis, J.M.
1994-01-01
Three outcrop studies were conducted in deposits of different depositional environments. At each site, permeability measurements were obtained with an air-minipermeameter developed as part of this study. In addition, the geological units were mapped with either surveying, photographs, or both. Geostatistical analysis of the permeability data was performed to estimate the characteristics of the probability distribution function and the spatial correlation structure. The information obtained from the geological mapping was then compared with the results of the geostatistical analysis for any relationships that may exist. The main field site was located in the Albuquerque Basin of central New Mexico at an outcrop of the Pliocene-Pleistocene Sierra Ladrones Formation. The second study was conducted on the walls of waste pits in alluvial fan deposits at the Nevada Test Site. The third study was conducted on an outcrop of an eolian deposit (miocene) south of Socorro, New Mexico. The results of the three studies were then used to construct a conceptual model relating depositional environment to geostatistical models of heterogeneity. The model presented is largely qualitative but provides a basis for further hypothesis formulation and testing.
The Bayesian bridge between simple and universal kriging
Omre, H.; Halvorsen, K.B. )
1989-10-01
Kriging techniques are suited well for evaluation of continuous, spatial phenomena. Bayesian statistics are characterized by using prior qualified guesses on the model parameters. By merging kriging techniques and Bayesian theory, prior guesses may be used in a spatial setting. Partial knowledge of model parameters defines a continuum of models between what is named simple and universal kriging in geostatistical terminology. The Bayesian approach to kriging is developed and discussed, and a case study concerning depth conversion of seismic reflection times is presented.
An Integrated Bayesian Model for DIF Analysis
ERIC Educational Resources Information Center
Soares, Tufi M.; Goncalves, Flavio B.; Gamerman, Dani
2009-01-01
In this article, an integrated Bayesian model for differential item functioning (DIF) analysis is proposed. The model is integrated in the sense of modeling the responses along with the DIF analysis. This approach allows DIF detection and explanation in a simultaneous setup. Previous empirical studies and/or subjective beliefs about the item…
Heterogeneous Factor Analysis Models: A Bayesian Approach.
ERIC Educational Resources Information Center
Ansari, Asim; Jedidi, Kamel; Dube, Laurette
2002-01-01
Developed Markov Chain Monte Carlo procedures to perform Bayesian inference, model checking, and model comparison in heterogeneous factor analysis. Tested the approach with synthetic data and data from a consumption emotion study involving 54 consumers. Results show that traditional psychometric methods cannot fully capture the heterogeneity in…
Survey of Bayesian Models for Modelling of Stochastic Temporal Processes
Ng, B
2006-10-12
This survey gives an overview of popular generative models used in the modeling of stochastic temporal systems. In particular, this survey is organized into two parts. The first part discusses the discrete-time representations of dynamic Bayesian networks and dynamic relational probabilistic models, while the second part discusses the continuous-time representation of continuous-time Bayesian networks.
NASA Astrophysics Data System (ADS)
Hodgetts, David; Burnham, Brian; Head, William; Jonathan, Atunima; Rarity, Franklin; Seers, Thomas; Spence, Guy
2013-04-01
In the hydrocarbon industry stochastic approaches are the main method by which reservoirs are modelled. These stochastic modelling approaches require geostatistical information on the geometry and distribution of the geological elements of the reservoir. As the reservoir itself cannot be viewed directly (only indirectly via seismic and/or well log data) this leads to a great deal of uncertainty in the geostatistics used, therefore outcrop analogues are characterised to help obtain the geostatistical information required to model the reservoir. Lidar derived Digital Outcrop Model's (DOM's) provide the ability to collect large quantities of statistical information on the geological architecture of the outcrop, far more than is possible by field work alone as the DOM allows accurate measurements to be made in normally inaccessible parts of the exposure. This increases the size of the measured statistical dataset, which in turn results in an increase in statistical significance. There are, however, many problems and biases in the data which cannot be overcome by sample size alone. These biases, for example, may relate to the orientation, size and quality of exposure, as well as the resolution of the DOM itself. Stochastic modelling used in the hydrocarbon industry fall mainly into 4 generic approaches: 1) Object Modelling where the geology is defined by a set of simplistic shapes (such as channels), where parameters such as width, height and orientation, among others, can be defined. 2) Sequential Indicator Simulations where geological shapes are less well defined and the size and distribution are defined using variograms. 3) Multipoint statistics where training images are used to define shapes and relationships between geological elements and 4) Discrete Fracture Networks for fractures reservoirs where information on fracture size and distribution are required. Examples of using DOM's to assist with each of these modelling approaches are presented, highlighting the
Modelling the presence of disease under spatial misalignment using Bayesian latent Gaussian models.
Barber, Xavier; Conesa, David; Lladosa, Silvia; López-Quílez, Antonio
2016-01-01
Modelling patterns of the spatial incidence of diseases using local environmental factors has been a growing problem in the last few years. Geostatistical models have become popular lately because they allow estimating and predicting the underlying disease risk and relating it with possible risk factors. Our approach to these models is based on the fact that the presence/absence of a disease can be expressed with a hierarchical Bayesian spatial model that incorporates the information provided by the geographical and environmental characteristics of the region of interest. Nevertheless, our main interest here is to tackle the misalignment problem arising when information about possible covariates are partially (or totally) different than those of the observed locations and those in which we want to predict. As a result, we present two different models depending on the fact that there is uncertainty on the covariates or not. In both cases, Bayesian inference on the parameters and prediction of presence/absence in new locations are made by considering the model as a latent Gaussian model, which allows the use of the integrated nested Laplace approximation. In particular, the spatial effect is implemented with the stochastic partial differential equation approach. The methodology is evaluated on the presence of the Fasciola hepatica in Galicia, a North-West region of Spain. PMID:27087038
Can Geostatistical Models Represent Nature's Variability? An Analysis Using Flume Experiments
NASA Astrophysics Data System (ADS)
Scheidt, C.; Fernandes, A. M.; Paola, C.; Caers, J.
2015-12-01
The lack of understanding in the Earth's geological and physical processes governing sediment deposition render subsurface modeling subject to large uncertainty. Geostatistics is often used to model uncertainty because of its capability to stochastically generate spatially varying realizations of the subsurface. These methods can generate a range of realizations of a given pattern - but how representative are these of the full natural variability? And how can we identify the minimum set of images that represent this natural variability? Here we use this minimum set to define the geostatistical prior model: a set of training images that represent the range of patterns generated by autogenic variability in the sedimentary environment under study. The proper definition of the prior model is essential in capturing the variability of the depositional patterns. This work starts with a set of overhead images from an experimental basin that showed ongoing autogenic variability. We use the images to analyze the essential characteristics of this suite of patterns. In particular, our goal is to define a prior model (a minimal set of selected training images) such that geostatistical algorithms, when applied to this set, can reproduce the full measured variability. A necessary prerequisite is to define a measure of variability. In this study, we measure variability using a dissimilarity distance between the images. The distance indicates whether two snapshots contain similar depositional patterns. To reproduce the variability in the images, we apply an MPS algorithm to the set of selected snapshots of the sedimentary basin that serve as training images. The training images are chosen from among the initial set by using the distance measure to ensure that only dissimilar images are chosen. Preliminary investigations show that MPS can reproduce fairly accurately the natural variability of the experimental depositional system. Furthermore, the selected training images provide
NASA Astrophysics Data System (ADS)
de Fouquet, Chantal; Malherbe, Laure; Ung, Anthony
2011-07-01
Deterministic models have become essential tools to forecast and map concentration fields of atmospheric pollutants like ozone. Those models are regularly updated and improved by incorporating recent theoretical developments and using more precise input data. Unavoidable differences with in situ measurements still remain, which need to be better understood. This study investigates those discrepancies in a geostatistical framework by comparing the temporal variability of ozone hourly surface concentrations simulated by a chemistry-transport model, CHIMERE, and measured across France. More than 200 rural and urban background monitoring sites are considered. The relationship between modelled and observed data is complex. Ozone concentrations evolve according to various time scales. CHIMERE correctly accounts for those different scales of variability but is usually unable to reproduce the exact magnitude of each temporal component. Such difficulty cannot be entirely attributed to the difference in spatial support between grid cell averages and punctual observations. As a result of this exploratory analysis, the common multivariate geostatistical model, known as the linear model of coregionalization, is used to describe the temporal variability of ozone hourly concentrations and the relationship between simulated and observed values at each observation point. The fitted parameters of the model can then be interpreted. Their distribution in space provides objective criteria to delimitate the areas where the chemistry-transport model is more or less reliable.
NASA Astrophysics Data System (ADS)
Golay, Jean; Kanevski, Mikhaïl
2013-04-01
The present research deals with the exploration and modeling of a complex dataset of 200 measurement points of sediment pollution by heavy metals in Lake Geneva. The fundamental idea was to use multivariate Artificial Neural Networks (ANN) along with geostatistical models and tools in order to improve the accuracy and the interpretability of data modeling. The results obtained with ANN were compared to those of traditional geostatistical algorithms like ordinary (co)kriging and (co)kriging with an external drift. Exploratory data analysis highlighted a great variety of relationships (i.e. linear, non-linear, independence) between the 11 variables of the dataset (i.e. Cadmium, Mercury, Zinc, Copper, Titanium, Chromium, Vanadium and Nickel as well as the spatial coordinates of the measurement points and their depth). Then, exploratory spatial data analysis (i.e. anisotropic variography, local spatial correlations and moving window statistics) was carried out. It was shown that the different phenomena to be modeled were characterized by high spatial anisotropies, complex spatial correlation structures and heteroscedasticity. A feature selection procedure based on General Regression Neural Networks (GRNN) was also applied to create subsets of variables enabling to improve the predictions during the modeling phase. The basic modeling was conducted using a Multilayer Perceptron (MLP) which is a workhorse of ANN. MLP models are robust and highly flexible tools which can incorporate in a nonlinear manner different kind of high-dimensional information. In the present research, the input layer was made of either two (spatial coordinates) or three neurons (when depth as auxiliary information could possibly capture an underlying trend) and the output layer was composed of one (univariate MLP) to eight neurons corresponding to the heavy metals of the dataset (multivariate MLP). MLP models with three input neurons can be referred to as Artificial Neural Networks with EXternal
Hierarchical Bayesian model updating for structural identification
NASA Astrophysics Data System (ADS)
Behmanesh, Iman; Moaveni, Babak; Lombaert, Geert; Papadimitriou, Costas
2015-12-01
A new probabilistic finite element (FE) model updating technique based on Hierarchical Bayesian modeling is proposed for identification of civil structural systems under changing ambient/environmental conditions. The performance of the proposed technique is investigated for (1) uncertainty quantification of model updating parameters, and (2) probabilistic damage identification of the structural systems. Accurate estimation of the uncertainty in modeling parameters such as mass or stiffness is a challenging task. Several Bayesian model updating frameworks have been proposed in the literature that can successfully provide the "parameter estimation uncertainty" of model parameters with the assumption that there is no underlying inherent variability in the updating parameters. However, this assumption may not be valid for civil structures where structural mass and stiffness have inherent variability due to different sources of uncertainty such as changing ambient temperature, temperature gradient, wind speed, and traffic loads. Hierarchical Bayesian model updating is capable of predicting the overall uncertainty/variability of updating parameters by assuming time-variability of the underlying linear system. A general solution based on Gibbs Sampler is proposed to estimate the joint probability distributions of the updating parameters. The performance of the proposed Hierarchical approach is evaluated numerically for uncertainty quantification and damage identification of a 3-story shear building model. Effects of modeling errors and incomplete modal data are considered in the numerical study.
Normativity, interpretation, and Bayesian models
Oaksford, Mike
2014-01-01
It has been suggested that evaluative normativity should be expunged from the psychology of reasoning. A broadly Davidsonian response to these arguments is presented. It is suggested that two distinctions, between different types of rationality, are more permeable than this argument requires and that the fundamental objection is to selecting theories that make the most rational sense of the data. It is argued that this is inevitable consequence of radical interpretation where understanding others requires assuming they share our own norms of reasoning. This requires evaluative normativity and it is shown that when asked to evaluate others’ arguments participants conform to rational Bayesian norms. It is suggested that logic and probability are not in competition and that the variety of norms is more limited than the arguments against evaluative normativity suppose. Moreover, the universality of belief ascription suggests that many of our norms are universal and hence evaluative. It is concluded that the union of evaluative normativity and descriptive psychology implicit in Davidson and apparent in the psychology of reasoning is a good thing. PMID:24860519
Boden, Sven; Rogiers, Bart; Jacques, Diederik
2013-09-01
Decommissioning of nuclear building structures usually leads to large amounts of low level radioactive waste. Using a reliable method to determine the contamination depth is indispensable prior to the start of decontamination works and also for minimizing the radioactive waste volume and the total workload. The method described in this paper is based on geostatistical modeling of in situ gamma-ray spectroscopy measurements using the multiple photo peak method. The method has been tested on the floor of the waste gas surge tank room within the BR3 (Belgian Reactor 3) decommissioning project and has delivered adequate results. PMID:23722072
Posterior Predictive Bayesian Phylogenetic Model Selection
Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn
2014-01-01
We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
Building on crossvalidation for increasing the quality of geostatistical modeling
Olea, R.A.
2012-01-01
The random function is a mathematical model commonly used in the assessment of uncertainty associated with a spatially correlated attribute that has been partially sampled. There are multiple algorithms for modeling such random functions, all sharing the requirement of specifying various parameters that have critical influence on the results. The importance of finding ways to compare the methods and setting parameters to obtain results that better model uncertainty has increased as these algorithms have grown in number and complexity. Crossvalidation has been used in spatial statistics, mostly in kriging, for the analysis of mean square errors. An appeal of this approach is its ability to work with the same empirical sample available for running the algorithms. This paper goes beyond checking estimates by formulating a function sensitive to conditional bias. Under ideal conditions, such function turns into a straight line, which can be used as a reference for preparing measures of performance. Applied to kriging, deviations from the ideal line provide sensitivity to the semivariogram lacking in crossvalidation of kriging errors and are more sensitive to conditional bias than analyses of errors. In terms of stochastic simulation, in addition to finding better parameters, the deviations allow comparison of the realizations resulting from the applications of different methods. Examples show improvements of about 30% in the deviations and approximately 10% in the square root of mean square errors between reasonable starting modelling and the solutions according to the new criteria. ?? 2011 US Government.
A Bayesian Model of Sensory Adaptation
Sato, Yoshiyuki; Aihara, Kazuyuki
2011-01-01
Recent studies reported two opposite types of adaptation in temporal perception. Here, we propose a Bayesian model of sensory adaptation that exhibits both types of adaptation. We regard adaptation as the adaptive updating of estimations of time-evolving variables, which determine the mean value of the likelihood function and that of the prior distribution in a Bayesian model of temporal perception. On the basis of certain assumptions, we can analytically determine the mean behavior in our model and identify the parameters that determine the type of adaptation that actually occurs. The results of our model suggest that we can control the type of adaptation by controlling the statistical properties of the stimuli presented. PMID:21541346
Joint space-time geostatistical model for air quality surveillance
NASA Astrophysics Data System (ADS)
Russo, A.; Soares, A.; Pereira, M. J.
2009-04-01
Air pollution and peoples' generalized concern about air quality are, nowadays, considered to be a global problem. Although the introduction of rigid air pollution regulations has reduced pollution from industry and power stations, the growing number of cars on the road poses a new pollution problem. Considering the characteristics of the atmospheric circulation and also the residence times of certain pollutants in the atmosphere, a generalized and growing interest on air quality issues led to research intensification and publication of several articles with quite different levels of scientific depth. As most natural phenomena, air quality can be seen as a space-time process, where space-time relationships have usually quite different characteristics and levels of uncertainty. As a result, the simultaneous integration of space and time is not an easy task to perform. This problem is overcome by a variety of methodologies. The use of stochastic models and neural networks to characterize space-time dispersion of air quality is becoming a common practice. The main objective of this work is to produce an air quality model which allows forecasting critical concentration episodes of a certain pollutant by means of a hybrid approach, based on the combined use of neural network models and stochastic simulations. A stochastic simulation of the spatial component with a space-time trend model is proposed to characterize critical situations, taking into account data from the past and a space-time trend from the recent past. To identify near future critical episodes, predicted values from neural networks are used at each monitoring station. In this paper, we describe the design of a hybrid forecasting tool for ambient NO2 concentrations in Lisbon, Portugal.
Estimating malaria burden in Nigeria: a geostatistical modelling approach.
Onyiri, Nnadozie
2015-01-01
This study has produced a map of malaria prevalence in Nigeria based on available data from the Mapping Malaria Risk in Africa (MARA) database, including all malaria prevalence surveys in Nigeria that could be geolocated, as well as data collected during fieldwork in Nigeria between March and June 2007. Logistic regression was fitted to malaria prevalence to identify significant demographic (age) and environmental covariates in STATA. The following environmental covariates were included in the spatial model: the normalized difference vegetation index, the enhanced vegetation index, the leaf area index, the land surface temperature for day and night, land use/landcover (LULC), distance to water bodies, and rainfall. The spatial model created suggests that the two main environmental covariates correlating with malaria presence were land surface temperature for day and rainfall. It was also found that malaria prevalence increased with distance to water bodies up to 4 km. The malaria risk map estimated from the spatial model shows that malaria prevalence in Nigeria varies from 20% in certain areas to 70% in others. The highest prevalence rates were found in the Niger Delta states of Rivers and Bayelsa, the areas surrounding the confluence of the rivers Niger and Benue, and also isolated parts of the north-eastern and north-western parts of the country. Isolated patches of low malaria prevalence were found to be scattered around the country with northern Nigeria having more such areas than the rest of the country. Nigeria's belt of middle regions generally has malaria prevalence of 40% and above. PMID:26618305
Bayesian network modelling of upper gastrointestinal bleeding
NASA Astrophysics Data System (ADS)
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
Del Monego, Maurici; Ribeiro, Paulo Justiniano; Ramos, Patrícia
2015-04-01
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Matèrn models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion. PMID:25345922
A Bayesian model for visual space perception
NASA Technical Reports Server (NTRS)
Curry, R. E.
1972-01-01
A model for visual space perception is proposed that contains desirable features in the theories of Gibson and Brunswik. This model is a Bayesian processor of proximal stimuli which contains three important elements: an internal model of the Markov process describing the knowledge of the distal world, the a priori distribution of the state of the Markov process, and an internal model relating state to proximal stimuli. The universality of the model is discussed and it is compared with signal detection theory models. Experimental results of Kinchla are used as a special case.
Cay, Tayfun; Uyan, Mevlut
2009-12-01
Groundwater is one of the most important resources used for drinking and utility and irrigation purposes in the city of Konya, Turkey, as in many areas. The purpose of this study is to evaluate spatial and temporal changes in the level of groundwater by using geostatistical methods based on data from 91 groundwater wells during the period 1999 to 2003. Geostatistical methods have been used widely as a convenient tool to make decisions on the management of groundwater levels. To evaluate the spatial and temporal changes in the level of the groundwater, a vector-based geographic information system software package, ArcGIS 9.1 (Environmental Systems Research Institute, Redlands, California), was used for the application of an ordinary kriging method, with cross-validation leading to the estimation of groundwater levels. The average value of variogram (spherical model) for the spatial analysis was approximately 2150 m. Results of ordinary kriging for groundwater level drops were underestimated by 17%. Cross-validation errors were within an acceptable level. The kriging model also helps to detect risk-prone areas for groundwater abstraction. PMID:20099631
Rhea, L.; Person, M.; Marsily, G. de; Ledoux, E.; Galli, A.
1994-11-01
This paper critically evaluates the utility of two different geostatistical methods in tracing long-distance oil migration through sedimentary basins. Geostatistical models of petroleum migration based on kriging and the conditional simulation method are assessed by comparing them to {open_quotes}known{close_quotes} oil migration rates and directions through a numerical carrier bed. In this example, the numerical carrier bed, which serves as {open_quotes}ground truth{close_quotes} in the study, incorporates a synthetic permeability field generated using the method of turning bands. Different representations of lateral permeability heterogeneity of the carrier bed are incorporated into a quasi-three-dimensional model of secondary oil migration. The geometric configuration of the carrier bed is intended to represent migration conditions within the center of a saucer-shaped intracratonic sag basin. In all of the numerical experiments, oil is sourced in the lowest 10% of a saucer-shaped carrier bed and migrates 10-14 km outward in a radial fashion by buoyancy. The effects of vertical permeability variations on secondary oil migration were not considered in the study.
NASA Astrophysics Data System (ADS)
Atzberger, C.; Richter, K.
2009-09-01
The robust and accurate retrieval of vegetation biophysical variables using radiative transfer models (RTM) is seriously hampered by the ill-posedness of the inverse problem. With this research we further develop our previously published (object-based) inversion approach [Atzberger (2004)]. The object-based RTM inversion takes advantage of the geostatistical fact that the biophysical characteristics of nearby pixel are generally more similar than those at a larger distance. A two-step inversion based on PROSPECT+SAIL generated look-up-tables is presented that can be easily implemented and adapted to other radiative transfer models. The approach takes into account the spectral signatures of neighboring pixel and optimizes a common value of the average leaf angle (ALA) for all pixel of a given image object, such as an agricultural field. Using a large set of leaf area index (LAI) measurements (n = 58) acquired over six different crops of the Barrax test site, Spain), we demonstrate that the proposed geostatistical regularization yields in most cases more accurate and spatially consistent results compared to the traditional (pixel-based) inversion. Pros and cons of the approach are discussed and possible future extensions presented.
Bayesian population modeling of drug dosing adherence.
Fellows, Kelly; Stoneking, Colin J; Ramanathan, Murali
2015-10-01
Adherence is a frequent contributing factor to variations in drug concentrations and efficacy. The purpose of this work was to develop an integrated population model to describe variation in adherence, dose-timing deviations, overdosing and persistence to dosing regimens. The hybrid Markov chain-von Mises method for modeling adherence in individual subjects was extended to the population setting using a Bayesian approach. Four integrated population models for overall adherence, the two-state Markov chain transition parameters, dose-timing deviations, overdosing and persistence were formulated and critically compared. The Markov chain-Monte Carlo algorithm was used for identifying distribution parameters and for simulations. The model was challenged with medication event monitoring system data for 207 hypertension patients. The four Bayesian models demonstrated good mixing and convergence characteristics. The distributions of adherence, dose-timing deviations, overdosing and persistence were markedly non-normal and diverse. The models varied in complexity and the method used to incorporate inter-dependence with the preceding dose in the two-state Markov chain. The model that incorporated a cooperativity term for inter-dependence and a hyperbolic parameterization of the transition matrix probabilities was identified as the preferred model over the alternatives. The simulated probability densities from the model satisfactorily fit the observed probability distributions of adherence, dose-timing deviations, overdosing and persistence parameters in the sample patients. The model also adequately described the median and observed quartiles for these parameters. The Bayesian model for adherence provides a parsimonious, yet integrated, description of adherence in populations. It may find potential applications in clinical trial simulations and pharmacokinetic-pharmacodynamic modeling. PMID:26319548
Bayesian model selection analysis of WMAP3
Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.
2006-06-15
We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.
Bayesian Model Selection for Group Studies
Stephan, Klaas Enno; Penny, Will D.; Daunizeau, Jean; Moran, Rosalyn J.; Friston, Karl J.
2009-01-01
Bayesian model selection (BMS) is a powerful method for determining the most likely among a set of competing hypotheses about the mechanisms that generated observed data. BMS has recently found widespread application in neuroimaging, particularly in the context of dynamic causal modelling (DCM). However, so far, combining BMS results from several subjects has relied on simple (fixed effects) metrics, e.g. the group Bayes factor (GBF), that do not account for group heterogeneity or outliers. In this paper, we compare the GBF with two random effects methods for BMS at the between-subject or group level. These methods provide inference on model-space using a classical and Bayesian perspective respectively. First, a classical (frequentist) approach uses the log model evidence as a subject-specific summary statistic. This enables one to use analysis of variance to test for differences in log-evidences over models, relative to inter-subject differences. We then consider the same problem in Bayesian terms and describe a novel hierarchical model, which is optimised to furnish a probability density on the models themselves. This new variational Bayes method rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution which describes the probabilities for all models considered. These probabilities then define a multinomial distribution over model space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject as well as the exceedance probability of one model being more likely than any other model. Using empirical and synthetic data, we show that optimising a conditional density of the model probabilities, given the log-evidences for each model over subjects, is more informative and appropriate than both the GBF and frequentist tests of the log-evidences. In particular, we found that the hierarchical Bayesian approach is considerably more robust than either of the other
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach. PMID:26442771
Local Geostatistical Models and Big Data in Hydrological and Ecological Applications
NASA Astrophysics Data System (ADS)
Hristopulos, Dionissios
2015-04-01
The advent of the big data era creates new opportunities for environmental and ecological modelling but also presents significant challenges. The availability of remote sensing images and low-cost wireless sensor networks implies that spatiotemporal environmental data to cover larger spatial domains at higher spatial and temporal resolution for longer time windows. Handling such voluminous data presents several technical and scientific challenges. In particular, the geostatistical methods used to process spatiotemporal data need to overcome the dimensionality curse associated with the need to store and invert large covariance matrices. There are various mathematical approaches for addressing the dimensionality problem, including change of basis, dimensionality reduction, hierarchical schemes, and local approximations. We present a Stochastic Local Interaction (SLI) model that can be used to model local correlations in spatial data. SLI is a random field model suitable for data on discrete supports (i.e., regular lattices or irregular sampling grids). The degree of localization is determined by means of kernel functions and appropriate bandwidths. The strength of the correlations is determined by means of coefficients. In the "plain vanilla" version the parameter set involves scale and rigidity coefficients as well as a characteristic length. The latter determines in connection with the rigidity coefficient the correlation length of the random field. The SLI model is based on statistical field theory and extends previous research on Spartan spatial random fields [2,3] from continuum spaces to explicitly discrete supports. The SLI kernel functions employ adaptive bandwidths learned from the sampling spatial distribution [1]. The SLI precision matrix is expressed explicitly in terms of the model parameter and the kernel function. Hence, covariance matrix inversion is not necessary for parameter inference that is based on leave-one-out cross validation. This property
Bayesian Nonparametric Models for Multiway Data Analysis.
Xu, Zenglin; Yan, Feng; Qi, Yuan
2015-02-01
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels. PMID:26353255
Modelling ambient ozone in an urban area using an objective model and geostatistical algorithms
NASA Astrophysics Data System (ADS)
Moral, Francisco J.; Rebollo, Francisco J.; Valiente, Pablo; López, Fernando; Muñoz de la Peña, Arsenio
2012-12-01
Ground-level tropospheric ozone is one of the air pollutants of most concern. Ozone levels continue to exceed both target values and the long-term objectives established in EU legislation to protect human health and prevent damage to ecosystems, agricultural crops and materials. Researchers or decision-makers frequently need information about atmospheric pollution patterns in urbanized areas. The preparation of this type of information is a complex task, due to the influence of several factors and their variability over time. In this work, some results of urban ozone distribution patterns in the city of Badajoz, which is the largest (140,000 inhabitants) and most industrialized city in Extremadura region (southwest Spain) are shown. Twelve sampling campaigns, one per month, were carried out to measure ambient air ozone concentrations, during periods that were selected according to favourable conditions to ozone production, using an automatic portable analyzer. Later, to evaluate the overall ozone level at each sampling location during the time interval considered, the measured ozone data were analysed using a new methodology based on the formulation of the Rasch model. As a result, a measure of overall ozone level which consolidates the monthly ground-level ozone measurements was obtained, getting moreover information about the influence on the overall ozone level of each monthly ozone measure. Finally, overall ozone level at locations where no measurements were available was estimated with geostatistical techniques and hazard assessment maps based on the spatial distribution of ozone were also generated.
On Bayesian estimation of marginal structural models.
Saarela, Olli; Stephens, David A; Moodie, Erica E M; Klein, Marina B
2015-06-01
The purpose of inverse probability of treatment (IPT) weighting in estimation of marginal treatment effects is to construct a pseudo-population without imbalances in measured covariates, thus removing the effects of confounding and informative censoring when performing inference. In this article, we formalize the notion of such a pseudo-population as a data generating mechanism with particular characteristics, and show that this leads to a natural Bayesian interpretation of IPT weighted estimation. Using this interpretation, we are able to propose the first fully Bayesian procedure for estimating parameters of marginal structural models using an IPT weighting. Our approach suggests that the weights should be derived from the posterior predictive treatment assignment and censoring probabilities, answering the question of whether and how the uncertainty in the estimation of the weights should be incorporated in Bayesian inference of marginal treatment effects. The proposed approach is compared to existing methods in simulated data, and applied to an analysis of the Canadian Co-infection Cohort. PMID:25677103
Geostatistical three-dimensional modeling of oolite shoals, St. Louis Limestone, southwest Kansas
Qi, L.; Carr, T.R.; Goldstein, R.H.
2007-01-01
In the Hugoton embayment of southwestern Kansas, reservoirs composed of relatively thin (<4 m; <13.1 ft) oolitic deposits within the St. Louis Limestone have produced more than 300 million bbl of oil. The geometry and distribution of oolitic deposits control the heterogeneity of the reservoirs, resulting in exploration challenges and relatively low recovery. Geostatistical three-dimensional (3-D) models were constructed to quantify the geometry and spatial distribution of oolitic reservoirs, and the continuity of flow units within Big Bow and Sand Arroyo Creek fields. Lithofacies in uncored wells were predicted from digital logs using a neural network. The tilting effect from the Laramide orogeny was removed to construct restored structural surfaces at the time of deposition. Well data and structural maps were integrated to build 3-D models of oolitic reservoirs using stochastic simulations with geometry data. Three-dimensional models provide insights into the distribution, the external and internal geometry of oolitic deposits, and the sedimentologic processes that generated reservoir intervals. The structural highs and general structural trend had a significant impact on the distribution and orientation of the oolitic complexes. The depositional pattern and connectivity analysis suggest an overall aggradation of shallow-marine deposits during pulses of relative sea level rise followed by deepening near the top of the St. Louis Limestone. Cemented oolitic deposits were modeled as barriers and baffles and tend to concentrate at the edge of oolitic complexes. Spatial distribution of porous oolitic deposits controls the internal geometry of rock properties. Integrated geostatistical modeling methods can be applicable to other complex carbonate or siliciclastic reservoirs in shallow-marine settings. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.
Bayesian Kinematic Finite Fault Source Models (Invited)
NASA Astrophysics Data System (ADS)
Minson, S. E.; Simons, M.; Beck, J. L.
2010-12-01
Finite fault earthquake source models are inherently under-determined: there is no unique solution to the inverse problem of determining the rupture history at depth as a function of time and space when our data are only limited observations at the Earth's surface. Traditional inverse techniques rely on model constraints and regularization to generate one model from the possibly broad space of all possible solutions. However, Bayesian methods allow us to determine the ensemble of all possible source models which are consistent with the data and our a priori assumptions about the physics of the earthquake source. Until now, Bayesian techniques have been of limited utility because they are computationally intractable for problems with as many free parameters as kinematic finite fault models. We have developed a methodology called Cascading Adaptive Tempered Metropolis In Parallel (CATMIP) which allows us to sample very high-dimensional problems in a parallel computing framework. The CATMIP algorithm combines elements of simulated annealing and genetic algorithms with the Metropolis algorithm to dynamically optimize the algorithm's efficiency as it runs. We will present synthetic performance tests of finite fault models made with this methodology as well as a kinematic source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. This earthquake was well recorded by multiple ascending and descending interferograms and a network of high-rate GPS stations whose records can be used as near-field seismograms.
Enhancing multiple-point geostatistical modeling: 1. Graph theory and pattern adjustment
NASA Astrophysics Data System (ADS)
Tahmasebi, Pejman; Sahimi, Muhammad
2016-03-01
In recent years, higher-order geostatistical methods have been used for modeling of a wide variety of large-scale porous media, such as groundwater aquifers and oil reservoirs. Their popularity stems from their ability to account for qualitative data and the great flexibility that they offer for conditioning the models to hard (quantitative) data, which endow them with the capability for generating realistic realizations of porous formations with very complex channels, as well as features that are mainly a barrier to fluid flow. One group of such models consists of pattern-based methods that use a set of data points for generating stochastic realizations by which the large-scale structure and highly-connected features are reproduced accurately. The cross correlation-based simulation (CCSIM) algorithm, proposed previously by the authors, is a member of this group that has been shown to be capable of simulating multimillion cell models in a matter of a few CPU seconds. The method is, however, sensitive to pattern's specifications, such as boundaries and the number of replicates. In this paper the original CCSIM algorithm is reconsidered and two significant improvements are proposed for accurately reproducing large-scale patterns of heterogeneities in porous media. First, an effective boundary-correction method based on the graph theory is presented by which one identifies the optimal cutting path/surface for removing the patchiness and discontinuities in the realization of a porous medium. Next, a new pattern adjustment method is proposed that automatically transfers the features in a pattern to one that seamlessly matches the surrounding patterns. The original CCSIM algorithm is then combined with the two methods and is tested using various complex two- and three-dimensional examples. It should, however, be emphasized that the methods that we propose in this paper are applicable to other pattern-based geostatistical simulation methods.
A Bayesian Shrinkage Approach for AMMI Models.
da Silva, Carlos Pereira; de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves; Balestre, Marcio
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
A Bayesian Shrinkage Approach for AMMI Models
de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum
2011-01-01
Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…
A Nonparametric Bayesian Model for Nested Clustering.
Lee, Juhee; Müller, Peter; Zhu, Yitan; Ji, Yuan
2016-01-01
We propose a nonparametric Bayesian model for clustering where clusters of experimental units are determined by a shared pattern of clustering another set of experimental units. The proposed model is motivated by the analysis of protein activation data, where we cluster proteins such that all proteins in one cluster give rise to the same clustering of patients. That is, we define clusters of proteins by the way that patients group with respect to the corresponding protein activations. This is in contrast to (almost) all currently available models that use shared parameters in the sampling model to define clusters. This includes in particular model based clustering, Dirichlet process mixtures, product partition models, and more. We show results for two typical biostatistical inference problems that give rise to clustering. PMID:26519174
Bayesian nonparametric models for ranked set sampling.
Gemayel, Nader; Stasny, Elizabeth A; Wolfe, Douglas A
2015-04-01
Ranked set sampling (RSS) is a data collection technique that combines measurement with judgment ranking for statistical inference. This paper lays out a formal and natural Bayesian framework for RSS that is analogous to its frequentist justification, and that does not require the assumption of perfect ranking or use of any imperfect ranking models. Prior beliefs about the judgment order statistic distributions and their interdependence are embodied by a nonparametric prior distribution. Posterior inference is carried out by means of Markov chain Monte Carlo techniques, and yields estimators of the judgment order statistic distributions (and of functionals of those distributions). PMID:25326663
Bayesian POT modeling for historical data
NASA Astrophysics Data System (ADS)
Parent, Eric; Bernier, Jacques
2003-04-01
When designing hydraulic structures, civil engineers have to evaluate design floods, i.e. events generally much rarer that the ones that have already been systematically recorded. To extrapolate towards extreme value events, taking advantage of further information such as historical data, has been an early concern among hydrologists. Most methods described in the hydrological literature are designed from a frequentist interpretation of probabilities, although such probabilities are commonly interpreted as subjective decisional bets by the end user. This paper adopts a Bayesian setting to deal with the classical Poisson-Pareto peak over treshold (POT) model when a sample of historical data is available. Direct probalistic statements can be made about the unknown parameters, thus improving communication with decision makers. On the Garonne case study, we point out that twelve historical events, however imprecise they might be, greatly reduce uncertainty. The 90% credible interval for the 1000 year flood becomes 40% smaller when taking into account historical data. Any kind of uncertainty (model uncertainty, imprecise range for historical events, missing data) can be incorporated into the decision analysis. Tractable and versatile data augmentation algorithms are implemented by Monte Carlo Markov Chain tools. Advantage is taken from a semi-conjugate prior, flexible enough to elicit expert knowledge about extreme behavior of the river flows. The data augmentation algorithm allows to deal with imprecise historical data in the POT model. A direct hydrological meaning is given to the latent variables, which are the Bayesian keytool to model unobserved past floods in the historical series.
Model feedback in Bayesian propensity score estimation.
Zigler, Corwin M; Watts, Krista; Yeh, Robert W; Wang, Yun; Coull, Brent A; Dominici, Francesca
2013-03-01
Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: (1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and (2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is "feedback" between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting versus carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in 2006 and 2007. PMID:23379793
Experience With Bayesian Image Based Surface Modeling
NASA Technical Reports Server (NTRS)
Stutz, John C.
2005-01-01
Bayesian surface modeling from images requires modeling both the surface and the image generation process, in order to optimize the models by comparing actual and generated images. Thus it differs greatly, both conceptually and in computational difficulty, from conventional stereo surface recovery techniques. But it offers the possibility of using any number of images, taken under quite different conditions, and by different instruments that provide independent and often complementary information, to generate a single surface model that fuses all available information. I describe an implemented system, with a brief introduction to the underlying mathematical models and the compromises made for computational efficiency. I describe successes and failures achieved on actual imagery, where we went wrong and what we did right, and how our approach could be improved. Lastly I discuss how the same approach can be extended to distinct types of instruments, to achieve true sensor fusion.
A Hierarchical Bayesian Model for Crowd Emotions.
Urizar, Oscar J; Baig, Mirza S; Barakova, Emilia I; Regazzoni, Carlo S; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology. PMID:23687472
BAYESIAN MODEL DETERMINATION FOR GEOSTATISTICAL REGRESSION MODELS. (R829095C001)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Merging Digital Surface Models Implementing Bayesian Approaches
NASA Astrophysics Data System (ADS)
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures.
Orbanz, Peter; Roy, Daniel M
2015-02-01
The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays. PMID:26353253
Integration of geostatistical techniques and intuitive geology in the 3-D modeling process
Heine, C.J.; Cooper, D.H.
1995-08-01
The development of 3-D geologic models for reservoir description and simulation has traditionally relied on the computer derived interpolation of well data in a geocelluar stratigraphic framework. The quality of the interpolation has been directly dependent on the nature of the interpolation method, and ability of the interpolation scheme to accurately predict the value of geologic attributes away from the well. Typically, interpolation methods employ deterministic or geostatistical algorithms which offer limited capacity for integrating data derived from secondary analyses. These secondary analyses, which might include the results from 3-D seismic inversion, borehole imagery studies, or deductive reasoning, introduce a subjective component into what would otherwise be restricted to a purely mathematical treatment of geologic data. At Saudi ARAMCO an increased emphasis is being placed on the role of the reservoir geologist in the development of 3-D geologic models. Quantitative results, based on numerical computations, are being enhanced with intuitive geology, derived from years of cumulative professional experience and expertise. Techniques such as template modeling and modified conditional simulation, are yielding 3-D geologic models, which not only more accurately reflect the geology of the reservoir, but also preserve geologic detail throughout the simulation process. This incorporation of secondary data sources and qualitative analysis has been successfully demonstrated in a clastic reservoir environment in Central Saudi Arabia, and serves as a prototype for future 3-D geologic model development.
Integration of geostatistical techniques and intuitive geology in the 3-D modeling process
Heine, C.J.; Cooper, D.H. )
1996-01-01
The development of 3-D geologic models for reservoir description and simulation has traditionally relied on the computer derived interpolation of well data in a geocelluar stratigraphic framework. The quality of the interpolation has been directly dependent on the nature of the interpolation method, and ability of the Interpolation scheme to accurately predict the value of geologic attributes away from the well. Typically, interpolation methods employ deterministic or geostatistical algorithms which offer limited capacity for Integrating data derived from secondary analyses. These secondary analyses, which might include the results from 3-D seismic inversion, borehole imagery studies, or deductive reasoning, introduce a subjective component into what would otherwise be restricted to a purely mathematical treatment of geologic data. At Saudi ARAMCO an increased emphases is being placed on the role of the reservoir geologist in the development of 3-D geologic models. Quantitative results, based on numerical computations, are being enhanced with intuitive geology, derived from years of cumulative professional experience and expertise. Techniques such as template modeling and modified conditional simulation, are yielding 3-D geologic models, which not only more accurately reflect the geology of the reservoir, but also preserve geologic detail throughout the simulation process. This incorporation of secondary data sources and qualitative analysis has been successfully demonstrated in a clastic reservoir environment in Central Saudi Arabia, and serves as a prototype for future 3-D geologic model development.
Integration of geostatistical techniques and intuitive geology in the 3-D modeling process
Heine, C.J.; Cooper, D.H.
1996-12-31
The development of 3-D geologic models for reservoir description and simulation has traditionally relied on the computer derived interpolation of well data in a geocelluar stratigraphic framework. The quality of the interpolation has been directly dependent on the nature of the interpolation method, and ability of the Interpolation scheme to accurately predict the value of geologic attributes away from the well. Typically, interpolation methods employ deterministic or geostatistical algorithms which offer limited capacity for Integrating data derived from secondary analyses. These secondary analyses, which might include the results from 3-D seismic inversion, borehole imagery studies, or deductive reasoning, introduce a subjective component into what would otherwise be restricted to a purely mathematical treatment of geologic data. At Saudi ARAMCO an increased emphases is being placed on the role of the reservoir geologist in the development of 3-D geologic models. Quantitative results, based on numerical computations, are being enhanced with intuitive geology, derived from years of cumulative professional experience and expertise. Techniques such as template modeling and modified conditional simulation, are yielding 3-D geologic models, which not only more accurately reflect the geology of the reservoir, but also preserve geologic detail throughout the simulation process. This incorporation of secondary data sources and qualitative analysis has been successfully demonstrated in a clastic reservoir environment in Central Saudi Arabia, and serves as a prototype for future 3-D geologic model development.
NASA Astrophysics Data System (ADS)
Blessent, Daniela; Therrien, René; Lemieux, Jean-Michel
2011-12-01
This paper presents numerical simulations of a series of hydraulic interference tests conducted in crystalline bedrock at Olkiluoto (Finland), a potential site for the disposal of the Finnish high-level nuclear waste. The tests are in a block of crystalline bedrock of about 0.03 km3 that contains low-transmissivity fractures. Fracture density, orientation, and fracture transmissivity are estimated from Posiva Flow Log (PFL) measurements in boreholes drilled in the rock block. On the basis of those data, a geostatistical approach relying on a transitional probability and Markov chain models is used to define a conceptual model based on stochastic fractured rock facies. Four facies are defined, from sparsely fractured bedrock to highly fractured bedrock. Using this conceptual model, three-dimensional groundwater flow is then simulated to reproduce interference pumping tests in either open or packed-off boreholes. Hydraulic conductivities of the fracture facies are estimated through automatic calibration using either hydraulic heads or both hydraulic heads and PFL flow rates as targets for calibration. The latter option produces a narrower confidence interval for the calibrated hydraulic conductivities, therefore reducing the associated uncertainty and demonstrating the usefulness of the measured PFL flow rates. Furthermore, the stochastic facies conceptual model is a suitable alternative to discrete fracture network models to simulate fluid flow in fractured geological media.
Howes, Rosalind E.; Piel, Frédéric B.; Patil, Anand P.; Nyangiri, Oscar A.; Gething, Peter W.; Dewi, Mewahyu; Hogg, Mariana M.; Battle, Katherine E.; Padilla, Carmencita D.; Baird, J. Kevin; Hay, Simon I.
2012-01-01
Background Primaquine is a key drug for malaria elimination. In addition to being the only drug active against the dormant relapsing forms of Plasmodium vivax, primaquine is the sole effective treatment of infectious P. falciparum gametocytes, and may interrupt transmission and help contain the spread of artemisinin resistance. However, primaquine can trigger haemolysis in patients with a deficiency in glucose-6-phosphate dehydrogenase (G6PDd). Poor information is available about the distribution of individuals at risk of primaquine-induced haemolysis. We present a continuous evidence-based prevalence map of G6PDd and estimates of affected populations, together with a national index of relative haemolytic risk. Methods and Findings Representative community surveys of phenotypic G6PDd prevalence were identified for 1,734 spatially unique sites. These surveys formed the evidence-base for a Bayesian geostatistical model adapted to the gene's X-linked inheritance, which predicted a G6PDd allele frequency map across malaria endemic countries (MECs) and generated population-weighted estimates of affected populations. Highest median prevalence (peaking at 32.5%) was predicted across sub-Saharan Africa and the Arabian Peninsula. Although G6PDd prevalence was generally lower across central and southeast Asia, rarely exceeding 20%, the majority of G6PDd individuals (67.5% median estimate) were from Asian countries. We estimated a G6PDd allele frequency of 8.0% (interquartile range: 7.4–8.8) across MECs, and 5.3% (4.4–6.7) within malaria-eliminating countries. The reliability of the map is contingent on the underlying data informing the model; population heterogeneity can only be represented by the available surveys, and important weaknesses exist in the map across data-sparse regions. Uncertainty metrics are used to quantify some aspects of these limitations in the map. Finally, we assembled a database of G6PDd variant occurrences to inform a national-level index of
Model parameter updating using Bayesian networks
Treml, C. A.; Ross, Timothy J.
2004-01-01
This paper outlines a model parameter updating technique for a new method of model validation using a modified model reference adaptive control (MRAC) framework with Bayesian Networks (BNs). The model parameter updating within this method is generic in the sense that the model/simulation to be validated is treated as a black box. It must have updateable parameters to which its outputs are sensitive, and those outputs must have metrics that can be compared to that of the model reference, i.e., experimental data. Furthermore, no assumptions are made about the statistics of the model parameter uncertainty, only upper and lower bounds need to be specified. This method is designed for situations where a model is not intended to predict a complete point-by-point time domain description of the item/system behavior; rather, there are specific points, features, or events of interest that need to be predicted. These specific points are compared to the model reference derived from actual experimental data. The logic for updating the model parameters to match the model reference is formed via a BN. The nodes of this BN consist of updateable model input parameters and the specific output values or features of interest. Each time the model is executed, the input/output pairs are used to adapt the conditional probabilities of the BN. Each iteration further refines the inferred model parameters to produce the desired model output. After parameter updating is complete and model inputs are inferred, reliabilities for the model output are supplied. Finally, this method is applied to a simulation of a resonance control cooling system for a prototype coupled cavity linac. The results are compared to experimental data.
A Bayesian model for cluster detection.
Wakefield, Jonathan; Kim, Albert
2013-09-01
The detection of areas in which the risk of a particular disease is significantly elevated, leading to an excess of cases, is an important enterprise in spatial epidemiology. Various frequentist approaches have been suggested for the detection of "clusters" within a hypothesis testing framework. Unfortunately, these suffer from a number of drawbacks including the difficulty in specifying a p-value threshold at which to call significance, the inherent multiplicity problem, and the possibility of multiple clusters. In this paper, we suggest a Bayesian approach to detecting "areas of clustering" in which the study region is partitioned into, possibly multiple, "zones" within which the risk is either at a null, or non-null, level. Computation is carried out using Markov chain Monte Carlo, tuned to the model that we develop. The method is applied to leukemia data in upstate New York. PMID:23476026
Bayesian model selection for LISA pathfinder
NASA Astrophysics Data System (ADS)
Karnesis, Nikolaos; Nofrarias, Miquel; Sopuerta, Carlos F.; Gibert, Ferran; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Ferraioli, Luigi; Hewitson, Martin; Hueller, Mauro; Korsakova, Natalia; McNamara, Paul W.; Plagnol, Eric; Vitale, Stefano
2014-03-01
The main goal of the LISA Pathfinder (LPF) mission is to fully characterize the acceleration noise models and to test key technologies for future space-based gravitational-wave observatories similar to the eLISA concept. The data analysis team has developed complex three-dimensional models of the LISA Technology Package (LTP) experiment onboard the LPF. These models are used for simulations, but, more importantly, they will be used for parameter estimation purposes during flight operations. One of the tasks of the data analysis team is to identify the physical effects that contribute significantly to the properties of the instrument noise. A way of approaching this problem is to recover the essential parameters of a LTP model fitting the data. Thus, we want to define the simplest model that efficiently explains the observations. To do so, adopting a Bayesian framework, one has to estimate the so-called Bayes factor between two competing models. In our analysis, we use three main different methods to estimate it: the reversible jump Markov chain Monte Carlo method, the Schwarz criterion, and the Laplace approximation. They are applied to simulated LPF experiments in which the most probable LTP model that explains the observations is recovered. The same type of analysis presented in this paper is expected to be followed during flight operations. Moreover, the correlation of the output of the aforementioned methods with the design of the experiment is explored.
Modeling residual hydrologic errors with Bayesian inference
NASA Astrophysics Data System (ADS)
Smith, Tyler; Marshall, Lucy; Sharma, Ashish
2015-09-01
Hydrologic modelers are confronted with the challenge of producing estimates of the uncertainty associated with model predictions across an array of catchments and hydrologic flow regimes. Formal Bayesian approaches are commonly employed for parameter calibration and uncertainty analysis, but are often criticized for making strong assumptions about the nature of model residuals via the likelihood function that may not be well satisfied (or even checked). This technical note outlines a residual error model (likelihood function) specification framework that aims to provide guidance for the application of more appropriate residual error models through a nested approach that is both flexible and extendible. The framework synthesizes many previously employed residual error models and has been applied to four synthetic datasets (of differing error structure) and a real dataset from the Black River catchment in Queensland, Australia. Each residual error model was investigated and assessed under a top-down approach focused on its ability to properly characterize the errors. The results of these test applications indicate that a multifaceted assessment strategy is necessary to determine the adequacy of an individual likelihood function.
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
Implementing Relevance Feedback in the Bayesian Network Retrieval Model.
ERIC Educational Resources Information Center
de Campos, Luis M.; Fernandez-Luna, Juan M.; Huete, Juan F.
2003-01-01
Discussion of relevance feedback in information retrieval focuses on a proposal for the Bayesian Network Retrieval Model. Bases the proposal on the propagation of partial evidences in the Bayesian network, representing new information obtained from the user's relevance judgments to compute the posterior relevance probabilities of the documents…
Bayesian Student Modeling and the Problem of Parameter Specification.
ERIC Educational Resources Information Center
Millan, Eva; Agosta, John Mark; Perez de la Cruz, Jose Luis
2001-01-01
Discusses intelligent tutoring systems and the application of Bayesian networks to student modeling. Considers reasons for not using Bayesian networks, including the computational complexity of the algorithms and the difficulty of knowledge acquisition, and proposes an approach to simplify knowledge acquisition that applies causal independence to…
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Warnery, E; Ielsch, G; Lajaunie, C; Cale, E; Wackernagel, H; Debayle, C; Guillevic, J
2015-01-01
information, which is exhaustive throughout France, could help in estimating the telluric gamma dose rates. Such an approach is possible using multivariate geostatistics and cokriging. Multi-collocated cokriging has been performed on 1*1 km(2) cells over the domain. This model used gamma dose rate measurement results and GUP classes. Our results provide useful information on the variability of the natural terrestrial gamma radiation in France ('natural background') and exposure data for epidemiological studies and risk assessment from low dose chronic exposures. PMID:25464050
NASA Astrophysics Data System (ADS)
Vasquez, D. A.; Swift, J. N.; Tan, S.; Darrah, T. H.
2013-12-01
The integration of precise geochemical analyses with quantitative engineering modeling into an interactive GIS system allows for a sophisticated and efficient method of reservoir engineering and characterization. Geographic Information Systems (GIS) is utilized as an advanced technique for oil field reservoir analysis by combining field engineering and geological/geochemical spatial datasets with the available systematic modeling and mapping methods to integrate the information into a spatially correlated first-hand approach in defining surface and subsurface characteristics. Three key methods of analysis include: 1) Geostatistical modeling to create a static and volumetric 3-dimensional representation of the geological body, 2) Numerical modeling to develop a dynamic and interactive 2-dimensional model of fluid flow across the reservoir and 3) Noble gas geochemistry to further define the physical conditions, components and history of the geologic system. Results thus far include using engineering algorithms for interpolating electrical well log properties across the field (spontaneous potential, resistivity) yielding a highly accurate and high-resolution 3D model of rock properties. Results so far also include using numerical finite difference methods (crank-nicholson) to solve for equations describing the distribution of pressure across field yielding a 2D simulation model of fluid flow across reservoir. Ongoing noble gas geochemistry results will also include determination of the source, thermal maturity and the extent/style of fluid migration (connectivity, continuity and directionality). Future work will include developing an inverse engineering algorithm to model for permeability, porosity and water saturation.This combination of new and efficient technological and analytical capabilities is geared to provide a better understanding of the field geology and hydrocarbon dynamics system with applications to determine the presence of hydrocarbon pay zones (or
Bayesian analysis of the backreaction models
Kurek, Aleksandra; Bolejko, Krzysztof; Szydlowski, Marek
2010-03-15
We present a Bayesian analysis of four different types of backreaction models, which are based on the Buchert equations. In this approach, one considers a solution to the Einstein equations for a general matter distribution and then an average of various observable quantities is taken. Such an approach became of considerable interest when it was shown that it could lead to agreement with observations without resorting to dark energy. In this paper we compare the {Lambda}CDM model and the backreaction models with type Ia supernovae, baryon acoustic oscillations, and cosmic microwave background data, and find that the former is favored. However, the tested models were based on some particular assumptions about the relation between the average spatial curvature and the backreaction, as well as the relation between the curvature and curvature index. In this paper we modified the latter assumption, leaving the former unchanged. We find that, by varying the relation between the curvature and curvature index, we can obtain a better fit. Therefore, some further work is still needed--in particular, the relation between the backreaction and the curvature should be revisited in order to fully determine the feasibility of the backreaction models to mimic dark energy.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Stochastic model updating utilizing Bayesian approach and Gaussian process model
NASA Astrophysics Data System (ADS)
Wan, Hua-Ping; Ren, Wei-Xin
2016-03-01
Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
NASA Astrophysics Data System (ADS)
You, Jiong; Pei, Zhiyuan
2015-01-01
With the development of remote sensing technology, its applications in agriculture monitoring systems, crop mapping accuracy, and spatial distribution are more and more being explored by administrators and users. Uncertainty in crop mapping is profoundly affected by the spatial pattern of spectral reflectance values obtained from the applied remote sensing data. Errors in remotely sensed crop cover information and the propagation in derivative products need to be quantified and handled correctly. Therefore, this study discusses the methods of error modeling for uncertainty characterization in crop mapping using GF-1 multispectral imagery. An error modeling framework based on geostatistics is proposed, which introduced the sequential Gaussian simulation algorithm to explore the relationship between classification errors and the spectral signature from remote sensing data source. On this basis, a misclassification probability model to produce a spatially explicit classification error probability surface for the map of a crop is developed, which realizes the uncertainty characterization for crop mapping. In this process, trend surface analysis was carried out to generate a spatially varying mean response and the corresponding residual response with spatial variation for the spectral bands of GF-1 multispectral imagery. Variogram models were employed to measure the spatial dependence in the spectral bands and the derived misclassification probability surfaces. Simulated spectral data and classification results were quantitatively analyzed. Through experiments using data sets from a region in the low rolling country located at the Yangtze River valley, it was found that GF-1 multispectral imagery can be used for crop mapping with a good overall performance, the proposal error modeling framework can be used to quantify the uncertainty in crop mapping, and the misclassification probability model can summarize the spatial variation in map accuracy and is helpful for
Bayesian Test of Significance for Conditional Independence: The Multinomial Model
NASA Astrophysics Data System (ADS)
de Morais Andrade, Pablo; Stern, Julio; de Bragança Pereira, Carlos
2014-03-01
Conditional independence tests (CI tests) have received special attention lately in Machine Learning and Computational Intelligence related literature as an important indicator of the relationship among the variables used by their models. In the field of Probabilistic Graphical Models (PGM)--which includes Bayesian Networks (BN) models--CI tests are especially important for the task of learning the PGM structure from data. In this paper, we propose the Full Bayesian Significance Test (FBST) for tests of conditional independence for discrete datasets. FBST is a powerful Bayesian test for precise hypothesis, as an alternative to frequentist's significance tests (characterized by the calculation of the \\emph{p-value}).
Bayesian analysis of a disability model for lung cancer survival.
Armero, C; Cabras, S; Castellanos, M E; Perra, S; Quirós, A; Oruezábal, M J; Sánchez-Rubio, J
2016-02-01
Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions. PMID:22767866
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
NASA Astrophysics Data System (ADS)
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Entropic Priors and Bayesian Model Selection
NASA Astrophysics Data System (ADS)
Brewer, Brendon J.; Francis, Matthew J.
2009-12-01
We demonstrate that the principle of maximum relative entropy (ME), used judiciously, can ease the specification of priors in model selection problems. The resulting effect is that models that make sharp predictions are disfavoured, weakening the usual Bayesian ``Occam's Razor.'' This is illustrated with a simple example involving what Jaynes called a ``sure thing'' hypothesis. Jaynes' resolution of the situation involved introducing a large number of alternative ``sure thing'' hypotheses that were possible before we observed the data. However, in more complex situations, it may not be possible to explicitly enumerate large numbers of alternatives. The entropic priors formalism produces the desired result without modifying the hypothesis space or requiring explicit enumeration of alternatives; all that is required is a good model for the prior predictive distribution for the data. This idea is illustrated with a simple rigged-lottery example, and we outline how this idea may help to resolve a recent debate amongst cosmologists: is dark energy a cosmological constant, or has it evolved with time in some way? And how shall we decide, when the data are in?
Integrative variable selection via Bayesian model uncertainty.
Quintana, M A; Conti, D V
2013-12-10
We are interested in developing integrative approaches for variable selection problems that incorporate external knowledge on a set of predictors of interest. In particular, we have developed an integrative Bayesian model uncertainty (iBMU) method, which formally incorporates multiple sources of data via a second-stage probit model on the probability that any predictor is associated with the outcome of interest. Using simulations, we demonstrate that iBMU leads to an increase in power to detect true marginal associations over more commonly used variable selection techniques, such as least absolute shrinkage and selection operator and elastic net. In addition, iBMU leads to a more efficient model search algorithm over the basic BMU method even when the predictor-level covariates are only modestly informative. The increase in power and efficiency of our method becomes more substantial as the predictor-level covariates become more informative. Finally, we demonstrate the power and flexibility of iBMU for integrating both gene structure and functional biomarker information into a candidate gene study investigating over 50 genes in the brain reward system and their role with smoking cessation from the Pharmacogenetics of Nicotine Addiction and Treatment Consortium. PMID:23824835
Two-Stage Bayesian Model Averaging in Endogenous Variable Models.
Lenkoski, Alex; Eicher, Theo S; Raftery, Adrian E
2014-01-01
Economic modeling in the presence of endogeneity is subject to model uncertainty at both the instrument and covariate level. We propose a Two-Stage Bayesian Model Averaging (2SBMA) methodology that extends the Two-Stage Least Squares (2SLS) estimator. By constructing a Two-Stage Unit Information Prior in the endogenous variable model, we are able to efficiently combine established methods for addressing model uncertainty in regression models with the classic technique of 2SLS. To assess the validity of instruments in the 2SBMA context, we develop Bayesian tests of the identification restriction that are based on model averaged posterior predictive p-values. A simulation study showed that 2SBMA has the ability to recover structure in both the instrument and covariate set, and substantially improves the sharpness of resulting coefficient estimates in comparison to 2SLS using the full specification in an automatic fashion. Due to the increased parsimony of the 2SBMA estimate, the Bayesian Sargan test had a power of 50 percent in detecting a violation of the exogeneity assumption, while the method based on 2SLS using the full specification had negligible power. We apply our approach to the problem of development accounting, and find support not only for institutions, but also for geography and integration as development determinants, once both model uncertainty and endogeneity have been jointly addressed. PMID:24223471
Calibrating Bayesian Network Representations of Social-Behavioral Models
Whitney, Paul D.; Walsh, Stephen J.
2010-04-08
While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empirical comparison with data taken from the Minorities at Risk Organizational Behaviors database.
Bayesian model reduction and empirical Bayes for group (DCM) studies.
Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter
2016-03-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
Bayesian model reduction and empirical Bayes for group (DCM) studies
Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter
2016-01-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
NASA Astrophysics Data System (ADS)
Wingle, William L.; Poeter, Eileen P.; McKenna, Sean A.
1999-05-01
UNCERT is a 2D and 3D geostatistics, uncertainty analysis and visualization software package applied to ground water flow and contaminant transport modeling. It is a collection of modules that provides tools for linear regression, univariate statistics, semivariogram analysis, inverse-distance gridding, trend-surface analysis, simple and ordinary kriging and discrete conditional indicator simulation. Graphical user interfaces for MODFLOW and MT3D, ground water flow and contaminant transport models, are provided for streamlined data input and result analysis. Visualization tools are included for displaying data input and output. These include, but are not limited to, 2D and 3D scatter plots, histograms, box and whisker plots, 2D contour maps, surface renderings of 2D gridded data and 3D views of gridded data. By design, UNCERT's graphical user interface and visualization tools facilitate model design and analysis. There are few built in restrictions on data set sizes and each module (with two exceptions) can be run in either graphical or batch mode. UNCERT is in the public domain and is available from the World Wide Web with complete on-line and printable (PDF) documentation. UNCERT is written in ANSI-C with a small amount of FORTRAN77, for UNIX workstations running X-Windows and Motif (or Lesstif). This article discusses the features of each module and demonstrates how they can be used individually and in combination. The tools are applicable to a wide range of fields and are currently used by researchers in the ground water, mining, mathematics, chemistry and geophysics, to name a few disciplines.
NASA Astrophysics Data System (ADS)
Matiatos, Ioannis; Varouhakis, Emmanouil A.; Papadopoulou, Maria P.
2015-04-01
level and nitrate concentrations were produced and compared with those obtained from groundwater and mass transport numerical models. Preliminary results showed similar efficiency of the spatiotemporal geostatistical method with the numerical models. However data requirements of the former model were significantly less. Advantages and disadvantages of the methods performance were analysed and discussed indicating the characteristics of the different approaches.
BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)
We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...
Multivariate Bayesian Models of Extreme Rainfall
NASA Astrophysics Data System (ADS)
Rahill-Marier, B.; Devineni, N.; Lall, U.; Farnham, D.
2013-12-01
Accounting for spatial heterogeneity in extreme rainfall has important ramifications in hydrological design and climate models alike. Traditional methods, including areal reduction factors and kriging, are sensitive to catchment shape assumptions and return periods, and do not explicitly model spatial dependence between between data points. More recent spatially dense rainfall simulators depend on newer data sources such as radar and may struggle to reproduce extremes because of physical assumptions in the model and short historical records. Rain gauges offer the longest historical record, key when considering rainfall extremes and changes over time, and particularly relevant in today's environment of designing for climate change. In this paper we propose a probabilistic approach of accounting for spatial dependence using the lengthy but spatially disparate hourly rainfall network in the greater New York City area. We build a hierarchical Bayesian model allowing extremes at one station to co-vary with concurrent rainfall fields occurring at other stations. Subsequently we pool across the extreme rainfall fields of all stations, and demonstrate that the expected catchment-wide events are significantly lower when considering spatial fields instead of maxima-only fields. We additionally demonstrate the importance of using concurrent spatial fields, rather than annual maxima, in producing covariance matrices that describe true storm dynamics. This approach is also unique in that it considers short duration storms - from one hour to twenty-four hours - rather than the daily values typically derived from rainfall gauges. The same methodology can be extended to include the radar fields available in the past decade. The hierarchical multilevel approach lends itself easily to integration of long-record parameters and short-record parameters at a station or regional level. In addition climate covariates can be introduced to support the relationship of spatial covariance with
Which level of model complexity is justified by your data? A Bayesian answer
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Illman, Walter; Wöhling, Thomas; Nowak, Wolfgang
2016-04-01
When judging the plausibility and utility of a subsurface flow or transport model, the question of justifiability arises: which level of model complexity can still be justified by the available calibration data? Although it is common sense that more data are needed to reasonably constrain the parameter space of a more complex model, there is a lack of tools that can objectively quantify model justifiability as a function of the available data. We propose an approach to determine model justifiability in the context of comparing alternative conceptual models. Our approach rests on Bayesian model averaging (BMA). BMA yields posterior model probabilities that point the modeler to an optimal trade-off between model performance in reproducing a given calibration data set and model complexity. To find out which level of complexity can be justified by the available data, we disentangle the complexity component of the trade-off from its performance counterpart. Technically, we remove the performance component from the BMA analysis by replacing the actually observed data values with potential measurement values as predicted by the models. Our proposed analysis results in a "model confusion matrix". Based on this matrix, the modeler can identify the maximum level of model complexity that could possibly be justified by the available amount and type of data. As a side product, model (dis-)similarity is revealed. We have applied the model justifiability analysis to a case of aquifer characterization via hydraulic tomography. Four models of vastly different complexity have been proposed to represent the heterogeneity in hydraulic conductivity of a sandbox aquifer, ranging from a homogeneous medium to geostatistical random fields. We have used drawdown data from two to six pumping tests to condition the models and to determine model justifiability as a function of data set size. Our test case shows that a geostatistical parameterization scheme requires a substantial amount of
Viswanathan, Hari S; Robinson, Bruce A; Gable, Carl W; Carey, James W
2003-01-01
Retardation of certain radionuclides due to sorption to zeolitic minerals is considered one of the major barriers to contaminant transport in the unsaturated zone of Yucca Mountain. However, zeolitically altered areas are lower in permeability than unaltered regions, which raises the possibility that contaminants might bypass the sorptive zeolites. The relationship between hydrologic and chemical properties must be understood to predict the transport of radionuclides through zeolitically altered areas. In this study, we incorporate mineralogical information into an unsaturated zone transport model using geostatistical techniques to correlate zeolitic abundance to hydrologic and chemical properties. Geostatistical methods are used to develop variograms, kriging maps, and conditional simulations of zeolitic abundance. We then investigate, using flow and transport modeling on a heterogeneous field, the relationship between percent zeolitic alteration, permeability changes due to alteration, sorption due to alteration, and their overall effect on radionuclide transport. We compare these geostatistical simulations to a simplified threshold method in which each spatial location in the model is assigned either zeolitic or vitric properties based on the zeolitic abundance at that location. A key conclusion is that retardation due to sorption predicted by using the continuous distribution is larger than the retardation predicted by the threshold method. The reason for larger retardation when using the continuous distribution is a small but significant sorption at locations with low zeolitic abundance. If, for practical reasons, models with homogeneous properties within each layer are used, we recommend setting nonzero K(d)s in the vitric tuffs to mimic the more rigorous continuous distribution simulations. Regions with high zeolitic abundance may not be as effective in retarding radionuclides such as Neptunium since these rocks are lower in permeability and contaminants can
A Bayesian observer model constrained by efficient coding can explain 'anti-Bayesian' percepts.
Wei, Xue-Xin; Stocker, Alan A
2015-10-01
Bayesian observer models provide a principled account of the fact that our perception of the world rarely matches physical reality. The standard explanation is that our percepts are biased toward our prior beliefs. However, reported psychophysical data suggest that this view may be simplistic. We propose a new model formulation based on efficient coding that is fully specified for any given natural stimulus distribution. The model makes two new and seemingly anti-Bayesian predictions. First, it predicts that perception is often biased away from an observer's prior beliefs. Second, it predicts that stimulus uncertainty differentially affects perceptual bias depending on whether the uncertainty is induced by internal or external noise. We found that both model predictions match reported perceptual biases in perceived visual orientation and spatial frequency, and were able to explain data that have not been explained before. The model is general and should prove applicable to other perceptual variables and tasks. PMID:26343249
Yang, Kun; Zhou, Xiao-Nong; Wu, Xiao-Hua; Steinmann, Peter; Wang, Xian-Hong; Yang, Guo-Jing; Utzinger, Jürg; Li, Hong-Jun
2009-09-01
Detailed knowledge of how local landscape patterns influence the distribution of Oncomelania hupensis, the intermediate host snail of Schistosoma japonicum, might facilitate more effective schistosomiasis control. We selected 12 villages in a mountainous area of Eryuan County, Yunnan Province, People's Republic of China, and developed Bayesian geostatistical models to explore heterogeneities of landscape composition in relation to distribution of O. hupensis. The best-fitting spatio-temporal model indicated that the snail density was significantly correlated with environmental factors. Specifically, snail density was positively correlated with wetness and inversely correlated with the normalized difference vegetation index and mollusciciding, and snail density decreased as landscape patterns became more uniform. However, the distribution of infected snails was not significantly correlated with any of the investigated environmental factors and landscape metrics. Our enhanced understanding of O. hupensis ecology is important for spatial targeting of schistosomiasis control interventions. PMID:19706906
Evaluating Individualized Reading Programs: A Bayesian Model.
ERIC Educational Resources Information Center
Maxwell, Martha
Simple Bayesian approaches can be applied to answer specific questions in evaluating an individualized reading program. A small reading and study skills program located in the counseling center of a major research university collected and compiled data on student characteristics such as class, number of sessions attended, grade point average, and…
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. PMID:27179874
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Hu, Liangdong; Wang, Limin
2013-01-01
Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway. PMID:23457624
Lee, K.H.
1997-09-01
Numerical and geostatistical analyses show that the artificial smoothing effect of kriging removes high permeability flow paths from hydrogeologic data sets, reducing simulated contaminant transport rates in heterogeneous vadose zone systems. therefore, kriging alone is not recommended for estimating the spatial distribution of soil hydraulic properties for contaminant transport analysis at vadose zone sites. Vadose zone transport if modeled more effectively by combining kriging with stochastic simulation to better represent the high degree of spatial variability usually found in the hydraulic properties of field soils. However, kriging is a viable technique for estimating the initial mass distribution of contaminants in the subsurface.
Geostatistical simulations for radon indoor with a nested model including the housing factor.
Cafaro, C; Giovani, C; Garavaglia, M
2016-01-01
The radon prone areas definition is matter of many researches in radioecology, since radon is considered a leading cause of lung tumours, therefore the authorities ask for support to develop an appropriate sanitary prevention strategy. In this paper, we use geostatistical tools to elaborate a definition accounting for some of the available information about the dwellings. Co-kriging is the proper interpolator used in geostatistics to refine the predictions by using external covariates. In advance, co-kriging is not guaranteed to improve significantly the results obtained by applying the common lognormal kriging. Here, instead, such multivariate approach leads to reduce the cross-validation residual variance to an extent which is deemed as satisfying. Furthermore, with the application of Monte Carlo simulations, the paradigm provides a more conservative radon prone areas definition than the one previously made by lognormal kriging. PMID:26547362
Samimi, B.; Bagherpour, H.; Nioc, A.
1995-08-01
The geological reservoir study of the supergiant Ahwaz field significantly improved the history matching process in many aspects, particularly the development of a geostatistical model which allowed a sound basis for changes and by delivering much needed accurate estimates of grid block vertical permeabilities. The geostatistical reservoir evaluation was facilitated by using the Heresim package and litho-stratigraphic zonations for the entire field. For each of the geological zones, 3-dimensional electrolithofacies and petrophysical property distributions (realizations) were treated which captured the heterogeneities which significantly affected fluid flow. However, as this level of heterogeneity was at a significantly smaller scale than the flow simulation grid blocks, a scaling up effort was needed to derive the effective flow properties of the blocks (porosity, horizontal and vertical permeability, and water saturation). The properties relating to the static reservoir description were accurately derived by using stream tube techniques developed in-house whereas, the relative permeabilities of the grid block were derived by dynamic pseudo relative permeability techniques. The prediction of vertical and lateral communication and water encroachment was facilitated by a close integration of pressure, saturation data, geostatistical modelling and sedimentological studies of the depositional environments and paleocurrents. The nature of reservoir barriers and baffles varied both vertically and laterally in this heterogeneous reservoir. Maps showing differences in pressure between zones after years of production served as a guide to integrating the static geological studies to the dynamic behaviour of each of the 16 reservoir zones. The use of deep wells being drilled to a deeper reservoir provided data to better understand the sweep efficiency and the continuity of barriers and baffles.
Social Science and the Bayesian Probability Explanation Model
NASA Astrophysics Data System (ADS)
Yin, Jie; Zhao, Lei
2014-03-01
C. G. Hempel, one of the logical empiricists, who builds up his probability explanation model by using the empiricist view of probability, this model encountered many difficulties in the scientific explanation in which Hempel is difficult to make a reasonable defense. Based on the bayesian probability theory, the Bayesian probability model provides an approach of a subjective probability explanation based on the subjective probability, using the subjectivist view of probability. On the one hand, this probability model establishes the epistemological status of the subject in the social science; On the other hand, it provides a feasible explanation model for the social scientific explanation, which has important methodological significance.
Bayesian calibration of a flood inundation model using spatial data
NASA Astrophysics Data System (ADS)
Hall, Jim W.; Manning, Lucy J.; Hankin, Robin K. S.
2011-05-01
Bayesian theory of model calibration provides a coherent framework for distinguishing and encoding multiple sources of uncertainty in probabilistic predictions of flooding. This paper demonstrates the use of a Bayesian approach to computer model calibration, where the calibration data are in the form of spatial observations of flood extent. The Bayesian procedure involves generating posterior distributions of the flood model calibration parameters and observation error, as well as a Gaussian model inadequacy function, which represents the discrepancy between the best model predictions and reality. The approach is first illustrated with a simple didactic example and is then applied to a flood model of a reach of the river Thames in the UK. A predictive spatial distribution of flooding is generated for a flood of given severity.
Bayesian Estimation of the Logistic Positive Exponent IRT Model
ERIC Educational Resources Information Center
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
NASA Astrophysics Data System (ADS)
Moradkhani, Hamid; Yan, Hongxiang
2016-04-01
Soil moisture simulation and prediction are increasingly used to characterize agricultural droughts but the process suffers from data scarcity and quality. The satellite soil moisture observations could be used to improve model predictions with data assimilation. Remote sensing products, however, are typically discontinuous in spatial-temporal coverages; while simulated soil moisture products are potentially biased due to the errors in forcing data, parameters, and deficiencies of model physics. This study attempts to provide a detailed analysis of the joint and separate assimilation of streamflow and Advanced Scatterometer (ASCAT) surface soil moisture into a fully distributed hydrologic model, with the use of recently developed particle filter-Markov chain Monte Carlo (PF-MCMC) method. A geostatistical model is introduced to overcome the satellite soil moisture discontinuity issue where satellite data does not cover the whole study region or is significantly biased, and the dominant land cover is dense vegetation. The results indicate that joint assimilation of soil moisture and streamflow has minimal effect in improving the streamflow prediction, however, the surface soil moisture field is significantly improved. The combination of DA and geostatistical approach can further improve the surface soil moisture prediction.
Estimating Tree Height-Diameter Models with the Bayesian Method
Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
Estimating tree height-diameter models with the Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)
ERIC Educational Resources Information Center
Wills, Andy J.; Pothos, Emmanuel M.
2012-01-01
Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…
Bayesian Network Models for Local Dependence among Observable Outcome Variables
ERIC Educational Resources Information Center
Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli
2009-01-01
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…
On the Bayesian Nonparametric Generalization of IRT-Type Models
ERIC Educational Resources Information Center
San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel
2011-01-01
We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…
Bayesian non-parametrics and the probabilistic approach to modelling
Ghahramani, Zoubin
2013-01-01
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
A General Bayesian Model for Testlets: Theory and Applications.
ERIC Educational Resources Information Center
Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard
2002-01-01
Proposes a modified version of commonly employed item response models in a fully Bayesian framework and obtains inferences under the model using Markov chain Monte Carlo techniques. Demonstrates use of the model in a series of simulations and with operational data from the North Carolina Test of Computer Skills and the Test of Spoken English…
Ashton, Ruth A.; Kefyalew, Takele; Rand, Alison; Sime, Heven; Assefa, Ashenafi; Mekasha, Addis; Edosa, Wasihun; Tesfaye, Gezahegn; Cano, Jorge; Teka, Hiwot; Reithinger, Richard; Pullan, Rachel L.; Drakeley, Chris J.; Brooker, Simon J.
2015-01-01
Ethiopia has a diverse ecology and geography resulting in spatial and temporal variation in malaria transmission. Evidence-based strategies are thus needed to monitor transmission intensity and target interventions. A purposive selection of dried blood spots collected during cross-sectional school-based surveys in Oromia Regional State, Ethiopia, were tested for presence of antibodies against Plasmodium falciparum and P. vivax antigens. Spatially explicit binomial models of seroprevalence were created for each species using a Bayesian framework, and used to predict seroprevalence at 5 km resolution across Oromia. School seroprevalence showed a wider prevalence range than microscopy for both P. falciparum (0–50% versus 0–12.7%) and P. vivax (0–53.7% versus 0–4.5%), respectively. The P. falciparum model incorporated environmental predictors and spatial random effects, while P. vivax seroprevalence first-order trends were not adequately explained by environmental variables, and a spatial smoothing model was developed. This is the first demonstration of serological indicators being used to detect large-scale heterogeneity in malaria transmission using samples from cross-sectional school-based surveys. The findings support the incorporation of serological indicators into periodic large-scale surveillance such as Malaria Indicator Surveys, and with particular utility for low transmission and elimination settings. PMID:25962770
A Practical Primer on Geostatistics
Olea, Ricardo A.
2009-01-01
THE CHALLENGE Most geological phenomena are extraordinarily complex in their interrelationships and vast in their geographical extension. Ordinarily, engineers and geoscientists are faced with corporate or scientific requirements to properly prepare geological models with measurements involving a small fraction of the entire area or volume of interest. Exact description of a system such as an oil reservoir is neither feasible nor economically possible. The results are necessarily uncertain. Note that the uncertainty is not an intrinsic property of the systems; it is the result of incomplete knowledge by the observer. THE AIM OF GEOSTATISTICS The main objective of geostatistics is the characterization of spatial systems that are incompletely known, systems that are common in geology. A key difference from classical statistics is that geostatistics uses the sampling location of every measurement. Unless the measurements show spatial correlation, the application of geostatistics is pointless. Ordinarily the need for additional knowledge goes beyond a few points, which explains the display of results graphically as fishnet plots, block diagrams, and maps. GEOSTATISTICAL METHODS Geostatistics is a collection of numerical techniques for the characterization of spatial attributes using primarily two tools: probabilistic models, which are used for spatial data in a manner similar to the way in which time-series analysis characterizes temporal data, or pattern recognition techniques. The probabilistic models are used as a way to handle uncertainty in results away from sampling locations, making a radical departure from alternative approaches like inverse distance estimation methods. DIFFERENCES WITH TIME SERIES On dealing with time-series analysis, users frequently concentrate their attention on extrapolations for making forecasts. Although users of geostatistics may be interested in extrapolation, the methods work at their best interpolating. This simple difference has
NASA Astrophysics Data System (ADS)
Goeckede, M.; Yadav, V.; Mueller, K. L.; Gourdji, S. M.; Michalak, A. M.; Law, B. E.
2011-12-01
We designed a framework to train biogeophysics-biogeochemistry process models using atmospheric inverse modeling, multiple databases characterizing biosphere-atmosphere exchange, and advanced geostatistics. Our main objective is to reduce uncertainties in carbon cycle and climate projections by exploring the full spectrum of process representation, data assimilation and statistical tools currently available. Incorporating multiple high-quality data sources like eddy-covariance flux databases or biometric inventories has the potential to produce a rigorous data-constrained process model implementation. However, representation errors may bias spatially explicit model output when upscaling to regional to global scales. Atmospheric inverse modeling can be used to validate the regional representativeness of the fluxes, but each piece of prior information from the surface databases limits the ability of the inverse model to characterize the carbon cycle from the perspective of the atmospheric observations themselves. The use of geostatistical inverse modeling (GIM) holds the potential to overcome these limitations, replacing rigid prior patterns with information on how flux fields are correlated across time and space, as well as ancillary environmental data related to the carbon fluxes. We present results from a regional scale data assimilation study that focuses on generating terrestrial CO2 fluxes at high spatial and temporal resolution in the Pacific Northwest United States. Our framework couples surface fluxes from different biogeochemistry process models to very high resolution atmospheric transport using mesoscale modeling (WRF) and Lagrangian Particle dispersion (STILT). We use GIM to interpret the spatiotemporal differences between bottom-up and top-down flux fields. GIM results make it possible to link those differences to input parameters and processes, strengthening model parameterization and process understanding. Results are compared against independent
Involving stakeholders in building integrated fisheries models using Bayesian methods.
Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari
2013-06-01
A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts. PMID:23604267
Involving Stakeholders in Building Integrated Fisheries Models Using Bayesian Methods
NASA Astrophysics Data System (ADS)
Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari
2013-06-01
A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts.
A Bayesian Approach for Analyzing Longitudinal Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum
2011-01-01
This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Bayesian Estimation of the DINA Model with Gibbs Sampling
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2015-01-01
A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…
Bayesian Finite Mixtures for Nonlinear Modeling of Educational Data.
ERIC Educational Resources Information Center
Tirri, Henry; And Others
A Bayesian approach for finding latent classes in data is discussed. The approach uses finite mixture models to describe the underlying structure in the data and demonstrate that the possibility of using full joint probability models raises interesting new prospects for exploratory data analysis. The concepts and methods discussed are illustrated…
Bayesian Semiparametric Structural Equation Models with Latent Variables
ERIC Educational Resources Information Center
Yang, Mingan; Dunson, David B.
2010-01-01
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…
Grisotto, Laura; Consonni, Dario; Cecconi, Lorenzo; Catelan, Dolores; Lagazio, Corrado; Bertazzi, Pier Alberto; Baccini, Michela; Biggeri, Annibale
2016-01-01
In this paper the focus is on environmental statistics, with the aim of estimating the concentration surface and related uncertainty of an air pollutant. We used air quality data recorded by a network of monitoring stations within a Bayesian framework to overcome difficulties in accounting for prediction uncertainty and to integrate information provided by deterministic models based on emissions meteorology and chemico-physical characteristics of the atmosphere. Several authors have proposed such integration, but all the proposed approaches rely on representativeness and completeness of existing air pollution monitoring networks. We considered the situation in which the spatial process of interest and the sampling locations are not independent. This is known in the literature as the preferential sampling problem, which if ignored in the analysis, can bias geostatistical inferences. We developed a Bayesian geostatistical model to account for preferential sampling with the main interest in statistical integration and uncertainty. We used PM10 data arising from the air quality network of the Environmental Protection Agency of Lombardy Region (Italy) and numerical outputs from the deterministic model. We specified an inhomogeneous Poisson process for the sampling locations intensities and a shared spatial random component model for the dependence between the spatial location of monitors and the pollution surface. We found greater predicted standard deviation differences in areas not properly covered by the air quality network. In conclusion, in this context inferences on prediction uncertainty may be misleading when geostatistical modelling does not take into account preferential sampling. PMID:27087040
Bayesian log-periodic model for financial crashes
NASA Astrophysics Data System (ADS)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-10-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student's t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical part of the study, we analyze a well-known example of financial bubble - the S&P 500 1987 crash - to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian models provide 95% credible intervals for the estimated crash time.
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images.
Shi, Zhiyuan; Hospedales, Timothy M; Xiang, Tao
2015-10-01
We address the problem of localisation of objects as bounding boxes in images and videos with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes. In this paper, a novel framework based on Bayesian joint topic modelling is proposed, which differs significantly from the existing ones in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple object co-existence so that "explaining away" inference can resolve ambiguity and lead to better learning and localisation. (2) Image backgrounds are shared across classes to better learn varying surroundings and "push out" objects of interest. (3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning. Moreover, the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Extensive experiments on the PASCAL VOC, ImageNet and YouTube-Object videos datasets demonstrate the effectiveness of our Bayesian joint model for weakly supervised object localisation. PMID:26340253
Maximum Likelihood Bayesian Averaging of Spatial Variability Models in Unsaturated Fractured Tuff
Ye, Ming; Neuman, Shlomo P.; Meyer, Philip D.
2004-05-25
Hydrologic analyses typically rely on a single conceptual-mathematical model. Yet hydrologic environments are open and complex, rendering them prone to multiple interpretations and mathematical descriptions. Adopting only one of these may lead to statistical bias and underestimation of uncertainty. Bayesian Model Averaging (BMA) provides an optimal way to combine the predictions of several competing models and to assess their joint predictive uncertainty. However, it tends to be computationally demanding and relies heavily on prior information about model parameters. We apply a maximum likelihood (ML) version of BMA (MLBMA) to seven alternative variogram models of log air permeability data from single-hole pneumatic injection tests in six boreholes at the Apache Leap Research Site (ALRS) in central Arizona. Unbiased ML estimates of variogram and drift parameters are obtained using Adjoint State Maximum Likelihood Cross Validation in conjunction with Universal Kriging and Generalized L east Squares. Standard information criteria provide an ambiguous ranking of the models, which does not justify selecting one of them and discarding all others as is commonly done in practice. Instead, we eliminate some of the models based on their negligibly small posterior probabilities and use the rest to project the measured log permeabilities by kriging onto a rock volume containing the six boreholes. We then average these four projections, and associated kriging variances, using the posterior probability of each model as weight. Finally, we cross-validate the results by eliminating from consideration all data from one borehole at a time, repeating the above process, and comparing the predictive capability of MLBMA with that of each individual model. We find that MLBMA is superior to any individual geostatistical model of log permeability among those we consider at the ALRS.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided. PMID:26019004
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Measuring Learning Progressions Using Bayesian Modeling in Complex Assessments
ERIC Educational Resources Information Center
Rutstein, Daisy Wise
2012-01-01
This research examines issues regarding model estimation and robustness in the use of Bayesian Inference Networks (BINs) for measuring Learning Progressions (LPs). It provides background information on LPs and how they might be used in practice. Two simulation studies are performed, along with real data examples. The first study examines the case…
Probabilistic climate change predictions applying Bayesian model averaging.
Min, Seung-Ki; Simonis, Daniel; Hense, Andreas
2007-08-15
This study explores the sensitivity of probabilistic predictions of the twenty-first century surface air temperature (SAT) changes to different multi-model averaging methods using available simulations from the Intergovernmental Panel on Climate Change fourth assessment report. A way of observationally constrained prediction is provided by training multi-model simulations for the second half of the twentieth century with respect to long-term components. The Bayesian model averaging (BMA) produces weighted probability density functions (PDFs) and we compare two methods of estimating weighting factors: Bayes factor and expectation-maximization algorithm. It is shown that Bayesian-weighted PDFs for the global mean SAT changes are characterized by multi-modal structures from the middle of the twenty-first century onward, which are not clearly seen in arithmetic ensemble mean (AEM). This occurs because BMA tends to select a few high-skilled models and down-weight the others. Additionally, Bayesian results exhibit larger means and broader PDFs in the global mean predictions than the unweighted AEM. Multi-modality is more pronounced in the continental analysis using 30-year mean (2070-2099) SATs while there is only a little effect of Bayesian weighting on the 5-95% range. These results indicate that this approach to observationally constrained probabilistic predictions can be highly sensitive to the method of training, particularly for the later half of the twenty-first century, and that a more comprehensive approach combining different regions and/or variables is required. PMID:17569647
Shortlist B: A Bayesian Model of Continuous Speech Recognition
ERIC Educational Resources Information Center
Norris, Dennis; McQueen, James M.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…
Andrade, A I A S S; Stigter, T Y
2013-04-01
In this study multivariate and geostatistical methods are jointly applied to model the spatial and temporal distribution of arsenic (As) concentrations in shallow groundwater as a function of physicochemical, hydrogeological and land use parameters, as well as to assess the related uncertainty. The study site is located in the Mondego River alluvial body in Central Portugal, where maize, rice and some vegetable crops dominate. In a first analysis scatter plots are used, followed by the application of principal component analysis to two different data matrices, of 112 and 200 samples, with the aim of detecting associations between As levels and other quantitative parameters. In the following phase explanatory models of As are created through factorial regression based on correspondence analysis, integrating both quantitative and qualitative parameters. Finally, these are combined with indicator-geostatistical techniques to create maps indicating the predicted probability of As concentrations in groundwater exceeding the current global drinking water guideline of 10 μg/l. These maps further allow assessing the uncertainty and representativeness of the monitoring network. A clear effect of the redox state on the presence of As is observed, and together with significant correlations with dissolved oxygen, nitrate, sulfate, iron, manganese and alkalinity, points towards the reductive dissolution of Fe (hydr)oxides as the essential mechanism of As release. The association of high As values with rice crop, known to promote reduced environments due to ponding, further corroborates this hypothesis. An additional source of As from fertilizers cannot be excluded, as the correlation with As is higher where rice is associated with vegetables, normally associated with higher fertilization rates. The best explanatory model of As occurrence integrates the parameters season, crop type, well and water depth, nitrate and Eh, though a model without the last two parameters also gives
Resolution-matrix-constrained model updates for bayesian seismic tomography
NASA Astrophysics Data System (ADS)
Fontanini, Francesco; Bleibinhaus, Florian
2015-04-01
One of the most important issues of interpreting seismic tomography models is the need to provide a quantification of their uncertainty. Bayesian approach to inverse problems offers a rigorous way to quantitatively estimate this uncertainty at the price of an higher computation time. Optimizing bayesian algorithms is therefore a key problem. We are developing a multivariate model-updating scheme that makes use of the constraints provided by the Model Resolution Matrix , aiming to a more efficient sampling of the model space. The Resolution Matrix relates the true model to the estimate, its off-diagonal values provide a set of trade-off relations between model parameters used in our algorithm to obtain optimized model updates.
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling
Li Yupeng Deutsch, Clayton V.
2012-06-15
In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells. In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.
A Bayesian approach to parameter estimation in HIV dynamical models.
Putter, H; Heisterkamp, S H; Lange, J M A; de Wolf, F
2002-08-15
In the context of a mathematical model describing HIV infection, we discuss a Bayesian modelling approach to a non-linear random effects estimation problem. The model and the data exhibit a number of features that make the use of an ordinary non-linear mixed effects model intractable: (i) the data are from two compartments fitted simultaneously against the implicit numerical solution of a system of ordinary differential equations; (ii) data from one compartment are subject to censoring; (iii) random effects for one variable are assumed to be from a beta distribution. We show how the Bayesian framework can be exploited by incorporating prior knowledge on some of the parameters, and by combining the posterior distributions of the parameters to obtain estimates of quantities of interest that follow from the postulated model. PMID:12210633
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
A Bayesian nonlinear mixed-effects disease progression model
Kim, Seongho; Jang, Hyejeong; Wu, Dongfeng; Abrams, Judith
2016-01-01
A nonlinear mixed-effects approach is developed for disease progression models that incorporate variation in age in a Bayesian framework. We further generalize the probability model for sensitivity to depend on age at diagnosis, time spent in the preclinical state and sojourn time. The developed models are then applied to the Johns Hopkins Lung Project data and the Health Insurance Plan for Greater New York data using Bayesian Markov chain Monte Carlo and are compared with the estimation method that does not consider random-effects from age. Using the developed models, we obtain not only age-specific individual-level distributions, but also population-level distributions of sensitivity, sojourn time and transition probability. PMID:26798562
A Bayesian population PBPK model for multiroute chloroform exposure
Yang, Yuching; Xu, Xu; Georgopoulos, Panos G.
2011-01-01
A Bayesian hierarchical model was developed to estimate the parameters in a physiologically based pharmacokinetic (PBPK) model for chloroform using prior information and biomarker data from different exposure pathways. In particular, the model provides a quantitative description of the changes in physiological parameters associated with hot-water bath and showering scenarios. Through Bayesian inference, uncertainties in the PBPK parameters were reduced from the prior distributions. Prediction of biomarker data with the calibrated PBPK model was improved by the calibration. The posterior results indicate that blood flow rates varied under two different exposure scenarios, with a two-fold increase of the skin's blood flow rate predicted in the hot-bath scenario. This result highlights the importance of considering scenario-specific parameters in PBPK modeling. To demonstrate the application of a probability approach in toxicological assessment, results from the posterior distributions from this calibrated model were used to predict target tissue dose based on the rate of chloroform metabolized in liver. This study demonstrates the use of the Bayesian approach to optimize PBPK model parameters for typical household exposure scenarios. PMID:19471319
HIBAYES: Global 21-cm Bayesian Monte-Carlo Model Fitting
NASA Astrophysics Data System (ADS)
Zwart, Jonathan T. L.; Price, Daniel; Bernardi, Gianni
2016-06-01
HIBAYES implements fully-Bayesian extraction of the sky-averaged (global) 21-cm signal from the Cosmic Dawn and Epoch of Reionization in the presence of foreground emission. User-defined likelihood and prior functions are called by the sampler PyMultiNest (ascl:1606.005) in order to jointly explore the full (signal plus foreground) posterior probability distribution and evaluate the Bayesian evidence for a given model. Implemented models, for simulation and fitting, include gaussians (HI signal) and polynomials (foregrounds). Some simple plotting and analysis tools are supplied. The code can be extended to other models (physical or empirical), to incorporate data from other experiments, or to use alternative Monte-Carlo sampling engines as required.
Bayesian point event modeling in spatial and environmental epidemiology.
Lawson, Andrew B
2012-10-01
This paper reviews the current state of point event modeling in spatial epidemiology from a Bayesian perspective. Point event (or case event) data arise when geo-coded addresses of disease events are available. Often, this level of spatial resolution would not be accessible due to medical confidentiality constraints. However, for the examination of small spatial scales, it is important to be capable of examining point process data directly. Models for such data are usually formulated based on point process theory. In addition, special conditioning arguments can lead to simpler Bernoulli likelihoods and logistic spatial models. Goodness-of-fit diagnostics and Bayesian residuals are also considered. Applications within putative health hazard risk assessment, cluster detection, and linkage to environmental risk fields (misalignment) are considered. PMID:23035034
Application of the Bayesian dynamic survival model in medicine.
He, Jianghua; McGee, Daniel L; Niu, Xufeng
2010-02-10
The Bayesian dynamic survival model (BDSM), a time-varying coefficient survival model from the Bayesian prospective, was proposed in early 1990s but has not been widely used or discussed. In this paper, we describe the model structure of the BDSM and introduce two estimation approaches for BDSMs: the Markov Chain Monte Carlo (MCMC) approach and the linear Bayesian (LB) method. The MCMC approach estimates model parameters through sampling and is computationally intensive. With the newly developed geoadditive survival models and software BayesX, the BDSM is available for general applications. The LB approach is easier in terms of computations but it requires the prespecification of some unknown smoothing parameters. In a simulation study, we use the LB approach to show the effects of smoothing parameters on the performance of the BDSM and propose an ad hoc method for identifying appropriate values for those parameters. We also demonstrate the performance of the MCMC approach compared with the LB approach and a penalized partial likelihood method available in software R packages. A gastric cancer trial is utilized to illustrate the application of the BDSM. PMID:20014356
Bayesian approach for network modeling of brain structural features
NASA Astrophysics Data System (ADS)
Joshi, Anand A.; Joshi, Shantanu H.; Leahy, Richard M.; Shattuck, David W.; Dinov, Ivo; Toga, Arthur W.
2010-03-01
Brain connectivity patterns are useful in understanding brain function and organization. Anatomical brain connectivity is largely determined using the physical synaptic connections between neurons. In contrast statistical brain connectivity in a given brain population refers to the interaction and interdependencies of statistics of multitudes of brain features including cortical area, volume, thickness etc. Traditionally, this dependence has been studied by statistical correlations of cortical features. In this paper, we propose the use of Bayesian network modeling for inferring statistical brain connectivity patterns that relate to causal (directed) as well as non-causal (undirected) relationships between cortical surface areas. We argue that for multivariate cortical data, the Bayesian model provides for a more accurate representation by removing the effect of confounding correlations that get introduced due to canonical dependence between the data. Results are presented for a population of 466 brains, where a SEM (structural equation modeling) approach is used to generate a Bayesian network model, as well as a dependency graph for the joint distribution of cortical areas.
Bayesian Inference of High-Dimensional Dynamical Ocean Models
NASA Astrophysics Data System (ADS)
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
A localization model to localize multiple sources using Bayesian inference
NASA Astrophysics Data System (ADS)
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
Slice sampling technique in Bayesian extreme of gold price modelling
NASA Astrophysics Data System (ADS)
Rostami, Mohammad; Adam, Mohd Bakri; Ibrahim, Noor Akma; Yahya, Mohamed Hisham
2013-09-01
In this paper, a simulation study of Bayesian extreme values by using Markov Chain Monte Carlo via slice sampling algorithm is implemented. We compared the accuracy of slice sampling with other methods for a Gumbel model. This study revealed that slice sampling algorithm offers more accurate and closer estimates with less RMSE than other methods . Finally we successfully employed this procedure to estimate the parameters of Malaysia extreme gold price from 2000 to 2011.
How to Address Measurement Noise in Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Schöniger, A.; Wöhling, T.; Nowak, W.
2014-12-01
When confronted with the challenge of selecting one out of several competing conceptual models for a specific modeling task, Bayesian model averaging is a rigorous choice. It ranks the plausibility of models based on Bayes' theorem, which yields an optimal trade-off between performance and complexity. With the resulting posterior model probabilities, their individual predictions are combined into a robust weighted average and the overall predictive uncertainty (including conceptual uncertainty) can be quantified. This rigorous framework does, however, not yet explicitly consider statistical significance of measurement noise in the calibration data set. This is a major drawback, because model weights might be instable due to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new extension to the Bayesian model averaging framework that explicitly accounts for measurement noise as a source of uncertainty for the weights. This enables modelers to assess the reliability of model ranking for a specific application and a given calibration data set. Also, the impact of measurement noise on the overall prediction uncertainty can be determined. Technically, our extension is built within a Monte Carlo framework. We repeatedly perturb the observed data with random realizations of measurement error. Then, we determine the robustness of the resulting model weights against measurement noise. We quantify the variability of posterior model weights as weighting variance. We add this new variance term to the overall prediction uncertainty analysis within the Bayesian model averaging framework to make uncertainty quantification more realistic and "complete". We illustrate the importance of our suggested extension with an application to soil-plant model selection, based on studies by Wöhling et al. (2013, 2014). Results confirm that noise in leaf area index or evaporation rate observations produces a significant amount of weighting
NASA Astrophysics Data System (ADS)
Chahal, M. K.; Brown, D. J.; Brooks, E. S.; Campbell, C.; Cobos, D. R.; Vierling, L. A.
2012-12-01
Estimating soil moisture content continuously over space and time using geo-statistical techniques supports the refinement of process-based watershed hydrology models and the application of soil process models (e.g. biogeochemical models predicting greenhouse gas fluxes) to complex landscapes. In this study, we model soil profile volumetric moisture content for five agricultural fields with loess soils in the Palouse region of Eastern Washington and Northern Idaho. Using a combination of stratification and space-filling techniques, we selected 42 representative and distributed measurement locations in the Cook Agronomy Farm (Pullman, WA) and 12 locations each in four additional grower fields that span the precipitation gradient across the Palouse. At each measurement location, soil moisture was measured on an hourly basis at five different depths (30, 60, 90, 120, and 150 cm) using Decagon 5-TE/5-TM soil moisture sensors (Decagon Devices, Pullman, WA, USA). This data was collected over three years for the Cook Agronomy Farm and one year for each of the grower fields. In addition to ordinary kriging, we explored the correlation of volumetric water content with external, spatially exhaustive indices derived from terrain models, optical remote sensing imagery, and proximal soil sensing data (electromagnetic induction and VisNIR penetrometer)
Bayesian regression model for seasonal forecast of precipitation over Korea
NASA Astrophysics Data System (ADS)
Jo, Seongil; Lim, Yaeji; Lee, Jaeyong; Kang, Hyun-Suk; Oh, Hee-Seok
2012-08-01
In this paper, we apply three different Bayesian methods to the seasonal forecasting of the precipitation in a region around Korea (32.5°N-42.5°N, 122.5°E-132.5°E). We focus on the precipitation of summer season (June-July-August; JJA) for the period of 1979-2007 using the precipitation produced by the Global Data Assimilation and Prediction System (GDAPS) as predictors. Through cross-validation, we demonstrate improvement for seasonal forecast of precipitation in terms of root mean squared error (RMSE) and linear error in probability space score (LEPS). The proposed methods yield RMSE of 1.09 and LEPS of 0.31 between the predicted and observed precipitations, while the prediction using GDAPS output only produces RMSE of 1.20 and LEPS of 0.33 for CPC Merged Analyzed Precipitation (CMAP) data. For station-measured precipitation data, the RMSE and LEPS of the proposed Bayesian methods are 0.53 and 0.29, while GDAPS output is 0.66 and 0.33, respectively. The methods seem to capture the spatial pattern of the observed precipitation. The Bayesian paradigm incorporates the model uncertainty as an integral part of modeling in a natural way. We provide a probabilistic forecast integrating model uncertainty.
AIC, BIC, Bayesian evidence against the interacting dark energy model
NASA Astrophysics Data System (ADS)
Szydłowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michał
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting CDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative—the CDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), , baryon acoustic oscillation, the Alcock-Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting CDM model when compared to the CDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the CDM model. Given the weak or almost non-existing support for the interacting CDM model and bearing in mind Occam's razor we are inclined to reject this model.
Todd, M Jason; Lowrance, R Richard; Goovaerts, Pierre; Vellidis, George; Pringle, Catherine M
2010-10-15
Blackwater streams are found throughout the Coastal Plain of the southeastern United States and are characterized by a series of instream floodplain swamps that play a critical role in determining the water quality of these systems. Within the state of Georgia, many of these streams are listed in violation of the state's dissolved oxygen (DO) standard. Previous work has shown that sediment oxygen demand (SOD) is elevated in instream floodplain swamps and due to these areas of intense oxygen demand, these locations play a major role in determining the oxygen balance of the watershed as a whole. This work also showed SOD rates to be positively correlated with the concentration of total organic carbon. This study builds on previous work by using geostatistics and Sequential Gaussian Simulation to investigate the patchiness and distribution of total organic carbon (TOC) at the reach scale. This was achieved by interpolating TOC observations and simulated SOD rates based on a linear regression. Additionally, this study identifies areas within the stream system prone to high SOD at representative 3rd and 5th order locations. Results show that SOD was spatially correlated with the differences in distribution of TOC at both locations and that these differences in distribution are likely a result of the differing hydrologic regime and watershed position. Mapping of floodplain soils at the watershed scale shows that areas of organic sediment are widespread and become more prevalent in higher order streams. DO dynamics within blackwater systems are a complicated mix of natural and anthropogenic influences, but this paper illustrates the importance of instream swamps in enhancing SOD at the watershed scale. Moreover, our study illustrates the influence of instream swamps on oxygen demand while providing support that many of these systems are naturally low in DO. PMID:20938491
Todd, M. Jason; Lowrance, R. Richard; Goovaerts, Pierre; Vellidis, George; Pringle, Catherine M.
2010-01-01
Blackwater streams are found throughout the Coastal Plain of the southeastern United States and are characterized by a series of instream floodplain swamps that play a critical role in determining the water quality of these systems. Within the state of Georgia, many of these streams are listed in violation of the state’s dissolved oxygen (DO) standard. Previous work has shown that sediment oxygen demand (SOD) is elevated in instream floodplain swamps and due to these areas of intense oxygen demand, these locations play a major role in determining the oxygen balance of the watershed as a whole. This work also showed SOD rates to be positively correlated with the concentration of total organic carbon. This study builds on previous work by using geostatistics and Sequential Gaussian Simulation to investigate the patchiness and distribution of total organic carbon (TOC) at the reach scale. This was achieved by interpolating TOC observations and simulated SOD rates based on a linear regression. Additionally, this study identifies areas within the stream system prone to high SOD at representative 3rd and 5th order locations. Results show that SOD was spatially correlated with the differences in distribution of TOC at both locations and that these differences in distribution are likely a result of the differing hydrologic regime and watershed position. Mapping of floodplain soils at the watershed scale shows that areas of organic sediment are widespread and become more prevalent in higher order streams. DO dynamics within blackwater systems are a complicated mix of natural and anthropogenic influences, but this paper illustrates the importance of instream swamps in enhancing SOD at the watershed scale. Moreover, our study illustrates the influence of instream swamps on oxygen demand while providing support that many of these systems are naturally low in DO. PMID:20938491
Dissecting Magnetar Variability with Bayesian Hierarchical Models
NASA Astrophysics Data System (ADS)
Huppenkothen, Daniela; Brewer, Brendon J.; Hogg, David W.; Murray, Iain; Frean, Marcus; Elenbaas, Chris; Watts, Anna L.; Levin, Yuri; van der Horst, Alexander J.; Kouveliotou, Chryssa
2015-09-01
Neutron stars are a prime laboratory for testing physical processes under conditions of strong gravity, high density, and extreme magnetic fields. Among the zoo of neutron star phenomena, magnetars stand out for their bursting behavior, ranging from extremely bright, rare giant flares to numerous, less energetic recurrent bursts. The exact trigger and emission mechanisms for these bursts are not known; favored models involve either a crust fracture and subsequent energy release into the magnetosphere, or explosive reconnection of magnetic field lines. In the absence of a predictive model, understanding the physical processes responsible for magnetar burst variability is difficult. Here, we develop an empirical model that decomposes magnetar bursts into a superposition of small spike-like features with a simple functional form, where the number of model components is itself part of the inference problem. The cascades of spikes that we model might be formed by avalanches of reconnection, or crust rupture aftershocks. Using Markov Chain Monte Carlo sampling augmented with reversible jumps between models with different numbers of parameters, we characterize the posterior distributions of the model parameters and the number of components per burst. We relate these model parameters to physical quantities in the system, and show for the first time that the variability within a burst does not conform to predictions from ideas of self-organized criticality. We also examine how well the properties of the spikes fit the predictions of simplified cascade models for the different trigger mechanisms.
NASA Astrophysics Data System (ADS)
Parasyris, Antonios E.; Spanoudaki, Katerina; Kampanis, Nikolaos A.
2016-04-01
Groundwater level monitoring networks provide essential information for water resources management, especially in areas with significant groundwater exploitation for agricultural and domestic use. Given the high maintenance costs of these networks, development of tools, which can be used by regulators for efficient network design is essential. In this work, a monitoring network optimisation tool is presented. The network optimisation tool couples geostatistical modelling based on the Spartan family variogram with a genetic algorithm method and is applied to Mires basin in Crete, Greece, an area of high socioeconomic and agricultural interest, which suffers from groundwater overexploitation leading to a dramatic decrease of groundwater levels. The purpose of the optimisation tool is to determine which wells to exclude from the monitoring network because they add little or no beneficial information to groundwater level mapping of the area. Unlike previous relevant investigations, the network optimisation tool presented here uses Ordinary Kriging with the recently-established non-differentiable Spartan variogram for groundwater level mapping, which, based on a previous geostatistical study in the area leads to optimal groundwater level mapping. Seventy boreholes operate in the area for groundwater abstraction and water level monitoring. The Spartan variogram gives overall the most accurate groundwater level estimates followed closely by the power-law model. The geostatistical model is coupled to an integer genetic algorithm method programmed in MATLAB 2015a. The algorithm is used to find the set of wells whose removal leads to the minimum error between the original water level mapping using all the available wells in the network and the groundwater level mapping using the reduced well network (error is defined as the 2-norm of the difference between the original mapping matrix with 70 wells and the mapping matrix of the reduced well network). The solution to the
Bayesian Transformation Models for Multivariate Survival Data
DE CASTRO, MÁRIO; CHEN, MING-HUI; IBRAHIM, JOSEPH G.; KLEIN, JOHN P.
2014-01-01
In this paper we propose a general class of gamma frailty transformation models for multivariate survival data. The transformation class includes the commonly used proportional hazards and proportional odds models. The proposed class also includes a family of cure rate models. Under an improper prior for the parameters, we establish propriety of the posterior distribution. A novel Gibbs sampling algorithm is developed for sampling from the observed data posterior distribution. A simulation study is conducted to examine the properties of the proposed methodology. An application to a data set from a cord blood transplantation study is also reported. PMID:24904194
NASA Astrophysics Data System (ADS)
Ly, S.; Charles, C.; Degré, A.
2011-07-01
Spatial interpolation of precipitation data is of great importance for hydrological modelling. Geostatistical methods (kriging) are widely applied in spatial interpolation from point measurement to continuous surfaces. The first step in kriging computation is the semi-variogram modelling which usually used only one variogram model for all-moment data. The objective of this paper was to develop different algorithms of spatial interpolation for daily rainfall on 1 km2 regular grids in the catchment area and to compare the results of geostatistical and deterministic approaches. This study leaned on 30-yr daily rainfall data of 70 raingages in the hilly landscape of the Ourthe and Ambleve catchments in Belgium (2908 km2). This area lies between 35 and 693 m in elevation and consists of river networks, which are tributaries of the Meuse River. For geostatistical algorithms, seven semi-variogram models (logarithmic, power, exponential, Gaussian, rational quadratic, spherical and penta-spherical) were fitted to daily sample semi-variogram on a daily basis. These seven variogram models were also adopted to avoid negative interpolated rainfall. The elevation, extracted from a digital elevation model, was incorporated into multivariate geostatistics. Seven validation raingages and cross validation were used to compare the interpolation performance of these algorithms applied to different densities of raingages. We found that between the seven variogram models used, the Gaussian model was the most frequently best fit. Using seven variogram models can avoid negative daily rainfall in ordinary kriging. The negative estimates of kriging were observed for convective more than stratiform rain. The performance of the different methods varied slightly according to the density of raingages, particularly between 8 and 70 raingages but it was much different for interpolation using 4 raingages. Spatial interpolation with the geostatistical and Inverse Distance Weighting (IDW) algorithms
Bayesian inference and model comparison for metallic fatigue data
NASA Astrophysics Data System (ADS)
Babuška, Ivo; Sawlan, Zaid; Scavino, Marco; Szabó, Barna; Tempone, Raúl
2016-06-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
3-D model-based Bayesian classification
Soenneland, L.; Tenneboe, P.; Gehrmann, T.; Yrke, O.
1994-12-31
The challenging task of the interpreter is to integrate different pieces of information and combine them into an earth model. The sophistication level of this earth model might vary from the simplest geometrical description to the most complex set of reservoir parameters related to the geometrical description. Obviously the sophistication level also depend on the completeness of the available information. The authors describe the interpreter`s task as a mapping between the observation space and the model space. The information available to the interpreter exists in observation space and the task is to infer a model in model-space. It is well-known that this inversion problem is non-unique. Therefore any attempt to find a solution depend son constraints being added in some manner. The solution will obviously depend on which constraints are introduced and it would be desirable to allow the interpreter to modify the constraints in a problem-dependent manner. They will present a probabilistic framework that gives the interpreter the tools to integrate the different types of information and produce constrained solutions. The constraints can be adapted to the problem at hand.
Bayesian Local Contamination Models for Multivariate Outliers
Page, Garritt L.; Dunson, David B.
2013-01-01
In studies where data are generated from multiple locations or sources it is common for there to exist observations that are quite unlike the majority. Motivated by the application of establishing a reference value in an inter-laboratory setting when outlying labs are present, we propose a local contamination model that is able to accommodate unusual multivariate realizations in a flexible way. The proposed method models the process level of a hierarchical model using a mixture with a parametric component and a possibly nonparametric contamination. Much of the flexibility in the methodology is achieved by allowing varying random subsets of the elements in the lab-specific mean vectors to be allocated to the contamination component. Computational methods are developed and the methodology is compared to three other possible approaches using a simulation study. We apply the proposed method to a NIST/NOAA sponsored inter-laboratory study which motivated the methodological development. PMID:24363465
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, C.; Plant, N.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70-90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale. ?? 2010.
Bayesian sensitivity analysis of bifurcating nonlinear models
NASA Astrophysics Data System (ADS)
Becker, W.; Worden, K.; Rowson, J.
2013-01-01
Sensitivity analysis allows one to investigate how changes in input parameters to a system affect the output. When computational expense is a concern, metamodels such as Gaussian processes can offer considerable computational savings over Monte Carlo methods, albeit at the expense of introducing a data modelling problem. In particular, Gaussian processes assume a smooth, non-bifurcating response surface. This work highlights a recent extension to Gaussian processes which uses a decision tree to partition the input space into homogeneous regions, and then fits separate Gaussian processes to each region. In this way, bifurcations can be modelled at region boundaries and different regions can have different covariance properties. To test this method, both the treed and standard methods were applied to the bifurcating response of a Duffing oscillator and a bifurcating FE model of a heart valve. It was found that the treed Gaussian process provides a practical way of performing uncertainty and sensitivity analysis on large, potentially-bifurcating models, which cannot be dealt with by using a single GP, although an open problem remains how to manage bifurcation boundaries that are not parallel to coordinate axes.
Bayesian calibration of hyperelastic constitutive models of soft tissue.
Madireddy, Sandeep; Sista, Bhargava; Vemaganti, Kumar
2016-06-01
There is inherent variability in the experimental response used to characterize the hyperelastic mechanical response of soft tissues. This has to be accounted for while estimating the parameters in the constitutive models to obtain reliable estimates of the quantities of interest. The traditional least squares method of parameter estimation does not give due importance to this variability. We use a Bayesian calibration framework based on nested Monte Carlo sampling to account for the variability in the experimental data and its effect on the estimated parameters through a systematic probability-based treatment. We consider three different constitutive models to represent the hyperelastic nature of soft tissue: Mooney-Rivlin model, exponential model, and Ogden model. Three stress-strain data sets corresponding to the deformation of agarose gel, bovine liver tissue, and porcine brain tissue are considered. Bayesian fits and parameter estimates are compared with the corresponding least squares values. Finally, we propagate the uncertainty in the parameters to a quantity of interest (QoI), namely the force-indentation response, to study the effect of model form on the values of the QoI. Our results show that the quality of the fit alone is insufficient to determine the adequacy of the model, and due importance has to be given to the maximum likelihood value, the landscape of the likelihood distribution, and model complexity. PMID:26751706
DPpackage: Bayesian Non- and Semi-parametric Modelling in R.
Jara, Alejandro; Hanson, Timothy E; Quintana, Fernando A; Müller, Peter; Rosner, Gary L
2011-04-01
Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage. Currently DPpackage includes models for marginal and conditional density estimation, ROC curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior. To maximize computational efficiency, the actual sampling for each model is carried out using compiled FORTRAN. PMID:21796263
Meyer, Swen; Blaschek, Michael; Duttmann, Rainer; Ludwig, Ralf
2016-02-01
According to current climate projections, Mediterranean countries are at high risk for an even pronounced susceptibility to changes in the hydrological budget and extremes. These changes are expected to have severe direct impacts on the management of water resources, agricultural productivity and drinking water supply. Current projections of future hydrological change, based on regional climate model results and subsequent hydrological modeling schemes, are very uncertain and poorly validated. The Rio Mannu di San Sperate Basin, located in Sardinia, Italy, is one test site of the CLIMB project. The Water Simulation Model (WaSiM) was set up to model current and future hydrological conditions. The availability of measured meteorological and hydrological data is poor as it is common for many Mediterranean catchments. In this study we conducted a soil sampling campaign in the Rio Mannu catchment. We tested different deterministic and hybrid geostatistical interpolation methods on soil textures and tested the performance of the applied models. We calculated a new soil texture map based on the best prediction method. The soil model in WaSiM was set up with the improved new soil information. The simulation results were compared to standard soil parametrization. WaSiMs was validated with spatial evapotranspiration rates using the triangle method (Jiang and Islam, 1999). WaSiM was driven with the meteorological forcing taken from 4 different ENSEMBLES climate projections for a reference (1971-2000) and a future (2041-2070) times series. The climate change impact was assessed based on differences between reference and future time series. The simulated results show a reduction of all hydrological quantities in the future in the spring season. Furthermore simulation results reveal an earlier onset of dry conditions in the catchment. We show that a solid soil model setup based on short-term field measurements can improve long-term modeling results, which is especially important
Estimating anatomical trajectories with Bayesian mixed-effects modeling
Ziegler, G.; Penny, W.D.; Ridgway, G.R.; Ourselin, S.; Friston, K.J.
2015-01-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405
Estimating anatomical trajectories with Bayesian mixed-effects modeling.
Ziegler, G; Penny, W D; Ridgway, G R; Ourselin, S; Friston, K J
2015-11-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405
Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.
Hack, C Eric
2006-04-17
Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach. PMID:16466842
Lack of confidence in approximate Bayesian computation model choice.
Robert, Christian P; Cornuet, Jean-Marie; Marin, Jean-Michel; Pillai, Natesh S
2011-09-13
Approximate Bayesian computation (ABC) have become an essential tool for the analysis of complex stochastic models. Grelaud et al. [(2009) Bayesian Anal 3:427-442] advocated the use of ABC for model choice in the specific case of Gibbs random fields, relying on an intermodel sufficiency property to show that the approximation was legitimate. We implemented ABC model choice in a wide range of phylogenetic models in the Do It Yourself-ABC (DIY-ABC) software [Cornuet et al. (2008) Bioinformatics 24:2713-2719]. We now present arguments as to why the theoretical arguments for ABC model choice are missing, because the algorithm involves an unknown loss of information induced by the use of insufficient summary statistics. The approximation error of the posterior probabilities of the models under comparison may thus be unrelated with the computational effort spent in running an ABC algorithm. We then conclude that additional empirical verifications of the performances of the ABC procedure as those available in DIY-ABC are necessary to conduct model choice. PMID:21876135
Bayesian partial linear model for skewed longitudinal data.
Tang, Yuanyuan; Sinha, Debajyoti; Pati, Debdeep; Lipsitz, Stuart; Lipshultz, Steven
2015-07-01
Unlike majority of current statistical models and methods focusing on mean response for highly skewed longitudinal data, we present a novel model for such data accommodating a partially linear median regression function, a skewed error distribution and within subject association structures. We provide theoretical justifications for our methods including asymptotic properties of the posterior and associated semiparametric Bayesian estimators. We also provide simulation studies to investigate the finite sample properties of our methods. Several advantages of our method compared with existing methods are demonstrated via analysis of a cardiotoxicity study of children of HIV-infected mothers. PMID:25792623
Goodness-of-fit diagnostics for Bayesian hierarchical models.
Yuan, Ying; Johnson, Valen E
2012-03-01
This article proposes methodology for assessing goodness of fit in Bayesian hierarchical models. The methodology is based on comparing values of pivotal discrepancy measures (PDMs), computed using parameter values drawn from the posterior distribution, to known reference distributions. Because the resulting diagnostics can be calculated from standard output of Markov chain Monte Carlo algorithms, their computational costs are minimal. Several simulation studies are provided, each of which suggests that diagnostics based on PDMs have higher statistical power than comparable posterior-predictive diagnostic checks in detecting model departures. The proposed methodology is illustrated in a clinical application; an application to discrete data is described in supplementary material. PMID:22050079
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.
Baele, Guy; Lemey, Philippe; Suchard, Marc A
2016-03-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of "working distributions" to facilitate--or shorten--the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a "working" distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different "working" distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
NASA Astrophysics Data System (ADS)
Over, M. W.; Wollschlaeger, U.; Osorio-Murillo, C. A.; Ames, D. P.; Rubin, Y.
2013-12-01
Good estimates for water retention and hydraulic conductivity functions are essential for accurate modeling of the nonlinear water dynamics of unsaturated soils. Parametric mathematical models for these functions are utilized in numerical applications of vadose zone dynamics; therefore, characterization of the model parameters to represent in situ soil properties is the goal of many inversion or calibration techniques. A critical, statistical challenge of existing approaches is the subjective, user-definition of a likelihood function or objective function - a step known to introduce bias in the results. We present a methodology for Bayesian inversion where the likelihood function is inferred directly from the simulation data, which eliminates subjectivity. Additionally, our approach assumes that there is no one parameterization that is appropriate for soils, but rather that the parameters are randomly distributed. This introduces the familiar concept from groundwater hydrogeology of structural models into vadose zone applications, but without attempting to apply geostatistics, which is extremely difficult in unsaturated problems. We validate our robust statistical approach on field data obtained during a multi-layer, natural boundary condition experiment and compare with previous optimizations using the same data. Our confidence intervals for the water retention and hydraulic conductivity functions as well as joint posterior probability distributions of the Mualem-van Genuchten parameters compare well with the previous work. The entire analysis was carried out using the free, open-source MAD# software available at http://mad.codeplex.com/.
Bayesian joint modeling of longitudinal and spatial survival AIDS data.
Martins, Rui; Silva, Giovani L; Andreozzi, Valeska
2016-08-30
Joint analysis of longitudinal and survival data has received increasing attention in the recent years, especially for analyzing cancer and AIDS data. As both repeated measurements (longitudinal) and time-to-event (survival) outcomes are observed in an individual, a joint modeling is more appropriate because it takes into account the dependence between the two types of responses, which are often analyzed separately. We propose a Bayesian hierarchical model for jointly modeling longitudinal and survival data considering functional time and spatial frailty effects, respectively. That is, the proposed model deals with non-linear longitudinal effects and spatial survival effects accounting for the unobserved heterogeneity among individuals living in the same region. This joint approach is applied to a cohort study of patients with HIV/AIDS in Brazil during the years 2002-2006. Our Bayesian joint model presents considerable improvements in the estimation of survival times of the Brazilian HIV/AIDS patients when compared with those obtained through a separate survival model and shows that the spatial risk of death is the same across the different Brazilian states. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26990773
Structural and parameter uncertainty in Bayesian cost-effectiveness models
Jackson, Christopher H; Sharples, Linda D; Thompson, Simon G
2010-01-01
Health economic decision models are subject to various forms of uncertainty, including uncertainty about the parameters of the model and about the model structure. These uncertainties can be handled within a Bayesian framework, which also allows evidence from previous studies to be combined with the data. As an example, we consider a Markov model for assessing the cost-effectiveness of implantable cardioverter defibrillators. Using Markov chain Monte Carlo posterior simulation, uncertainty about the parameters of the model is formally incorporated in the estimates of expected cost and effectiveness. We extend these methods to include uncertainty about the choice between plausible model structures. This is accounted for by averaging the posterior distributions from the competing models using weights that are derived from the pseudo-marginal-likelihood and the deviance information criterion, which are measures of expected predictive utility. We also show how these cost-effectiveness calculations can be performed efficiently in the widely used software WinBUGS. PMID:20383261
Quantum-Like Bayesian Networks for Modeling Decision Making
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669
Predictive RANS simulations via Bayesian Model-Scenario Averaging
Edeling, W.N.; Cinnella, P.; Dwight, R.P.
2014-10-15
The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier–Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.
Assessing uncertainty in a stand growth model by Bayesian synthesis
Green, E.J.; MacFarlane, D.W.; Valentine, H.T.; Strawderman, W.E.
1999-11-01
The Bayesian synthesis method (BSYN) was used to bound the uncertainty in projections calculated with PIPESTEM, a mechanistic model of forest growth. The application furnished posterior distributions of (a) the values of the model's parameters, and (b) the values of three of the model's output variables--basal area per unit land area, average tree height, and tree density--at different points in time. Confidence or credible intervals for the output variables were obtained directly from the posterior distributions. The application also provides estimates of correlation among the parameters and output variables. BSYN, which originally was applied to a population dynamics model for bowhead whales, is generally applicable to deterministic models. Extension to two or more linked models is discussed. A simple worked example is included in an appendix.
A Bayesian approach to biokinetic models of internally- deposited radionuclides
NASA Astrophysics Data System (ADS)
Amer, Mamun F.
Bayesian methods were developed and applied to estimate parameters of biokinetic models of internally deposited radionuclides for the first time. Marginal posterior densities for the parameters, given the available data, were obtained and graphed. These densities contain all the information available about the parameters and fully describe their uncertainties. Two different numerical integration methods were employed to approximate the multi-dimensional integrals needed to obtain these densities and to verify our results. One numerical method was based on Gaussian quadrature. The other method was a lattice rule that was developed by Conroy. The lattice rule method is applied here for the first time in conjunction with Bayesian analysis. Computer codes were developed in Mathematica's own programming language to perform the integrals. Several biokinetic models were studied. The first model was a single power function, a/ t-b that was used to describe 226Ra whole body retention data for long periods of time in many patients. The posterior odds criterion for model identification was applied to select, from among some competing models, the best model to represent 226Ra retention in man. The highest model posterior was attained by the single power function. Posterior densities for the model parameters were obtained for each patient. Also, predictive densities for retention, given the available retention values and some selected times, were obtained. These predictive densities characterize the uncertainties in the unobservable retention values taking into consideration the uncertainties of other parameters in the model. The second model was a single exponential function, α e-/beta t, that was used to represent one patient's whole body retention as well as total excretion of 137Cs. Missing observations (censored data) in the two responses were replaced by unknown parameters and were handled in the same way other model parameters are treated. By applying the Bayesian
Assessing global vegetation activity using spatio-temporal Bayesian modelling
NASA Astrophysics Data System (ADS)
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support
Bayesian Gaussian Copula Factor Models for Mixed Data
Murray, Jared S.; Dunson, David B.; Carin, Lawrence; Lucas, Joseph E.
2013-01-01
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating measurements in the exponential family. However, when generalizing to non-Gaussian measured variables the latent variables typically influence both the dependence structure and the form of the marginal distributions, complicating interpretation and introducing artifacts. To address this problem we propose a novel class of Bayesian Gaussian copula factor models which decouple the latent factors from the marginal distributions. A semiparametric specification for the marginals based on the extended rank likelihood yields straightforward implementation and substantial computational gains. We provide new theoretical and empirical justifications for using this likelihood in Bayesian inference. We propose new default priors for the factor loadings and develop efficient parameter-expanded Gibbs sampling for posterior computation. The methods are evaluated through simulations and applied to a dataset in political science. The models in this paper are implemented in the R package bfa.1 PMID:23990691
Bayesian Models for fMRI Data Analysis
Zhang, Linlin; Guindani, Michele; Vannucci, Marina
2015-01-01
Functional magnetic resonance imaging (fMRI), a noninvasive neuroimaging method that provides an indirect measure of neuronal activity by detecting blood flow changes, has experienced an explosive growth in the past years. Statistical methods play a crucial role in understanding and analyzing fMRI data. Bayesian approaches, in particular, have shown great promise in applications. A remarkable feature of fully Bayesian approaches is that they allow a flexible modeling of spatial and temporal correlations in the data. This paper provides a review of the most relevant models developed in recent years. We divide methods according to the objective of the analysis. We start from spatio-temporal models for fMRI data that detect task-related activation patterns. We then address the very important problem of estimating brain connectivity. We also touch upon methods that focus on making predictions of an individual's brain activity or a clinical or behavioral response. We conclude with a discussion of recent integrative models that aim at combining fMRI data with other imaging modalities, such as EEG/MEG and DTI data, measured on the same subjects. We also briefly discuss the emerging field of imaging genetics. PMID:25750690
Approximate Bayesian computation for forward modeling in cosmology
NASA Astrophysics Data System (ADS)
Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar
2015-08-01
Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
NASA Astrophysics Data System (ADS)
Meng, Jia; Zhang, Jianqiu(Michelle); Qi, Yuan(Alan); Chen, Yidong; Huang, Yufei
2010-12-01
The problem of uncovering transcriptional regulation by transcription factors (TFs) based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM) is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ([InlineEquation not available: see fulltext.]) status and Estrogen Receptor negative ([InlineEquation not available: see fulltext.]) status, respectively.
Efficient multilevel brain tumor segmentation with integrated bayesian model classification.
Corso, J J; Sharon, E; Dube, S; El-Saden, S; Sinha, U; Yuille, A
2008-05-01
We present a new method for automatic segmentation of heterogeneous image data that takes a step toward bridging the gap between bottom-up affinity-based segmentation methods and top-down generative model based approaches. The main contribution of the paper is a Bayesian formulation for incorporating soft model assignments into the calculation of affinities, which are conventionally model free. We integrate the resulting model-aware affinities into the multilevel segmentation by weighted aggregation algorithm, and apply the technique to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes. The computationally efficient method runs orders of magnitude faster than current state-of-the-art techniques giving comparable or improved results. Our quantitative results indicate the benefit of incorporating model-aware affinities into the segmentation process for the difficult case of glioblastoma multiforme brain tumor. PMID:18450536
Emulation: A fast stochastic Bayesian method to eliminate model space
NASA Astrophysics Data System (ADS)
Roberts, Alan; Hobbs, Richard; Goldstein, Michael
2010-05-01
Joint inversion of large 3D datasets has been the goal of geophysicists ever since the datasets first started to be produced. There are two broad approaches to this kind of problem, traditional deterministic inversion schemes and more recently developed Bayesian search methods, such as MCMC (Markov Chain Monte Carlo). However, using both these kinds of schemes has proved prohibitively expensive, both in computing power and time cost, due to the normally very large model space which needs to be searched using forward model simulators which take considerable time to run. At the heart of strategies aimed at accomplishing this kind of inversion is the question of how to reliably and practicably reduce the size of the model space in which the inversion is to be carried out. Here we present a practical Bayesian method, known as emulation, which can address this issue. Emulation is a Bayesian technique used with considerable success in a number of technical fields, such as in astronomy, where the evolution of the universe has been modelled using this technique, and in the petroleum industry where history matching is carried out of hydrocarbon reservoirs. The method of emulation involves building a fast-to-compute uncertainty-calibrated approximation to a forward model simulator. We do this by modelling the output data from a number of forward simulator runs by a computationally cheap function, and then fitting the coefficients defining this function to the model parameters. By calibrating the error of the emulator output with respect to the full simulator output, we can use this to screen out large areas of model space which contain only implausible models. For example, starting with what may be considered a geologically reasonable prior model space of 10000 models, using the emulator we can quickly show that only models which lie within 10% of that model space actually produce output data which is plausibly similar in character to an observed dataset. We can thus much
Bayesian Learning of a Language Model from Continuous Speech
NASA Astrophysics Data System (ADS)
Neubig, Graham; Mimura, Masato; Mori, Shinsuke; Kawahara, Tatsuya
We propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexity (such as the size of the learned vocabulary) and expressive power, and provide a principled learning algorithm through the use of Gibbs sampling. Implementation is performed using weighted finite state transducers (WFSTs), which allow for the simple handling of lattice input. Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates. The proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.
Dummer, T J B; Yu, Z M; Nauta, L; Murimboh, J D; Parker, L
2015-02-01
Arsenic is a naturally occurring class 1 human carcinogen that is widespread in private drinking water wells throughout the province of Nova Scotia in Canada. In this paper we explore the spatial variation in toenail arsenic concentrations (arsenic body burden) in Nova Scotia. We describe the regional distribution of arsenic concentrations in private well water supplies in the province, and evaluate the geological and environmental features associated with higher levels of arsenic in well water. We develop geostatistical process models to predict high toenail arsenic concentrations and high well water arsenic concentrations, which have utility for studies where no direct measurements of arsenic body burden or arsenic exposure are available. 892 men and women who participated in the Atlantic Partnership for Tomorrow's Health Project provided both drinking water and toenail clipping samples. Information on socio-demographic, lifestyle and health factors was obtained with a set of standardized questionnaires. Anthropometric indices and arsenic concentrations in drinking water and toenails were measured. In addition, data on arsenic concentrations in 10,498 private wells were provided by the Nova Scotia Department of Environment. We utilised stepwise multivariable logistic regression modelling to develop separate statistical models to: a) predict high toenail arsenic concentrations (defined as toenail arsenic levels ≥0.12 μg g(-1)) and b) predict high well water arsenic concentrations (defined as well water arsenic levels ≥5.0 μg L(-1)). We found that the geological and environmental information that predicted well water arsenic concentrations can also be used to accurately predict toenail arsenic concentrations. We conclude that geological and environmental factors contributing to arsenic contamination in well water are the major contributing influences on arsenic body burden among Nova Scotia residents. Further studies are warranted to assess appropriate
NASA Astrophysics Data System (ADS)
Tadeu Pereira, Gener; Ribeiro de Oliveira, Ismênia; De Bortoli Teixeira, Daniel; Arantes Camargo, Livia; Rodrigo Panosso, Alan; Marques, José, Jr.
2015-04-01
Phosphorus is one of the limiting nutrients for sugarcane development in Brazilian soils. The spatial variability of this nutrient is great, defined by the properties that control its adsorption and desorption reactions. Spatial estimates to characterize this variability are based on geostatistical interpolation. Thus, the assessment of the uncertainty of estimates associated with the spatial distribution of available P (Plabile) is decisive to optimize the use of phosphate fertilizers. The purpose of this study was to evaluate the performance of sequential Gaussian simulation (sGs) and ordinary kriging (OK) in the modeling of uncertainty in available P estimates. A sampling grid with 626 points was established in a 200-ha experimental sugarcane field in Tabapuã, São Paulo State, Brazil. The soil was sampled in the crossover points of a regular grid with intervals of 50 m. From the observations, 63 points, approximately 10% of sampled points were randomly selected before the geostatistical modeling of the composition of a data set used in the validation process modeling, while the remaining 563 points were used for the predictions variable in a place not sampled. The sGs generated 200 realizations. From the realizations generated, different measures of estimation and uncertainty were obtained. The standard deviation, calculated point to point, all simulated maps provided the map of deviation, used to assess local uncertainty. The visual analysis of maps of the E-type and KO showed that the spatial patterns produced by both methods were similar, however, it was possible to observe the characteristic smoothing effect of the KO especially in regions with extreme values. The Standardized variograms of selected realizations sGs showed both range and model similar to the variogram of the Observed date of Plabile. The variogram KO showed a distinct structure of the observed data, underestimating the variability over short distances, presenting parabolic behavior near
Multimodel Bayesian analysis of groundwater data worth
NASA Astrophysics Data System (ADS)
Xue, Liang; Zhang, Dongxiao; Guadagnini, Alberto; Neuman, Shlomo P.
2014-11-01
We explore the way in which uncertain descriptions of aquifer heterogeneity and groundwater flow impact one's ability to assess the worth of collecting additional data. We do so on the basis of Maximum Likelihood Bayesian Model Averaging (MLBMA) by accounting jointly for uncertainties in geostatistical and flow model structures and parameter (hydraulic conductivity) as well as system state (hydraulic head) estimates, given uncertain measurements of one or both variables. Previous description of our approach was limited to geostatistical models based solely on hydraulic conductivity data. Here we implement the approach on a synthetic example of steady state flow in a two-dimensional random log hydraulic conductivity field with and without recharge by embedding an inverse stochastic moment solution of groundwater flow in MLBMA. A moment-equations-based geostatistical inversion method is utilized to circumvent the need for computationally expensive numerical Monte Carlo simulations. The approach is compatible with either deterministic or stochastic flow models and consistent with modern statistical methods of parameter estimation, admitting but not requiring prior information about the parameters. It allows but does not require approximating lead predictive statistical moments of system states by linearization while updating model posterior probabilities and parameter estimates on the basis of potential new data both before and after such data are actually collected.
Collective opinion formation model under Bayesian updating and confirmation bias.
Nishi, Ryosuke; Masuda, Naoki
2013-06-01
We propose a collective opinion formation model with a so-called confirmation bias. The confirmation bias is a psychological effect with which, in the context of opinion formation, an individual in favor of an opinion is prone to misperceive new incoming information as supporting the current belief of the individual. Our model modifies a Bayesian decision-making model for single individuals [M. Rabin and J. L. Schrag, Q. J. Econ. 114, 37 (1999)] for the case of a well-mixed population of interacting individuals in the absence of the external input. We numerically simulate the model to show that all the agents eventually agree on one of the two opinions only when the confirmation bias is weak. Otherwise, the stochastic population dynamics ends up creating a disagreement configuration (also called polarization), particularly for large system sizes. A strong confirmation bias allows various final disagreement configurations with different fractions of the individuals in favor of the opposite opinions. PMID:23848643
A kinematic model for Bayesian tracking of cyclic human motion
NASA Astrophysics Data System (ADS)
Greif, Thomas; Lienhart, Rainer
2010-01-01
We introduce a two-dimensional kinematic model for cyclic motions of humans, which is suitable for the use as temporal prior in any Bayesian tracking framework. This human motion model is solely based on simple kinematic properties: the joint accelerations. Distributions of joint accelerations subject to the cycle progress are learned from training data. We present results obtained by applying the introduced model to the cyclic motion of backstroke swimming in a Kalman filter framework that represents the posterior distribution by a Gaussian. We experimentally evaluate the sensitivity of the motion model with respect to the frequency and noise level of assumed appearance-based pose measurements by simulating various fidelities of the pose measurements using ground truth data.
Bayesian Dose-Response Modeling in Sparse Data
NASA Astrophysics Data System (ADS)
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
NASA Astrophysics Data System (ADS)
Mendes, B. S.; Draper, D.
2008-12-01
The issue of model uncertainty and model choice is central in any groundwater modeling effort [Neuman and Wierenga, 2003]; among the several approaches to the problem we favour using Bayesian statistics because it is a method that integrates in a natural way uncertainties (arising from any source) and experimental data. In this work, we experiment with several Bayesian approaches to model choice, focusing primarily on demonstrating the usefulness of the Reversible Jump Markov Chain Monte Carlo (RJMCMC) simulation method [Green, 1995]; this is an extension of the now- common MCMC methods. Standard MCMC techniques approximate posterior distributions for quantities of interest, often by creating a random walk in parameter space; RJMCMC allows the random walk to take place between parameter spaces with different dimensionalities. This fact allows us to explore state spaces that are associated with different deterministic models for experimental data. Our work is exploratory in nature; we restrict our study to comparing two simple transport models applied to a data set gathered to estimate the breakthrough curve for a tracer compound in groundwater. One model has a mean surface based on a simple advection dispersion differential equation; the second model's mean surface is also governed by a differential equation but in two dimensions. We focus on artificial data sets (in which truth is known) to see if model identification is done correctly, but we also address the issues of over and under-paramerization, and we compare RJMCMC's performance with other traditional methods for model selection and propagation of model uncertainty, including Bayesian model averaging, BIC and DIC.References Neuman and Wierenga (2003). A Comprehensive Strategy of Hydrogeologic Modeling and Uncertainty Analysis for Nuclear Facilities and Sites. NUREG/CR-6805, Division of Systems Analysis and Regulatory Effectiveness Office of Nuclear Regulatory Research, U. S. Nuclear Regulatory Commission
Advanced REACH Tool: A Bayesian Model for Occupational Exposure Assessment
McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W.; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik
2014-01-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110
Bayesian predictive modeling for genomic based personalized treatment selection.
Ma, Junsheng; Stingo, Francesco C; Hobbs, Brian P
2016-06-01
Efforts to personalize medicine in oncology have been limited by reductive characterizations of the intrinsically complex underlying biological phenomena. Future advances in personalized medicine will rely on molecular signatures that derive from synthesis of multifarious interdependent molecular quantities requiring robust quantitative methods. However, highly parameterized statistical models when applied in these settings often require a prohibitively large database and are sensitive to proper characterizations of the treatment-by-covariate interactions, which in practice are difficult to specify and may be limited by generalized linear models. In this article, we present a Bayesian predictive framework that enables the integration of a high-dimensional set of genomic features with clinical responses and treatment histories of historical patients, providing a probabilistic basis for using the clinical and molecular information to personalize therapy for future patients. Our work represents one of the first attempts to define personalized treatment assignment rules based on large-scale genomic data. We use actual gene expression data acquired from The Cancer Genome Atlas in the settings of leukemia and glioma to explore the statistical properties of our proposed Bayesian approach for personalizing treatment selection. The method is shown to yield considerable improvements in predictive accuracy when compared to penalized regression approaches. PMID:26575856
Parameter Estimation and Parameterization Uncertainty Using Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Tsai, F. T.; Li, X.
2007-12-01
This study proposes Bayesian model averaging (BMA) to address parameter estimation uncertainty arisen from non-uniqueness in parameterization methods. BMA provides a means of incorporating multiple parameterization methods for prediction through the law of total probability, with which an ensemble average of hydraulic conductivity distribution is obtained. Estimation uncertainty is described by the BMA variances, which contain variances within and between parameterization methods. BMA shows the facts that considering more parameterization methods tends to increase estimation uncertainty and estimation uncertainty is always underestimated using a single parameterization method. Two major problems in applying BMA to hydraulic conductivity estimation using a groundwater inverse method will be discussed in the study. The first problem is the use of posterior probabilities in BMA, which tends to single out one best method and discard other good methods. This problem arises from Occam's window that only accepts models in a very narrow range. We propose a variance window to replace Occam's window to cope with this problem. The second problem is the use of Kashyap information criterion (KIC), which makes BMA tend to prefer high uncertain parameterization methods due to considering the Fisher information matrix. We found that Bayesian information criterion (BIC) is a good approximation to KIC and is able to avoid controversial results. We applied BMA to hydraulic conductivity estimation in the 1,500-foot sand aquifer in East Baton Rouge Parish, Louisiana.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Advanced REACH Tool: a Bayesian model for occupational exposure assessment.
McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik
2014-06-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110
Modeling the Climatology of Tornado Occurrence with Bayesian Inference
NASA Astrophysics Data System (ADS)
Cheng, Vincent Y. S.
Our mechanistic understanding of tornadic environments has significantly improved by the recent technological enhancements in the detection of tornadoes as well as the advances of numerical weather predictive modeling. Nonetheless, despite the decades of active research, prediction of tornado occurrence remains one of the most difficult problems in meteorological and climate science. In our efforts to develop predictive tools for tornado occurrence, there are a number of issues to overcome, such as the treatment of inconsistent tornado records, the consideration of suitable combination of atmospheric predictors, and the selection of appropriate resolution to accommodate the variability in time and space. In this dissertation, I address each of these topics by undertaking three empirical (statistical) modeling studies, where I examine the signature of different atmospheric factors influencing the tornado occurrence, the sampling biases in tornado observations, and the optimal spatiotemporal resolution for studying tornado occurrence. In the first study, I develop a novel Bayesian statistical framework to assess the probability of tornado occurrence in Canada, in which the sampling bias of tornado observations and the linkage between lightning climatology and tornadogenesis are considered. The results produced reasonable probability estimates of tornado occurrence for the under-sampled areas in the model domain. The same study also delineated the geographical variability in the lightning-tornado relationship across Canada. In the second study, I present a novel modeling framework to examine the relative importance of several key atmospheric variables (e.g., convective available potential energy, 0-3 km storm-relative helicity, 0-6 km bulk wind difference, 0-tropopause vertical wind shear) on tornado activity in North America. I found that the variable quantifying the updraft strength is more important during the warm season, whereas the effects of wind
Model for Aggregated Water Heater Load Using Dynamic Bayesian Networks
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai; Kalsi, Karanjit
2012-07-19
The transition to the new generation power grid, or “smart grid”, requires novel ways of using and analyzing data collected from the grid infrastructure. Fundamental functionalities like demand response (DR), that the smart grid needs, rely heavily on the ability of the energy providers and distributors to forecast the load behavior of appliances under different DR strategies. This paper presents a new model of aggregated water heater load, based on dynamic Bayesian networks (DBNs). The model has been validated against simulated data from an open source distribution simulation software (GridLAB-D). The results presented in this paper demonstrate that the DBN model accurately tracks the load profile curves of aggregated water heaters under different testing scenarios.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
Performance and Prediction: Bayesian Modelling of Fallible Choice in Chess
NASA Astrophysics Data System (ADS)
Haworth, Guy; Regan, Ken; di Fatta, Giuseppe
Evaluating agents in decision-making applications requires assessing their skill and predicting their behaviour. Both are well developed in Poker-like situations, but less so in more complex game and model domains. This paper addresses both tasks by using Bayesian inference in a benchmark space of reference agents. The concepts are explained and demonstrated using the game of chess but the model applies generically to any domain with quantifiable options and fallible choice. Demonstration applications address questions frequently asked by the chess community regarding the stability of the rating scale, the comparison of players of different eras and/or leagues, and controversial incidents possibly involving fraud. The last include alleged under-performance, fabrication of tournament results, and clandestine use of computer advice during competition. Beyond the model world of games, the aim is to improve fallible human performance in complex, high-value tasks.
Development of a Bayesian Belief Network Runway Incursion Model
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2014-01-01
In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.