Sample records for spatial regression model

  1. Accounting for spatial effects in land use regression for urban air pollution modeling.

    PubMed

    Bertazzon, Stefania; Johnson, Markey; Eccles, Kristin; Kaplan, Gilaad G

    2015-01-01

    In order to accurately assess air pollution risks, health studies require spatially resolved pollution concentrations. Land-use regression (LUR) models estimate ambient concentrations at a fine spatial scale. However, spatial effects such as spatial non-stationarity and spatial autocorrelation can reduce the accuracy of LUR estimates by increasing regression errors and uncertainty; and statistical methods for resolving these effects--e.g., spatially autoregressive (SAR) and geographically weighted regression (GWR) models--may be difficult to apply simultaneously. We used an alternate approach to address spatial non-stationarity and spatial autocorrelation in LUR models for nitrogen dioxide. Traditional models were re-specified to include a variable capturing wind speed and direction, and re-fit as GWR models. Mean R(2) values for the resulting GWR-wind models (summer: 0.86, winter: 0.73) showed a 10-20% improvement over traditional LUR models. GWR-wind models effectively addressed both spatial effects and produced meaningful predictive models. These results suggest a useful method for improving spatially explicit models. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  2. Importance of spatial autocorrelation in modeling bird distributions at a continental scale

    USGS Publications Warehouse

    Bahn, V.; O'Connor, R.J.; Krohn, W.B.

    2006-01-01

    Spatial autocorrelation in species' distributions has been recognized as inflating the probability of a type I error in hypotheses tests, causing biases in variable selection, and violating the assumption of independence of error terms in models such as correlation or regression. However, it remains unclear whether these problems occur at all spatial resolutions and extents, and under which conditions spatially explicit modeling techniques are superior. Our goal was to determine whether spatial models were superior at large extents and across many different species. In addition, we investigated the importance of purely spatial effects in distribution patterns relative to the variation that could be explained through environmental conditions. We studied distribution patterns of 108 bird species in the conterminous United States using ten years of data from the Breeding Bird Survey. We compared the performance of spatially explicit regression models with non-spatial regression models using Akaike's information criterion. In addition, we partitioned the variance in species distributions into an environmental, a pure spatial and a shared component. The spatially-explicit conditional autoregressive regression models strongly outperformed the ordinary least squares regression models. In addition, partialling out the spatial component underlying the species' distributions showed that an average of 17% of the explained variation could be attributed to purely spatial effects independent of the spatial autocorrelation induced by the underlying environmental variables. We concluded that location in the range and neighborhood play an important role in the distribution of species. Spatially explicit models are expected to yield better predictions especially for mobile species such as birds, even in coarse-grained models with a large extent. ?? Ecography.

  3. Spatial Double Generalized Beta Regression Models: Extensions and Application to Study Quality of Education in Colombia

    ERIC Educational Resources Information Center

    Cepeda-Cuervo, Edilberto; Núñez-Antón, Vicente

    2013-01-01

    In this article, a proposed Bayesian extension of the generalized beta spatial regression models is applied to the analysis of the quality of education in Colombia. We briefly revise the beta distribution and describe the joint modeling approach for the mean and dispersion parameters in the spatial regression models' setting. Finally, we motivate…

  4. Spatial Assessment of Model Errors from Four Regression Techniques

    Treesearch

    Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

    2005-01-01

    Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...

  5. A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods.

    PubMed

    Duncan, Dustin T; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A; Arbia, Giuseppe; Castro, Marcia C; White, Kellee; Williams, David R

    2014-04-01

    The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran's I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran's I range from 0.24 to 0.86, all P =0.001), for tree density (Global Moran's I =0.452, P =0.001), and in the OLS regression residuals (Global Moran's I range from 0.32 to 0.38, all P <0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (r S =-0.19; conventional P -value=0.016; spatially adjusted P -value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (r S =-0.18; conventional P -value=0.019; spatially adjusted P -value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed.

  6. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

    USGS Publications Warehouse

    Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.

    2015-01-01

    Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.

  7. [Prediction and spatial distribution of recruitment trees of natural secondary forest based on geographically weighted Poisson model].

    PubMed

    Zhang, Ling Yu; Liu, Zhao Gang

    2017-12-01

    Based on the data collected from 108 permanent plots of the forest resources survey in Maoershan Experimental Forest Farm during 2004-2016, this study investigated the spatial distribution of recruitment trees in natural secondary forest by global Poisson regression and geographically weighted Poisson regression (GWPR) with four bandwidths of 2.5, 5, 10 and 15 km. The simulation effects of the 5 regressions and the factors influencing the recruitment trees in stands were analyzed, a description was given to the spatial autocorrelation of the regression residuals on global and local levels using Moran's I. The results showed that the spatial distribution of the number of natural secondary forest recruitment was significantly influenced by stands and topographic factors, especially average DBH. The GWPR model with small scale (2.5 km) had high accuracy of model fitting, a large range of model parameter estimates was generated, and the localized spatial distribution effect of the model parameters was obtained. The GWPR model at small scale (2.5 and 5 km) had produced a small range of model residuals, and the stability of the model was improved. The global spatial auto-correlation of the GWPR model residual at the small scale (2.5 km) was the lowe-st, and the local spatial auto-correlation was significantly reduced, in which an ideal spatial distribution pattern of small clusters with different observations was formed. The local model at small scale (2.5 km) was much better than the global model in the simulation effect on the spatial distribution of recruitment tree number.

  8. Using an autologistic regression model to identify spatial risk factors and spatial risk patterns of hand, foot and mouth disease (HFMD) in Mainland China

    PubMed Central

    2014-01-01

    Background There have been large-scale outbreaks of hand, foot and mouth disease (HFMD) in Mainland China over the last decade. These events varied greatly across the country. It is necessary to identify the spatial risk factors and spatial distribution patterns of HFMD for public health control and prevention. Climate risk factors associated with HFMD occurrence have been recognized. However, few studies discussed the socio-economic determinants of HFMD risk at a space scale. Methods HFMD records in Mainland China in May 2008 were collected. Both climate and socio-economic factors were selected as potential risk exposures of HFMD. Odds ratio (OR) was used to identify the spatial risk factors. A spatial autologistic regression model was employed to get OR values of each exposures and model the spatial distribution patterns of HFMD risk. Results Results showed that both climate and socio-economic variables were spatial risk factors for HFMD transmission in Mainland China. The statistically significant risk factors are monthly average precipitation (OR = 1.4354), monthly average temperature (OR = 1.379), monthly average wind speed (OR = 1.186), the number of industrial enterprises above designated size (OR = 17.699), the population density (OR = 1.953), and the proportion of student population (OR = 1.286). The spatial autologistic regression model has a good goodness of fit (ROC = 0.817) and prediction accuracy (Correct ratio = 78.45%) of HFMD occurrence. The autologistic regression model also reduces the contribution of the residual term in the ordinary logistic regression model significantly, from 17.25 to 1.25 for the odds ratio. Based on the prediction results of the spatial model, we obtained a map of the probability of HFMD occurrence that shows the spatial distribution pattern and local epidemic risk over Mainland China. Conclusions The autologistic regression model was used to identify spatial risk factors and model spatial risk patterns of HFMD. HFMD occurrences were found to be spatially heterogeneous over the Mainland China, which is related to both the climate and socio-economic variables. The combination of socio-economic and climate exposures can explain the HFMD occurrences more comprehensively and objectively than those with only climate exposures. The modeled probability of HFMD occurrence at the county level reveals not only the spatial trends, but also the local details of epidemic risk, even in the regions where there were no HFMD case records. PMID:24731248

  9. A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods

    PubMed Central

    Duncan, Dustin T.; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A.; Arbia, Giuseppe; Castro, Marcia C.; White, Kellee; Williams, David R.

    2017-01-01

    The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran’s I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran’s I range from 0.24 to 0.86, all P=0.001), for tree density (Global Moran’s I=0.452, P=0.001), and in the OLS regression residuals (Global Moran’s I range from 0.32 to 0.38, all P<0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (rS=−0.19; conventional P-value=0.016; spatially adjusted P-value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (rS=−0.18; conventional P-value=0.019; spatially adjusted P-value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed. PMID:29354668

  10. Simulating land-use changes by incorporating spatial autocorrelation and self-organization in CLUE-S modeling: a case study in Zengcheng District, Guangzhou, China

    NASA Astrophysics Data System (ADS)

    Mei, Zhixiong; Wu, Hao; Li, Shiyun

    2018-06-01

    The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.

  11. Modelling space of spread Dengue Hemorrhagic Fever (DHF) in Central Java use spatial durbin model

    NASA Astrophysics Data System (ADS)

    Ispriyanti, Dwi; Prahutama, Alan; Taryono, Arkadina PN

    2018-05-01

    Dengue Hemorrhagic Fever is one of the major public health problems in Indonesia. From year to year, DHF causes Extraordinary Event in most parts of Indonesia, especially Central Java. Central Java consists of 35 districts or cities where each region is close to each other. Spatial regression is an analysis that suspects the influence of independent variables on the dependent variables with the influences of the region inside. In spatial regression modeling, there are spatial autoregressive model (SAR), spatial error model (SEM) and spatial autoregressive moving average (SARMA). Spatial Durbin model is the development of SAR where the dependent and independent variable have spatial influence. In this research dependent variable used is number of DHF sufferers. The independent variables observed are population density, number of hospitals, residents and health centers, and mean years of schooling. From the multiple regression model test, the variables that significantly affect the spread of DHF disease are the population and mean years of schooling. By using queen contiguity and rook contiguity, the best model produced is the SDM model with queen contiguity because it has the smallest AIC value of 494,12. Factors that generally affect the spread of DHF in Central Java Province are the number of population and the average length of school.

  12. Functional CAR models for large spatially correlated functional datasets.

    PubMed

    Zhang, Lin; Baladandayuthapani, Veerabhadran; Zhu, Hongxiao; Baggerly, Keith A; Majewski, Tadeusz; Czerniak, Bogdan A; Morris, Jeffrey S

    2016-01-01

    We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations.

  13. Gaussian Process Regression Model in Spatial Logistic Regression

    NASA Astrophysics Data System (ADS)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  14. Evaluating the utility of companion animal tick surveillance practices for monitoring spread and occurrence of human Lyme disease in West Virginia, 2014-2016.

    PubMed

    Hendricks, Brian; Mark-Carew, Miguella; Conley, Jamison

    2017-11-13

    Domestic dogs and cats are potentially effective sentinel populations for monitoring occurrence and spread of Lyme disease. Few studies have evaluated the public health utility of sentinel programmes using geo-analytic approaches. Confirmed Lyme disease cases diagnosed by physicians and ticks submitted by veterinarians to the West Virginia State Health Department were obtained for 2014-2016. Ticks were identified to species, and only Ixodes scapularis were incorporated in the analysis. Separate ordinary least squares (OLS) and spatial lag regression models were conducted to estimate the association between average numbers of Ix. scapularis collected on pets and human Lyme disease incidence. Regression residuals were visualised using Local Moran's I as a diagnostic tool to identify spatial dependence. Statistically significant associations were identified between average numbers of Ix. scapularis collected from dogs and human Lyme disease in the OLS (β=20.7, P<0.001) and spatial lag (β=12.0, P=0.002) regression. No significant associations were identified for cats in either regression model. Statistically significant (P≤0.05) spatial dependence was identified in all regression models. Local Moran's I maps produced for spatial lag regression residuals indicated a decrease in model over- and under-estimation, but identified a higher number of statistically significant outliers than OLS regression. Results support previous conclusions that dogs are effective sentinel populations for monitoring risk of human exposure to Lyme disease. Findings reinforce the utility of spatial analysis of surveillance data, and highlight West Virginia's unique position within the eastern United States in regards to Lyme disease occurrence.

  15. Prediction of hourly PM2.5 using a space-time support vector regression model

    NASA Astrophysics Data System (ADS)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  16. Deciphering factors controlling groundwater arsenic spatial variability in Bangladesh

    NASA Astrophysics Data System (ADS)

    Tan, Z.; Yang, Q.; Zheng, C.; Zheng, Y.

    2017-12-01

    Elevated concentrations of geogenic arsenic in groundwater have been found in many countries to exceed 10 μg/L, the WHO's guideline value for drinking water. A common yet unexplained characteristic of groundwater arsenic spatial distribution is the extensive variability at various spatial scales. This study investigates factors influencing the spatial variability of groundwater arsenic in Bangladesh to improve the accuracy of models predicting arsenic exceedance rate spatially. A novel boosted regression tree method is used to establish a weak-learning ensemble model, which is compared to a linear model using a conventional stepwise logistic regression method. The boosted regression tree models offer the advantage of parametric interaction when big datasets are analyzed in comparison to the logistic regression. The point data set (n=3,538) of groundwater hydrochemistry with 19 parameters was obtained by the British Geological Survey in 2001. The spatial data sets of geological parameters (n=13) were from the Consortium for Spatial Information, Technical University of Denmark, University of East Anglia and the FAO, while the soil parameters (n=42) were from the Harmonized World Soil Database. The aforementioned parameters were regressed to categorical groundwater arsenic concentrations below or above three thresholds: 5 μg/L, 10 μg/L and 50 μg/L to identify respective controlling factors. Boosted regression tree method outperformed logistic regression methods in all three threshold levels in terms of accuracy, specificity and sensitivity, resulting in an improvement of spatial distribution map of probability of groundwater arsenic exceeding all three thresholds when compared to disjunctive-kriging interpolated spatial arsenic map using the same groundwater arsenic dataset. Boosted regression tree models also show that the most important controlling factors of groundwater arsenic distribution include groundwater iron content and well depth for all three thresholds. The probability of a well with iron content higher than 5mg/L to contain greater than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be more than 91%, 85% and 51%, respectively, while the probability of a well from depth more than 160m to contain more than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be less than 38%, 25% and 14%, respectively.

  17. Schistosomiasis Breeding Environment Situation Analysis in Dongting Lake Area

    NASA Astrophysics Data System (ADS)

    Li, Chuanrong; Jia, Yuanyuan; Ma, Lingling; Liu, Zhaoyan; Qian, Yonggang

    2013-01-01

    Monitoring environmental characteristics, such as vegetation, soil moisture et al., of Oncomelania hupensis (O. hupensis)’ spatial/temporal distribution is of vital importance to the schistosomiasis prevention and control. In this study, the relationship between environmental factors derived from remotely sensed data and the density of O. hupensis was analyzed by a multiple linear regression model. Secondly, spatial analysis of the regression residual was investigated by the semi-variogram method. Thirdly, spatial analysis of the regression residual and the multiple linear regression model were both employed to estimate the spatial variation of O. hupensis density. Finally, the approach was used to monitor and predict the spatial and temporal variations of oncomelania of Dongting Lake region, China. And the areas of potential O. hupensis habitats were predicted and the influence of Three Gorges Dam (TGB)project on the density of O. hupensis was analyzed.

  18. Restricted spatial regression in practice: Geostatistical models, confounding, and robustness under model misspecification

    USGS Publications Warehouse

    Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.

    2015-01-01

    In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.

  19. [Spatial differentiation and impact factors of Yutian Oasis's soil surface salt based on GWR model].

    PubMed

    Yuan, Yu Yun; Wahap, Halik; Guan, Jing Yun; Lu, Long Hui; Zhang, Qin Qin

    2016-10-01

    In this paper, topsoil salinity data gathered from 24 sampling sites in the Yutian Oasis were used, nine different kinds of environmental variables closely related to soil salinity were selec-ted as influencing factors, then, the spatial distribution characteristics of topsoil salinity and spatial heterogeneity of influencing factors were analyzed by combining the spatial autocorrelation with traditional regression analysis and geographically weighted regression model. Results showed that the topsoil salinity in Yutian Oasis was not of random distribution but had strong spatial dependence, and the spatial autocorrelation index for topsoil salinity was 0.479. Groundwater salinity, groundwater depth, elevation and temperature were the main factors influencing topsoil salt accumulation in arid land oases and they were spatially heterogeneous. The nine selected environmental variables except soil pH had significant influences on topsoil salinity with spatial disparity. GWR model was superior to the OLS model on interpretation and estimation of spatial non-stationary data, also had a remarkable advantage in visualization of modeling parameters.

  20. Implementations of geographically weighted lasso in spatial data with multicollinearity (Case study: Poverty modeling of Java Island)

    NASA Astrophysics Data System (ADS)

    Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi

    2017-03-01

    Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.

  1. Developing and testing a global-scale regression model to quantify mean annual streamflow

    NASA Astrophysics Data System (ADS)

    Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.

    2017-01-01

    Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.

  2. Hierarchical Bayesian spatial models for predicting multiple forest variables using waveform LiDAR, hyperspectral imagery, and large inventory datasets

    USGS Publications Warehouse

    Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.

    2013-01-01

    In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.

  3. Eigenvector Spatial Filtering Regression Modeling of Ground PM2.5 Concentrations Using Remotely Sensed Data.

    PubMed

    Zhang, Jingyi; Li, Bin; Chen, Yumin; Chen, Meijie; Fang, Tao; Liu, Yongfeng

    2018-06-11

    This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF) method to estimate ground PM 2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR) models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM 2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM 2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM 2.5 analysis and prediction.

  4. Modeling vertebrate diversity in Oregon using satellite imagery

    NASA Astrophysics Data System (ADS)

    Cablk, Mary Elizabeth

    Vertebrate diversity was modeled for the state of Oregon using a parametric approach to regression tree analysis. This exploratory data analysis effectively modeled the non-linear relationships between vertebrate richness and phenology, terrain, and climate. Phenology was derived from time-series NOAA-AVHRR satellite imagery for the year 1992 using two methods: principal component analysis and derivation of EROS data center greenness metrics. These two measures of spatial and temporal vegetation condition incorporated the critical temporal element in this analysis. The first three principal components were shown to contain spatial and temporal information about the landscape and discriminated phenologically distinct regions in Oregon. Principal components 2 and 3, 6 greenness metrics, elevation, slope, aspect, annual precipitation, and annual seasonal temperature difference were investigated as correlates to amphibians, birds, all vertebrates, reptiles, and mammals. Variation explained for each regression tree by taxa were: amphibians (91%), birds (67%), all vertebrates (66%), reptiles (57%), and mammals (55%). Spatial statistics were used to quantify the pattern of each taxa and assess validity of resulting predictions from regression tree models. Regression tree analysis was relatively robust against spatial autocorrelation in the response data and graphical results indicated models were well fit to the data.

  5. Spatial interpolation schemes of daily precipitation for hydrologic modeling

    USGS Publications Warehouse

    Hwang, Y.; Clark, M.R.; Rajagopalan, B.; Leavesley, G.

    2012-01-01

    Distributed hydrologic models typically require spatial estimates of precipitation interpolated from sparsely located observational points to the specific grid points. We compare and contrast the performance of regression-based statistical methods for the spatial estimation of precipitation in two hydrologically different basins and confirmed that widely used regression-based estimation schemes fail to describe the realistic spatial variability of daily precipitation field. The methods assessed are: (1) inverse distance weighted average; (2) multiple linear regression (MLR); (3) climatological MLR; and (4) locally weighted polynomial regression (LWP). In order to improve the performance of the interpolations, the authors propose a two-step regression technique for effective daily precipitation estimation. In this simple two-step estimation process, precipitation occurrence is first generated via a logistic regression model before estimate the amount of precipitation separately on wet days. This process generated the precipitation occurrence, amount, and spatial correlation effectively. A distributed hydrologic model (PRMS) was used for the impact analysis in daily time step simulation. Multiple simulations suggested noticeable differences between the input alternatives generated by three different interpolation schemes. Differences are shown in overall simulation error against the observations, degree of explained variability, and seasonal volumes. Simulated streamflows also showed different characteristics in mean, maximum, minimum, and peak flows. Given the same parameter optimization technique, LWP input showed least streamflow error in Alapaha basin and CMLR input showed least error (still very close to LWP) in Animas basin. All of the two-step interpolation inputs resulted in lower streamflow error compared to the directly interpolated inputs. ?? 2011 Springer-Verlag.

  6. A spatially filtered multilevel model to account for spatial dependency: application to self-rated health status in South Korea

    PubMed Central

    2014-01-01

    Background This study aims to suggest an approach that integrates multilevel models and eigenvector spatial filtering methods and apply it to a case study of self-rated health status in South Korea. In many previous health-related studies, multilevel models and single-level spatial regression are used separately. However, the two methods should be used in conjunction because the objectives of both approaches are important in health-related analyses. The multilevel model enables the simultaneous analysis of both individual and neighborhood factors influencing health outcomes. However, the results of conventional multilevel models are potentially misleading when spatial dependency across neighborhoods exists. Spatial dependency in health-related data indicates that health outcomes in nearby neighborhoods are more similar to each other than those in distant neighborhoods. Spatial regression models can address this problem by modeling spatial dependency. This study explores the possibility of integrating a multilevel model and eigenvector spatial filtering, an advanced spatial regression for addressing spatial dependency in datasets. Methods In this spatially filtered multilevel model, eigenvectors function as additional explanatory variables accounting for unexplained spatial dependency within the neighborhood-level error. The specification addresses the inability of conventional multilevel models to account for spatial dependency, and thereby, generates more robust outputs. Results The findings show that sex, employment status, monthly household income, and perceived levels of stress are significantly associated with self-rated health status. Residents living in neighborhoods with low deprivation and a high doctor-to-resident ratio tend to report higher health status. The spatially filtered multilevel model provides unbiased estimations and improves the explanatory power of the model compared to conventional multilevel models although there are no changes in the signs of parameters and the significance levels between the two models in this case study. Conclusions The integrated approach proposed in this paper is a useful tool for understanding the geographical distribution of self-rated health status within a multilevel framework. In future research, it would be useful to apply the spatially filtered multilevel model to other datasets in order to clarify the differences between the two models. It is anticipated that this integrated method will also out-perform conventional models when it is used in other contexts. PMID:24571639

  7. Spatial variability of excess mortality during prolonged dust events in a high-density city: a time-stratified spatial regression approach.

    PubMed

    Wong, Man Sing; Ho, Hung Chak; Yang, Lin; Shi, Wenzhong; Yang, Jinxin; Chan, Ta-Chien

    2017-07-24

    Dust events have long been recognized to be associated with a higher mortality risk. However, no study has investigated how prolonged dust events affect the spatial variability of mortality across districts in a downwind city. In this study, we applied a spatial regression approach to estimate the district-level mortality during two extreme dust events in Hong Kong. We compared spatial and non-spatial models to evaluate the ability of each regression to estimate mortality. We also compared prolonged dust events with non-dust events to determine the influences of community factors on mortality across the city. The density of a built environment (estimated by the sky view factor) had positive association with excess mortality in each district, while socioeconomic deprivation contributed by lower income and lower education induced higher mortality impact in each territory planning unit during a prolonged dust event. Based on the model comparison, spatial error modelling with the 1st order of queen contiguity consistently outperformed other models. The high-risk areas with higher increase in mortality were located in an urban high-density environment with higher socioeconomic deprivation. Our model design shows the ability to predict spatial variability of mortality risk during an extreme weather event that is not able to be estimated based on traditional time-series analysis or ecological studies. Our spatial protocol can be used for public health surveillance, sustainable planning and disaster preparation when relevant data are available.

  8. Publically accessible decision support system of the spatially referenced regressions on watershed attributes (SPARROW) model and model enhancements in South Carolina

    Treesearch

    Celeste Journey; Anne B. Hoos; David E. Ladd; John W. brakebill; Richard A. Smith

    2016-01-01

    The U.S. Geological Survey (USGS) National Water Quality Assessment program has developed a web-based decision support system (DSS) to provide free public access to the steady-stateSPAtially Referenced Regressions On Watershed attributes (SPARROW) model simulation results on nutrient conditions in streams and rivers and to offer scenario testing capabilities for...

  9. Area-to-point regression kriging for pan-sharpening

    NASA Astrophysics Data System (ADS)

    Wang, Qunming; Shi, Wenzhong; Atkinson, Peter M.

    2016-04-01

    Pan-sharpening is a technique to combine the fine spatial resolution panchromatic (PAN) band with the coarse spatial resolution multispectral bands of the same satellite to create a fine spatial resolution multispectral image. In this paper, area-to-point regression kriging (ATPRK) is proposed for pan-sharpening. ATPRK considers the PAN band as the covariate. Moreover, ATPRK is extended with a local approach, called adaptive ATPRK (AATPRK), which fits a regression model using a local, non-stationary scheme such that the regression coefficients change across the image. The two geostatistical approaches, ATPRK and AATPRK, were compared to the 13 state-of-the-art pan-sharpening approaches summarized in Vivone et al. (2015) in experiments on three separate datasets. ATPRK and AATPRK produced more accurate pan-sharpened images than the 13 benchmark algorithms in all three experiments. Unlike the benchmark algorithms, the two geostatistical solutions precisely preserved the spectral properties of the original coarse data. Furthermore, ATPRK can be enhanced by a local scheme in AATRPK, in cases where the residuals from a global regression model are such that their spatial character varies locally.

  10. Spatial structure, sampling design and scale in remotely-sensed imagery of a California savanna woodland

    NASA Technical Reports Server (NTRS)

    Mcgwire, K.; Friedl, M.; Estes, J. E.

    1993-01-01

    This article describes research related to sampling techniques for establishing linear relations between land surface parameters and remotely-sensed data. Predictive relations are estimated between percentage tree cover in a savanna environment and a normalized difference vegetation index (NDVI) derived from the Thematic Mapper sensor. Spatial autocorrelation in original measurements and regression residuals is examined using semi-variogram analysis at several spatial resolutions. Sampling schemes are then tested to examine the effects of autocorrelation on predictive linear models in cases of small sample sizes. Regression models between image and ground data are affected by the spatial resolution of analysis. Reducing the influence of spatial autocorrelation by enforcing minimum distances between samples may also improve empirical models which relate ground parameters to satellite data.

  11. Hyper-Spectral Image Analysis With Partially Latent Regression and Spatial Markov Dependencies

    NASA Astrophysics Data System (ADS)

    Deleforge, Antoine; Forbes, Florence; Ba, Sileye; Horaud, Radu

    2015-09-01

    Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.

  12. Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests

    NASA Astrophysics Data System (ADS)

    Wheeler, David C.; Waller, Lance A.

    2009-03-01

    In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.

  13. Spatiotemporal variability of urban growth factors: A global and local perspective on the megacity of Mumbai

    NASA Astrophysics Data System (ADS)

    Shafizadeh-Moghadam, Hossein; Helbich, Marco

    2015-03-01

    The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.

  14. Crime Modeling using Spatial Regression Approach

    NASA Astrophysics Data System (ADS)

    Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.

    2018-01-01

    Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.

  15. Comparing spatial regression to random forests for large ...

    EPA Pesticide Factsheets

    Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. Our primary goal is predicting MMI at over 1.1 million perennial stream reaches across the USA. For spatial regression modeling, we develop two new methods to accommodate large data: (1) a procedure that estimates optimal Box-Cox transformations to linearize covariate relationships; and (2) a computationally efficient covariate selection routine that takes into account spatial autocorrelation. We show that our new methods lead to cross-validated performance similar to random forests, but that there is an advantage for spatial regression when quantifying the uncertainty of the predictions. Simulations are used to clarify advantages for each method. This research investigates different approaches for modeling and mapping national stream condition. We use MMI data from the EPA's National Rivers and Streams Assessment and predictors from StreamCat (Hill et al., 2015). Previous studies have focused on modeling the MMI condition classes (i.e., good, fair, and po

  16. Application of spatial and non-spatial data analysis in determination of the factors that impact municipal solid waste generation rates in Turkey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keser, Saniye; Duzgun, Sebnem; Department of Geodetic and Geographic Information Technologies, Middle East Technical University, 06800 Ankara

    Highlights: Black-Right-Pointing-Pointer Spatial autocorrelation exists in municipal solid waste generation rates for different provinces in Turkey. Black-Right-Pointing-Pointer Traditional non-spatial regression models may not provide sufficient information for better solid waste management. Black-Right-Pointing-Pointer Unemployment rate is a global variable that significantly impacts the waste generation rates in Turkey. Black-Right-Pointing-Pointer Significances of global parameters may diminish at local scale for some provinces. Black-Right-Pointing-Pointer GWR model can be used to create clusters of cities for solid waste management. - Abstract: In studies focusing on the factors that impact solid waste generation habits and rates, the potential spatial dependency in solid waste generation datamore » is not considered in relating the waste generation rates to its determinants. In this study, spatial dependency is taken into account in determination of the significant socio-economic and climatic factors that may be of importance for the municipal solid waste (MSW) generation rates in different provinces of Turkey. Simultaneous spatial autoregression (SAR) and geographically weighted regression (GWR) models are used for the spatial data analyses. Similar to ordinary least squares regression (OLSR), regression coefficients are global in SAR model. In other words, the effect of a given independent variable on a dependent variable is valid for the whole country. Unlike OLSR or SAR, GWR reveals the local impact of a given factor (or independent variable) on the waste generation rates of different provinces. Results show that provinces within closer neighborhoods have similar MSW generation rates. On the other hand, this spatial autocorrelation is not very high for the exploratory variables considered in the study. OLSR and SAR models have similar regression coefficients. GWR is useful to indicate the local determinants of MSW generation rates. GWR model can be utilized to plan waste management activities at local scale including waste minimization, collection, treatment, and disposal. At global scale, the MSW generation rates in Turkey are significantly related to unemployment rate and asphalt-paved roads ratio. Yet, significances of these variables may diminish at local scale for some provinces. At local scale, different factors may be important in affecting MSW generation rates.« less

  17. Mapping the Climate of Puerto Rico, Vieques and Culebra.

    Treesearch

    CHRISTOPHER DALY; E. H. HELMER; MAYA QUINONES

    2003-01-01

    Spatially explicit climate data contribute to watershed resource management, mapping vegetation type with satellite imagery, mapping present and hypothetical future ecological zones, and predicting species distributions. The regression based Parameter-elevation Regressions on Independent Slopes Model (PRISM) uses spatial data sets, a knowledge base and expert...

  18. Spatial Statistical Network Models for Stream and River Temperature in the Chesapeake Bay Watershed, USA

    EPA Science Inventory

    Regional temperature models are needed for characterizing and mapping stream thermal regimes, establishing reference conditions, predicting future impacts and identifying critical thermal refugia. Spatial statistical models have been developed to improve regression modeling techn...

  19. Revisiting crash spatial heterogeneity: A Bayesian spatially varying coefficients approach.

    PubMed

    Xu, Pengpeng; Huang, Helai; Dong, Ni; Wong, S C

    2017-01-01

    This study was performed to investigate the spatially varying relationships between crash frequency and related risk factors. A Bayesian spatially varying coefficients model was elaborately introduced as a methodological alternative to simultaneously account for the unstructured and spatially structured heterogeneity of the regression coefficients in predicting crash frequencies. The proposed method was appealing in that the parameters were modeled via a conditional autoregressive prior distribution, which involved a single set of random effects and a spatial correlation parameter with extreme values corresponding to pure unstructured or pure spatially correlated random effects. A case study using a three-year crash dataset from the Hillsborough County, Florida, was conducted to illustrate the proposed model. Empirical analysis confirmed the presence of both unstructured and spatially correlated variations in the effects of contributory factors on severe crash occurrences. The findings also suggested that ignoring spatially structured heterogeneity may result in biased parameter estimates and incorrect inferences, while assuming the regression coefficients to be spatially clustered only is probably subject to the issue of over-smoothness. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: Insights into spatial variability using high-resolution satellite data

    PubMed Central

    Alexeeff, Stacey E.; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A.

    2016-01-01

    Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1km x 1km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R2 yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with greater than 0.9 out-of-sample R2 yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the standard errors. Land use regression models performed better in chronic effects simulations. These results can help researchers when interpreting health effect estimates in these types of studies. PMID:24896768

  1. Spatial measurement error and correction by spatial SIMEX in linear regression models when using predicted air pollution exposures.

    PubMed

    Alexeeff, Stacey E; Carroll, Raymond J; Coull, Brent

    2016-04-01

    Spatial modeling of air pollution exposures is widespread in air pollution epidemiology research as a way to improve exposure assessment. However, there are key sources of exposure model uncertainty when air pollution is modeled, including estimation error and model misspecification. We examine the use of predicted air pollution levels in linear health effect models under a measurement error framework. For the prediction of air pollution exposures, we consider a universal Kriging framework, which may include land-use regression terms in the mean function and a spatial covariance structure for the residuals. We derive the bias induced by estimation error and by model misspecification in the exposure model, and we find that a misspecified exposure model can induce asymptotic bias in the effect estimate of air pollution on health. We propose a new spatial simulation extrapolation (SIMEX) procedure, and we demonstrate that the procedure has good performance in correcting this asymptotic bias. We illustrate spatial SIMEX in a study of air pollution and birthweight in Massachusetts. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams

    USGS Publications Warehouse

    Kocovsky, P.M.; Carline, R.F.

    2006-01-01

    Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.

  3. The effect of occlusion on the semantics of projective spatial terms: a case study in grounding language in perception.

    PubMed

    Kelleher, John D; Ross, Robert J; Sloan, Colm; Mac Namee, Brian

    2011-02-01

    Although data-driven spatial template models provide a practical and cognitively motivated mechanism for characterizing spatial term meaning, the influence of perceptual rather than solely geometric and functional properties has yet to be systematically investigated. In the light of this, in this paper, we investigate the effects of the perceptual phenomenon of object occlusion on the semantics of projective terms. We did this by conducting a study to test whether object occlusion had a noticeable effect on the acceptance values assigned to projective terms with respect to a 2.5-dimensional visual stimulus. Based on the data collected, a regression model was constructed and presented. Subsequent analysis showed that the regression model that included the occlusion factor outperformed an adaptation of Regier & Carlson's well-regarded AVS model for that same spatial configuration.

  4. Pragmatic estimation of a spatio-temporal air quality model with irregular monitoring data

    NASA Astrophysics Data System (ADS)

    Sampson, Paul D.; Szpiro, Adam A.; Sheppard, Lianne; Lindström, Johan; Kaufman, Joel D.

    2011-11-01

    Statistical analyses of health effects of air pollution have increasingly used GIS-based covariates for prediction of ambient air quality in "land use" regression models. More recently these spatial regression models have accounted for spatial correlation structure in combining monitoring data with land use covariates. We present a flexible spatio-temporal modeling framework and pragmatic, multi-step estimation procedure that accommodates essentially arbitrary patterns of missing data with respect to an ideally complete space by time matrix of observations on a network of monitoring sites. The methodology incorporates a model for smooth temporal trends with coefficients varying in space according to Partial Least Squares regressions on a large set of geographic covariates and nonstationary modeling of spatio-temporal residuals from these regressions. This work was developed to provide spatial point predictions of PM 2.5 concentrations for the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) using irregular monitoring data derived from the AQS regulatory monitoring network and supplemental short-time scale monitoring campaigns conducted to better predict intra-urban variation in air quality. We demonstrate the interpretation and accuracy of this methodology in modeling data from 2000 through 2006 in six U.S. metropolitan areas and establish a basis for likelihood-based estimation.

  5. [Spatial patterns and influence factors of specialization in tea cultivation based on geographically weighted regression model: A case study of Anxi County of Fujian Province, China].

    PubMed

    Shui, Wei; DU, Yong; Chen, Yi Ping; Jian, Xiao Mei; Fan, Bing Xiong

    2017-04-18

    Anxi County, specializing in tea cultivation, was taken as a case in this research. Pearson correlation analysis, ordinary least squares model (OLS) and geographically weighted regression model (GWR) were used to select four primary influence factors of specialization in tea cultivation (i.e., the average elevation, net income per capita, proportion of agricultural population, and the distance from roads) by analyzing the specialization degree of each town of Anxi County. Meanwhile, the spatial patterns of specialization in tea cultivation of Anxi County were evaluated. The results indicated that specialization in tea cultivation of Anxi County showed an obvious spatial auto-correlation, and a spatial pattern with "low-middle-high" circle structure, which was similar to Von Thünen's circle structure model, appeared from the county town to its surrounding region. Meanwhile, GWR (0.624) had a better fitting degree than OLS (0.595), and GWR could reasonably expound the spatial data. Contrary to the agricultural location theory of Von Thünen's model, which indicated that distance from market was a determination factor, the specialization degree of tea cultivation in Anxi was mainly decided by natural conditions of mountain area, instead of the social factors. Specialization degree of tea cultivation was positively correlated with the average elevation, net income per capita and the proportion of agricultural population, while a negative correlation was found between the distance from roads and specialization degree of tea cultivation. Coefficients of regression between the specialization degree of tea cultivation and two factors (i.e., the average elevation and net income per capita) showed a spatial pattern of higher level in the north direction and lower level in the south direction. On the contrary, the regression coefficients for the proportion of agricultural population increased from south to north of Anxi County. Furthermore, regression coefficient for the distance from roads showed a spatial pattern of higher level in the northeast direction and lower level in the southwest direction of Anxi County.

  6. Bayesian structured additive regression modeling of epidemic data: application to cholera

    PubMed Central

    2012-01-01

    Background A significant interest in spatial epidemiology lies in identifying associated risk factors which enhances the risk of infection. Most studies, however, make no, or limited use of the spatial structure of the data, as well as possible nonlinear effects of the risk factors. Methods We develop a Bayesian Structured Additive Regression model for cholera epidemic data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (MCMC) simulations. The model is applied to cholera epidemic data in the Kumasi Metropolis, Ghana. Proximity to refuse dumps, density of refuse dumps, and proximity to potential cholera reservoirs were modeled as continuous functions; presence of slum settlers and population density were modeled as fixed effects, whereas spatial references to the communities were modeled as structured and unstructured spatial effects. Results We observe that the risk of cholera is associated with slum settlements and high population density. The risk of cholera is equal and lower for communities with fewer refuse dumps, but variable and higher for communities with more refuse dumps. The risk is also lower for communities distant from refuse dumps and potential cholera reservoirs. The results also indicate distinct spatial variation in the risk of cholera infection. Conclusion The study highlights the usefulness of Bayesian semi-parametric regression model analyzing public health data. These findings could serve as novel information to help health planners and policy makers in making effective decisions to control or prevent cholera epidemics. PMID:22866662

  7. Evaluation of land use regression models in Detroit, Michigan

    EPA Science Inventory

    Introduction: Land use regression (LUR) models have emerged as a cost-effective tool for characterizing exposure in epidemiologic health studies. However, little critical attention has been focused on validation of these models as a step toward temporal and spatial extension of ...

  8. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.

    PubMed

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.

  9. Estimating riparian understory vegetation cover with beta regression and copula models

    USGS Publications Warehouse

    Eskelson, Bianca N.I.; Madsen, Lisa; Hagar, Joan C.; Temesgen, Hailemariam

    2011-01-01

    Understory vegetation communities are critical components of forest ecosystems. As a result, the importance of modeling understory vegetation characteristics in forested landscapes has become more apparent. Abundance measures such as shrub cover are bounded between 0 and 1, exhibit heteroscedastic error variance, and are often subject to spatial dependence. These distributional features tend to be ignored when shrub cover data are analyzed. The beta distribution has been used successfully to describe the frequency distribution of vegetation cover. Beta regression models ignoring spatial dependence (BR) and accounting for spatial dependence (BRdep) were used to estimate percent shrub cover as a function of topographic conditions and overstory vegetation structure in riparian zones in western Oregon. The BR models showed poor explanatory power (pseudo-R2 ≤ 0.34) but outperformed ordinary least-squares (OLS) and generalized least-squares (GLS) regression models with logit-transformed response in terms of mean square prediction error and absolute bias. We introduce a copula (COP) model that is based on the beta distribution and accounts for spatial dependence. A simulation study was designed to illustrate the effects of incorrectly assuming normality, equal variance, and spatial independence. It showed that BR, BRdep, and COP models provide unbiased parameter estimates, whereas OLS and GLS models result in slightly biased estimates for two of the three parameters. On the basis of the simulation study, 93–97% of the GLS, BRdep, and COP confidence intervals covered the true parameters, whereas OLS and BR only resulted in 84–88% coverage, which demonstrated the superiority of GLS, BRdep, and COP over OLS and BR models in providing standard errors for the parameter estimates in the presence of spatial dependence.

  10. The Association between Environmental Factors and Scarlet Fever Incidence in Beijing Region: Using GIS and Spatial Regression Models

    PubMed Central

    Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua

    2016-01-01

    (1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran’s I statistic and Anselin’s local Moran’s I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R2 = 0.0741, log likelihood = −1819.69, AIC = 3665.38), SLM (R2 = 0.0786, log likelihood = −1819.04, AIC = 3665.08) and SEM (R2 = 0.0743, log likelihood = −1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide (p = 0.027), rainfall (p = 0.036) and sunshine hour (p = 0.048), while the relative humidity (p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention. PMID:27827946

  11. The Association between Environmental Factors and Scarlet Fever Incidence in Beijing Region: Using GIS and Spatial Regression Models.

    PubMed

    Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua

    2016-11-04

    (1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran's I statistic and Anselin's local Moran's I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R² = 0.0741, log likelihood = -1819.69, AIC = 3665.38), SLM (R² = 0.0786, log likelihood = -1819.04, AIC = 3665.08) and SEM (R² = 0.0743, log likelihood = -1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide ( p = 0.027), rainfall ( p = 0.036) and sunshine hour ( p = 0.048), while the relative humidity ( p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention.

  12. Mapping the spatial pattern of temperate forest above ground biomass by integrating airborne lidar with Radarsat-2 imagery via geostatistical models

    NASA Astrophysics Data System (ADS)

    Li, Wang; Niu, Zheng; Gao, Shuai; Wang, Cheng

    2014-11-01

    Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are two competitive active remote sensing techniques in forest above ground biomass estimation, which is important for forest management and global climate change study. This study aims to further explore their capabilities in temperate forest above ground biomass (AGB) estimation by emphasizing the spatial auto-correlation of variables obtained from these two remote sensing tools, which is a usually overlooked aspect in remote sensing applications to vegetation studies. Remote sensing variables including airborne LiDAR metrics, backscattering coefficient for different SAR polarizations and their ratio variables for Radarsat-2 imagery were calculated. First, simple linear regression models (SLR) was established between the field-estimated above ground biomass and the remote sensing variables. Pearson's correlation coefficient (R2) was used to find which LiDAR metric showed the most significant correlation with the regression residuals and could be selected as co-variable in regression co-kriging (RCoKrig). Second, regression co-kriging was conducted by choosing the regression residuals as dependent variable and the LiDAR metric (Hmean) with highest R2 as co-variable. Third, above ground biomass over the study area was estimated using SLR model and RCoKrig model, respectively. The results for these two models were validated using the same ground points. Results showed that both of these two methods achieved satisfactory prediction accuracy, while regression co-kriging showed the lower estimation error. It is proved that regression co-kriging model is feasible and effective in mapping the spatial pattern of AGB in the temperate forest using Radarsat-2 data calibrated by airborne LiDAR metrics.

  13. A spatial-temporal regression model to predict daily outdoor residential PAH concentrations in an epidemiologic study in Fresno, CA

    NASA Astrophysics Data System (ADS)

    Noth, Elizabeth M.; Hammond, S. Katharine; Biging, Gregory S.; Tager, Ira B.

    2011-05-01

    BackgroundPolycyclic aromatic hydrocarbons (PAHs) are generated as a byproduct of combustion, and are associated with respiratory symptoms and increased risk of asthma attacks. ObjectivesTo assign daily, outdoor exposures to participants in the Fresno Asthmatic Children's Environment Study (FACES) using land use regression models for the sum of 4-, 5- and 6-ring PAHs (PAH456). MethodsPAH data were collected daily at the EPA Supersite in Fresno, CA from 10/2000 through 2/2007. From 2/2002 to 2/2003, intensive air pollution sampling was conducted at 83 homes of participants in the FACES study. These measurement data were combined with meteorological data, source data, and other spatial variables to form a land use regression model to assign daily exposure at all FACES homes for all years of the study (2001-2008). ResultsThe model for daily, outdoor residential PAH456 concentrations accounted for 80% of the between-home variability and 18% of the within-home variability. Both temporal and spatial variables were significant in the model. Traffic characteristics and home heating fuel were the main spatial explanatory variables. ConclusionsBecause spatial and temporal distributions of PAHs vary on an intra-urban scale, the location of the child's home within the urban setting plays an important role in the level of exposure that each child has to PAHs.

  14. The Role of Auxiliary Variables in Deterministic and Deterministic-Stochastic Spatial Models of Air Temperature in Poland

    NASA Astrophysics Data System (ADS)

    Szymanowski, Mariusz; Kryza, Maciej

    2017-02-01

    Our study examines the role of auxiliary variables in the process of spatial modelling and mapping of climatological elements, with air temperature in Poland used as an example. The multivariable algorithms are the most frequently applied for spatialization of air temperature, and their results in many studies are proved to be better in comparison to those obtained by various one-dimensional techniques. In most of the previous studies, two main strategies were used to perform multidimensional spatial interpolation of air temperature. First, it was accepted that all variables significantly correlated with air temperature should be incorporated into the model. Second, it was assumed that the more spatial variation of air temperature was deterministically explained, the better was the quality of spatial interpolation. The main goal of the paper was to examine both above-mentioned assumptions. The analysis was performed using data from 250 meteorological stations and for 69 air temperature cases aggregated on different levels: from daily means to 10-year annual mean. Two cases were considered for detailed analysis. The set of potential auxiliary variables covered 11 environmental predictors of air temperature. Another purpose of the study was to compare the results of interpolation given by various multivariable methods using the same set of explanatory variables. Two regression models: multiple linear (MLR) and geographically weighted (GWR) method, as well as their extensions to the regression-kriging form, MLRK and GWRK, respectively, were examined. Stepwise regression was used to select variables for the individual models and the cross-validation method was used to validate the results with a special attention paid to statistically significant improvement of the model using the mean absolute error (MAE) criterion. The main results of this study led to rejection of both assumptions considered. Usually, including more than two or three of the most significantly correlated auxiliary variables does not improve the quality of the spatial model. The effects of introduction of certain variables into the model were not climatologically justified and were seen on maps as unexpected and undesired artefacts. The results confirm, in accordance with previous studies, that in the case of air temperature distribution, the spatial process is non-stationary; thus, the local GWR model performs better than the global MLR if they are specified using the same set of auxiliary variables. If only GWR residuals are autocorrelated, the geographically weighted regression-kriging (GWRK) model seems to be optimal for air temperature spatial interpolation.

  15. Use of geographically weighted logistic regression to quantify spatial variation in the environmental and sociodemographic drivers of leptospirosis in Fiji: a modelling study.

    PubMed

    Mayfield, Helen J; Lowry, John H; Watson, Conall H; Kama, Mike; Nilles, Eric J; Lau, Colleen L

    2018-05-01

    Leptospirosis is a globally important zoonotic disease, with complex exposure pathways that depend on interactions between human beings, animals, and the environment. Major drivers of outbreaks include flooding, urbanisation, poverty, and agricultural intensification. The intensity of these drivers and their relative importance vary between geographical areas; however, non-spatial regression methods are incapable of capturing the spatial variations. This study aimed to explore the use of geographically weighted logistic regression (GWLR) to provide insights into the ecoepidemiology of human leptospirosis in Fiji. We obtained field data from a cross-sectional community survey done in 2013 in the three main islands of Fiji. A blood sample obtained from each participant (aged 1-90 years) was tested for anti-Leptospira antibodies and household locations were recorded using GPS receivers. We used GWLR to quantify the spatial variation in the relative importance of five environmental and sociodemographic covariates (cattle density, distance to river, poverty rate, residential setting [urban or rural], and maximum rainfall in the wettest month) on leptospirosis transmission in Fiji. We developed two models, one using GWLR and one with standard logistic regression; for each model, the dependent variable was the presence or absence of anti-Leptospira antibodies. GWLR results were compared with results obtained with standard logistic regression, and used to produce a predictive risk map and maps showing the spatial variation in odds ratios (OR) for each covariate. The dataset contained location information for 2046 participants from 1922 households representing 81 communities. The Aikaike information criterion value of the GWLR model was 1935·2 compared with 1254·2 for the standard logistic regression model, indicating that the GWLR model was more efficient. Both models produced similar OR for the covariates, but GWLR also detected spatial variation in the effect of each covariate. Maximum rainfall had the least variation across space (median OR 1·30, IQR 1·27-1·35), and distance to river varied the most (1·45, 1·35-2·05). The predictive risk map indicated that the highest risk was in the interior of Viti Levu, and the agricultural region and southern end of Vanua Levu. GWLR provided a valuable method for modelling spatial heterogeneity of covariates for leptospirosis infection and their relative importance over space. Results of GWLR could be used to inform more place-specific interventions, particularly for diseases with strong environmental or sociodemographic drivers of transmission. WHO, Australian National Health & Medical Research Council, University of Queensland, UK Medical Research Council, Chadwick Trust. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.

  16. Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?

    NASA Astrophysics Data System (ADS)

    Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.

    2016-12-01

    Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.

  17. Accounting for and predicting the influence of spatial autocorrelation in water quality modeling

    NASA Astrophysics Data System (ADS)

    Miralha, L.; Kim, D.

    2017-12-01

    Although many studies have attempted to investigate the spatial trends of water quality, more attention is yet to be paid to the consequences of considering and ignoring the spatial autocorrelation (SAC) that exists in water quality parameters. Several studies have mentioned the importance of accounting for SAC in water quality modeling, as well as the differences in outcomes between models that account for and ignore SAC. However, the capacity to predict the magnitude of such differences is still ambiguous. In this study, we hypothesized that SAC inherently possessed by a response variable (i.e., water quality parameter) influences the outcomes of spatial modeling. We evaluated whether the level of inherent SAC is associated with changes in R-Squared, Akaike Information Criterion (AIC), and residual SAC (rSAC), after accounting for SAC during modeling procedure. The main objective was to analyze if water quality parameters with higher Moran's I values (inherent SAC measure) undergo a greater increase in R² and a greater reduction in both AIC and rSAC. We compared a non-spatial model (OLS) to two spatial regression approaches (spatial lag and error models). Predictor variables were the principal components of topographic (elevation and slope), land cover, and hydrological soil group variables. We acquired these data from federal online sources (e.g. USGS). Ten watersheds were selected, each in a different state of the USA. Results revealed that water quality parameters with higher inherent SAC showed substantial increase in R² and decrease in rSAC after performing spatial regressions. However, AIC values did not show significant changes. Overall, the higher the level of inherent SAC in water quality variables, the greater improvement of model performance. This indicates a linear and direct relationship between the spatial model outcomes (R² and rSAC) and the degree of SAC in each water quality variable. Therefore, our study suggests that the inherent level of SAC in response variables can predict improvements in models even before performing spatial regression approaches. We also recognize the constraints of this research and suggest that further studies focus on better ways of defining spatial neighborhoods, considering the differences among stations set in tributaries near to each other and in upstream areas.

  18. A hydrologic network supporting spatially referenced regression modeling in the Chesapeake Bay watershed

    USGS Publications Warehouse

    Brakebill, J.W.; Preston, S.D.

    2003-01-01

    The U.S. Geological Survey has developed a methodology for statistically relating nutrient sources and land-surface characteristics to nutrient loads of streams. The methodology is referred to as SPAtially Referenced Regressions On Watershed attributes (SPARROW), and relates measured stream nutrient loads to nutrient sources using nonlinear statistical regression models. A spatially detailed digital hydrologic network of stream reaches, stream-reach characteristics such as mean streamflow, water velocity, reach length, and travel time, and their associated watersheds supports the regression models. This network serves as the primary framework for spatially referencing potential nutrient source information such as atmospheric deposition, septic systems, point-sources, land use, land cover, and agricultural sources and land-surface characteristics such as land use, land cover, average-annual precipitation and temperature, slope, and soil permeability. In the Chesapeake Bay watershed that covers parts of Delaware, Maryland, Pennsylvania, New York, Virginia, West Virginia, and Washington D.C., SPARROW was used to generate models estimating loads of total nitrogen and total phosphorus representing 1987 and 1992 land-surface conditions. The 1987 models used a hydrologic network derived from an enhanced version of the U.S. Environmental Protection Agency's digital River Reach File, and course resolution Digital Elevation Models (DEMs). A new hydrologic network was created to support the 1992 models by generating stream reaches representing surface-water pathways defined by flow direction and flow accumulation algorithms from higher resolution DEMs. On a reach-by-reach basis, stream reach characteristics essential to the modeling were transferred to the newly generated pathways or reaches from the enhanced River Reach File used to support the 1987 models. To complete the new network, watersheds for each reach were generated using the direction of surface-water flow derived from the DEMs. This network improves upon existing digital stream data by increasing the level of spatial detail and providing consistency between the reach locations and topography. The hydrologic network also aids in illustrating the spatial patterns of predicted nutrient loads and sources contributed locally to each stream, and the percentages of nutrient load that reach Chesapeake Bay.

  19. Spatial quantile regression using INLA with applications to childhood overweight in Malawi.

    PubMed

    Mtambo, Owen P L; Masangwi, Salule J; Kazembe, Lawrence N M

    2015-04-01

    Analyses of childhood overweight have mainly used mean regression. However, using quantile regression is more appropriate as it provides flexibility to analyse the determinants of overweight corresponding to quantiles of interest. The main objective of this study was to fit a Bayesian additive quantile regression model with structured spatial effects for childhood overweight in Malawi using the 2010 Malawi DHS data. Inference was fully Bayesian using R-INLA package. The significant determinants of childhood overweight ranged from socio-demographic factors such as type of residence to child and maternal factors such as child age and maternal BMI. We observed significant positive structured spatial effects on childhood overweight in some districts of Malawi. We recommended that the childhood malnutrition policy makers should consider timely interventions based on risk factors as identified in this paper including spatial targets of interventions. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Influences of spatial and temporal variation on fish-habitat relationships defined by regression quantiles

    Treesearch

    Jason B. Dunham; Brian S. Cade; James W. Terrell

    2002-01-01

    We used regression quantiles to model potentially limiting relationships between the standing crop of cutthroat trout Oncorhynchus clarki and measures of stream channel morphology. Regression quantile models indicated that variation in fish density was inversely related to the width:depth ratio of streams but not to stream width or depth alone. The...

  1. The Bayesian group lasso for confounded spatial data

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

    2017-01-01

    Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

  2. Advantages of geographically weighted regression for modeling benthic substrate in two Greater Yellowstone Ecosystem streams

    USGS Publications Warehouse

    Sheehan, Kenneth R.; Strager, Michael P.; Welsh, Stuart A.

    2013-01-01

    Stream habitat assessments are commonplace in fish management, and often involve nonspatial analysis methods for quantifying or predicting habitat, such as ordinary least squares regression (OLS). Spatial relationships, however, often exist among stream habitat variables. For example, water depth, water velocity, and benthic substrate sizes within streams are often spatially correlated and may exhibit spatial nonstationarity or inconsistency in geographic space. Thus, analysis methods should address spatial relationships within habitat datasets. In this study, OLS and a recently developed method, geographically weighted regression (GWR), were used to model benthic substrate from water depth and water velocity data at two stream sites within the Greater Yellowstone Ecosystem. For data collection, each site was represented by a grid of 0.1 m2 cells, where actual values of water depth, water velocity, and benthic substrate class were measured for each cell. Accuracies of regressed substrate class data by OLS and GWR methods were calculated by comparing maps, parameter estimates, and determination coefficient r 2. For analysis of data from both sites, Akaike’s Information Criterion corrected for sample size indicated the best approximating model for the data resulted from GWR and not from OLS. Adjusted r 2 values also supported GWR as a better approach than OLS for prediction of substrate. This study supports GWR (a spatial analysis approach) over nonspatial OLS methods for prediction of habitat for stream habitat assessments.

  3. Spatial regression analysis of traffic crashes in Seoul.

    PubMed

    Rhee, Kyoung-Ah; Kim, Joon-Ki; Lee, Young-ihn; Ulfarsson, Gudmundur F

    2016-06-01

    Traffic crashes can be spatially correlated events and the analysis of the distribution of traffic crash frequency requires evaluation of parameters that reflect spatial properties and correlation. Typically this spatial aspect of crash data is not used in everyday practice by planning agencies and this contributes to a gap between research and practice. A database of traffic crashes in Seoul, Korea, in 2010 was developed at the traffic analysis zone (TAZ) level with a number of GIS developed spatial variables. Practical spatial models using available software were estimated. The spatial error model was determined to be better than the spatial lag model and an ordinary least squares baseline regression. A geographically weighted regression model provided useful insights about localization of effects. The results found that an increased length of roads with speed limit below 30 km/h and a higher ratio of residents below age of 15 were correlated with lower traffic crash frequency, while a higher ratio of residents who moved to the TAZ, more vehicle-kilometers traveled, and a greater number of access points with speed limit difference between side roads and mainline above 30 km/h all increased the number of traffic crashes. This suggests, for example, that better control or design for merging lower speed roads with higher speed roads is important. A key result is that the length of bus-only center lanes had the largest effect on increasing traffic crashes. This is important as bus-only center lanes with bus stop islands have been increasingly used to improve transit times. Hence the potential negative safety impacts of such systems need to be studied further and mitigated through improved design of pedestrian access to center bus stop islands. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression

    PubMed Central

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271

  5. Logistic regression accuracy across different spatial and temporal scales for a wide-ranging species, the marbled murrelet

    Treesearch

    Carolyn B. Meyer; Sherri L. Miller; C. John Ralph

    2004-01-01

    The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...

  6. Landscape-scale consequences of differential tree mortality from catastrophic wind disturbance in the Amazon.

    PubMed

    Rifai, Sami W; Urquiza Muñoz, José D; Negrón-Juárez, Robinson I; Ramírez Arévalo, Fredy R; Tello-Espinoza, Rodil; Vanderwel, Mark C; Lichstein, Jeremy W; Chambers, Jeffrey Q; Bohlman, Stephanie A

    2016-10-01

    Wind disturbance can create large forest blowdowns, which greatly reduces live biomass and adds uncertainty to the strength of the Amazon carbon sink. Observational studies from within the central Amazon have quantified blowdown size and estimated total mortality but have not determined which trees are most likely to die from a catastrophic wind disturbance. Also, the impact of spatial dependence upon tree mortality from wind disturbance has seldom been quantified, which is important because wind disturbance often kills clusters of trees due to large treefalls killing surrounding neighbors. We examine (1) the causes of differential mortality between adult trees from a 300-ha blowdown event in the Peruvian region of the northwestern Amazon, (2) how accounting for spatial dependence affects mortality predictions, and (3) how incorporating both differential mortality and spatial dependence affect the landscape level estimation of necromass produced from the blowdown. Standard regression and spatial regression models were used to estimate how stem diameter, wood density, elevation, and a satellite-derived disturbance metric influenced the probability of tree death from the blowdown event. The model parameters regarding tree characteristics, topography, and spatial autocorrelation of the field data were then used to determine the consequences of non-random mortality for landscape production of necromass through a simulation model. Tree mortality was highly non-random within the blowdown, where tree mortality rates were highest for trees that were large, had low wood density, and were located at high elevation. Of the differential mortality models, the non-spatial models overpredicted necromass, whereas the spatial model slightly underpredicted necromass. When parameterized from the same field data, the spatial regression model with differential mortality estimated only 7.5% more dead trees across the entire blowdown than the random mortality model, yet it estimated 51% greater necromass. We suggest that predictions of forest carbon loss from wind disturbance are sensitive to not only the underlying spatial dependence of observations, but also the biological differences between individuals that promote differential levels of mortality. © 2016 by the Ecological Society of America.

  7. Effects of urban form on the urban heat island effect based on spatial regression model.

    PubMed

    Yin, Chaohui; Yuan, Man; Lu, Youpeng; Huang, Yaping; Liu, Yanfang

    2018-09-01

    The urban heat island (UHI) effect is becoming more of a concern with the accelerated process of urbanization. However, few studies have examined the effect of urban form on land surface temperature (LST) especially from an urban planning perspective. This paper used spatial regression model to investigate the effects of both land use composition and urban form on LST in Wuhan City, China, based on the regulatory planning management unit. Landsat ETM+ image data was used to estimate LST. Land use composition was calculated by impervious surface area proportion, vegetated area proportion, and water proportion, while urban form indicators included sky view factor (SVF), building density, and floor area ratio (FAR). We first tested for spatial autocorrelation of urban LST, which confirmed that a traditional regression method would be invalid. A spatial error model (SEM) was chosen because its parameters were better than a spatial lag model (SLM). The results showed that urban form metrics should be the focus for mitigation efforts of UHI effects. In addition, analysis of the relationship between urban form and UHI effect based on the regulatory planning management unit was helpful for promoting corresponding UHI effect mitigation rules in practice. Finally, the spatial regression model was recommended to be an appropriate method for dealing with problems related to the urban thermal environment. Results suggested that the impact of urbanization on the UHI effect can be mitigated not only by balancing various land use types, but also by optimizing urban form, which is even more effective. This research expands the scientific understanding of effects of urban form on UHI by explicitly analyzing indicators closely related to urban detailed planning at the level of regulatory planning management unit. In addition, it may provide important insights and effective regulation measures for urban planners to mitigate future UHI effects. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Spatial patterns of March and September streamflow trends in Pacific Northwest Streams, 1958-2008

    USGS Publications Warehouse

    Chang, Heejun; Jung, Il-Won; Steele, Madeline; Gannett, Marshall

    2012-01-01

    Summer streamflow is a vital water resource for municipal and domestic water supplies, irrigation, salmonid habitat, recreation, and water-related ecosystem services in the Pacific Northwest (PNW) in the United States. This study detects significant negative trends in September absolute streamflow in a majority of 68 stream-gauging stations located on unregulated streams in the PNW from 1958 to 2008. The proportion of March streamflow to annual streamflow increases in most stations over 1,000 m elevation, with a baseflow index of less than 50, while absolute March streamflow does not increase in most stations. The declining trends of September absolute streamflow are strongly associated with seven-day low flow, January–March maximum temperature trends, and the size of the basin (19–7,260 km2), while the increasing trends of the fraction of March streamflow are associated with elevation, April 1 snow water equivalent, March precipitation, center timing of streamflow, and October–December minimum temperature trends. Compared with ordinary least squares (OLS) estimated regression models, spatial error regression and geographically weighted regression (GWR) models effectively remove spatial autocorrelation in residuals. The GWR model results show spatial gradients of local R 2 values with consistently higher local R 2 values in the northern Cascades. This finding illustrates that different hydrologic landscape factors, such as geology and seasonal distribution of precipitation, also influence streamflow trends in the PNW. In addition, our spatial analysis model results show that considering various geographic factors help clarify the dynamics of streamflow trends over a large geographical area, supporting a spatial analysis approach over aspatial OLS-estimated regression models for predicting streamflow trends. Results indicate that transitional rain–snow surface water-dominated basins are likely to have reduced summer streamflow under warming scenarios. Consequently, a better understanding of the relationships among summer streamflow, precipitation, snowmelt, elevation, and geology can help water managers predict the response of regional summer streamflow to global warming.

  9. Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

    NASA Astrophysics Data System (ADS)

    Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

    2014-12-01

    Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.

  10. Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

    PubMed Central

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915

  11. Using structured additive regression models to estimate risk factors of malaria: analysis of 2010 Malawi malaria indicator survey data.

    PubMed

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities.

  12. Spatial regression methods capture prediction uncertainty in species distribution model projections through time

    Treesearch

    Alan K. Swanson; Solomon Z. Dobrowski; Andrew O. Finley; James H. Thorne; Michael K. Schwartz

    2013-01-01

    The uncertainty associated with species distribution model (SDM) projections is poorly characterized, despite its potential value to decision makers. Error estimates from most modelling techniques have been shown to be biased due to their failure to account for spatial autocorrelation (SAC) of residual error. Generalized linear mixed models (GLMM) have the ability to...

  13. An Analysis of San Diego's Housing Market Using a Geographically Weighted Regression Approach

    NASA Astrophysics Data System (ADS)

    Grant, Christina P.

    San Diego County real estate transaction data was evaluated with a set of linear models calibrated by ordinary least squares and geographically weighted regression (GWR). The goal of the analysis was to determine whether the spatial effects assumed to be in the data are best studied globally with no spatial terms, globally with a fixed effects submarket variable, or locally with GWR. 18,050 single-family residential sales which closed in the six months between April 2014 and September 2014 were used in the analysis. Diagnostic statistics including AICc, R2, Global Moran's I, and visual inspection of diagnostic plots and maps indicate superior model performance by GWR as compared to both global regressions.

  14. Optimizing landslide susceptibility zonation: Effects of DEM spatial resolution and slope unit delineation on logistic regression models

    NASA Astrophysics Data System (ADS)

    Schlögel, R.; Marchesini, I.; Alvioli, M.; Reichenbach, P.; Rossi, M.; Malet, J.-P.

    2018-01-01

    We perform landslide susceptibility zonation with slope units using three digital elevation models (DEMs) of varying spatial resolution of the Ubaye Valley (South French Alps). In so doing, we applied a recently developed algorithm automating slope unit delineation, given a number of parameters, in order to optimize simultaneously the partitioning of the terrain and the performance of a logistic regression susceptibility model. The method allowed us to obtain optimal slope units for each available DEM spatial resolution. For each resolution, we studied the susceptibility model performance by analyzing in detail the relevance of the conditioning variables. The analysis is based on landslide morphology data, considering either the whole landslide or only the source area outline as inputs. The procedure allowed us to select the most useful information, in terms of DEM spatial resolution, thematic variables and landslide inventory, in order to obtain the most reliable slope unit-based landslide susceptibility assessment.

  15. Spatial patterns of species richness in New World coral snakes and the metabolic theory of ecology

    NASA Astrophysics Data System (ADS)

    Terribile, Levi Carina; Diniz-Filho, José Alexandre Felizola

    2009-03-01

    The metabolic theory of ecology (MTE) has attracted great interest because it proposes an explanation for species diversity gradients based on temperature-metabolism relationships of organisms. Here we analyse the spatial richness pattern of 73 coral snake species from the New World in the context of MTE. We first analysed the association between ln-transformed richness and environmental variables, including the inverse transformation of annual temperature (1/ kT). We used eigenvector-based spatial filtering to remove the residual spatial autocorrelation in the data and geographically weighted regression to account for non-stationarity in data. In a model I regression (OLS), the observed slope between ln-richness and 1/ kT was -0.626 ( r2 = 0.413), but a model II regression generated a much steeper slope (-0.975). When we added additional environmental correlates and the spatial filters in the OLS model, the R2 increased to 0.863 and the partial regression coefficient of 1/ kT was -0.676. The GWR detected highly significant non-stationarity, in data, and the median of local slopes of ln-richness against 1/ kT was -0.38. Our results expose several problems regarding the assumptions needed to test MTE: although the slope of OLS fell within that predicted by the theory and the dataset complied with the assumption of temperature-independence of average body size, the fact that coral snakes consist of a restricted taxonomic group and the non-stationarity of slopes across geographical space makes MTE invalid to explain richness in this case. Also, it is clear that other ecological and historical factors are important drivers of species richness patterns and must be taken into account both in theoretical modeling and data analysis.

  16. Improving Global Models of Remotely Sensed Ocean Chlorophyll Content Using Partial Least Squares and Geographically Weighted Regression

    NASA Astrophysics Data System (ADS)

    Gholizadeh, H.; Robeson, S. M.

    2015-12-01

    Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.

  17. Quantile regression models of animal habitat relationships

    USGS Publications Warehouse

    Cade, Brian S.

    2003-01-01

    Typically, all factors that limit an organism are not measured and included in statistical models used to investigate relationships with their environment. If important unmeasured variables interact multiplicatively with the measured variables, the statistical models often will have heterogeneous response distributions with unequal variances. Quantile regression is an approach for estimating the conditional quantiles of a response variable distribution in the linear model, providing a more complete view of possible causal relationships between variables in ecological processes. Chapter 1 introduces quantile regression and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of estimates for homogeneous and heterogeneous regression models. Chapter 2 evaluates performance of quantile rankscore tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). A permutation F test maintained better Type I errors than the Chi-square T test for models with smaller n, greater number of parameters p, and more extreme quantiles τ. Both versions of the test required weighting to maintain correct Type I errors when there was heterogeneity under the alternative model. An example application related trout densities to stream channel width:depth. Chapter 3 evaluates a drop in dispersion, F-ratio like permutation test for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). Chapter 4 simulates from a large (N = 10,000) finite population representing grid areas on a landscape to demonstrate various forms of hidden bias that might occur when the effect of a measured habitat variable on some animal was confounded with the effect of another unmeasured variable (spatially and not spatially structured). Depending on whether interactions of the measured habitat and unmeasured variable were negative (interference interactions) or positive (facilitation interactions), either upper (τ > 0.5) or lower (τ < 0.5) quantile regression parameters were less biased than mean rate parameters. Sampling (n = 20 - 300) simulations demonstrated that confidence intervals constructed by inverting rankscore tests provided valid coverage of these biased parameters. Quantile regression was used to estimate effects of physical habitat resources on a bivalve mussel (Macomona liliana) in a New Zealand harbor by modeling the spatial trend surface as a cubic polynomial of location coordinates.

  18. Sparse modeling of spatial environmental variables associated with asthma

    PubMed Central

    Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.

    2014-01-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437

  19. Sparse modeling of spatial environmental variables associated with asthma.

    PubMed

    Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

    2015-02-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. GIS-based spatial regression and prediction of water quality in river networks: A case study in Iowa

    USGS Publications Warehouse

    Yang, X.; Jin, W.

    2010-01-01

    Nonpoint source pollution is the leading cause of the U.S.'s water quality problems. One important component of nonpoint source pollution control is an understanding of what and how watershed-scale conditions influence ambient water quality. This paper investigated the use of spatial regression to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration in the Cedar River Watershed, Iowa. An Arc Hydro geodatabase was constructed to organize various datasets on the watershed. Spatial regression models were developed to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration and predict NO3NO2-N concentration at unmonitored locations. Unlike the traditional ordinary least square (OLS) method, the spatial regression method incorporates the potential spatial correlation among the observations in its coefficient estimation. Study results show that NO3NO2-N observations in the Cedar River Watershed are spatially correlated, and by ignoring the spatial correlation, the OLS method tends to over-estimate the impacts of watershed characteristics on stream NO3NO2-N concentration. In conjunction with kriging, the spatial regression method not only makes better stream NO3NO2-N concentration predictions than the OLS method, but also gives estimates of the uncertainty of the predictions, which provides useful information for optimizing the design of stream monitoring network. It is a promising tool for better managing and controlling nonpoint source pollution. ?? 2010 Elsevier Ltd.

  1. Improving the Spatial Prediction of Soil Organic Carbon Stocks in a Complex Tropical Mountain Landscape by Methodological Specifications in Machine Learning Approaches.

    PubMed

    Ließ, Mareike; Schmidt, Johannes; Glaser, Bruno

    2016-01-01

    Tropical forests are significant carbon sinks and their soils' carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms-including the model tuning and predictor selection-were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models' predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction.

  2. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    NASA Astrophysics Data System (ADS)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  3. Evaluation of Land use Regression Models for NO2 in El Paso, Texas, USA

    EPA Science Inventory

    Developing suitable exposure estimates for air pollution health studies is problematic due to spatial and temporal variation in concentrations and often limited monitoring data. Though land use regression models (LURs) are often used for this purpose, their applicability to later...

  4. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  5. Estimates of nitrate loads and yields from groundwater to streams in the Chesapeake Bay watershed based on land use and geology

    USGS Publications Warehouse

    Terziotti, Silvia; Capel, Paul D.; Tesoriero, Anthony J.; Hopple, Jessica A.; Kronholm, Scott C.

    2018-03-07

    The water quality of the Chesapeake Bay may be adversely affected by dissolved nitrate carried in groundwater discharge to streams. To estimate the concentrations, loads, and yields of nitrate from groundwater to streams for the Chesapeake Bay watershed, a regression model was developed based on measured nitrate concentrations from 156 small streams with watersheds less than 500 square miles (mi2 ) at baseflow. The regression model has three predictive variables: geologic unit, percent developed land, and percent agricultural land. Comparisons of estimated and actual values within geologic units were closely matched. The coefficient of determination (R2 ) for the model was 0.6906. The model was used to calculate baseflow nitrate concentrations at over 83,000 National Hydrography Dataset Plus Version 2 catchments and aggregated to 1,966 total 12-digit hydrologic units in the Chesapeake Bay watershed. The modeled output geospatial data layers provided estimated annual loads and yields of nitrate from groundwater into streams. The spatial distribution of annual nitrate yields from groundwater estimated by this method was compared to the total watershed yields of all sources estimated from a Chesapeake Bay SPAtially Referenced Regressions On Watershed attributes (SPARROW) water-quality model. The comparison showed similar spatial patterns. The regression model for groundwater contribution had similar but lower yields, suggesting that groundwater is an important source of nitrogen for streams in the Chesapeake Bay watershed.

  6. Evaluation of land use regression models (LURs) for nitrogen dioxide and benzene in four U.S. Cities.

    EPA Science Inventory

    Spatial analysis studies have included application of land use regression models (LURs) for health and air quality assessments. Recent LUR studies have collected nitrogen dioxide (NO2) and volatile organic compounds (VOCs) using passive samplers at urban air monitoring networks ...

  7. Improving the Spatial Prediction of Soil Organic Carbon Stocks in a Complex Tropical Mountain Landscape by Methodological Specifications in Machine Learning Approaches

    PubMed Central

    Schmidt, Johannes; Glaser, Bruno

    2016-01-01

    Tropical forests are significant carbon sinks and their soils’ carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms—including the model tuning and predictor selection—were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models’ predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction. PMID:27128736

  8. A spatial analysis of the determinants of pneumonia and influenza hospitalizations in Ontario (1992-2001).

    PubMed

    Crighton, Eric J; Elliott, Susan J; Moineddin, Rahim; Kanaroglou, Pavlos; Upshur, Ross

    2007-04-01

    Previous research on the determinants of pneumonia and influenza has focused primarily on the role of individual level biological and behavioural risk factors resulting in partial explanations and largely curative approaches to reducing the disease burden. This study examines the geographic patterns of pneumonia and influenza hospitalizations and the role that broad ecologic-level factors may have in determining them. We conducted a county level, retrospective, ecologic study of pneumonia and influenza hospitalizations in the province of Ontario, Canada, between 1992 and 2001 (N=241,803), controlling for spatial dependence in the data. Non-spatial and spatial regression models were estimated using a range of environmental, social, economic, behavioural, and health care predictors. Results revealed low education to be positively associated with hospitalization rates over all age groups and both genders. The Aboriginal population variable was also positively associated in most models except for the 65+-year age group. Behavioural factors (daily smoking and heavy drinking), environmental factors (passive smoking, poor housing, temperature), and health care factors (influenza vaccination) were all significantly associated in different age and gender-specific models. The use of spatial error regression models allowed for unbiased estimation of regression parameters and their significance levels. These findings demonstrate the importance of broad age and gender-specific population-level factors in determining pneumonia and influenza hospitalizations, and illustrate the need for place and population-specific policies that take these factors into consideration.

  9. Mapping and spatial-temporal modeling of Bromus tectorum invasion in central Utah

    NASA Astrophysics Data System (ADS)

    Jin, Zhenyu

    Cheatgrass, or Downy Brome, is an exotic winter annual weed native to the Mediterranean region. Since its introduction to the U.S., it has become a significant weed and aggressive invader of sagebrush, pinion-juniper, and other shrub communities, where it can completely out-compete native grasses and shrubs. In this research, remotely sensed data combined with field collected data are used to investigate the distribution of the cheatgrass in Central Utah, to characterize the trend of the NDVI time-series of cheatgrass, and to construct a spatially explicit population-based model to simulate the spatial-temporal dynamics of the cheatgrass. This research proposes a method for mapping the canopy closure of invasive species using remotely sensed data acquired at different dates. Different invasive species have their own distinguished phenologies and the satellite images in different dates could be used to capture the phenology. The results of cheatgrass abundance prediction have a good fit with the field data for both linear regression and regression tree models, although the regression tree model has better performance than the linear regression model. To characterize the trend of NDVI time-series of cheatgrass, a novel smoothing algorithm named RMMEH is presented in this research to overcome some drawbacks of many other algorithms. By comparing the performance of RMMEH in smoothing a 16-day composite of the MODIS NDVI time-series with that of two other methods, which are the 4253EH, twice and the MVI, we have found that RMMEH not only keeps the original valid NDVI points, but also effectively removes the spurious spikes. The reconstructed NDVI time-series of different land covers are of higher quality and have smoother temporal trend. To simulate the spatial-temporal dynamics of cheatgrass, a spatially explicit population-based model is built applying remotely sensed data. The comparison between the model output and the ground truth of cheatgrass closure demonstrates that the model could successfully simulate the spatial-temporal dynamics of cheatgrass in a simple cheatgrass-dominant environment. The simulation of the functional response of different prescribed fire rates also shows that this model is helpful to answer management questions like, "What are the effects of prescribed fire to invasive species?" It demonstrates that a medium fire rate of 10% can successfully prevent cheatgrass invasion.

  10. Preliminary results of spatial modeling of selected forest health variables in Georgia

    Treesearch

    Brock Stewart; Chris J. Cieszewski

    2009-01-01

    Variables relating to forest health monitoring, such as mortality, are difficult to predict and model. We present here the results of fitting various spatial regression models to these variables. We interpolate plot-level values compiled from the Forest Inventory and Analysis National Information Management System (FIA-NIMS) data that are related to forest health....

  11. Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities?

    PubMed Central

    Aldstadt, Jared; Whalen, John; White, Kellee; Castro, Marcia C.; Williams, David R.

    2017-01-01

    BACKGROUND Multiple and varied benefits have been suggested for increased neighborhood walkability. However, spatial inequalities in neighborhood walkability likely exist and may be attributable, in part, to residential segregation. OBJECTIVE Utilizing a spatial demographic perspective, we evaluated potential spatial inequalities in walkable neighborhood amenities across census tracts in Boston, MA (US). METHODS The independent variables included minority racial/ethnic population percentages and percent of families in poverty. Walkable neighborhood amenities were assessed with a composite measure. Spatial autocorrelation in key study variables were first calculated with the Global Moran’s I statistic. Then, Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were calculated as well as Spearman correlations accounting for spatial autocorrelation. We fit ordinary least squares (OLS) regression and spatial autoregressive models, when appropriate, as a final step. RESULTS Significant positive spatial autocorrelation was found in neighborhood socio-demographic characteristics (e.g. census tract percent Black), but not walkable neighborhood amenities or in the OLS regression residuals. Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were not statistically significant, nor were neighborhood socio-demographic characteristics significantly associated with walkable neighborhood amenities in OLS regression models. CONCLUSIONS Our results suggest that there is residential segregation in Boston and that spatial inequalities do not necessarily show up using a composite measure. COMMENTS Future research in other geographic areas (including international contexts) and using different definitions of neighborhoods (including small-area definitions) should evaluate if spatial inequalities are found using composite measures but also should use measures of specific neighborhood amenities. PMID:29046612

  12. The relative roles of environment, history and local dispersal in controlling the distributions of common tree and shrub species in a tropical forest landscape, Panama

    USGS Publications Warehouse

    Svenning, J.-C.; Engelbrecht, B.M.J.; Kinner, D.A.; Kursar, T.A.; Stallard, R.F.; Wright, S.J.

    2006-01-01

    We used regression models and information-theoretic model selection to assess the relative importance of environment, local dispersal and historical contingency as controls of the distributions of 26 common plant species in tropical forest on Barro Colorado Island (BCI), Panama. We censused eighty-eight 0.09-ha plots scattered across the landscape. Environmental control, local dispersal and historical contingency were represented by environmental variables (soil moisture, slope, soil type, distance to shore, old-forest presence), a spatial autoregressive parameter (??), and four spatial trend variables, respectively. We built regression models, representing all combinations of the three hypotheses, for each species. The probability that the best model included the environmental variables, spatial trend variables and ?? averaged 33%, 64% and 50% across the study species, respectively. The environmental variables, spatial trend variables, ??, and a simple intercept model received the strongest support for 4, 15, 5 and 2 species, respectively. Comparing the model results to information on species traits showed that species with strong spatial trends produced few and heavy diaspores, while species with strong soil moisture relationships were particularly drought-sensitive. In conclusion, history and local dispersal appeared to be the dominant controls of the distributions of common plant species on BCI. Copyright ?? 2006 Cambridge University Press.

  13. Local spatial variations analysis of smear-positive tuberculosis in Xinjiang using Geographically Weighted Regression model.

    PubMed

    Wei, Wang; Yuan-Yuan, Jin; Ci, Yan; Ahan, Alayi; Ming-Qin, Cao

    2016-10-06

    The spatial interplay between socioeconomic factors and tuberculosis (TB) cases contributes to the understanding of regional tuberculosis burdens. Historically, local Poisson Geographically Weighted Regression (GWR) has allowed for the identification of the geographic disparities of TB cases and their relevant socioeconomic determinants, thereby forecasting local regression coefficients for the relations between the incidence of TB and its socioeconomic determinants. Therefore, the aims of this study were to: (1) identify the socioeconomic determinants of geographic disparities of smear positive TB in Xinjiang, China (2) confirm if the incidence of smear positive TB and its associated socioeconomic determinants demonstrate spatial variability (3) compare the performance of two main models: one is Ordinary Least Square Regression (OLS), and the other local GWR model. Reported smear-positive TB cases in Xinjiang were extracted from the TB surveillance system database during 2004-2010. The average number of smear-positive TB cases notified in Xinjiang was collected from 98 districts/counties. The population density (POPden), proportion of minorities (PROmin), number of infectious disease network reporting agencies (NUMagen), proportion of agricultural population (PROagr), and per capita annual gross domestic product (per capita GDP) were gathered from the Xinjiang Statistical Yearbook covering a period from 2004 to 2010. The OLS model and GWR model were then utilized to investigate socioeconomic determinants of smear-positive TB cases. Geoda 1.6.7, and GWR 4.0 software were used for data analysis. Our findings indicate that the relations between the average number of smear-positive TB cases notified in Xinjiang and their socioeconomic determinants (POPden, PROmin, NUMagen, PROagr, and per capita GDP) were significantly spatially non-stationary. This means that in some areas more smear-positive TB cases could be related to higher socioeconomic determinant regression coefficients, but in some areas more smear-positive TB cases were found to do with lower socioeconomic determinant regression coefficients. We also found out that the GWR model could be better exploited to geographically differentiate the relationships between the average number of smear-positive TB cases and their socioeconomic determinants, which could interpret the dataset better (adjusted R 2  = 0.912, AICc = 1107.22) than the OLS model (adjusted R 2  = 0.768, AICc = 1196.74). POPden, PROmin, NUMagen, PROagr, and per capita GDP are socioeconomic determinants of smear-positive TB cases. Comprehending the spatial heterogeneity of POPden, PROmin, NUMagen, PROagr, per capita GDP, and smear-positive TB cases could provide valuable information for TB precaution and control strategies.

  14. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression.

    PubMed

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-04-08

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.

  15. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression

    PubMed Central

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-01-01

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale. PMID:28397745

  16. Environmental, Spatial, and Sociodemographic Factors Associated with Nonfatal Injuries in Indonesia.

    PubMed

    Irianti, Sri; Prasetyoputra, Puguh

    2017-01-01

    Background . The determinants of injuries and their reoccurrence in Indonesia are not well understood, despite their importance in the prevention of injuries. Therefore, this study seeks to investigate the environmental, spatial, and sociodemographic factors associated with the reoccurrence of injuries among Indonesian people. Methods . Data from the 2013 round of the Indonesia Baseline Health Research (IBHR 2013) were analysed using a two-part hurdle regression model. A logit regression model was chosen for the zero-hurdle part , while a zero-truncated negative binomial regression model was selected for the counts part . Odds ratio (OR) and incidence rate ratio (IRR) were the measures of association, respectively. Results . The results suggest that living in a household with distant drinking water source, residing in slum areas, residing in Eastern Indonesia, having low educational attainment, being men, and being poorer are positively related to the likelihood of experiencing injury. Moreover, being a farmer or fishermen, having low educational attainment, and being men are positively associated with the frequency of injuries. Conclusion . This study would be useful to prioritise injury prevention programs in Indonesia based on the environmental, spatial, and sociodemographic characteristics.

  17. Approximating prediction uncertainty for random forest regression models

    Treesearch

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  18. Modeling stream network-scale variation in Coho salmon overwinter survival and smolt size

    Treesearch

    Joseph L. Ebersole; Mike E. Colvin; Parker J. Wigington; Scott G. Leibowitz; Joan P. Baker; Jana E. Compton; Bruce A. Miller; Michael A. Carins; Bruce P. Hansen; Henry R. La Vigne

    2009-01-01

    We used multiple regression and hierarchical mixed-effects models to examine spatial patterns of overwinter survival and size at smolting in juvenile coho salmon Oncorhynchus kisutch in relation to habitat attributes across an extensive stream network in southwestern Oregon over 3 years. Contributing basin area explained the majority of spatial...

  19. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  20. Spatial analysis of instream nitrogen loads and factors controlling nitrogen delivery to streams in the southeastern United States using spatially referenced regression on watershed attributes (SPARROW) and regional classification frameworks

    USGS Publications Warehouse

    Hoos, A.B.; McMahon, G.

    2009-01-01

    Understanding how nitrogen transport across the landscape varies with landscape characteristics is important for developing sound nitrogen management policies. We used a spatially referenced regression analysis (SPARROW) to examine landscape characteristics influencing delivery of nitrogen from sources in a watershed to stream channels. Modelled landscape delivery ratio varies widely (by a factor of 4) among watersheds in the southeastern United States - higher in the western part (Tennessee, Alabama, and Mississippi) than in the eastern part, and the average value for the region is lower compared to other parts of the nation. When we model landscape delivery ratio as a continuous function of local-scale landscape characteristics, we estimate a spatial pattern that varies as a function of soil and climate characteristics but exhibits spatial structure in residuals (observed load minus predicted load). The spatial pattern of modelled landscape delivery ratio and the spatial pattern of residuals coincide spatially with Level III ecoregions and also with hydrologic landscape regions. Subsequent incorporation into the model of these frameworks as regional scale variables improves estimation of landscape delivery ratio, evidenced by reduced spatial bias in residuals, and suggests that cross-scale processes affect nitrogen attenuation on the landscape. The model-fitted coefficient values are logically consistent with the hypothesis that broad-scale classifications of hydrologic response help to explain differential rates of nitrogen attenuation, controlling for local-scale landscape characteristics. Negative model coefficients for hydrologic landscape regions where the primary flow path is shallow ground water suggest that a lower fraction of nitrogen mass will be delivered to streams; this relation is reversed for regions where the primary flow path is overland flow.

  1. Spatial analysis of instream nitrogen loads and factors controlling nitrogen delivery to streams in the southeastern United States using spatially referenced regression on watershed attributes (SPARROW) and regional classification frameworks

    USGS Publications Warehouse

    Hoos, Anne B.; McMahon, Gerard

    2009-01-01

    Understanding how nitrogen transport across the landscape varies with landscape characteristics is important for developing sound nitrogen management policies. We used a spatially referenced regression analysis (SPARROW) to examine landscape characteristics influencing delivery of nitrogen from sources in a watershed to stream channels. Modelled landscape delivery ratio varies widely (by a factor of 4) among watersheds in the southeastern United States—higher in the western part (Tennessee, Alabama, and Mississippi) than in the eastern part, and the average value for the region is lower compared to other parts of the nation. When we model landscape delivery ratio as a continuous function of local-scale landscape characteristics, we estimate a spatial pattern that varies as a function of soil and climate characteristics but exhibits spatial structure in residuals (observed load minus predicted load). The spatial pattern of modelled landscape delivery ratio and the spatial pattern of residuals coincide spatially with Level III ecoregions and also with hydrologic landscape regions. Subsequent incorporation into the model of these frameworks as regional scale variables improves estimation of landscape delivery ratio, evidenced by reduced spatial bias in residuals, and suggests that cross-scale processes affect nitrogen attenuation on the landscape. The model-fitted coefficient values are logically consistent with the hypothesis that broad-scale classifications of hydrologic response help to explain differential rates of nitrogen attenuation, controlling for local-scale landscape characteristics. Negative model coefficients for hydrologic landscape regions where the primary flow path is shallow ground water suggest that a lower fraction of nitrogen mass will be delivered to streams; this relation is reversed for regions where the primary flow path is overland flow.

  2. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    PubMed

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  3. Gbm.auto: A software tool to simplify spatial modelling and Marine Protected Area planning

    PubMed Central

    Officer, Rick; Clarke, Maurice; Reid, David G.; Brophy, Deirdre

    2017-01-01

    Boosted Regression Trees. Excellent for data-poor spatial management but hard to use Marine resource managers and scientists often advocate spatial approaches to manage data-poor species. Existing spatial prediction and management techniques are either insufficiently robust, struggle with sparse input data, or make suboptimal use of multiple explanatory variables. Boosted Regression Trees feature excellent performance and are well suited to modelling the distribution of data-limited species, but are extremely complicated and time-consuming to learn and use, hindering access for a wide potential user base and therefore limiting uptake and usage. BRTs automated and simplified for accessible general use with rich feature set We have built a software suite in R which integrates pre-existing functions with new tailor-made functions to automate the processing and predictive mapping of species abundance data: by automating and greatly simplifying Boosted Regression Tree spatial modelling, the gbm.auto R package suite makes this powerful statistical modelling technique more accessible to potential users in the ecological and modelling communities. The package and its documentation allow the user to generate maps of predicted abundance, visualise the representativeness of those abundance maps and to plot the relative influence of explanatory variables and their relationship to the response variables. Databases of the processed model objects and a report explaining all the steps taken within the model are also generated. The package includes a previously unavailable Decision Support Tool which combines estimated escapement biomass (the percentage of an exploited population which must be retained each year to conserve it) with the predicted abundance maps to generate maps showing the location and size of habitat that should be protected to conserve the target stocks (candidate MPAs), based on stakeholder priorities, such as the minimisation of fishing effort displacement. Gbm.auto for management in various settings By bridging the gap between advanced statistical methods for species distribution modelling and conservation science, management and policy, these tools can allow improved spatial abundance predictions, and therefore better management, decision-making, and conservation. Although this package was built to support spatial management of a data-limited marine elasmobranch fishery, it should be equally applicable to spatial abundance modelling, area protection, and stakeholder engagement in various scenarios. PMID:29216310

  4. A spatial regression procedure for evaluating the relationship between AVHRR-NDVI and climate in the northern Great Plains

    USGS Publications Warehouse

    Ji, Lei; Peters, Albert J.

    2004-01-01

    The relationship between vegetation and climate in the grassland and cropland of the northern US Great Plains was investigated with Normalized Difference Vegetation Index (NDVI) (1989–1993) images derived from the Advanced Very High Resolution Radiometer (AVHRR), and climate data from automated weather stations. The relationship was quantified using a spatial regression technique that adjusts for spatial autocorrelation inherent in these data. Conventional regression techniques used frequently in previous studies are not adequate, because they are based on the assumption of independent observations. Six climate variables during the growing season; precipitation, potential evapotranspiration, daily maximum and minimum air temperature, soil temperature, solar irradiation were regressed on NDVI derived from a 10-km weather station buffer. The regression model identified precipitation and potential evapotranspiration as the most significant climatic variables, indicating that the water balance is the most important factor controlling vegetation condition at an annual timescale. The model indicates that 46% and 24% of variation in NDVI is accounted for by climate in grassland and cropland, respectively, indicating that grassland vegetation has a more pronounced response to climate variation than cropland. Other factors contributing to NDVI variation include environmental factors (soil, groundwater and terrain), human manipulation of crops, and sensor variation.

  5. Spatial modeling in ecology: the flexibility of eigenfunction spatial analyses.

    PubMed

    Griffith, Daniel A; Peres-Neto, Pedro R

    2006-10-01

    Recently, analytical approaches based on the eigenfunctions of spatial configuration matrices have been proposed in order to consider explicitly spatial predictors. The present study demonstrates the usefulness of eigenfunctions in spatial modeling applied to ecological problems and shows equivalencies of and differences between the two current implementations of this methodology. The two approaches in this category are the distance-based (DB) eigenvector maps proposed by P. Legendre and his colleagues, and spatial filtering based upon geographic connectivity matrices (i.e., topology-based; CB) developed by D. A. Griffith and his colleagues. In both cases, the goal is to create spatial predictors that can be easily incorporated into conventional regression models. One important advantage of these two approaches over any other spatial approach is that they provide a flexible tool that allows the full range of general and generalized linear modeling theory to be applied to ecological and geographical problems in the presence of nonzero spatial autocorrelation.

  6. Multicollinearity in spatial genetics: separating the wheat from the chaff using commonality analyses.

    PubMed

    Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C

    2015-01-01

    Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.

  7. Determinants of single family residential water use across scales in four western US cities.

    PubMed

    Chang, Heejun; Bonnette, Matthew Ryan; Stoker, Philip; Crow-Miller, Britt; Wentz, Elizabeth

    2017-10-15

    A growing body of literature examines urban water sustainability with increasing evidence that locally-based physical and social spatial interactions contribute to water use. These studies however are based on single-city analysis and often fail to consider whether these interactions occur more generally. We examine a multi-city comparison using a common set of spatially-explicit water, socioeconomic, and biophysical data. We investigate the relative importance of variables for explaining the variations of single family residential (SFR) water uses at Census Block Group (CBG) and Census Tract (CT) scales in four representative western US cities - Austin, Phoenix, Portland, and Salt Lake City, - which cover a wide range of climate and development density. We used both ordinary least squares regression and spatial error regression models to identify the influence of spatial dependence on water use patterns. Our results show that older downtown areas show lower water use than newer suburban areas in all four cities. Tax assessed value and building age are the main determinants of SFR water use across the four cities regardless of the scale. Impervious surface area becomes an important variable for summer water use in all cities, and it is important in all seasons for arid environments such as Phoenix. CT level analysis shows better model predictability than CBG analysis. In all cities, seasons, and spatial scales, spatial error regression models better explain the variations of SFR water use. Such a spatially-varying relationship of urban water consumption provides additional evidence for the need to integrate urban land use planning and municipal water planning. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Assessing spatial inequalities in accessing community pharmacies: a mixed geographically weighted approach.

    PubMed

    Domnich, Alexander; Arata, Lucia; Amicizia, Daniela; Signori, Alessio; Gasparini, Roberto; Panatto, Donatella

    2016-11-16

    Geographical accessibility is an important determinant for the utilisation of community pharmacies. The present study explored patterns of spatial accessibility with respect to pharmacies in Liguria, Italy, a region with particular geographical and demographic features. Municipal density of pharmacies was proxied as the number of pharmacies per capita and per km2, and spatial autocorrelation analysis was performed to identify spatial clusters. Both non-spatial and spatial models were constructed to predict the study outcome. Spatial autocorrelation analysis showed a highly significant clustered pattern in the density of pharmacies per capita (I=0.082) and per km2 (I=0.295). Potentially under-supplied areas were mostly located in the mountainous hinterland. Ordinary least-squares (OLS) regressions established a significant positive relationship between the density of pharmacies and income among municipalities located at high altitudes, while no such association was observed in lower-lying areas. However, residuals of the OLS models were spatially auto-correlated. The best-fitting mixed geographically weighted regression (GWR) models outperformed the corresponding OLS models. Pharmacies per capita were best predicted by two local predictors (altitude and proportion of immigrants) and two global ones (proportion of elderly residents and income), while the local terms population, mean altitude and rural status and the global term income functioned as independent variables predicting pharmacies per km2. The density of pharmacies in Liguria was found to be associated with both socio-economic and landscape factors. Mapping of mixed GWR results would be helpful to policy-makers.

  9. Influences of spatial and temporal variation on fish-habitat relationships defined by regression quantiles

    USGS Publications Warehouse

    Dunham, J.B.; Cade, B.S.; Terrell, J.W.

    2002-01-01

    We used regression quantiles to model potentially limiting relationships between the standing crop of cutthroat trout Oncorhynchus clarki and measures of stream channel morphology. Regression quantile models indicated that variation in fish density was inversely related to the width:depth ratio of streams but not to stream width or depth alone. The spatial and temporal stability of model predictions were examined across years and streams, respectively. Variation in fish density with width:depth ratio (10th-90th regression quantiles) modeled for streams sampled in 1993-1997 predicted the variation observed in 1998-1999, indicating similar habitat relationships across years. Both linear and nonlinear models described the limiting relationships well, the latter performing slightly better. Although estimated relationships were transferable in time, results were strongly dependent on the influence of spatial variation in fish density among streams. Density changes with width:depth ratio in a single stream were responsible for the significant (P < 0.10) negative slopes estimated for the higher quantiles (>80th). This suggests that stream-scale factors other than width:depth ratio play a more direct role in determining population density. Much of the variation in densities of cutthroat trout among streams was attributed to the occurrence of nonnative brook trout Salvelinus fontinalis (a possible competitor) or connectivity to migratory habitats. Regression quantiles can be useful for estimating the effects of limiting factors when ecological responses are highly variable, but our results indicate that spatiotemporal variability in the data should be explicitly considered. In this study, data from individual streams and stream-specific characteristics (e.g., the occurrence of nonnative species and habitat connectivity) strongly affected our interpretation of the relationship between width:depth ratio and fish density.

  10. Effects of land cover, topography, and built structure on seasonal water quality at multiple spatial scales.

    PubMed

    Pratt, Bethany; Chang, Heejun

    2012-03-30

    The relationship among land cover, topography, built structure and stream water quality in the Portland Metro region of Oregon and Clark County, Washington areas, USA, is analyzed using ordinary least squares (OLS) and geographically weighted (GWR) multiple regression models. Two scales of analysis, a sectional watershed and a buffer, offered a local and a global investigation of the sources of stream pollutants. Model accuracy, measured by R(2) values, fluctuated according to the scale, season, and regression method used. While most wet season water quality parameters are associated with urban land covers, most dry season water quality parameters are related topographic features such as elevation and slope. GWR models, which take into consideration local relations of spatial autocorrelation, had stronger results than OLS regression models. In the multiple regression models, sectioned watershed results were consistently better than the sectioned buffer results, except for dry season pH and stream temperature parameters. This suggests that while riparian land cover does have an effect on water quality, a wider contributing area needs to be included in order to account for distant sources of pollutants. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    NASA Astrophysics Data System (ADS)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  12. Mapping soil textural fractions across a large watershed in north-east Florida.

    PubMed

    Lamsal, S; Mishra, U

    2010-08-01

    Assessment of regional scale soil spatial variation and mapping their distribution is constrained by sparse data which are collected using field surveys that are labor intensive and cost prohibitive. We explored geostatistical (ordinary kriging-OK), regression (Regression Tree-RT), and hybrid methods (RT plus residual Sequential Gaussian Simulation-SGS) to map soil textural fractions across the Santa Fe River Watershed (3585 km(2)) in north-east Florida. Soil samples collected from four depths (L1: 0-30 cm, L2: 30-60 cm, L3: 60-120 cm, and L4: 120-180 cm) at 141 locations were analyzed for soil textural fractions (sand, silt and clay contents), and combined with textural data (15 profiles) assembled under the Florida Soil Characterization program. Textural fractions in L1 and L2 were autocorrelated, and spatially mapped across the watershed. OK performance was poor, which may be attributed to the sparse sampling. RT model structure varied among textural fractions, and the model explained variations ranged from 25% for L1 silt to 61% for L2 clay content. Regression residuals were simulated using SGS, and the average of simulated residuals were used to approximate regression residual distribution map, which were added to regression trend maps. Independent validation of the prediction maps showed that regression models performed slightly better than OK, and regression combined with average of simulated regression residuals improved predictions beyond the regression model. Sand content >90% in both 0-30 and 30-60 cm covered 80.6% of the watershed area. Copyright 2010 Elsevier Ltd. All rights reserved.

  13. Fine-Scale Exposure to Allergenic Pollen in the Urban Environment: Evaluation of Land Use Regression Approach.

    PubMed

    Hjort, Jan; Hugg, Timo T; Antikainen, Harri; Rusanen, Jarmo; Sofiev, Mikhail; Kukkonen, Jaakko; Jaakkola, Maritta S; Jaakkola, Jouni J K

    2016-05-01

    Despite the recent developments in physically and chemically based analysis of atmospheric particles, no models exist for resolving the spatial variability of pollen concentration at urban scale. We developed a land use regression (LUR) approach for predicting spatial fine-scale allergenic pollen concentrations in the Helsinki metropolitan area, Finland, and evaluated the performance of the models against available empirical data. We used grass pollen data monitored at 16 sites in an urban area during the peak pollen season and geospatial environmental data. The main statistical method was generalized linear model (GLM). GLM-based LURs explained 79% of the spatial variation in the grass pollen data based on all samples, and 47% of the variation when samples from two sites with very high concentrations were excluded. In model evaluation, prediction errors ranged from 6% to 26% of the observed range of grass pollen concentrations. Our findings support the use of geospatial data-based statistical models to predict the spatial variation of allergenic grass pollen concentrations at intra-urban scales. A remote sensing-based vegetation index was the strongest predictor of pollen concentrations for exposure assessments at local scales. The LUR approach provides new opportunities to estimate the relations between environmental determinants and allergenic pollen concentration in human-modified environments at fine spatial scales. This approach could potentially be applied to estimate retrospectively pollen concentrations to be used for long-term exposure assessments. Hjort J, Hugg TT, Antikainen H, Rusanen J, Sofiev M, Kukkonen J, Jaakkola MS, Jaakkola JJ. 2016. Fine-scale exposure to allergenic pollen in the urban environment: evaluation of land use regression approach. Environ Health Perspect 124:619-626; http://dx.doi.org/10.1289/ehp.1509761.

  14. Application of geographically-weighted regression analysis to assess risk factors for malaria hotspots in Keur Soce health and demographic surveillance site.

    PubMed

    Ndiath, Mansour M; Cisse, Badara; Ndiaye, Jean Louis; Gomis, Jules F; Bathiery, Ousmane; Dia, Anta Tal; Gaye, Oumar; Faye, Babacar

    2015-11-18

    In Senegal, considerable efforts have been made to reduce malaria morbidity and mortality during the last decade. This resulted in a marked decrease of malaria cases. With the decline of malaria cases, transmission has become sparse in most Senegalese health districts. This study investigated malaria hotspots in Keur Soce sites by using geographically-weighted regression. Because of the occurrence of hotspots, spatial modelling of malaria cases could have a considerable effect in disease surveillance. This study explored and analysed the spatial relationships between malaria occurrence and socio-economic and environmental factors in small communities in Keur Soce, Senegal, using 6 months passive surveillance. Geographically-weighted regression was used to explore the spatial variability of relationships between malaria incidence or persistence and the selected socio-economic, and human predictors. A model comparison of between ordinary least square and geographically-weighted regression was also explored. Vector dataset (spatial) of the study area by village levels and statistical data (non-spatial) on malaria confirmed cases, socio-economic status (bed net use), population data (size of the household) and environmental factors (temperature, rain fall) were used in this exploratory analysis. ArcMap 10.2 and Stata 11 were used to perform malaria hotspots analysis. From Jun to December, a total of 408 confirmed malaria cases were notified. The explanatory variables-household size, housing materials, sleeping rooms, sheep and distance to breeding site returned significant t values of -0.25, 2.3, 4.39, 1.25 and 2.36, respectively. The OLS global model revealed that it explained about 70 % (adjusted R(2) = 0.70) of the variation in malaria occurrence with AIC = 756.23. The geographically-weighted regression of malaria hotspots resulted in coefficient intercept ranging from 1.89 to 6.22 with a median of 3.5. Large positive values are distributed mainly in the southeast of the district where hotspots are more accurate while low values are mainly found in the centre and in the north. Geographically-weighted regression and OLS showed important risks factors of malaria hotspots in Keur Soce. The outputs of such models can be a useful tool to understand occurrence of malaria hotspots in Senegal. An understanding of geographical variation and determination of the core areas of the disease may provide an explanation regarding possible proximal and distal contributors to malaria elimination in Senegal.

  15. When homogeneity meets heterogeneity: the geographically weighted regression with spatial lag approach to prenatal care utilization

    PubMed Central

    Shoff, Carla; Chen, Vivian Yi-Ju; Yang, Tse-Chuan

    2014-01-01

    Using geographically weighted regression (GWR), a recent study by Shoff and colleagues (2012) investigated the place-specific risk factors for prenatal care utilization in the US and found that most of the relationships between late or not prenatal care and its determinants are spatially heterogeneous. However, the GWR approach may be subject to the confounding effect of spatial homogeneity. The goal of this study is to address this concern by including both spatial homogeneity and heterogeneity into the analysis. Specifically, we employ an analytic framework where a spatially lagged (SL) effect of the dependent variable is incorporated into the GWR model, which is called GWR-SL. Using this innovative framework, we found evidence to argue that spatial homogeneity is neglected in the study by Shoff et al. (2012) and the results are changed after considering the spatially lagged effect of prenatal care utilization. The GWR-SL approach allows us to gain a place-specific understanding of prenatal care utilization in US counties. In addition, we compared the GWR-SL results with the results of conventional approaches (i.e., OLS and spatial lag models) and found that GWR-SL is the preferred modeling approach. The new findings help us to better estimate how the predictors are associated with prenatal care utilization across space, and determine whether and how the level of prenatal care utilization in neighboring counties matters. PMID:24893033

  16. Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

    PubMed

    Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

    2017-08-01

    Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2  = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.

  17. Spatial analysis and land use regression of VOCs and NO(2) from school-based urban air monitoring in Detroit/Dearborn, USA.

    PubMed

    Mukerjee, Shaibal; Smith, Luther A; Johnson, Mary M; Neas, Lucas M; Stallings, Casson A

    2009-08-01

    Passive ambient air sampling for nitrogen dioxide (NO(2)) and volatile organic compounds (VOCs) was conducted at 25 school and two compliance sites in Detroit and Dearborn, Michigan, USA during the summer of 2005. Geographic Information System (GIS) data were calculated at each of 116 schools. The 25 selected schools were monitored to assess and model intra-urban gradients of air pollutants to evaluate impact of traffic and urban emissions on pollutant levels. Schools were chosen to be statistically representative of urban land use variables such as distance to major roadways, traffic intensity around the schools, distance to nearest point sources, population density, and distance to nearest border crossing. Two approaches were used to investigate spatial variability. First, Kruskal-Wallis analyses and pairwise comparisons on data from the schools examined coarse spatial differences based on city section and distance from heavily trafficked roads. Secondly, spatial variation on a finer scale and as a response to multiple factors was evaluated through land use regression (LUR) models via multiple linear regression. For weeklong exposures, VOCs did not exhibit spatial variability by city section or distance from major roads; NO(2) was significantly elevated in a section dominated by traffic and industrial influence versus a residential section. Somewhat in contrast to coarse spatial analyses, LUR results revealed spatial gradients in NO(2) and selected VOCs across the area. The process used to select spatially representative sites for air sampling and the results of coarse and fine spatial variability of air pollutants provide insights that may guide future air quality studies in assessing intra-urban gradients.

  18. Spatial analysis of land use and shallow groundwater vulnerability in the watershed adjacent to Assateague Island National Seashore, Maryland and Virginia, USA

    USGS Publications Warehouse

    LaMotte, A.E.; Greene, E.A.

    2007-01-01

    Spatial relations between land use and groundwater quality in the watershed adjacent to Assateague Island National Seashore, Maryland and Virginia, USA were analyzed by the use of two spatial models. One model used a logit analysis and the other was based on geostatistics. The models were developed and compared on the basis of existing concentrations of nitrate as nitrogen in samples from 529 domestic wells. The models were applied to produce spatial probability maps that show areas in the watershed where concentrations of nitrate in groundwater are likely to exceed a predetermined management threshold value. Maps of the watershed generated by logistic regression and probability kriging analysis showing where the probability of nitrate concentrations would exceed 3 mg/L (>0.50) compared favorably. Logistic regression was less dependent on the spatial distribution of sampled wells, and identified an additional high probability area within the watershed that was missed by probability kriging. The spatial probability maps could be used to determine the natural or anthropogenic factors that best explain the occurrence and distribution of elevated concentrations of nitrate (or other constituents) in shallow groundwater. This information can be used by local land-use planners, ecologists, and managers to protect water supplies and identify land-use planning solutions and monitoring programs in vulnerable areas. ?? 2006 Springer-Verlag.

  19. Wildlife tradeoffs based on landscape models of habitat preference

    USGS Publications Warehouse

    Loehle, C.; Mitchell, M.S.; White, M.

    2000-01-01

    Wildlife tradeoffs based on landscape models of habitat preference were presented. Multiscale logistic regression models were used and based on these models a spatial optimization technique was utilized to generate optimal maps. The tradeoffs were analyzed by gradually increasing the weighting on a single species in the objective function over a series of simulations. Results indicated that efficiency of habitat management for species diversity could be maximized for small landscapes by incorporating spatial context.

  20. Global-scale high-resolution ( 1 km) modelling of mean, maximum and minimum annual streamflow

    NASA Astrophysics Data System (ADS)

    Barbarossa, Valerio; Huijbregts, Mark; Hendriks, Jan; Beusen, Arthur; Clavreul, Julie; King, Henry; Schipper, Aafke

    2017-04-01

    Quantifying mean, maximum and minimum annual flow (AF) of rivers at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. AF metrics can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict AF metrics based on climate and catchment characteristics. Yet, so far, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. We developed global-scale regression models that quantify mean, maximum and minimum AF as function of catchment area and catchment-averaged slope, elevation, and mean, maximum and minimum annual precipitation and air temperature. We then used these models to obtain global 30 arc-seconds (˜ 1 km) maps of mean, maximum and minimum AF for each year from 1960 through 2015, based on a newly developed hydrologically conditioned digital elevation model. We calibrated our regression models based on observations of discharge and catchment characteristics from about 4,000 catchments worldwide, ranging from 100 to 106 km2 in size, and validated them against independent measurements as well as the output of a number of process-based global hydrological models (GHMs). The variance explained by our regression models ranged up to 90% and the performance of the models compared well with the performance of existing GHMs. Yet, our AF maps provide a level of spatial detail that cannot yet be achieved by current GHMs.

  1. Digital hydrologic networks supporting applications related to spatially referenced regression modeling

    USGS Publications Warehouse

    Brakebill, John W.; Wolock, David M.; Terziotti, Silvia

    2011-01-01

    Digital hydrologic networks depicting surface-water pathways and their associated drainage catchments provide a key component to hydrologic analysis and modeling. Collectively, they form common spatial units that can be used to frame the descriptions of aquatic and watershed processes. In addition, they provide the ability to simulate and route the movement of water and associated constituents throughout the landscape. Digital hydrologic networks have evolved from derivatives of mapping products to detailed, interconnected, spatially referenced networks of water pathways, drainage areas, and stream and watershed characteristics. These properties are important because they enhance the ability to spatially evaluate factors that affect the sources and transport of water-quality constituents at various scales. SPAtially Referenced Regressions On Watershed attributes (SPARROW), a process-based ⁄ statistical model, relies on a digital hydrologic network in order to establish relations between quantities of monitored contaminant flux, contaminant sources, and the associated physical characteristics affecting contaminant transport. Digital hydrologic networks modified from the River Reach File (RF1) and National Hydrography Dataset (NHD) geospatial datasets provided frameworks for SPARROW in six regions of the conterminous United States. In addition, characteristics of the modified RF1 were used to update estimates of mean-annual streamflow. This produced more current flow estimates for use in SPARROW modeling.

  2. Spatial analysis of relative humidity during ungauged periods in a mountainous region

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Kim, Yeonjoo

    2017-08-01

    Although atmospheric humidity influences environmental and agricultural conditions, thereby influencing plant growth, human health, and air pollution, efforts to develop spatial maps of atmospheric humidity using statistical approaches have thus far been limited. This study therefore aims to develop statistical approaches for inferring the spatial distribution of relative humidity (RH) for a mountainous island, for which data are not uniformly available across the region. A multiple regression analysis based on various mathematical models was used to identify the optimal model for estimating monthly RH by incorporating not only temperature but also location and elevation. Based on the regression analysis, we extended the monthly RH data from weather stations to cover the ungauged periods when no RH observations were available. Then, two different types of station-based data, the observational data and the data extended via the regression model, were used to form grid-based data with a resolution of 100 m. The grid-based data that used the extended station-based data captured the increasing RH trend along an elevation gradient. Furthermore, annual RH values averaged over the regions were examined. Decreasing temporal trends were found in most cases, with magnitudes varying based on the season and region.

  3. Potential habitat distribution for the freshwater diatom Didymosphenia geminata in the continental US

    USGS Publications Warehouse

    Kumar, S.; Spaulding, S.A.; Stohlgren, T.J.; Hermann, K.A.; Schmidt, T.S.; Bahls, L.L.

    2009-01-01

    The diatom Didymosphenia geminata is a single-celled alga found in lakes, streams, and rivers. Nuisance blooms of D geminata affect the diversity, abundance, and productivity of other aquatic organisms. Because D geminata can be transported by humans on waders and other gear, accurate spatial prediction of habitat suitability is urgently needed for early detection and rapid response, as well as for evaluation of monitoring and control programs. We compared four modeling methods to predict D geminata's habitat distribution; two methods use presence-absence data (logistic regression and classification and regression tree [CART]), and two involve presence data (maximum entropy model [Maxent] and genetic algorithm for rule-set production [GARP]). Using these methods, we evaluated spatially explicit, bioclimatic and environmental variables as predictors of diatom distribution. The Maxent model provided the most accurate predictions, followed by logistic regression, CART, and GARP. The most suitable habitats were predicted to occur in the western US, in relatively cool sites, and at high elevations with a high base-flow index. The results provide insights into the factors that affect the distribution of D geminata and a spatial basis for the prediction of nuisance blooms. ?? The Ecological Society of America.

  4. Characterization of the spatial variability of soil available zinc at various sampling densities using grouped soil type information.

    PubMed

    Song, Xiao-Dong; Zhang, Gan-Lin; Liu, Feng; Li, De-Cheng; Zhao, Yu-Guo

    2016-11-01

    The influence of anthropogenic activities and natural processes involved high uncertainties to the spatial variation modeling of soil available zinc (AZn) in plain river network regions. Four datasets with different sampling densities were split over the Qiaocheng district of Bozhou City, China. The difference of AZn concentrations regarding soil types was analyzed by the principal component analysis (PCA). Since the stationarity was not indicated and effective ranges of four datasets were larger than the sampling extent (about 400 m), two investigation tools, namely F3 test and stationarity index (SI), were employed to test the local non-stationarity. Geographically weighted regression (GWR) technique was performed to describe the spatial heterogeneity of AZn concentrations under the non-stationarity assumption. GWR based on grouped soil type information (GWRG for short) was proposed so as to benefit the local modeling of soil AZn within each soil-landscape unit. For reference, the multiple linear regression (MLR) model, a global regression technique, was also employed and incorporated the same predictors as in the GWR models. Validation results based on 100 times realization demonstrated that GWRG outperformed MLR and can produce similar or better accuracy than the GWR approach. Nevertheless, GWRG can generate better soil maps than GWR for limit soil data. Two-sample t test of produced soil maps also confirmed significantly different means. Variogram analysis of the model residuals exhibited weak spatial correlation, rejecting the use of hybrid kriging techniques. As a heuristically statistical method, the GWRG was beneficial in this study and potentially for other soil properties.

  5. Development of land-use regression models for exposure assessment to ultrafine particles in Rome, Italy

    NASA Astrophysics Data System (ADS)

    Cattani, Giorgio; Gaeta, Alessandra; di Menno di Bucchianico, Alessandro; de Santis, Antonella; Gaddi, Raffaela; Cusano, Mariacarmela; Ancona, Carla; Badaloni, Chiara; Forastiere, Francesco; Gariazzo, Claudio; Sozzi, Roberto; Inglessis, Marco; Silibello, Camillo; Salvatori, Elisabetta; Manes, Fausto; Cesaroni, Giulia; The Viias Study Group

    2017-05-01

    The health effects of long-term exposure to ultrafine particles (UFPs) are poorly understood. Data on spatial contrasts in ambient ultrafine particles (UFPs) concentrations are needed with fine resolution. This study aimed to assess the spatial variability of total particle number concentrations (PNC, a proxy for UFPs) in the city of Rome, Italy, using land use regression (LUR) models, and the correspondent exposure of population here living. PNC were measured using condensation particle counters at the building facade of 28 homes throughout the city. Three 7-day monitoring periods were carried out during cold, warm and intermediate seasons. Geographic Information System predictor variables, with buffers of varying size, were evaluated to model spatial variations of PNC. A stepwise forward selection procedure was used to develop a "base" linear regression model according to the European Study of Cohorts for Air Pollution Effects project methodology. Other variables were then included in more enhanced models and their capability of improving model performance was evaluated. Four LUR models were developed. Local variation in UFPs in the study area can be largely explained by the ratio of traffic intensity and distance to the nearest major road. The best model (adjusted R2 = 0.71; root mean square error = ±1,572 particles/cm³, leave one out cross validated R2 = 0.68) was achieved by regressing building and street configuration variables against residual from the "base" model, which added 3% more to the total variance explained. Urban green and population density in a 5,000 m buffer around each home were also relevant predictors. The spatial contrast in ambient PNC across the large conurbation of Rome, was successfully assessed. The average exposure of subjects living in the study area was 16,006 particles/cm³ (SD 2165 particles/cm³, range: 11,075-28,632 particles/cm³). A total of 203,886 subjects (16%) lives in Rome within 50 m from a high traffic road and they experience the highest exposure levels (18,229 particles/cm³). The results will be used to estimate the long-term health effects of ultrafine particle exposure of participants in Rome.

  6. A comparison of adaptive sampling designs and binary spatial models: A simulation study using a census of Bromus inermis

    USGS Publications Warehouse

    Irvine, Kathryn M.; Thornton, Jamie; Backus, Vickie M.; Hohmann, Matthew G.; Lehnhoff, Erik A.; Maxwell, Bruce D.; Michels, Kurt; Rew, Lisa

    2013-01-01

    Commonly in environmental and ecological studies, species distribution data are recorded as presence or absence throughout a spatial domain of interest. Field based studies typically collect observations by sampling a subset of the spatial domain. We consider the effects of six different adaptive and two non-adaptive sampling designs and choice of three binary models on both predictions to unsampled locations and parameter estimation of the regression coefficients (species–environment relationships). Our simulation study is unique compared to others to date in that we virtually sample a true known spatial distribution of a nonindigenous plant species, Bromus inermis. The census of B. inermis provides a good example of a species distribution that is both sparsely (1.9 % prevalence) and patchily distributed. We find that modeling the spatial correlation using a random effect with an intrinsic Gaussian conditionally autoregressive prior distribution was equivalent or superior to Bayesian autologistic regression in terms of predicting to un-sampled areas when strip adaptive cluster sampling was used to survey B. inermis. However, inferences about the relationships between B. inermis presence and environmental predictors differed between the two spatial binary models. The strip adaptive cluster designs we investigate provided a significant advantage in terms of Markov chain Monte Carlo chain convergence when trying to model a sparsely distributed species across a large area. In general, there was little difference in the choice of neighborhood, although the adaptive king was preferred when transects were randomly placed throughout the spatial domain.

  7. Reduced Lung Cancer Mortality With Lower Atmospheric Pressure.

    PubMed

    Merrill, Ray M; Frutos, Aaron

    2018-01-01

    Research has shown that higher altitude is associated with lower risk of lung cancer and improved survival among patients. The current study assessed the influence of county-level atmospheric pressure (a measure reflecting both altitude and temperature) on age-adjusted lung cancer mortality rates in the contiguous United States, with 2 forms of spatial regression. Ordinary least squares regression and geographically weighted regression models were used to evaluate the impact of climate and other selected variables on lung cancer mortality, based on 2974 counties. Atmospheric pressure was significantly positively associated with lung cancer mortality, after controlling for sunlight, precipitation, PM2.5 (µg/m 3 ), current smoker, and other selected variables. Positive county-level β coefficient estimates ( P < .05) for atmospheric pressure were observed throughout the United States, higher in the eastern half of the country. The spatial regression models showed that atmospheric pressure is positively associated with age-adjusted lung cancer mortality rates, after controlling for other selected variables.

  8. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    PubMed

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Applying spatial regression to evaluate risk factors for microbiological contamination of urban groundwater sources in Juba, South Sudan

    NASA Astrophysics Data System (ADS)

    Engström, Emma; Mörtberg, Ulla; Karlström, Anders; Mangold, Mikael

    2017-06-01

    This study developed methodology for statistically assessing groundwater contamination mechanisms. It focused on microbial water pollution in low-income regions. Risk factors for faecal contamination of groundwater-fed drinking-water sources were evaluated in a case study in Juba, South Sudan. The study was based on counts of thermotolerant coliforms in water samples from 129 sources, collected by the humanitarian aid organisation Médecins Sans Frontières in 2010. The factors included hydrogeological settings, land use and socio-economic characteristics. The results showed that the residuals of a conventional probit regression model had a significant positive spatial autocorrelation (Moran's I = 3.05, I-stat = 9.28); therefore, a spatial model was developed that had better goodness-of-fit to the observations. The most significant factor in this model ( p-value 0.005) was the distance from a water source to the nearest Tukul area, an area with informal settlements that lack sanitation services. It is thus recommended that future remediation and monitoring efforts in the city be concentrated in such low-income regions. The spatial model differed from the conventional approach: in contrast with the latter case, lowland topography was not significant at the 5% level, as the p-value was 0.074 in the spatial model and 0.040 in the traditional model. This study showed that statistical risk-factor assessments of groundwater contamination need to consider spatial interactions when the water sources are located close to each other. Future studies might further investigate the cut-off distance that reflects spatial autocorrelation. Particularly, these results advise research on urban groundwater quality.

  10. Explorative spatial analysis of traffic accident statistics and road mortality among the provinces of Turkey.

    PubMed

    Erdogan, Saffet

    2009-10-01

    The aim of the study is to describe the inter-province differences in traffic accidents and mortality on roads of Turkey. Two different risk indicators were used to evaluate the road safety performance of the provinces in Turkey. These indicators are the ratios between the number of persons killed in road traffic accidents (1) and the number of accidents (2) (nominators) and their exposure to traffic risk (denominator). Population and the number of registered motor vehicles in the provinces were used as denominators individually. Spatial analyses were performed to the mean annual rate of deaths and to the number of fatal accidents that were calculated for the period of 2001-2006. Empirical Bayes smoothing was used to remove background noise from the raw death and accident rates because of the sparsely populated provinces and small number of accident and death rates of provinces. Global and local spatial autocorrelation analyses were performed to show whether the provinces with high rates of deaths-accidents show clustering or are located closer by chance. The spatial distribution of provinces with high rates of deaths and accidents was nonrandom and detected as clustered with significance of P<0.05 with spatial autocorrelation analyses. Regions with high concentration of fatal accidents and deaths were located in the provinces that contain the roads connecting the Istanbul, Ankara, and Antalya provinces. Accident and death rates were also modeled with some independent variables such as number of motor vehicles, length of roads, and so forth using geographically weighted regression analysis with forward step-wise elimination. The level of statistical significance was taken as P<0.05. Large differences were found between the rates of deaths and accidents according to denominators in the provinces. The geographically weighted regression analyses did significantly better predictions for both accident rates and death rates than did ordinary least regressions, as indicated by adjusted R(2) values. Geographically weighted regression provided values of 0.89-0.99 adjusted R(2) for death and accident rates, compared with 0.88-0.95, respectively, by ordinary least regressions. Geographically weighted regression has the potential to reveal local patterns in the spatial distribution of rates, which would be ignored by the ordinary least regression approach. The application of spatial analysis and modeling of accident statistics and death rates at provincial level in Turkey will help to identification of provinces with outstandingly high accident and death rates. This could help more efficient road safety management in Turkey.

  11. Evaluating Bayesian spatial methods for modelling species distributions with clumped and restricted occurrence data.

    PubMed

    Redding, David W; Lucas, Tim C D; Blackburn, Tim M; Jones, Kate E

    2017-01-01

    Statistical approaches for inferring the spatial distribution of taxa (Species Distribution Models, SDMs) commonly rely on available occurrence data, which is often clumped and geographically restricted. Although available SDM methods address some of these factors, they could be more directly and accurately modelled using a spatially-explicit approach. Software to fit models with spatial autocorrelation parameters in SDMs are now widely available, but whether such approaches for inferring SDMs aid predictions compared to other methodologies is unknown. Here, within a simulated environment using 1000 generated species' ranges, we compared the performance of two commonly used non-spatial SDM methods (Maximum Entropy Modelling, MAXENT and boosted regression trees, BRT), to a spatial Bayesian SDM method (fitted using R-INLA), when the underlying data exhibit varying combinations of clumping and geographic restriction. Finally, we tested how any recommended methodological settings designed to account for spatially non-random patterns in the data impact inference. Spatial Bayesian SDM method was the most consistently accurate method, being in the top 2 most accurate methods in 7 out of 8 data sampling scenarios. Within high-coverage sample datasets, all methods performed fairly similarly. When sampling points were randomly spread, BRT had a 1-3% greater accuracy over the other methods and when samples were clumped, the spatial Bayesian SDM method had a 4%-8% better AUC score. Alternatively, when sampling points were restricted to a small section of the true range all methods were on average 10-12% less accurate, with greater variation among the methods. Model inference under the recommended settings to account for autocorrelation was not impacted by clumping or restriction of data, except for the complexity of the spatial regression term in the spatial Bayesian model. Methods, such as those made available by R-INLA, can be successfully used to account for spatial autocorrelation in an SDM context and, by taking account of random effects, produce outputs that can better elucidate the role of covariates in predicting species occurrence. Given that it is often unclear what the drivers are behind data clumping in an empirical occurrence dataset, or indeed how geographically restricted these data are, spatially-explicit Bayesian SDMs may be the better choice when modelling the spatial distribution of target species.

  12. Student Moon Observations and Spatial-Scientific Reasoning

    ERIC Educational Resources Information Center

    Cole, Merryn; Wilhelm, Jennifer; Yang, Hongwei

    2015-01-01

    Relationships between sixth grade students' moon journaling and students' spatial-scientific reasoning after implementation of an Earth/Space unit were examined. Teachers used the project-based Realistic Explorations in Astronomical Learning curriculum. We used a regression model to analyze the relationship between the students' Lunar Phases…

  13. The geography of recreational open space: influence of neighborhood racial composition and neighborhood poverty.

    PubMed

    Duncan, Dustin T; Kawachi, Ichiro; White, Kellee; Williams, David R

    2013-08-01

    The geography of recreational open space might be inequitable in terms of minority neighborhood racial/ethnic composition and neighborhood poverty, perhaps due in part to residential segregation. This study evaluated the association between minority neighborhood racial/ethnic composition, neighborhood poverty, and recreational open space in Boston, Massachusetts (US). Across Boston census tracts, we computed percent non-Hispanic Black, percent Hispanic, and percent families in poverty as well as recreational open space density. We evaluated spatial autocorrelation in study variables and in the ordinary least squares (OLS) regression residuals via the Global Moran's I. We then computed Spearman correlations between the census tract socio-demographic characteristics and recreational open space density, including correlations adjusted for spatial autocorrelation. After this, we computed OLS regressions or spatial regressions as appropriate. Significant positive spatial autocorrelation was found for neighborhood socio-demographic characteristics (all p value = 0.001). We found marginally significant positive spatial autocorrelation in recreational open space (Global Moran's I = 0.082; p value = 0.053). However, we found no spatial autocorrelation in the OLS regression residuals, which indicated that spatial models were not appropriate. There was a negative correlation between census tract percent non-Hispanic Black and recreational open space density (r S = -0.22; conventional p value = 0.005; spatially adjusted p value = 0.019) as well as a negative correlation between predominantly non-Hispanic Black census tracts (>60 % non-Hispanic Black in a census tract) and recreational open space density (r S = -0.23; conventional p value = 0.003; spatially adjusted p value = 0.007). In bivariate and multivariate OLS models, percent non-Hispanic Black in a census tract and predominantly Black census tracts were associated with decreased density of recreational open space (p value < 0.001). Consistent with several previous studies in other geographic locales, we found that Black neighborhoods in Boston were less likely to have recreational open spaces, indicating the need for policy interventions promoting equitable access. Such interventions may contribute to reductions and disparities in obesity.

  14. Variable selection and model choice in geoadditive regression models.

    PubMed

    Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard

    2009-06-01

    Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.

  15. Field Scale Spatial Modelling of Surface Soil Quality Attributes in Controlled Traffic Farming

    NASA Astrophysics Data System (ADS)

    Guenette, Kris; Hernandez-Ramirez, Guillermo

    2017-04-01

    The employment of controlled traffic farming (CTF) can yield improvements to soil quality attributes through the confinement of equipment traffic to tramlines with the field. There is a need to quantify and explain the spatial heterogeneity of soil quality attributes affected by CTF to further improve our understanding and modelling ability of field scale soil dynamics. Soil properties such as available nitrogen (AN), pH, soil total nitrogen (STN), soil organic carbon (SOC), bulk density, macroporosity, soil quality S-Index, plant available water capacity (PAWC) and unsaturated hydraulic conductivity (Km) were analysed and compared among trafficked and un-trafficked areas. We contrasted standard geostatistical methods such as ordinary kriging (OK) and covariate kriging (COK) as well as the hybrid method of regression kriging (ROK) to predict the spatial distribution of soil properties across two annual cropland sites actively employing CTF in Alberta, Canada. Field scale variability was quantified more accurately through the inclusion of covariates; however, the use of ROK was shown to improve model accuracy despite the regression model composition limiting the robustness of the ROK method. The exclusion of traffic from the un-trafficked areas displayed significant improvements to bulk density, macroporosity and Km while subsequently enhancing AN, STN and SOC. The ability of the regression models and the ROK method to account for spatial trends led to the highest goodness-of-fit and lowest error achieved for the soil physical properties, as the rigid traffic regime of CTF altered their spatial distribution at the field scale. Conversely, the COK method produced the most optimal predictions for the soil nutrient properties and Km. The use of terrain covariates derived from light ranging and detection (LiDAR), such as of elevation and topographic position index (TPI), yielded the best models in the COK method at the field scale.

  16. Demand-supply dynamics in tourism systems: A spatio-temporal GIS analysis. The Alberta ski industry case study

    NASA Astrophysics Data System (ADS)

    Bertazzon, Stefania

    The present research focuses on the interaction of supply and demand of down-hill ski tourism in the province of Alberta. The main hypothesis is that the demand for skiing depends on the socio-economic and demographic characteristics of the population living in the province and outside it. A second, consequent hypothesis is that the development of ski resorts (supply) is a response to the demand for skiing. From the latter derives the hypothesis of a dynamic interaction between supply (ski resorts) and demand (skiers). Such interaction occurs in space, within a range determined by physical distance and the means available to overcome it. The above hypotheses implicitly define interactions that take place in space and evolve over time. The hypotheses are tested by temporal, spatial, and spatio-temporal regression models, using the best available data and the latest commercially available software. The main purpose of this research is to explore analytical techniques to model spatial, temporal, and spatio-temporal dynamics in the context of regional science. The completion of the present research has produced more significant contributions than was originally expected. Many of the unexpected contributions resulted from theoretical and applied needs arising from the application of spatial regression models. Spatial regression models are a new and largely under-applied technique. The models are fairly complex and a considerable amount of preparatory work is needed, prior to their specification and estimation. Most of this work is specific to the field of application. The originality of the solutions devised is increased by the lack of applications in the field of tourism. The scarcity of applications in other fields adds to their value for other applications. The estimation of spatio-temporal models has been only partially attained in the present research. This apparent limitation is due to the novelty and complexity of the analytical methods applied. This opens new directions for further work in the field of spatial analysis, in conjunction with the development of specific software.

  17. Latin hypercube approach to estimate uncertainty in ground water vulnerability

    USGS Publications Warehouse

    Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

    2007-01-01

    A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.

  18. Local-scale spatial modelling for interpolating climatic temperature variables to predict agricultural plant suitability

    NASA Astrophysics Data System (ADS)

    Webb, Mathew A.; Hall, Andrew; Kidd, Darren; Minansy, Budiman

    2016-05-01

    Assessment of local spatial climatic variability is important in the planning of planting locations for horticultural crops. This study investigated three regression-based calibration methods (i.e. traditional versus two optimized methods) to relate short-term 12-month data series from 170 temperature loggers and 4 weather station sites with data series from nearby long-term Australian Bureau of Meteorology climate stations. The techniques trialled to interpolate climatic temperature variables, such as frost risk, growing degree days (GDDs) and chill hours, were regression kriging (RK), regression trees (RTs) and random forests (RFs). All three calibration methods produced accurate results, with the RK-based calibration method delivering the most accurate validation measures: coefficients of determination ( R 2) of 0.92, 0.97 and 0.95 and root-mean-square errors of 1.30, 0.80 and 1.31 °C, for daily minimum, daily maximum and hourly temperatures, respectively. Compared with the traditional method of calibration using direct linear regression between short-term and long-term stations, the RK-based calibration method improved R 2 and reduced root-mean-square error (RMSE) by at least 5 % and 0.47 °C for daily minimum temperature, 1 % and 0.23 °C for daily maximum temperature and 3 % and 0.33 °C for hourly temperature. Spatial modelling indicated insignificant differences between the interpolation methods, with the RK technique tending to be the slightly better method due to the high degree of spatial autocorrelation between logger sites.

  19. Predicting the occurrence of wildfires with binary structured additive regression models.

    PubMed

    Ríos-Pena, Laura; Kneib, Thomas; Cadarso-Suárez, Carmen; Marey-Pérez, Manuel

    2017-02-01

    Wildfires are one of the main environmental problems facing societies today, and in the case of Galicia (north-west Spain), they are the main cause of forest destruction. This paper used binary structured additive regression (STAR) for modelling the occurrence of wildfires in Galicia. Binary STAR models are a recent contribution to the classical logistic regression and binary generalized additive models. Their main advantage lies in their flexibility for modelling non-linear effects, while simultaneously incorporating spatial and temporal variables directly, thereby making it possible to reveal possible relationships among the variables considered. The results showed that the occurrence of wildfires depends on many covariates which display variable behaviour across space and time, and which largely determine the likelihood of ignition of a fire. The joint possibility of working on spatial scales with a resolution of 1 × 1 km cells and mapping predictions in a colour range makes STAR models a useful tool for plotting and predicting wildfire occurrence. Lastly, it will facilitate the development of fire behaviour models, which can be invaluable when it comes to drawing up fire-prevention and firefighting plans. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)

    NASA Astrophysics Data System (ADS)

    Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul

    2018-05-01

    The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.

  1. Spatial distribution of psychotic disorders in an urban area of France: an ecological study.

    PubMed

    Pignon, Baptiste; Schürhoff, Franck; Baudin, Grégoire; Ferchiou, Aziz; Richard, Jean-Romain; Saba, Ghassen; Leboyer, Marion; Kirkbride, James B; Szöke, Andrei

    2016-05-18

    Previous analyses of neighbourhood variations of non-affective psychotic disorders (NAPD) have focused mainly on incidence. However, prevalence studies provide important insights on factors associated with disease evolution as well as for healthcare resource allocation. This study aimed to investigate the distribution of prevalent NAPD cases in an urban area in France. The number of cases in each neighbourhood was modelled as a function of potential confounders and ecological variables, namely: migrant density, economic deprivation and social fragmentation. This was modelled using statistical models of increasing complexity: frequentist models (using Poisson and negative binomial regressions), and several Bayesian models. For each model, assumptions validity were checked and compared as to how this fitted to the data, in order to test for possible spatial variation in prevalence. Data showed significant overdispersion (invalidating the Poisson regression model) and residual autocorrelation (suggesting the need to use Bayesian models). The best Bayesian model was Leroux's model (i.e. a model with both strong correlation between neighbouring areas and weaker correlation between areas further apart), with economic deprivation as an explanatory variable (OR = 1.13, 95% CI [1.02-1.25]). In comparison with frequentist methods, the Bayesian model showed a better fit. The number of cases showed non-random spatial distribution and was linked to economic deprivation.

  2. High Incidence of Breast Cancer in Light-Polluted Areas with Spatial Effects in Korea.

    PubMed

    Kim, Yun Jeong; Park, Man Sik; Lee, Eunil; Choi, Jae Wook

    2016-01-01

    We have reported a high prevalence of breast cancer in light-polluted areas in Korea. However, it is necessary to analyze the spatial effects of light polluted areas on breast cancer because light pollution levels are correlated with region proximity to central urbanized areas in studied cities. In this study, we applied a spatial regression method (an intrinsic conditional autoregressive [iCAR] model) to analyze the relationship between the incidence of breast cancer and artificial light at night (ALAN) levels in 25 regions including central city, urbanized, and rural areas. By Poisson regression analysis, there was a significant correlation between ALAN, alcohol consumption rates, and the incidence of breast cancer. We also found significant spatial effects between ALAN and the incidence of breast cancer, with an increase in the deviance information criterion (DIC) from 374.3 to 348.6 and an increase in R2 from 0.574 to 0.667. Therefore, spatial analysis (an iCAR model) is more appropriate for assessing ALAN effects on breast cancer. To our knowledge, this study is the first to show spatial effects of light pollution on breast cancer, despite the limitations of an ecological study. We suggest that a decrease in ALAN could reduce breast cancer more than expected because of spatial effects.

  3. Drought Patterns Forecasting using an Auto-Regressive Logistic Model

    NASA Astrophysics Data System (ADS)

    del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

    2014-12-01

    Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.

  4. Investigation of the marked and long-standing spatial inhomogeneity of the Hungarian suicide rate: a spatial regression approach.

    PubMed

    Balint, Lajos; Dome, Peter; Daroczi, Gergely; Gonda, Xenia; Rihmer, Zoltan

    2014-02-01

    In the last century Hungary had astonishingly high suicide rates characterized by marked regional within-country inequalities, a spatial pattern which has been quite stable over time. To explain the above phenomenon at the level of micro-regions (n=175) in the period between 2005 and 2011. Our dependent variable was the age and gender standardized mortality ratio (SMR) for suicide while explanatory variables were factors which are supposed to influence suicide risk, such as measures of religious and political integration, travel time accessibility of psychiatric services, alcohol consumption, unemployment and disability pensionery. When applying the ordinary least squared regression model, the residuals were found to be spatially autocorrelated, which indicates the violation of the assumption on the independence of error terms and - accordingly - the necessity of application of a spatial autoregressive (SAR) model to handle this problem. According to our calculations the SARlag model was a better way (versus the SARerr model) of addressing the problem of spatial autocorrelation, furthermore its substantive meaning is more convenient. SMR was significantly associated with the "political integration" variable in a negative and with "lack of religious integration" and "disability pensionery" variables in a positive manner. Associations were not significant for the remaining explanatory variables. Several important psychiatric variables were not available at the level of micro-regions. We conducted our analysis on aggregate data. Our results may draw attention to the relevance and abiding validity of the classic Durkheimian suicide risk factors - such as lack of social integration - apropos of the spatial pattern of Hungarian suicides. © 2013 Published by Elsevier B.V.

  5. Systems and methods for knowledge discovery in spatial data

    DOEpatents

    Obradovic, Zoran; Fiez, Timothy E.; Vucetic, Slobodan; Lazarevic, Aleksandar; Pokrajac, Dragoljub; Hoskinson, Reed L.

    2005-03-08

    Systems and methods are provided for knowledge discovery in spatial data as well as to systems and methods for optimizing recipes used in spatial environments such as may be found in precision agriculture. A spatial data analysis and modeling module is provided which allows users to interactively and flexibly analyze and mine spatial data. The spatial data analysis and modeling module applies spatial data mining algorithms through a number of steps. The data loading and generation module obtains or generates spatial data and allows for basic partitioning. The inspection module provides basic statistical analysis. The preprocessing module smoothes and cleans the data and allows for basic manipulation of the data. The partitioning module provides for more advanced data partitioning. The prediction module applies regression and classification algorithms on the spatial data. The integration module enhances prediction methods by combining and integrating models. The recommendation module provides the user with site-specific recommendations as to how to optimize a recipe for a spatial environment such as a fertilizer recipe for an agricultural field.

  6. Digital Hydrologic Networks Supporting Applications Related to Spatially Referenced Regression Modeling

    USGS Publications Warehouse

    Brakebill, J.W.; Wolock, D.M.; Terziotti, S.E.

    2011-01-01

    Digital hydrologic networks depicting surface-water pathways and their associated drainage catchments provide a key component to hydrologic analysis and modeling. Collectively, they form common spatial units that can be used to frame the descriptions of aquatic and watershed processes. In addition, they provide the ability to simulate and route the movement of water and associated constituents throughout the landscape. Digital hydrologic networks have evolved from derivatives of mapping products to detailed, interconnected, spatially referenced networks of water pathways, drainage areas, and stream and watershed characteristics. These properties are important because they enhance the ability to spatially evaluate factors that affect the sources and transport of water-quality constituents at various scales. SPAtially Referenced Regressions On Watershed attributes (SPARROW), a process-based/statistical model, relies on a digital hydrologic network in order to establish relations between quantities of monitored contaminant flux, contaminant sources, and the associated physical characteristics affecting contaminant transport. Digital hydrologic networks modified from the River Reach File (RF1) and National Hydrography Dataset (NHD) geospatial datasets provided frameworks for SPARROW in six regions of the conterminous United States. In addition, characteristics of the modified RF1 were used to update estimates of mean-annual streamflow. This produced more current flow estimates for use in SPARROW modeling. ?? 2011 American Water Resources Association. This article is a U.S. Government work and is in the public domain in the USA.

  7. Predictive landslide susceptibility mapping using spatial information in the Pechabun area of Thailand

    NASA Astrophysics Data System (ADS)

    Oh, Hyun-Joo; Lee, Saro; Chotikasathien, Wisut; Kim, Chang Hwan; Kwon, Ju Hyoung

    2009-04-01

    For predictive landslide susceptibility mapping, this study applied and verified probability model, the frequency ratio and statistical model, logistic regression at Pechabun, Thailand, using a geographic information system (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and field surveys, and maps of the topography, geology and land cover were constructed to spatial database. The factors that influence landslide occurrence, such as slope gradient, slope aspect and curvature of topography and distance from drainage were calculated from the topographic database. Lithology and distance from fault were extracted and calculated from the geology database. Land cover was classified from Landsat TM satellite image. The frequency ratio and logistic regression coefficient were overlaid for landslide susceptibility mapping as each factor’s ratings. Then the landslide susceptibility map was verified and compared using the existing landslide location. As the verification results, the frequency ratio model showed 76.39% and logistic regression model showed 70.42% in prediction accuracy. The method can be used to reduce hazards associated with landslides and to plan land cover.

  8. Mathematical models application for mapping soils spatial distribution on the example of the farm from the North of Udmurt Republic of Russia

    NASA Astrophysics Data System (ADS)

    Dokuchaev, P. M.; Meshalkina, J. L.; Yaroslavtsev, A. M.

    2018-01-01

    Comparative analysis of soils geospatial modeling using multinomial logistic regression, decision trees, random forest, regression trees and support vector machines algorithms was conducted. The visual interpretation of the digital maps obtained and their comparison with the existing map, as well as the quantitative assessment of the individual soil groups detection overall accuracy and of the models kappa showed that multiple logistic regression, support vector method, and random forest models application with spatial prediction of the conditional soil groups distribution can be reliably used for mapping of the study area. It has shown the most accurate detection for sod-podzolics soils (Phaeozems Albic) lightly eroded and moderately eroded soils. In second place, according to the mean overall accuracy of the prediction, there are sod-podzolics soils - non-eroded and warp one, as well as sod-gley soils (Umbrisols Gleyic) and alluvial soils (Fluvisols Dystric, Umbric). Heavy eroded sod-podzolics and gray forest soils (Phaeozems Albic) were detected by methods of automatic classification worst of all.

  9. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.

  10. [Mapping environmental vulnerability from ETM + data in the Yellow River Mouth Area].

    PubMed

    Wang, Rui-Yan; Yu, Zhen-Wen; Xia, Yan-Ling; Wang, Xiang-Feng; Zhao, Geng-Xing; Jiang, Shu-Qian

    2013-10-01

    The environmental vulnerability retrieval is important to support continuing data. The spatial distribution of regional environmental vulnerability was got through remote sensing retrieval. In view of soil and vegetation, the environmental vulnerability evaluation index system was built, and the environmental vulnerability of sampling points was calculated by the AHP-fuzzy method, then the correlation between the sampling points environmental vulnerability and ETM + spectral reflectance ratio including some kinds of conversion data was analyzed to determine the sensitive spectral parameters. Based on that, models of correlation analysis, traditional regression, BP neural network and support vector regression were taken to explain the quantitative relationship between the spectral reflectance and the environmental vulnerability. With this model, the environmental vulnerability distribution was retrieved in the Yellow River Mouth Area. The results showed that the correlation between the environmental vulnerability and the spring NDVI, the September NDVI and the spring brightness was better than others, so they were selected as the sensitive spectral parameters. The model precision result showed that in addition to the support vector model, the other model reached the significant level. While all the multi-variable regression was better than all one-variable regression, and the model accuracy of BP neural network was the best. This study will serve as a reliable theoretical reference for the large spatial scale environmental vulnerability estimation based on remote sensing data.

  11. Trees grow on money: urban tree canopy cover and environmental justice.

    PubMed

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G; Zhou, Weiqi; McHale, Melissa; Grove, J Morgan; O'Neil-Dunne, Jarlath; McFadden, Joseph P; Buckley, Geoffrey L; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman's correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns.

  12. Proximity to natural amenities: A seemingly unrelated hedonic regression model with spatial durbin and spatial error processes

    Treesearch

    German M. Izon; Michael S. Hand; Daniel W. Mccollum; Jennifer A. Thacher; Robert P. Berrens

    2016-01-01

    The existing literature suggests that the presence of natural amenities, such as open spaces, can be highly valued and affect economic decisions about where people live and work. This article contributes to previous research by testing this hypothesis using a unique micro-level data set and by examining spatial variations in income levels and housing prices in the...

  13. Carbon emissions risk map from deforestation in the tropical Amazon

    NASA Astrophysics Data System (ADS)

    Ometto, J.; Soler, L. S.; Assis, T. D.; Oliveira, P. V.; Aguiar, A. P.

    2011-12-01

    Assis, Pedro Valle This work aims to estimate the carbon emissions from tropical deforestation in the Brazilian Amazon associated to the risk assessment of future land use change. The emissions are estimated by incorporating temporal deforestation dynamics, accounting for the biophysical and socioeconomic heterogeneity in the region, as well secondary forest growth dynamic in abandoned areas. The land cover change model that supported the risk assessment of deforestation, was run based on linear regressions. This method takes into account spatial heterogeneity of deforestation as the spatial variables adopted to fit the final regression model comprise: environmental aspects, economic attractiveness, accessibility and land tenure structure. After fitting a suitable regression models for each land cover category, the potential of each cell to be deforested (25x25km and 5x5 km of resolution) in the near future was used to calculate the risk assessment of land cover change. The carbon emissions model combines high-resolution new forest clear-cut mapping and four alternative sources of spatial information on biomass distribution for different vegetation types. The risk assessment map of CO2 emissions, was obtained by crossing the simulation results of the historical land cover changes to a map of aboveground biomass contained in the remaining forest. This final map represents the risk of CO2 emissions at 25x25km and 5x5 km until 2020, under a scenario of carbon emission reduction target.

  14. Fire frequency in the Interior Columbia River Basin: Building regional models from fire history data

    USGS Publications Warehouse

    McKenzie, D.; Peterson, D.L.; Agee, James K.

    2000-01-01

    Fire frequency affects vegetation composition and successional pathways; thus it is essential to understand fire regimes in order to manage natural resources at broad spatial scales. Fire history data are lacking for many regions for which fire management decisions are being made, so models are needed to estimate past fire frequency where local data are not yet available. We developed multiple regression models and tree-based (classification and regression tree, or CART) models to predict fire return intervals across the interior Columbia River basin at 1-km resolution, using georeferenced fire history, potential vegetation, cover type, and precipitation databases. The models combined semiqualitative methods and rigorous statistics. The fire history data are of uneven quality; some estimates are based on only one tree, and many are not cross-dated. Therefore, we weighted the models based on data quality and performed a sensitivity analysis of the effects on the models of estimation errors that are due to lack of cross-dating. The regression models predict fire return intervals from 1 to 375 yr for forested areas, whereas the tree-based models predict a range of 8 to 150 yr. Both types of models predict latitudinal and elevational gradients of increasing fire return intervals. Examination of regional-scale output suggests that, although the tree-based models explain more of the variation in the original data, the regression models are less likely to produce extrapolation errors. Thus, the models serve complementary purposes in elucidating the relationships among fire frequency, the predictor variables, and spatial scale. The models can provide local managers with quantitative information and provide data to initialize coarse-scale fire-effects models, although predictions for individual sites should be treated with caution because of the varying quality and uneven spatial coverage of the fire history database. The models also demonstrate the integration of qualitative and quantitative methods when requisite data for fully quantitative models are unavailable. They can be tested by comparing new, independent fire history reconstructions against their predictions and can be continually updated, as better fire history data become available.

  15. Distributed Lag Models: Examining Associations between the Built Environment and Health

    PubMed Central

    Baek, Jonggyu; Sánchez, Brisa N.; Berrocal, Veronica J.; Sanchez-Vaznaugh, Emma V.

    2016-01-01

    Built environment factors constrain individual level behaviors and choices, and thus are receiving increasing attention to assess their influence on health. Traditional regression methods have been widely used to examine associations between built environment measures and health outcomes, where a fixed, pre-specified spatial scale (e.g., 1 mile buffer) is used to construct environment measures. However, the spatial scale for these associations remains largely unknown and misspecifying it introduces bias. We propose the use of distributed lag models (DLMs) to describe the association between built environment features and health as a function of distance from the locations of interest and circumvent a-priori selection of a spatial scale. Based on simulation studies, we demonstrate that traditional regression models produce associations biased away from the null when there is spatial correlation among the built environment features. Inference based on DLMs is robust under a range of scenarios of the built environment. We use this innovative application of DLMs to examine the association between the availability of convenience stores near California public schools, which may affect children’s dietary choices both through direct access to junk food and exposure to advertisement, and children’s body mass index z-scores (BMIz). PMID:26414942

  16. Can dispersion modeling of air pollution be improved by land-use regression? An example from Stockholm, Sweden

    PubMed Central

    Korek, Michal; Johansson, Christer; Svensson, Nina; Lind, Tomas; Beelen, Rob; Hoek, Gerard; Pershagen, Göran; Bellander, Tom

    2017-01-01

    Both dispersion modeling (DM) and land-use regression modeling (LUR) are often used for assessment of long-term air pollution exposure in epidemiological studies, but seldom in combination. We developed a hybrid DM–LUR model using 93 biweekly observations of NOx at 31 sites in greater Stockholm (Sweden). The DM was based on spatially resolved topographic, physiographic and emission data, and hourly meteorological data from a diagnostic wind model. Other data were from land use, meteorology and routine monitoring of NOx. We built a linear regression model for NOx, using a stepwise forward selection of covariates. The resulting model predicted observed NOx (R2=0.89) better than the DM without covariates (R2=0.68, P-interaction <0.001) and with minimal apparent bias. The model included (in descending order of importance) DM, traffic intensity on the nearest street, population (number of inhabitants) within 100 m radius, global radiation (direct sunlight plus diffuse or scattered light) and urban contribution to NOx levels (routine urban NOx, less routine rural NOx). Our results indicate that there is a potential for improving estimates of air pollutant concentrations based on DM, by incorporating further spatial characteristics of the immediate surroundings, possibly accounting for imperfections in the emission data. PMID:27485990

  17. Can dispersion modeling of air pollution be improved by land-use regression? An example from Stockholm, Sweden.

    PubMed

    Korek, Michal; Johansson, Christer; Svensson, Nina; Lind, Tomas; Beelen, Rob; Hoek, Gerard; Pershagen, Göran; Bellander, Tom

    2017-11-01

    Both dispersion modeling (DM) and land-use regression modeling (LUR) are often used for assessment of long-term air pollution exposure in epidemiological studies, but seldom in combination. We developed a hybrid DM-LUR model using 93 biweekly observations of NO x at 31 sites in greater Stockholm (Sweden). The DM was based on spatially resolved topographic, physiographic and emission data, and hourly meteorological data from a diagnostic wind model. Other data were from land use, meteorology and routine monitoring of NO x . We built a linear regression model for NO x , using a stepwise forward selection of covariates. The resulting model predicted observed NO x (R 2 =0.89) better than the DM without covariates (R 2 =0.68, P-interaction <0.001) and with minimal apparent bias. The model included (in descending order of importance) DM, traffic intensity on the nearest street, population (number of inhabitants) within 100 m radius, global radiation (direct sunlight plus diffuse or scattered light) and urban contribution to NO x levels (routine urban NO x , less routine rural NO x ). Our results indicate that there is a potential for improving estimates of air pollutant concentrations based on DM, by incorporating further spatial characteristics of the immediate surroundings, possibly accounting for imperfections in the emission data.

  18. Spatial occupancy models applied to atlas data show Southern Ground Hornbills strongly depend on protected areas.

    PubMed

    Broms, Kristin M; Johnson, Devin S; Altwegg, Res; Conquest, Loveday L

    2014-03-01

    Determining the range of a species and exploring species--habitat associations are central questions in ecology and can be answered by analyzing presence--absence data. Often, both the sampling of sites and the desired area of inference involve neighboring sites; thus, positive spatial autocorrelation between these sites is expected. Using survey data for the Southern Ground Hornbill (Bucorvus leadbeateri) from the Southern African Bird Atlas Project, we compared advantages and disadvantages of three increasingly complex models for species occupancy: an occupancy model that accounted for nondetection but assumed all sites were independent, and two spatial occupancy models that accounted for both nondetection and spatial autocorrelation. We modeled the spatial autocorrelation with an intrinsic conditional autoregressive (ICAR) model and with a restricted spatial regression (RSR) model. Both spatial models can readily be applied to any other gridded, presence--absence data set using a newly introduced R package. The RSR model provided the best inference and was able to capture small-scale variation that the other models did not. It showed that ground hornbills are strongly dependent on protected areas in the north of their South African range, but less so further south. The ICAR models did not capture any spatial autocorrelation in the data, and they took an order, of magnitude longer than the RSR models to run. Thus, the RSR occupancy model appears to be an attractive choice for modeling occurrences at large spatial domains, while accounting for imperfect detection and spatial autocorrelation.

  19. USE OF GIS AND ANCILLARY VARIABLES TO PREDICT VOLATILE ORGANIC COMPOUND AND NITROGEN DIOXIDE LEVELS AT UNMONITORED LOCATIONS

    EPA Science Inventory

    This paper presents a GIS-based regression spatial method, known as land-use regression (LUR) modeling, to estimate ambient air pollution exposures used in the EPA El Paso Children's Health Study. Passive measurements of select volatile organic compounds (VOC) and nitrogen dioxi...

  20. Spatial regression analysis on 32 years of total column ozone data

    NASA Astrophysics Data System (ADS)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-08-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

  1. An hourly regression model for ultrafine particles in a near-highway urban area

    PubMed Central

    Patton, Allison P.; Collins, Caitlin; Naumova, Elena N.; Zamore, Wig; Brugge, Doug; Durant, John L.

    2015-01-01

    Estimating ultrafine particle number concentrations (PNC) near highways for exposure assessment in chronic health studies requires models capable of capturing PNC spatial and temporal variations over the course of a full year. The objectives of this work were to describe the relationship between near-highway PNC and potential predictors, and to build and validate hourly log-linear regression models. PNC was measured near Interstate 93 (I-93) in Somerville, MA (USA) using a mobile monitoring platform driven for 234 hours on 43 days between August 2009 and September 2010. Compared to urban background, PNC levels were consistently elevated within 100–200 m of I-93, with gradients impacted by meteorological and traffic conditions. Temporal and spatial variables including wind speed and direction, temperature, highway traffic, and distance to I-93 and major roads contributed significantly to the full regression model. Cross-validated model R2 values ranged from 0.38–0.47, with higher values achieved (0.43–0.53) when short-duration PNC spikes were removed. The model predicts highest PNC near major roads and on cold days with low wind speeds. The model allows estimation of hourly ambient PNC at 20-m resolution in a near-highway neighborhood. PMID:24559198

  2. Tularosa Basin Play Fairway: Weights of Evidence Models

    DOE Data Explorer

    Adam Brandt

    2015-12-01

    These models are related to weights of evidence play fairway anlaysis of the Tularosa Basin, New Mexico and Texas. They were created through Spatial Data Modeler: ArcMAP 9.3 geoprocessing tools for spatial data modeling using weights of evidence, logistic regression, fuzzy logic and neural networks. It used to identify high values for potential geothermal plays and low values. The results are relative not only within the Tularosa Basin, but also throughout New Mexico, Utah, Nevada, and other places where high to moderate enthalpy geothermal systems are present (training sites).

  3. Accounting for groundwater in stream fish thermal habitat responses to climate change

    USGS Publications Warehouse

    Snyder, Craig D.; Hitt, Nathaniel P.; Young, John A.

    2015-01-01

    Forecasting climate change effects on aquatic fauna and their habitat requires an understanding of how water temperature responds to changing air temperature (i.e., thermal sensitivity). Previous efforts to forecast climate effects on brook trout habitat have generally assumed uniform air-water temperature relationships over large areas that cannot account for groundwater inputs and other processes that operate at finer spatial scales. We developed regression models that accounted for groundwater influences on thermal sensitivity from measured air-water temperature relationships within forested watersheds in eastern North America (Shenandoah National Park, USA, 78 sites in 9 watersheds). We used these reach-scale models to forecast climate change effects on stream temperature and brook trout thermal habitat, and compared our results to previous forecasts based upon large-scale models. Observed stream temperatures were generally less sensitive to air temperature than previously assumed, and we attribute this to the moderating effect of shallow groundwater inputs. Predicted groundwater temperatures from air-water regression models corresponded well to observed groundwater temperatures elsewhere in the study area. Predictions of brook trout future habitat loss derived from our fine-grained models were far less pessimistic than those from prior models developed at coarser spatial resolutions. However, our models also revealed spatial variation in thermal sensitivity within and among catchments resulting in a patchy distribution of thermally suitable habitat. Habitat fragmentation due to thermal barriers therefore may have an increasingly important role for trout population viability in headwater streams. Our results demonstrate that simple adjustments to air-water temperature regression models can provide a powerful and cost-effective approach for predicting future stream temperatures while accounting for effects of groundwater.

  4. Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks

    NASA Astrophysics Data System (ADS)

    Kanevski, Mikhail

    2015-04-01

    The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press. With a CD: data, software, guides. (2009). 2. Kanevski M. Spatial Predictions of Soil Contamination Using General Regression Neural Networks. Systems Research and Information Systems, Volume 8, number 4, 1999. 3. Robert S., Foresti L., Kanevski M. Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks. International Journal of Climatology, 33 pp. 1793-1804, 2013.

  5. Modeling spatial effects of PM{sub 2.5} on term low birth weight in Los Angeles County

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coker, Eric, E-mail: cokerer@onid.orst.edu; Ghosh, Jokay; Jerrett, Michael

    Air pollution epidemiological studies suggest that elevated exposure to fine particulate matter (PM{sub 2.5}) is associated with higher prevalence of term low birth weight (TLBW). Previous studies have generally assumed the exposure–response of PM{sub 2.5} on TLBW to be the same throughout a large geographical area. Health effects related to PM{sub 2.5} exposures, however, may not be uniformly distributed spatially, creating a need for studies that explicitly investigate the spatial distribution of the exposure–response relationship between individual-level exposure to PM{sub 2.5} and TLBW. Here, we examine the overall and spatially varying exposure–response relationship between PM{sub 2.5} and TLBW throughout urbanmore » Los Angeles (LA) County, California. We estimated PM{sub 2.5} from a combination of land use regression (LUR), aerosol optical depth from remote sensing, and atmospheric modeling techniques. Exposures were assigned to LA County individual pregnancies identified from electronic birth certificates between the years 1995-2006 (N=1,359,284) provided by the California Department of Public Health. We used a single pollutant multivariate logistic regression model, with multilevel spatially structured and unstructured random effects set in a Bayesian framework to estimate global and spatially varying pollutant effects on TLBW at the census tract level. Overall, increased PM{sub 2.5} level was associated with higher prevalence of TLBW county-wide. The spatial random effects model, however, demonstrated that the exposure–response for PM{sub 2.5} and TLBW was not uniform across urban LA County. Rather, the magnitude and certainty of the exposure–response estimates for PM{sub 2.5} on log odds of TLBW were greatest in the urban core of Central and Southern LA County census tracts. These results suggest that the effects may be spatially patterned, and that simply estimating global pollutant effects obscures disparities suggested by spatial patterns of effects. Studies that incorporate spatial multilevel modeling with random coefficients allow us to identify areas where air pollutant effects on adverse birth outcomes may be most severe and policies to further reduce air pollution might be most effective. - Highlights: • We model the spatial dependency of PM{sub 2.5} effects on term low birth weight (TLBW). • PM{sub 2.5} effects on TLBW are shown to vary spatially across urban LA County. • Modeling spatial dependency of PM{sub 2.5} health effects may identify effect 'hotspots'. • Birth outcomes studies should consider the spatial dependency of PM{sub 2.5} effects.« less

  6. An improved geographically weighted regression model for PM2.5 concentration estimation in large areas

    NASA Astrophysics Data System (ADS)

    Zhai, Liang; Li, Shuang; Zou, Bin; Sang, Huiyong; Fang, Xin; Xu, Shan

    2018-05-01

    Considering the spatial non-stationary contributions of environment variables to PM2.5 variations, the geographically weighted regression (GWR) modeling method has been using to estimate PM2.5 concentrations widely. However, most of the GWR models in reported studies so far were established based on the screened predictors through pretreatment correlation analysis, and this process might cause the omissions of factors really driving PM2.5 variations. This study therefore developed a best subsets regression (BSR) enhanced principal component analysis-GWR (PCA-GWR) modeling approach to estimate PM2.5 concentration by fully considering all the potential variables' contributions simultaneously. The performance comparison experiment between PCA-GWR and regular GWR was conducted in the Beijing-Tianjin-Hebei (BTH) region over a one-year-period. Results indicated that the PCA-GWR modeling outperforms the regular GWR modeling with obvious higher model fitting- and cross-validation based adjusted R2 and lower RMSE. Meanwhile, the distribution map of PM2.5 concentration from PCA-GWR modeling also clearly depicts more spatial variation details in contrast to the one from regular GWR modeling. It can be concluded that the BSR enhanced PCA-GWR modeling could be a reliable way for effective air pollution concentration estimation in the coming future by involving all the potential predictor variables' contributions to PM2.5 variations.

  7. Phenomapping of rangelands in South Africa using time series of RapidEye data

    NASA Astrophysics Data System (ADS)

    Parplies, André; Dubovyk, Olena; Tewes, Andreas; Mund, Jan-Peter; Schellberg, Jürgen

    2016-12-01

    Phenomapping is an approach which allows the derivation of spatial patterns of vegetation phenology and rangeland productivity based on time series of vegetation indices. In our study, we propose a new spatial mapping approach which combines phenometrics derived from high resolution (HR) satellite time series with spatial logistic regression modeling to discriminate land management systems in rangelands. From the RapidEye time series for selected rangelands in South Africa, we calculated bi-weekly noise reduced Normalized Difference Vegetation Index (NDVI) images. For the growing season of 2011⿿2012, we further derived principal phenology metrics such as start, end and length of growing season and related phenological variables such as amplitude, left derivative and small integral of the NDVI curve. We then mapped these phenometrics across two different tenure systems, communal and commercial, at the very detailed spatial resolution of 5 m. The result of a binary logistic regression (BLR) has shown that the amplitude and the left derivative of the NDVI curve were statistically significant. These indicators are useful to discriminate commercial from communal rangeland systems. We conclude that phenomapping combined with spatial modeling is a powerful tool that allows efficient aggregation of phenology and productivity metrics for spatially explicit analysis of the relationships of crop phenology with site conditions and management. This approach has particular potential for disaggregated and patchy environments such as in farming systems in semi-arid South Africa, where phenology varies considerably among and within years. Further, we see a strong perspective for phenomapping to support spatially explicit modelling of vegetation.

  8. An Introduction to Macro- Level Spatial Nonstationarity: a Geographically Weighted Regression Analysis of Diabetes and Poverty

    PubMed Central

    Siordia, Carlos; Saenz, Joseph; Tom, Sarah E.

    2014-01-01

    Type II diabetes is a growing health problem in the United States. Understanding geographic variation in diabetes prevalence will inform where resources for management and prevention should be allocated. Investigations of the correlates of diabetes prevalence have largely ignored how spatial nonstationarity might play a role in the macro-level distribution of diabetes. This paper introduces the reader to the concept of spatial nonstationarity—variance in statistical relationships as a function of geographical location. Since spatial nonstationarity means different predictors can have varying effects on model outcomes, we make use of a geographically weighed regression to calculate correlates of diabetes as a function of geographic location. By doing so, we demonstrate an exploratory example in which the diabetes-poverty macro-level statistical relationship varies as a function of location. In particular, we provide evidence that when predicting macro-level diabetes prevalence, poverty is not always positively associated with diabetes PMID:25414731

  9. An Introduction to Macro- Level Spatial Nonstationarity: a Geographically Weighted Regression Analysis of Diabetes and Poverty.

    PubMed

    Siordia, Carlos; Saenz, Joseph; Tom, Sarah E

    2012-01-01

    Type II diabetes is a growing health problem in the United States. Understanding geographic variation in diabetes prevalence will inform where resources for management and prevention should be allocated. Investigations of the correlates of diabetes prevalence have largely ignored how spatial nonstationarity might play a role in the macro-level distribution of diabetes. This paper introduces the reader to the concept of spatial nonstationarity-variance in statistical relationships as a function of geographical location. Since spatial nonstationarity means different predictors can have varying effects on model outcomes, we make use of a geographically weighed regression to calculate correlates of diabetes as a function of geographic location. By doing so, we demonstrate an exploratory example in which the diabetes-poverty macro-level statistical relationship varies as a function of location. In particular, we provide evidence that when predicting macro-level diabetes prevalence, poverty is not always positively associated with diabetes.

  10. Analysis of the impact of immigration on labour market using spatial models

    NASA Astrophysics Data System (ADS)

    Polonyankina, Tatiana

    2017-07-01

    This paper investigates the impact of immigration on employment and unemployment of a host country. The question to answer is: How does employment/unemployment in the host country change after an increase in number of immigrants? The analysis is taking into account only legal immigrants in recession period. The model is combining classical regression of cross-sectional data with spatial econometrics models where cross-section dependencies are captured by a spatial matrix. The intention is by using spatial models analyse the sensitivity of employment/unemployment rate on change in a share of immigration in a region. The used panel data are based on the Labour force survey and on available macro data in Eurostat for 3 European countries (Germany, Austria and Czech Republic) grouped into cells by NUTS regions in a recession period.

  11. Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

    PubMed

    Haoliang Yuan; Yuan Yan Tang

    2017-04-01

    Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

  12. Trees Grow on Money: Urban Tree Canopy Cover and Environmental Justice

    PubMed Central

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G.; Zhou, Weiqi; McHale, Melissa; Grove, J. Morgan; O’Neil-Dunne, Jarlath; McFadden, Joseph P.; Buckley, Geoffrey L.; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L.

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman’s correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns. PMID:25830303

  13. Wilderness and primitive area recreation participation and consumption: an examination of demographic and spatial factors

    Treesearch

    J. Michael Bowker; D. Murphy; H. Ken Cordell; Donald B.K. English; J.C. Bergstrom; C.M. Starbuck; C.J. Betz; G.T. Green

    2006-01-01

    This paper explores the influence of demographic and spatial variables on individual participation and consumption of wildland area recreation. Data from the National Survey on Recreation and the Environment are combined with geographical information systembased distance measures to develop nonlinear regression models used to predict both participation and the number...

  14. Spatio-temporal water quality mapping from satellite images using geographically and temporally weighted regression

    NASA Astrophysics Data System (ADS)

    Chu, Hone-Jay; Kong, Shish-Jeng; Chang, Chih-Hua

    2018-03-01

    The turbidity (TB) of a water body varies with time and space. Water quality is traditionally estimated via linear regression based on satellite images. However, estimating and mapping water quality require a spatio-temporal nonstationary model, while TB mapping necessitates the use of geographically and temporally weighted regression (GTWR) and geographically weighted regression (GWR) models, both of which are more precise than linear regression. Given the temporal nonstationary models for mapping water quality, GTWR offers the best option for estimating regional water quality. Compared with GWR, GTWR provides highly reliable information for water quality mapping, boasts a relatively high goodness of fit, improves the explanation of variance from 44% to 87%, and shows a sufficient space-time explanatory power. The seasonal patterns of TB and the main spatial patterns of TB variability can be identified using the estimated TB maps from GTWR and by conducting an empirical orthogonal function (EOF) analysis.

  15. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

    PubMed

    Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

    2017-05-07

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  16. Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

    NASA Astrophysics Data System (ADS)

    Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

    2017-05-01

    In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade  ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC  =  0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.

  17. Spatial distribution of water supply in the coterminous United States

    Treesearch

    Thomas C. Brown; Michael T. Hobbins; Jorge A. Ramirez

    2008-01-01

    Available water supply across the contiguous 48 states was estimated as precipitation minus evapotranspiration using data for the period 1953-1994. Precipitation estimates were taken from the Parameter- Elevation Regressions on Independent Slopes Model (PRISM). Evapotranspiration was estimated using two models, the Advection-Aridity model and the Zhang model. The...

  18. Spatial association of public sports facilities with body mass index in Korea.

    PubMed

    Han, Eun Jin; Kang, Kiyeon; Sohn, So Young

    2018-05-07

    Governments and also local councils create and enforce their own regional public health care plans for the problem of overweight and obesity in the population. Public sports facilities can help these plans. In this paper, we investigated the contribution of public sports facilities to the reduction of the obesity of local residents. We used the data obtained from the Fifth Korea National Health and Nutrition Examination Surveys; and measured the degree of obesity using body mass index (BMI). We conducted various spatial regression analyses including the global Moran's I test and local indicators of spatial autocorrelation analysis finding that there exists spatial dependence in the error term of spatial regression model for BMI. However, we also observed that the number of local public sports facilities is not significantly related to local BMI. This result can be caused by the low utilization ratio and an unbalanced spatial distribution of local public sports facilities. Based on our findings, we suggest that local councils need to improve the quality of public sports facilities encouraging the establishment of preferred types of pubic sports facilities.

  19. Estimation of Fine Particulate Matter in Taipei Using Landuse Regression and Bayesian Maximum Entropy Methods

    PubMed Central

    Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming

    2011-01-01

    Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005–2007. PMID:21776223

  20. Estimation of fine particulate matter in Taipei using landuse regression and bayesian maximum entropy methods.

    PubMed

    Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming

    2011-06-01

    Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.

  1. Role of Aedes aegypti (Linnaeus) and Aedes albopictus (Skuse) in local dengue epidemics in Taiwan.

    PubMed

    Tsai, Pui-Jen; Teng, Hwa-Jen

    2016-11-09

    Aedes mosquitoes in Taiwan mainly comprise Aedes albopictus and Ae. aegypti. However, the species contributing to autochthonous dengue spread and the extent at which it occurs remain unclear. Thus, in this study, we spatially analyzed real data to determine spatial features related to local dengue incidence and mosquito density, particularly that of Ae. albopictus and Ae. aegypti. We used bivariate Moran's I statistic and geographically weighted regression (GWR) spatial methods to analyze the globally spatial dependence and locally regressed relationship between (1) imported dengue incidences and Breteau indices (BIs) of Ae. albopictus, (2) imported dengue incidences and BI of Ae. aegypti, (3) autochthonous dengue incidences and BI of Ae. albopictus, (4) autochthonous dengue incidences and BI of Ae. aegypti, (5) all dengue incidences and BI of Ae. albopictus, (6) all dengue incidences and BI of Ae. aegypti, (7) BI of Ae. albopictus and human population density, and (8) BI of Ae. aegypti and human population density in 348 townships in Taiwan. In the GWR models, regression coefficients of spatially regressed relationships between the incidence of autochthonous dengue and vector density of Ae. aegypti were significant and positive in most townships in Taiwan. However, Ae. albopictus had significant but negative regression coefficients in clusters of dengue epidemics. In the global bivariate Moran's index, spatial dependence between the incidence of autochthonous dengue and vector density of Ae. aegypti was significant and exhibited positive correlation in Taiwan (bivariate Moran's index = 0.51). However, Ae. albopictus exhibited positively significant but low correlation (bivariate Moran's index = 0.06). Similar results were observed in the two spatial methods between all dengue incidences and Aedes mosquitoes (Ae. aegypti and Ae. albopictus). The regression coefficients of spatially regressed relationships between imported dengue cases and Aedes mosquitoes (Ae. aegypti and Ae. albopictus) were significant in 348 townships in Taiwan. The results indicated that local Aedes mosquitoes do not contribute to the dengue incidence of imported cases. The density of Ae. aegypti positively correlated with the density of human population. By contrast, the density of Ae. albopictus negatively correlated with the density of human population in the areas of southern Taiwan. The results indicated that Ae. aegypti has more opportunities for human-mosquito contact in dengue endemic areas in southern Taiwan. Ae. aegypti, but not Ae. albopictus, and human population density in southern Taiwan are closely associated with an increased risk of autochthonous dengue incidence.

  2. Regional assessments of the Nation's water quality—Improved understanding of stream nutrient sources through enhanced modeling capabilities

    USGS Publications Warehouse

    Preston, Stephen D.; Alexander, Richard B.; Woodside, Michael D.

    2011-01-01

    The U.S. Geological Survey (USGS) recently completed assessments of stream nutrients in six major regions extending over much of the conterminous United States. SPARROW (SPAtially Referenced Regressions On Watershed attributes) models were developed for each region to explain spatial patterns in monitored stream nutrient loads in relation to human activities and natural resources and processes. The model information, reported by stream reach and catchment, provides contrasting views of the spatial patterns of nutrient source contributions, including those from urban (wastewater effluent and diffuse runoff from developed land), agricultural (farm fertilizers and animal manure), and specific background sources (atmospheric nitrogen deposition, soil phosphorus, forest nitrogen fixation, and channel erosion).

  3. The importance of regional models in assessing canine cancer incidences in Switzerland

    PubMed Central

    Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina

    2018-01-01

    Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships. PMID:29652921

  4. The importance of regional models in assessing canine cancer incidences in Switzerland.

    PubMed

    Boo, Gianluca; Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina

    2018-01-01

    Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships.

  5. Panel regressions to estimate low-flow response to rainfall variability in ungaged basins

    USGS Publications Warehouse

    Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.

    2016-01-01

    Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.

  6. Panel regressions to estimate low-flow response to rainfall variability in ungaged basins

    NASA Astrophysics Data System (ADS)

    Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.

    2016-12-01

    Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.

  7. The impact of anthropogenic emissions and meteorological conditions on the spatial variation of ambient SO2 concentrations: A panel study of 113 Chinese cities.

    PubMed

    Yang, Xue; Wang, Shaojian; Zhang, Wenzhong; Zhan, Dongsheng; Li, Jiaming

    2017-04-15

    China has received increased international criticism in recent years in relation to its air pollution levels, both in terms of the transmission of pollutants across international borders and the attendant adverse health effects being witnessed. Whilst existing research has examined the factors influencing ambient air pollutant concentrations, previous studies have failed to adequately explore the determinants of such concentrations from either a source or diffusion perspective. This study addressed both source (specifically, anthropogenic emissions) and diffusion (namely, meteorological conditions) indicators, in order to detect their respective impacts on the spatial variations seen in the distribution of air pollution. Spatial panel data for 113 major cities in China was processed using a range of global regression models-the ordinary least square model, the spatial lag model, and the spatial error model-as well as a local, geographic weighted regression (GWR) model. Results from the study suggest that in 2014, average SO 2 concentrations exceeded China's first-level target. The most polluted cities were found to be predominantly located in northern China, while less polluted cities were located in southern China. Global regression results indicated that precipitation exerts a significant effect on SO 2 reduction (p<0.001) and that a regional increase of 1mm in precipitation can reduce SO 2 concentrations by 0.026μg/m 3 . Both emission and temperature factors were found to aggravate SO 2 concentrations, although no such significant correlation was found in relation to wind speed. GWR results suggest that the association between SO 2 and its factors varied over space. Increased emissions were found to be able to produce more pollution in the northwest than in other parts of the country. Higher wind speeds and temperatures in northwestern areas were shown to reinforce SO 2 pollution, while in southern regions, they had the opposite effect. Further, increased precipitation was found to exert a greater inhibitory effect on SO 2 pollution in the country's northeast than that in other areas. Our findings could provide a detailed reference for formulating regionally specific emission reduction policies in China. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Detecting spatio-temporal changes in agricultural land use in Heilongjiang province, China using MODIS time-series data and a random forest regression model

    NASA Astrophysics Data System (ADS)

    Hu, Q.; Friedl, M. A.; Wu, W.

    2017-12-01

    Accurate and timely information regarding the spatial distribution of crop types and their changes is essential for acreage surveys, yield estimation, water management, and agricultural production decision-making. In recent years, increasing population, dietary shifts and climate change have driven drastic changes in China's agricultural land use. However, no maps are currently available that document the spatial and temporal patterns of these agricultural land use changes. Because of its short revisit period, rich spectral bands and global coverage, MODIS time series data has been shown to have great potential for detecting the seasonal dynamics of different crop types. However, its inherently coarse spatial resolution limits the accuracy with which crops can be identified from MODIS in regions with small fields or complex agricultural landscapes. To evaluate this more carefully and specifically understand the strengths and weaknesses of MODIS data for crop-type mapping, we used MODIS time-series imagery to map the sub-pixel fractional crop area for four major crop types (rice, corn, soybean and wheat) at 500-m spatial resolution for Heilongjiang province, one of the most important grain-production regions in China where recent agricultural land use change has been rapid and pronounced. To do this, a random forest regression (RF-g) model was constructed to estimate the percentage of each sub-pixel crop type in 2006, 2011 and 2016. Crop type maps generated through expert visual interpretation of high spatial resolution images (i.e., Landsat and SPOT data) were used to calibrate the regression model. Five different time series of vegetation indices (155 features) derived from different spectral channels of MODIS land surface reflectance (MOD09A1) data were used as candidate features for the RF-g model. An out-of-bag strategy and backward elimination approach was applied to select the optimal spectra-temporal feature subset for each crop type. The resulting crop maps were assessed in two ways: (1) wall-to-wall pixel comparison with corresponding high spatial resolution reference maps; and (2) county-level comparison with census data. Based on these derived maps, changes in crop type, total area, and spatial patterns of change in Heilongjiang province during 2006-2016 were analyzed.

  9. Evaluating the spatial variation of total mercury in young-of-year yellow perch (Perca flavescens), surface water and upland soil for watershed-lake systems within the southern Boreal Shield.

    PubMed

    Gabriel, Mark C; Kolka, Randy; Wickman, Trent; Nater, Ed; Woodruff, Laurel

    2009-06-15

    The primary objective of this research is to investigate relationships between mercury in upland soil, lake water and fish tissue and explore the cause for the observed spatial variation of THg in age one yellow perch (Perca flavescens) for ten lakes within the Superior National Forest. Spatial relationships between yellow perch THg tissue concentration and a total of 45 watershed and water chemistry parameters were evaluated for two separate years: 2005 and 2006. Results show agreement with other studies where watershed area, lake water pH, nutrient levels (specifically dissolved NO(3)(-)-N) and dissolved iron are important factors controlling and/or predicting fish THg level. Exceeding all was the strong dependence of yellow perch THg level on soil A-horizon THg and, in particular, soil O-horizon THg concentrations (Spearman rho=0.81). Soil B-horizon THg concentration was significantly correlated (Pearson r=0.75) with lake water THg concentration. Lakes surrounded by a greater percentage of shrub wetlands (peatlands) had higher fish tissue THg levels, thus it is highly possible that these wetlands are main locations for mercury methylation. Stepwise regression was used to develop empirical models for the purpose of predicting the spatial variation in yellow perch THg over the studied region. The 2005 regression model demonstrates it is possible to obtain good prediction (up to 60% variance description) of resident yellow perch THg level using upland soil O-horizon THg as the only independent variable. The 2006 model shows even greater prediction (r(2)=0.73, with an overall 10 ng/g [tissue, wet weight] margin of error), using lake water dissolved iron and watershed area as the only model independent variables. The developed regression models in this study can help with interpreting THg concentrations in low trophic level fish species for untested lakes of the greater Superior National Forest and surrounding Boreal ecosystem.

  10. The rubber plantation environment and Lassa fever epidemics in Liberia, 2008-2012: a spatial regression.

    PubMed

    Olugasa, Babasola O; Dogba, John B; Ogunro, Bamidele; Odigie, Eugene A; Nykoi, Jomah; Ojo, Johnson F; Taiwo, Olalekan; Kamara, Abraham; Mulbah, Charles K; Fasunla, Ayotunde J

    2014-10-01

    As Lassa fever continues to be a public health challenge in West Africa, it is critical to produce good maps of its risk pattern for use in active surveillance and control intervention. We identified eight spatial features related to the rubber plantation environment and used them as explanatory variables for Lassa fever (LF) outbreaks on the Uniroyal Liberian Agricultural Company (LAC) rubber plantation environment in Grand Bassa County, Liberia. We computed classical and spatial lag regression models on all spatial features, including proximity of residential camp to rubber tree-edge, main road in the plantation, LAC hospital, rice farmland, household refuse dump, human population density, post-harvest storage density of rice and density of rodent deterrent on rice storage. We found significant (p=0.0024) spatial autocorrelation between LF cases and the spatial features we have considered. We concluded that the rubber plantation environment influenced Mastomys species' breeding and transmission of Lassa virus along spatial scale to humans. The risk factors identified in this study offered a baseline for more effective surveillance and control of LF in the post-civil conflict Liberia. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. A regression-kriging model for estimation of rainfall in the Laohahe basin

    NASA Astrophysics Data System (ADS)

    Wang, Hong; Ren, Li L.; Liu, Gao H.

    2009-10-01

    This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.

  12. Comparing Different Approaches for Mapping Urban Vegetation Cover from Landsat ETM+ Data: A Case Study on Brussels

    PubMed Central

    Van de Voorde, Tim; Vlaeminck, Jeroen; Canters, Frank

    2008-01-01

    Urban growth and its related environmental problems call for sustainable urban management policies to safeguard the quality of urban environments. Vegetation plays an important part in this as it provides ecological, social, health and economic benefits to a city's inhabitants. Remotely sensed data are of great value to monitor urban green and despite the clear advantages of contemporary high resolution images, the benefits of medium resolution data should not be discarded. The objective of this research was to estimate fractional vegetation cover from a Landsat ETM+ image with sub-pixel classification, and to compare accuracies obtained with multiple stepwise regression analysis, linear spectral unmixing and multi-layer perceptrons (MLP) at the level of meaningful urban spatial entities. Despite the small, but nevertheless statistically significant differences at pixel level between the alternative approaches, the spatial pattern of vegetation cover and estimation errors is clearly distinctive at neighbourhood level. At this spatially aggregated level, a simple regression model appears to attain sufficient accuracy. For mapping at a spatially more detailed level, the MLP seems to be the most appropriate choice. Brightness normalisation only appeared to affect the linear models, especially the linear spectral unmixing. PMID:27879914

  13. The spatial and temporal association of neighborhood drug markets and rates of sexually transmitted infections in an urban setting.

    PubMed

    Jennings, Jacky M; Woods, Stacy E; Curriero, Frank C

    2013-09-01

    This study examined temporal and spatial relationships between neighborhood drug markets and gonorrhea among census block groups from 2002 to 2005. This was a spatial, longitudinal ecologic study. Poisson regression was used with adjustment in final models for socioeconomic status, residential stability and vacant housing. Increased drug market arrests were significantly associated with a 11% increase gonorrhea (adjusted relative risk (ARR) 1.11; 95% CI 1.05, 1.16). Increased drug market arrests in adjacent neighborhoods were significantly associated with a 27% increase in gonorrhea (ARR 1.27; 95% CI 1.16, 1.36), independent of focal neighborhood drug markets. Increased drug market arrests in the previous year in focal neighborhoods were not associated with gonorrhea (ARR 1.04; 95% CI 0.98, 1.10), adjusting for focal and adjacent drug markets. While the temporal was not supported, our findings support an associative link between drug markets and gonorrhea. The findings suggest that drug markets and their associated sexual networks may extend beyond local neighborhood boundaries indicating the importance of including spatial lags in regression models investigating these associations. Copyright © 2013 Elsevier Ltd. All rights reserved.

  14. The spatial and temporal association of neighborhood drug markets and rates of sexually transmitted infections in an urban setting

    PubMed Central

    Jennings, Jacky M.; Woods, Stacy E.; Curriero, Frank C.

    2013-01-01

    This study examined temporal and spatial relationships between neighborhood drug markets and gonorrhea among census block groups from 2002 to 2005. This was a spatial, longitudinal ecologic study. Poisson regression was used with adjustment in final models for socioeconomic status, residential stability and vacant housing. Increased drug market arrests were significantly associated with a 11% increase gonorrhea (Adjusted Relative Risk (ARR) 1.11; 95% CI 1.05, 1.16). Increased drug market arrests in adjacent neighborhoods were significantly associated with a 27% increase in gonorrhea (ARR 1.27; 95% CI 1.16, 1.36), independent of focal neighborhood drug markets. Increased drug market arrests in the previous year in focal neighborhoods were not associated with gonorrhea (ARR 1.04; 95% CI 0.98, 1.10), adjusting for focal and adjacent drug markets. While the temporal was not supported, our findings support an associative link between drug markets and gonorrhea. The findings suggest that drug markets and their associated sexual networks may extend beyond local neighborhood boundaries indicating the importance of including spatial lags in regression models investigating these associations. PMID:23872251

  15. Status and Trends of Nitrogen Loads to Estuaries of the Conterminous U.S.

    EPA Science Inventory

    We applied regional SPARROW (SPAtially Referenced Regressions On Watershed attributes) models to estimate status and trends of potential nitrogen loads to estuaries of the conterminous United States. The original SPARROW models predict average detrended loads by source based on ...

  16. Modeling stream network-scale variation in coho salmon overwinter survival and smolt size

    EPA Science Inventory

    We used multiple regression and hierarchical mixed-effects models to examine spatial patterns of overwinter survival and size at smolting in juvenile coho salmon Oncorhynchus kisutch in relation to habitat attributes across an extensive stream network in southwestern Oregon over ...

  17. A consistent positive association between landscape simplification and insecticide use across the Midwestern US from 1997 through 2012

    DOE PAGES

    Meehan, Timothy D.; Gratton, Claudio

    2015-10-27

    During 2007, counties across the Midwestern US with relatively high levels of landscape simplification (i.e., widespread replacement of seminatural habitats with cultivated crops) had relatively high crop-pest abundances which, in turn, were associated with relatively high insecticide application. These results suggested a positive relationship between landscape simplification and insecticide use, mediated by landscape effects on crop pests or their natural enemies. A follow-up study, in the same region but using different statistical methods, explored the relationship between landscape simplification and insecticide use between 1987 and 2007, and concluded that the relationship varied substantially in sign and strength across years. Here,more » we explore this relationship from 1997 through 2012, using a single dataset and two different analytical approaches. We demonstrate that, when using ordinary least squares (OLS) regression, the relationship between landscape simplification and insecticide use is, indeed, quite variable over time. However, the residuals from OLS models show strong spatial autocorrelation, indicating spatial structure in the data not accounted for by explanatory variables, and violating a standard assumption of OLS. When modeled using spatial regression techniques, relationships between landscape simplification and insecticide use were consistently positive between 1997 and 2012, and model fits were dramatically improved. We argue that spatial regression methods are more appropriate for these data, and conclude that there remains compelling correlative support for a link between landscape simplification and insecticide use in the Midwestern US. We discuss the limitations of inference from this and related studies, and suggest improved data collection campaigns for better understanding links between landscape structure, crop-pest pressure, and pest-management practices.« less

  18. A consistent positive association between landscape simplification and insecticide use across the Midwestern US from 1997 through 2012

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meehan, Timothy D.; Gratton, Claudio

    During 2007, counties across the Midwestern US with relatively high levels of landscape simplification (i.e., widespread replacement of seminatural habitats with cultivated crops) had relatively high crop-pest abundances which, in turn, were associated with relatively high insecticide application. These results suggested a positive relationship between landscape simplification and insecticide use, mediated by landscape effects on crop pests or their natural enemies. A follow-up study, in the same region but using different statistical methods, explored the relationship between landscape simplification and insecticide use between 1987 and 2007, and concluded that the relationship varied substantially in sign and strength across years. Here,more » we explore this relationship from 1997 through 2012, using a single dataset and two different analytical approaches. We demonstrate that, when using ordinary least squares (OLS) regression, the relationship between landscape simplification and insecticide use is, indeed, quite variable over time. However, the residuals from OLS models show strong spatial autocorrelation, indicating spatial structure in the data not accounted for by explanatory variables, and violating a standard assumption of OLS. When modeled using spatial regression techniques, relationships between landscape simplification and insecticide use were consistently positive between 1997 and 2012, and model fits were dramatically improved. We argue that spatial regression methods are more appropriate for these data, and conclude that there remains compelling correlative support for a link between landscape simplification and insecticide use in the Midwestern US. We discuss the limitations of inference from this and related studies, and suggest improved data collection campaigns for better understanding links between landscape structure, crop-pest pressure, and pest-management practices.« less

  19. Geographic dimensions of heat-related mortality in seven U.S. cities.

    PubMed

    Hondula, David M; Davis, Robert E; Saha, Michael V; Wegner, Carleigh R; Veazey, Lindsay M

    2015-04-01

    Spatially targeted interventions may help protect the public when extreme heat occurs. Health outcome data are increasingly being used to map intra-urban variability in heat-health risks, but there has been little effort to compare patterns and risk factors between cities. We sought to identify places within large metropolitan areas where the mortality rate is highest on hot summer days and determine if characteristics of high-risk areas are consistent from one city to another. A Poisson regression model was adapted to quantify temperature-mortality relationships at the postal code scale based on 2.1 million records of daily all-cause mortality counts from seven U.S. cities. Multivariate spatial regression models were then used to determine the demographic and environmental variables most closely associated with intra-city variability in risk. Significant mortality increases on extreme heat days were confined to 12-44% of postal codes comprising each city. Places with greater risk had more developed land, young, elderly, and minority residents, and lower income and educational attainment, but the key explanatory variables varied from one city to another. Regression models accounted for 14-34% of the spatial variability in heat-related mortality. The results emphasize the need for public health plans for heat to be locally tailored and not assume that pre-identified vulnerability indicators are universally applicable. As known risk factors accounted for no more than one third of the spatial variability in heat-health outcomes, consideration of health outcome data is important in efforts to identify and protect residents of the places where the heat-related health risks are the highest. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Spatial Variability of Plant Available Water, Soil Organic Carbon, and Microbial Biomass under Divergent Land Uses: A Comparison among Regression-Kriging, Cokriging, and Regression-Cokriging

    NASA Astrophysics Data System (ADS)

    Kiani, M.; Hernandez Ramirez, G.; Quideau, S.

    2016-12-01

    Improved knowledge about the spatial variability of plant available water (PAW), soil organic carbon (SOC), and microbial biomass carbon (MBC) as affected by land-use systems can underpin the identification and inventory of beneficial ecosystem good and services in both agricultural and wild lands. Little research has been done that addresses the spatial patterns of PAW, SOC, and MBC under different land use types at a field scale. Therefore, we collected 56 soil samples (5-10 cm depth increment), using a nested cyclic sampling design within both a native grassland (NG) site and an irrigated cultivated (IC) site located near Brooks, Alberta. Using classical statistical and geostatistical methods, we characterized the spatial heterogeneities of PAW, SOC, and MBC under NG and IC using several geostatistical methods such as ordinary kriging (OK), regression-kriging (RK), cokriging (COK), and regression-cokriging (RCOK). Converting the native grassland to irrigated cultivated land altered soil pore distribution by reducing macroporosity which led to lower saturated water content and half hydraulic conductivity in IC compared to NG. This conversion also decreased the relative abundance of gram-negative bacteria, while increasing both the proportion of gram-positive bacteria and MBC concentration. At both studied sites, the best fitted spatial model was Gaussian based on lower RSS and higher R2 as criteria. The IC had stronger degree of spatial dependence and longer range of spatial auto-correlation revealing a homogenization of the spatial variability of soil properties as a result of intensive, recurrent agricultural activities. Comparison of OK, RK, COK, and RCOK approaches indicated that cokriging method had the best performance demonstrating a profound improvement in the accuracy of spatial estimations of PAW, SOC, and MBC. It seems that the combination of terrain covariates such as elevation and depth-to-water with kriging techniques offers more capability for incorporating explicit ancillary information in predictive soil mapping. Overall, identification of spatial patterns of soil properties in agricultural lands gives a bird's eye view to land owners to implement and improve management practices which lead to more sustainable production.

  1. Estimation of aboveground biomass in Mediterranean forests by statistical modelling of ASTER fraction images

    NASA Astrophysics Data System (ADS)

    Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.

    2014-09-01

    Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.

  2. Predictive modeling of hazardous waste landfill total above-ground biomass using passive optical and LIDAR remotely sensed data

    NASA Astrophysics Data System (ADS)

    Hadley, Brian Christopher

    This dissertation assessed remotely sensed data and geospatial modeling technique(s) to map the spatial distribution of total above-ground biomass present on the surface of the Savannah River National Laboratory's (SRNL) Mixed Waste Management Facility (MWMF) hazardous waste landfill. Ordinary least squares (OLS) regression, regression kriging, and tree-structured regression were employed to model the empirical relationship between in-situ measured Bahia (Paspalum notatum Flugge) and Centipede [Eremochloa ophiuroides (Munro) Hack.] grass biomass against an assortment of explanatory variables extracted from fine spatial resolution passive optical and LIDAR remotely sensed data. Explanatory variables included: (1) discrete channels of visible, near-infrared (NIR), and short-wave infrared (SWIR) reflectance, (2) spectral vegetation indices (SVI), (3) spectral mixture analysis (SMA) modeled fractions, (4) narrow-band derivative-based vegetation indices, and (5) LIDAR derived topographic variables (i.e. elevation, slope, and aspect). Results showed that a linear combination of the first- (1DZ_DGVI), second- (2DZ_DGVI), and third-derivative of green vegetation indices (3DZ_DGVI) calculated from hyperspectral data recorded over the 400--960 nm wavelengths of the electromagnetic spectrum explained the largest percentage of statistical variation (R2 = 0.5184) in the total above-ground biomass measurements. In general, the topographic variables did not correlate well with the MWMF biomass data, accounting for less than five percent of the statistical variation. It was concluded that tree-structured regression represented the optimum geospatial modeling technique due to a combination of model performance and efficiency/flexibility factors.

  3. [Detecting the moisture content of forest surface soil based on the microwave remote sensing technology.

    PubMed

    Li, Ming Ze; Gao, Yuan Ke; Di, Xue Ying; Fan, Wen Yi

    2016-03-01

    The moisture content of forest surface soil is an important parameter in forest ecosystems. It is practically significant for forest ecosystem related research to use microwave remote sensing technology for rapid and accurate estimation of the moisture content of forest surface soil. With the aid of TDR-300 soil moisture content measuring instrument, the moisture contents of forest surface soils of 120 sample plots at Tahe Forestry Bureau of Daxing'anling region in Heilongjiang Province were measured. Taking the moisture content of forest surface soil as the dependent variable and the polarization decomposition parameters of C band Quad-pol SAR data as independent variables, two types of quantitative estimation models (multilinear regression model and BP-neural network model) for predicting moisture content of forest surface soils were developed. The spatial distribution of moisture content of forest surface soil on the regional scale was then derived with model inversion. Results showed that the model precision was 86.0% and 89.4% with RMSE of 3.0% and 2.7% for the multilinear regression model and the BP-neural network model, respectively. It indicated that the BP-neural network model had a better performance than the multilinear regression model in quantitative estimation of the moisture content of forest surface soil. The spatial distribution of forest surface soil moisture content in the study area was then obtained by using the BP neural network model simulation with the Quad-pol SAR data.

  4. Violent crime in San Antonio, Texas: an application of spatial epidemiological methods.

    PubMed

    Sparks, Corey S

    2011-12-01

    Violent crimes are rarely considered a public health problem or investigated using epidemiological methods. But patterns of violent crime and other health conditions are often affected by similar characteristics of the built environment. In this paper, methods and perspectives from spatial epidemiology are used in an analysis of violent crimes in San Antonio, TX. Bayesian statistical methods are used to examine the contextual influence of several aspects of the built environment. Additionally, spatial regression models using Bayesian model specifications are used to examine spatial patterns of violent crime risk. Results indicate that the determinants of violent crime depend on the model specification, but are primarily related to the built environment and neighborhood socioeconomic conditions. Results are discussed within the context of a rapidly growing urban area with a diverse population. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Post-Modeling Histogram Matching of Maps Produced Using Regression Trees

    Treesearch

    Andrew J. Lister; Tonya W. Lister

    2006-01-01

    Spatial predictive models often use statistical techniques that in some way rely on averaging of values. Estimates from linear modeling are known to be susceptible to truncation of variance when the independent (predictor) variables are measured with error. A straightforward post-processing technique (histogram matching) for attempting to mitigate this effect is...

  6. Large-area forest inventory regression modeling: spatial scale considerations

    Treesearch

    James A. Westfall

    2015-01-01

    In many forest inventories, statistical models are employed to predict values for attributes that are difficult and/or time-consuming to measure. In some applications, models are applied across a large geographic area, which assumes the relationship between the response variable and predictors is ubiquitously invariable within the area. The extent to which this...

  7. Characterizing the spatial distribution of ambient ultrafine particles in Toronto, Canada: A land use regression model.

    PubMed

    Weichenthal, Scott; Van Ryswyk, Keith; Goldstein, Alon; Shekarrizfard, Maryam; Hatzopoulou, Marianne

    2016-01-01

    Exposure models are needed to evaluate the chronic health effects of ambient ultrafine particles (<0.1 μm) (UFPs). We developed a land use regression model for ambient UFPs in Toronto, Canada using mobile monitoring data collected during summer/winter 2010-2011. In total, 405 road segments were included in the analysis. The final model explained 67% of the spatial variation in mean UFPs and included terms for the logarithm of distances to highways, major roads, the central business district, Pearson airport, and bus routes as well as variables for the number of on-street trees, parks, open space, and the length of bus routes within a 100 m buffer. There was no systematic difference between measured and predicted values when the model was evaluated in an external dataset, although the R(2) value decreased (R(2) = 50%). This model will be used to evaluate the chronic health effects of UFPs using population-based cohorts in the Toronto area. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  8. Empirical Assessment of Spatial Prediction Methods for Location Cost Adjustment Factors

    PubMed Central

    Migliaccio, Giovanni C.; Guindani, Michele; D'Incognito, Maria; Zhang, Linlin

    2014-01-01

    In the feasibility stage, the correct prediction of construction costs ensures that budget requirements are met from the start of a project's lifecycle. A very common approach for performing quick-order-of-magnitude estimates is based on using Location Cost Adjustment Factors (LCAFs) that compute historically based costs by project location. Nowadays, numerous LCAF datasets are commercially available in North America, but, obviously, they do not include all locations. Hence, LCAFs for un-sampled locations need to be inferred through spatial interpolation or prediction methods. Currently, practitioners tend to select the value for a location using only one variable, namely the nearest linear-distance between two sites. However, construction costs could be affected by socio-economic variables as suggested by macroeconomic theories. Using a commonly used set of LCAFs, the City Cost Indexes (CCI) by RSMeans, and the socio-economic variables included in the ESRI Community Sourcebook, this article provides several contributions to the body of knowledge. First, the accuracy of various spatial prediction methods in estimating LCAF values for un-sampled locations was evaluated and assessed in respect to spatial interpolation methods. Two Regression-based prediction models were selected, a Global Regression Analysis and a Geographically-weighted regression analysis (GWR). Once these models were compared against interpolation methods, the results showed that GWR is the most appropriate way to model CCI as a function of multiple covariates. The outcome of GWR, for each covariate, was studied for all the 48 states in the contiguous US. As a direct consequence of spatial non-stationarity, it was possible to discuss the influence of each single covariate differently from state to state. In addition, the article includes a first attempt to determine if the observed variability in cost index values could be, at least partially explained by independent socio-economic variables. PMID:25018582

  9. Spatial patterns of arrests, police assault and addiction treatment center locations in Tijuana, Mexico.

    PubMed

    Werb, Dan; Strathdee, Steffanie A; Vera, Alicia; Arredondo, Jaime; Beletsky, Leo; Gonzalez-Zuniga, Patricia; Gaines, Tommi

    2016-07-01

    In the context of a public health-oriented drug policy reform in Mexico, we assessed the spatial distribution of police encounters among people who inject drugs (PWID) in Tijuana, determined the association between these encounters and the location of addiction treatment centers and explored the association between police encounters and treatment access. Geographically weighted regression (GWR) and logistic regression analysis using prospective spatial data from a community-recruited cohort of PWID in Tijuana and official geographical arrest data from the Tijuana Municipal Police Department. Tijuana, Mexico. A total of 608 participants (median age 37; 28.4% female) in the prospective Proyecto El Cuete cohort study recruited between January and December 2011. We compared the mean distance of police encounters and a randomly distributed set of events to treatment centers. GWR was undertaken to model the spatial relationship between police interactions and treatment centers. Logistic regression analysis was used to investigate factors associated with reporting police interactions. During the study period, 27.5% of police encounters occurred within 500 m of treatment centers. The GWR model suggested spatial correlation between encounters and treatment centers (global R(2)  = 0.53). Reporting a need for addiction treatment was associated with reporting arrest and police assault [adjusted odds ratio = 2.74, 95% confidence interval (CI) = 1.25-6.02, P = 0.012]. A geospatial analysis suggests that, in Mexico, people who inject drugs are at greater risk of being a victim of police violence if they consider themselves in need of addiction treatment, and their interactions with police appear to be more frequent around treatment centers. © 2016 Society for the Study of Addiction.

  10. SPATIAL PATTERNS OF ARRESTS, POLICE ASSAULT, AND ADDICTION TREATMENT CENTER LOCATIONS IN TIJUANA, MEXICO

    PubMed Central

    Werb, D; Strathdee, SA; Vera, A; Arredondo, J; Beletsky, L; Gonzalez-Zuniga, P; Gaines, T

    2016-01-01

    Aims In the context of a public health-oriented drug policy reform in Mexico, we assessed the spatial distribution of police encounters among people who inject drugs (PWID) in Tijuana; determined the association between these encounters and the location of addiction treatment centers; and explored the association between police encounters and treatment access. Design Geographically weighted regression (GWR) and logistic regression analysis using prospective spatial data from a community-recruited cohort of PWID in Tijuana and official geographic arrest data from the Tijuana Municipal Police Department. Setting Tijuana, Mexico. Participants 608 participants (median age 37; 28.4% female) in the prospective Proyecto El Cuete cohort study recruited between January and December 2011. Measurements We compared the mean distance of police encounters and a randomly distributed set of events to treatment centers. GWR was undertaken to model the spatial relationship between police interactions and treatment centers. Logistic regression analysis was used to investigate factors associated with reporting police interactions. Findings During the study period, 27.5% of police encounters occurred within 500 meters of treatment centers. The GWR model suggested spatial correlation between encounters and treatment centers (Global R2 = 0.53). Reporting a need for addiction treatment was associated with reporting arrest and police assault (Adjusted Odds Ratio = 2.74, 95% Confidence Interval [CI]: 1.25–6.02, p = 0.012). Conclusions A geospatial analysis suggests that in Mexico, people who inject drugs are at greater risk of being a victim of police violence if they consider themselves in need of addiction treatment, and their interactions with police appear to be more frequent around treatment centres. PMID:26879179

  11. Spatial Heterogeneity in the Effects of Immigration and Diversity on Neighborhood Homicide Rates

    PubMed Central

    Graif, Corina; Sampson, Robert J.

    2010-01-01

    This paper examines the connection of immigration and diversity to homicide by advancing a recently developed approach to modeling spatial dynamics—geographically weighted regression. In contrast to traditional global averaging, we argue on substantive grounds that neighborhood characteristics vary in their effects across neighborhood space, a process of “spatial heterogeneity.” Much like treatment-effect heterogeneity and distinct from spatial spillover, our analysis finds considerable evidence that neighborhood characteristics in Chicago vary significantly in predicting homicide, in some cases showing countervailing effects depending on spatial location. In general, however, immigrant concentration is either unrelated or inversely related to homicide, whereas language diversity is consistently linked to lower homicide. The results shed new light on the immigration-homicide nexus and suggest the pitfalls of global averaging models that hide the reality of a highly diversified and spatially stratified metropolis. PMID:20671811

  12. Digital data used to relate nutrient inputs to water quality in the Chesapeake Bay watershed

    USGS Publications Warehouse

    Brakebill, John W.; Preston, Stephen D.

    1999-01-01

    Digital data sets were compiled by the U. S. Geological Survey (USGS) and used as input for a collection of Spatially Referenced Regressions On Watershed attributes for the Chesapeake Bay region. These regressions relate streamwater loads to nutrient sources and the factors that affect the transport of these nutrients throughout the watershed. A digital segmented network based on watershed boundaries serves as the primary foundation for spatially referencing total nitrogen and total phosphorus source and land-surface characteristic data sets within a Geographic Information System. Digital data sets of atmospheric wet deposition of nitrate, point-source discharge locations, land cover, and agricultural sources such as fertilizer and manure were created and compiled from numerous sources and represent nitrogen and phosphorus inputs. Some land-surface characteristics representing factors that affect the transport of nutrients include land use, land cover, average annual precipitation and temperature, slope, and soil permeability. Nutrient input and land-surface characteristic data sets merged with the segmented watershed network provide the spatial detail by watershed segment required by the models. Nutrient stream loads were estimated for total nitrogen, total phosphorus, nitrate/nitrite, amonium, phosphate, and total suspended soilds at as many as 109 sites within the Chesapeake Bay watershed. The total nitrogen and total phosphorus load estimates are the dependent variables for the regressions and were used for model calibration. Other nutrient-load estimates may be used for calibration in future applications of the models.

  13. Estimating maize production in Kenya using NDVI: Some statistical considerations

    USGS Publications Warehouse

    Lewis, J.E.; Rowland, James; Nadeau , A.

    1998-01-01

    A regression model approach using a normalized difference vegetation index (NDVI) has the potential for estimating crop production in East Africa. However, before production estimation can become a reality, the underlying model assumptions and statistical nature of the sample data (NDVI and crop production) must be examined rigorously. Annual maize production statistics from 1982-90 for 36 agricultural districts within Kenya were used as the dependent variable; median area NDVI (independent variable) values from each agricultural district and year were extracted from the annual maximum NDVI data set. The input data and the statistical association of NDVI with maize production for Kenya were tested systematically for the following items: (1) homogeneity of the data when pooling the sample, (2) gross data errors and influence points, (3) serial (time) correlation, (4) spatial autocorrelation and (5) stability of the regression coefficients. The results of using a simple regression model with NDVI as the only independent variable are encouraging (r 0.75, p 0.05) and illustrate that NDVI can be a responsive indicator of maize production, especially in areas of high NDVI spatial variability, which coincide with areas of production variability in Kenya.

  14. Data-driven discovery of partial differential equations.

    PubMed

    Rudy, Samuel H; Brunton, Steven L; Proctor, Joshua L; Kutz, J Nathan

    2017-04-01

    We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg-de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable.

  15. Evaluating neighborhood structures for modeling intercity diffusion of large-scale dengue epidemics.

    PubMed

    Wen, Tzai-Hung; Hsu, Ching-Shun; Hu, Ming-Che

    2018-05-03

    Dengue fever is a vector-borne infectious disease that is transmitted by contact between vector mosquitoes and susceptible hosts. The literature has addressed the issue on quantifying the effect of individual mobility on dengue transmission. However, there are methodological concerns in the spatial regression model configuration for examining the effect of intercity-scale human mobility on dengue diffusion. The purposes of the study are to investigate the influence of neighborhood structures on intercity epidemic progression from pre-epidemic to epidemic periods and to compare definitions of different neighborhood structures for interpreting the spread of dengue epidemics. We proposed a framework for assessing the effect of model configurations on dengue incidence in 2014 and 2015, which were the most severe outbreaks in 70 years in Taiwan. Compared with the conventional model configuration in spatial regression analysis, our proposed model used a radiation model, which reflects population flow between townships, as a spatial weight to capture the structure of human mobility. The results of our model demonstrate better model fitting performance, indicating that the structure of human mobility has better explanatory power in dengue diffusion than the geometric structure of administration boundaries and geographic distance between centroids of cities. We also identified spatial-temporal hierarchy of dengue diffusion: dengue incidence would be influenced by its immediate neighboring townships during pre-epidemic and epidemic periods, and also with more distant neighbors (based on mobility) in pre-epidemic periods. Our findings suggest that the structure of population mobility could more reasonably capture urban-to-urban interactions, which implies that the hub cities could be a "bridge" for large-scale transmission and make townships that immediately connect to hub cities more vulnerable to dengue epidemics.

  16. Using a GIS model to assess terrestrial salamander response to alternative forest management plans

    Treesearch

    Eric J. Gustafson; Nathan L. Murphy; Thomas R. Crow

    2001-01-01

    A GIS model predicting the spatial distribution of terrestrial salamander abundance based on topography and forest age was developed using parameters derived from the literature. The model was tested by sampling salamander abundance across the full range of site conditions used in the model. A regression of the predictions of our GIS model against these sample data...

  17. Modeling animal movements using stochastic differential equations

    Treesearch

    Haiganoush K. Preisler; Alan A. Ager; Bruce K. Johnson; John G. Kie

    2004-01-01

    We describe the use of bivariate stochastic differential equations (SDE) for modeling movements of 216 radiocollared female Rocky Mountain elk at the Starkey Experimental Forest and Range in northeastern Oregon. Spatially and temporally explicit vector fields were estimated using approximating difference equations and nonparametric regression techniques. Estimated...

  18. Comprehensive Status and Trends of Nitrogen Loads to Estuaries in the Conterminous United States: Pacific Coast Results

    EPA Science Inventory

    We applied regional SPARROW (SPAtially Referenced Regressions On Watershed attributes) models to estimate status and trends of potential nitrogen loads to estuaries of the conterminous United States. The original Regional SPARROW models predict average detrended loads by source ...

  19. Subpixel Snow Cover Mapping from MODIS Data by Nonparametric Regression Splines

    NASA Astrophysics Data System (ADS)

    Akyurek, Z.; Kuter, S.; Weber, G. W.

    2016-12-01

    Spatial extent of snow cover is often considered as one of the key parameters in climatological, hydrological and ecological modeling due to its energy storage, high reflectance in the visible and NIR regions of the electromagnetic spectrum, significant heat capacity and insulating properties. A significant challenge in snow mapping by remote sensing (RS) is the trade-off between the temporal and spatial resolution of satellite imageries. In order to tackle this issue, machine learning-based subpixel snow mapping methods, like Artificial Neural Networks (ANNs), from low or moderate resolution images have been proposed. Multivariate Adaptive Regression Splines (MARS) is a nonparametric regression tool that can build flexible models for high dimensional and complex nonlinear data. Although MARS is not often employed in RS, it has various successful implementations such as estimation of vertical total electron content in ionosphere, atmospheric correction and classification of satellite images. This study is the first attempt in RS to evaluate the applicability of MARS for subpixel snow cover mapping from MODIS data. Total 16 MODIS-Landsat ETM+ image pairs taken over European Alps between March 2000 and April 2003 were used in the study. MODIS top-of-atmospheric reflectance, NDSI, NDVI and land cover classes were used as predictor variables. Cloud-covered, cloud shadow, water and bad-quality pixels were excluded from further analysis by a spatial mask. MARS models were trained and validated by using reference fractional snow cover (FSC) maps generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also developed. The mutual comparison of obtained MARS and ANN models was accomplished on independent test areas. The MARS model performed better than the ANN model with an average RMSE of 0.1288 over the independent test areas; whereas the average RMSE of the ANN model was 0.1500. MARS estimates for low FSC values (i.e., FSC<0.3) were better than that of ANN. Both ANN and MARS tended to overestimate medium FSC values (i.e., 0.30.7).

  20. Relative importance of management, meteorological and environmental factors in the spatial distribution of Fasciola hepatica in dairy cattle in a temperate climate zone.

    PubMed

    Bennema, S C; Ducheyne, E; Vercruysse, J; Claerebout, E; Hendrickx, G; Charlier, J

    2011-02-01

    Fasciola hepatica, a trematode parasite with a worldwide distribution, is the cause of important production losses in the dairy industry. Diagnosis is hampered by the fact that the infection is mostly subclinical. To increase awareness and develop regionally adapted control methods, knowledge on the spatial distribution of economically important infection levels is needed. Previous studies modelling the spatial distribution of F. hepatica are mostly based on single cross-sectional samplings and have focussed on climatic and environmental factors, often ignoring management factors. This study investigated the associations between management, climatic and environmental factors affecting the spatial distribution of infection with F. hepatica in dairy herds in a temperate climate zone (Flanders, Belgium) over three consecutive years. A bulk-tank milk antibody ELISA was used to measure F. hepatica infection levels in a random sample of 1762 dairy herds in the autumns of 2006, 2007 and 2008. The infection levels were included in a Geographic Information System together with meteorological, environmental and management parameters. Logistic regression models were used to determine associations between possible risk factors and infection levels. The prevalence and spatial distribution of F. hepatica was relatively stable, with small interannual differences in prevalence and location of clusters. The logistic regression model based on both management and climatic/environmental factors included the factors: annual rainfall, mowing of pastures, proportion of grazed grass in the diet and length of grazing season as significant predictors and described the spatial distribution of F. hepatica better than the model based on climatic/environmental factors only (annual rainfall, elevation and slope, soil type), with an Area Under the Curve of the Receiver Operating Characteristic of 0.68 compared with 0.62. The results indicate that in temperate climate zones without large climatic and environmental variation, management factors affect the spatial distribution of F. hepatica, and should be included in future spatial distribution models. Copyright © 2010 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  1. Geo-additive modelling of malaria in Burundi

    PubMed Central

    2011-01-01

    Background Malaria is a major public health issue in Burundi in terms of both morbidity and mortality, with around 2.5 million clinical cases and more than 15,000 deaths each year. It is still the single main cause of mortality in pregnant women and children below five years of age. Because of the severe health and economic burden of malaria, there is still a growing need for methods that will help to understand the influencing factors. Several studies/researches have been done on the subject yielding different results as which factors are most responsible for the increase in malaria transmission. This paper considers the modelling of the dependence of malaria cases on spatial determinants and climatic covariates including rainfall, temperature and humidity in Burundi. Methods The analysis carried out in this work exploits real monthly data collected in the area of Burundi over 12 years (1996-2007). Semi-parametric regression models are used. The spatial analysis is based on a geo-additive model using provinces as the geographic units of study. The spatial effect is split into structured (correlated) and unstructured (uncorrelated) components. Inference is fully Bayesian and uses Markov chain Monte Carlo techniques. The effects of the continuous covariates are modelled by cubic p-splines with 20 equidistant knots and second order random walk penalty. For the spatially correlated effect, Markov random field prior is chosen. The spatially uncorrelated effects are assumed to be i.i.d. Gaussian. The effects of climatic covariates and the effects of other spatial determinants are estimated simultaneously in a unified regression framework. Results The results obtained from the proposed model suggest that although malaria incidence in a given month is strongly positively associated with the minimum temperature of the previous months, regional patterns of malaria that are related to factors other than climatic variables have been identified, without being able to explain them. Conclusions In this paper, semiparametric models are used to model the effects of both climatic covariates and spatial effects on malaria distribution in Burundi. The results obtained from the proposed models suggest a strong positive association between malaria incidence in a given month and the minimum temperature of the previous month. From the spatial effects, important spatial patterns of malaria that are related to factors other than climatic variables are identified. Potential explanations (factors) could be related to socio-economic conditions, food shortage, limited access to health care service, precarious housing, promiscuity, poor hygienic conditions, limited access to drinking water, land use (rice paddies for example), displacement of the population (due to armed conflicts). PMID:21835010

  2. A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

    PubMed

    Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

    2013-08-01

    It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences.

  3. Spatial prediction of wheat Septoria leaf blotch (Septoria tritici) disease severity in central Ethiopia

    USGS Publications Warehouse

    Wakie, Tewodros; Kumar, Sunil; Senay, Gabriel; Takele, Abera; Lencho, Alemu

    2016-01-01

    A number of studies have reported the presence of wheat septoria leaf blotch (Septoria tritici; SLB) disease in Ethiopia. However, the environmental factors associated with SLB disease, and areas under risk of SLB disease, have not been studied. Here, we tested the hypothesis that environmental variables can adequately explain observed SLB disease severity levels in West Shewa, Central Ethiopia. Specifically, we identified 50 environmental variables and assessed their relationships with SLB disease severity. Geographically referenced disease severity data were obtained from the field, and linear regression and Boosted Regression Trees (BRT) modeling approaches were used for developing spatial models. Moderate-resolution imaging spectroradiometer (MODIS) derived vegetation indices and land surface temperature (LST) variables highly influenced SLB model predictions. Soil and topographic variables did not sufficiently explain observed SLB disease severity variation in this study. Our results show that wheat growing areas in Central Ethiopia, including highly productive districts, are at risk of SLB disease. The study demonstrates the integration of field data with modeling approaches such as BRT for predicting the spatial patterns of severity of a pathogenic wheat disease in Central Ethiopia. Our results can aid Ethiopia's wheat disease monitoring efforts, while our methods can be replicated for testing related hypotheses elsewhere.

  4. Retrieval of total suspended matter concentrations from high resolution WorldView-2 imagery: a case study of inland rivers

    NASA Astrophysics Data System (ADS)

    Shi, Liangliang; Mao, Zhihua; Wang, Zheng

    2018-02-01

    Satellite imagery has played an important role in monitoring water quality of lakes or coastal waters presently, but scarcely been applied in inland rivers. This paper presents an attempt of feasibility to apply regression model to quantify and map the concentrations of total suspended matter (CTSM) in inland rivers which have a large scale of spatial and a high CTSM dynamic range by using high resolution satellite remote sensing data, WorldView-2. An empirical approach to quantify CTSM by integrated use of high resolution WorldView-2 multispectral data and 21 in situ CTSM measurements. Radiometric correction, geometric and atmospheric correction involved in image processing procedure is carried out for deriving the surface reflectance to correlate the CTSM and satellite data by using single-variable and multivariable regression technique. Results of regression model show that the single near-infrared (NIR) band 8 of WorldView-2 have a relative strong relationship (R2=0.93) with CTSM. Different prediction models were developed on various combinations of WorldView-2 bands, the Akaike Information Criteria approach was used to choose the best model. The model involving band 1, 3, 5, and 8 of WorldView-2 had a best performance, whose R2 reach to 0.92, with SEE of 53.30 g/m3. The spatial distribution maps were produced by using the best multiple regression model. The results of this paper indicated that it is feasible to apply the empirical model by using high resolution satellite imagery to retrieve CTSM of inland rivers in routine monitoring of water quality.

  5. Spatial Surface PM2.5 Concentration Estimates for Wildfire Smoke Plumes in the Western U.S. Using Satellite Retrievals and Data Assimilation Techniques

    NASA Astrophysics Data System (ADS)

    Loria Salazar, S. M.; Holmes, H.

    2015-12-01

    Health effects studies of aerosol pollution have been extended spatially using data assimilation techniques that combine surface PM2.5 concentrations and Aerosol Optical Depth (AOD) from satellite retrievals. While most of these models were developed for the dark-vegetated eastern U.S. they are being used in the semi-arid western U.S. to remotely sense atmospheric aerosol concentrations. These models are helpful to understand the spatial variability of surface PM2.5concentrations in the western U.S. because of the sparse network of surface monitoring stations. However, the models developed for the eastern U.S. are not robust in the western U.S. due to different aerosol formation mechanisms, transport phenomena, and optical properties. This region is a challenge because of complex terrain, anthropogenic and biogenic emissions, secondary organic aerosol formation, smoke from wildfires, and low background aerosol concentrations. This research concentrates on the use and evaluation of satellite remote sensing to estimate surface PM2.5 concentrations from AOD satellite retrievals over California and Nevada during the summer months of 2012 and 2013. The aim of this investigation is to incorporate a spatial statistical model that uses AOD from AERONET as well as MODIS, surface PM2.5 concentrations, and land-use regression to characterize spatial surface PM2.5 concentrations. The land use regression model uses traditional inputs (e.g. meteorology, population density, terrain) and non-traditional variables (e.g. FIre Inventory from NCAR (FINN) emissions and MODIS albedo product) to account for variability related to smoke plume trajectories and land use. The results will be used in a spatially resolved health study to determine the association between wildfire smoke exposure and cardiorespiratory health endpoints. This relationship can be used with future projections of wildfire emissions related to climate change and droughts to quantify the expected health impact.

  6. The basis function approach for modeling autocorrelation in ecological data

    USGS Publications Warehouse

    Hefley, Trevor J.; Broms, Kristin M.; Brost, Brian M.; Buderman, Frances E.; Kay, Shannon L.; Scharf, Henry; Tipton, John; Williams, Perry J.; Hooten, Mevin B.

    2017-01-01

    Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data.

  7. Selection of a Geostatistical Method to Interpolate Soil Properties of the State Crop Testing Fields using Attributes of a Digital Terrain Model

    NASA Astrophysics Data System (ADS)

    Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.

    2018-03-01

    The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.

  8. Conduct urban agglomeration with the baton of transportation.

    DOT National Transportation Integrated Search

    2013-12-01

    A key indicator of traffic activity patterns is commuting distance. Shorter commuting distances yield less traffic, fewer emissions, : and lower energy consumption. This study develops a spatial error seemingly unrelated regression model to investiga...

  9. MERGANSER - An Empirical Model to Predict Fish and Loon Mercury in New England Lakes

    EPA Science Inventory

    MERGANSER (MERcury Geo-spatial AssessmeNtS for the New England Region) is an empirical least-squares multiple regression model using mercury (Hg) deposition and readily obtainable lake and watershed features to predict fish (fillet) and common loon (blood) Hg in New England lakes...

  10. EPA Office of Water (OW): 2002 SPARROW Total NP (Catchments)

    EPA Pesticide Factsheets

    SPARROW (SPAtially Referenced Regressions On Watershed attributes) is a watershed modeling tool with output that allows the user to interpret water quality monitoring data at the regional and sub-regional scale. The model relates in-stream water-quality measurements to spatially referenced characteristics of watersheds, including pollutant sources and environmental factors that affect rates of pollutant delivery to streams from the land and aquatic, in-stream processing . The core of the model consists of a nonlinear regression equation describing the non-conservative transport of contaminants from point and non-point (or ??diffuse??) sources on land to rivers and through the stream and river network. SPARROW estimates contaminant concentrations, loads (or ??mass,?? which is the product of concentration and streamflow), and yields in streams (mass of nitrogen and of phosphorus entering a stream per acre of land). It empirically estimates the origin and fate of contaminants in streams and receiving bodies, and quantifies uncertainties in model predictions. The model predictions are illustrated through detailed maps that provide information about contaminant loadings and source contributions at multiple scales for specific stream reaches, basins, or other geographic areas.

  11. The integration of geophysical and enhanced Moderate Resolution Imaging Spectroradiometer Normalized Difference Vegetation Index data into a rule-based, piecewise regression-tree model to estimate cheatgrass beginning of spring growth

    USGS Publications Warehouse

    Boyte, Stephen P.; Wylie, Bruce K.; Major, Donald J.; Brown, Jesslyn F.

    2015-01-01

    Cheatgrass exhibits spatial and temporal phenological variability across the Great Basin as described by ecological models formed using remote sensing and other spatial data-sets. We developed a rule-based, piecewise regression-tree model trained on 99 points that used three data-sets – latitude, elevation, and start of season time based on remote sensing input data – to estimate cheatgrass beginning of spring growth (BOSG) in the northern Great Basin. The model was then applied to map the location and timing of cheatgrass spring growth for the entire area. The model was strong (R2 = 0.85) and predicted an average cheatgrass BOSG across the study area of 29 March–4 April. Of early cheatgrass BOSG areas, 65% occurred at elevations below 1452 m. The highest proportion of cheatgrass BOSG occurred between mid-April and late May. Predicted cheatgrass BOSG in this study matched well with previous Great Basin cheatgrass green-up studies.

  12. Exploring the spatially varying innovation capacity of the US counties in the framework of Griliches' knowledge production function: a mixed GWR approach

    NASA Astrophysics Data System (ADS)

    Kang, Dongwoo; Dall'erba, Sandy

    2016-04-01

    Griliches' knowledge production function has been increasingly adopted at the regional level where location-specific conditions drive the spatial differences in knowledge creation dynamics. However, the large majority of such studies rely on a traditional regression approach that assumes spatially homogenous marginal effects of knowledge input factors. This paper extends the authors' previous work (Kang and Dall'erba in Int Reg Sci Rev, 2015. doi: 10.1177/0160017615572888) to investigate the spatial heterogeneity in the marginal effects by using nonparametric local modeling approaches such as geographically weighted regression (GWR) and mixed GWR with two distinct samples of the US Metropolitan Statistical Area (MSA) and non-MSA counties. The results indicate a high degree of spatial heterogeneity in the marginal effects of the knowledge input variables, more specifically for the local and distant spillovers of private knowledge measured across MSA counties. On the other hand, local academic knowledge spillovers are found to display spatially homogenous elasticities in both MSA and non-MSA counties. Our results highlight the strengths and weaknesses of each county's innovation capacity and suggest policy implications for regional innovation strategies.

  13. The centrifugal and centripetal force influence on spatial competition of agricultural land in Bandung Metropolitan Region

    NASA Astrophysics Data System (ADS)

    Sadewo, E.

    2017-06-01

    Agricultural activity has suffered a massive land functional shift caused by market mechanism in Bandung metropolitan region (BMR). We argue that the existence of agricultural land in urban spatial structure is the result of interaction between centrifugal and centripetal force on spatial competition. This research aims to explore how several recognized centrifugal and centripetal force influence to the existence of agricultural land in BMR land development. The analysis using multivariate regression indicates that there exists spatial competition between population density and degree of urbanization with agricultural land areas. Its extended spatial regression model suggested that neighboring situation plays an important role to preserve agricultural land areas existences in BMR. Meanwhile, the influence of distance between the location of the city center and employment opportunities is found to be insignificant in the spatial competition. It is opposed to the theory of von Thünen and monocentric model in general. One of the possible explanation of such condition is that the assumption of centrality does not met. In addition, the agricultural land density decay in the southern parts of the area was related to its geographical conditions as protected areas or unfavorable for farming activity. It is suggested that BMR was in the early phase of polycentric development. Hence, better policies that lead redirected development to the southern part of the region is needed as well as population control and regulation of land use.

  14. Statistical modeling of landslide hazard using GIS

    Treesearch

    Peter V. Gorsevski; Randy B. Foltz; Paul E. Gessler; Terrance W. Cundy

    2001-01-01

    A model for spatial prediction of landslide hazard was applied to a watershed affected by landslide events that occurred during the winter of 1995-96, following heavy rains, and snowmelt. Digital elevation data with 22.86 m x 22.86 m resolution was used for deriving topographic attributes used for modeling. The model is based on the combination of logistic regression...

  15. Calibrating MODIS aerosol optical depth for predicting daily PM2.5 concentrations via statistical downscaling.

    PubMed

    Chang, Howard H; Hu, Xuefei; Liu, Yang

    2014-07-01

    There has been a growing interest in the use of satellite-retrieved aerosol optical depth (AOD) to estimate ambient concentrations of PM2.5 (particulate matter <2.5 μm in aerodynamic diameter). With their broad spatial coverage, satellite data can increase the spatial-temporal availability of air quality data beyond ground monitoring measurements and potentially improve exposure assessment for population-based health studies. This paper describes a statistical downscaling approach that brings together (1) recent advances in PM2.5 land use regression models utilizing AOD and (2) statistical data fusion techniques for combining air quality data sets that have different spatial resolutions. Statistical downscaling assumes the associations between AOD and PM2.5 concentrations to be spatially and temporally dependent and offers two key advantages. First, it enables us to use gridded AOD data to predict PM2.5 concentrations at spatial point locations. Second, the unified hierarchical framework provides straightforward uncertainty quantification in the predicted PM2.5 concentrations. The proposed methodology is applied to a data set of daily AOD values in southeastern United States during the period 2003-2005. Via cross-validation experiments, our model had an out-of-sample prediction R(2) of 0.78 and a root mean-squared error (RMSE) of 3.61 μg/m(3) between observed and predicted daily PM2.5 concentrations. This corresponds to a 10% decrease in RMSE compared with the same land use regression model without AOD as a predictor. Prediction performances of spatial-temporal interpolations to locations and on days without monitoring PM2.5 measurements were also examined.

  16. Assessing Local Model Adequacy in Bayesian Hierarchical Models Using the Partitioned Deviance Information Criterion

    PubMed Central

    Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.

    2010-01-01

    Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121

  17. The Schaake shuffle: A method for reconstructing space-time variability in forecasted precipitation and temperature fields

    USGS Publications Warehouse

    Clark, M.R.; Gangopadhyay, S.; Hay, L.; Rajagopalan, B.; Wilby, R.

    2004-01-01

    A number of statistical methods that are used to provide local-scale ensemble forecasts of precipitation and temperature do not contain realistic spatial covariability between neighboring stations or realistic temporal persistence for subsequent forecast lead times. To demonstrate this point, output from a global-scale numerical weather prediction model is used in a stepwise multiple linear regression approach to downscale precipitation and temperature to individual stations located in and around four study basins in the United States. Output from the forecast model is downscaled for lead times up to 14 days. Residuals in the regression equation are modeled stochastically to provide 100 ensemble forecasts. The precipitation and temperature ensembles from this approach have a poor representation of the spatial variability and temporal persistence. The spatial correlations for downscaled output are considerably lower than observed spatial correlations at short forecast lead times (e.g., less than 5 days) when there is high accuracy in the forecasts. At longer forecast lead times, the downscaled spatial correlations are close to zero. Similarly, the observed temporal persistence is only partly present at short forecast lead times. A method is presented for reordering the ensemble output in order to recover the space-time variability in precipitation and temperature fields. In this approach, the ensemble members for a given forecast day are ranked and matched with the rank of precipitation and temperature data from days randomly selected from similar dates in the historical record. The ensembles are then reordered to correspond to the original order of the selection of historical data. Using this approach, the observed intersite correlations, intervariable correlations, and the observed temporal persistence are almost entirely recovered. This reordering methodology also has applications for recovering the space-time variability in modeled streamflow. ?? 2004 American Meteorological Society.

  18. Scaling field data to calibrate and validate moderate spatial resolution remote sensing models

    USGS Publications Warehouse

    Baccini, A.; Friedl, M.A.; Woodcock, C.E.; Zhu, Z.

    2007-01-01

    Validation and calibration are essential components of nearly all remote sensing-based studies. In both cases, ground measurements are collected and then related to the remote sensing observations or model results. In many situations, and particularly in studies that use moderate resolution remote sensing, a mismatch exists between the sensor's field of view and the scale at which in situ measurements are collected. The use of in situ measurements for model calibration and validation, therefore, requires a robust and defensible method to spatially aggregate ground measurements to the scale at which the remotely sensed data are acquired. This paper examines this challenge and specifically considers two different approaches for aggregating field measurements to match the spatial resolution of moderate spatial resolution remote sensing data: (a) landscape stratification; and (b) averaging of fine spatial resolution maps. The results show that an empirically estimated stratification based on a regression tree method provides a statistically defensible and operational basis for performing this type of procedure. 

  19. A heteroskedastic error covariance matrix estimator using a first-order conditional autoregressive Markov simulation for deriving asympotical efficient estimates from ecological sampled Anopheles arabiensis aquatic habitat covariates

    PubMed Central

    Jacob, Benjamin G; Griffith, Daniel A; Muturi, Ephantus J; Caamano, Erick X; Githure, John I; Novak, Robert J

    2009-01-01

    Background Autoregressive regression coefficients for Anopheles arabiensis aquatic habitat models are usually assessed using global error techniques and are reported as error covariance matrices. A global statistic, however, will summarize error estimates from multiple habitat locations. This makes it difficult to identify where there are clusters of An. arabiensis aquatic habitats of acceptable prediction. It is therefore useful to conduct some form of spatial error analysis to detect clusters of An. arabiensis aquatic habitats based on uncertainty residuals from individual sampled habitats. In this research, a method of error estimation for spatial simulation models was demonstrated using autocorrelation indices and eigenfunction spatial filters to distinguish among the effects of parameter uncertainty on a stochastic simulation of ecological sampled Anopheles aquatic habitat covariates. A test for diagnostic checking error residuals in an An. arabiensis aquatic habitat model may enable intervention efforts targeting productive habitats clusters, based on larval/pupal productivity, by using the asymptotic distribution of parameter estimates from a residual autocovariance matrix. The models considered in this research extends a normal regression analysis previously considered in the literature. Methods Field and remote-sampled data were collected during July 2006 to December 2007 in Karima rice-village complex in Mwea, Kenya. SAS 9.1.4® was used to explore univariate statistics, correlations, distributions, and to generate global autocorrelation statistics from the ecological sampled datasets. A local autocorrelation index was also generated using spatial covariance parameters (i.e., Moran's Indices) in a SAS/GIS® database. The Moran's statistic was decomposed into orthogonal and uncorrelated synthetic map pattern components using a Poisson model with a gamma-distributed mean (i.e. negative binomial regression). The eigenfunction values from the spatial configuration matrices were then used to define expectations for prior distributions using a Markov chain Monte Carlo (MCMC) algorithm. A set of posterior means were defined in WinBUGS 1.4.3®. After the model had converged, samples from the conditional distributions were used to summarize the posterior distribution of the parameters. Thereafter, a spatial residual trend analyses was used to evaluate variance uncertainty propagation in the model using an autocovariance error matrix. Results By specifying coefficient estimates in a Bayesian framework, the covariate number of tillers was found to be a significant predictor, positively associated with An. arabiensis aquatic habitats. The spatial filter models accounted for approximately 19% redundant locational information in the ecological sampled An. arabiensis aquatic habitat data. In the residual error estimation model there was significant positive autocorrelation (i.e., clustering of habitats in geographic space) based on log-transformed larval/pupal data and the sampled covariate depth of habitat. Conclusion An autocorrelation error covariance matrix and a spatial filter analyses can prioritize mosquito control strategies by providing a computationally attractive and feasible description of variance uncertainty estimates for correctly identifying clusters of prolific An. arabiensis aquatic habitats based on larval/pupal productivity. PMID:19772590

  20. Prediction of erodibility in Oxisols using iron oxides, soil color and diffuse reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Arantes Camargo, Livia; Marques, José, Jr.

    2015-04-01

    The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.

  1. Evaluation of land use regression models for NO2 in El Paso, Texas, USA

    PubMed Central

    Gonzales, Melissa; Myers, Orrin; Smith, Luther; Olvera, Hector A.; Mukerjee, Shaibal; Li, Wen-Whai; Pingitore, Nicholas; Amaya, Maria; Burchiel, Scott; Berwick, Marianne

    2012-01-01

    Developing suitable exposure estimates for air pollution health studies is problematic due to spatial and temporal variation in concentrations and often limited monitoring data. Though land use regression models (LURs) are often used for this purpose, their applicability to later periods of time, larger geographic areas, and seasonal variation is largely untested. We evaluate a series of mixed model LURs to describe the spatial-temporal gradients of NO2 across El Paso County, Texas based on measurements collected during cool and warm seasons in 2006–2007 (2006–7). We also evaluated performance of a general additive model (GAM) developed for central El Paso in 1999 to assess spatial gradients across the County in 2006–7. Five LURs were developed iteratively from the study data and their predictions were averaged to provide robust nitrogen dioxide (NO2) concentration gradients across the county. Despite differences in sampling time frame, model covariates and model estimation methods, predicted NO2 concentration gradients were similar in the current study as compared to the 1999 study. Through a comprehensive LUR modeling campaign, it was shown that the nature of the most influential predictive variables remained the same for El Paso between the 1999 and 2006–7. The similar LUR results obtained here demonstrate that, at least for El Paso, LURs developed from prior years may still be applicable to assess exposure conditions in subsequent years and in different seasons when seasonal variation is taken into consideration. PMID:22728301

  2. When Deriving the Spatial QRS-T Angle from the 12-lead ECG, which Transform is More Frank: Regression or Inverse Dower?

    NASA Technical Reports Server (NTRS)

    Schlegel, Todd T.; Cortez, Daniel

    2010-01-01

    Our primary objective was to ascertain which commonly used 12-to-Frank-lead transformation yields spatial QRS-T angle values closest to those obtained from simultaneously collected true Frank-lead recordings. Simultaneous 12-lead and Frank XYZ-lead recordings were analyzed for 100 post-myocardial infarction patients and 50 controls. Relative agreement, with true Frank-lead results, of 12-to-Frank-lead transformed results for the spatial QRS-T angle using Kors regression versus inverse Dower was assessed via ANOVA, Lin s concordance and Bland-Altman plots. Spatial QRS-T angles from the true Frank leads were not significantly different than those derived from the Kors regression-related transformation but were significantly smaller than those derived from the inverse Dower-related transformation (P less than 0.001). Independent of method, spatial mean QRS-T angles were also always significantly larger than spatial maximum (peaks) QRS-T angles. Spatial QRS-T angles are best approximated by regression-related transforms. Spatial mean and spatial peaks QRS-T angles should also not be used interchangeably.

  3. Downscaling soil moisture over East Asia through multi-sensor data fusion and optimization of regression trees

    NASA Astrophysics Data System (ADS)

    Park, Seonyoung; Im, Jungho; Park, Sumin; Rhee, Jinyoung

    2017-04-01

    Soil moisture is one of the most important keys for understanding regional and global climate systems. Soil moisture is directly related to agricultural processes as well as hydrological processes because soil moisture highly influences vegetation growth and determines water supply in the agroecosystem. Accurate monitoring of the spatiotemporal pattern of soil moisture is important. Soil moisture has been generally provided through in situ measurements at stations. Although field survey from in situ measurements provides accurate soil moisture with high temporal resolution, it requires high cost and does not provide the spatial distribution of soil moisture over large areas. Microwave satellite (e.g., advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR2), the Advanced Scatterometer (ASCAT), and Soil Moisture Active Passive (SMAP)) -based approaches and numerical models such as Global Land Data Assimilation System (GLDAS) and Modern- Era Retrospective Analysis for Research and Applications (MERRA) provide spatial-temporalspatiotemporally continuous soil moisture products at global scale. However, since those global soil moisture products have coarse spatial resolution ( 25-40 km), their applications for agriculture and water resources at local and regional scales are very limited. Thus, soil moisture downscaling is needed to overcome the limitation of the spatial resolution of soil moisture products. In this study, GLDAS soil moisture data were downscaled up to 1 km spatial resolution through the integration of AMSR2 and ASCAT soil moisture data, Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and Moderate Resolution Imaging Spectroradiometer (MODIS) data—Land Surface Temperature, Normalized Difference Vegetation Index, and Land cover—using modified regression trees over East Asia from 2013 to 2015. Modified regression trees were implemented using Cubist, a commercial software tool based on machine learning. An optimization based on pruning of rules derived from the modified regression trees was conducted. Root Mean Square Error (RMSE) and Correlation coefficients (r) were used to optimize the rules, and finally 59 rules from modified regression trees were produced. The results show high validation r (0.79) and low validation RMSE (0.0556m3/m3). The 1 km downscaled soil moisture was evaluated using ground soil moisture data at 14 stations, and both soil moisture data showed similar temporal patterns (average r=0.51 and average RMSE=0.041). The spatial distribution of the 1 km downscaled soil moisture well corresponded with GLDAS soil moisture that caught both extremely dry and wet regions. Correlation between GLDAS and the 1 km downscaled soil moisture during growing season was positive (mean r=0.35) in most regions.

  4. Spatial vulnerability assessments by regression kriging

    NASA Astrophysics Data System (ADS)

    Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor

    2016-04-01

    Two fairly different complex environmental phenomena, causing natural hazard were mapped based on a combined spatial inference approach. The behaviour is related to various environmental factors and the applied approach enables the inclusion of several, spatially exhaustive auxiliary variables that are available for mapping. Inland excess water (IEW) is an interrelated natural and human induced phenomenon causes several problems in the flat-land regions of Hungary, which cover nearly half of the country. The term 'inland excess water' refers to the occurrence of inundations outside the flood levee that originate from sources differing from flood overflow, it is surplus surface water forming due to the lack of runoff, insufficient absorption capability of soil or the upwelling of groundwater. There is a multiplicity of definitions, which indicate the complexity of processes that govern this phenomenon. Most of the definitions have a common part, namely, that inland excess water is temporary water inundation that occurs in flat-lands due to both precipitation and groundwater emerging on the surface as substantial sources. Radon gas is produced in the radioactive decay chain of uranium, which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on soil physical and meteorological parameters and can enter and accumulate in the buildings. Health risk originating from indoor radon concentration attributed to natural factors is characterized by geogenic radon potential (GRP). In addition to geology and meteorology, physical soil properties play significant role in the determination of GRP. Identification of areas with high risk requires spatial modelling, that is mapping of specific natural hazards. In both cases external environmental factors determine the behaviour of the target process (occurrence/frequncy of IEW and grade of GRP respectively). Spatial auxiliary information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).

  5. Exploring prediction uncertainty of spatial data in geostatistical and machine learning Approaches

    NASA Astrophysics Data System (ADS)

    Klump, J. F.; Fouedjio, F.

    2017-12-01

    Geostatistical methods such as kriging with external drift as well as machine learning techniques such as quantile regression forest have been intensively used for modelling spatial data. In addition to providing predictions for target variables, both approaches are able to deliver a quantification of the uncertainty associated with the prediction at a target location. Geostatistical approaches are, by essence, adequate for providing such prediction uncertainties and their behaviour is well understood. However, they often require significant data pre-processing and rely on assumptions that are rarely met in practice. Machine learning algorithms such as random forest regression, on the other hand, require less data pre-processing and are non-parametric. This makes the application of machine learning algorithms to geostatistical problems an attractive proposition. The objective of this study is to compare kriging with external drift and quantile regression forest with respect to their ability to deliver reliable prediction uncertainties of spatial data. In our comparison we use both simulated and real world datasets. Apart from classical performance indicators, comparisons make use of accuracy plots, probability interval width plots, and the visual examinations of the uncertainty maps provided by the two approaches. By comparing random forest regression to kriging we found that both methods produced comparable maps of estimated values for our variables of interest. However, the measure of uncertainty provided by random forest seems to be quite different to the measure of uncertainty provided by kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. These preliminary results raise questions about assessing the risks associated with decisions based on the predictions from geostatistical and machine learning algorithms in a spatial context, e.g. mineral exploration.

  6. Correlates of county-level nonviral sexually transmitted infection hot spots in the US: application of hot spot analysis and spatial logistic regression.

    PubMed

    Chang, Brian A; Pearson, William S; Owusu-Edusei, Kwame

    2017-04-01

    We used a combination of hot spot analysis (HSA) and spatial regression to examine county-level hot spot correlates for the most commonly reported nonviral sexually transmitted infections (STIs) in the 48 contiguous states in the United States (US). We obtained reported county-level total case rates of chlamydia, gonorrhea, and primary and secondary (P&S) syphilis in all counties in the 48 contiguous states from national surveillance data and computed temporally smoothed rates using 2008-2012 data. Covariates were obtained from county-level multiyear (2008-2012) American Community Surveys from the US census. We conducted HSA to identify hot spot counties for all three STIs. We then applied spatial logistic regression with the spatial error model to determine the association between the identified hot spots and the covariates. HSA indicated that ≥84% of hot spots for each STI were in the South. Spatial regression results indicated that, a 10-unit increase in the percentage of Black non-Hispanics was associated with ≈42% (P < 0.01) [≈22% (P < 0.01), for Hispanics] increase in the odds of being a hot spot county for chlamydia and gonorrhea, and ≈27% (P < 0.01) [≈11% (P < 0.01) for Hispanics] for P&S syphilis. Compared with the other regions (West, Midwest, and Northeast), counties in the South were 6.5 (P < 0.01; chlamydia), 9.6 (P < 0.01; gonorrhea), and 4.7 (P < 0.01; P&S syphilis) times more likely to be hot spots. Our study provides important information on hot spot clusters of nonviral STIs in the entire United States, including associations between hot spot counties and sociodemographic factors. Published by Elsevier Inc.

  7. Spatial patterns of development drive water use

    USGS Publications Warehouse

    Sanchez, G.M.; Smith, J.W.; Terando, Adam J.; Sun, G.; Meentemeyer, R.K.

    2018-01-01

    Water availability is becoming more uncertain as human populations grow, cities expand into rural regions and the climate changes. In this study, we examine the functional relationship between water use and the spatial patterns of developed land across the rapidly growing region of the southeastern United States. We quantified the spatial pattern of developed land within census tract boundaries, including multiple metrics of density and configuration. Through non‐spatial and spatial regression approaches we examined relationships and spatial dependencies between the spatial pattern metrics, socio‐economic and environmental variables and two water use variables: a) domestic water use, and b) total development‐related water use (a combination of public supply, domestic self‐supply and industrial self‐supply). Metrics describing the spatial patterns of development had the highest measure of relative importance (accounting for 53% of model's explanatory power), explaining significantly more variance in water use compared to socio‐economic or environmental variables commonly used to estimate water use. Integrating metrics characterizing the spatial pattern of development into water use models is likely to increase their utility and could facilitate water‐efficient land use planning.

  8. Spatial Patterns of Development Drive Water Use

    NASA Astrophysics Data System (ADS)

    Sanchez, G. M.; Smith, J. W.; Terando, A.; Sun, G.; Meentemeyer, R. K.

    2018-03-01

    Water availability is becoming more uncertain as human populations grow, cities expand into rural regions and the climate changes. In this study, we examine the functional relationship between water use and the spatial patterns of developed land across the rapidly growing region of the southeastern United States. We quantified the spatial pattern of developed land within census tract boundaries, including multiple metrics of density and configuration. Through non-spatial and spatial regression approaches we examined relationships and spatial dependencies between the spatial pattern metrics, socio-economic and environmental variables and two water use variables: a) domestic water use, and b) total development-related water use (a combination of public supply, domestic self-supply and industrial self-supply). Metrics describing the spatial patterns of development had the highest measure of relative importance (accounting for 53% of model's explanatory power), explaining significantly more variance in water use compared to socio-economic or environmental variables commonly used to estimate water use. Integrating metrics characterizing the spatial pattern of development into water use models is likely to increase their utility and could facilitate water-efficient land use planning.

  9. Classification and regression trees

    Treesearch

    G. G. Moisen

    2008-01-01

    Frequently, ecologists are interested in exploring ecological relationships, describing patterns and processes, or making spatial or temporal predictions. These purposes often can be addressed by modeling the relationship between some outcome or response and a set of features or explanatory variables.

  10. Accounting for autocorrelation in multi-drug resistant tuberculosis predictors using a set of parsimonious orthogonal eigenvectors aggregated in geographic space.

    PubMed

    Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J

    2010-05-01

    Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.

  11. Using Historical Atlas Data to Develop High-Resolution Distribution Models of Freshwater Fishes

    PubMed Central

    Huang, Jian; Frimpong, Emmanuel A.

    2015-01-01

    Understanding the spatial pattern of species distributions is fundamental in biogeography, and conservation and resource management applications. Most species distribution models (SDMs) require or prefer species presence and absence data for adequate estimation of model parameters. However, observations with unreliable or unreported species absences dominate and limit the implementation of SDMs. Presence-only models generally yield less accurate predictions of species distribution, and make it difficult to incorporate spatial autocorrelation. The availability of large amounts of historical presence records for freshwater fishes of the United States provides an opportunity for deriving reliable absences from data reported as presence-only, when sampling was predominantly community-based. In this study, we used boosted regression trees (BRT), logistic regression, and MaxEnt models to assess the performance of a historical metacommunity database with inferred absences, for modeling fish distributions, investigating the effect of model choice and data properties thereby. With models of the distribution of 76 native, non-game fish species of varied traits and rarity attributes in four river basins across the United States, we show that model accuracy depends on data quality (e.g., sample size, location precision), species’ rarity, statistical modeling technique, and consideration of spatial autocorrelation. The cross-validation area under the receiver-operating-characteristic curve (AUC) tended to be high in the spatial presence-absence models at the highest level of resolution for species with large geographic ranges and small local populations. Prevalence affected training but not validation AUC. The key habitat predictors identified and the fish-habitat relationships evaluated through partial dependence plots corroborated most previous studies. The community-based SDM framework broadens our capability to model species distributions by innovatively removing the constraint of lack of species absence data, thus providing a robust prediction of distribution for stream fishes in other regions where historical data exist, and for other taxa (e.g., benthic macroinvertebrates, birds) usually observed by community-based sampling designs. PMID:26075902

  12. Spatial variation of natural radiation and childhood leukaemia incidence in Great Britain.

    PubMed

    Richardson, S; Monfort, C; Green, M; Draper, G; Muirhead, C

    This paper describes an analysis of the geographical variation of childhood leukaemia incidence in Great Britain over a 15 year period in relation to natural radiation (gamma and radon). Data at the level of the 459 district level local authorities in England, Wales and regional districts in Scotland are analysed in two complementary ways: first, by Poisson regressions with the inclusion of environmental covariates and a smooth spatial structure; secondly, by a hierarchical Bayesian model in which extra-Poisson variability is modelled explicitly in terms of spatial and non-spatial components. From this analysis, we deduce a strong indication that a main part of the variability is accounted for by a local neighbourhood 'clustering' structure. This structure is furthermore relatively stable over the 15 year period for the lymphocytic leukaemias which make up the majority of observed cases. We found no evidence of a positive association of childhood leukaemia incidence with outdoor or indoor gamma radiation levels. There is no consistent evidence of any association with radon levels. Indeed, in the Poisson regressions, a significant positive association was only observed for one 5-year period, a result which is not compatible with a stable environmental effect. Moreover, this positive association became clearly non-significant when over-dispersion relative to the Poisson distribution was taken into account.

  13. Spatial Patterns and Impacts of Environmental and Climatic Factors on Canine Sinonasal Aspergillosis in Northern California

    PubMed Central

    Magro, Monise; Sykes, Jane; Vishkautsan, Polina; Martínez-López, Beatriz

    2017-01-01

    Sinonasal aspergillosis (SNA) causes chronic nasal discharge in dogs and has a worldwide distribution, although most reports of SNA in North America originate from the western USA. SNA is mainly caused by Aspergillus fumigatus, a ubiquitous saprophytic filamentous fungus. Infection is thought to follow inhalation of spores. SNA is a disease of the nasal cavity and/or sinuses with variable degrees of local invasion and destruction. While some host factors appear to predispose to SNA (such as belonging to a dolichocephalic breed), environmental risk factors have been scarcely studied. Because A. fumigatus is also the main cause of invasive aspergillosis in humans, unraveling the distribution and the environmental and climatic risk factors for this agent in dogs would be of great benefit for public health studies, advancing understanding of both distribution and risk factors in humans. In this study, we reviewed electronic medical records of 250 dogs diagnosed with SNA between 1990 and 2014 at the University of California Davis Veterinary Medical Teaching Hospital (VMTH). A 145-mile radius catchment area around the VMTH was selected. Data were aggregated by zip code and incorporated into a multivariate logistic regression model. The logistic regression model was compared to an autologistic regression model to evaluate the effect of spatial autocorrelation. Traffic density, active composting sites, and environmental and climatic factors related with wind and temperature were significantly associated with increase in disease occurrence in dogs. Results provide valuable information about the risk factors and spatial distribution of SNA in dogs in Northern California. Our ultimate goal is to utilize the results to investigate risk-based interventions, promote awareness, and serve as a model for further studies of aspergillosis in humans. PMID:28717638

  14. Spatial Patterns and Impacts of Environmental and Climatic Factors on Canine Sinonasal Aspergillosis in Northern California.

    PubMed

    Magro, Monise; Sykes, Jane; Vishkautsan, Polina; Martínez-López, Beatriz

    2017-01-01

    Sinonasal aspergillosis (SNA) causes chronic nasal discharge in dogs and has a worldwide distribution, although most reports of SNA in North America originate from the western USA. SNA is mainly caused by Aspergillus fumigatus , a ubiquitous saprophytic filamentous fungus. Infection is thought to follow inhalation of spores. SNA is a disease of the nasal cavity and/or sinuses with variable degrees of local invasion and destruction. While some host factors appear to predispose to SNA (such as belonging to a dolichocephalic breed), environmental risk factors have been scarcely studied. Because A. fumigatus is also the main cause of invasive aspergillosis in humans, unraveling the distribution and the environmental and climatic risk factors for this agent in dogs would be of great benefit for public health studies, advancing understanding of both distribution and risk factors in humans. In this study, we reviewed electronic medical records of 250 dogs diagnosed with SNA between 1990 and 2014 at the University of California Davis Veterinary Medical Teaching Hospital (VMTH). A 145-mile radius catchment area around the VMTH was selected. Data were aggregated by zip code and incorporated into a multivariate logistic regression model. The logistic regression model was compared to an autologistic regression model to evaluate the effect of spatial autocorrelation. Traffic density, active composting sites, and environmental and climatic factors related with wind and temperature were significantly associated with increase in disease occurrence in dogs. Results provide valuable information about the risk factors and spatial distribution of SNA in dogs in Northern California. Our ultimate goal is to utilize the results to investigate risk-based interventions, promote awareness, and serve as a model for further studies of aspergillosis in humans.

  15. Environmental characteristics associated with pedestrian-motor vehicle collisions in Denver, Colorado.

    PubMed

    Sebert Kuhlmann, Anne K; Brett, John; Thomas, Deborah; Sain, Stephan R

    2009-09-01

    We examined patterns of pedestrian-motor vehicle collisions and associated environmental characteristics in Denver, Colorado. We integrated publicly available data on motor vehicle collisions, liquor licenses, land use, and sociodemographic characteristics to analyze spatial patterns and other characteristics of collisions involving pedestrians. We developed both linear and spatially weighted regression models of these collisions. Spatial analysis revealed global clustering of pedestrian-motor vehicle collisions with concentrations in downtown, in a contiguous neighborhood, and along major arterial streets. Walking to work, population density, and liquor license outlet density all contributed significantly to both linear and spatial models of collisions involving pedestrians and were each significantly associated with these collisions. These models, constructed with data from Denver, identified conditions that likely contribute to patterns of pedestrian-motor vehicle collisions. Should these models be verified elsewhere, they will have implications for future research directions, public policy to enhance pedestrian safety, and public health programs aimed at decreasing unintentional injury from pedestrian-motor vehicle collisions and promoting walking as a routine physical activity.

  16. Networks of volatility spillovers among stock markets

    NASA Astrophysics Data System (ADS)

    Baumöhl, Eduard; Kočenda, Evžen; Lyócsa, Štefan; Výrost, Tomáš

    2018-01-01

    In our network analysis of 40 developed, emerging and frontier stock markets during the 2006-2014 period, we describe and model volatility spillovers during both the global financial crisis and tranquil periods. The resulting market interconnectedness is depicted by fitting a spatial model incorporating several exogenous characteristics. We document the presence of significant temporal proximity effects between markets and somewhat weaker temporal effects with regard to the US equity market - volatility spillovers decrease when markets are characterized by greater temporal proximity. Volatility spillovers also present a high degree of interconnectedness, which is measured by high spatial autocorrelation. This finding is confirmed by spatial regression models showing that indirect effects are much stronger than direct effects; i.e., market-related changes in 'neighboring' markets (within a network) affect volatility spillovers more than changes in the given market alone, suggesting that spatial effects simply cannot be ignored when modeling stock market relationships. Our results also link spillovers of escalating magnitude with increasing market size, market liquidity and economic openness.

  17. Exploring the Mechanisms of Ecological Land Change Based on the Spatial Autoregressive Model: A Case Study of the Poyang Lake Eco-Economic Zone, China

    PubMed Central

    Xie, Hualin; Liu, Zhifei; Wang, Peng; Liu, Guiying; Lu, Fucai

    2013-01-01

    Ecological land is one of the key resources and conditions for the survival of humans because it can provide ecosystem services and is particularly important to public health and safety. It is extremely valuable for effective ecological management to explore the evolution mechanisms of ecological land. Based on spatial statistical analyses, we explored the spatial disparities and primary potential drivers of ecological land change in the Poyang Lake Eco-economic Zone of China. The results demonstrated that the global Moran’s I value is 0.1646 during the 1990 to 2005 time period and indicated significant positive spatial correlation (p < 0.05). The results also imply that the clustering trend of ecological land changes weakened in the study area. Some potential driving forces were identified by applying the spatial autoregressive model in this study. The results demonstrated that the higher economic development level and industrialization rate were the main drivers for the faster change of ecological land in the study area. This study also tested the superiority of the spatial autoregressive model to study the mechanisms of ecological land change by comparing it with the traditional linear regressive model. PMID:24384778

  18. a Geographic Weighted Regression for Rural Highways Crashes Modelling Using the Gaussian and Tricube Kernels: a Case Study of USA Rural Highways

    NASA Astrophysics Data System (ADS)

    Aghayari, M.; Pahlavani, P.; Bigdeli, B.

    2017-09-01

    Based on world health organization (WHO) report, driving incidents are counted as one of the eight initial reasons for death in the world. The purpose of this paper is to develop a method for regression on effective parameters of highway crashes. In the traditional methods, it was assumed that the data are completely independent and environment is homogenous while the crashes are spatial events which are occurring in geographic space and crashes have spatial data. Spatial data have spatial features such as spatial autocorrelation and spatial non-stationarity in a way working with them is going to be a bit difficult. The proposed method has implemented on a set of records of fatal crashes that have been occurred in highways connecting eight east states of US. This data have been recorded between the years 2007 and 2009. In this study, we have used GWR method with two Gaussian and Tricube kernels. The Number of casualties has been considered as dependent variable and number of persons in crash, road alignment, number of lanes, pavement type, surface condition, road fence, light condition, vehicle type, weather, drunk driver, speed limitation, harmful event, road profile, and junction type have been considered as explanatory variables according to previous studies in using GWR method. We have compered the results of implementation with OLS method. Results showed that R2 for OLS method is 0.0654 and for the proposed method is 0.9196 that implies the proposed GWR is better method for regression in rural highway crashes.

  19. Modeling current climate conditions for forest pest risk assessment

    Treesearch

    Frank H. Koch; John W. Coulston

    2010-01-01

    Current information on broad-scale climatic conditions is essential for assessing potential distribution of forest pests. At present, sophisticated spatial interpolation approaches such as the Parameter-elevation Regressions on Independent Slopes Model (PRISM) are used to create high-resolution climatic data sets. Unfortunately, these data sets are based on 30-year...

  20. Urban change analysis and future growth of Istanbul.

    PubMed

    Akın, Anıl; Sunar, Filiz; Berberoğlu, Süha

    2015-08-01

    This study is aimed at analyzing urban change within Istanbul and assessing the city's future growth potential using appropriate approach modeling for the year 2040. Urban growth is a major driving force of land-use change, and spatial and temporal components of urbanization can be identified through accurate spatial modeling. In this context, widely used urban modeling approaches, such as the Markov chain and logistic regression based on cellular automata (CA), were used to simulate urban growth within Istanbul. The distance from each pixel to the urban and road classes, elevation, and slope, together with municipality and land use maps (as an excluded layer), were identified as factors. Calibration data were obtained from remotely sensed data recorded in 1972, 1986, and 2013. Validation was performed by overlaying the simulated and actual 2013 urban maps, and a kappa index of agreement was derived. The results indicate that urban expansion will influence mainly forest areas during the time period of 2013-2040. The urban expansion was predicted as 429 and 327 km(2) with the Markov chain and logistic regression models, respectively.

  1. Hierarchical spatial models for predicting pygmy rabbit distribution and relative abundance

    USGS Publications Warehouse

    Wilson, T.L.; Odei, J.B.; Hooten, M.B.; Edwards, T.C.

    2010-01-01

    Conservationists routinely use species distribution models to plan conservation, restoration and development actions, while ecologists use them to infer process from pattern. These models tend to work well for common or easily observable species, but are of limited utility for rare and cryptic species. This may be because honest accounting of known observation bias and spatial autocorrelation are rarely included, thereby limiting statistical inference of resulting distribution maps. We specified and implemented a spatially explicit Bayesian hierarchical model for a cryptic mammal species (pygmy rabbit Brachylagus idahoensis). Our approach used two levels of indirect sign that are naturally hierarchical (burrows and faecal pellets) to build a model that allows for inference on regression coefficients as well as spatially explicit model parameters. We also produced maps of rabbit distribution (occupied burrows) and relative abundance (number of burrows expected to be occupied by pygmy rabbits). The model demonstrated statistically rigorous spatial prediction by including spatial autocorrelation and measurement uncertainty. We demonstrated flexibility of our modelling framework by depicting probabilistic distribution predictions using different assumptions of pygmy rabbit habitat requirements. Spatial representations of the variance of posterior predictive distributions were obtained to evaluate heterogeneity in model fit across the spatial domain. Leave-one-out cross-validation was conducted to evaluate the overall model fit. Synthesis and applications. Our method draws on the strengths of previous work, thereby bridging and extending two active areas of ecological research: species distribution models and multi-state occupancy modelling. Our framework can be extended to encompass both larger extents and other species for which direct estimation of abundance is difficult. ?? 2010 The Authors. Journal compilation ?? 2010 British Ecological Society.

  2. A Development of Nonstationary Regional Frequency Analysis Model with Large-scale Climate Information: Its Application to Korean Watershed

    NASA Astrophysics Data System (ADS)

    Kim, Jin-Young; Kwon, Hyun-Han; Kim, Hung-Soo

    2015-04-01

    The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, this study aims to develop a hierarchical Bayesian model based nonstationary regional frequency analysis in that spatial patterns of the design rainfall with geographical information (e.g. latitude, longitude and altitude) are explicitly incorporated. This study assumes that the parameters of Gumbel (or GEV distribution) are a function of geographical characteristics within a general linear regression framework. Posterior distribution of the regression parameters are estimated by Bayesian Markov Chain Monte Carlo (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the distributions by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Finally, comprehensive discussion on design rainfall in the context of nonstationary will be presented. KEYWORDS: Regional frequency analysis, Nonstationary, Spatial information, Bayesian Acknowledgement This research was supported by a grant (14AWMP-B082564-01) from Advanced Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.

  3. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

    Treesearch

    Chad Babcock; Andrew O. Finley; John B. Bradford; Randy Kolka; Richard Birdsey; Michael G. Ryan

    2015-01-01

    Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both...

  4. Landslide-susceptibility analysis using light detection and ranging-derived digital elevation models and logistic regression models: a case study in Mizunami City, Japan

    NASA Astrophysics Data System (ADS)

    Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji

    2013-01-01

    To mitigate the damage caused by landslide disasters, different mathematical models have been applied to predict landslide spatial distribution characteristics. Although some researchers have achieved excellent results around the world, few studies take the spatial resolution of the database into account. Four types of digital elevation model (DEM) ranging from 2 to 20 m derived from light detection and ranging technology to analyze landslide susceptibility in Mizunami City, Gifu Prefecture, Japan, are presented. Fifteen landslide-causative factors are considered using a logistic-regression approach to create models for landslide potential analysis. Pre-existing landslide bodies are used to evaluate the performance of the four models. The results revealed that the 20-m model had the highest classification accuracy (71.9%), whereas the 2-m model had the lowest value (68.7%). In the 2-m model, 89.4% of the landslide bodies fit in the medium to very high categories. For the 20-m model, only 83.3% of the landslide bodies were concentrated in the medium to very high classes. When the cell size decreases from 20 to 2 m, the area under the relative operative characteristic increases from 0.68 to 0.77. Therefore, higher-resolution DEMs would provide better results for landslide-susceptibility mapping.

  5. The basis function approach for modeling autocorrelation in ecological data.

    PubMed

    Hefley, Trevor J; Broms, Kristin M; Brost, Brian M; Buderman, Frances E; Kay, Shannon L; Scharf, Henry R; Tipton, John R; Williams, Perry J; Hooten, Mevin B

    2017-03-01

    Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data. © 2016 by the Ecological Society of America.

  6. Data-driven discovery of partial differential equations

    PubMed Central

    Rudy, Samuel H.; Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan

    2017-01-01

    We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg–de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable. PMID:28508044

  7. Estimation of Subpixel Snow-Covered Area by Nonparametric Regression Splines

    NASA Astrophysics Data System (ADS)

    Kuter, S.; Akyürek, Z.; Weber, G.-W.

    2016-10-01

    Measurement of the areal extent of snow cover with high accuracy plays an important role in hydrological and climate modeling. Remotely-sensed data acquired by earth-observing satellites offer great advantages for timely monitoring of snow cover. However, the main obstacle is the tradeoff between temporal and spatial resolution of satellite imageries. Soft or subpixel classification of low or moderate resolution satellite images is a preferred technique to overcome this problem. The most frequently employed snow cover fraction methods applied on Moderate Resolution Imaging Spectroradiometer (MODIS) data have evolved from spectral unmixing and empirical Normalized Difference Snow Index (NDSI) methods to latest machine learning-based artificial neural networks (ANNs). This study demonstrates the implementation of subpixel snow-covered area estimation based on the state-of-the-art nonparametric spline regression method, namely, Multivariate Adaptive Regression Splines (MARS). MARS models were trained by using MODIS top of atmospheric reflectance values of bands 1-7 as predictor variables. Reference percentage snow cover maps were generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also employed to estimate the percentage snow-covered area on the same data set. The results indicated that the developed MARS model performed better than th

  8. Evaluating the spatial variation of total mercury in young-of-year yellow perch (Perca flavescens), surface water and upland soil for watershed-lake systems within the southern Boreal Shield

    USGS Publications Warehouse

    Gabriel, M.C.; Kolka, R.; Wickman, T.; Nater, E.; Woodruff, Laurel G.

    2009-01-01

    The primary objective of this research is to investigate relationships between mercury in upland soil, lake water and fish tissue and explore the cause for the observed spatial variation of THg in age one yellow perch (Perca flavescens) for ten lakes within the Superior National Forest. Spatial relationships between yellow perch THg tissue concentration and a total of 45 watershed and water chemistry parameters were evaluated for two separate years: 2005 and 2006. Results show agreement with other studies where watershed area, lake water pH, nutrient levels (specifically dissolved NO3−-N) and dissolved iron are important factors controlling and/or predicting fish THg level. Exceeding all was the strong dependence of yellow perch THg level on soil A-horizon THg and, in particular, soil O-horizon THg concentrations (Spearman ρ = 0.81). Soil B-horizon THg concentration was significantly correlated (Pearson r = 0.75) with lake water THg concentration. Lakes surrounded by a greater percentage of shrub wetlands (peatlands) had higher fish tissue THg levels, thus it is highly possible that these wetlands are main locations for mercury methylation. Stepwise regression was used to develop empirical models for the purpose of predicting the spatial variation in yellow perch THg over the studied region. The 2005 regression model demonstrates it is possible to obtain good prediction (up to 60% variance description) of resident yellow perch THg level using upland soil O-horizon THg as the only independent variable. The 2006 model shows even greater prediction (r2 = 0.73, with an overall 10 ng/g [tissue, wet weight] margin of error), using lake water dissolved iron and watershed area as the only model independent variables. The developed regression models in this study can help with interpreting THg concentrations in low trophic level fish species for untested lakes of the greater Superior National Forest and surrounding Boreal ecosystem.

  9. Measuring the value of air quality: application of the spatial hedonic model.

    PubMed

    Kim, Seung Gyu; Cho, Seong-Hoon; Lambert, Dayton M; Roberts, Roland K

    2010-03-01

    This study applies a hedonic model to assess the economic benefits of air quality improvement following the 1990 Clean Air Act Amendment at the county level in the lower 48 United States. An instrumental variable approach that combines geographically weighted regression and spatial autoregression methods (GWR-SEM) is adopted to simultaneously account for spatial heterogeneity and spatial autocorrelation. SEM mitigates spatial dependency while GWR addresses spatial heterogeneity by allowing response coefficients to vary across observations. Positive amenity values of improved air quality are found in four major clusters: (1) in East Kentucky and most of Georgia around the Southern Appalachian area; (2) in a few counties in Illinois; (3) on the border of Oklahoma and Kansas, on the border of Kansas and Nebraska, and in east Texas; and (4) in a few counties in Montana. Clusters of significant positive amenity values may exist because of a combination of intense air pollution and consumer awareness of diminishing air quality.

  10. Geographically weighted regression model on poverty indicator

    NASA Astrophysics Data System (ADS)

    Slamet, I.; Nugroho, N. F. T. A.; Muslich

    2017-12-01

    In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.

  11. Pattern Recognition Analysis of Age-Related Retinal Ganglion Cell Signatures in the Human Eye

    PubMed Central

    Yoshioka, Nayuta; Zangerl, Barbara; Nivison-Smith, Lisa; Khuu, Sieu K.; Jones, Bryan W.; Pfeiffer, Rebecca L.; Marc, Robert E.; Kalloniatis, Michael

    2017-01-01

    Purpose To characterize macular ganglion cell layer (GCL) changes with age and provide a framework to assess changes in ocular disease. This study used data clustering to analyze macular GCL patterns from optical coherence tomography (OCT) in a large cohort of subjects without ocular disease. Methods Single eyes of 201 patients evaluated at the Centre for Eye Health (Sydney, Australia) were retrospectively enrolled (age range, 20–85); 8 × 8 grid locations obtained from Spectralis OCT macular scans were analyzed with unsupervised classification into statistically separable classes sharing common GCL thickness and change with age. The resulting classes and gridwise data were fitted with linear and segmented linear regression curves. Additionally, normalized data were analyzed to determine regression as a percentage. Accuracy of each model was examined through comparison of predicted 50-year-old equivalent macular GCL thickness for the entire cohort to a true 50-year-old reference cohort. Results Pattern recognition clustered GCL thickness across the macula into five to eight spatially concentric classes. F-test demonstrated segmented linear regression to be the most appropriate model for macular GCL change. The pattern recognition–derived and normalized model revealed less difference between the predicted macular GCL thickness and the reference cohort (average ± SD 0.19 ± 0.92 and −0.30 ± 0.61 μm) than a gridwise model (average ± SD 0.62 ± 1.43 μm). Conclusions Pattern recognition successfully identified statistically separable macular areas that undergo a segmented linear reduction with age. This regression model better predicted macular GCL thickness. The various unique spatial patterns revealed by pattern recognition combined with core GCL thickness data provide a framework to analyze GCL loss in ocular disease. PMID:28632847

  12. Simulating stream transport of nutrients in the eastern United States, 2002, using a spatially-referenced regression model and 1:100,000-scale hydrography

    USGS Publications Warehouse

    Hoos, Anne B.; Moore, Richard B.; Garcia, Ana Maria; Noe, Gregory B.; Terziotti, Silvia E.; Johnston, Craig M.; Dennis, Robin L.

    2013-01-01

    Existing Spatially Referenced Regression on Watershed attributes (SPARROW) nutrient models for the northeastern and southeastern regions of the United States were recalibrated to achieve a hydrographically consistent model with which to assess nutrient sources and stream transport and investigate specific management questions about the effects of wetlands and atmospheric deposition on nutrient transport. Recalibrated nitrogen models for the northeast and southeast were sufficiently similar to be merged into a single nitrogen model for the eastern United States. The atmospheric deposition source in the nitrogen model has been improved to account for individual components of atmospheric input, derived from emissions from agricultural manure, agricultural livestock, vehicles, power plants, other industry, and background sources. This accounting makes it possible to simulate the effects of altering an individual component of atmospheric deposition, such as nitrate emissions from vehicles or power plants. Regional differences in transport of phosphorus through wetlands and reservoirs were investigated and resulted in two distinct phosphorus models for the northeast and southeast. The recalibrated nitrogen and phosphorus models account explicitly for the influence of wetlands on regional-scale land-phase and aqueous-phase transport of nutrients and therefore allow comparison of the water-quality functions of different wetland systems over large spatial scales. Seven wetland systems were associated with enhanced transport of either nitrogen or phosphorus in streams, probably because of the export of dissolved organic nitrogen and bank erosion. Six wetland systems were associated with mitigating the delivery of either nitrogen or phosphorus to streams, probably because of sedimentation, phosphate sorption, and ground water infiltration.

  13. Modeling pluvial flooding damage in urban environments: spatial relationships between citizens' complaints and overland catchment areas

    NASA Astrophysics Data System (ADS)

    Gaitan, Santiago; ten Veldhuis, Marie-Claire; van de Giesen, Nick

    2013-04-01

    Extreme weather events such as floods and storms are expected to cause severe economic losses in The Netherlands. Cumulative damage due to pluvial flooding can be considerable, especially in lowland areas where this type of floods occurs relatively frequently. Currently, in The Netherlands, water-related damages to property and contents are covered through private insurance. As pluvial flooding is becoming heavier and more likely to occur, sound modelling of damages is required to ensure that insurance systems are able to stand as an adaptation measure. Current damage models based on rainfall intensity, registries of insurance claims, and classifications of building types are unable to fully explain damage variability. Further developments assessing additional explanatory factors and reducing uncertainties, are required in order to significantly explain damage. In this study, urban topography is used as an explanatory factor for modelling of urban pluvial flooding. Flood damage is evaluated based on complaints data, a valuable resource for assessing vulnerability to urban pluvial flooding. Though previous research has shown coincidences between the localization of high complaint counts and large size catchments areas in Rotterdam, additional research is needed to establish the precise spatial relationship of those two variables. This additional task is the focus of the presented work. To that end a data base of complaints, that was made available by the Municipality Administration of the City, will be analysed. It comprises close to 36800 complaints from 2004 to 2011. The geographical position of the registries is aggregated into 4 to 6-digit Postal Code zones, which represents entire streets or relative positions along a street, respectively. The Municipality also provided the DEM, characterized by a spatial resolution of 0.5 m × 0.5 m, a vertical precision of 5 cm, and an accuracy better than two standard deviations of 15 cm. First the localization of complaints will be tested for spatial randomness: the distribution of Global Moran's I will be used as a measure of spatial aggregation of complaints. We expect high values of spatial aggregation, that would confirm the existence of a spatial structure in the distribution of complaints. Then we will probe how much does the extent of catchment areas influence such distribution of complaints. That will be done through both an ordinary least squares regression and a geographically weighted regression. By contrasting the results from these two regressions, the relationship between complaints and size of catchment area across the urban environment will be evaluated. The results will confirm whether complaints have a spatial distribution pattern. Furthermore, the results will provide insight into the importance of the size of catchment areas as a significant factor for complaints distribution, and for the assessment of urban vulnerability to pluvial flooding in the City of Rotterdam.

  14. Spatial analysis of soybean canopy response to soybean cyst nematodes (Heterodera glycines) in eastern Arkansas: An approach to future precision agriculture technology application

    NASA Astrophysics Data System (ADS)

    Kulkarni, Subodh

    2008-10-01

    Heterodera glycines Ichinohe, commonly known as soybean cyst nematode (SCN) is a serious widespread pathogen of soybean in the US. Present research primarily investigated feasibility of detecting SCN infestation in the field using aerial images and ground level spectrometric sensing. Non-spatial and spatial linear regression analyses were performed to correlate SCN population densities with Normalized Difference Vegetation Index (NDVI) and Green NDVI (GNDVI) derived from soybean canopy spectra. Field data were obtained from two fields; Field A and B under different nematode control strategies in 2003 and 2004. Analysis of aerial image data from July 18, 2004 from the Field A showed a significant relationship between SCN population at planting and the GNDVI (R2=0.17 at p=0.0006). Linear regression analysis revealed that SCN had a little effect on yield (R2 =0.14, at p=0.0001, RMSEP=1052.42 kg ha-1) and GNDVI (R 2=0.17 at p=0.0006, RMSEP=0.087) derived from the aerial imagery on a single date. However, the spatial regression analysis based on spherical semivariogram showed that the RMSEP was 0.037 for the GNDVI on July 18, 2004 and 427.32 kg ha-1 for yield on October 14, 2003 indicating better model performance. For July 18, 2004 data from Field B, a relationship between NDVI and the cyst counts at planting was significant (R2=0.5 at p=0.0468). Non-spatial analyses of the ground level spectrometric data for the first field showed that NDVI and GNDVI were correlated with cyst counts at planting (R 2=0.34 and 0.27 at p=0.0015 and 0.0127, respectively), and GNDVI was correlated with eggs count at planting (R2= 0.27 at p=0.0118). Both NDVI and GNDVI were correlated with egg counts at flowering (R 2=0.34 and 0.27 at p=0.0013 and 0.0018, respectively). However, paired T test to validate the above relationships showed that, predicted values of NDVI and GNDVI were significantly different. The statistical evidences suggested that variability in vegetation indices was caused by SCN infestation. Comparison of estimators such as -2 RLL, AIC, and BIC of non-spatial and spatial models affirmed that incorporating spatial covariance structure of observations improved model performances. These results demonstrated a limited potential of aerial imaging and ground level spectrometry for detecting nematode infestation in the field. However, it is strongly recommended that more multisite-multiyear trials must be performed to establish and validate empirical models to quantify SCN population densities and their impact on soybean canopy reflectance.

  15. Noninvasive diagnostics of skin microphysical parameters based on spatially resolved diffuse reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Lisenko, S. A.; Kugeiko, M. M.

    2013-01-01

    The ability to determine noninvasively microphysical parameters (MPPs) of skin characteristic of malignant melanoma was demonstrated. The MPPs were the melanin content in dermis, saturation of tissue with blood vessels, and concentration and effective size of tissue scatterers. The proposed method was based on spatially resolved spectral measurements of skin diffuse reflectance and multiple regressions between linearly independent measurement components and skin MPPs. The regressions were established by modeling radiation transfer in skin with a wide variation of its MPPs. Errors in the determination of skin MPPs were estimated using fiber-optic measurements of its diffuse reflectance at wavelengths of commercially available semiconductor diode lasers (578, 625, 660, 760, and 806 nm) at source-detector separations of 0.23-1.38 mm.

  16. Exploring the Spatial Association between Social Deprivation and Cardiovascular Disease Mortality at the Neighborhood Level.

    PubMed

    Ford, Mary Margaret; Highfield, Linda D

    2016-01-01

    Cardiovascular disease (CVD), the leading cause of death in the United States, is impacted by neighborhood-level factors including social deprivation. To measure the association between social deprivation and CVD mortality in Harris County, Texas, global (Ordinary Least Squares (OLS) and local (Geographically Weighted Regression (GWR)) models were built. The models explored the spatial variation in the relationship at a census-tract level while controlling for age, income by race, and education. A significant and spatially varying association (p < .01) was found between social deprivation and CVD mortality, when controlling for all other factors in the model. The GWR model provided a better model fit over the analogous OLS model (R2 = .65 vs. .57), reinforcing the importance of geography and neighborhood of residence in the relationship between social deprivation and CVD mortality. Findings from the GWR model can be used to identify neighborhoods at greatest risk for poor health outcomes and to inform the placement of community-based interventions.

  17. Smooth individual level covariates adjustment in disease mapping.

    PubMed

    Huque, Md Hamidul; Anderson, Craig; Walton, Richard; Woolford, Samuel; Ryan, Louise

    2018-05-01

    Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available "indiCAR" model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log-linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non-log-linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth-indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two-step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth-indiCAR through simulation. Our results indicate that the smooth-indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Three-dimensional mapping of soil chemical characteristics at micrometric scale: Statistical prediction by combining 2D SEM-EDX data and 3D X-ray computed micro-tomographic images

    NASA Astrophysics Data System (ADS)

    Hapca, Simona

    2015-04-01

    Many soil properties and functions emerge from interactions of physical, chemical and biological processes at microscopic scales, which can be understood only by integrating techniques that traditionally are developed within separate disciplines. While recent advances in imaging techniques, such as X-ray computed tomography (X-ray CT), offer the possibility to reconstruct the 3D physical structure at fine resolutions, for the distribution of chemicals in soil, existing methods, based on scanning electron microscope (SEM) and energy dispersive X-ray detection (EDX), allow for characterization of the chemical composition only on 2D surfaces. At present, direct 3D measurement techniques are still lacking, sequential sectioning of soils, followed by 2D mapping of chemical elements and interpolation to 3D, being an alternative which is explored in this study. Specifically, we develop an integrated experimental and theoretical framework which combines 3D X-ray CT imaging technique with 2D SEM-EDX and use spatial statistics methods to map the chemical composition of soil in 3D. The procedure involves three stages 1) scanning a resin impregnated soil cube by X-ray CT, followed by precision cutting to produce parallel thin slices, the surfaces of which are scanned by SEM-EDX, 2) alignment of the 2D chemical maps within the internal 3D structure of the soil cube, and 3) development, of spatial statistics methods to predict the chemical composition of 3D soil based on the observed 2D chemical and 3D physical data. Specifically, three statistical models consisting of a regression tree, a regression tree kriging and cokriging model were used to predict the 3D spatial distribution of carbon, silicon, iron and oxygen in soil, these chemical elements showing a good spatial agreement between the X-ray grayscale intensities and the corresponding 2D SEM-EDX data. Due to the spatial correlation between the physical and chemical data, the regression-tree model showed a great potential in predicting chemical composition in particular for iron, which is generally sparsely distributed in soil. For carbon, silicon and oxygen, which are more densely distributed, the additional kriging of the regression tree residuals improved significantly the prediction, whereas prediction based on co-kriging was less consistent across replicates, underperforming regression-tree kriging. The present study shows a great potential in integrating geo-statistical methods with imaging techniques to unveil the 3D chemical structure of soil at very fine scales, the framework being suitable to be further applied to other types of imaging data such as images of biological thin sections for characterization of microbial distribution. Key words: X-ray CT, SEM-EDX, segmentation techniques, spatial correlation, 3D soil images, 2D chemical maps.

  19. Bayesian quantitative precipitation forecasts in terms of quantiles

    NASA Astrophysics Data System (ADS)

    Bentzien, Sabrina; Friederichs, Petra

    2014-05-01

    Ensemble prediction systems (EPS) for numerical weather predictions on the mesoscale are particularly developed to obtain probabilistic guidance for high impact weather. An EPS not only issues a deterministic future state of the atmosphere but a sample of possible future states. Ensemble postprocessing then translates such a sample of forecasts into probabilistic measures. This study focus on probabilistic quantitative precipitation forecasts in terms of quantiles. Quantiles are particular suitable to describe precipitation at various locations, since no assumption is required on the distribution of precipitation. The focus is on the prediction during high-impact events and related to the Volkswagen Stiftung funded project WEX-MOP (Mesoscale Weather Extremes - Theory, Spatial Modeling and Prediction). Quantile forecasts are derived from the raw ensemble and via quantile regression. Neighborhood method and time-lagging are effective tools to inexpensively increase the ensemble spread, which results in more reliable forecasts especially for extreme precipitation events. Since an EPS provides a large amount of potentially informative predictors, a variable selection is required in order to obtain a stable statistical model. A Bayesian formulation of quantile regression allows for inference about the selection of predictive covariates by the use of appropriate prior distributions. Moreover, the implementation of an additional process layer for the regression parameters accounts for spatial variations of the parameters. Bayesian quantile regression and its spatially adaptive extension is illustrated for the German-focused mesoscale weather prediction ensemble COSMO-DE-EPS, which runs (pre)operationally since December 2010 at the German Meteorological Service (DWD). Objective out-of-sample verification uses the quantile score (QS), a weighted absolute error between quantile forecasts and observations. The QS is a proper scoring function and can be decomposed into reliability, resolutions and uncertainty parts. A quantile reliability plot gives detailed insights in the predictive performance of the quantile forecasts.

  20. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  1. Modelling typhoid risk in Dhaka Metropolitan Area of Bangladesh: the role of socio-economic and environmental factors

    PubMed Central

    2013-01-01

    Background Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as Cholera, Typhoid and Paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and Principal Axis Factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a Quality of Life index, which in turn is used in a regression model of typhoid occurrence and risk. Results The three Principal Factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using Ordinary Least Squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the Principal factors. The use of Geographically Weighted Regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike Information Criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. Conclusions The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka Metropolitan Area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination. PMID:23497202

  2. Modelling typhoid risk in Dhaka metropolitan area of Bangladesh: the role of socio-economic and environmental factors.

    PubMed

    Corner, Robert J; Dewan, Ashraf M; Hashizume, Masahiro

    2013-03-16

    Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as cholera, typhoid and paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and principal axis factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a quality of life index, which in turn is used in a regression model of typhoid occurrence and risk. The three principal factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using ordinary least squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the principal factors. The use of geographically weighted regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike information criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka metropolitan area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination.

  3. Acute Effects of Nitrogen Dioxide on Cardiovascular Mortality in Beijing: An Exploration of Spatial Heterogeneity and the District-specific Predictors

    NASA Astrophysics Data System (ADS)

    Luo, Kai; Li, Runkui; Li, Wenjing; Wang, Zongshuang; Ma, Xinming; Zhang, Ruiming; Fang, Xin; Wu, Zhenglai; Cao, Yang; Xu, Qun

    2016-12-01

    The exploration of spatial variation and predictors of the effects of nitrogen dioxide (NO2) on fatal health outcomes is still sparse. In a multilevel case-crossover study in Beijing, China, we used mixed Cox proportional hazard model to examine the citywide effects and conditional logistic regression to evaluate the district-specific effects of NO2 on cardiovascular mortality. District-specific predictors that could be related to the spatial pattern of NO2 effects were examined by robust regression models. We found that a 10 μg/m3 increase in daily mean NO2 concentration was associated with a 1.89% [95% confidence interval (CI): 1.33-2.45%], 2.07% (95% CI: 1.23-2.91%) and 1.95% (95% CI: 1.16-2.72%) increase in daily total cardiovascular (lag03), cerebrovascular (lag03) and ischemic heart disease (lag02) mortality, respectively. For spatial variation of NO2 effects across 16 districts, significant effects were only observed in 5, 4 and 2 districts for the above three outcomes, respectively. Generally, NO2 was likely having greater adverse effects on districts with larger population, higher consumption of coal and more civilian vehicles. Our results suggested independent and spatially varied effects of NO2 on total and subcategory cardiovascular mortalities. The identification of districts with higher risk can provide important insights for reducing NO2 related health hazards.

  4. A spectral-spatial-dynamic hierarchical Bayesian (SSD-HB) model for estimating soybean yield

    NASA Astrophysics Data System (ADS)

    Kazama, Yoriko; Kujirai, Toshihiro

    2014-10-01

    A method called a "spectral-spatial-dynamic hierarchical-Bayesian (SSD-HB) model," which can deal with many parameters (such as spectral and weather information all together) by reducing the occurrence of multicollinearity, is proposed. Experiments conducted on soybean yields in Brazil fields with a RapidEye satellite image indicate that the proposed SSD-HB model can predict soybean yield with a higher degree of accuracy than other estimation methods commonly used in remote-sensing applications. In the case of the SSD-HB model, the mean absolute error between estimated yield of the target area and actual yield is 0.28 t/ha, compared to 0.34 t/ha when conventional PLS regression was applied, showing the potential effectiveness of the proposed model.

  5. Asymptotics of nonparametric L-1 regression models with dependent data

    PubMed Central

    ZHAO, ZHIBIAO; WEI, YING; LIN, DENNIS K.J.

    2013-01-01

    We investigate asymptotic properties of least-absolute-deviation or median quantile estimates of the location and scale functions in nonparametric regression models with dependent data from multiple subjects. Under a general dependence structure that allows for longitudinal data and some spatially correlated data, we establish uniform Bahadur representations for the proposed median quantile estimates. The obtained Bahadur representations provide deep insights into the asymptotic behavior of the estimates. Our main theoretical development is based on studying the modulus of continuity of kernel weighted empirical process through a coupling argument. Progesterone data is used for an illustration. PMID:24955016

  6. Spatial landscape model to characterize biological diversity using R statistical computing environment.

    PubMed

    Singh, Hariom; Garg, R D; Karnatak, Harish C; Roy, Arijit

    2018-01-15

    Due to urbanization and population growth, the degradation of natural forests and associated biodiversity are now widely recognized as a global environmental concern. Hence, there is an urgent need for rapid assessment and monitoring of biodiversity on priority using state-of-art tools and technologies. The main purpose of this research article is to develop and implement a new methodological approach to characterize biological diversity using spatial model developed during the study viz. Spatial Biodiversity Model (SBM). The developed model is scale, resolution and location independent solution for spatial biodiversity richness modelling. The platform-independent computation model is based on parallel computation. The biodiversity model based on open-source software has been implemented on R statistical computing platform. It provides information on high disturbance and high biological richness areas through different landscape indices and site specific information (e.g. forest fragmentation (FR), disturbance index (DI) etc.). The model has been developed based on the case study of Indian landscape; however it can be implemented in any part of the world. As a case study, SBM has been tested for Uttarakhand state in India. Inputs for landscape ecology are derived through multi-criteria decision making (MCDM) techniques in an interactive command line environment. MCDM with sensitivity analysis in spatial domain has been carried out to illustrate the model stability and robustness. Furthermore, spatial regression analysis has been made for the validation of the output. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Spatial modelling of landscape aesthetic potential in urban-rural fringes.

    PubMed

    Sahraoui, Yohan; Clauzel, Céline; Foltête, Jean-Christophe

    2016-10-01

    The aesthetic potential of landscape has to be modelled to provide tools for land-use planning. This involves identifying landscape attributes and revealing individuals' landscape preferences. Landscape aesthetic judgments of individuals (n = 1420) were studied by means of a photo-based survey. A set of landscape visibility metrics was created to measure landscape composition and configuration in each photograph using spatial data. These metrics were used as explanatory variables in multiple linear regressions to explain aesthetic judgments. We demonstrate that landscape aesthetic judgments may be synthesized in three consensus groups. The statistical results obtained show that landscape visibility metrics have good explanatory power. Ultimately, we propose a spatial modelling of landscape aesthetic potential based on these results combined with systematic computation of visibility metrics. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Can spatial statistical river temperature models be transferred between catchments?

    NASA Astrophysics Data System (ADS)

    Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.

    2017-09-01

    There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across multiple catchments and larger spatial scales.

  9. Modeling the spatio-temporal heterogeneity in the PM10-PM2.5 relationship

    NASA Astrophysics Data System (ADS)

    Chu, Hone-Jay; Huang, Bo; Lin, Chuan-Yao

    2015-02-01

    This paper explores the spatio-temporal patterns of particulate matter (PM) in Taiwan based on a series of methods. Using fuzzy c-means clustering first, the spatial heterogeneity (six clusters) in the PM data collected between 2005 and 2009 in Taiwan are identified and the industrial and urban areas of Taiwan (southwestern, west central, northwestern, and northern Taiwan) are found to have high PM concentrations. The PM10-PM2.5 relationship is then modeled with global ordinary least squares regression, geographically weighted regression (GWR), and geographically and temporally weighted regression (GTWR). The GTWR and GWR produce consistent results; however, GTWR provides more detailed information of spatio-temporal variations of the PM10-PM2.5 relationship. The results also show that GTWR provides a relatively high goodness of fit and sufficient space-time explanatory power. In particular, the PM2.5 or PM10 varies with time and space, depending on weather conditions and the spatial distribution of land use and emission patterns in local areas. Such information can be used to determine patterns of spatio-temporal heterogeneity in PM that will allow the control of pollutants and the reduction of public exposure.

  10. [Analysis of influence on spatial distribution of fishing ground for Antarctic krill fishery in the northern South Shetland Islands based on GWR model].

    PubMed

    Chen, Lyu Feng; Zhu, Guo Ping

    2018-03-01

    Based on Antarctic krill fishery and marine environmental data collected by scientific observers, using geographically weighted regression (GWR) model, we analyzed the effects of the factors with spatial attributes, i.e., depth of krill swarm (DKS) and distance from fishing position to shore (DTS), and sea surface temperature (SST), on the spatial distribution of fishing ground in the northern South Shetland Islands. The results showed that there was no significant aggregation in spatial distribution of catch per unit fishing effort (CPUE). Spatial autocorrelations (positive) among three factors were observed in 2010 and 2013, but were not in 2012 and 2016. Results from GWR model showed that the extent for the impacts on spatial distribution of CPUEs varied among those three factors, following the order DKS>SST>DTS. Compared to the DKS and DTS, the impact of SST on the spatial distribution of CPUEs presented adverse trend in the eastern and western parts of the South Shetland Islands. Negative correlations occurred for the spatial effects of DKS and DTS on distribution of CPUEs, though with inter-annual and regional variation. Our results provide metho-dological reference for researches on the underlying mechanism for fishing ground formation for Antarctic krill fishery.

  11. Human impact on sediment fluxes within the Blue Nile and Atbara River basins

    NASA Astrophysics Data System (ADS)

    Balthazar, Vincent; Vanacker, Veerle; Girma, Atkilt; Poesen, Jean; Golla, Semunesh

    2013-01-01

    A regional assessment of the spatial variability in sediment yields allows filling the gap between detailed, process-based understanding of erosion at field scale and empirical sediment flux models at global scale. In this paper, we focus on the intrabasin variability in sediment yield within the Blue Nile and Atbara basins as biophysical and anthropogenic factors are presumably acting together to accelerate soil erosion. The Blue Nile and Atbara River systems are characterized by an important spatial variability in sediment fluxes, with area-specific sediment yield (SSY) values ranging between 4 and 4935 t/km2/y. Statistical analyses show that 41% of the observed variation in SSY can be explained by remote sensing proxy data of surface vegetation cover, rainfall intensity, mean annual temperature, and human impact. The comparison of a locally adapted regression model with global predictive sediment flux models indicates that global flux models such as the ART and BQART models are less suited to capture the spatial variability in area-specific sediment yields (SSY), but they are very efficient to predict absolute sediment yields (SY). We developed a modified version of the BQART model that estimates the human influence on sediment yield based on a high resolution composite measure of local human impact (human footprint index) instead of countrywide estimates of GNP/capita. Our modified version of the BQART is able to explain 80% of the observed variation in SY for the Blue Nile and Atbara basins and thereby performs only slightly less than locally adapted regression models.

  12. Spatio-temporal modeling of chronic PM 10 exposure for the Nurses' Health Study

    NASA Astrophysics Data System (ADS)

    Yanosky, Jeff D.; Paciorek, Christopher J.; Schwartz, Joel; Laden, Francine; Puett, Robin; Suh, Helen H.

    2008-06-01

    Chronic epidemiological studies of airborne particulate matter (PM) have typically characterized the chronic PM exposures of their study populations using city- or county-wide ambient concentrations, which limit the studies to areas where nearby monitoring data are available and which ignore within-city spatial gradients in ambient PM concentrations. To provide more spatially refined and precise chronic exposure measures, we used a Geographic Information System (GIS)-based spatial smoothing model to predict monthly outdoor PM10 concentrations in the northeastern and midwestern United States. This model included monthly smooth spatial terms and smooth regression terms of GIS-derived and meteorological predictors. Using cross-validation and other pre-specified selection criteria, terms for distance to road by road class, urban land use, block group and county population density, point- and area-source PM10 emissions, elevation, wind speed, and precipitation were found to be important determinants of PM10 concentrations and were included in the final model. Final model performance was strong (cross-validation R2=0.62), with little bias (-0.4 μg m-3) and high precision (6.4 μg m-3). The final model (with monthly spatial terms) performed better than a model with seasonal spatial terms (cross-validation R2=0.54). The addition of GIS-derived and meteorological predictors improved predictive performance over spatial smoothing (cross-validation R2=0.51) or inverse distance weighted interpolation (cross-validation R2=0.29) methods alone and increased the spatial resolution of predictions. The model performed well in both rural and urban areas, across seasons, and across the entire time period. The strong model performance demonstrates its suitability as a means to estimate individual-specific chronic PM10 exposures for large populations.

  13. Monthly Rainfall Erosivity Assessment for Switzerland

    NASA Astrophysics Data System (ADS)

    Schmidt, Simon; Meusburger, Katrin; Alewell, Christine

    2016-04-01

    Water erosion is crucially controlled by rainfall erosivity, which is quantified out of the kinetic energy of raindrop impact and associated surface runoff. Rainfall erosivity is often expressed as the R-factor in soil erosion risk models like the Universal Soil Loss Equation (USLE) and its revised version (RUSLE). Just like precipitation, the rainfall erosivity of Switzerland has a characteristic seasonal dynamic throughout the year. This inter-annual variability is to be assessed by a monthly and seasonal modelling approach. We used a network of 86 precipitation gauging stations with a 10-minute temporal resolution to calculate long-term average monthly R-factors. Stepwise regression and Monte Carlo Cross Validation (MCCV) was used to select spatial covariates to explain the spatial pattern of R-factor for each month across Switzerland. The regionalized monthly R-factor is mapped by its individual regression equation and the ordinary kriging interpolation of its residuals (Regression-Kriging). As covariates, a variety of precipitation indicator data has been included like snow height, a combination of hourly gauging measurements and radar observations (CombiPrecip), mean monthly alpine precipitation (EURO4M-APGD) and monthly precipitation sums (Rhires). Topographic parameters were also significant explanatory variables for single months. The comparison of all 12 monthly rainfall erosivity maps showed seasonality with highest rainfall erosivity in summer (June, July, and August) and lowest rainfall erosivity in winter months. Besides the inter-annual temporal regime, a seasonal spatial variability was detectable. Spatial maps of monthly rainfall erosivity are presented for the first time for Switzerland. The assessment of the spatial and temporal dynamic behaviour of the R-factor is valuable for the identification of more susceptible seasons and regions as well as for the application of selective erosion control measures. A combination with monthly vegetation cover (C-factor) maps would enable the assessment of seasonal dynamics of erosion processes in Switzerland.

  14. A spatial analysis of social and economic determinants of tuberculosis in Brazil.

    PubMed

    Harling, Guy; Castro, Marcia C

    2014-01-01

    We investigated the spatial distribution, and social and economic correlates, of tuberculosis in Brazil between 2002 and 2009 using municipality-level age/sex-standardized tuberculosis notification data. Rates were very strongly spatially autocorrelated, being notably high in urban areas on the eastern seaboard and in the west of the country. Non-spatial ecological regression analyses found higher rates associated with urbanicity, population density, poor economic conditions, household crowding, non-white population and worse health and healthcare indicators. These associations remained in spatial conditional autoregressive models, although the effect of poverty appeared partially confounded by urbanicity, race and spatial autocorrelation, and partially mediated by household crowding. Our analysis highlights both the multiple relationships between socioeconomic factors and tuberculosis in Brazil, and the importance of accounting for spatial factors in analysing socioeconomic determinants of tuberculosis. © 2013 Published by Elsevier Ltd.

  15. Use of Forest Inventory and Analysis information in wildlife habitat modeling: a process for linking multiple scales

    Treesearch

    Thomas C. Edwards; Gretchen G. Moisen; Tracey S. Frescino; Joshua L. Lawler

    2002-01-01

    We describe our collective efforts to develop and apply methods for using FIA data to model forest resources and wildlife habitat. Our work demonstrates how flexible regression techniques, such as generalized additive models, can be linked with spatially explicit environmental information for the mapping of forest type and structure. We illustrate how these maps of...

  16. Regional interpretation of water-quality monitoring data

    USGS Publications Warehouse

    Smith, Richard A.; Schwarz, Gregory E.; Alexander, Richard B.

    1997-01-01

    We describe a method for using spatially referenced regressions of contaminant transport on watershed attributes (SPARROW) in regional water-quality assessment. The method is designed to reduce the problems of data interpretation caused by sparse sampling, network bias, and basin heterogeneity. The regression equation relates measured transport rates in streams to spatially referenced descriptors of pollution sources and land-surface and stream-channel characteristics. Regression models of total phosphorus (TP) and total nitrogen (TN) transport are constructed for a region defined as the nontidal conterminous United States. Observed TN and TP transport rates are derived from water-quality records for 414 stations in the National Stream Quality Accounting Network. Nutrient sources identified in the equations include point sources, applied fertilizer, livestock waste, nonagricultural land, and atmospheric deposition (TN only). Surface characteristics found to be significant predictors of land-water delivery include soil permeability, stream density, and temperature (TN only). Estimated instream decay coefficients for the two contaminants decrease monotonically with increasing stream size. TP transport is found to be significantly reduced by reservoir retention. Spatial referencing of basin attributes in relation to the stream channel network greatly increases their statistical significance and model accuracy. The method is used to estimate the proportion of watersheds in the conterminous United States (i.e., hydrologic cataloging units) with outflow TP concentrations less than the criterion of 0.1 mg/L, and to classify cataloging units according to local TN yield (kg/km2/yr).

  17. Spatial analysis of ambulance response times related to prehospital cardiac arrests in the city-state of Singapore.

    PubMed

    Earnest, Arul; Hock Ong, Marcus Eng; Shahidah, Nur; Min Ng, Wen; Foo, Chuanyang; Nott, David John

    2012-01-01

    The main objective of this study was to establish the spatial variation in ambulance response times for out-of-hospital cardiac arrests (OHCAs) in the city-state of Singapore. The secondary objective involved studying the relationships between various covariates, such as traffic condition and time and day of collapse, and ambulance response times. The study design was observational and ecological in nature. Data on OHCAs were collected from a nationally representative database for the period October 2001 to October 2004. We used the conditional autoregressive (CAR) model to analyze the data. Within the Bayesian framework of analysis, we used a Weibull regression model that took into account spatial random effects. The regression model was used to study the independent effects of each covariate. Our results showed that there was spatial heterogeneity in the ambulance response times in Singapore. Generally, areas in the far outskirts (suburbs), such as Boon Lay (in the west) and Sembawang (in the north), fared badly in terms of ambulance response times. This improved when adjusted for key covariates, including distance from the nearest fire station. Ambulance response time was also associated with better traffic conditions, weekend OHCAs, distance from the nearest fire station, and OHCAs occurring during nonpeak driving hours. For instance, the hazard ratio for good ambulance response time was 2.35 (95% credible interval [CI] 1.97-2.81) when traffic conditions were light and 1.72 (95% CI 1.51-1.97) when traffic conditions were moderate, as compared with heavy traffic. We found a clear spatial gradient for ambulance response times, with far-outlying areas' exhibiting poorer response times. Our study highlights the utility of this novel approach, which may be helpful for planning emergency medical services and public emergency responses.

  18. The effects of spatial autoregressive dependencies on inference in ordinary least squares: a geometric approach

    NASA Astrophysics Data System (ADS)

    Smith, Tony E.; Lee, Ka Lok

    2012-01-01

    There is a common belief that the presence of residual spatial autocorrelation in ordinary least squares (OLS) regression leads to inflated significance levels in beta coefficients and, in particular, inflated levels relative to the more efficient spatial error model (SEM). However, our simulations show that this is not always the case. Hence, the purpose of this paper is to examine this question from a geometric viewpoint. The key idea is to characterize the OLS test statistic in terms of angle cosines and examine the geometric implications of this characterization. Our first result is to show that if the explanatory variables in the regression exhibit no spatial autocorrelation, then the distribution of test statistics for individual beta coefficients in OLS is independent of any spatial autocorrelation in the error term. Hence, inferences about betas exhibit all the optimality properties of the classic uncorrelated error case. However, a second more important series of results show that if spatial autocorrelation is present in both the dependent and explanatory variables, then the conventional wisdom is correct. In particular, even when an explanatory variable is statistically independent of the dependent variable, such joint spatial dependencies tend to produce "spurious correlation" that results in over-rejection of the null hypothesis. The underlying geometric nature of this problem is clarified by illustrative examples. The paper concludes with a brief discussion of some possible remedies for this problem.

  19. Re-assessing acalculia: Distinguishing spatial and purely arithmetical deficits in right-hemisphere damaged patients.

    PubMed

    Benavides-Varela, S; Piva, D; Burgio, F; Passarini, L; Rolma, G; Meneghello, F; Semenza, C

    2017-03-01

    Arithmetical deficits in right-hemisphere damaged patients have been traditionally considered secondary to visuo-spatial impairments, although the exact relationship between the two deficits has rarely been assessed. The present study implemented a voxelwise lesion analysis among 30 right-hemisphere damaged patients and a controlled, matched-sample, cross-sectional analysis with 35 cognitively normal controls regressing three composite cognitive measures on standardized numerical measures. The results showed that patients and controls significantly differ in Number comprehension, Transcoding, and Written operations, particularly subtractions and multiplications. The percentage of patients performing below the cutoffs ranged between 27% and 47% across these tasks. Spatial errors were associated with extensive lesions in fronto-temporo-parietal regions -which frequently lead to neglect- whereas pure arithmetical errors appeared related to more confined lesions in the right angular gyrus and its proximity. Stepwise regression models consistently revealed that spatial errors were primarily predicted by composite measures of visuo-spatial attention/neglect and representational abilities. Conversely, specific errors of arithmetic nature linked to representational abilities only. Crucially, the proportion of arithmetical errors (ranging from 65% to 100% across tasks) was higher than that of spatial ones. These findings thus suggest that unilateral right hemisphere lesions can directly affect core numerical/arithmetical processes, and that right-hemisphere acalculia is not only ascribable to visuo-spatial deficits as traditionally thought. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Spatial design and strength of spatial signal: Effects on covariance estimation

    USGS Publications Warehouse

    Irvine, Kathryn M.; Gitelman, Alix I.; Hoeting, Jennifer A.

    2007-01-01

    In a spatial regression context, scientists are often interested in a physical interpretation of components of the parametric covariance function. For example, spatial covariance parameter estimates in ecological settings have been interpreted to describe spatial heterogeneity or “patchiness” in a landscape that cannot be explained by measured covariates. In this article, we investigate the influence of the strength of spatial dependence on maximum likelihood (ML) and restricted maximum likelihood (REML) estimates of covariance parameters in an exponential-with-nugget model, and we also examine these influences under different sampling designs—specifically, lattice designs and more realistic random and cluster designs—at differing intensities of sampling (n=144 and 361). We find that neither ML nor REML estimates perform well when the range parameter and/or the nugget-to-sill ratio is large—ML tends to underestimate the autocorrelation function and REML produces highly variable estimates of the autocorrelation function. The best estimates of both the covariance parameters and the autocorrelation function come under the cluster sampling design and large sample sizes. As a motivating example, we consider a spatial model for stream sulfate concentration.

  1. Equivalence of MAXENT and Poisson point process models for species distribution modeling in ecology.

    PubMed

    Renner, Ian W; Warton, David I

    2013-03-01

    Modeling the spatial distribution of a species is a fundamental problem in ecology. A number of modeling methods have been developed, an extremely popular one being MAXENT, a maximum entropy modeling approach. In this article, we show that MAXENT is equivalent to a Poisson regression model and hence is related to a Poisson point process model, differing only in the intercept term, which is scale-dependent in MAXENT. We illustrate a number of improvements to MAXENT that follow from these relations. In particular, a point process model approach facilitates methods for choosing the appropriate spatial resolution, assessing model adequacy, and choosing the LASSO penalty parameter, all currently unavailable to MAXENT. The equivalence result represents a significant step in the unification of the species distribution modeling literature. Copyright © 2013, The International Biometric Society.

  2. Mapping malaria risk among children in Côte d'Ivoire using Bayesian geo-statistical models.

    PubMed

    Raso, Giovanna; Schur, Nadine; Utzinger, Jürg; Koudou, Benjamin G; Tchicaya, Emile S; Rohner, Fabian; N'goran, Eliézer K; Silué, Kigbafori D; Matthys, Barbara; Assi, Serge; Tanner, Marcel; Vounatsou, Penelope

    2012-05-09

    In Côte d'Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d'Ivoire at high spatial resolution. Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d'Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d'Ivoire, including uncertainty. Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d'Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation.

  3. Mapping malaria risk among children in Côte d’Ivoire using Bayesian geo-statistical models

    PubMed Central

    2012-01-01

    Background In Côte d’Ivoire, an estimated 767,000 disability-adjusted life years are due to malaria, placing the country at position number 14 with regard to the global burden of malaria. Risk maps are important to guide control interventions, and hence, the aim of this study was to predict the geographical distribution of malaria infection risk in children aged <16 years in Côte d’Ivoire at high spatial resolution. Methods Using different data sources, a systematic review was carried out to compile and geo-reference survey data on Plasmodium spp. infection prevalence in Côte d’Ivoire, focusing on children aged <16 years. The period from 1988 to 2007 was covered. A suite of Bayesian geo-statistical logistic regression models was fitted to analyse malaria risk. Non-spatial models with and without exchangeable random effect parameters were compared to stationary and non-stationary spatial models. Non-stationarity was modelled assuming that the underlying spatial process is a mixture of separate stationary processes in each ecological zone. The best fitting model based on the deviance information criterion was used to predict Plasmodium spp. infection risk for entire Côte d’Ivoire, including uncertainty. Results Overall, 235 data points at 170 unique survey locations with malaria prevalence data for individuals aged <16 years were extracted. Most data points (n = 182, 77.4%) were collected between 2000 and 2007. A Bayesian non-stationary regression model showed the best fit with annualized rainfall and maximum land surface temperature identified as significant environmental covariates. This model was used to predict malaria infection risk at non-sampled locations. High-risk areas were mainly found in the north-central and western area, while relatively low-risk areas were located in the north at the country border, in the north-east, in the south-east around Abidjan, and in the central-west between two high prevalence areas. Conclusion The malaria risk map at high spatial resolution gives an important overview of the geographical distribution of the disease in Côte d’Ivoire. It is a useful tool for the national malaria control programme and can be utilized for spatial targeting of control interventions and rational resource allocation. PMID:22571469

  4. Differential shift in spatial bias over time depends on observers׳ initial bias: Observer subtypes, or regression to the mean?

    PubMed

    Newman, Daniel P; Loughnane, Gerard M; Abe, Rafael; Zoratti, Marco T R; Martins, Ana C P; van den Bogert, Petra C; Kelly, Simon P; O'Connell, Redmond G; Bellgrove, Mark A

    2014-11-01

    Healthy subjects typically exhibit a subtle bias of visuospatial attention favouring left space that is commonly termed 'pseudoneglect'. This bias is attenuated, or shifted rightwards, with decreasing alertness over time, consistent with theoretical models proposing that pseudoneglect is a result of the right hemisphere׳s dominance in regulating attention. Although this 'time-on-task effect' for spatial bias is observed when averaging across whole samples of healthy participants, Benwell, C. S. Y., Thut, G., Learmonth, G., & Harvey, M. (2013b). Spatial attention: differential shifts in pseudoneglect direction with time-on-task and initial bias support the idea of observer subtypes. Neuropsychologia, 51(13), 2747-2756 recently presented evidence that the direction and magnitude of bias exhibited by the participant early in the task (left biased, no bias, or right biased) were stable traits that predicted the direction of the subsequent time-on-task shift in spatial bias. That is, the spatial bias of participants who were initially left biased shifted in a rightward direction with time, whereas that of participants who were initially right biased shifted in a leftward direction. If valid, the data of Benwell et al. are potentially important and may demand a re-evaluation of current models of the neural networks governing spatial attention. Here we use two novel spatial attention tasks in an attempt to confirm the results of Benwell et al. We show that rather than being indicative of true participant subtypes, these data patterns are likely driven, at least in part, by 'regression towards the mean' arising from the analysis method employed. Although evidence supports the contention that trait-like individual differences in spatial bias exist within the healthy population, no clear evidence is yet available for participant/observer subtypes in the direction of time-on-task shift in spatial biases. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Regional regression models of watershed suspended-sediment discharge for the eastern United States

    NASA Astrophysics Data System (ADS)

    Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.

    2012-11-01

    SummaryEstimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1-8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.

  6. Regional regression models of watershed suspended-sediment discharge for the eastern United States

    USGS Publications Warehouse

    Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.

    2012-01-01

    Estimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1–8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.

  7. Satellite-based high-resolution PM2.5 estimation over the Beijing-Tianjin-Hebei region of China using an improved geographically and temporally weighted regression model.

    PubMed

    He, Qingqing; Huang, Bo

    2018-05-01

    Ground fine particulate matter (PM2.5) concentrations at high spatial resolution are substantially required for determining the population exposure to PM2.5 over densely populated urban areas. However, most studies for China have generated PM2.5 estimations at a coarse resolution (≥10 km) due to the limitation of satellite aerosol optical depth (AOD) product in spatial resolution. In this study, the 3 km AOD data fused using the Moderate Resolution Imaging Spectroradiometer (MODIS) Collection 6 AOD products were employed to estimate the ground PM2.5 concentrations over the Beijing-Tianjin-Hebei (BTH) region of China from January 2013 to December 2015. An improved geographically and temporally weighted regression (iGTWR) model incorporating seasonal characteristics within the data was developed, which achieved comparable performance to the standard GTWR model for the days with paired PM 2.5 - AOD samples (Cross-validation (CV) R 2  = 0.82) and showed better predictive power for the days without PM 2.5 - AOD pairs (the R 2 increased from 0.24 to 0.46 in CV). Both iGTWR and GTWR (CV R 2  = 0.84) significantly outperformed the daily geographically weighted regression model (CV R 2  = 0.66). Also, the fused 3 km AODs improved data availability and presented more spatial gradients, thereby enhancing model performance compared with the MODIS original 3/10 km AOD product. As a result, ground PM2.5 concentrations at higher resolution were well represented, allowing, e.g., short-term pollution events and long-term PM2.5 trend to be identified, which, in turn, indicated that concerns about air pollution in the BTH region are justified despite its decreasing trend from 2013 to 2015. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Land use regression models to assess air pollution exposure in Mexico City using finer spatial and temporal input parameters.

    PubMed

    Son, Yeongkwon; Osornio-Vargas, Álvaro R; O'Neill, Marie S; Hystad, Perry; Texcalac-Sangrador, José L; Ohman-Strickland, Pamela; Meng, Qingyu; Schwander, Stephan

    2018-05-17

    The Mexico City Metropolitan Area (MCMA) is one of the largest and most populated urban environments in the world and experiences high air pollution levels. To develop models that estimate pollutant concentrations at fine spatiotemporal scales and provide improved air pollution exposure assessments for health studies in Mexico City. We developed finer spatiotemporal land use regression (LUR) models for PM 2.5 , PM 10 , O 3 , NO 2 , CO and SO 2 using mixed effect models with the Least Absolute Shrinkage and Selection Operator (LASSO). Hourly traffic density was included as a temporal variable besides meteorological and holiday variables. Models of hourly, daily, monthly, 6-monthly and annual averages were developed and evaluated using traditional and novel indices. The developed spatiotemporal LUR models yielded predicted concentrations with good spatial and temporal agreements with measured pollutant levels except for the hourly PM 2.5 , PM 10 and SO 2 . Most of the LUR models met performance goals based on the standardized indices. LUR models with temporal scales greater than one hour were successfully developed using mixed effect models with LASSO and showed superior model performance compared to earlier LUR models, especially for time scales of a day or longer. The newly developed LUR models will be further refined with ongoing Mexico City air pollution sampling campaigns to improve personal exposure assessments. Copyright © 2018. Published by Elsevier B.V.

  9. Blending Multiple Nitrogen Dioxide Data Sources for Neighborhood Estimates of Long-Term Exposure for Health Research.

    PubMed

    Hanigan, Ivan C; Williamson, Grant J; Knibbs, Luke D; Horsley, Joshua; Rolfe, Margaret I; Cope, Martin; Barnett, Adrian G; Cowie, Christine T; Heyworth, Jane S; Serre, Marc L; Jalaludin, Bin; Morgan, Geoffrey G

    2017-11-07

    Exposure to traffic related nitrogen dioxide (NO 2 ) air pollution is associated with adverse health outcomes. Average pollutant concentrations for fixed monitoring sites are often used to estimate exposures for health studies, however these can be imprecise due to difficulty and cost of spatial modeling at the resolution of neighborhoods (e.g., a scale of tens of meters) rather than at a coarse scale (around several kilometers). The objective of this study was to derive improved estimates of neighborhood NO 2 concentrations by blending measurements with modeled predictions in Sydney, Australia (a low pollution environment). We implemented the Bayesian maximum entropy approach to blend data with uncertainty defined using informative priors. We compiled NO 2 data from fixed-site monitors, chemical transport models, and satellite-based land use regression models to estimate neighborhood annual average NO 2 . The spatial model produced a posterior probability density function of estimated annual average concentrations that spanned an order of magnitude from 3 to 35 ppb. Validation using independent data showed improvement, with root mean squared error improvement of 6% compared with the land use regression model and 16% over the chemical transport model. These estimates will be used in studies of health effects and should minimize misclassification bias.

  10. Depth and Medium-Scale Spatial Processes Influence Fish Assemblage Structure of Unconsolidated Habitats in a Subtropical Marine Park

    PubMed Central

    Schultz, Arthur L.; Malcolm, Hamish A.; Bucher, Daniel J.; Linklater, Michelle; Smith, Stephen D. A.

    2014-01-01

    Where biological datasets are spatially limited, abiotic surrogates have been advocated to inform objective planning for Marine Protected Areas. However, this approach assumes close correlation between abiotic and biotic patterns. The Solitary Islands Marine Park, northern NSW, Australia, currently uses a habitat classification system (HCS) to assist with planning, but this is based only on data for reefs. We used Baited Remote Underwater Videos (BRUVs) to survey fish assemblages of unconsolidated substrata at different depths, distances from shore, and across an along-shore spatial scale of 10 s of km (2 transects) to examine how well the HCS works for this dominant habitat. We used multivariate regression modelling to examine the importance of these, and other environmental factors (backscatter intensity, fine-scale bathymetric variation and rugosity), in structuring fish assemblages. There were significant differences in fish assemblages across depths, distance from shore, and over the medium spatial scale of the study: together, these factors generated the optimum model in multivariate regression. However, marginal tests suggested that backscatter intensity, which itself is a surrogate for sediment type and hardness, might also influence fish assemblages and needs further investigation. Species richness was significantly different across all factors: however, total MaxN only differed significantly between locations. This study demonstrates that the pre-existing abiotic HCS only partially represents the range of fish assemblages of unconsolidated habitats in the region. PMID:24824998

  11. Effects of environmental amenities and locational disamenities on home values in the Santa Cruz watershed: a hedonic analysis using census data

    USGS Publications Warehouse

    Arora, Gaurav; Frisvold, George; Norman, Laura

    2014-01-01

    For this study, we used the hedonic pricing method to measure the effects of natural amenities on home prices in the U.S-side of the Santa Cruz Watershed. We employed multivariate spatial regression techniques to estimate how difference factors affect median home values in 613 census block groups of the 2000 Census, accounting for spatial autocorrelation, spatial lags, and/or spatial heterogeneity in the data. Diagnostic tests suggest that failure to account for the hedonic model can be classified as (1) physical features of the housing stock, (2) neighborhood characteristics, and (3) environmental attributes. Census data was combined with GIS data for vegetation and land cover, land administration, measures of species richness and open space, and proximity to amenities and disamenities. Census block groups close to the US-Mexico border of airports/air bases were negative. Results suggest that policies to maintain biodiversity and open space provide economic benefits to homeowners, reflected in higher home values. Future research will quantify the marginal effects of regression explanatory variables on home values to assess their economic and policy significant. These marginal effects will be used as input indicators to discern potential economic impacts of various scenarios in the Santa Cruz Watershed Ecosystem Portfolio Model (SCWEPM). Future research will also expand this effort into the Mexican-portion of the watershed.

  12. Global patterns of current and future road infrastructure

    NASA Astrophysics Data System (ADS)

    Meijer, Johan R.; Huijbregts, Mark A. J.; Schotten, Kees C. G. J.; Schipper, Aafke M.

    2018-06-01

    Georeferenced information on road infrastructure is essential for spatial planning, socio-economic assessments and environmental impact analyses. Yet current global road maps are typically outdated or characterized by spatial bias in coverage. In the Global Roads Inventory Project we gathered, harmonized and integrated nearly 60 geospatial datasets on road infrastructure into a global roads dataset. The resulting dataset covers 222 countries and includes over 21 million km of roads, which is two to three times the total length in the currently best available country-based global roads datasets. We then related total road length per country to country area, population density, GDP and OECD membership, resulting in a regression model with adjusted R 2 of 0.90, and found that that the highest road densities are associated with densely populated and wealthier countries. Applying our regression model to future population densities and GDP estimates from the Shared Socioeconomic Pathway (SSP) scenarios, we obtained a tentative estimate of 3.0–4.7 million km additional road length for the year 2050. Large increases in road length were projected for developing nations in some of the world’s last remaining wilderness areas, such as the Amazon, the Congo basin and New Guinea. This highlights the need for accurate spatial road datasets to underpin strategic spatial planning in order to reduce the impacts of roads in remaining pristine ecosystems.

  13. Nitrogen dioxide concentrations in neighborhoods adjacent to a commercial airport: a land use regression modeling study

    PubMed Central

    2010-01-01

    Background There is growing concern in communities surrounding airports regarding the contribution of various emission sources (such as aircraft and ground support equipment) to nearby ambient concentrations. We used extensive monitoring of nitrogen dioxide (NO2) in neighborhoods surrounding T.F. Green Airport in Warwick, RI, and land-use regression (LUR) modeling techniques to determine the impact of proximity to the airport and local traffic on these concentrations. Methods Palmes diffusion tube samplers were deployed along the airport's fence line and within surrounding neighborhoods for one to two weeks. In total, 644 measurements were collected over three sampling campaigns (October 2007, March 2008 and June 2008) and each sampling location was geocoded. GIS-based variables were created as proxies for local traffic and airport activity. A forward stepwise regression methodology was employed to create general linear models (GLMs) of NO2 variability near the airport. The effect of local meteorology on associations with GIS-based variables was also explored. Results Higher concentrations of NO2 were seen near the airport terminal, entrance roads to the terminal, and near major roads, with qualitatively consistent spatial patterns between seasons. In our final multivariate model (R2 = 0.32), the local influences of highways and arterial/collector roads were statistically significant, as were local traffic density and distance to the airport terminal (all p < 0.001). Local meteorology did not significantly affect associations with principal GIS variables, and the regression model structure was robust to various model-building approaches. Conclusion Our study has shown that there are clear local variations in NO2 in the neighborhoods that surround an urban airport, which are spatially consistent across seasons. LUR modeling demonstrated a strong influence of local traffic, except the smallest roads that predominate in residential areas, as well as proximity to the airport terminal. PMID:21083910

  14. Nitrogen dioxide concentrations in neighborhoods adjacent to a commercial airport: a land use regression modeling study.

    PubMed

    Adamkiewicz, Gary; Hsu, Hsiao-Hsien; Vallarino, Jose; Melly, Steven J; Spengler, John D; Levy, Jonathan I

    2010-11-17

    There is growing concern in communities surrounding airports regarding the contribution of various emission sources (such as aircraft and ground support equipment) to nearby ambient concentrations. We used extensive monitoring of nitrogen dioxide (NO2) in neighborhoods surrounding T.F. Green Airport in Warwick, RI, and land-use regression (LUR) modeling techniques to determine the impact of proximity to the airport and local traffic on these concentrations. Palmes diffusion tube samplers were deployed along the airport's fence line and within surrounding neighborhoods for one to two weeks. In total, 644 measurements were collected over three sampling campaigns (October 2007, March 2008 and June 2008) and each sampling location was geocoded. GIS-based variables were created as proxies for local traffic and airport activity. A forward stepwise regression methodology was employed to create general linear models (GLMs) of NO2 variability near the airport. The effect of local meteorology on associations with GIS-based variables was also explored. Higher concentrations of NO2 were seen near the airport terminal, entrance roads to the terminal, and near major roads, with qualitatively consistent spatial patterns between seasons. In our final multivariate model (R2 = 0.32), the local influences of highways and arterial/collector roads were statistically significant, as were local traffic density and distance to the airport terminal (all p < 0.001). Local meteorology did not significantly affect associations with principal GIS variables, and the regression model structure was robust to various model-building approaches. Our study has shown that there are clear local variations in NO2 in the neighborhoods that surround an urban airport, which are spatially consistent across seasons. LUR modeling demonstrated a strong influence of local traffic, except the smallest roads that predominate in residential areas, as well as proximity to the airport terminal.

  15. Do hospitals respond to rivals' quality and efficiency? A spatial panel econometric analysis.

    PubMed

    Longo, Francesco; Siciliani, Luigi; Gravelle, Hugh; Santos, Rita

    2017-09-01

    We investigate whether hospitals in the English National Health Service change their quality or efficiency in response to changes in quality or efficiency of neighbouring hospitals. We first provide a theoretical model that predicts that a hospital will not respond to changes in the efficiency of its rivals but may change its quality or efficiency in response to changes in the quality of rivals, though the direction of the response is ambiguous. We use data on eight quality measures (including mortality, emergency readmissions, patient reported outcome, and patient satisfaction) and six efficiency measures (including bed occupancy, cancelled operations, and costs) for public hospitals between 2010/11 and 2013/14 to estimate both spatial cross-sectional and spatial fixed- and random-effects panel data models. We find that although quality and efficiency measures are unconditionally spatially correlated, the spatial regression models suggest that a hospital's quality or efficiency does not respond to its rivals' quality or efficiency, except for a hospital's overall mortality that is positively associated with that of its rivals. The results are robust to allowing for spatially correlated covariates and errors and to instrumenting rivals' quality and efficiency. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Updated estimates of long-term average dissolved-solids loading in streams and rivers of the Upper Colorado River Basin

    USGS Publications Warehouse

    Tillman, Fred D.; Anning, David W.

    2014-01-01

    The Colorado River and its tributaries supply water to more than 35 million people in the United States and 3 million people in Mexico, irrigating over 4.5 million acres of farmland, and annually generating about 12 billion kilowatt hours of hydroelectric power. The Upper Colorado River Basin, part of the Colorado River Basin, encompasses more than 110,000 mi2 and is the source of much of more than 9 million tons of dissolved solids that annually flows past the Hoover Dam. High dissolved-solids concentrations in the river are the cause of substantial economic damages to users, primarily in reduced agricultural crop yields and corrosion, with damages estimated to be greater than 300 million dollars annually. In 1974, the Colorado River Basin Salinity Control Act created the Colorado River Basin Salinity Control Program to investigate and implement a broad range of salinity control measures. A 2009 study by the U.S. Geological Survey, supported by the Salinity Control Program, used the Spatially Referenced Regressions on Watershed Attributes surface-water quality model to examine dissolved-solids supply and transport within the Upper Colorado River Basin. Dissolved-solids loads developed for 218 monitoring sites were used to calibrate the 2009 Upper Colorado River Basin Spatially Referenced Regressions on Watershed Attributes dissolved-solids model. This study updates and develops new dissolved-solids loading estimates for 323 Upper Colorado River Basin monitoring sites using streamflow and dissolved-solids concentration data through 2012, to support a planned Spatially Referenced Regressions on Watershed Attributes modeling effort that will investigate the contributions to dissolved-solids loads from irrigation and rangeland practices.

  17. A New Hybrid Spatio-temporal Model for Estimating Daily Multi-year PM2.5 Concentrations Across Northeastern USA Using High Resolution Aerosol Optical Depth Data

    NASA Technical Reports Server (NTRS)

    Kloog, Itai; Chudnovsky, Alexandra A.; Just, Allan C.; Nordio, Francesco; Koutrakis, Petros; Coull, Brent A.; Lyapustin, Alexei; Wang, Yujie; Schwartz, Joel

    2014-01-01

    The use of satellite-based aerosol optical depth (AOD) to estimate fine particulate matter PM(sub 2.5) for epidemiology studies has increased substantially over the past few years. These recent studies often report moderate predictive power, which can generate downward bias in effect estimates. In addition, AOD measurements have only moderate spatial resolution, and have substantial missing data. We make use of recent advances in MODIS satellite data processing algorithms (Multi-Angle Implementation of Atmospheric Correction (MAIAC), which allow us to use 1 km (versus currently available 10 km) resolution AOD data.We developed and cross validated models to predict daily PM(sub 2.5) at a 1X 1 km resolution across the northeastern USA (New England, New York and New Jersey) for the years 2003-2011, allowing us to better differentiate daily and long term exposure between urban, suburban, and rural areas. Additionally, we developed an approach that allows us to generate daily high-resolution 200 m localized predictions representing deviations from the area 1 X 1 km grid predictions. We used mixed models regressing PM(sub 2.5) measurements against day-specific random intercepts, and fixed and random AOD and temperature slopes. We then use generalized additive mixed models with spatial smoothing to generate grid cell predictions when AOD was missing. Finally, to get 200 m localized predictions, we regressed the residuals from the final model for each monitor against the local spatial and temporal variables at each monitoring site. Our model performance was excellent (mean out-of-sample R(sup 2) = 0.88). The spatial and temporal components of the out-of-sample results also presented very good fits to the withheld data (R(sup 2) = 0.87, R(sup)2 = 0.87). In addition, our results revealed very little bias in the predicted concentrations (Slope of predictions versus withheld observations = 0.99). Our daily model results show high predictive accuracy at high spatial resolutions and will be useful in reconstructing exposure histories for epidemiological studies across this region.

  18. A New Hybrid Spatio-Temporal Model For Estimating Daily Multi-Year PM2.5 Concentrations Across Northeastern USA Using High Resolution Aerosol Optical Depth Data.

    PubMed

    Kloog, Itai; Chudnovsky, Alexandra A; Just, Allan C; Nordio, Francesco; Koutrakis, Petros; Coull, Brent A; Lyapustin, Alexei; Wang, Yujie; Schwartz, Joel

    2014-10-01

    The use of satellite-based aerosol optical depth (AOD) to estimate fine particulate matter (PM 2.5 ) for epidemiology studies has increased substantially over the past few years. These recent studies often report moderate predictive power, which can generate downward bias in effect estimates. In addition, AOD measurements have only moderate spatial resolution, and have substantial missing data. We make use of recent advances in MODIS satellite data processing algorithms (Multi-Angle Implementation of Atmospheric Correction (MAIAC), which allow us to use 1 km (versus currently available 10 km) resolution AOD data. We developed and cross validated models to predict daily PM 2.5 at a 1×1km resolution across the northeastern USA (New England, New York and New Jersey) for the years 2003-2011, allowing us to better differentiate daily and long term exposure between urban, suburban, and rural areas. Additionally, we developed an approach that allows us to generate daily high-resolution 200 m localized predictions representing deviations from the area 1×1 km grid predictions. We used mixed models regressing PM 2.5 measurements against day-specific random intercepts, and fixed and random AOD and temperature slopes. We then use generalized additive mixed models with spatial smoothing to generate grid cell predictions when AOD was missing. Finally, to get 200 m localized predictions, we regressed the residuals from the final model for each monitor against the local spatial and temporal variables at each monitoring site. Our model performance was excellent (mean out-of-sample R 2 =0.88). The spatial and temporal components of the out-of-sample results also presented very good fits to the withheld data (R 2 =0.87, R 2 =0.87). In addition, our results revealed very little bias in the predicted concentrations (Slope of predictions versus withheld observations = 0.99). Our daily model results show high predictive accuracy at high spatial resolutions and will be useful in reconstructing exposure histories for epidemiological studies across this region.

  19. A New Hybrid Spatio-Temporal Model For Estimating Daily Multi-Year PM2.5 Concentrations Across Northeastern USA Using High Resolution Aerosol Optical Depth Data

    PubMed Central

    Kloog, Itai; Chudnovsky, Alexandra A.; Just, Allan C.; Nordio, Francesco; Koutrakis, Petros; Coull, Brent A.; Lyapustin, Alexei; Wang, Yujie; Schwartz, Joel

    2017-01-01

    Background The use of satellite-based aerosol optical depth (AOD) to estimate fine particulate matter (PM2.5) for epidemiology studies has increased substantially over the past few years. These recent studies often report moderate predictive power, which can generate downward bias in effect estimates. In addition, AOD measurements have only moderate spatial resolution, and have substantial missing data. Methods We make use of recent advances in MODIS satellite data processing algorithms (Multi-Angle Implementation of Atmospheric Correction (MAIAC), which allow us to use 1 km (versus currently available 10 km) resolution AOD data. We developed and cross validated models to predict daily PM2.5 at a 1×1km resolution across the northeastern USA (New England, New York and New Jersey) for the years 2003–2011, allowing us to better differentiate daily and long term exposure between urban, suburban, and rural areas. Additionally, we developed an approach that allows us to generate daily high-resolution 200 m localized predictions representing deviations from the area 1×1 km grid predictions. We used mixed models regressing PM2.5 measurements against day-specific random intercepts, and fixed and random AOD and temperature slopes. We then use generalized additive mixed models with spatial smoothing to generate grid cell predictions when AOD was missing. Finally, to get 200 m localized predictions, we regressed the residuals from the final model for each monitor against the local spatial and temporal variables at each monitoring site. Results Our model performance was excellent (mean out-of-sample R2=0.88). The spatial and temporal components of the out-of-sample results also presented very good fits to the withheld data (R2=0.87, R2=0.87). In addition, our results revealed very little bias in the predicted concentrations (Slope of predictions versus withheld observations = 0.99). Conclusion Our daily model results show high predictive accuracy at high spatial resolutions and will be useful in reconstructing exposure histories for epidemiological studies across this region. PMID:28966552

  20. Multi-decadal trend and space-time variability of sea level over the Indian Ocean since the 1950s: impact of decadal climate modes

    NASA Astrophysics Data System (ADS)

    Han, W.; Stammer, D.; Meehl, G. A.; Hu, A.; Sienz, F.

    2016-12-01

    Sea level varies on decadal and multi-decadal timescales over the Indian Ocean. The variations are not spatially uniform, and can deviate considerably from the global mean sea level rise (SLR) due to various geophysical processes. One of these processes is the change of ocean circulation, which can be partly attributed to natural internal modes of climate variability. Over the Indian Ocean, the most influential climate modes on decadal and multi-decadal timescales are the Interdecadal Pacific Oscillation (IPO) and decadal variability of the Indian Ocean dipole (IOD). Here, we first analyze observational datasets to investigate the impacts of IPO and IOD on spatial patterns of decadal and interdecadal (hereafter decal) sea level variability & multi-decadal trend over the Indian Ocean since the 1950s, using a new statistical approach of Bayesian Dynamical Linear regression Model (DLM). The Bayesian DLM overcomes the limitation of "time-constant (static)" regression coefficients in conventional multiple linear regression model, by allowing the coefficients to vary with time and therefore measuring "time-evolving (dynamical)" relationship between climate modes and sea level. For the multi-decadal sea level trend since the 1950s, our results show that climate modes and non-climate modes (the part that cannot be explained by climate modes) have comparable contributions in magnitudes but with different spatial patterns, with each dominating different regions of the Indian Ocean. For decadal variability, climate modes are the major contributors for sea level variations over most region of the tropical Indian Ocean. The relative importance of IPO and decadal variability of IOD, however, varies spatially. For example, while IOD decadal variability dominates IPO in the eastern equatorial basin (85E-100E, 5S-5N), IPO dominates IOD in causing sea level variations in the tropical southwest Indian Ocean (45E-65E, 12S-2S). To help decipher the possible contribution of external forcing to the multi-decadal sea level trend and decadal variability, we also analyze the model outputs from NCAR's Community Earth System Model (CESM) Large Ensemble Experiments, and compare the results with our observational analyses.

  1. Regression tree modeling of forest NPP using site conditions and climate variables across eastern USA

    NASA Astrophysics Data System (ADS)

    Kwon, Y.

    2013-12-01

    As evidence of global warming continue to increase, being able to predict forest response to climate changes, such as expected rise of temperature and precipitation, will be vital for maintaining the sustainability and productivity of forests. To map forest species redistribution by climate change scenario has been successful, however, most species redistribution maps lack mechanistic understanding to explain why trees grow under the novel conditions of chaining climate. Distributional map is only capable of predicting under the equilibrium assumption that the communities would exist following a prolonged period under the new climate. In this context, forest NPP as a surrogate for growth rate, the most important facet that determines stand dynamics, can lead to valid prediction on the transition stage to new vegetation-climate equilibrium as it represents changes in structure of forest reflecting site conditions and climate factors. The objective of this study is to develop forest growth map using regression tree analysis by extracting large-scale non-linear structures from both field-based FIA and remotely sensed MODIS data set. The major issue addressed in this approach is non-linear spatial patterns of forest attributes. Forest inventory data showed complex spatial patterns that reflect environmental states and processes that originate at different spatial scales. At broad scales, non-linear spatial trends in forest attributes and mixture of continuous and discrete types of environmental variables make traditional statistical (multivariate regression) and geostatistical (kriging) models inefficient. It calls into question some traditional underlying assumptions of spatial trends that uncritically accepted in forest data. To solve the controversy surrounding the suitability of forest data, regression tree analysis are performed using Software See5 and Cubist. Four publicly available data sets were obtained: First, field-based Forest Inventory and Analysis (USDA, Forest Service) data set for the 31 eastern most United States. Second, 8-day composite of MODIS Land Cover, FPAR, LAI and GPP/NPP data were obtained from Jan 2001 to Dec 2004 (total 182 composite) and each product were filtered by pixel-level quality assurance data to select best quality pixels. Third, 30-year averaged climate data were collected from National Oceanic and Atmospheric Administration (NOAA) and five climatic variables were obtained: Monthly temperature, precipitation, annual heating and cooling days, and annual frost-free days. Forth, topographic data were obtained from digital elevation model (1km by 1km). This research will provide a better understanding of large-scale forest responses to environmental factors that will be beneficial for the development of important forest management applications.

  2. Remote sensing estimation of the total phosphorus concentration in a large lake using band combinations and regional multivariate statistical modeling techniques.

    PubMed

    Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi

    2015-03-15

    Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate statistical modeling techniques, demonstrated advantages for estimating the TP concentration in a large lake and had a strong potential for universal application for the TP concentration estimation in large lake waters worldwide. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Spatial and Temporal Changes of Aerosol Optical Depth and its Driving Factors Based on Modis in Jiangsu Province

    NASA Astrophysics Data System (ADS)

    Jiang, C.; Xu, Q.; Gu, Y. K.; Qian, X. Y.; He, J. N.

    2018-04-01

    Aerosol Optical Depth (AOD) is of great value for studying air mass and its changes. In this paper, we studied the spatial-temporal changes of AOD and its driving factors based on spatial autocorrelation model, gravity model and multiple regression analysis in Jiangsu Province from 2007 to 2016. The results showed that in terms of spatial distribution, the southern AOD value is higher, and the high-value aggregation areas are significant, while the northern AOD value is lower, but the low-value aggregation areas constantly change. The AOD gravity centers showed a clear point-like aggregation. In terms of temporal changes, the overall AOD in Jiangsu Province increased year by year in fluctuation. In terms of driving factors, the total amount of vehicles, precipitation and temperature are important factors for the growth of AOD.

  4. An exploration of spatial patterns of seasonal diarrhoeal morbidity in Thailand.

    PubMed

    McCormick, B J J; Alonso, W J; Miller, M A

    2012-07-01

    Studies of temporal and spatial patterns of diarrhoeal disease can suggest putative aetiological agents and environmental or socioeconomic drivers. Here, the seasonal patterns of monthly acute diarrhoeal morbidity in Thailand, where diarrhoeal morbidity is increasing, are explored. Climatic data (2003-2006) and Thai Ministry of Health annual reports (2003-2009) were used to construct a spatially weighted panel regression model. Seasonal patterns of diarrhoeal disease were generally bimodal with aetiological agents peaking at different times of the year. There is a strong association between daily mean temperature and precipitation and the incidence of hospitalization due to acute diarrhoea in Thailand leading to a distinct spatial pattern in the seasonal pattern of diarrhoea. Model performance varied across the country in relation to per capita GDP and population density. While climatic factors are likely to drive the general pattern of diarrhoeal disease in Thailand, the seasonality of diarrhoeal disease is dampened in affluent urban populations.

  5. Development of a microscale land use regression model for predicting NO2 concentrations at a heavy trafficked suburban area in Auckland, NZ.

    PubMed

    Weissert, L F; Salmond, J A; Miskell, G; Alavi-Shoshtari, M; Williams, D E

    2018-04-01

    Land use regression (LUR) analysis has become a key method to explain air pollutant concentrations at unmeasured sites at city or country scales, but little is known about the applicability of LUR at microscales. We present a microscale LUR model developed for a heavy trafficked section of road in Auckland, New Zealand. We also test the within-city transferability of LUR models developed at different spatial scales (local scale and city scale). Nitrogen dioxide (NO 2 ) was measured during summer at 40 sites and a LUR model was developed based on standard criteria. The results showed that LUR models are able to capture the microscale variability with the model explaining 66% of the variability in NO 2 concentrations. Predictor variables identified at this scale were street width, distance to major road, presence of awnings and number of bus stops, with the latter three also being important determinants at the local scale. This highlights the importance of street and building configurations for individual exposure at the street level. However, within-city transferability was limited with the number of bus stops being the only significant predictor variable at all spatial scales and locations tested, indicating the strong influence of diesel emissions related to bus traffic. These findings show that air quality monitoring is necessary at a high spatial density within cities in capturing small-scale variability in NO 2 concentrations at the street level and assessing individual exposure to traffic related air pollutants. Copyright © 2017. Published by Elsevier B.V.

  6. Modelling the Relationship Between Land Surface Temperature and Landscape Patterns of Land Use Land Cover Classification Using Multi Linear Regression Models

    NASA Astrophysics Data System (ADS)

    Bernales, A. M.; Antolihao, J. A.; Samonte, C.; Campomanes, F.; Rojas, R. J.; dela Serna, A. M.; Silapan, J.

    2016-06-01

    The threat of the ailments related to urbanization like heat stress is very prevalent. There are a lot of things that can be done to lessen the effect of urbanization to the surface temperature of the area like using green roofs or planting trees in the area. So land use really matters in both increasing and decreasing surface temperature. It is known that there is a relationship between land use land cover (LULC) and land surface temperature (LST). Quantifying this relationship in terms of a mathematical model is very important so as to provide a way to predict LST based on the LULC alone. This study aims to examine the relationship between LST and LULC as well as to create a model that can predict LST using class-level spatial metrics from LULC. LST was derived from a Landsat 8 image and LULC classification was derived from LiDAR and Orthophoto datasets. Class-level spatial metrics were created in FRAGSTATS with the LULC and LST as inputs and these metrics were analysed using a statistical framework. Multi linear regression was done to create models that would predict LST for each class and it was found that the spatial metric "Effective mesh size" was a top predictor for LST in 6 out of 7 classes. The model created can still be refined by adding a temporal aspect by analysing the LST of another farming period (for rural areas) and looking for common predictors between LSTs of these two different farming periods.

  7. Predicting nitrogen loading with land-cover composition: how can watershed size affect model performance?

    PubMed

    Zhang, Tao; Yang, Xiaojun

    2013-01-01

    Watershed-wide land-cover proportions can be used to predict the in-stream non-point source pollutant loadings through regression modeling. However, the model performance can vary greatly across different study sites and among various watersheds. Existing literature has shown that this type of regression modeling tends to perform better for large watersheds than for small ones, and that such a performance variation has been largely linked with different interwatershed landscape heterogeneity levels. The purpose of this study is to further examine the previously mentioned empirical observation based on a set of watersheds in the northern part of Georgia (USA) to explore the underlying causes of the variation in model performance. Through the combined use of the neutral landscape modeling approach and a spatially explicit nutrient loading model, we tested whether the regression model performance variation over the watershed groups ranging in size is due to the different watershed landscape heterogeneity levels. We adopted three neutral landscape modeling criteria that were tied with different similarity levels in watershed landscape properties and used the nutrient loading model to estimate the nitrogen loads for these neutral watersheds. Then we compared the regression model performance for the real and neutral landscape scenarios, respectively. We found that watershed size can affect the regression model performance both directly and indirectly. Along with the indirect effect through interwatershed heterogeneity, watershed size can directly affect the model performance over the watersheds varying in size. We also found that the regression model performance can be more significantly affected by other physiographic properties shaping nitrogen delivery effectiveness than the watershed land-cover heterogeneity. This study contrasts with many existing studies because it goes beyond hypothesis formulation based on empirical observations and into hypothesis testing to explore the fundamental mechanism.

  8. Non-Parametric Blur Map Regression for Depth of Field Extension.

    PubMed

    D'Andres, Laurent; Salvador, Jordi; Kochale, Axel; Susstrunk, Sabine

    2016-04-01

    Real camera systems have a limited depth of field (DOF) which may cause an image to be degraded due to visible misfocus or too shallow DOF. In this paper, we present a blind deblurring pipeline able to restore such images by slightly extending their DOF and recovering sharpness in regions slightly out of focus. To address this severely ill-posed problem, our algorithm relies first on the estimation of the spatially varying defocus blur. Drawing on local frequency image features, a machine learning approach based on the recently introduced regression tree fields is used to train a model able to regress a coherent defocus blur map of the image, labeling each pixel by the scale of a defocus point spread function. A non-blind spatially varying deblurring algorithm is then used to properly extend the DOF of the image. The good performance of our algorithm is assessed both quantitatively, using realistic ground truth data obtained with a novel approach based on a plenoptic camera, and qualitatively with real images.

  9. Environmental Characteristics Associated With Pedestrian–Motor Vehicle Collisions in Denver, Colorado

    PubMed Central

    Sebert Kuhlmann, Anne K.; Thomas, Deborah; R. Sain, Stephan

    2009-01-01

    Objectives. We examined patterns of pedestrian–motor vehicle collisions and associated environmental characteristics in Denver, Colorado. Methods. We integrated publicly available data on motor vehicle collisions, liquor licenses, land use, and sociodemographic characteristics to analyze spatial patterns and other characteristics of collisions involving pedestrians. We developed both linear and spatially weighted regression models of these collisions. Results. Spatial analysis revealed global clustering of pedestrian–motor vehicle collisions with concentrations in downtown, in a contiguous neighborhood, and along major arterial streets. Walking to work, population density, and liquor license outlet density all contributed significantly to both linear and spatial models of collisions involving pedestrians and were each significantly associated with these collisions. Conclusions. These models, constructed with data from Denver, identified conditions that likely contribute to patterns of pedestrian–motor vehicle collisions. Should these models be verified elsewhere, they will have implications for future research directions, public policy to enhance pedestrian safety, and public health programs aimed at decreasing unintentional injury from pedestrian–motor vehicle collisions and promoting walking as a routine physical activity. PMID:19608966

  10. A comparative analysis of two highly spatially resolved European atmospheric emission inventories

    NASA Astrophysics Data System (ADS)

    Ferreira, J.; Guevara, M.; Baldasano, J. M.; Tchepel, O.; Schaap, M.; Miranda, A. I.; Borrego, C.

    2013-08-01

    A reliable emissions inventory is highly important for air quality modelling applications, especially at regional or local scales, which require high resolutions. Consequently, higher resolution emission inventories have been developed that are suitable for regional air quality modelling. This research performs an inter-comparative analysis of different spatial disaggregation methodologies of atmospheric emission inventories. This study is based on two different European emission inventories with different spatial resolutions: 1) the EMEP (European Monitoring and Evaluation Programme) inventory and 2) an emission inventory developed by the TNO (Netherlands Organisation for Applied Scientific Research). These two emission inventories were converted into three distinct gridded emission datasets as follows: (i) the EMEP emission inventory was disaggregated by area (EMEParea) and (ii) following a more complex methodology (HERMES-DIS - High-Elective Resolution Modelling Emissions System - DISaggregation module) to understand and evaluate the influence of different disaggregation methods; and (iii) the TNO gridded emissions, which are based on different emission data sources and different disaggregation methods. A predefined common grid with a spatial resolution of 12 × 12 km2 was used to compare the three datasets spatially. The inter-comparative analysis was performed by source sector (SNAP - Selected Nomenclature for Air Pollution) with emission totals for selected pollutants. It included the computation of difference maps (to focus on the spatial variability of emission differences) and a linear regression analysis to calculate the coefficients of determination and to quantitatively measure differences. From the spatial analysis, greater differences were found for residential/commercial combustion (SNAP02), solvent use (SNAP06) and road transport (SNAP07). These findings were related to the different spatial disaggregation that was conducted by the TNO and HERMES-DIS for the first two sectors and to the distinct data sources that were used by the TNO and HERMES-DIS for road transport. Regarding the regression analysis, the greatest correlation occurred between the EMEParea and HERMES-DIS because the latter is derived from the first, which does not occur for the TNO emissions. The greatest correlations were encountered for agriculture NH3 emissions, due to the common use of the CORINE Land Cover database for disaggregation. The point source emissions (energy industries, industrial processes, industrial combustion and extraction/distribution of fossil fuels) resulted in the lowest coefficients of determination. The spatial variability of SOx differed among the emissions that were obtained from the different disaggregation methods. In conclusion, HERMES-DIS and TNO are two distinct emission inventories, both very well discretized and detailed, suitable for air quality modelling. However, the different databases and distinct disaggregation methodologies that were used certainly result in different spatial emission patterns. This fact should be considered when applying regional atmospheric chemical transport models. Future work will focus on the evaluation of air quality models performance and sensitivity to these spatial discrepancies in emission inventories. Air quality modelling will benefit from the availability of appropriate resolution, consistent and reliable emission inventories.

  11. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale. PMID:26964095

  12. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    PubMed

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale.

  13. Structured Additive Quantile Regression for Assessing the Determinants of Childhood Anemia in Rwanda.

    PubMed

    Habyarimana, Faustin; Zewotir, Temesgen; Ramroop, Shaun

    2017-06-17

    Childhood anemia is among the most significant health problems faced by public health departments in developing countries. This study aims at assessing the determinants and possible spatial effects associated with childhood anemia in Rwanda. The 2014/2015 Rwanda Demographic and Health Survey (RDHS) data was used. The analysis was done using the structured spatial additive quantile regression model. The findings of this study revealed that the child's age; the duration of breastfeeding; gender of the child; the nutritional status of the child (whether underweight and/or wasting); whether the child had a fever; had a cough in the two weeks prior to the survey or not; whether the child received vitamin A supplementation in the six weeks before the survey or not; the household wealth index; literacy of the mother; mother's anemia status; mother's age at the birth are all significant factors associated with childhood anemia in Rwanda. Furthermore, significant structured spatial location effects on childhood anemia was found.

  14. Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection

    PubMed Central

    Goldsmith, Jeff; Huang, Lei; Crainiceanu, Ciprian M.

    2013-01-01

    We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in less than one page in the Appendix. We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white matter microstructure at every voxel of the corpus callosum for hundreds of subjects. PMID:24729670

  15. Detecting influential observations in nonlinear regression modeling of groundwater flow

    USGS Publications Warehouse

    Yager, Richard M.

    1998-01-01

    Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.

  16. Mixed conditional logistic regression for habitat selection studies.

    PubMed

    Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas

    2010-05-01

    1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.

  17. Spatiotemporal analysis of the relationship between socioeconomic factors and stroke in the Portuguese mainland population under 65 years old.

    PubMed

    Oliveira, André; Cabral, António J R; Mendes, Jorge M; Martins, Maria R O; Cabral, Pedro

    2015-11-04

    Stroke risk has been shown to display varying patterns of geographic distribution amongst countries but also between regions of the same country. Traditionally a disease of older persons, a global 25% increase in incidence instead was noticed between 1990 and 2010 in persons aged 20-≤64 years, particularly in low- and medium-income countries. Understanding spatial disparities in the association between socioeconomic factors and stroke is critical to target public health initiatives aiming to mitigate or prevent this disease, including in younger persons. We aimed to identify socioeconomic determinants of geographic disparities of stroke risk in people <65 years old, in municipalities of mainland Portugal, and the spatiotemporal variation of the association between these determinants and stroke risk during two study periods (1992-1996 and 2002-2006). Poisson and negative binomial global regression models were used to explore determinants of disease risk. Geographically weighted regression (GWR) represents a distinctive approach, allowing estimation of local regression coefficients. Models for both study periods were identified. Significant variables included education attainment, work hours per week and unemployment. Local Poisson GWR models achieved the best fit and evidenced spatially varying regression coefficients. Spatiotemporal inequalities were observed in significant variables, with dissimilarities between men and women. This study contributes to a better understanding of the relationship between stroke and socioeconomic factors in the population <65 years of age, one age group seldom analysed separately. It can thus help to improve the targeting of public health initiatives, even more in a context of economic crisis.

  18. Local regression type methods applied to the study of geophysics and high frequency financial data

    NASA Astrophysics Data System (ADS)

    Mariani, M. C.; Basu, K.

    2014-09-01

    In this work we applied locally weighted scatterplot smoothing techniques (Lowess/Loess) to Geophysical and high frequency financial data. We first analyze and apply this technique to the California earthquake geological data. A spatial analysis was performed to show that the estimation of the earthquake magnitude at a fixed location is very accurate up to the relative error of 0.01%. We also applied the same method to a high frequency data set arising in the financial sector and obtained similar satisfactory results. The application of this approach to the two different data sets demonstrates that the overall method is accurate and efficient, and the Lowess approach is much more desirable than the Loess method. The previous works studied the time series analysis; in this paper our local regression models perform a spatial analysis for the geophysics data providing different information. For the high frequency data, our models estimate the curve of best fit where data are dependent on time.

  19. Evaluation of Land Use Regression Models for Nitrogen Dioxide and Benzene in Four US Cities

    PubMed Central

    Mukerjee, Shaibal; Smith, Luther; Neas, Lucas; Norris, Gary

    2012-01-01

    Spatial analysis studies have included the application of land use regression models (LURs) for health and air quality assessments. Recent LUR studies have collected nitrogen dioxide (NO2) and volatile organic compounds (VOCs) using passive samplers at urban air monitoring networks in El Paso and Dallas, TX, Detroit, MI, and Cleveland, OH to assess spatial variability and source influences. LURs were successfully developed to estimate pollutant concentrations throughout the study areas. Comparisons of development and predictive capabilities of LURs from these four cities are presented to address this issue of uniform application of LURs across study areas. Traffic and other urban variables were important predictors in the LURs although city-specific influences (such as border crossings) were also important. In addition, transferability of variables or LURs from one city to another may be problematic due to intercity differences and data availability or comparability. Thus, developing common predictors in future LURs may be difficult. PMID:23226985

  20. Detecting the spatial and temporal variability of chlorophylla concentration and total suspended solids in Apalachicola Bay, Florida using MODIS imagery

    USGS Publications Warehouse

    Wang, Hongqing; Hladik, C.M.; Huang, W.; Milla, K.; Edmiston, L.; Harwell, M.A.; Schalles, J.F.

    2010-01-01

    Apalachicola Bay, Florida, accounts for 90% of Florida's and 10% of the nation's eastern oyster (Crassostrea virginica) harvesting. Chlorophyll-a concentration and total suspended solids (TSS) are two important water quality variables, among other environmental factors such as salinity, for eastern oyster production in Apalachicola Bay. In this research, we developed regression models of the relationships between the reflectance of the Moderate-Resolution Imaging Spectroradiometer (MODIS) Terra 250 m data and the two water quality variables based on the Bay-wide field data collected during 14-17 October 2002, a relatively dry period, and 3-5 April 2006, a relatively wet period, respectively. Then we selected the best regression models (highest coefficient of determination, R2) to derive Bay-wide maps of chlorophylla concentration and TSS for the two periods. The MODIS-derived maps revealed large spatial and temporal variations in chlorophylla concentration and TSS across the entire Apalachicola Bay. ?? 2010 Taylor & Francis.

  1. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  2. Variability in results from negative binomial models for Lyme disease measured at different spatial scales.

    PubMed

    Tran, Phoebe; Waller, Lance

    2015-01-01

    Lyme disease has been the subject of many studies due to increasing incidence rates year after year and the severe complications that can arise in later stages of the disease. Negative binomial models have been used to model Lyme disease in the past with some success. However, there has been little focus on the reliability and consistency of these models when they are used to study Lyme disease at multiple spatial scales. This study seeks to explore how sensitive/consistent negative binomial models are when they are used to study Lyme disease at different spatial scales (at the regional and sub-regional levels). The study area includes the thirteen states in the Northeastern United States with the highest Lyme disease incidence during the 2002-2006 period. Lyme disease incidence at county level for the period of 2002-2006 was linked with several previously identified key landscape and climatic variables in a negative binomial regression model for the Northeastern region and two smaller sub-regions (the New England sub-region and the Mid-Atlantic sub-region). This study found that negative binomial models, indeed, were sensitive/inconsistent when used at different spatial scales. We discuss various plausible explanations for such behavior of negative binomial models. Further investigation of the inconsistency and sensitivity of negative binomial models when used at different spatial scales is important for not only future Lyme disease studies and Lyme disease risk assessment/management but any study that requires use of this model type in a spatial context. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Comparison of five modelling techniques to predict the spatial distribution and abundance of seabirds

    USGS Publications Warehouse

    O'Connell, Allan F.; Gardner, Beth; Oppel, Steffen; Meirinho, Ana; Ramírez, Iván; Miller, Peter I.; Louzao, Maite

    2012-01-01

    Knowledge about the spatial distribution of seabirds at sea is important for conservation. During marine conservation planning, logistical constraints preclude seabird surveys covering the complete area of interest and spatial distribution of seabirds is frequently inferred from predictive statistical models. Increasingly complex models are available to relate the distribution and abundance of pelagic seabirds to environmental variables, but a comparison of their usefulness for delineating protected areas for seabirds is lacking. Here we compare the performance of five modelling techniques (generalised linear models, generalised additive models, Random Forest, boosted regression trees, and maximum entropy) to predict the distribution of Balearic Shearwaters (Puffinus mauretanicus) along the coast of the western Iberian Peninsula. We used ship transect data from 2004 to 2009 and 13 environmental variables to predict occurrence and density, and evaluated predictive performance of all models using spatially segregated test data. Predicted distribution varied among the different models, although predictive performance varied little. An ensemble prediction that combined results from all five techniques was robust and confirmed the existence of marine important bird areas for Balearic Shearwaters in Portugal and Spain. Our predictions suggested additional areas that would be of high priority for conservation and could be proposed as protected areas. Abundance data were extremely difficult to predict, and none of five modelling techniques provided a reliable prediction of spatial patterns. We advocate the use of ensemble modelling that combines the output of several methods to predict the spatial distribution of seabirds, and use these predictions to target separate surveys assessing the abundance of seabirds in areas of regular use.

  4. Prediction of spatially explicit rainfall intensity-duration thresholds for post-fire debris-flow generation in the western United States

    NASA Astrophysics Data System (ADS)

    Staley, Dennis; Negri, Jacquelyn; Kean, Jason

    2016-04-01

    Population expansion into fire-prone steeplands has resulted in an increase in post-fire debris-flow risk in the western United States. Logistic regression methods for determining debris-flow likelihood and the calculation of empirical rainfall intensity-duration thresholds for debris-flow initiation represent two common approaches for characterizing hazard and reducing risk. Logistic regression models are currently being used to rapidly assess debris-flow hazard in response to design storms of known intensities (e.g. a 10-year recurrence interval rainstorm). Empirical rainfall intensity-duration thresholds comprise a major component of the United States Geological Survey (USGS) and the National Weather Service (NWS) debris-flow early warning system at a regional scale in southern California. However, these two modeling approaches remain independent, with each approach having limitations that do not allow for synergistic local-scale (e.g. drainage-basin scale) characterization of debris-flow hazard during intense rainfall. The current logistic regression equations consider rainfall a unique independent variable, which prevents the direct calculation of the relation between rainfall intensity and debris-flow likelihood. Regional (e.g. mountain range or physiographic province scale) rainfall intensity-duration thresholds fail to provide insight into the basin-scale variability of post-fire debris-flow hazard and require an extensive database of historical debris-flow occurrence and rainfall characteristics. Here, we present a new approach that combines traditional logistic regression and intensity-duration threshold methodologies. This method allows for local characterization of both the likelihood that a debris-flow will occur at a given rainfall intensity, the direct calculation of the rainfall rates that will result in a given likelihood, and the ability to calculate spatially explicit rainfall intensity-duration thresholds for debris-flow generation in recently burned areas. Our approach synthesizes the two methods by incorporating measured rainfall intensity into each model variable (based on measures of topographic steepness, burn severity and surface properties) within the logistic regression equation. This approach provides a more realistic representation of the relation between rainfall intensity and debris-flow likelihood, as likelihood values asymptotically approach zero when rainfall intensity approaches 0 mm/h, and increase with more intense rainfall. Model performance was evaluated by comparing predictions to several existing regional thresholds. The model, based upon training data collected in southern California, USA, has proven to accurately predict rainfall intensity-duration thresholds for other areas in the western United States not included in the original training dataset. In addition, the improved logistic regression model shows promise for emergency planning purposes and real-time, site-specific early warning. With further validation, this model may permit the prediction of spatially-explicit intensity-duration thresholds for debris-flow generation in areas where empirically derived regional thresholds do not exist. This improvement would permit the expansion of the early-warning system into other regions susceptible to post-fire debris flow.

  5. Visuo-spatial abilities are key for young children's verbal number skills.

    PubMed

    Cornu, Véronique; Schiltz, Christine; Martin, Romain; Hornung, Caroline

    2018-02-01

    Children's development of verbal number skills (i.e., counting abilities and knowledge of the number names) presents a milestone in mathematical development. Different factors such as visuo-spatial and verbal abilities have been discussed as contributing to the development of these foundational skills. To understand the cognitive nature of verbal number skills in young children, the current study assessed the relation of preschoolers' verbal and visuo-spatial abilities to their verbal number skills. In total, 141 children aged 5 or 6 years participated in the current study. Verbal number skills were regressed on vocabulary, phonological awareness and visuo-spatial abilities, and verbal and visuo-spatial working memory in a structural equation model. Only visuo-spatial abilities emerged as a significant predictor of verbal number skills in the estimated model. Our results suggest that visuo-spatial abilities contribute to a larger extent to children's verbal number skills than verbal abilities. From a theoretical point of view, these results suggest a visuo-spatial, rather than a verbal, grounding of verbal number skills. These results are potentially informative for the conception of early mathematics assessments and interventions. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Measuring the contribution of water and green space amenities to housing values: an application and comparison of spatially weighted hedonic models

    Treesearch

    Seong-Hoon Cho; J. Michael Bowker; William M. Park

    2006-01-01

    This study estimates the influence of proximity to water bodies and park amenities on residential housing values in Knox County, Tennessee, using the hedonic price approach. Values for proximity to water bodies and parks are first estimated globally with a standard ordinary least squares (OLS) model. A locally weighted regression model is then employed to investigate...

  7. Predicting the geographic distribution of a species from presence-only data subject to detection errors

    USGS Publications Warehouse

    Dorazio, Robert M.

    2012-01-01

    Several models have been developed to predict the geographic distribution of a species by combining measurements of covariates of occurrence at locations where the species is known to be present with measurements of the same covariates at other locations where species occurrence status (presence or absence) is unknown. In the absence of species detection errors, spatial point-process models and binary-regression models for case-augmented surveys provide consistent estimators of a species’ geographic distribution without prior knowledge of species prevalence. In addition, these regression models can be modified to produce estimators of species abundance that are asymptotically equivalent to those of the spatial point-process models. However, if species presence locations are subject to detection errors, neither class of models provides a consistent estimator of covariate effects unless the covariates of species abundance are distinct and independently distributed from the covariates of species detection probability. These analytical results are illustrated using simulation studies of data sets that contain a wide range of presence-only sample sizes. Analyses of presence-only data of three avian species observed in a survey of landbirds in western Montana and northern Idaho are compared with site-occupancy analyses of detections and nondetections of these species.

  8. A High-Dimensional, Multivariate Copula Approach to Modeling Multivariate Agricultural Price Relationships and Tail Dependencies

    Treesearch

    Xuan Chi; Barry Goodwin

    2012-01-01

    Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...

  9. Land use regression modeling of oxidative potential of fine particles, NO2, PM2.5 mass and association to type two diabetes mellitus

    NASA Astrophysics Data System (ADS)

    Hellack, Bryan; Sugiri, Dorothea; Schins, Roel P. F.; Schikowski, Tamara; Krämer, Ursula; Kuhlbusch, Thomas A. J.; Hoffmann, Barbara

    2017-12-01

    While land use regression models (LUR) are commonly used, e.g. for the prediction of spatially variable air pollutant mass concentrations, they are scarcely used for predicting the oxidative potential (OP), a suggested unifying predictor of health effects. Therefore a LUR model was developed to examine if long-term OP of fine particulate exposure can be reasonably predicted by LUR modeling and whether it is related to health effects in a study region comprised of urban and rural areas. Four 14-day sampling periods over 1 year at 40 sites in the western Ruhr Area and adjacent northern rural area, Germany, in 2002/2003 were conducted and annual Nitrogen Dioxide (NO2), fine particles (PM2.5), and OP were calculated. LUR models were developed to estimate spatially-resolved annual OP, NO2 and PM2.5 concentrations. The model performance was checked by leave-one-out cross validation (LOOCV) and cox regression was used to analyze the association of modeled residential OP and NO2 with incident type 2 diabetes mellitus (T2DM) in 1784 elderly women during a mean follow-up of 16 years (baseline 1985-1994). The measured OP and NO2 concentrations were moderately correlated (rSpearman 0.57). The LUR models explained 62% and 92% of the OP and NO2 variance (adjusted LOOCV R2 57% and 90%). PM10 emission from combustion in a 5000 m buffer was the most important predictor for OP and NO2. Modeled pollutants were highly correlated (rSpearman 0.87). Model quality for OP was sensitive to the inclusion of a single influential measurement site. For PM2.5 mass only an insufficient model with a low explained variance of 22% (adjusted R2) was developed so no health effects analyses were conducted with estimated PM2.5. Increases in OP and NO2 were associated with an increase in risk of T2DM by a hazard ratio of 1.38 (95% CI 1.06-1.80) and 1.39 (95% CI 1.07-1.81) per interquartile range of OP and NO2, respectively. We conclude that spatially-resolved OP can be predicted by LUR modeling, but future work is needed to investigate the possibility to increase OP model quality with refined predictors.

  10. Prediction of Ba, Mn and Zn for tropical soils using iron oxides and magnetic susceptibility

    NASA Astrophysics Data System (ADS)

    Marques Júnior, José; Arantes Camargo, Livia; Reynaldo Ferracciú Alleoni, Luís; Tadeu Pereira, Gener; De Bortoli Teixeira, Daniel; Santos Rabelo de Souza Bahia, Angelica

    2017-04-01

    Agricultural activity is an important source of potentially toxic elements (PTEs) in soil worldwide but particularly in heavily farmed areas. Spatial distribution characterization of PTE contents in farming areas is crucial to assess further environmental impacts caused by soil contamination. Designing prediction models become quite useful to characterize the spatial variability of continuous variables, as it allows prediction of soil attributes that might be difficult to attain in a large number of samples through conventional methods. This study aimed to evaluate, in three geomorphic surfaces of Oxisols, the capacity for predicting PTEs (Ba, Mn, Zn) and their spatial variability using iron oxides and magnetic susceptibility (MS). Soil samples were collected from three geomorphic surfaces and analyzed for chemical, physical, mineralogical properties, as well as magnetic susceptibility (MS). PTE prediction models were calibrated by multiple linear regression (MLR). MLR calibration accuracy was evaluated using the coefficient of determination (R2). PTE spatial distribution maps were built using the values calculated by the calibrated models that reached the best accuracy by means of geostatistics. The high correlations between the attributes clay, MS, hematite (Hm), iron oxides extracted by sodium dithionite-citrate-bicarbonate (Fed), and iron oxides extracted using acid ammonium oxalate (Feo) with the elements Ba, Mn, and Zn enabled them to be selected as predictors for PTEs. Stepwise multiple linear regression showed that MS and Fed were the best PTE predictors individually, as they promoted no significant increase in R2 when two or more attributes were considered together. The MS-calibrated models for Ba, Mn, and Zn prediction exhibited R2 values of 0.88, 0.66, and 0.55, respectively. These are promising results since MS is a fast, cheap, and non-destructive tool, allowing the prediction of a large number of samples, which in turn enables detailed mapping of large areas. MS predicted values enabled the characterization and the understanding of spatial variability of the studied PTEs.

  11. Spatially distributed modeling of soil organic carbon across China with improved accuracy

    NASA Astrophysics Data System (ADS)

    Li, Qi-quan; Zhang, Hao; Jiang, Xin-ye; Luo, Youlin; Wang, Chang-quan; Yue, Tian-xiang; Li, Bing; Gao, Xue-song

    2017-06-01

    There is a need for more detailed spatial information on soil organic carbon (SOC) for the accurate estimation of SOC stock and earth system models. As it is effective to use environmental factors as auxiliary variables to improve the prediction accuracy of spatially distributed modeling, a combined method (HASM_EF) was developed to predict the spatial pattern of SOC across China using high accuracy surface modeling (HASM), artificial neural network (ANN), and principal component analysis (PCA) to introduce land uses, soil types, climatic factors, topographic attributes, and vegetation cover as predictors. The performance of HASM_EF was compared with ordinary kriging (OK), OK, and HASM combined, respectively, with land uses and soil types (OK_LS and HASM_LS), and regression kriging combined with land uses and soil types (RK_LS). Results showed that HASM_EF obtained the lowest prediction errors and the ratio of performance to deviation (RPD) presented the relative improvements of 89.91%, 63.77%, 55.86%, and 42.14%, respectively, compared to the other four methods. Furthermore, HASM_EF generated more details and more realistic spatial information on SOC. The improved performance of HASM_EF can be attributed to the introduction of more environmental factors, to explicit consideration of the multicollinearity of selected factors and the spatial nonstationarity and nonlinearity of relationships between SOC and selected factors, and to the performance of HASM and ANN. This method may play a useful tool in providing more precise spatial information on soil parameters for global modeling across large areas.

  12. A new approach for continuous estimation of baseflow using discrete water quality data: Method description and comparison with baseflow estimates from two existing approaches

    USGS Publications Warehouse

    Miller, Matthew P.; Johnson, Henry M.; Susong, David D.; Wolock, David M.

    2015-01-01

    Understanding how watershed characteristics and climate influence the baseflow component of stream discharge is a topic of interest to both the scientific and water management communities. Therefore, the development of baseflow estimation methods is a topic of active research. Previous studies have demonstrated that graphical hydrograph separation (GHS) and conductivity mass balance (CMB) methods can be applied to stream discharge data to estimate daily baseflow. While CMB is generally considered to be a more objective approach than GHS, its application across broad spatial scales is limited by a lack of high frequency specific conductance (SC) data. We propose a new method that uses discrete SC data, which are widely available, to estimate baseflow at a daily time step using the CMB method. The proposed approach involves the development of regression models that relate discrete SC concentrations to stream discharge and time. Regression-derived CMB baseflow estimates were more similar to baseflow estimates obtained using a CMB approach with measured high frequency SC data than were the GHS baseflow estimates at twelve snowmelt dominated streams and rivers. There was a near perfect fit between the regression-derived and measured CMB baseflow estimates at sites where the regression models were able to accurately predict daily SC concentrations. We propose that the regression-derived approach could be applied to estimate baseflow at large numbers of sites, thereby enabling future investigations of watershed and climatic characteristics that influence the baseflow component of stream discharge across large spatial scales.

  13. [Spatial epidemiological study on malaria epidemics in Hainan province].

    PubMed

    Wen, Liang; Shi, Run-He; Fang, Li-Qun; Xu, De-Zhong; Li, Cheng-Yi; Wang, Yong; Yuan, Zheng-Quan; Zhang, Hui

    2008-06-01

    To better understand the characteristics of spatial distribution of malaria epidemics in Hainan province and to explore the relationship between malaria epidemics and environmental factors, as well to develop prediction model on malaria epidemics. Data on Malaria and meteorological factors were collected in all 19 counties in Hainan province from May to Oct., 2000, and the proportion of land use types of these counties in this period were extracted from digital map of land use in Hainan province. Land surface temperatures (LST) were extracted from MODIS images and elevations of these counties were extracted from DEM of Hainan province. The coefficients of correlation of malaria incidences and these environmental factors were then calculated with SPSS 13.0, and negative binomial regression analysis were done using SAS 9.0. The incidence of malaria showed (1) positive correlations to elevation, proportion of forest land area and grassland area; (2) negative correlations to the proportion of cultivated area, urban and rural residents and to industrial enterprise area, LST; (3) no correlations to meteorological factors, proportion of water area, and unemployed land area. The prediction model of malaria which came from negative binomial regression analysis was: I (monthly, unit: 1/1,000,000) = exp (-1.672-0.399xLST). Spatial distribution of malaria epidemics was associated with some environmental factors, and prediction model of malaria epidemic could be developed with indexes which extracted from satellite remote sensing images.

  14. Human motion tracking by temporal-spatial local gaussian process experts.

    PubMed

    Zhao, Xu; Fu, Yun; Liu, Yuncai

    2011-04-01

    Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative framework. It is always a challenging task to model the mapping from observation space to state space because of the high-dimensional characteristic in the multimodal conditional distribution. In order to build the mapping, existing techniques usually involve a large set of training samples in the learning process which are limited in their capability to deal with multimodality. We propose, in this work, a novel online sparse Gaussian Process (GP) regression model to recover 3-D human motion in monocular videos. Particularly, we investigate the fact that for a given test input, its output is mainly determined by the training samples potentially residing in its local neighborhood and defined in the unified input-output space. This leads to a local mixture GP experts system composed of different local GP experts, each of which dominates a mapping behavior with the specific covariance function adapting to a local region. To handle the multimodality, we combine both temporal and spatial information therefore to obtain two categories of local experts. The temporal and spatial experts are integrated into a seamless hybrid system, which is automatically self-initialized and robust for visual tracking of nonlinear human motion. Learning and inference are extremely efficient as all the local experts are defined online within very small neighborhoods. Extensive experiments on two real-world databases, HumanEva and PEAR, demonstrate the effectiveness of our proposed model, which significantly improve the performance of existing models.

  15. The spatial heterogeneity between Japanese encephalitis incidence distribution and environmental variables in Nepal.

    PubMed

    Impoinvil, Daniel E; Solomon, Tom; Schluter, W William; Rayamajhi, Ajit; Bichha, Ram Padarath; Shakya, Geeta; Caminade, Cyril; Baylis, Matthew

    2011-01-01

    To identify potential environmental drivers of Japanese Encephalitis virus (JE) transmission in Nepal, we conducted an ecological study to determine the spatial association between 2005 Nepal JE incidence, and climate, agricultural, and land-cover variables at district level. District-level data on JE cases were examined using Local Indicators of Spatial Association (LISA) analysis to identify spatial clusters from 2004 to 2008 and 2005 data was used to fit a spatial lag regression model with climate, agriculture and land-cover variables. Prior to 2006, there was a single large cluster of JE cases located in the Far-West and Mid-West terai regions of Nepal. After 2005, the distribution of JE cases in Nepal shifted with clusters found in the central hill areas. JE incidence during the 2005 epidemic had a stronger association with May mean monthly temperature and April mean monthly total precipitation compared to mean annual temperature and precipitation. A parsimonious spatial lag regression model revealed, 1) a significant negative relationship between JE incidence and April precipitation, 2) a significant positive relationship between JE incidence and percentage of irrigated land 3) a non-significant negative relationship between JE incidence and percentage of grassland cover, and 4) a unimodal non-significant relationship between JE Incidence and pig-to-human ratio. JE cases clustered in the terai prior to 2006 where it seemed to shift to the Kathmandu region in subsequent years. The spatial pattern of JE cases during the 2005 epidemic in Nepal was significantly associated with low precipitation and the percentage of irrigated land. Despite the availability of an effective vaccine, it is still important to understand environmental drivers of JEV transmission since the enzootic cycle of JEV transmission is not likely to be totally interrupted. Understanding the spatial dynamics of JE risk factors may be useful in providing important information to the Nepal immunization program.

  16. Identifying and characterizing hepatitis C virus hotspots in Massachusetts: a spatial epidemiological approach.

    PubMed

    Stopka, Thomas J; Goulart, Michael A; Meyers, David J; Hutcheson, Marga; Barton, Kerri; Onofrey, Shauna; Church, Daniel; Donahue, Ashley; Chui, Kenneth K H

    2017-04-20

    Hepatitis C virus (HCV) infections have increased during the past decade but little is known about geographic clustering patterns. We used a unique analytical approach, combining geographic information systems (GIS), spatial epidemiology, and statistical modeling to identify and characterize HCV hotspots, statistically significant clusters of census tracts with elevated HCV counts and rates. We compiled sociodemographic and HCV surveillance data (n = 99,780 cases) for Massachusetts census tracts (n = 1464) from 2002 to 2013. We used a five-step spatial epidemiological approach, calculating incremental spatial autocorrelations and Getis-Ord Gi* statistics to identify clusters. We conducted logistic regression analyses to determine factors associated with the HCV hotspots. We identified nine HCV clusters, with the largest in Boston, New Bedford/Fall River, Worcester, and Springfield (p < 0.05). In multivariable analyses, we found that HCV hotspots were independently and positively associated with the percent of the population that was Hispanic (adjusted odds ratio [AOR]: 1.07; 95% confidence interval [CI]: 1.04, 1.09) and the percent of households receiving food stamps (AOR: 1.83; 95% CI: 1.22, 2.74). HCV hotspots were independently and negatively associated with the percent of the population that were high school graduates or higher (AOR: 0.91; 95% CI: 0.89, 0.93) and the percent of the population in the "other" race/ethnicity category (AOR: 0.88; 95% CI: 0.85, 0.91). We identified locations where HCV clusters were a concern, and where enhanced HCV prevention, treatment, and care can help combat the HCV epidemic in Massachusetts. GIS, spatial epidemiological and statistical analyses provided a rigorous approach to identify hotspot clusters of disease, which can inform public health policy and intervention targeting. Further studies that incorporate spatiotemporal cluster analyses, Bayesian spatial and geostatistical models, spatially weighted regression analyses, and assessment of associations between HCV clustering and the built environment are needed to expand upon our combined spatial epidemiological and statistical methods.

  17. The Spatial Heterogeneity between Japanese Encephalitis Incidence Distribution and Environmental Variables in Nepal

    PubMed Central

    Impoinvil, Daniel E.; Solomon, Tom; Schluter, W. William; Rayamajhi, Ajit; Bichha, Ram Padarath; Shakya, Geeta; Caminade, Cyril; Baylis, Matthew

    2011-01-01

    Background To identify potential environmental drivers of Japanese Encephalitis virus (JE) transmission in Nepal, we conducted an ecological study to determine the spatial association between 2005 Nepal JE incidence, and climate, agricultural, and land-cover variables at district level. Methods District-level data on JE cases were examined using Local Indicators of Spatial Association (LISA) analysis to identify spatial clusters from 2004 to 2008 and 2005 data was used to fit a spatial lag regression model with climate, agriculture and land-cover variables. Results Prior to 2006, there was a single large cluster of JE cases located in the Far-West and Mid-West terai regions of Nepal. After 2005, the distribution of JE cases in Nepal shifted with clusters found in the central hill areas. JE incidence during the 2005 epidemic had a stronger association with May mean monthly temperature and April mean monthly total precipitation compared to mean annual temperature and precipitation. A parsimonious spatial lag regression model revealed, 1) a significant negative relationship between JE incidence and April precipitation, 2) a significant positive relationship between JE incidence and percentage of irrigated land 3) a non-significant negative relationship between JE incidence and percentage of grassland cover, and 4) a unimodal non-significant relationship between JE Incidence and pig-to-human ratio. Conclusion JE cases clustered in the terai prior to 2006 where it seemed to shift to the Kathmandu region in subsequent years. The spatial pattern of JE cases during the 2005 epidemic in Nepal was significantly associated with low precipitation and the percentage of irrigated land. Despite the availability of an effective vaccine, it is still important to understand environmental drivers of JEV transmission since the enzootic cycle of JEV transmission is not likely to be totally interrupted. Understanding the spatial dynamics of JE risk factors may be useful in providing important information to the Nepal immunization program. PMID:21811573

  18. Regionalization of monthly rainfall erosivity patternsin Switzerland

    NASA Astrophysics Data System (ADS)

    Schmidt, Simon; Alewell, Christine; Panagos, Panos; Meusburger, Katrin

    2016-10-01

    One major controlling factor of water erosion is rainfall erosivity, which is quantified as the product of total storm energy and a maximum 30 min intensity (I30). Rainfall erosivity is often expressed as R-factor in soil erosion risk models like the Universal Soil Loss Equation (USLE) and its revised version (RUSLE). As rainfall erosivity is closely correlated with rainfall amount and intensity, the rainfall erosivity of Switzerland can be expected to have a regional characteristic and seasonal dynamic throughout the year. This intra-annual variability was mapped by a monthly modeling approach to assess simultaneously spatial and monthly patterns of rainfall erosivity. So far only national seasonal means and regional annual means exist for Switzerland. We used a network of 87 precipitation gauging stations with a 10 min temporal resolution to calculate long-term monthly mean R-factors. Stepwise generalized linear regression (GLM) and leave-one-out cross-validation (LOOCV) were used to select spatial covariates which explain the spatial and temporal patterns of the R-factor for each month across Switzerland. The monthly R-factor is mapped by summarizing the predicted R-factor of the regression equation and the corresponding residues of the regression, which are interpolated by ordinary kriging (regression-kriging). As spatial covariates, a variety of precipitation indicator data has been included such as snow depths, a combination product of hourly precipitation measurements and radar observations (CombiPrecip), daily Alpine precipitation (EURO4M-APGD), and monthly precipitation sums (RhiresM). Topographic parameters (elevation, slope) were also significant explanatory variables for single months. The comparison of the 12 monthly rainfall erosivity maps showed a distinct seasonality with the highest rainfall erosivity in summer (June, July, and August) influenced by intense rainfall events. Winter months have the lowest rainfall erosivity. A proportion of 62 % of the total annual rainfall erosivity is identified within four months only (June-September). The highest erosion risk can be expected in July, where not only rainfall erosivity but also erosivity density is high. In addition to the intra-annual temporal regime, a spatial variability of this seasonality was detectable between different regions of Switzerland. The assessment of the dynamic behavior of the R-factor is valuable for the identification of susceptible seasons and regions.

  19. Spatially Explicit Estimates of Suspended Sediment and Bedload Transport Rates for Western Oregon and Northwestern California

    NASA Astrophysics Data System (ADS)

    O'Connor, J. E.; Wise, D. R.; Mangano, J.; Jones, K.

    2015-12-01

    Empirical analyses of suspended sediment and bedload transport gives estimates of sediment flux for western Oregon and northwestern California. The estimates of both bedload and suspended load are from regression models relating measured annual sediment yield to geologic, physiographic, and climatic properties of contributing basins. The best models include generalized geology and either slope or precipitation. The best-fit suspended-sediment model is based on basin geology, precipitation, and area of recent wildfire. It explains 65% of the variance for 68 suspended sediment measurement sites within the model area. Predicted suspended sediment yields range from no yield from the High Cascades geologic province to 200 tonnes/ km2-yr in the northern Oregon Coast Range and 1000 tonnes/km2-yr in recently burned areas of the northern Klamath terrain. Bed-material yield is similarly estimated from a regression model based on 22 sites of measured bed-material transport, mostly from reservoir accumulation analyses but also from several bedload measurement programs. The resulting best-fit regression is based on basin slope and the presence/absence of the Klamath geologic terrane. For the Klamath terrane, bed-material yield is twice that of the other geologic provinces. This model explains more than 80% of the variance of the better-quality measurements. Predicted bed-material yields range up to 350 tonnes/ km2-yr in steep areas of the Klamath terrane. Applying these regressions to small individual watersheds (mean size; 66 km2 for bed-material; 3 km2 for suspended sediment) and cumulating totals down the hydrologic network (but also decreasing the bed-material flux by experimentally determined attrition rates) gives spatially explicit estimates of both bed-material and suspended sediment flux. This enables assessment of several management issues, including the effects of dams on bedload transport, instream gravel mining, habitat formation processes, and water-quality. The combined fluxes can also be compared to long-term rock uplift and cosmogenically determined landscape erosion rates.

  20. Characterizing optical properties and spatial heterogeneity of human ovarian tissue using spatial frequency domain imaging

    NASA Astrophysics Data System (ADS)

    Nandy, Sreyankar; Mostafa, Atahar; Kumavor, Patrick D.; Sanders, Melinda; Brewer, Molly; Zhu, Quing

    2016-10-01

    A spatial frequency domain imaging (SFDI) system was developed for characterizing ex vivo human ovarian tissue using wide-field absorption and scattering properties and their spatial heterogeneities. Based on the observed differences between absorption and scattering images of different ovarian tissue groups, six parameters were quantitatively extracted. These are the mean absorption and scattering, spatial heterogeneities of both absorption and scattering maps measured by a standard deviation, and a fitting error of a Gaussian model fitted to normalized mean Radon transform of the absorption and scattering maps. A logistic regression model was used for classification of malignant and normal ovarian tissues. A sensitivity of 95%, specificity of 100%, and area under the curve of 0.98 were obtained using six parameters extracted from the SFDI images. The preliminary results demonstrate the diagnostic potential of the SFDI method for quantitative characterization of wide-field optical properties and the spatial distribution heterogeneity of human ovarian tissue. SFDI could be an extremely robust and valuable tool for evaluation of the ovary and detection of neoplastic changes of ovarian cancer.

  1. The effects of climate downscaling technique and observational data set on modeled ecological responses.

    PubMed

    Pourmokhtarian, Afshin; Driscoll, Charles T; Campbell, John L; Hayhoe, Katharine; Stoner, Anne M K

    2016-07-01

    Assessments of future climate change impacts on ecosystems typically rely on multiple climate model projections, but often utilize only one downscaling approach trained on one set of observations. Here, we explore the extent to which modeled biogeochemical responses to changing climate are affected by the selection of the climate downscaling method and training observations used at the montane landscape of the Hubbard Brook Experimental Forest, New Hampshire, USA. We evaluated three downscaling methods: the delta method (or the change factor method), monthly quantile mapping (Bias Correction-Spatial Disaggregation, or BCSD), and daily quantile regression (Asynchronous Regional Regression Model, or ARRM). Additionally, we trained outputs from four atmosphere-ocean general circulation models (AOGCMs) (CCSM3, HadCM3, PCM, and GFDL-CM2.1) driven by higher (A1fi) and lower (B1) future emissions scenarios on two sets of observations (1/8º resolution grid vs. individual weather station) to generate the high-resolution climate input for the forest biogeochemical model PnET-BGC (eight ensembles of six runs).The choice of downscaling approach and spatial resolution of the observations used to train the downscaling model impacted modeled soil moisture and streamflow, which in turn affected forest growth, net N mineralization, net soil nitrification, and stream chemistry. All three downscaling methods were highly sensitive to the observations used, resulting in projections that were significantly different between station-based and grid-based observations. The choice of downscaling method also slightly affected the results, however not as much as the choice of observations. Using spatially smoothed gridded observations and/or methods that do not resolve sub-monthly shifts in the distribution of temperature and/or precipitation can produce biased results in model applications run at greater temporal and/or spatial resolutions. These results underscore the importance of carefully considering field observations used for training, as well as the downscaling method used to generate climate change projections, for smaller-scale modeling studies. Different sources of variability including selection of AOGCM, emissions scenario, downscaling technique, and data used for training downscaling models, result in a wide range of projected forest ecosystem responses to future climate change. © 2016 by the Ecological Society of America.

  2. A flexible cure rate model for spatially correlated survival data based on generalized extreme value distribution and Gaussian process priors.

    PubMed

    Li, Dan; Wang, Xia; Dey, Dipak K

    2016-09-01

    Our present work proposes a new survival model in a Bayesian context to analyze right-censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis. Therefore, the nonlinear covariate effects and the spatial effects are incorporated into the systematic component of our model. Gaussian processes (GPs) provide a natural framework for modeling potentially nonlinear relationship and have recently become extremely powerful in nonlinear regression. Our proposed model adopts a semiparametric Bayesian approach by imposing a GP prior on the nonlinear structure of continuous covariate. With the consideration of data availability and computational complexity, the conditionally autoregressive distribution is placed on the region-specific frailties to handle spatial correlation. The flexibility and gains of our proposed model are illustrated through analyses of simulated data examples as well as a dataset involving a colon cancer clinical trial from the state of Iowa. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Stability of Major Geogenic Cations in Drinking Water-An Issue of Public Health Importance: A Danish Study, 1980⁻2017.

    PubMed

    Wodschow, Kirstine; Hansen, Birgitte; Schullehner, Jörg; Ersbøll, Annette Kjær

    2018-06-08

    Concentrations and spatial variations of the four cations Na, K, Mg and Ca are known to some extent for groundwater and to a lesser extent for drinking water. Using Denmark as case, the purpose of this study was to analyze the spatial and temporal variations in the major cations in drinking water. The results will contribute to a better exposure estimation in future studies of the association between cations and diseases. Spatial and temporal variations and the association with aquifer types, were analyzed with spatial scan statistics, linear regression and a multilevel mixed-effects linear regression model. About 65,000 water samples of each cation (1980⁻2017) were included in the study. Results of mean concentrations were 31.4 mg/L, 3.5 mg/L, 12.1 mg/L and 84.5 mg/L for 1980⁻2017 for Na, K, Mg and Ca, respectively. An expected west-east trend in concentrations were confirmed, mainly explained by variations in aquifer types. The trend in concentration was stable for about 31⁻45% of the public water supply areas. It is therefore recommended that the exposure estimate in future health related studies not only be based on a single mean value, but that temporal and spatial variations should also be included.

  4. Monitoring and Assessment of Military Installation Land Condition under Training Disturbance Using Remote Sensing

    NASA Astrophysics Data System (ADS)

    Rijal, Santosh

    Various military training activities are conducted in more than 11.3 million hectares of land (> 5,500 training sites) in the United States (U.S.). These training activities directly and indirectly degrade the land. Land degradation can impede continuous military training. In order to sustain long term training missions and Army combat readiness, the environmental conditions of the military installations need to be carefully monitored and assessed. Furthermore, the National Environmental Policy Act of 1969 (NEPA) and the U.S. Army Regulation 200-2 require the DoD to minimize the environmental impacts of training and document the environmental consequences of their actions. To achieve these objectives, the Department of Army initiated an Integrated Training Area Management (ITAM) program to manage training lands through assessing their environmental requirements and establishing policies and procedures to achieve optimum, sustainable use of training lands. One of the programs under ITAM, Range and Training Land Assessment (RTLA) was established to collect field-based data for monitoring installation's environmental condition. Due to high cost and inefficiencies involved in the collection of field data, the RTLA program was stopped in several military installations. Therefore, there has been a strong need to develop an efficient and low cost remote sensing based methodology for assessing and monitoring land conditions of military installations. It is also important to make a long-term assessment of installation land condition for understanding cumulative impacts of continuous military training activities. Additionally, it is unclear that compared to non-military land condition, to what extent military training activities have led to the degradation of land condition for military installations. The first paper of this dissertation developed a soil erosion relevant and image derived cover factor (ICF) based on linear spectral mixture (LSM) analysis to assess and monitor the land condition of military land and compare it with non-military land. The results from this study can provide FR land managers with the information of the spatial variation and temporal trend of land condition in FR. Fort Riley land managers can also use this method for monitoring their land condition at a very low cost. This method can thus be applied to other military installations as well as non-military lands. Furthermore, one of the most significant environmental problems in military installations of the U.S. is the formation of gullies due to the intensive use of military vehicle. However, to our knowledge, no remote sensing based method has been developed and used to assess the detection of gullies in military installations. In the second paper of this dissertation, light detection and ranging (LiDAR) derived digital elevation model (DEM) of 2010 and WorldView-2 images of 2010 were used to quantify the gullies in FR. This method can be easily applied to assess gullies in non-military installations. On the other hand, modeling the land condition of military installation is critical to understand the spatial and temporal pattern of military training induced disturbance and land recovery. In the third paper, it was assumed that the military training induced disturbance was spatially auto-correlated and thus four regression models including i) linear stepwise regression (LSR) ii) logistic regression (LR), iii) geographically weighted linear regression (GWR), and iv) geographically weighted logistic regression (GWLR) were developed and compared using remote sensing image derived spectral variables for years 1990, 1997, 1998, 1999, and 2001. It was found that the spatial distribution of the military training disturbance was well demonstrated by all the regression models with higher intensities of military training disturbance in the northwest and central west parts of the installation. Compared to other regression models, GWR accurately estimated the land condition of FR. This result provided the applicability of using local variability based regression model to accurately predict land condition. Different plant communities of military installations respond differently to military training induced disturbance. The information of the spatial distribution of plant species in military installations is important to gain insight of the resilient capacity of the land following disturbances. For the purpose, in the fourth paper, hyperspectral in-situ data were collected from FR and KPBS in the summer of 2015 using a hyperspectral instrument. Principal component analysis (PCA) and band relative importance (BRI) were used to identify relative importance of each of the spectral bands. The results from this study provided useful information about the optimal wavelengths that help distinguish different plant species of FR and can be easily used with high resolution hyperspectral images for mapping the spatial distribution of the plant species. This information will be helpful for the sustainable management of the tallgrass prairie ecosystem. (Abstract shortened by ProQuest.).

  5. MERGANSER- Predicting Mercury Levels in Fish and Loons in New England Lakes

    EPA Science Inventory

    MERGANSER (MERcury Geo-spatial AssesmentS for the New England Region) is an empirical least squares multiple regression model using atmospheric deposition of mercury (Hg) and readily obtainable lake and watershed features to predict fish and common loon Hg (as methyl mercury) in ...

  6. A basin-scale approach to estimating stream temperatures of tributaries to the lower Klamath River, California

    USGS Publications Warehouse

    Flint, L.E.; Flint, A.L.

    2008-01-01

    Stream temperature is an important component of salmonid habitat and is often above levels suitable for fish survival in the Lower Klamath River in northern California. The objective of this study was to provide boundary conditions for models that are assessing stream temperature on the main stem for the purpose of developing strategies to manage stream conditions using Total Maximum Daily Loads. For model input, hourly stream temperatures for 36 tributaries were estimated for 1 Jan. 2001 through 31 Oct. 2004. A basin-scale approach incorporating spatially distributed energy balance data was used to estimate the stream temperatures with measured air temperature and relative humidity data and simulated solar radiation, including topographic shading and corrections for cloudiness. Regression models were developed on the basis of available stream temperature data to predict temperatures for unmeasured periods of time and for unmeasured streams. The most significant factor in matching measured minimum and maximum stream temperatures was the seasonality of the estimate. Adding minimum and maximum air temperature to the regression model improved the estimate, and air temperature data over the region are available and easily distributed spatially. The addition of simulated solar radiation and vapor saturation deficit to the regression model significantly improved predictions of maximum stream temperature but was not required to predict minimum stream temperature. The average SE in estimated maximum daily stream temperature for the individual basins was 0.9 ?? 0.6??C at the 95% confidence interval. Copyright ?? 2008 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved.

  7. Predicting active-layer soil thickness using topographic variables at a small watershed scale

    PubMed Central

    Li, Aidi; Tan, Xing; Wu, Wei; Liu, Hongbin; Zhu, Jie

    2017-01-01

    Knowledge about the spatial distribution of active-layer (AL) soil thickness is indispensable for ecological modeling, precision agriculture, and land resource management. However, it is difficult to obtain the details on AL soil thickness by using conventional soil survey method. In this research, the objective is to investigate the possibility and accuracy of mapping the spatial distribution of AL soil thickness through random forest (RF) model by using terrain variables at a small watershed scale. A total of 1113 soil samples collected from the slope fields were randomly divided into calibration (770 soil samples) and validation (343 soil samples) sets. Seven terrain variables including elevation, aspect, relative slope position, valley depth, flow path length, slope height, and topographic wetness index were derived from a digital elevation map (30 m). The RF model was compared with multiple linear regression (MLR), geographically weighted regression (GWR) and support vector machines (SVM) approaches based on the validation set. Model performance was evaluated by precision criteria of mean error (ME), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). Comparative results showed that RF outperformed MLR, GWR and SVM models. The RF gave better values of ME (0.39 cm), MAE (7.09 cm), and RMSE (10.85 cm) and higher R2 (62%). The sensitivity analysis demonstrated that the DEM had less uncertainty than the AL soil thickness. The outcome of the RF model indicated that elevation, flow path length and valley depth were the most important factors affecting the AL soil thickness variability across the watershed. These results demonstrated the RF model is a promising method for predicting spatial distribution of AL soil thickness using terrain parameters. PMID:28877196

  8. Analysis of an Environmental Exposure Health Questionnaire in a Metropolitan Minority Population Utilizing Logistic Regression and Support Vector Machines

    PubMed Central

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler

    2014-01-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953

  9. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  10. Estimates of diffuse phosphorus sources in surface waters of the United States using a spatially referenced watershed model

    USGS Publications Warehouse

    Alexander, R.B.; Smith, R.A.; Schwarz, G.E.

    2004-01-01

    The statistical watershed model SPARROW (SPAtially Referenced Regression On Watershed attributes) was used to estimate the sources and transport of total phosphorus (TP) in surface waters of the United States. We calibrated the model using stream measurements of TP from 336 watersheds of mixed land use and spatial data on topography, soils, stream hydrography, and land use (agriculture, forest, shrub/grass, urban). The model explained 87% of the spatial variability in log transformed stream TP flux (kg yr-1). Predictions of stream yield (kg ha-1 yr-1) were typically within 45% of the observed values at the monitoring sites. The model identified appreciable effects of soils, streams, and reservoirs on TP transport, The estimated aquatic rates of phosphorus removal declined with increasing stream size and rates of water flushing in reservoirs (i.e. areal hydraulic loads). A phosphorus budget for the 2.9 million km2 Mississippi River Basin provides a detailed accounting of TP delivery to streams, the removal of TP in surface waters, and the stream export of TP from major interior watersheds for sources associated with each land-use type. ?? US Government 2004.

  11. Spatial analysis of agri-environmental policy uptake and expenditure in Scotland.

    PubMed

    Yang, Anastasia L; Rounsevell, Mark D A; Wilson, Ronald M; Haggett, Claire

    2014-01-15

    Agri-environment is one of the most widely supported rural development policy measures in Scotland in terms of number of participants and expenditure. It comprises 69 management options and sub-options that are delivered primarily through the competitive 'Rural Priorities scheme'. Understanding the spatial determinants of uptake and expenditure would assist policy-makers in guiding future policy targeting efforts for the rural environment. This study is unique in examining the spatial dependency and determinants of Scotland's agri-environmental measures and categorised options uptake and payments at the parish level. Spatial econometrics is applied to test the influence of 40 explanatory variables on farming characteristics, land capability, designated sites, accessibility and population. Results identified spatial dependency for each of the dependent variables, which supported the use of spatially-explicit models. The goodness of fit of the spatial models was better than for the aspatial regression models. There was also notable improvement in the models for participation compared with the models for expenditure. Furthermore a range of expected explanatory variables were found to be significant and varied according to the dependent variable used. The majority of models for both payment and uptake showed a significant positive relationship with SSSI (Sites of Special Scientific Interest), which are designated sites prioritised in Scottish policy. These results indicate that environmental targeting efforts by the government for AEP uptake in designated sites can be effective. However habitats outside of SSSI, termed here the 'wider countryside' may not be sufficiently competitive to receive funding in the current policy system. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Spatial autocorrelation analysis of health care hotspots in Taiwan in 2006

    PubMed Central

    2009-01-01

    Background Spatial analytical techniques and models are often used in epidemiology to identify spatial anomalies (hotspots) in disease regions. These analytical approaches can be used to not only identify the location of such hotspots, but also their spatial patterns. Methods In this study, we utilize spatial autocorrelation methodologies, including Global Moran's I and Local Getis-Ord statistics, to describe and map spatial clusters, and areas in which these are situated, for the 20 leading causes of death in Taiwan. In addition, we use the fit to a logistic regression model to test the characteristics of similarity and dissimilarity by gender. Results Gender is compared in efforts to formulate the common spatial risk. The mean found by local spatial autocorrelation analysis is utilized to identify spatial cluster patterns. There is naturally great interest in discovering the relationship between the leading causes of death and well-documented spatial risk factors. For example, in Taiwan, we found the geographical distribution of clusters where there is a prevalence of tuberculosis to closely correspond to the location of aboriginal townships. Conclusions Cluster mapping helps to clarify issues such as the spatial aspects of both internal and external correlations for leading health care events. This is of great aid in assessing spatial risk factors, which in turn facilitates the planning of the most advantageous types of health care policies and implementation of effective health care services. PMID:20003460

  13. Landscape controls on total and methyl Hg in the Upper Hudson River basin, New York, USA

    USGS Publications Warehouse

    Burns, Douglas A.; Riva-Murray, K.; Bradley, P.M.; Aiken, G.R.; Brigham, M.E.

    2012-01-01

    Approaches are needed to better predict spatial variation in riverine Hg concentrations across heterogeneous landscapes that include mountains, wetlands, and open waters. We applied multivariate linear regression to determine the landscape factors and chemical variables that best account for the spatial variation of total Hg (THg) and methyl Hg (MeHg) concentrations in 27 sub-basins across the 493 km2 upper Hudson River basin in the Adirondack Mountains of New York. THg concentrations varied by sixfold, and those of MeHg by 40-fold in synoptic samples collected at low-to-moderate flow, during spring and summer of 2006 and 2008. Bivariate linear regression relations of THg and MeHg concentrations with either percent wetland area or DOC concentrations were significant but could account for only about 1/3 of the variation in these Hg forms in summer. In contrast, multivariate linear regression relations that included metrics of (1) hydrogeomorphology, (2) riparian/wetland area, and (3) open water, explained about 66% to >90% of spatial variation in each Hg form in spring and summer samples. These metrics reflect the influence of basin morphometry and riparian soils on Hg source and transport, and the role of open water as a Hg sink. Multivariate models based solely on these landscape metrics generally accounted for as much or more of the variation in Hg concentrations than models based on chemical and physical metrics, and show great promise for identifying waters with expected high Hg concentrations in the Adirondack region and similar glaciated riverine ecosystems.

  14. Sensitivity of snowpack storage to precipitation and temperature using spatial and temporal analog models

    NASA Astrophysics Data System (ADS)

    Luce, Charles H.; Lopez-Burgos, Viviana; Holden, Zachary

    2014-12-01

    Empirical sensitivity analyses are important for evaluation of the effects of a changing climate on water resources and ecosystems. Although mechanistic models are commonly applied for evaluation of climate effects for snowmelt, empirical relationships provide a first-order validation of the various postulates required for their implementation. Previous studies of empirical sensitivity for April 1 snow water equivalent (SWE) in the western United States were developed by regressing interannual variations in SWE to winter precipitation and temperature. This offers a temporal analog for climate change, positing that a warmer future looks like warmer years. Spatial analogs are used to hypothesize that a warmer future may look like warmer places, and are frequently applied alternatives for complex processes, or states/metrics that show little interannual variability (e.g., forest cover). We contrast spatial and temporal analogs for sensitivity of April 1 SWE and the mean residence time of snow (SRT) using data from 524 Snowpack Telemetry (SNOTEL) stations across the western U.S. We built relatively strong models using spatial analogs to relate temperature and precipitation climatology to snowpack climatology (April 1 SWE, R2=0.87, and SRT, R2=0.81). Although the poorest temporal analog relationships were in areas showing the highest sensitivity to warming, spatial analog models showed consistent performance throughout the range of temperature and precipitation. Generally, slopes from the spatial relationships showed greater thermal sensitivity than the temporal analogs, and high elevation stations showed greater vulnerability using a spatial analog than shown in previous modeling and sensitivity studies. The spatial analog models provide a simple perspective to evaluate potential futures and may be useful in further evaluation of snowpack with warming.

  15. Grassland and cropland net ecosystem production of the U.S. Great Plains: Regression tree model development and comparative analysis

    USGS Publications Warehouse

    Wylie, Bruce K.; Howard, Daniel; Dahal, Devendra; Gilmanov, Tagir; Ji, Lei; Zhang, Li; Smith, Kelcy

    2016-01-01

    This paper presents the methodology and results of two ecological-based net ecosystem production (NEP) regression tree models capable of up scaling measurements made at various flux tower sites throughout the U.S. Great Plains. Separate grassland and cropland NEP regression tree models were trained using various remote sensing data and other biogeophysical data, along with 15 flux towers contributing to the grassland model and 15 flux towers for the cropland model. The models yielded weekly mean daily grassland and cropland NEP maps of the U.S. Great Plains at 250 m resolution for 2000–2008. The grassland and cropland NEP maps were spatially summarized and statistically compared. The results of this study indicate that grassland and cropland ecosystems generally performed as weak net carbon (C) sinks, absorbing more C from the atmosphere than they released from 2000 to 2008. Grasslands demonstrated higher carbon sink potential (139 g C·m−2·year−1) than non-irrigated croplands. A closer look into the weekly time series reveals the C fluctuation through time and space for each land cover type.

  16. Non-Gaussian spatiotemporal simulation of multisite daily precipitation: downscaling framework

    NASA Astrophysics Data System (ADS)

    Ben Alaya, M. A.; Ouarda, T. B. M. J.; Chebana, F.

    2018-01-01

    Probabilistic regression approaches for downscaling daily precipitation are very useful. They provide the whole conditional distribution at each forecast step to better represent the temporal variability. The question addressed in this paper is: how to simulate spatiotemporal characteristics of multisite daily precipitation from probabilistic regression models? Recent publications point out the complexity of multisite properties of daily precipitation and highlight the need for using a non-Gaussian flexible tool. This work proposes a reasonable compromise between simplicity and flexibility avoiding model misspecification. A suitable nonparametric bootstrapping (NB) technique is adopted. A downscaling model which merges a vector generalized linear model (VGLM as a probabilistic regression tool) and the proposed bootstrapping technique is introduced to simulate realistic multisite precipitation series. The model is applied to data sets from the southern part of the province of Quebec, Canada. It is shown that the model is capable of reproducing both at-site properties and the spatial structure of daily precipitations. Results indicate the superiority of the proposed NB technique, over a multivariate autoregressive Gaussian framework (i.e. Gaussian copula).

  17. Predicting spatio-temporal failure in large scale observational and micro scale experimental systems

    NASA Astrophysics Data System (ADS)

    de las Heras, Alejandro; Hu, Yong

    2006-10-01

    Forecasting has become an essential part of modern thought, but the practical limitations still are manifold. We addressed future rates of change by comparing models that take into account time, and models that focus more on space. Cox regression confirmed that linear change can be safely assumed in the short-term. Spatially explicit Poisson regression, provided a ceiling value for the number of deforestation spots. With several observed and estimated rates, it was decided to forecast using the more robust assumptions. A Markov-chain cellular automaton thus projected 5-year deforestation in the Amazonian Arc of Deforestation, showing that even a stable rate of change would largely deplete the forest area. More generally, resolution and implementation of the existing models could explain many of the modelling difficulties still affecting forecasting.

  18. A review of spatio-temporal modelling of quadrat count data with application to striga occurrence in a pearl millet field

    NASA Astrophysics Data System (ADS)

    Hess, Dale; van Lieshout, Marie-Colette; Payne, Bill; Stein, Alfred

    This paper describes how spatial statistical techniques may be used to analyse weed occurrence in tropical fields. Quadrat counts of weed numbers are available over a series of years, as well as data on explanatory variables, and the aim is to smooth the data and assess spatial and temporal trends. We review a range of models for correlated count data. As an illustration, we consider data on striga infestation of a 60 × 24 m 2 millet field in Niger collected from 1985 until 1991, modelled by independent Poisson counts and a prior auto regression term enforcing spatial coherence. The smoothed fields show the presence of a seed bank, the estimated model parameters indicate a decay in the striga numbers over time, as well as a clear correlation with the amount of rainfall in 15 consecutive days following the sowing date. Such results could contribute to precision agriculture as a guide to more cost-effective striga control strategies.

  19. Measurement error in epidemiologic studies of air pollution based on land-use regression models.

    PubMed

    Basagaña, Xavier; Aguilera, Inmaculada; Rivera, Marcela; Agis, David; Foraster, Maria; Marrugat, Jaume; Elosua, Roberto; Künzli, Nino

    2013-10-15

    Land-use regression (LUR) models are increasingly used to estimate air pollution exposure in epidemiologic studies. These models use air pollution measurements taken at a small set of locations and modeling based on geographical covariates for which data are available at all study participant locations. The process of LUR model development commonly includes a variable selection procedure. When LUR model predictions are used as explanatory variables in a model for a health outcome, measurement error can lead to bias of the regression coefficients and to inflation of their variance. In previous studies dealing with spatial predictions of air pollution, bias was shown to be small while most of the effect of measurement error was on the variance. In this study, we show that in realistic cases where LUR models are applied to health data, bias in health-effect estimates can be substantial. This bias depends on the number of air pollution measurement sites, the number of available predictors for model selection, and the amount of explainable variability in the true exposure. These results should be taken into account when interpreting health effects from studies that used LUR models.

  20. SPATIAL ANALYSIS OF AIR POLLUTION AND DEVELOPMENT OF A LAND-USE REGRESSION ( LUR ) MODEL IN AN URBAN AIRSHED

    EPA Science Inventory

    The Detroit Children's Health Study is an epidemiologic study examining associations between chronic ambient environmental exposures to gaseous air pollutants and respiratory health outcomes among elementary school-age children in an urban airshed. The exposure component of this...

  1. Spatial assessment of soluble solid contents on apple slices using hyperspectral imaging

    USDA-ARS?s Scientific Manuscript database

    A partial least squares regression (PLSR) model to map internal soluble solids content (SSC) of apples using visible/near-infrared (VNIR) hyperspectral imaging was developed. The reflectance spectra of sliced apples were extracted from hyperspectral absorbance images obtained in the 400e1000 nm rang...

  2. Does context matter for the relationship between deprivation and all-cause mortality? The West vs. the rest of Scotland

    PubMed Central

    2011-01-01

    Background A growing body of research emphasizes the importance of contextual factors on health outcomes. Using postcode sector data for Scotland (UK), this study tests the hypothesis of spatial heterogeneity in the relationship between area-level deprivation and mortality to determine if contextual differences in the West vs. the rest of Scotland influence this relationship. Research into health inequalities frequently fails to recognise spatial heterogeneity in the deprivation-health relationship, assuming that global relationships apply uniformly across geographical areas. In this study, exploratory spatial data analysis methods are used to assess local patterns in deprivation and mortality. Spatial regression models are then implemented to examine the relationship between deprivation and mortality more formally. Results The initial exploratory spatial data analysis reveals concentrations of high standardized mortality ratios (SMR) and deprivation (hotspots) in the West of Scotland and concentrations of low values (coldspots) for both variables in the rest of the country. The main spatial regression result is that deprivation is the only variable that is highly significantly correlated with all-cause mortality in all models. However, in contrast to the expected spatial heterogeneity in the deprivation-mortality relationship, this relation does not vary between regions in any of the models. This result is robust to a number of specifications, including weighting for population size, controlling for spatial autocorrelation and heteroskedasticity, assuming a non-linear relationship between mortality and socio-economic deprivation, separating the dependent variable into male and female SMRs, and distinguishing between West, North and Southeast regions. The rejection of the hypothesis of spatial heterogeneity in the relationship between socio-economic deprivation and mortality complements prior research on the stability of the deprivation-mortality relationship over time. Conclusions The homogeneity we found in the deprivation-mortality relationship across the regions of Scotland and the absence of a contextualized effect of region highlights the importance of taking a broader strategic policy that can combat the toxic impacts of socio-economic deprivation on health. Focusing on a few specific places (e.g. 15% of the poorest areas) to concentrate resources might be a good start but the impact of socio-economic deprivation on mortality is not restricted to a few places. A comprehensive strategy that can be sustained over time might be needed to interrupt the linkages between poverty and mortality. PMID:21569408

  3. Geospatial Predictive Modelling for Climate Mapping of Selected Severe Weather Phenomena Over Poland: A Methodological Approach

    NASA Astrophysics Data System (ADS)

    Walawender, Ewelina; Walawender, Jakub P.; Ustrnul, Zbigniew

    2017-02-01

    The main purpose of the study is to introduce methods for mapping the spatial distribution of the occurrence of selected atmospheric phenomena (thunderstorms, fog, glaze and rime) over Poland from 1966 to 2010 (45 years). Limited in situ observations as well the discontinuous and location-dependent nature of these phenomena make traditional interpolation inappropriate. Spatially continuous maps were created with the use of geospatial predictive modelling techniques. For each given phenomenon, an algorithm identifying its favourable meteorological and environmental conditions was created on the basis of observations recorded at 61 weather stations in Poland. Annual frequency maps presenting the probability of a day with a thunderstorm, fog, glaze or rime were created with the use of a modelled, gridded dataset by implementing predefined algorithms. Relevant explanatory variables were derived from NCEP/NCAR reanalysis and downscaled with the use of a Regional Climate Model. The resulting maps of favourable meteorological conditions were found to be valuable and representative on the country scale but at different correlation ( r) strength against in situ data (from r = 0.84 for thunderstorms to r = 0.15 for fog). A weak correlation between gridded estimates of fog occurrence and observations data indicated the very local nature of this phenomenon. For this reason, additional environmental predictors of fog occurrence were also examined. Topographic parameters derived from the SRTM elevation model and reclassified CORINE Land Cover data were used as the external, explanatory variables for the multiple linear regression kriging used to obtain the final map. The regression model explained 89 % of annual frequency of fog variability in the study area. Regression residuals were interpolated via simple kriging.

  4. Exploration of walking behavior in Vermont using spatial regression.

    DOT National Transportation Integrated Search

    2015-06-01

    This report focuses on the relationship between walking and its contributing factors by : applying spatial regression methods. Using the Vermont data from the New England : Transportation Survey (NETS), walking variables as well as 170 independent va...

  5. Shifts of environmental and phytoplankton variables in a regulated river: A spatial-driven analysis.

    PubMed

    Sabater-Liesa, Laia; Ginebreda, Antoni; Barceló, Damià

    2018-06-18

    The longitudinal structure of the environmental and phytoplankton variables was investigated in the Ebro River (NE Spain), which is heavily affected by water abstraction and regulation. A first exploration indicated that the phytoplankton community did not resist the impact of reservoirs and barely recovered downstream of them. The spatial analysis showed that the responses of the phytoplankton and environmental variables were not uniform. The two set of variables revealed spatial variability discontinuities and river fragmentation upstream and downstream from the reservoirs. Reservoirs caused the replacement of spatially heterogeneous habitats by homogeneous spatially distributed water bodies, these new environmental conditions downstream benefiting the opportunist and cosmopolitan algal taxa. The application of a spatial auto-regression model to algal biomass (chlorophyll-a) permitted to capture the relevance and contribution of extra-local influences in the river ecosystem. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Bias and uncertainty in regression-calibrated models of groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Cooley, R.L.; Christensen, S.

    2006-01-01

    Groundwater models need to account for detailed but generally unknown spatial variability (heterogeneity) of the hydrogeologic model inputs. To address this problem we replace the large, m-dimensional stochastic vector ?? that reflects both small and large scales of heterogeneity in the inputs by a lumped or smoothed m-dimensional approximation ????*, where ?? is an interpolation matrix and ??* is a stochastic vector of parameters. Vector ??* has small enough dimension to allow its estimation with the available data. The consequence of the replacement is that model function f(????*) written in terms of the approximate inputs is in error with respect to the same model function written in terms of ??, ??,f(??), which is assumed to be nearly exact. The difference f(??) - f(????*), termed model error, is spatially correlated, generates prediction biases, and causes standard confidence and prediction intervals to be too small. Model error is accounted for in the weighted nonlinear regression methodology developed to estimate ??* and assess model uncertainties by incorporating the second-moment matrix of the model errors into the weight matrix. Techniques developed by statisticians to analyze classical nonlinear regression methods are extended to analyze the revised method. The analysis develops analytical expressions for bias terms reflecting the interaction of model nonlinearity and model error, for correction factors needed to adjust the sizes of confidence and prediction intervals for this interaction, and for correction factors needed to adjust the sizes of confidence and prediction intervals for possible use of a diagonal weight matrix in place of the correct one. If terms expressing the degree of intrinsic nonlinearity for f(??) and f(????*) are small, then most of the biases are small and the correction factors are reduced in magnitude. Biases, correction factors, and confidence and prediction intervals were obtained for a test problem for which model error is large to test robustness of the methodology. Numerical results conform with the theoretical analysis. ?? 2005 Elsevier Ltd. All rights reserved.

  7. Price, Weather, and `Acreage Abandonment' in Western Great Plains Wheat Culture.

    NASA Astrophysics Data System (ADS)

    Michaels, Patrick J.

    1983-07-01

    Multivariate analyses of acreage abandonment patterns in the U.S. Great Plains winter wheat region indicate that the major mode of variation is an in-phase oscillation confined to the western half of the overall area, which is also the area with lowest average yields. This is one of the more agroclimatically marginal environments in the United States, with wide interannual fluctuations in both climate and profitability.We developed a multiple regression model to determine the relative roles of weather and expected price in the decision not to harvest. The overall model explained 77% of the spatial and temporal variation in abandonment. The 36.5% of the non-spatial variation was explained by two simple transformations of climatic data from three monthly aggregates-September-October, November-February and March-April. Price factors, expressed as indexed future delivery quotations,were barely significant, with only between 3 and 5% of the non-spatial variation explained, depending upon the model.The model was based upon weather, climate and price data from 1932 through 1975. It was tested by sequentially withholding three-year blocks of data, and using the respecified regression coefficients, along with observed weather and price, to estimate abandonment in the withheld years. Error analyses indicate no loss of model fidelity in the test mode. Also, prediction errors in the 1970-75 period, characterized by widely fluctuating prices, were not different from those in the rest of the model.The overall results suggest that the perceived quality of the crop, as influenced by weather, is a much more important determinant of the abandonment decision than are expected returns based upon price considerations.

  8. Implications of a lightning-rich tundra biome for permafrost carbon and vegetation dynamics

    NASA Astrophysics Data System (ADS)

    Chen, Y.; Veraverbeke, S.; Randerson, J. T.

    2017-12-01

    Lightning is a major ignition source of wildfires in circumpolar boreal forests but rarely occurs in arctic tundra. While theoretical and empirical work suggests that climate change will increase lightning strikes in temperate regions, much less is known about future changes in lightning across terrestrial ecosystems at high northern latitudes. Here we analyzed the spatial and temporal patterns of lightning flash rate (FR) from the satellite observations and surface detection networks. Regression models between the observed FR from the Optical Transient Detector on the MicroLab-1 satellite (later renamed OV-1) and meteorological parameters, including surface temperature (T), convective available potential energy (CAPE), and convective precipitation (CP) from ECMWF (European Centre for Medium-Range Weather Forecasts) ERA-interim reanalysis, were established and assessed. We found that FR had significant linear correlations with CAPE and CP, and a strong non-linear relationship with T. The statistical model based on T and CP can reproduce most of the spatial and temporal variability in FR in the circumpolar region. By using the regression model and meteorological predictions from 24 earth system models in the Coupled Model Intercomparison Project Phase 5 (CMIP5), we estimated the spatial distribution of FR by the end of the 21st century. Due to increases in surface temperature and convection, modeled FR shows substantial increase in northern biomes, including a 338% change in arctic tundra and a 185% change in regions with permafrost soil carbon reservoirs. These changes highlight a new mechanism by which permafrost carbon is vulnerable to the sustained impacts of climate warming. Increased fire in a warmer and lightning-rich future near the treeline has the potential to accelerate the northward migration of trees, which may further enhance warming and the abundance of lightning strikes.

  9. Normal tissue complication probability (NTCP) modelling using spatial dose metrics and machine learning methods for severe acute oral mucositis resulting from head and neck radiotherapy.

    PubMed

    Dean, Jamie A; Wong, Kee H; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Nutting, Christopher M; Gulliford, Sarah L

    2016-07-01

    Severe acute mucositis commonly results from head and neck (chemo)radiotherapy. A predictive model of mucositis could guide clinical decision-making and inform treatment planning. We aimed to generate such a model using spatial dose metrics and machine learning. Predictive models of severe acute mucositis were generated using radiotherapy dose (dose-volume and spatial dose metrics) and clinical data. Penalised logistic regression, support vector classification and random forest classification (RFC) models were generated and compared. Internal validation was performed (with 100-iteration cross-validation), using multiple metrics, including area under the receiver operating characteristic curve (AUC) and calibration slope, to assess performance. Associations between covariates and severe mucositis were explored using the models. The dose-volume-based models (standard) performed equally to those incorporating spatial information. Discrimination was similar between models, but the RFCstandard had the best calibration. The mean AUC and calibration slope for this model were 0.71 (s.d.=0.09) and 3.9 (s.d.=2.2), respectively. The volumes of oral cavity receiving intermediate and high doses were associated with severe mucositis. The RFCstandard model performance is modest-to-good, but should be improved, and requires external validation. Reducing the volumes of oral cavity receiving intermediate and high doses may reduce mucositis incidence. Copyright © 2016 The Author(s). Published by Elsevier Ireland Ltd.. All rights reserved.

  10. A Hybrid Model for Spatially and Temporally Resolved Ozone Exposures in the Continental United States

    PubMed Central

    Di, Qian; Rowland, Sebastian; Koutrakis, Petros; Schwartz, Joel

    2017-01-01

    Ground-level ozone is an important atmospheric oxidant, which exhibits considerable spatial and temporal variability in its concentration level. Existing modeling approaches for ground-level ozone include chemical transport models, land-use regression, Kriging, and data fusion of chemical transport models with monitoring data. Each of these methods has both strengths and weaknesses. Combining those complementary approaches could improve model performance. Meanwhile, satellite-based total column ozone, combined with ozone vertical profile, is another potential input. We propose a hybrid model that integrates the above variables to achieve spatially and temporally resolved exposure assessments for ground-level ozone. We used a neural network for its capacity to model interactions and nonlinearity. Convolutional layers, which use convolution kernels to aggregate nearby information, were added to the neural network to account for spatial and temporal autocorrelation. We trained the model with AQS 8-hour daily maximum ozone in the continental United States from 2000 to 2012 and tested it with left out monitoring sites. Cross-validated R2 on the left out monitoring sites ranged from 0.74 to 0.80 (mean 0.76) for predictions on 1 km×1 km grid cells, which indicates good model performance. Model performance remains good even at low ozone concentrations. The prediction results facilitate epidemiological studies to assess the health effect of ozone in the long term and the short term. PMID:27332675

  11. Spatial prediction of soil texture in region Centre (France) from summary data

    NASA Astrophysics Data System (ADS)

    Dobarco, Mercedes Roman; Saby, Nicolas; Paroissien, Jean-Baptiste; Orton, Tom G.

    2015-04-01

    Soil texture is a key controlling factor of important soil functions like water and nutrient holding capacity, retention of pollutants, drainage, soil biodiversity, and C cycling. High resolution soil texture maps enhance our understanding of the spatial distribution of soil properties and provide valuable information for decision making and crop management, environmental protection, and hydrological planning. We predicted the soil texture of agricultural topsoils in the Region Centre (France) combining regression and area-to-point kriging. Soil texture data was collected from the French soil-test database (BDAT), which is populated with soil analysis performed by farmers' demand. To protect the anonymity of the farms the data was treated by commune. In a first step, summary statistics of environmental covariates by commune were used to develop prediction models with Cubist, boosted regression trees, and random forests. In a second step the residuals of each individual observation were summarized by commune and kriged following the method developed by Orton et al. (2012). This approach allowed to include non-linear relationships among covariates and soil texture while accounting for the uncertainty on areal means in the area-to-point kriging step. Independent validation of the models was done using data from the systematic soil monitoring network of French soils. Future work will compare the performance of these models with a non-stationary variance geostatistical model using the most important covariates and summary statistics of texture data. The results will inform on whether the later and statistically more-challenging approach improves significantly texture predictions or whether the more simple area-to-point regression kriging can offer satisfactory results. The application of area-to-point regression kriging at national level using BDAT data has the potential to improve soil texture predictions for agricultural topsoils, especially when combined with existing maps (i.e., model ensemble).

  12. A comparison of multi-spectral, multi-angular, and multi-temporal remote sensing datasets for fractional shrub canopy mapping in Arctic Alaska

    USGS Publications Warehouse

    Selkowitz, D.J.

    2010-01-01

    Shrub cover appears to be increasing across many areas of the Arctic tundra biome, and increasing shrub cover in the Arctic has the potential to significantly impact global carbon budgets and the global climate system. For most of the Arctic, however, there is no existing baseline inventory of shrub canopy cover, as existing maps of Arctic vegetation provide little information about the density of shrub cover at a moderate spatial resolution across the region. Remotely-sensed fractional shrub canopy maps can provide this necessary baseline inventory of shrub cover. In this study, we compare the accuracy of fractional shrub canopy (> 0.5 m tall) maps derived from multi-spectral, multi-angular, and multi-temporal datasets from Landsat imagery at 30 m spatial resolution, Moderate Resolution Imaging SpectroRadiometer (MODIS) imagery at 250 m and 500 m spatial resolution, and MultiAngle Imaging Spectroradiometer (MISR) imagery at 275 m spatial resolution for a 1067 km2 study area in Arctic Alaska. The study area is centered at 69 ??N, ranges in elevation from 130 to 770 m, is composed primarily of rolling topography with gentle slopes less than 10??, and is free of glaciers and perennial snow cover. Shrubs > 0.5 m in height cover 2.9% of the study area and are primarily confined to patches associated with specific landscape features. Reference fractional shrub canopy is determined from in situ shrub canopy measurements and a high spatial resolution IKONOS image swath. Regression tree models are constructed to estimate fractional canopy cover at 250 m using different combinations of input data from Landsat, MODIS, and MISR. Results indicate that multi-spectral data provide substantially more accurate estimates of fractional shrub canopy cover than multi-angular or multi-temporal data. Higher spatial resolution datasets also provide more accurate estimates of fractional shrub canopy cover (aggregated to moderate spatial resolutions) than lower spatial resolution datasets, an expected result for a study area where most shrub cover is concentrated in narrow patches associated with rivers, drainages, and slopes. Including the middle infrared bands available from Landsat and MODIS in the regression tree models (in addition to the four standard visible and near-infrared spectral bands) typically results in a slight boost in accuracy. Including the multi-angular red band data available from MISR in the regression tree models, however, typically boosts accuracy more substantially, resulting in moderate resolution fractional shrub canopy estimates approaching the accuracy of estimates derived from the much higher spatial resolution Landsat sensor. Given the poor availability of snow and cloud-free Landsat scenes in many areas of the Arctic and the promising results demonstrated here by the MISR sensor, MISR may be the best choice for large area fractional shrub canopy mapping in the Alaskan Arctic for the period 2000-2009.

  13. Evaluation of Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR) for Water Quality Monitoring: A Case Study for the Estimation of Salinity

    NASA Astrophysics Data System (ADS)

    Nazeer, Majid; Bilal, Muhammad

    2018-04-01

    Landsat-5 Thematic Mapper (TM) dataset have been used to estimate salinity in the coastal area of Hong Kong. Four adjacent Landsat TM images were used in this study, which was atmospherically corrected using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) radiative transfer code. The atmospherically corrected images were further used to develop models for salinity using Ordinary Least Square (OLS) regression and Geographically Weighted Regression (GWR) based on in situ data of October 2009. Results show that the coefficient of determination ( R 2) of 0.42 between the OLS estimated and in situ measured salinity is much lower than that of the GWR model, which is two times higher ( R 2 = 0.86). It indicates that the GWR model has more ability than the OLS regression model to predict salinity and show its spatial heterogeneity better. It was observed that the salinity was high in Deep Bay (north-western part of Hong Kong) which might be due to the industrial waste disposal, whereas the salinity was estimated to be constant (32 practical salinity units) towards the open sea.

  14. Characterizing the relationship between Asian tiger mosquito abundance and habitat in urban New Jersey

    NASA Astrophysics Data System (ADS)

    Ferwerda, Carolin

    2009-12-01

    Since its introduction to North America in 1987, the Asian tiger mosquito (Aedes albopictus) has spread rapidly. Due to its unique ecology and preference for container breeding sites, Ae. albopictus commonly inhabits urban/suburban areas and is often in close contact with humans. An aggressive pest, this mosquito species is a vector of multiple arboviruses. In order for mosquito control efforts to remain effective, control of this important vector must be guided by spatially explicit habitat models that aid in predicting mosquito outbreaks. Using linear regression, I determined the relationship between adult Ae. albopictus abundance and climate, census, and land use factors in nine urban/suburban study sites in central New Jersey. Systematically collected adult counts (females and males) from July to October 2008, served as estimates of abundance. Fine-scale land use/land cover data were obtained from object-oriented classifications of 2007 CIR orthophotos in Definiens eCognition. Mosquito abundance data were tested for spatial autocorrelation via Moran's I, semivariograms, and hotspot analysis in order to reveal consistent patterns in abundance. Spatial pattern analysis produced little evidence of consistent spatial autocorrelation, though several sites exhibited recurring hotspots, especially in areas near residential housing and vegetation. Stepwise multiple regression was able to explain 20-25 percent of variation in Ae. albopictus abundance at the 'backyard' or cell level and 72-78 percent of variation in abundance at the 'neighborhood' or study site level. Meteorological variables (temperature on the trap date and precipitation), census variables (vacant housing units and population density), and more detailed land use/land cover classes (deciduous woody vegetation, rights-of-way and vacant lots) were frequently selected in all eight models, though many other independent variables were included in the individual models. The results of the spatial statistics suggest that clustering may occur at a broader extent, while the superior predictive ability of the site level models over the finer grain cell level models supports this conclusion. Future work should focus on validating these models with 2009 field data and testing whether finer grain weather and census data enhance the models' predictive ability. Given the major differences between individual county models, future studies should further explore variations in Ae. albopictus habitat preferences in different geographic locations.

  15. Predicting space telerobotic operator training performance from human spatial ability assessment

    NASA Astrophysics Data System (ADS)

    Liu, Andrew M.; Oman, Charles M.; Galvan, Raquel; Natapoff, Alan

    2013-11-01

    Our goal was to determine whether existing tests of spatial ability can predict an astronaut's qualification test performance after robotic training. Because training astronauts to be qualified robotics operators is so long and expensive, NASA is interested in tools that can predict robotics performance before training begins. Currently, the Astronaut Office does not have a validated tool to predict robotics ability as part of its astronaut selection or training process. Commonly used tests of human spatial ability may provide such a tool to predict robotics ability. We tested the spatial ability of 50 active astronauts who had completed at least one robotics training course, then used logistic regression models to analyze the correlation between spatial ability test scores and the astronauts' performance in their evaluation test at the end of the training course. The fit of the logistic function to our data is statistically significant for several spatial tests. However, the prediction performance of the logistic model depends on the criterion threshold assumed. To clarify the critical selection issues, we show how the probability of correct classification vs. misclassification varies as a function of the mental rotation test criterion level. Since the costs of misclassification are low, the logistic models of spatial ability and robotic performance are reliable enough only to be used to customize regular and remedial training. We suggest several changes in tracking performance throughout robotics training that could improve the range and reliability of predictive models.

  16. A Predictive Risk Model for A(H7N9) Human Infections Based on Spatial-Temporal Autocorrelation and Risk Factors: China, 2013–2014

    PubMed Central

    Dong, Wen; Yang, Kun; Xu, Quan-Li; Yang, Yu-Lian

    2015-01-01

    This study investigated the spatial distribution, spatial autocorrelation, temporal cluster, spatial-temporal autocorrelation and probable risk factors of H7N9 outbreaks in humans from March 2013 to December 2014 in China. The results showed that the epidemic spread with significant spatial-temporal autocorrelation. In order to describe the spatial-temporal autocorrelation of H7N9, an improved model was developed by introducing a spatial-temporal factor in this paper. Logistic regression analyses were utilized to investigate the risk factors associated with their distribution, and nine risk factors were significantly associated with the occurrence of A(H7N9) human infections: the spatial-temporal factor φ (OR = 2546669.382, p < 0.001), migration route (OR = 0.993, p < 0.01), river (OR = 0.861, p < 0.001), lake(OR = 0.992, p < 0.001), road (OR = 0.906, p < 0.001), railway (OR = 0.980, p < 0.001), temperature (OR = 1.170, p < 0.01), precipitation (OR = 0.615, p < 0.001) and relative humidity (OR = 1.337, p < 0.001). The improved model obtained a better prediction performance and a higher fitting accuracy than the traditional model: in the improved model 90.1% (91/101) of the cases during February 2014 occurred in the high risk areas (the predictive risk > 0.70) of the predictive risk map, whereas 44.6% (45/101) of which overlaid on the high risk areas (the predictive risk > 0.70) for the traditional model, and the fitting accuracy of the improved model was 91.6% which was superior to the traditional model (86.1%). The predictive risk map generated based on the improved model revealed that the east and southeast of China were the high risk areas of A(H7N9) human infections in February 2014. These results provided baseline data for the control and prevention of future human infections. PMID:26633446

  17. Spatial/Temporal Variations of Crime: A Routine Activity Theory Perspective.

    PubMed

    de Melo, Silas Nogueira; Pereira, Débora V S; Andresen, Martin A; Matias, Lindon Fonseca

    2018-05-01

    Temporal and spatial patterns of crime in Campinas, Brazil, are analyzed considering the relevance of routine activity theory in a Latin American context. We use geo-referenced criminal event data, 2010-2013, analyzing spatial patterns using census tracts and temporal patterns considering seasons, months, days, and hours. Our analyses include difference in means tests, count-based regression models, and Kulldorff's scan test. We find that crime in Campinas, Brazil, exhibits both temporal and spatial-temporal patterns. However, the presence of these patterns at the different temporal scales varies by crime type. Specifically, not all crime types have statistically significant temporal patterns at all scales of analysis. As such, routine activity theory works well to explain temporal and spatial-temporal patterns of crime in Campinas, Brazil. However, local knowledge of Brazilian culture is necessary for understanding a portion of these crime patterns.

  18. Not to put too fine a point on it - does increasing precision of geographic referencing improve species distribution models for a wide-ranging migratory bat?

    USGS Publications Warehouse

    Hayes, Mark A.; Ozenberger, Katharine; Cryan, Paul M.; Wunder, Michael B.

    2015-01-01

    Bat specimens held in natural history museum collections can provide insights into the distribution of species. However, there are several important sources of spatial error associated with natural history specimens that may influence the analysis and mapping of bat species distributions. We analyzed the importance of geographic referencing and error correction in species distribution modeling (SDM) using occurrence records of hoary bats (Lasiurus cinereus). This species is known to migrate long distances and is a species of increasing concern due to fatalities documented at wind energy facilities in North America. We used 3,215 museum occurrence records collected from 1950–2000 for hoary bats in North America. We compared SDM performance using five approaches: generalized linear models, multivariate adaptive regression splines, boosted regression trees, random forest, and maximum entropy models. We evaluated results using three SDM performance metrics (AUC, sensitivity, and specificity) and two data sets: one comprised of the original occurrence data, and a second data set consisting of these same records after the locations were adjusted to correct for identifiable spatial errors. The increase in precision improved the mean estimated spatial error associated with hoary bat records from 5.11 km to 1.58 km, and this reduction in error resulted in a slight increase in all three SDM performance metrics. These results provide insights into the importance of geographic referencing and the value of correcting spatial errors in modeling the distribution of a wide-ranging bat species. We conclude that the considerable time and effort invested in carefully increasing the precision of the occurrence locations in this data set was not worth the marginal gains in improved SDM performance, and it seems likely that gains would be similar for other bat species that range across large areas of the continent, migrate, and are habitat generalists.

  19. Modelling spatial patterns of urban growth in Africa

    PubMed Central

    Linard, Catherine; Tatem, Andrew J.; Gilbert, Marius

    2013-01-01

    The population of Africa is predicted to double over the next 40 years, driving exceptionally high urban expansion rates that will induce significant socio-economic, environmental and health changes. In order to prepare for these changes, it is important to better understand urban growth dynamics in Africa and better predict the spatial pattern of rural-urban conversions. Previous work on urban expansion has been carried out at the city level or at the global level with a relatively coarse 5–10 km resolution. The main objective of the present paper was to develop a modelling approach at an intermediate scale in order to identify factors that influence spatial patterns of urban expansion in Africa. Boosted Regression Tree models were developed to predict the spatial pattern of rural-urban conversions in every large African city. Urban change data between circa 1990 and circa 2000 available for 20 large cities across Africa were used as training data. Results showed that the urban land in a 1 km neighbourhood and the accessibility to the city centre were the most influential variables. Results obtained were generally more accurate than results obtained using a distance-based urban expansion model and showed that the spatial pattern of small, compact and fast growing cities were easier to simulate than cities with lower population densities and a lower growth rate. The simulation method developed here will allow the production of spatially detailed urban expansion forecasts for 2020 and 2025 for Africa, data that are increasingly required by global change modellers. PMID:25152552

  20. Tundra plant biomass distribution and environmental constraints on the North Slope of Alaska

    NASA Astrophysics Data System (ADS)

    Berner, L. T.; Jantz, P.; Goetz, S. J.

    2017-12-01

    Rising temperatures are increasing plant productivity and biomass in the Arctic tundra, with pronounced greening having occurred in northern Alaska during recent decades. Increasing plant biomass will drive biogeochemical and biophysical feedback to regional climate; however, the amount and spatial distribution of plant biomass remains highly uncertain in these northern ecosystems. In this study, we mapped both plant aboveground biomass (AGB) and the shrub component across the North Slope of Alaska at 30 m spatial resolution by combining satellite and field measurements, and then examined how the spatial distribution of AGB was constrained by regional climate and local topography. Specifically, we developed regression models for predicting AGB based on the Normalized Difference Vegetation Index (NDVI) derived from Landsat satellite imagery. These regression models incorporated previously published field measurements from 27 tundra locations and showed strong relationships between AGB and peak summer NDVI (r2=0.75-0.80). We then predicted AGB across the study area by combining these regression models with a peak summer NDVI composite mosaic derived from over 2,000 Landsat scenes acquired between 2007 and 2016. We also created uncertainty maps using a Monte Carlo approach. The resulting biomass maps indicated that plant AGB averaged 0.72 kg m-2 (95% CI = 0.50-1.01 kg m-2) and totaled 108 Tg (75-153 Tg) across the domain, with shrub AGB accounting for about 44% of plant AGB. Plant and shrub AGB peaked in riparian areas, where permafrost active layers are generally deeper and nutrients more readily available. Plant and shrub AGB were also strongly influenced by summer temperature, with average plant AGB doubling and shrub AGB quadrupling between areas with the coldest and warmest summers. Furthermore, the contribution of shrub AGB to total plant AGB increased with increasing summer temperatures. Future warming will likely increase plant AGB and the contribution from shrubs in this area, particularity in riparian areas. These plant biomass maps provide an important, spatially explicit baseline for evaluating ecosystem-climate feedbacks associated with ongoing environmental change. These maps may also inform management assessments of North Slope ecosystems and associated wildlife.

  1. Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability.

    PubMed

    Wu, Chih-Da; Chen, Yu-Cheng; Pan, Wen-Chi; Zeng, Yu-Ting; Chen, Mu-Jean; Guo, Yue Leon; Lung, Shih-Chun Candice

    2017-05-01

    This study utilized a long-term satellite-based vegetation index, and considered culture-specific emission sources (temples and Chinese restaurants) with Land-use Regression (LUR) modelling to estimate the spatial-temporal variability of PM 2.5 using data from Taipei metropolis, which exhibits typical Asian city characteristics. Annual average PM 2.5 concentrations from 2006 to 2012 of 17 air quality monitoring stations established by Environmental Protection Administration of Taiwan were used for model development. PM 2.5 measurements from 2013 were used for external data verification. Monthly Normalized Difference Vegetation Index (NDVI) images coupled with buffer analysis were used to assess the spatial-temporal variations of greenness surrounding the monitoring sites. The distribution of temples and Chinese restaurants were included to represent the emission contributions from incense and joss money burning, and gas cooking, respectively. Spearman correlation coefficient and stepwise regression were used for LUR model development, and 10-fold cross-validation and external data verification were applied to verify the model reliability. The results showed a strongly negative correlation (r: -0.71 to -0.77) between NDVI and PM 2.5 while temples (r: 0.52 to 0.66) and Chinese restaurants (r: 0.31 to 0.44) were positively correlated to PM 2.5 concentrations. With the adjusted model R 2 of 0.89, a cross-validated adj-R 2 of 0.90, and external validated R 2 of 0.83, the high explanatory power of the resultant model was confirmed. Moreover, the averaged NDVI within a 1750 m circular buffer (p < 0.01), the number of Chinese restaurants within a 1750 m buffer (p < 0.01), and the number of temples within a 750 m buffer (p = 0.06) were selected as important predictors during the stepwise selection procedures. According to the partial R 2 , NDVI explained 66% of PM 2.5 variation and was the dominant variable in the developed model. We suggest future studies consider these three factors when establishing LUR models for estimating PM 2.5 in other Asian cities. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Mapping urban environmental noise: a land use regression method.

    PubMed

    Xie, Dan; Liu, Yi; Chen, Jining

    2011-09-01

    Forecasting and preventing urban noise pollution are major challenges in urban environmental management. Most existing efforts, including experiment-based models, statistical models, and noise mapping, however, have limited capacity to explain the association between urban growth and corresponding noise change. Therefore, these conventional methods can hardly forecast urban noise at a given outlook of development layout. This paper, for the first time, introduces a land use regression method, which has been applied for simulating urban air quality for a decade, to construct an urban noise model (LUNOS) in Dalian Municipality, Northwest China. The LUNOS model describes noise as a dependent variable of surrounding various land areas via a regressive function. The results suggest that a linear model performs better in fitting monitoring data, and there is no significant difference of the LUNOS's outputs when applied to different spatial scales. As the LUNOS facilitates a better understanding of the association between land use and urban environmental noise in comparison to conventional methods, it can be regarded as a promising tool for noise prediction for planning purposes and aid smart decision-making.

  3. Pyrogenic carbon distribution in mineral topsoils of the northeastern United States

    USGS Publications Warehouse

    Jauss, Verena; Sullivan, Patrick J.; Sanderman, Jonathan; Smith, David; Lehmann, Johannes

    2017-01-01

    Due to its slow turnover rates in soil, pyrogenic carbon (PyC) is considered an important C pool and relevant to climate change processes. Therefore, the amounts of soil PyC were compared to environmental covariates over an area of 327,757 km2 in the northeastern United States in order to understand the controls on PyC distribution over large areas. Topsoil (defined as the soil A horizon, after removal of any organic horizons) samples were collected at 165 field sites in a generalised random tessellation stratified design that corresponded to approximately 1 site per 1600 km2 and PyC was estimated from diffuse reflectance mid-infrared spectroscopy measurements using a partial least-squares regression analysis in conjunction with a large database of PyC measurements based on a solid-state 13C nuclear magnetic resonance spectroscopy technique. Three spatial models were applied to the data in order to relate critical environmental covariates to the changes in spatial density of PyC over the landscape. Regional mean density estimates of PyC were 11.0 g kg− 1 (0.84 Gg km− 2) for Ordinary Kriging, 25.8 g kg− 1(12.2 Gg km− 2) for Multivariate Linear Regression, and 26.1 g kg− 1 (12.4 Gg km− 2) for Bayesian Regression Kriging. Akaike Information Criterion (AIC) indicated that the Multivariate Linear Regression model performed best (AIC = 842.6; n = 165) compared to Ordinary Kriging (AIC = 982.4) and Bayesian Regression Kriging (AIC = 979.2). Soil PyC concentrations correlated well with total soil sulphur (P < 0.001; n = 165), plant tissue lignin (P = 0.003), and drainage class (P = 0.008). This suggests the opportunity of including related environmental parameters in the spatial assessment of PyC in soils. Better estimates of the contribution of PyC to the global carbon cycle will thus also require more accurate assessments of these covariates.

  4. Modelling the spatial distribution of Fasciola hepatica in dairy cattle in Europe.

    PubMed

    Ducheyne, Els; Charlier, Johannes; Vercruysse, Jozef; Rinaldi, Laura; Biggeri, Annibale; Demeler, Janina; Brandt, Christina; De Waal, Theo; Selemetas, Nikolaos; Höglund, Johan; Kaba, Jaroslaw; Kowalczyk, Slawomir J; Hendrickx, Guy

    2015-03-26

    A harmonized sampling approach in combination with spatial modelling is required to update current knowledge of fasciolosis in dairy cattle in Europe. Within the scope of the EU project GLOWORM, samples from 3,359 randomly selected farms in 849 municipalities in Belgium, Germany, Ireland, Poland and Sweden were collected and their infection status assessed using an indirect bulk tank milk (BTM) enzyme-linked immunosorbent assay (ELISA). Dairy farms were considered exposed when the optical density ratio (ODR) exceeded the 0.3 cut-off. Two ensemble-modelling techniques, Random Forests (RF) and Boosted Regression Trees (BRT), were used to obtain the spatial distribution of the probability of exposure to Fasciola hepatica using remotely sensed environmental variables (1-km spatial resolution) and interpolated values from meteorological stations as predictors. The median ODRs amounted to 0.31, 0.12, 0.54, 0.25 and 0.44 for Belgium, Germany, Ireland, Poland and southern Sweden, respectively. Using the 0.3 threshold, 571 municipalities were categorized as positive and 429 as negative. RF was seen as capable of predicting the spatial distribution of exposure with an area under the receiver operation characteristic (ROC) curve (AUC) of 0.83 (0.96 for BRT). Both models identified rainfall and temperature as the most important factors for probability of exposure. Areas of high and low exposure were identified by both models, with BRT better at discriminating between low-probability and high-probability exposure; this model may therefore be more useful in practise. Given a harmonized sampling strategy, it should be possible to generate robust spatial models for fasciolosis in dairy cattle in Europe to be used as input for temporal models and for the detection of deviations in baseline probability. Further research is required for model output in areas outside the eco-climatic range investigated.

  5. Calibrating MODIS aerosol optical depth for predicting daily PM2.5 concentrations via statistical downscaling

    PubMed Central

    Chang, Howard H.; Hu, Xuefei; Liu, Yang

    2014-01-01

    There has been a growing interest in the use of satellite-retrieved aerosol optical depth (AOD) to estimate ambient concentrations of PM2.5 (particulate matter <2.5 μm in aerodynamic diameter). With their broad spatial coverage, satellite data can increase the spatial–temporal availability of air quality data beyond ground monitoring measurements and potentially improve exposure assessment for population-based health studies. This paper describes a statistical downscaling approach that brings together (1) recent advances in PM2.5 land use regression models utilizing AOD and (2) statistical data fusion techniques for combining air quality data sets that have different spatial resolutions. Statistical downscaling assumes the associations between AOD and PM2.5 concentrations to be spatially and temporally dependent and offers two key advantages. First, it enables us to use gridded AOD data to predict PM2.5 concentrations at spatial point locations. Second, the unified hierarchical framework provides straightforward uncertainty quantification in the predicted PM2.5 concentrations. The proposed methodology is applied to a data set of daily AOD values in southeastern United States during the period 2003–2005. Via cross-validation experiments, our model had an out-of-sample prediction R2 of 0.78 and a root mean-squared error (RMSE) of 3.61 μg/m3 between observed and predicted daily PM2.5 concentrations. This corresponds to a 10% decrease in RMSE compared with the same land use regression model without AOD as a predictor. Prediction performances of spatial–temporal interpolations to locations and on days without monitoring PM2.5 measurements were also examined. PMID:24368510

  6. The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants--An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots.

    PubMed

    Kauhl, Boris; Heil, Jeanne; Hoebe, Christian J P A; Schweikart, Jürgen; Krafft, Thomas; Dukers-Muijrers, Nicole H T M

    2015-01-01

    Hepatitis C Virus (HCV) infections are a major cause for liver diseases. A large proportion of these infections remain hidden to care due to its mostly asymptomatic nature. Population-based screening and screening targeted on behavioural risk groups had not proven to be effective in revealing these hidden infections. Therefore, more practically applicable approaches to target screenings are necessary. Geographic Information Systems (GIS) and spatial epidemiological methods may provide a more feasible basis for screening interventions through the identification of hotspots as well as demographic and socio-economic determinants. Analysed data included all HCV tests (n = 23,800) performed in the southern area of the Netherlands between 2002-2008. HCV positivity was defined as a positive immunoblot or polymerase chain reaction test. Population data were matched to the geocoded HCV test data. The spatial scan statistic was applied to detect areas with elevated HCV risk. We applied global regression models to determine associations between population-based determinants and HCV risk. Geographically weighted Poisson regression models were then constructed to determine local differences of the association between HCV risk and population-based determinants. HCV prevalence varied geographically and clustered in urban areas. The main population at risk were middle-aged males, non-western immigrants and divorced persons. Socio-economic determinants consisted of one-person households, persons with low income and mean property value. However, the association between HCV risk and demographic as well as socio-economic determinants displayed strong regional and intra-urban differences. The detection of local hotspots in our study may serve as a basis for prioritization of areas for future targeted interventions. Demographic and socio-economic determinants associated with HCV risk show regional differences underlining that a one-size-fits-all approach even within small geographic areas may not be appropriate. Future screening interventions need to consider the spatially varying association between HCV risk and associated demographic and socio-economic determinants.

  7. Machine learning modeling of plant phenology based on coupling satellite and gridded meteorological dataset

    NASA Astrophysics Data System (ADS)

    Czernecki, Bartosz; Nowosad, Jakub; Jabłońska, Katarzyna

    2018-04-01

    Changes in the timing of plant phenological phases are important proxies in contemporary climate research. However, most of the commonly used traditional phenological observations do not give any coherent spatial information. While consistent spatial data can be obtained from airborne sensors and preprocessed gridded meteorological data, not many studies robustly benefit from these data sources. Therefore, the main aim of this study is to create and evaluate different statistical models for reconstructing, predicting, and improving quality of phenological phases monitoring with the use of satellite and meteorological products. A quality-controlled dataset of the 13 BBCH plant phenophases in Poland was collected for the period 2007-2014. For each phenophase, statistical models were built using the most commonly applied regression-based machine learning techniques, such as multiple linear regression, lasso, principal component regression, generalized boosted models, and random forest. The quality of the models was estimated using a k-fold cross-validation. The obtained results showed varying potential for coupling meteorological derived indices with remote sensing products in terms of phenological modeling; however, application of both data sources improves models' accuracy from 0.6 to 4.6 day in terms of obtained RMSE. It is shown that a robust prediction of early phenological phases is mostly related to meteorological indices, whereas for autumn phenophases, there is a stronger information signal provided by satellite-derived vegetation metrics. Choosing a specific set of predictors and applying a robust preprocessing procedures is more important for final results than the selection of a particular statistical model. The average RMSE for the best models of all phenophases is 6.3, while the individual RMSE vary seasonally from 3.5 to 10 days. Models give reliable proxy for ground observations with RMSE below 5 days for early spring and late spring phenophases. For other phenophases, RMSE are higher and rise up to 9-10 days in the case of the earliest spring phenophases.

  8. Strategies for minimizing sample size for use in airborne LiDAR-based forest inventory

    USGS Publications Warehouse

    Junttila, Virpi; Finley, Andrew O.; Bradford, John B.; Kauranne, Tuomo

    2013-01-01

    Recently airborne Light Detection And Ranging (LiDAR) has emerged as a highly accurate remote sensing modality to be used in operational scale forest inventories. Inventories conducted with the help of LiDAR are most often model-based, i.e. they use variables derived from LiDAR point clouds as the predictive variables that are to be calibrated using field plots. The measurement of the necessary field plots is a time-consuming and statistically sensitive process. Because of this, current practice often presumes hundreds of plots to be collected. But since these plots are only used to calibrate regression models, it should be possible to minimize the number of plots needed by carefully selecting the plots to be measured. In the current study, we compare several systematic and random methods for calibration plot selection, with the specific aim that they be used in LiDAR based regression models for forest parameters, especially above-ground biomass. The primary criteria compared are based on both spatial representativity as well as on their coverage of the variability of the forest features measured. In the former case, it is important also to take into account spatial auto-correlation between the plots. The results indicate that choosing the plots in a way that ensures ample coverage of both spatial and feature space variability improves the performance of the corresponding models, and that adequate coverage of the variability in the feature space is the most important condition that should be met by the set of plots collected.

  9. Land-use regression panel models of NO2 concentrations in Seoul, Korea

    NASA Astrophysics Data System (ADS)

    Kim, Youngkook; Guldmann, Jean-Michel

    2015-04-01

    Transportation and land-use activities are major air pollution contributors. Since their shares of emissions vary across space and time, so do air pollution concentrations. Despite these variations, panel data have rarely been used in land-use regression (LUR) modeling of air pollution. In addition, the complex interactions between traffic flows, land uses, and meteorological variables, have not been satisfactorily investigated in LUR models. The purpose of this research is to develop and estimate nitrogen dioxide (NO2) panel models based on the LUR framework with data for Seoul, Korea, accounting for the impacts of these variables, and their interactions with spatial and temporal dummy variables. The panel data vary over several scales: daily (24 h), seasonally (4), and spatially (34 intra-urban measurement locations). To enhance model explanatory power, wind direction and distance decay effects are accounted for. The results show that vehicle-kilometers-traveled (VKT) and solar radiation have statistically strong positive and negative impacts on NO2 concentrations across the four seasonal models. In addition, there are significant interactions with the dummy variables, pointing to VKT and solar radiation effects on NO2 concentrations that vary with time and intra-urban location. The results also show that residential, commercial, and industrial land uses, and wind speed, temperature, and humidity, all impact NO2 concentrations. The R2 vary between 0.95 and 0.98.

  10. Using geostatistical methods to estimate snow water equivalence distribution in a mountain watershed

    USGS Publications Warehouse

    Balk, B.; Elder, K.; Baron, Jill S.

    1998-01-01

    Knowledge of the spatial distribution of snow water equivalence (SWE) is necessary to adequately forecast the volume and timing of snowmelt runoff.  In April 1997, peak accumulation snow depth and density measurements were independently taken in the Loch Vale watershed (6.6 km2), Rocky Mountain National Park, Colorado.  Geostatistics and classical statistics were used to estimate SWE distribution across the watershed.  Snow depths were spatially distributed across the watershed through kriging interpolation methods which provide unbiased estimates that have minimum variances.  Snow densities were spatially modeled through regression analysis.  Combining the modeled depth and density with snow-covered area (SCA produced an estimate of the spatial distribution of SWE.  The kriged estimates of snow depth explained 37-68% of the observed variance in the measured depths.  Steep slopes, variably strong winds, and complex energy balance in the watershed contribute to a large degree of heterogeneity in snow depth.

  11. Using Dual Regression to Investigate Network Shape and Amplitude in Functional Connectivity Analyses

    PubMed Central

    Nickerson, Lisa D.; Smith, Stephen M.; Öngür, Döst; Beckmann, Christian F.

    2017-01-01

    Independent Component Analysis (ICA) is one of the most popular techniques for the analysis of resting state FMRI data because it has several advantageous properties when compared with other techniques. Most notably, in contrast to a conventional seed-based correlation analysis, it is model-free and multivariate, thus switching the focus from evaluating the functional connectivity of single brain regions identified a priori to evaluating brain connectivity in terms of all brain resting state networks (RSNs) that simultaneously engage in oscillatory activity. Furthermore, typical seed-based analysis characterizes RSNs in terms of spatially distributed patterns of correlation (typically by means of simple Pearson's coefficients) and thereby confounds together amplitude information of oscillatory activity and noise. ICA and other regression techniques, on the other hand, retain magnitude information and therefore can be sensitive to both changes in the spatially distributed nature of correlations (differences in the spatial pattern or “shape”) as well as the amplitude of the network activity. Furthermore, motion can mimic amplitude effects so it is crucial to use a technique that retains such information to ensure that connectivity differences are accurately localized. In this work, we investigate the dual regression approach that is frequently applied with group ICA to assess group differences in resting state functional connectivity of brain networks. We show how ignoring amplitude effects and how excessive motion corrupts connectivity maps and results in spurious connectivity differences. We also show how to implement the dual regression to retain amplitude information and how to use dual regression outputs to identify potential motion effects. Two key findings are that using a technique that retains magnitude information, e.g., dual regression, and using strict motion criteria are crucial for controlling both network amplitude and motion-related amplitude effects, respectively, in resting state connectivity analyses. We illustrate these concepts using realistic simulated resting state FMRI data and in vivo data acquired in healthy subjects and patients with bipolar disorder and schizophrenia. PMID:28348512

  12. Subgrid spatial variability of soil hydraulic functions for hydrological modelling

    NASA Astrophysics Data System (ADS)

    Kreye, Phillip; Meon, Günter

    2016-07-01

    State-of-the-art hydrological applications require a process-based, spatially distributed hydrological model. Runoff characteristics are demanded to be well reproduced by the model. Despite that, the model should be able to describe the processes at a subcatchment scale in a physically credible way. The objective of this study is to present a robust procedure to generate various sets of parameterisations of soil hydraulic functions for the description of soil heterogeneity on a subgrid scale. Relations between Rosetta-generated values of saturated hydraulic conductivity (Ks) and van Genuchten's parameters of soil hydraulic functions were statistically analysed. An universal function that is valid for the complete bandwidth of Ks values could not be found. After concentrating on natural texture classes, strong correlations were identified for all parameters. The obtained regression results were used to parameterise sets of hydraulic functions for each soil class. The methodology presented in this study is applicable on a wide range of spatial scales and does not need input data from field studies. The developments were implemented into a hydrological modelling system.

  13. Study of non-Hodgkin's lymphoma mortality associated with industrial pollution in Spain, using Poisson models

    PubMed Central

    Ramis, Rebeca; Vidal, Enrique; García-Pérez, Javier; Lope, Virginia; Aragonés, Nuria; Pérez-Gómez, Beatriz; Pollán, Marina; López-Abente, Gonzalo

    2009-01-01

    Background Non-Hodgkin's lymphomas (NHLs) have been linked to proximity to industrial areas, but evidence regarding the health risk posed by residence near pollutant industries is very limited. The European Pollutant Emission Register (EPER) is a public register that furnishes valuable information on industries that release pollutants to air and water, along with their geographical location. This study sought to explore the relationship between NHL mortality in small areas in Spain and environmental exposure to pollutant emissions from EPER-registered industries, using three Poisson-regression-based mathematical models. Methods Observed cases were drawn from mortality registries in Spain for the period 1994–2003. Industries were grouped into the following sectors: energy; metal; mineral; organic chemicals; waste; paper; food; and use of solvents. Populations having an industry within a radius of 1, 1.5, or 2 kilometres from the municipal centroid were deemed to be exposed. Municipalities outside those radii were considered as reference populations. The relative risks (RRs) associated with proximity to pollutant industries were estimated using the following methods: Poisson Regression; mixed Poisson model with random provincial effect; and spatial autoregressive modelling (BYM model). Results Only proximity of paper industries to population centres (>2 km) could be associated with a greater risk of NHL mortality (mixed model: RR:1.24, 95% CI:1.09–1.42; BYM model: RR:1.21, 95% CI:1.01–1.45; Poisson model: RR:1.16, 95% CI:1.06–1.27). Spatial models yielded higher estimates. Conclusion The reported association between exposure to air pollution from the paper, pulp and board industry and NHL mortality is independent of the model used. Inclusion of spatial random effects terms in the risk estimate improves the study of associations between environmental exposures and mortality. The EPER could be of great utility when studying the effects of industrial pollution on the health of the population. PMID:19159450

  14. Classification of Large-Scale Remote Sensing Images for Automatic Identification of Health Hazards: Smoke Detection Using an Autologistic Regression Classifier.

    PubMed

    Wolters, Mark A; Dean, C B

    2017-01-01

    Remote sensing images from Earth-orbiting satellites are a potentially rich data source for monitoring and cataloguing atmospheric health hazards that cover large geographic regions. A method is proposed for classifying such images into hazard and nonhazard regions using the autologistic regression model, which may be viewed as a spatial extension of logistic regression. The method includes a novel and simple approach to parameter estimation that makes it well suited to handling the large and high-dimensional datasets arising from satellite-borne instruments. The methodology is demonstrated on both simulated images and a real application to the identification of forest fire smoke.

  15. Inequalities in tobacco outlet density by race, ethnicity and socioeconomic status, 2012, USA: results from the ASPiRE Study.

    PubMed

    Lee, Joseph G L; Sun, Dennis L; Schleicher, Nina M; Ribisl, Kurt M; Luke, Douglas A; Henriksen, Lisa

    2017-05-01

    Evidence of racial/ethnic inequalities in tobacco outlet density is limited by: (1) reliance on studies from single counties or states, (2) limited attention to spatial dependence, and (3) an unclear theory-based relationship between neighbourhood composition and tobacco outlet density. In 97 counties from the contiguous USA, we calculated the 2012 density of likely tobacco outlets (N=90 407), defined as tobacco outlets per 1000 population in census tracts (n=17 667). We used 2 spatial regression techniques, (1) a spatial errors approach in GeoDa software and (2) fitting a covariance function to the errors using a distance matrix of all tract centroids. We examined density as a function of race, ethnicity, income and 2 indicators identified from city planning literature to indicate neighbourhood stability (vacant housing, renter-occupied housing). The average density was 1.3 tobacco outlets per 1000 persons. Both spatial regression approaches yielded similar results. In unadjusted models, tobacco outlet density was positively associated with the proportion of black residents and negatively associated with the proportion of Asian residents, white residents and median household income. There was no association with the proportion of Hispanic residents. Indicators of neighbourhood stability explained the disproportionate density associated with black residential composition, but inequalities by income persisted in multivariable models. Data from a large sample of US counties and results from 2 techniques to address spatial dependence strengthen evidence of inequalities in tobacco outlet density by race and income. Further research is needed to understand the underlying mechanisms in order to strengthen interventions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  16. Exploring the effects of spatial autocorrelation when identifying key drivers of wildlife crop-raiding.

    PubMed

    Songhurst, Anna; Coulson, Tim

    2014-03-01

    Few universal trends in spatial patterns of wildlife crop-raiding have been found. Variations in wildlife ecology and movements, and human spatial use have been identified as causes of this apparent unpredictability. However, varying spatial patterns of spatial autocorrelation (SA) in human-wildlife conflict (HWC) data could also contribute. We explicitly explore the effects of SA on wildlife crop-raiding data in order to facilitate the design of future HWC studies. We conducted a comparative survey of raided and nonraided fields to determine key drivers of crop-raiding. Data were subsampled at different spatial scales to select independent raiding data points. The model derived from all data was fitted to subsample data sets. Model parameters from these models were compared to determine the effect of SA. Most methods used to account for SA in data attempt to correct for the change in P-values; yet, by subsampling data at broader spatial scales, we identified changes in regression estimates. We consequently advocate reporting both model parameters across a range of spatial scales to help biological interpretation. Patterns of SA vary spatially in our crop-raiding data. Spatial distribution of fields should therefore be considered when choosing the spatial scale for analyses of HWC studies. Robust key drivers of elephant crop-raiding included raiding history of a field and distance of field to a main elephant pathway. Understanding spatial patterns and determining reliable socio-ecological drivers of wildlife crop-raiding is paramount for designing mitigation and land-use planning strategies to reduce HWC. Spatial patterns of HWC are complex, determined by multiple factors acting at more than one scale; therefore, studies need to be designed with an understanding of the effects of SA. Our methods are accessible to a variety of practitioners to assess the effects of SA, thereby improving the reliability of conservation management actions.

  17. Optimal Sparse Upstream Sensor Placement for Hydrokinetic Turbines

    NASA Astrophysics Data System (ADS)

    Cavagnaro, Robert; Strom, Benjamin; Ross, Hannah; Hill, Craig; Polagye, Brian

    2016-11-01

    Accurate measurement of the flow field incident upon a hydrokinetic turbine is critical for performance evaluation during testing and setting boundary conditions in simulation. Additionally, turbine controllers may leverage real-time flow measurements. Particle image velocimetry (PIV) is capable of rendering a flow field over a wide spatial domain in a controlled, laboratory environment. However, PIV's lack of suitability for natural marine environments, high cost, and intensive post-processing diminish its potential for control applications. Conversely, sensors such as acoustic Doppler velocimeters (ADVs), are designed for field deployment and real-time measurement, but over a small spatial domain. Sparsity-promoting regression analysis such as LASSO is utilized to improve the efficacy of point measurements for real-time applications by determining optimal spatial placement for a small number of ADVs using a training set of PIV velocity fields and turbine data. The study is conducted in a flume (0.8 m2 cross-sectional area, 1 m/s flow) with laboratory-scale axial and cross-flow turbines. Predicted turbine performance utilizing the optimal sparse sensor network and associated regression model is compared to actual performance with corresponding PIV measurements.

  18. Estimating carbon and showing impacts of drought using satellite data in regression-tree models

    USGS Publications Warehouse

    Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.

    2018-01-01

    Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.

  19. Boosted Regression Tree Models to Explain Watershed ...

    EPA Pesticide Factsheets

    Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on the Index of Biotic Integrity (IBI), were also analyzed. Seasonal BRT models at two spatial scales (watershed and riparian buffered area [RBA]) for nitrite-nitrate (NO2-NO3), total Kjeldahl nitrogen, and total phosphorus (TP) and annual models for the IBI score were developed. Two primary factors — location within the watershed (i.e., geographic position, stream order, and distance to a downstream confluence) and percentage of urban land cover (both scales) — emerged as important predictor variables. Latitude and longitude interacted with other factors to explain the variability in summer NO2-NO3 concentrations and IBI scores. BRT results also suggested that location might be associated with indicators of sources (e.g., land cover), runoff potential (e.g., soil and topographic factors), and processes not easily represented by spatial data indicators. Runoff indicators (e.g., Hydrological Soil Group D and Topographic Wetness Indices) explained a substantial portion of the variability in nutrient concentrations as did point sources for TP in the summer months. The results from our BRT approach can help prioritize areas for nutrient management in mixed-use and heavily impacted watershed

  20. Aboveground biomass mapping in French Guiana by combining remote sensing, forest inventories and environmental data

    NASA Astrophysics Data System (ADS)

    Fayad, Ibrahim; Baghdadi, Nicolas; Guitet, Stéphane; Bailly, Jean-Stéphane; Hérault, Bruno; Gond, Valéry; El Hajj, Mahmoud; Tong Minh, Dinh Ho

    2016-10-01

    Mapping forest aboveground biomass (AGB) has become an important task, particularly for the reporting of carbon stocks and changes. AGB can be mapped using synthetic aperture radar data (SAR) or passive optical data. However, these data are insensitive to high AGB levels (>150 Mg/ha, and >300 Mg/ha for P-band), which are commonly found in tropical forests. Studies have mapped the rough variations in AGB by combining optical and environmental data at regional and global scales. Nevertheless, these maps cannot represent local variations in AGB in tropical forests. In this paper, we hypothesize that the problem of misrepresenting local variations in AGB and AGB estimation with good precision occurs because of both methodological limits (signal saturation or dilution bias) and a lack of adequate calibration data in this range of AGB values. We test this hypothesis by developing a calibrated regression model to predict variations in high AGB values (mean >300 Mg/ha) in French Guiana by a methodological approach for spatial extrapolation with data from the optical geoscience laser altimeter system (GLAS), forest inventories, radar, optics, and environmental variables for spatial inter- and extrapolation. Given their higher point count, GLAS data allow a wider coverage of AGB values. We find that the metrics from GLAS footprints are correlated with field AGB estimations (R2 = 0.54, RMSE = 48.3 Mg/ha) with no bias for high values. First, predictive models, including remote-sensing, environmental variables and spatial correlation functions, allow us to obtain ;wall-to-wall; AGB maps over French Guiana with an RMSE for the in situ AGB estimates of ∼50 Mg/ha and R2 = 0.66 at a 1-km grid size. We conclude that a calibrated regression model based on GLAS with dependent environmental data can produce good AGB predictions even for high AGB values if the calibration data fit the AGB range. We also demonstrate that small temporal and spatial mismatches between field data and GLAS footprints are not a problem for regional and global calibrated regression models because field data aim to predict large and deep tendencies in AGB variations from environmental gradients and do not aim to represent high but stochastic and temporally limited variations from forest dynamics. Thus, we advocate including a greater variety of data, even if less precise and shifted, to better represent high AGB values in global models and to improve the fitting of these models for high values.

  1. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models.

    PubMed

    Forkuor, Gerald; Hounkpatin, Ozias K L; Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness, coloration and saturation were prominent predictors in digital soil mapping. Considering the increased availability of freely available Remote Sensing data (e.g. Landsat, SRTM, Sentinels), soil information at local and regional scales in data poor regions such as West Africa can be improved with relatively little financial and human resources.

  2. Mutant mouse models and their contribution to our knowledge of corpus luteum development, function and regression.

    PubMed

    Henkes, Luiz E; Davis, John S; Rueda, Bo R

    2003-11-10

    The corpus luteum is a unique organ, which is transitory in nature. The development, maintenance and regression of the corpus luteum are regulated by endocrine, paracrine and autocrine signaling events. Defining the specific mediators of luteal development, maintenance and regression has been difficult and often perplexing due to the complexity that stems from the variety of cell types that make up the luteal tissue. Moreover, some regulators may serve dual functions as a luteotropic and luteolytic agent depending on the temporal and spatial environment in which they are expressed. As a result, some confusion is present in the interpretation of in vitro and in vivo studies. More recently investigators have utilized mutant mouse models to define the functional significance of specific gene products. The goal of this mini-review is to identify and discuss mutant mouse models that have luteal anomalies, which may provide some clues as to the significance of specific regulators of corpus luteum function.

  3. Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

    PubMed

    Lin, Feng-Chang; Zhu, Jun

    2012-01-01

    We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.

  4. Landscape features and attractants that predispose grizzly bears to risk of conflicts with humans: A spatial and temporal analysis on privately owned agricultural land

    NASA Astrophysics Data System (ADS)

    Wilson, Seth Mark

    Grizzly bear (Ursus arctos) deaths in the US tend to be concentrated on the periphery of core habitats. These deaths were often preceded by conflicts with humans. Management removals of "nuisance" and or habituated grizzly bears are a leading cause of death in many populations. This exploratory study focuses on the conditions that lead to human-grizzly bear conflicts on private lands near core habitat. I examined spatial associations among reported human-grizzly bear conflicts during 1986--2001, landscape features, and agricultural-attractants in north-central Montana. I surveyed 61 of a possible 64 active livestock related land users and I used geographic information system (GIS) techniques to collect information on cattle and sheep pasture locations, seasons of use, and bone yard (carcass dumps) and beehive locations. I used GIS spatial analyses, univariate tests, and logistic regression models to explore the associations among conflicts, landscape features, and attractants. A majority (75%) of conflicts were found in distinct seasonal conflict hotspots. Conflict hotspots with spatial overlap were associated with riparian vegetation, bone yards, and beehives in close proximity to one another and accounted for 62% of all conflicts. Consistently available seasonal attractants in overlapping hotspots such as calving areas, sheep lambing areas and spring, summer, and fall sheep and cattle pastures appear to perpetuate the occurrence of conflicts. I found that lambing areas and spring and summer sheep pastures were strongly associated with conflict locations as were cattle calving areas, spring cow/calf pastures, fall pastures, and bone yards. Logistic regression modeling revealed that the presence of riparian vegetation within a 1.6 km search radius strongly influenced the likelihood of conflict. After controlling for riparian vegetation, I found that unmanaged bone yards, unfenced and fenced beehives, all increased the odds of conflict. For every 1 km moved away from spring, summer, and fall sheep and cattle pastures, the odds of conflict decreased. The model confirmed the existence of conflict hotspots and illustrated that a collection of attractants beyond the effects of riparian vegetation were associated with conflicts. Contour probability plots of logistic regression models showed good predictive capacity. We discuss these findings and offer management recommendations.

  5. Spatial variability in levels of benzene, formaldehyde, and total benzene, toluene, ethylbenzene and xylenes in New York City: a land-use regression study.

    PubMed

    Kheirbek, Iyad; Johnson, Sarah; Ross, Zev; Pezeshki, Grant; Ito, Kazuhiko; Eisl, Holger; Matte, Thomas

    2012-07-31

    Hazardous air pollutant exposures are common in urban areas contributing to increased risk of cancer and other adverse health outcomes. While recent analyses indicate that New York City residents experience significantly higher cancer risks attributable to hazardous air pollutant exposures than the United States as a whole, limited data exist to assess intra-urban variability in air toxics exposures. To assess intra-urban spatial variability in exposures to common hazardous air pollutants, street-level air sampling for volatile organic compounds and aldehydes was conducted at 70 sites throughout New York City during the spring of 2011. Land-use regression models were developed using a subset of 59 sites and validated against the remaining 11 sites to describe the relationship between concentrations of benzene, total BTEX (benzene, toluene, ethylbenzene, xylenes) and formaldehyde to indicators of local sources, adjusting for temporal variation. Total BTEX levels exhibited the most spatial variability, followed by benzene and formaldehyde (coefficient of variation of temporally adjusted measurements of 0.57, 0.35, 0.22, respectively). Total roadway length within 100 m, traffic signal density within 400 m of monitoring sites, and an indicator of temporal variation explained 65% of the total variability in benzene while 70% of the total variability in BTEX was accounted for by traffic signal density within 450 m, density of permitted solvent-use industries within 500 m, and an indicator of temporal variation. Measures of temporal variation, traffic signal density within 400 m, road length within 100 m, and interior building area within 100 m (indicator of heating fuel combustion) predicted 83% of the total variability of formaldehyde. The models built with the modeling subset were found to predict concentrations well, predicting 62% to 68% of monitored values at validation sites. Traffic and point source emissions cause substantial variation in street-level exposures to common toxic volatile organic compounds in New York City. Land-use regression models were successfully developed for benzene, formaldehyde, and total BTEX using spatial indicators of on-road vehicle emissions and emissions from stationary sources. These estimates will improve the understanding of health effects of individual pollutants in complex urban pollutant mixtures and inform local air quality improvement efforts that reduce disparities in exposure.

  6. Wilderness recreation participation: Projections for the next half century

    Treesearch

    J. M. Bowker; D. Murphy; H. K. Cordell; D. B. K. English; J. C. Bergstrom; C. M. Starbuck; C. J. Betz; G. T. Green; P. Reed

    2007-01-01

    This paper explores the influence of demographic and spatial variables on individual participation in wildland area recreation. Data from the National Survey on Recreation and the Environment (NSRE) are combined with GIS-based distance measures to develop nonlinear regression models used to predict both participation and the number of days of participation in...

  7. EMI-Sensor Data to Identify Areas of Manure Accumulation on a Feedlot Surface

    USDA-ARS?s Scientific Manuscript database

    A study was initiated to test the validity of using electromagnetic induction (EMI) survey data, a prediction-based sampling strategy and ordinary linear regression modeling to predict spatially variable feedlot surface manure accumulation. A 30 m × 60 m feedlot pen with a central mound was selecte...

  8. Analysis of Pollution Hazard Intensity: A Spatial Epidemiology Case Study of Soil Pb Contamination

    PubMed Central

    Ha, Hoehun; Rogerson, Peter A.; Olson, James R.; Han, Daikwon; Bian, Ling; Shao, Wanyun

    2016-01-01

    Heavy industrialization has resulted in the contamination of soil by metals from anthropogenic sources in Anniston, Alabama. This situation calls for increased public awareness of the soil contamination issue and better knowledge of the main factors contributing to the potential sources contaminating residential soil. The purpose of this spatial epidemiology research is to describe the effects of physical factors on the concentration of lead (Pb) in soil in Anniston AL, and to determine the socioeconomic and demographic characteristics of those residing in areas with higher soil contamination. Spatial regression models are used to account for spatial dependencies using these explanatory variables. After accounting for covariates and multicollinearity, results of the analysis indicate that lead concentration in soils varies markedly in the vicinity of a specific foundry (Foundry A), and that proximity to railroads explained a significant amount of spatial variation in soil lead concentration. Moreover, elevated soil lead levels were identified as a concern in industrial sites, neighborhoods with a high density of old housing, a high percentage of African American population, and a low percent of occupied housing units. The use of spatial modelling allows for better identification of significant factors that are correlated with soil lead concentrations. PMID:27649221

  9. Analysis of Pollution Hazard Intensity: A Spatial Epidemiology Case Study of Soil Pb Contamination.

    PubMed

    Ha, Hoehun; Rogerson, Peter A; Olson, James R; Han, Daikwon; Bian, Ling; Shao, Wanyun

    2016-09-14

    Heavy industrialization has resulted in the contamination of soil by metals from anthropogenic sources in Anniston, Alabama. This situation calls for increased public awareness of the soil contamination issue and better knowledge of the main factors contributing to the potential sources contaminating residential soil. The purpose of this spatial epidemiology research is to describe the effects of physical factors on the concentration of lead (Pb) in soil in Anniston AL, and to determine the socioeconomic and demographic characteristics of those residing in areas with higher soil contamination. Spatial regression models are used to account for spatial dependencies using these explanatory variables. After accounting for covariates and multicollinearity, results of the analysis indicate that lead concentration in soils varies markedly in the vicinity of a specific foundry (Foundry A), and that proximity to railroads explained a significant amount of spatial variation in soil lead concentration. Moreover, elevated soil lead levels were identified as a concern in industrial sites, neighborhoods with a high density of old housing, a high percentage of African American population, and a low percent of occupied housing units. The use of spatial modelling allows for better identification of significant factors that are correlated with soil lead concentrations.

  10. Hydroacoustic estimation of zooplankton biomass at two shoal complexes in the Apostle Islands Region of Lake Superior

    USGS Publications Warehouse

    Holbrook, B.V.; Hrabik, T.R.; Branstrator, D.K.; Yule, D.L.; Stockwell, J.D.

    2006-01-01

    Hydroacoustics can be used to assess zooplankton populations, however, backscatter must be scaled to be biologically meaningful. In this study, we used a general model to correlate site-specific hydroacoustic backscatter with zooplankton dry weight biomass estimated from net tows. The relationship between zooplankton dry weight and backscatter was significant (p < 0.001) and explained 76% of the variability in the dry weight data. We applied this regression to hydroacoustic data collected monthly in 2003 and 2004 at two shoals in the Apostle Island Region of Lake Superior. After applying the regression model to convert hydroacoustic backscatter to zooplankton dry weight biomass, we used geostatistics to analyze the mean and variance, and ordinary kriging to create spatial zooplankton distribution maps. The mean zooplankton dry weight biomass estimates from plankton net tows and hydroacoustics were not significantly different (p = 0.19) but the hydroacoustic data had a significantly lower coefficient of variation (p < 0.001). The maps of zooplankton distribution illustrated spatial trends in zooplankton dry weight biomass that were not discernable from the overall means.

  11. Data-driven mapping of the potential mountain permafrost distribution.

    PubMed

    Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

    2017-07-15

    Existing mountain permafrost distribution models generally offer a good overview of the potential extent of this phenomenon at a regional scale. They are however not always able to reproduce the high spatial discontinuity of permafrost at the micro-scale (scale of a specific landform; ten to several hundreds of meters). To overcome this lack, we tested an alternative modelling approach using three classification algorithms belonging to statistics and machine learning: Logistic regression, Support Vector Machines and Random forests. These supervised learning techniques infer a classification function from labelled training data (pixels of permafrost absence and presence) with the aim of predicting the permafrost occurrence where it is unknown. The research was carried out in a 588km 2 area of the Western Swiss Alps. Permafrost evidences were mapped from ortho-image interpretation (rock glacier inventorying) and field data (mainly geoelectrical and thermal data). The relationship between selected permafrost evidences and permafrost controlling factors was computed with the mentioned techniques. Classification performances, assessed with AUROC, range between 0.81 for Logistic regression, 0.85 with Support Vector Machines and 0.88 with Random forests. The adopted machine learning algorithms have demonstrated to be efficient for permafrost distribution modelling thanks to consistent results compared to the field reality. The high resolution of the input dataset (10m) allows elaborating maps at the micro-scale with a modelled permafrost spatial distribution less optimistic than classic spatial models. Moreover, the probability output of adopted algorithms offers a more precise overview of the potential distribution of mountain permafrost than proposing simple indexes of the permafrost favorability. These encouraging results also open the way to new possibilities of permafrost data analysis and mapping. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Spatial-temporal trend for mother-to-child transmission of HIV up to infancy and during pre-Option B+ in western Kenya, 2007-13.

    PubMed

    Waruru, Anthony; Achia, Thomas N O; Muttai, Hellen; Ng'ang'a, Lucy; Zielinski-Gutierrez, Emily; Ochanda, Boniface; Katana, Abraham; Young, Peter W; Tobias, James L; Juma, Peter; De Cock, Kevin M; Tylleskär, Thorkild

    2018-01-01

    Using spatial-temporal analyses to understand coverage and trends in elimination of mother-to-child transmission of HIV (e-MTCT) efforts may be helpful in ensuring timely services are delivered to the right place. We present spatial-temporal analysis of seven years of HIV early infant diagnosis (EID) data collected from 12 districts in western Kenya from January 2007 to November 2013, during pre-Option B+ use. We included in the analysis infants up to one year old. We performed trend analysis using extended Cochran-Mantel-Haenszel stratified test and logistic regression models to examine trends and associations of infant HIV status at first diagnosis with: early diagnosis (<8 weeks after birth), age at specimen collection, infant ever having breastfed, use of single dose nevirapine, and maternal antiretroviral therapy status. We examined these covariates and fitted spatial and spatial-temporal semiparametric Poisson regression models to explain HIV-infection rates using R-integrated nested Laplace approximation package. We calculated new infections per 100,000 live births and used Quantum GIS to map fitted MTCT estimates for each district in Nyanza region. Median age was two months, interquartile range 1.5-5.8 months. Unadjusted pooled positive rate was 11.8% in the seven-years period and declined from 19.7% in 2007 to 7.0% in 2013, p < 0.01. Uptake of testing ≤8 weeks after birth was under 50% in 2007 and increased to 64.1% by 2013, p < 0.01. By 2013, the overall standardized MTCT rate was 447 infections per 100,000 live births. Based on Bayesian deviance information criterion comparisons, the spatial-temporal model with maternal and infant covariates was best in explaining geographical variation in MTCT. Improved EID uptake and reduced MTCT rates are indicators of progress towards e-MTCT. Cojoined analysis of time and covariates in a spatial context provides a robust approach for explaining differences in programmatic impact over time. During this pre-Option B+ period, the prevention of mother to child transmission program in this region has not achieved e-MTCT target of ≤50 infections per 100,000 live births. Geographical disparities in program achievements may signify gaps in spatial distribution of e-MTCT efforts and could indicate areas needing further resources and interventions.

  13. Forecasting urban water demand: A meta-regression analysis.

    PubMed

    Sebri, Maamar

    2016-12-01

    Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.

  14. Mapping extreme rainfall in the Northwest Portugal region: statistical analysis and spatial modelling

    NASA Astrophysics Data System (ADS)

    Santos, Monica; Fragoso, Marcelo

    2010-05-01

    Extreme precipitation events are one of the causes of natural hazards, such as floods and landslides, making its investigation so important, and this research aims to contribute to the study of the extreme rainfall patterns in a Portuguese mountainous area. The study area is centred on the Arcos de Valdevez county, located in the northwest region of Portugal, the rainiest of the country, with more than 3000 mm of annual rainfall at the Peneda-Gerês mountain system. This work focus on two main subjects related with the precipitation variability on the study area. First, a statistical analysis of several precipitation parameters is carried out, using daily data from 17 rain-gauges with a complete record for the 1960-1995 period. This approach aims to evaluate the main spatial contrasts regarding different aspects of the rainfall regime, described by ten parameters and indices of precipitation extremes (e.g. mean annual precipitation, the annual frequency of precipitation days, wet spells durations, maximum daily precipitation, maximum of precipitation in 30 days, number of days with rainfall exceeding 100 mm and estimated maximum daily rainfall for a return period of 100 years). The results show that the highest precipitation amounts (from annual to daily scales) and the higher frequency of very abundant rainfall events occur in the Serra da Peneda and Gerês mountains, opposing to the valleys of the Lima, Minho and Vez rivers, with lower precipitation amounts and less frequent heavy storms. The second purpose of this work is to find a method of mapping extreme rainfall in this mountainous region, investigating the complex influence of the relief (e.g. elevation, topography) on the precipitation patterns, as well others geographical variables (e.g. distance from coast, latitude), applying tested geo-statistical techniques (Goovaerts, 2000; Diodato, 2005). Models of linear regression were applied to evaluate the influence of different geographical variables (altitude, latitude, distance from sea and distance to the highest orographic barrier) on the rainfall behaviours described by the studied variables. The techniques of spatial interpolation evaluated include univariate and multivariate methods: cokriging, kriging, IDW (inverse distance weighted) and multiple linear regression. Validation procedures were used, assessing the estimated errors in the analysis of descriptive statistics of the models. Multiple linear regression models produced satisfactory results in relation to 70% of the rainfall parameters, suggested by lower average percentage of error. However, the results also demonstrates that there is no an unique and ideal model, depending on the rainfall parameter in consideration. Probably, the unsatisfactory results obtained in relation to some rainfall parameters was motivated by constraints as the spatial complexity of the precipitation patterns, as well as to the deficient spatial coverage of the territory by the rain-gauges network. References Diodato, N. (2005). The influence of topographic co-variables on the spatial variability of precipitation over small regions of complex terrain. Internacional Journal of Climatology, 25(3), 351-363. Goovaerts, P. (2000). Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology, 228, 113 - 129.

  15. Modeling Culex tarsalis abundance on the northern Colorado front range using a landscape-level approach.

    PubMed

    Schurich, Jessica A; Kumar, Sunil; Eisen, Lars; Moore, Chester G

    2014-03-01

    Remote sensing and Geographic Information System (GIS) data can be used to identify larval mosquito habitats and predict species distribution and abundance across a landscape. An understanding of the landscape features that impact abundance and dispersal can then be applied operationally in mosquito control efforts to reduce the transmission of mosquito-borne pathogens. In an effort to better understand the effects of landscape heterogeneity on the abundance of the West Nile virus (WNV) vector Culex tarsalis, we determined associations between GIS-based environmental data at multiple spatial extents and monthly abundance of adult Cx. tarsalis in Larimer County and Weld County, CO. Mosquito data were collected from Centers for Disease Control and Prevention miniature light traps operated as part of local WNV surveillance efforts. Multiple regression models were developed for prediction of monthly Cx. tarsalis abundance for June, July, and August using 4 years of data collected over 2007-10. The models explained monthly adult mosquito abundance with accuracies ranging from 51-61% in Fort Collins and 57-88% in Loveland-Johnstown. Models derived using landscape-level predictors indicated that adult Cx. tarsalis abundance is negatively correlated with elevation. In this case, low-elevation areas likely more abundantly include habitats for Cx. tarsalis. Model output indicated that the perimeter of larval sites is a significant predictor of Cx. tarsalis abundance at a spatial extent of 500 m in Loveland-Johnstown in all months examined. The contribution of irrigated crops at a spatial extent of 500 m improved model fit in August in both Fort Collins and Loveland-Johnstown. These results emphasize the significance of irrigation and the manual control of water across the landscape to provide viable larval habitats for Cx. tarsalis in the study area. Results from multiple regression models can be applied operationally to identify areas of larval Cx. tarsalis production (irrigated crops lands and standing water) and assign priority in larval treatments to areas with a high density of larval sites at relevant spatial extents around urban locations.

  16. Spatial models reveal the microclimatic buffering capacity of old-growth forests

    PubMed Central

    Frey, Sarah J. K.; Hadley, Adam S.; Johnson, Sherri L.; Schulze, Mark; Jones, Julia A.; Betts, Matthew G.

    2016-01-01

    Climate change is predicted to cause widespread declines in biodiversity, but these predictions are derived from coarse-resolution climate models applied at global scales. Such models lack the capacity to incorporate microclimate variability, which is critical to biodiversity microrefugia. In forested montane regions, microclimate is thought to be influenced by combined effects of elevation, microtopography, and vegetation, but their relative effects at fine spatial scales are poorly known. We used boosted regression trees to model the spatial distribution of fine-scale, under-canopy air temperatures in mountainous terrain. Spatial models predicted observed independent test data well (r = 0.87). As expected, elevation strongly predicted temperatures, but vegetation and microtopography also exerted critical effects. Old-growth vegetation characteristics, measured using LiDAR (light detection and ranging), appeared to have an insulating effect; maximum spring monthly temperatures decreased by 2.5°C across the observed gradient in old-growth structure. These cooling effects across a gradient in forest structure are of similar magnitude to 50-year forecasts of the Intergovernmental Panel on Climate Change and therefore have the potential to mitigate climate warming at local scales. Management strategies to conserve old-growth characteristics and to curb current rates of primary forest loss could maintain microrefugia, enhancing biodiversity persistence in mountainous systems under climate warming. PMID:27152339

  17. Effects of preference heterogeneity among landowners on spatial conservation prioritization.

    PubMed

    Nielsen, Anne Sofie Elberg; Strange, Niels; Bruun, Hans Henrik; Jacobsen, Jette Bredahl

    2017-06-01

    The participation of private landowners in conservation is crucial to efficient biodiversity conservation. This is especially the case in settings where the share of private ownership is large and the economic costs associated with land acquisition are high. We used probit regression analysis and historical participation data to examine the likelihood of participation of Danish forest owners in a voluntary conservation program. We used the results to spatially predict the likelihood of participation of all forest owners in Denmark. We merged spatial data on the presence of forest, cadastral information on participation contracts, and individual-level socioeconomic information about the forest owners and their households. We included predicted participation in a probability model for species survival. Uninformed and informed (included land owner characteristics) models were then incorporated into a spatial prioritization for conservation of unmanaged forests. The choice models are based on sociodemographic data on the entire population of Danish forest owners and historical data on their participation in conservation schemes. Inclusion in the model of information on private landowners' willingness to supply land for conservation yielded at intermediate budget levels up to 30% more expected species coverage than the uninformed prioritization scheme. Our landowner-choice model provides an example of moving toward more implementable conservation planning. © 2016 Society for Conservation Biology.

  18. Spatial models reveal the microclimatic buffering capacity of old-growth forests.

    PubMed

    Frey, Sarah J K; Hadley, Adam S; Johnson, Sherri L; Schulze, Mark; Jones, Julia A; Betts, Matthew G

    2016-04-01

    Climate change is predicted to cause widespread declines in biodiversity, but these predictions are derived from coarse-resolution climate models applied at global scales. Such models lack the capacity to incorporate microclimate variability, which is critical to biodiversity microrefugia. In forested montane regions, microclimate is thought to be influenced by combined effects of elevation, microtopography, and vegetation, but their relative effects at fine spatial scales are poorly known. We used boosted regression trees to model the spatial distribution of fine-scale, under-canopy air temperatures in mountainous terrain. Spatial models predicted observed independent test data well (r = 0.87). As expected, elevation strongly predicted temperatures, but vegetation and microtopography also exerted critical effects. Old-growth vegetation characteristics, measured using LiDAR (light detection and ranging), appeared to have an insulating effect; maximum spring monthly temperatures decreased by 2.5°C across the observed gradient in old-growth structure. These cooling effects across a gradient in forest structure are of similar magnitude to 50-year forecasts of the Intergovernmental Panel on Climate Change and therefore have the potential to mitigate climate warming at local scales. Management strategies to conserve old-growth characteristics and to curb current rates of primary forest loss could maintain microrefugia, enhancing biodiversity persistence in mountainous systems under climate warming.

  19. Remote sensing of impervious surface growth: A framework for quantifying urban expansion and re-densification mechanisms

    NASA Astrophysics Data System (ADS)

    Shahtahmassebi, Amir Reza; Song, Jie; Zheng, Qing; Blackburn, George Alan; Wang, Ke; Huang, Ling Yan; Pan, Yi; Moore, Nathan; Shahtahmassebi, Golnaz; Sadrabadi Haghighi, Reza; Deng, Jing Song

    2016-04-01

    A substantial body of literature has accumulated on the topic of using remotely sensed data to map impervious surfaces which are widely recognized as an important indicator of urbanization. However, the remote sensing of impervious surface growth has not been successfully addressed. This study proposes a new framework for deriving and summarizing urban expansion and re-densification using time series of impervious surface fractions (ISFs) derived from remotely sensed imagery. This approach integrates multiple endmember spectral mixture analysis (MESMA), analysis of regression residuals, spatial statistics (Getis_Ord) and urban growth theories; hence, the framework is abbreviated as MRGU. The performance of MRGU was compared with commonly used change detection techniques in order to evaluate the effectiveness of the approach. The results suggested that the ISF regression residuals were optimal for detecting impervious surface changes while Getis_Ord was effective for mapping hotspot regions in the regression residuals image. Moreover, the MRGU outputs agreed with the mechanisms proposed in several existing urban growth theories, but importantly the outputs enable the refinement of such models by explicitly accounting for the spatial distribution of both expansion and re-densification mechanisms. Based on Landsat data, the MRGU is somewhat restricted in its ability to measure re-densification in the urban core but this may be improved through the use of higher spatial resolution satellite imagery. The paper ends with an assessment of the present gaps in remote sensing of impervious surface growth and suggests some solutions. The application of impervious surface fractions in urban change detection is a stimulating new research idea which is driving future research with new models and algorithms.

  20. Mapping the spatial distribution and time evolution of snow water equivalent with passive microwave measurements

    USGS Publications Warehouse

    Guo, J.; Tsang, L.; Josberger, E.G.; Wood, A.W.; Hwang, J.-N.; Lettenmaier, D.P.

    2003-01-01

    This paper presents an algorithm that estimates the spatial distribution and temporal evolution of snow water equivalent and snow depth based on passive remote sensing measurements. It combines the inversion of passive microwave remote sensing measurements via dense media radiative transfer modeling results with snow accumulation and melt model predictions to yield improved estimates of snow depth and snow water equivalent, at a pixel resolution of 5 arc-min. In the inversion, snow grain size evolution is constrained based on pattern matching by using the local snow temperature history. This algorithm is applied to produce spatial snow maps of Upper Rio Grande River basin in Colorado. The simulation results are compared with that of the snow accumulation and melt model and a linear regression method. The quantitative comparison with the ground truth measurements from four Snowpack Telemetry (SNOTEL) sites in the basin shows that this algorithm is able to improve the estimation of snow parameters.

  1. Source apportionment of soil heavy metals using robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR) receptor model.

    PubMed

    Qu, Mingkai; Wang, Yan; Huang, Biao; Zhao, Yongcun

    2018-06-01

    The traditional source apportionment models, such as absolute principal component scores-multiple linear regression (APCS-MLR), are usually susceptible to outliers, which may be widely present in the regional geochemical dataset. Furthermore, the models are merely built on variable space instead of geographical space and thus cannot effectively capture the local spatial characteristics of each source contributions. To overcome the limitations, a new receptor model, robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR), was proposed based on the traditional APCS-MLR model. Then, the new method was applied to the source apportionment of soil metal elements in a region of Wuhan City, China as a case study. Evaluations revealed that: (i) RAPCS-RGWR model had better performance than APCS-MLR model in the identification of the major sources of soil metal elements, and (ii) source contributions estimated by RAPCS-RGWR model were more close to the true soil metal concentrations than that estimated by APCS-MLR model. It is shown that the proposed RAPCS-RGWR model is a more effective source apportionment method than APCS-MLR (i.e., non-robust and global model) in dealing with the regional geochemical dataset. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Jobs and the resource curse in the sun: The effects of oil production on female labor force participation in California counties from 1980-2010

    NASA Astrophysics Data System (ADS)

    Zavala, Gabriel

    This study aims to evaluate the relationship between oil income and the female labor force participation rate in California for the years of 1980, 1990, 2000 and 2010 using panel linear regression models. This study also aims to visualize the spatial patterns of both variables in California through Hot Spot analysis at the county level for the same years. The regression found no sign of a relationship between oil income and female labor force participation rate but did find evidence of a positive relationship between two income control variables and the female labor force participation rate. The hot spot analysis also found that female labor force participation cold spots are not spatially correlated with oil production hot spots. These findings contribute new methodologies at a finer scale to the very nuanced discussion of the resource curse in the United States.

  3. Modeling habitat and environmental factors affecting mosquito abundance in Chesapeake, Virginia

    NASA Astrophysics Data System (ADS)

    Bellows, Alan Scott

    The models I present in this dissertation were designed to enable mosquito control agencies in the mid-Atlantic region that oversee large jurisdictions to rapidly track the spatial and temporal distributions of mosquito species, especially those species known to be vectors of eastern equine encephalitis and West Nile virus. I was able to keep these models streamlined, user-friendly, and not cost-prohibitive using empirically based digital data to analyze mosquito-abundance patterns in real landscapes. This research is presented in three major chapters: (II) a series of semi-static habitat suitability indices (HSI) grounded on well-documented associations between mosquito abundance and environmental variables, (III) a dynamic model for predicting both spatial and temporal mosquito abundance based on a topographic soil moisture index and recent weather patterns, and (IV) a set of protocols laid out to aid mosquito control agencies for the use of these models. The HSIs (Chapter II) were based on relationships of mosquitoes to digital surrogates of soil moisture and vegetation characteristics. These models grouped mosquitoes species derived from similarities in habitat requirements, life-cycle type, and vector competence. Quantification of relationships was determined using multiple linear regression models. As in Chapter II, relationships between mosquito abundance and environmental factors in Chapter III were quantified using regression models. However, because this model was, in part, a function of changes in weather patterns, it enables the prediction of both 'where' and 'when' mosquito outbreaks are likely to occur. This model is distinctive among similar studies in the literature because of my use of NOAA's NEXRAD Doppler radar (3-hr precipitation accumulation data) to quantify the spatial and temporal distributions in precipitation accumulation. \\ Chapter IV is unique among the chapters in this dissertation because in lieu of presenting new research, it summarizes the preprocessing steps and analyses used in the HSIs and the dynamic, weather-based, model generated in Chapters II and III. The purpose of this chapter is to provide the reader and potential users with the necessary protocols for modeling the spatial and temporal abundances and distributions of mosquitoes, with emphasis on Culiseta melanura, in a real-world landscape of the mid-Atlantic region. This chapter also provides enhancements that could easily be incorporated into an environmentally sensitive integrated pest management program.

  4. Peak-flow characteristics of Virginia streams

    USGS Publications Warehouse

    Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute

    2011-01-01

    Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.

  5. Modeling the Land Use/Cover Change in an Arid Region Oasis City Constrained by Water Resource and Environmental Policy Change using Cellular Automata Model

    NASA Astrophysics Data System (ADS)

    Hu, X.; Li, X.; Lu, L.

    2017-12-01

    Land use/cover change (LUCC) is an important subject in the research of global environmental change and sustainable development, while spatial simulation on land use/cover change is one of the key content of LUCC and is also difficult due to the complexity of the system. The cellular automata (CA) model had an irreplaceable role in simulating of land use/cover change process due to the powerful spatial computing power. However, the majority of current CA land use/cover models were binary-state model that could not provide more general information about the overall spatial pattern of land use/cover change. Here, a multi-state logistic-regression-based Markov cellular automata (MLRMCA) model and a multi-state artificial-neural-network-based Markov cellular automata (MANNMCA) model were developed and were used to simulate complex land use/cover evolutionary process in an arid region oasis city constrained by water resource and environmental policy change, the Zhangye city during the period of 1990-2010. The results indicated that the MANNMCA model was superior to MLRMCA model in simulated accuracy. These indicated that by combining the artificial neural network with CA could more effectively capture the complex relationships between the land use/cover change and a set of spatial variables. Although the MLRMCA model were also some advantages, the MANNMCA model was more appropriate for simulating complex land use/cover dynamics. The two proposed models were effective and reliable, and could reflect the spatial evolution of regional land use/cover changes. These have also potential implications for the impact assessment of water resources, ecological restoration, and the sustainable urban development in arid areas.

  6. Comparing spatial regression to random forests for large environmental data sets

    EPA Science Inventory

    Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...

  7. Spatial Durbin model analysis macroeconomic loss due to natural disasters

    NASA Astrophysics Data System (ADS)

    Kusrini, D. E.; Mukhtasor

    2015-03-01

    Magnitude of the damage and losses caused by natural disasters is huge for Indonesia, therefore this study aimed to analyze the effects of natural disasters for macroeconomic losses that occurred in 115 cities/districts across Java during 2012. Based on the results of previous studies it is suspected that it contains effects of spatial dependencies in this case, so that the completion of this case is performed using a regression approach to the area, namely Analysis of Spatial Durbin Model (SDM). The obtained significant predictor variable is population, and predictor variable with a significant weighting is the number of occurrences of disasters, i.e., disasters in the region which have an impact on other neighboring regions. Moran's I index value using the weighted Queen Contiguity also showed significant results, meaning that the incidence of disasters in the region will decrease the value of GDP in other.

  8. Advances in Parameter and Uncertainty Quantification Using Bayesian Hierarchical Techniques with a Spatially Referenced Watershed Model (Invited)

    NASA Astrophysics Data System (ADS)

    Alexander, R. B.; Boyer, E. W.; Schwarz, G. E.; Smith, R. A.

    2013-12-01

    Estimating water and material stores and fluxes in watershed studies is frequently complicated by uncertainties in quantifying hydrological and biogeochemical effects of factors such as land use, soils, and climate. Although these process-related effects are commonly measured and modeled in separate catchments, researchers are especially challenged by their complexity across catchments and diverse environmental settings, leading to a poor understanding of how model parameters and prediction uncertainties vary spatially. To address these concerns, we illustrate the use of Bayesian hierarchical modeling techniques with a dynamic version of the spatially referenced watershed model SPARROW (SPAtially Referenced Regression On Watershed attributes). The dynamic SPARROW model is designed to predict streamflow and other water cycle components (e.g., evapotranspiration, soil and groundwater storage) for monthly varying hydrological regimes, using mechanistic functions, mass conservation constraints, and statistically estimated parameters. In this application, the model domain includes nearly 30,000 NHD (National Hydrologic Data) stream reaches and their associated catchments in the Susquehanna River Basin. We report the results of our comparisons of alternative models of varying complexity, including models with different explanatory variables as well as hierarchical models that account for spatial and temporal variability in model parameters and variance (error) components. The model errors are evaluated for changes with season and catchment size and correlations in time and space. The hierarchical models consist of a two-tiered structure in which climate forcing parameters are modeled as random variables, conditioned on watershed properties. Quantification of spatial and temporal variations in the hydrological parameters and model uncertainties in this approach leads to more efficient (lower variance) and less biased model predictions throughout the river network. Moreover, predictions of water-balance components are reported according to probabilistic metrics (e.g., percentiles, prediction intervals) that include both parameter and model uncertainties. These improvements in predictions of streamflow dynamics can inform the development of more accurate predictions of spatial and temporal variations in biogeochemical stores and fluxes (e.g., nutrients and carbon) in watersheds.

  9. Comparison of spatial association approaches for landscape mapping of soil organic carbon stocks

    NASA Astrophysics Data System (ADS)

    Miller, B. A.; Koszinski, S.; Wehrhan, M.; Sommer, M.

    2015-03-01

    The distribution of soil organic carbon (SOC) can be variable at small analysis scales, but consideration of its role in regional and global issues demands the mapping of large extents. There are many different strategies for mapping SOC, among which is to model the variables needed to calculate the SOC stock indirectly or to model the SOC stock directly. The purpose of this research is to compare direct and indirect approaches to mapping SOC stocks from rule-based, multiple linear regression models applied at the landscape scale via spatial association. The final products for both strategies are high-resolution maps of SOC stocks (kg m-2), covering an area of 122 km2, with accompanying maps of estimated error. For the direct modelling approach, the estimated error map was based on the internal error estimations from the model rules. For the indirect approach, the estimated error map was produced by spatially combining the error estimates of component models via standard error propagation equations. We compared these two strategies for mapping SOC stocks on the basis of the qualities of the resulting maps as well as the magnitude and distribution of the estimated error. The direct approach produced a map with less spatial variation than the map produced by the indirect approach. The increased spatial variation represented by the indirect approach improved R2 values for the topsoil and subsoil stocks. Although the indirect approach had a lower mean estimated error for the topsoil stock, the mean estimated error for the total SOC stock (topsoil + subsoil) was lower for the direct approach. For these reasons, we recommend the direct approach to modelling SOC stocks be considered a more conservative estimate of the SOC stocks' spatial distribution.

  10. Comparison of spatial association approaches for landscape mapping of soil organic carbon stocks

    NASA Astrophysics Data System (ADS)

    Miller, B. A.; Koszinski, S.; Wehrhan, M.; Sommer, M.

    2014-11-01

    The distribution of soil organic carbon (SOC) can be variable at small analysis scales, but consideration of its role in regional and global issues demands the mapping of large extents. There are many different strategies for mapping SOC, among which are to model the variables needed to calculate the SOC stock indirectly or to model the SOC stock directly. The purpose of this research is to compare direct and indirect approaches to mapping SOC stocks from rule-based, multiple linear regression models applied at the landscape scale via spatial association. The final products for both strategies are high-resolution maps of SOC stocks (kg m-2), covering an area of 122 km2, with accompanying maps of estimated error. For the direct modelling approach, the estimated error map was based on the internal error estimations from the model rules. For the indirect approach, the estimated error map was produced by spatially combining the error estimates of component models via standard error propagation equations. We compared these two strategies for mapping SOC stocks on the basis of the qualities of the resulting maps as well as the magnitude and distribution of the estimated error. The direct approach produced a map with less spatial variation than the map produced by the indirect approach. The increased spatial variation represented by the indirect approach improved R2 values for the topsoil and subsoil stocks. Although the indirect approach had a lower mean estimated error for the topsoil stock, the mean estimated error for the total SOC stock (topsoil + subsoil) was lower for the direct approach. For these reasons, we recommend the direct approach to modelling SOC stocks be considered a more conservative estimate of the SOC stocks' spatial distribution.

  11. Description and spatial inference of soil drainage using matrix soil colours in the Lower Hunter Valley, New South Wales, Australia

    PubMed Central

    McBratney, Alex B.; Minasny, Budiman

    2018-01-01

    Soil colour is often used as a general purpose indicator of internal soil drainage. In this study we developed a necessarily simple model of soil drainage which combines the tacit knowledge of the soil surveyor with observed matrix soil colour descriptions. From built up knowledge of the soils in our Lower Hunter Valley, New South Wales study area, the sequence of well-draining → imperfectly draining → poorly draining soils generally follows the colour sequence of red → brown → yellow → grey → black soil matrix colours. For each soil profile, soil drainage is estimated somewhere on a continuous index of between 5 (very well drained) and 1 (very poorly drained) based on the proximity or similarity to reference soil colours of the soil drainage colour sequence. The estimation of drainage index at each profile incorporates the whole-profile descriptions of soil colour where necessary, and is weighted such that observation of soil colour at depth and/or dominantly observed horizons are given more preference than observations near the soil surface. The soil drainage index, by definition disregards surficial soil horizons and consolidated and semi-consolidated parent materials. With the view to understanding the spatial distribution of soil drainage we digitally mapped the index across our study area. Spatial inference of the drainage index was made using Cubist regression tree model combined with residual kriging. Environmental covariates for deterministic inference were principally terrain variables derived from a digital elevation model. Pearson’s correlation coefficients indicated the variables most strongly correlated with soil drainage were topographic wetness index (−0.34), mid-slope position (−0.29), multi-resolution valley bottom flatness index (−0.29) and vertical distance to channel network (VDCN) (0.26). From the regression tree modelling, two linear models of soil drainage were derived. The partitioning of models was based upon threshold criteria of VDCN. Validation of the regression kriging model using a withheld dataset resulted in a root mean square error of 0.90 soil drainage index units. Concordance between observations and predictions was 0.49. Given the scale of mapping, and inherent subjectivity of soil colour description, these results are acceptable. Furthermore, the spatial distribution of soil drainage predicted in our study area is attuned with our mental model developed over successive field surveys. Our approach, while exclusively calibrated for the conditions observed in our study area, can be generalised once the unique soil colour and soil drainage relationship is expertly defined for an area or region in question. With such rules established, the quantitative components of the method would remain unchanged. PMID:29682425

  12. Estimation of Total Nitrogen and Phosphorus in New England Streams Using Spatially Referenced Regression Models

    USGS Publications Warehouse

    Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.

    2004-01-01

    The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total phosphorus model include discharges for municipal wastewater-treatment facilities and pulp and paper facilities, developed land area, agricultural area, and forested area. For total phosphorus, loss rates were significant for reservoirs with surface areas of 10 square kilometers or less, and in streams with flows less than or equal to 2.83 cubic meters per second. Applications of SPARROW for evaluating nutrient loading in New England waters include estimates of the spatial distributions of total nitrogen and phosphorus yields, sources of the nutrients, and the potential for delivery of those yields to receiving waters. This information can be used to (1) predict ranges in nutrient levels in surface waters, (2) identify the environmental variables that are statistically significant predictors of nutrient levels in streams, (3) evaluate monitoring efforts for better determination of nutrient loads, and (4) evaluate management options for reducing nutrient loads to achieve water-quality goals.

  13. Description and spatial inference of soil drainage using matrix soil colours in the Lower Hunter Valley, New South Wales, Australia.

    PubMed

    Malone, Brendan P; McBratney, Alex B; Minasny, Budiman

    2018-01-01

    Soil colour is often used as a general purpose indicator of internal soil drainage. In this study we developed a necessarily simple model of soil drainage which combines the tacit knowledge of the soil surveyor with observed matrix soil colour descriptions. From built up knowledge of the soils in our Lower Hunter Valley, New South Wales study area, the sequence of well-draining → imperfectly draining → poorly draining soils generally follows the colour sequence of red → brown → yellow → grey → black soil matrix colours. For each soil profile, soil drainage is estimated somewhere on a continuous index of between 5 (very well drained) and 1 (very poorly drained) based on the proximity or similarity to reference soil colours of the soil drainage colour sequence. The estimation of drainage index at each profile incorporates the whole-profile descriptions of soil colour where necessary, and is weighted such that observation of soil colour at depth and/or dominantly observed horizons are given more preference than observations near the soil surface. The soil drainage index, by definition disregards surficial soil horizons and consolidated and semi-consolidated parent materials. With the view to understanding the spatial distribution of soil drainage we digitally mapped the index across our study area. Spatial inference of the drainage index was made using Cubist regression tree model combined with residual kriging. Environmental covariates for deterministic inference were principally terrain variables derived from a digital elevation model. Pearson's correlation coefficients indicated the variables most strongly correlated with soil drainage were topographic wetness index (-0.34), mid-slope position (-0.29), multi-resolution valley bottom flatness index (-0.29) and vertical distance to channel network (VDCN) (0.26). From the regression tree modelling, two linear models of soil drainage were derived. The partitioning of models was based upon threshold criteria of VDCN. Validation of the regression kriging model using a withheld dataset resulted in a root mean square error of 0.90 soil drainage index units. Concordance between observations and predictions was 0.49. Given the scale of mapping, and inherent subjectivity of soil colour description, these results are acceptable. Furthermore, the spatial distribution of soil drainage predicted in our study area is attuned with our mental model developed over successive field surveys. Our approach, while exclusively calibrated for the conditions observed in our study area, can be generalised once the unique soil colour and soil drainage relationship is expertly defined for an area or region in question. With such rules established, the quantitative components of the method would remain unchanged.

  14. Estimating top-of-atmosphere thermal infrared radiance using MERRA-2 atmospheric data

    NASA Astrophysics Data System (ADS)

    Kleynhans, Tania; Montanaro, Matthew; Gerace, Aaron; Kanan, Christopher

    2017-05-01

    Thermal infrared satellite images have been widely used in environmental studies. However, satellites have limited temporal resolution, e.g., 16 day Landsat or 1 to 2 day Terra MODIS. This paper investigates the use of the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis data product, produced by NASA's Global Modeling and Assimilation Office (GMAO) to predict global topof-atmosphere (TOA) thermal infrared radiance. The high temporal resolution of the MERRA-2 data product presents opportunities for novel research and applications. Various methods were applied to estimate TOA radiance from MERRA-2 variables namely (1) a parameterized physics based method, (2) Linear regression models and (3) non-linear Support Vector Regression. Model prediction accuracy was evaluated using temporally and spatially coincident Moderate Resolution Imaging Spectroradiometer (MODIS) thermal infrared data as reference data. This research found that Support Vector Regression with a radial basis function kernel produced the lowest error rates. Sources of errors are discussed and defined. Further research is currently being conducted to train deep learning models to predict TOA thermal radiance

  15. Visual-spatial abilities relate to mathematics achievement in children with heavy prenatal alcohol exposure

    PubMed Central

    Crocker, N.; Riley, E.P.; Mattson, S.N.

    2014-01-01

    Objective The current study examined the relationship between mathematics and attention, working memory, and visual memory in children with heavy prenatal alcohol exposure and controls. Method Fifty-six children (29 AE, 27 CON) were administered measures of global mathematics achievement (WRAT-3 Arithmetic & WISC-III Written Arithmetic), attention, (WISC-III Digit Span forward and Spatial Span forward), working memory (WISC-III Digit Span backward and Spatial Span backward), and visual memory (CANTAB Spatial Recognition Memory and Pattern Recognition Memory). The contribution of cognitive domains to mathematics achievement was analyzed using linear regression techniques. Attention, working memory and visual memory data were entered together on step 1 followed by group on step 2, and the interaction terms on step 3. Results Model 1 accounted for a significant amount of variance in both mathematics achievement measures, however, model fit improved with the addition of group on step 2. Significant predictors of mathematics achievement were Spatial Span forward and backward and Spatial Recognition Memory. Conclusions These findings suggest that deficits in spatial processing may be related to math impairments seen in FASD. In addition, prenatal alcohol exposure was associated with deficits in mathematics achievement, above and beyond the contribution of general cognitive abilities. PMID:25000323

  16. Visual-spatial abilities relate to mathematics achievement in children with heavy prenatal alcohol exposure.

    PubMed

    Crocker, Nicole; Riley, Edward P; Mattson, Sarah N

    2015-01-01

    The current study examined the relationship between mathematics and attention, working memory, and visual memory in children with heavy prenatal alcohol exposure and controls. Subjects were 56 children (29 AE, 27 CON) who were administered measures of global mathematics achievement (WRAT-3 Arithmetic & WISC-III Written Arithmetic), attention, (WISC-III Digit Span forward and Spatial Span forward), working memory (WISC-III Digit Span backward and Spatial Span backward), and visual memory (CANTAB Spatial Recognition Memory and Pattern Recognition Memory). The contribution of cognitive domains to mathematics achievement was analyzed using linear regression techniques. Attention, working memory, and visual memory data were entered together on Step 1 followed by group on Step 2, and the interaction terms on Step 3. Model 1 accounted for a significant amount of variance in both mathematics achievement measures; however, model fit improved with the addition of group on Step 2. Significant predictors of mathematics achievement were Spatial Span forward and backward and Spatial Recognition Memory. These findings suggest that deficits in spatial processing may be related to math impairments seen in FASD. In addition, prenatal alcohol exposure was associated with deficits in mathematics achievement, above and beyond the contribution of general cognitive abilities. PsycINFO Database Record (c) 2015 APA, all rights reserved.

  17. Identifying Flood-Related Infectious Diseases in Anhui Province, China: A Spatial and Temporal Analysis

    PubMed Central

    Gao, Lu; Zhang, Ying; Ding, Guoyong; Liu, Qiyong; Jiang, Baofa

    2016-01-01

    The aim of this study was to explore infectious diseases related to the 2007 Huai River flood in Anhui Province, China. The study was based on the notified incidences of infectious diseases between June 29 and July 25 from 2004 to 2011. Daily incidences of notified diseases in 2007 were compared with the corresponding daily incidences during the same period in the other years (from 2004 to 2011, except 2007) by Poisson regression analysis. Spatial autocorrelation analysis was used to test the distribution pattern of the diseases. Spatial regression models were then performed to examine the association between the incidence of each disease and flood, considering lag effects and other confounders. After controlling the other meteorological and socioeconomic factors, malaria (odds ratio [OR] = 3.67, 95% confidence interval [CI] = 1.77–7.61), diarrhea (OR = 2.16, 95% CI = 1.24–3.78), and hepatitis A virus (HAV) infection (OR = 6.11, 95% CI = 1.04–35.84) were significantly related to the 2007 Huai River flood both from the spatial and temporal analyses. Special attention should be given to develop public health preparation and interventions with a focus on malaria, diarrhea, and HAV infection, in the study region. PMID:26903612

  18. Comparing methods of measuring geographic patterns in temporal trends: an application to county-level heart disease mortality in the United States, 1973 to 2010.

    PubMed

    Vaughan, Adam S; Kramer, Michael R; Waller, Lance A; Schieb, Linda J; Greer, Sophia; Casper, Michele

    2015-05-01

    To demonstrate the implications of choosing analytical methods for quantifying spatiotemporal trends, we compare the assumptions, implementation, and outcomes of popular methods using county-level heart disease mortality in the United States between 1973 and 2010. We applied four regression-based approaches (joinpoint regression, both aspatial and spatial generalized linear mixed models, and Bayesian space-time model) and compared resulting inferences for geographic patterns of local estimates of annual percent change and associated uncertainty. The average local percent change in heart disease mortality from each method was -4.5%, with the Bayesian model having the smallest range of values. The associated uncertainty in percent change differed markedly across the methods, with the Bayesian space-time model producing the narrowest range of variance (0.0-0.8). The geographic pattern of percent change was consistent across methods with smaller declines in the South Central United States and larger declines in the Northeast and Midwest. However, the geographic patterns of uncertainty differed markedly between methods. The similarity of results, including geographic patterns, for magnitude of percent change across these methods validates the underlying spatial pattern of declines in heart disease mortality. However, marked differences in degree of uncertainty indicate that Bayesian modeling offers substantially more precise estimates. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. [Prediction of soil nutrients spatial distribution based on neural network model combined with goestatistics].

    PubMed

    Li, Qi-Quan; Wang, Chang-Quan; Zhang, Wen-Jiang; Yu, Yong; Li, Bing; Yang, Juan; Bai, Gen-Chuan; Cai, Yan

    2013-02-01

    In this study, a radial basis function neural network model combined with ordinary kriging (RBFNN_OK) was adopted to predict the spatial distribution of soil nutrients (organic matter and total N) in a typical hilly region of Sichuan Basin, Southwest China, and the performance of this method was compared with that of ordinary kriging (OK) and regression kriging (RK). All the three methods produced the similar soil nutrient maps. However, as compared with those obtained by multiple linear regression model, the correlation coefficients between the measured values and the predicted values of soil organic matter and total N obtained by neural network model increased by 12. 3% and 16. 5% , respectively, suggesting that neural network model could more accurately capture the complicated relationships between soil nutrients and quantitative environmental factors. The error analyses of the prediction values of 469 validation points indicated that the mean absolute error (MAE) , mean relative error (MRE), and root mean squared error (RMSE) of RBFNN_OK were 6.9%, 7.4%, and 5. 1% (for soil organic matter), and 4.9%, 6.1% , and 4.6% (for soil total N) smaller than those of OK (P<0.01), and 2.4%, 2.6% , and 1.8% (for soil organic matter), and 2.1%, 2.8%, and 2.2% (for soil total N) smaller than those of RK, respectively (P<0.05).

  20. An integrated fiber-optic probe combined with support vector regression for fast estimation of optical properties of turbid media.

    PubMed

    Zhou, Yang; Fu, Xiaping; Ying, Yibin; Fang, Zhenhuan

    2015-06-23

    A fiber-optic probe system was developed to estimate the optical properties of turbid media based on spatially resolved diffuse reflectance. Because of the limitations in numerical calculation of radiative transfer equation (RTE), diffusion approximation (DA) and Monte Carlo simulations (MC), support vector regression (SVR) was introduced to model the relationship between diffuse reflectance values and optical properties. The SVR models of four collection fibers were trained by phantoms in calibration set with a wide range of optical properties which represented products of different applications, then the optical properties of phantoms in prediction set were predicted after an optimal searching on SVR models. The results indicated that the SVR model was capable of describing the relationship with little deviation in forward validation. The correlation coefficient (R) of reduced scattering coefficient μ'(s) and absorption coefficient μ(a) in the prediction set were 0.9907 and 0.9980, respectively. The root mean square errors of prediction (RMSEP) of μ'(s) and μ(a) in inverse validation were 0.411 cm(-1) and 0.338 cm(-1), respectively. The results indicated that the integrated fiber-optic probe system combined with SVR model were suitable for fast and accurate estimation of optical properties of turbid media based on spatially resolved diffuse reflectance. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Modelling Ecuador's rainfall distribution according to geographical characteristics.

    NASA Astrophysics Data System (ADS)

    Tobar, Vladimiro; Wyseure, Guido

    2017-04-01

    It is known that rainfall is affected by terrain characteristics and some studies had focussed on its distribution over complex terrain. Ecuador's temporal and spatial rainfall distribution is affected by its location on the ITCZ, the marine currents in the Pacific, the Amazon rainforest, and the Andes mountain range. Although all these factors are important, we think that the latter one may hold a key for modelling spatial and temporal distribution of rainfall. The study considered 30 years of monthly data from 319 rainfall stations having at least 10 years of data available. The relatively low density of stations and their location in accessible sites near to main roads or rivers, leave large and important areas ungauged, making it not appropriate to rely on traditional interpolation techniques to estimate regional rainfall for water balance. The aim of this research was to come up with a useful model for seasonal rainfall distribution in Ecuador based on geographical characteristics to allow its spatial generalization. The target for modelling was the seasonal rainfall, characterized by nine percentiles for each one of the 12 months of the year that results in 108 response variables, later on reduced to four principal components comprising 94% of the total variability. Predictor variables for the model were: geographic coordinates, elevation, main wind effects from the Amazon and Coast, Valley and Hill indexes, and average and maximum elevation above the selected rainfall station to the east and to the west, for each one of 18 directions (50-135°, by 5°) adding up to 79 predictors. A multiple linear regression model by the Elastic-net algorithm with cross-validation was applied for each one of the PC as response to select the most important ones from the 79 predictor variables. The Elastic-net algorithm deals well with collinearity problems, while allowing variable selection in a blended approach between the Ridge and Lasso regression. The model fitting produced explained variances of 59%, 81%, 49% and 17% for PC1, PC2, PC3 and PC4, respectively, backing up the hypothesis of good correlation between geographical characteristics and seasonal rainfall patterns (comprised in the four principal components). With the obtained coefficients from the regression, the 108 rainfall percentiles for each station were back estimated giving very good results when compared with the original ones, with an overall 60% explained variance.

  2. Satellite derived bathymetry: mapping the Irish coastline

    NASA Astrophysics Data System (ADS)

    Monteys, X.; Cahalane, C.; Harris, P.; Hanafin, J.

    2017-12-01

    Ireland has a varied coastline in excess of 3000 km in length largely characterized by extended shallow environments. The coastal shallow water zone can be a challenging and costly environment in which to acquire bathymetry and other oceanographic data using traditional survey methods or airborne LiDAR techniques as demonstrated in the Irish INFOMAR program. Thus, large coastal areas in Ireland, and much of the coastal zone worldwide remain unmapped using modern techniques and is poorly understood. Earth Observations (EO) missions are currently being used to derive timely, cost effective, and quality controlled information for mapping and monitoring coastal environments. Different wavelengths of the solar light penetrate the water column to different depths and are routinely sensed by EO satellites. A large selection of multispectral imagery (MS) from many platforms were examined, as well as from small aircrafts and drones. A number of bays representing very different coastal environments were explored in turn. The project's workflow is created by building a catalogue of satellite and field bathymetric data to assess the suitability of imagery captured at a range of spatial, spectral and temporal resolutions. Turbidity indices are derived from the multispectral information. Finally, a number of spatial regression models using water-leaving radiance parameters and field calibration data are examined. Our assessment reveals that spatial regression algorithms have the potential to significantly improve the accuracy of the predictions up to 10m WD and offer a better handle on the error and uncertainty budget. The four spatial models investigated show better adjustments than the basic non-spatial model. Accuracy of the predictions is better than 10% WD at 95% confidence. Future work will focus on improving the accuracy of the predictions incorporating an analytical model in conjunction with improved empirical methods. The recently launched ESA Sentinel 2 will become the primary focus of study. Satellite bathymetry and coastal mapping products, and remarkably, their repeatability over time, can offer solutions to important coastal zone management issues and address key challenges in the critical line between shoreline changes and human activity, particularly in the light of future climate change scenarios.

  3. Empirical water depth predictions in Dublin Bay based on satellite EO multispectral imagery and multibeam data using spatially weighted geographical analysis

    NASA Astrophysics Data System (ADS)

    Monteys, Xavier; Harris, Paul; Caloca, Silvia

    2014-05-01

    The coastal shallow water zone can be a challenging and expensive environment within which to acquire bathymetry and other oceanographic data using traditional survey methods. Dangers and limited swath coverage make some of these areas unfeasible to survey using ship borne systems, and turbidity can preclude marine LIDAR. As a result, an extensive part of the coastline worldwide remains completely unmapped. Satellite EO multispectral data, after processing, allows timely, cost efficient and quality controlled information to be used for planning, monitoring, and regulating coastal environments. It has the potential to deliver repetitive derivation of medium resolution bathymetry, coastal water properties and seafloor characteristics in shallow waters. Over the last 30 years satellite passive imaging methods for bathymetry extraction, implementing analytical or empirical methods, have had a limited success predicting water depths. Different wavelengths of the solar light penetrate the water column to varying depths. They can provide acceptable results up to 20 m but become less accurate in deeper waters. The study area is located in the inner part of Dublin Bay, on the East coast of Ireland. The region investigated is a C-shaped inlet covering an area of 10 km long and 5 km wide with water depths ranging from 0 to 10 m. The methodology employed on this research uses a ratio of reflectance from SPOT 5 satellite bands, differing to standard linear transform algorithms. High accuracy water depths were derived using multibeam data. The final empirical model uses spatially weighted geographical tools to retrieve predicted depths. The results of this paper confirm that SPOT satellite scenes are suitable to predict depths using empirical models in very shallow embayments. Spatial regression models show better adjustments in the predictions over non-spatial models. The spatial regression equation used provides realistic results down to 6 m below the water surface, with reliable and error controlled depths. Bathymetric extraction approaches involving satellite imagery data are regarded as a fast, successful and economically advantageous solution to automatic water depth calculation in shallow and complex environments.

  4. Geostatistical modelling of household malaria in Malawi

    NASA Astrophysics Data System (ADS)

    Chirombo, J.; Lowe, R.; Kazembe, L.

    2012-04-01

    Malaria is one of the most important diseases in the world today, common in tropical and subtropical areas with sub-Saharan Africa being the region most burdened, including Malawi. This region has the right combination of biotic and abiotic components, including socioeconomic, climatic and environmental factors that sustain transmission of the disease. Differences in these conditions across the country consequently lead to spatial variation in risk of the disease. Analysis of nationwide survey data that takes into account this spatial variation is crucial in a resource constrained country like Malawi for targeted allocation of scare resources in the fight against malaria. Previous efforts to map malaria risk in Malawi have been based on limited data collected from small surveys. The Malaria Indicator Survey conducted in 2010 is the most comprehensive malaria survey carried out in Malawi and provides point referenced data for the study. The data has been shown to be spatially correlated. We use Bayesian logistic regression models with spatial correlation to model the relationship between malaria presence in children and covariates such as socioeconomic status of households and meteorological conditions. This spatial model is then used to assess how malaria varies spatially and a malaria risk map for Malawi is produced. By taking intervention measures into account, the developed model is used to assess whether they have an effect on the spatial distribution of the disease and Bayesian kriging is used to predict areas where malaria risk is more likely to increase. It is hoped that this study can help reveal areas that require more attention from the authorities in the continuing fight against malaria, particularly in children under the age of five.

  5. Organic carbon stock modelling for the quantification of the carbon sinks in terrestrial ecosystems

    NASA Astrophysics Data System (ADS)

    Durante, Pilar; Algeet, Nur; Oyonarte, Cecilio

    2017-04-01

    Given the recent environmental policies derived from the serious threats caused by global change, practical measures to decrease net CO2 emissions have to be put in place. Regarding this, carbon sequestration is a major measure to reduce atmospheric CO2 concentrations within a short and medium term, where terrestrial ecosystems play a basic role as carbon sinks. Development of tools for quantification, assessment and management of organic carbon in ecosystems at different scales and management scenarios, it is essential to achieve these commitments. The aim of this study is to establish a methodological framework for the modeling of this tool, applied to a sustainable land use planning and management at spatial and temporal scale. The methodology for carbon stock estimation in ecosystems is based on merger techniques between carbon stored in soils and aerial biomass. For this purpose, both spatial variability map of soil organic carbon (SOC) and algorithms for calculation of forest species biomass will be created. For the modelling of the SOC spatial distribution at different map scales, it is necessary to fit in and screen the available information of soil database legacy. Subsequently, SOC modelling will be based on the SCORPAN model, a quantitative model use to assess the correlation among soil-forming factors measured at the same site location. These factors will be selected from both static (terrain morphometric variables) and dynamic variables (climatic variables and vegetation indexes -NDVI-), providing to the model the spatio-temporal characteristic. After the predictive model, spatial inference techniques will be used to achieve the final map and to extrapolate the data to unavailable information areas (automated random forest regression kriging). The estimated uncertainty will be calculated to assess the model performance at different scale approaches. Organic carbon modelling of aerial biomass will be estimate using LiDAR (Light Detection And Ranging) algorithms. The available LiDAR databases will be used. LiDAR statistics (which describe the LiDAR cloud point data to calculate forest stand parameters) will be correlated with different canopy cover variables. The regression models applied to the total area will produce a continuous geo-information map to each canopy variable. The CO2 estimation will be calculated by dry-mass conversion factors for each forest species (C kg-CO2 kg equivalent). The result is the organic carbon modelling at spatio-temporal scale with different levels of uncertainty associated to the predictive models and diverse detailed scales. However, one of the main expected problems is due to the heterogeneous spatial distribution of the soil information, which influences on the prediction of the models at different spatial scales and, consequently, at SOC map scale. Besides this, the variability and mixture of the forest species of the aerial biomass decrease the accuracy assessment of the organic carbon.

  6. Predictability of depression severity based on posterior alpha oscillations.

    PubMed

    Jiang, H; Popov, T; Jylänki, P; Bi, K; Yao, Z; Lu, Q; Jensen, O; van Gerven, M A J

    2016-04-01

    We aimed to integrate neural data and an advanced machine learning technique to predict individual major depressive disorder (MDD) patient severity. MEG data was acquired from 22 MDD patients and 22 healthy controls (HC) resting awake with eyes closed. Individual power spectra were calculated by a Fourier transform. Sources were reconstructed via beamforming technique. Bayesian linear regression was applied to predict depression severity based on the spatial distribution of oscillatory power. In MDD patients, decreased theta (4-8 Hz) and alpha (8-14 Hz) power was observed in fronto-central and posterior areas respectively, whereas increased beta (14-30 Hz) power was observed in fronto-central regions. In particular, posterior alpha power was negatively related to depression severity. The Bayesian linear regression model showed significant depression severity prediction performance based on the spatial distribution of both alpha (r=0.68, p=0.0005) and beta power (r=0.56, p=0.007) respectively. Our findings point to a specific alteration of oscillatory brain activity in MDD patients during rest as characterized from MEG data in terms of spectral and spatial distribution. The proposed model yielded a quantitative and objective estimation for the depression severity, which in turn has a potential for diagnosis and monitoring of the recovery process. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  7. A Land System representation for global assessments and land-use modeling.

    PubMed

    van Asselen, Sanneke; Verburg, Peter H

    2012-10-01

    Current global scale land-change models used for integrated assessments and climate modeling are based on classifications of land cover. However, land-use management intensity and livestock keeping are also important aspects of land use, and are an integrated part of land systems. This article aims to classify, map, and to characterize Land Systems (LS) at a global scale and analyze the spatial determinants of these systems. Besides proposing such a classification, the article tests if global assessments can be based on globally uniform allocation rules. Land cover, livestock, and agricultural intensity data are used to map LS using a hierarchical classification method. Logistic regressions are used to analyze variation in spatial determinants of LS. The analysis of the spatial determinants of LS indicates strong associations between LS and a range of socioeconomic and biophysical indicators of human-environment interactions. The set of identified spatial determinants of a LS differs among regions and scales, especially for (mosaic) cropland systems, grassland systems with livestock, and settlements. (Semi-)Natural LS have more similar spatial determinants across regions and scales. Using LS in global models is expected to result in a more accurate representation of land use capturing important aspects of land systems and land architecture: the variation in land cover and the link between land-use intensity and landscape composition. Because the set of most important spatial determinants of LS varies among regions and scales, land-change models that include the human drivers of land change are best parameterized at sub-global level, where similar biophysical, socioeconomic and cultural conditions prevail in the specific regions. © 2012 Blackwell Publishing Ltd.

  8. Assessing NARCCAP climate model effects using spatial confidence regions.

    PubMed

    French, Joshua P; McGinnis, Seth; Schwartzman, Armin

    2017-01-01

    We assess similarities and differences between model effects for the North American Regional Climate Change Assessment Program (NARCCAP) climate models using varying classes of linear regression models. Specifically, we consider how the average temperature effect differs for the various global and regional climate model combinations, including assessment of possible interaction between the effects of global and regional climate models. We use both pointwise and simultaneous inference procedures to identify regions where global and regional climate model effects differ. We also show conclusively that results from pointwise inference are misleading, and that accounting for multiple comparisons is important for making proper inference.

  9. Pitfalls in statistical landslide susceptibility modelling

    NASA Astrophysics Data System (ADS)

    Schröder, Boris; Vorpahl, Peter; Märker, Michael; Elsenbeer, Helmut

    2010-05-01

    The use of statistical methods is a well-established approach to predict landslide occurrence probabilities and to assess landslide susceptibility. This is achieved by applying statistical methods relating historical landslide inventories to topographic indices as predictor variables. In our contribution, we compare several new and powerful methods developed in machine learning and well-established in landscape ecology and macroecology for predicting the distribution of shallow landslides in tropical mountain rainforests in southern Ecuador (among others: boosted regression trees, multivariate adaptive regression splines, maximum entropy). Although these methods are powerful, we think it is necessary to follow a basic set of guidelines to avoid some pitfalls regarding data sampling, predictor selection, and model quality assessment, especially if a comparison of different models is contemplated. We therefore suggest to apply a novel toolbox to evaluate approaches to the statistical modelling of landslide susceptibility. Additionally, we propose some methods to open the "black box" as an inherent part of machine learning methods in order to achieve further explanatory insights into preparatory factors that control landslides. Sampling of training data should be guided by hypotheses regarding processes that lead to slope failure taking into account their respective spatial scales. This approach leads to the selection of a set of candidate predictor variables considered on adequate spatial scales. This set should be checked for multicollinearity in order to facilitate model response curve interpretation. Model quality assesses how well a model is able to reproduce independent observations of its response variable. This includes criteria to evaluate different aspects of model performance, i.e. model discrimination, model calibration, and model refinement. In order to assess a possible violation of the assumption of independency in the training samples or a possible lack of explanatory information in the chosen set of predictor variables, the model residuals need to be checked for spatial auto¬correlation. Therefore, we calculate spline correlograms. In addition to this, we investigate partial dependency plots and bivariate interactions plots considering possible interactions between predictors to improve model interpretation. Aiming at presenting this toolbox for model quality assessment, we investigate the influence of strategies in the construction of training datasets for statistical models on model quality.

  10. Characteristics and Impact of Imperviousness From a GIS-based Hydrological Perspective

    NASA Astrophysics Data System (ADS)

    Moglen, G. E.; Kim, S.

    2005-12-01

    With the concern that imperviousness can be differently quantified depending on data sources and methods, this study assessed imperviousness estimates using two different data sources: land use and land cover. Year 2000 land use developed by the Maryland Department of Planning was utilized to estimate imperviousness by assigning imperviousness coefficients to unique land use categories. These estimates were compared with imperviousness estimates based on satellite-derived land cover from the 2001 National Land Cover Dataset. Our study developed the relationships between these two estimates in the form of regression equations to convert imperviousness derived from one data source to the other. The regression equations are considered reliable, based on goodness-of-fit measures. Furthermore, this study examined how quantitatively different imperviousness estimates affect the prediction of hydrological response both in the flow regime and in the thermal regime. We assessed the relationships between indicators of hydrological response and imperviousness-descriptors. As indicators of flow variability, coefficient of variance, lag-one autocorrelation, and mean daily flow change were calculated based on measured mean daily stream flow from the water year 1997 to 2003. For thermal variability, indicators such as percent-days of surge, degree-day, and mean daily temperature difference were calculated base on measured stream temperature over several basins in Maryland. To describe imperviousness through the hydrological process, GIS-based spatially distributed hydrological models were developed based on a water-balance method and the SCS-CN method. Imperviousness estimates from land use and land cover were used as predictors in these models to examine the effect of imperviousness using different data sources on the prediction of hydrological response. Indicators of hydrological response were also regressed on aggregate imperviousness. This allowed for identifying if hydrological response is more sensitive to spatially distributed imperviousness or aggregate (lumped) imperviousness. The regressions between indicators of hydrological response and imperviousness-descriptors were evaluated by examining goodness-of-fit measures such as explained variance or relative standard error. The results show that imperviousness estimates using land use are better predictors of flow variability and thermal variability than imperviousness estimates using land cover. Also, this study reveals that flow variability is more sensitive to spatially distributed models than lumped models, while thermal variability is equally responsive to both models. The findings from this study can be further examined from a policy perspective with regard to policies that are based on a threshold concept for imperviousness impacts on the ecological and hydrological system.

  11. Spatial Modeling for Groundwater Arsenic Levels in North Carolina

    PubMed Central

    Kim, Dohyeong; Miranda, Marie Lynn; Tootoo, Joshua; Bradley, Phil; Gelfand, Alan E.

    2013-01-01

    To examine environmental and geologic determinants of arsenic in groundwater, detailed geologic data were integrated with well water arsenic concentration data and well construction data for 471 private wells in Orange County, NC, via a geographic information system. For the statistical analysis, the geologic units were simplified into four generalized categories based on rock type and interpreted mode of deposition/emplacement. The geologic transitions from rocks of a primary pyroclastic origin to rocks of volcaniclastic sedimentary origin were designated as polylines. The data were fitted to a left-censored regression model to identify key determinants of arsenic levels in groundwater. A Bayesian spatial random effects model was then developed to capture any spatial patterns in groundwater arsenic residuals into model estimation. Statistical model results indicate (1) wells close to a transition zone or fault are more likely to contain detectible arsenic; (2) welded tuffs and hydrothermal quartz bodies are associated with relatively higher groundwater arsenic concentrations and even higher for those proximal to a pluton; and (3) wells of greater depth are more likely to contain elevated arsenic. This modeling effort informs policy intervention by creating three-dimensional maps of predicted arsenic levels in groundwater for any location and depth in the area. PMID:21528844

  12. Spatial modeling for groundwater arsenic levels in North Carolina.

    PubMed

    Kim, Dohyeong; Miranda, Marie Lynn; Tootoo, Joshua; Bradley, Phil; Gelfand, Alan E

    2011-06-01

    To examine environmental and geologic determinants of arsenic in groundwater, detailed geologic data were integrated with well water arsenic concentration data and well construction data for 471 private wells in Orange County, NC, via a geographic information system. For the statistical analysis, the geologic units were simplified into four generalized categories based on rock type and interpreted mode of deposition/emplacement. The geologic transitions from rocks of a primary pyroclastic origin to rocks of volcaniclastic sedimentary origin were designated as polylines. The data were fitted to a left-censored regression model to identify key determinants of arsenic levels in groundwater. A Bayesian spatial random effects model was then developed to capture any spatial patterns in groundwater arsenic residuals into model estimation. Statistical model results indicate (1) wells close to a transition zone or fault are more likely to contain detectible arsenic; (2) welded tuffs and hydrothermal quartz bodies are associated with relatively higher groundwater arsenic concentrations and even higher for those proximal to a pluton; and (3) wells of greater depth are more likely to contain elevated arsenic. This modeling effort informs policy intervention by creating three-dimensional maps of predicted arsenic levels in groundwater for any location and depth in the area.

  13. Multivariate analysis applied to the study of spatial distributions found in drug-eluting stent coatings by confocal Raman microscopy.

    PubMed

    Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A

    2008-07-01

    Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.

  14. Comparison of modeling methods to predict the spatial distribution of deep-sea coral and sponge in the Gulf of Alaska

    NASA Astrophysics Data System (ADS)

    Rooper, Christopher N.; Zimmermann, Mark; Prescott, Megan M.

    2017-08-01

    Deep-sea coral and sponge ecosystems are widespread throughout most of Alaska's marine waters, and are associated with many different species of fishes and invertebrates. These ecosystems are vulnerable to the effects of commercial fishing activities and climate change. We compared four commonly used species distribution models (general linear models, generalized additive models, boosted regression trees and random forest models) and an ensemble model to predict the presence or absence and abundance of six groups of benthic invertebrate taxa in the Gulf of Alaska. All four model types performed adequately on training data for predicting presence and absence, with regression forest models having the best overall performance measured by the area under the receiver-operating-curve (AUC). The models also performed well on the test data for presence and absence with average AUCs ranging from 0.66 to 0.82. For the test data, ensemble models performed the best. For abundance data, there was an obvious demarcation in performance between the two regression-based methods (general linear models and generalized additive models), and the tree-based models. The boosted regression tree and random forest models out-performed the other models by a wide margin on both the training and testing data. However, there was a significant drop-off in performance for all models of invertebrate abundance ( 50%) when moving from the training data to the testing data. Ensemble model performance was between the tree-based and regression-based methods. The maps of predictions from the models for both presence and abundance agreed very well across model types, with an increase in variability in predictions for the abundance data. We conclude that where data conforms well to the modeled distribution (such as the presence-absence data and binomial distribution in this study), the four types of models will provide similar results, although the regression-type models may be more consistent with biological theory. For data with highly zero-inflated distributions and non-normal distributions such as the abundance data from this study, the tree-based methods performed better. Ensemble models that averaged predictions across the four model types, performed better than the GLM or GAM models but slightly poorer than the tree-based methods, suggesting ensemble models might be more robust to overfitting than tree methods, while mitigating some of the disadvantages in predictive performance of regression methods.

  15. Multi-Scale Approach for Predicting Fish Species Distributions across Coral Reef Seascapes

    PubMed Central

    Pittman, Simon J.; Brown, Kerry A.

    2011-01-01

    Two of the major limitations to effective management of coral reef ecosystems are a lack of information on the spatial distribution of marine species and a paucity of data on the interacting environmental variables that drive distributional patterns. Advances in marine remote sensing, together with the novel integration of landscape ecology and advanced niche modelling techniques provide an unprecedented opportunity to reliably model and map marine species distributions across many kilometres of coral reef ecosystems. We developed a multi-scale approach using three-dimensional seafloor morphology and across-shelf location to predict spatial distributions for five common Caribbean fish species. Seascape topography was quantified from high resolution bathymetry at five spatial scales (5–300 m radii) surrounding fish survey sites. Model performance and map accuracy was assessed for two high performing machine-learning algorithms: Boosted Regression Trees (BRT) and Maximum Entropy Species Distribution Modelling (MaxEnt). The three most important predictors were geographical location across the shelf, followed by a measure of topographic complexity. Predictor contribution differed among species, yet rarely changed across spatial scales. BRT provided ‘outstanding’ model predictions (AUC = >0.9) for three of five fish species. MaxEnt provided ‘outstanding’ model predictions for two of five species, with the remaining three models considered ‘excellent’ (AUC = 0.8–0.9). In contrast, MaxEnt spatial predictions were markedly more accurate (92% map accuracy) than BRT (68% map accuracy). We demonstrate that reliable spatial predictions for a range of key fish species can be achieved by modelling the interaction between the geographical location across the shelf and the topographic heterogeneity of seafloor structure. This multi-scale, analytic approach is an important new cost-effective tool to accurately delineate essential fish habitat and support conservation prioritization in marine protected area design, zoning in marine spatial planning, and ecosystem-based fisheries management. PMID:21637787

  16. Multi-scale approach for predicting fish species distributions across coral reef seascapes.

    PubMed

    Pittman, Simon J; Brown, Kerry A

    2011-01-01

    Two of the major limitations to effective management of coral reef ecosystems are a lack of information on the spatial distribution of marine species and a paucity of data on the interacting environmental variables that drive distributional patterns. Advances in marine remote sensing, together with the novel integration of landscape ecology and advanced niche modelling techniques provide an unprecedented opportunity to reliably model and map marine species distributions across many kilometres of coral reef ecosystems. We developed a multi-scale approach using three-dimensional seafloor morphology and across-shelf location to predict spatial distributions for five common Caribbean fish species. Seascape topography was quantified from high resolution bathymetry at five spatial scales (5-300 m radii) surrounding fish survey sites. Model performance and map accuracy was assessed for two high performing machine-learning algorithms: Boosted Regression Trees (BRT) and Maximum Entropy Species Distribution Modelling (MaxEnt). The three most important predictors were geographical location across the shelf, followed by a measure of topographic complexity. Predictor contribution differed among species, yet rarely changed across spatial scales. BRT provided 'outstanding' model predictions (AUC = >0.9) for three of five fish species. MaxEnt provided 'outstanding' model predictions for two of five species, with the remaining three models considered 'excellent' (AUC = 0.8-0.9). In contrast, MaxEnt spatial predictions were markedly more accurate (92% map accuracy) than BRT (68% map accuracy). We demonstrate that reliable spatial predictions for a range of key fish species can be achieved by modelling the interaction between the geographical location across the shelf and the topographic heterogeneity of seafloor structure. This multi-scale, analytic approach is an important new cost-effective tool to accurately delineate essential fish habitat and support conservation prioritization in marine protected area design, zoning in marine spatial planning, and ecosystem-based fisheries management.

  17. Influence of Scale Effect and Model Performance in Downscaling ASTER Land Surface Temperatures to a Very High Spatial Resolution in an Agricultural Area

    NASA Astrophysics Data System (ADS)

    Zhou, J.; Li, G.; Liu, S.; Zhan, W.; Zhang, X.

    2015-12-01

    At present land surface temperatures (LSTs) can be generated from thermal infrared remote sensing with spatial resolutions from ~100 m to tens of kilometers. However, LSTs with high spatial resolution, e.g. tens of meters, are still lack. The purpose of LST downscaling is to generate LSTs with finer spatial resolutions than their native spatial resolutions. The statistical linear or nonlinear regression models are most frequently used for LST downscaling. The basic assumption of these models is the scale-invariant relationships between LST and its descriptors, which is questioned but rare researches have been reported. In addition, few researches can be found for downscaling satellite LST or TIR data to a high spatial resolution, i.e. better than 100 m or even finer. The lack of LST with high spatial resolution cannot satisfy the requirements of applications such as evapotranspiration mapping at the field scale. By selecting a dynamically developing agricultural oasis as the study area, the aim of this study is to downscale the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) LSTs to 15 m, to satisfy the requirement of evapotranspiration mapping at the field scale. Twelve ASTER images from May to September in 2012, covering the entire growth stage of maize, were selected. Four statistical models were evaluated, including one global model, one piecewise model, and two local models. The influence from scale effect in downscaling LST was quantified. The downscaled LSTs are evaluated from accuracy and image quality. Results demonstrate that the influence from scale effect varies according to models and the maize growth stage. Significant influence about -4 K to 6 K existed at the early stage and weaker influence existed in the middle stage. When compared with the ground measured LSTs, the downscaled LSTs resulted from the global and local models yielded higher accuracies and better image qualities than the local models. In addition to the vegetation indices, the surface albedo is an important descriptor for downscaling LST through explaining its spatial variation induced by soil moisture.

  18. The application of neural network model to the simulation nitrous oxide emission in the hydro-fluctuation belt of Three Gorges Reservoir

    NASA Astrophysics Data System (ADS)

    Song, Lanlan

    2017-04-01

    Nitrous oxide is much more potent greenhouse gas than carbon dioxide. However, the estimation of N2O flux is usually clouded with uncertainty, mainly due to high spatial and temporal variations. This hampers the development of general mechanistic models for N2O emission as well, as most previously developed models were empirical or exhibited low predictability with numerous assumptions. In this study, we tested General Regression Neural Networks (GRNN) as an alternative to classic empirical models for simulating N2O emission in riparian zones of Reservoirs. GRNN and nonlinear regression (NLR) were applied to estimate the N2O flux of 1-year observations in riparian zones of Three Gorge Reservoir. NLR resulted in lower prediction power and higher residuals compared to GRNN. Although nonlinear regression model estimated similar average values of N2O, it could not capture the fluctuation patterns accurately. In contrast, GRNN model achieved a fairly high predictability, with an R2 of 0.59 for model validation, 0.77 for model calibration (training), and a low root mean square error (RMSE), indicating a high capacity to simulate the dynamics of N2O flux. According to a sensitivity analysis of the GRNN, nonlinear relationships between input variables and N2O flux were well explained. Our results suggest that the GRNN developed in this study has a greater performance in simulating variations in N2O flux than nonlinear regressions.

  19. Discovering Communicable Models from Earth Science Data

    NASA Technical Reports Server (NTRS)

    Schwabacher, Mark; Langley, Pat; Potter, Christopher; Klooster, Steven; Torregrosa, Alicia

    2002-01-01

    This chapter describes how we used regression rules to improve upon results previously published in the Earth science literature. In such a scientific application of machine learning, it is crucially important for the learned models to be understandable and communicable. We recount how we selected a learning algorithm to maximize communicability, and then describe two visualization techniques that we developed to aid in understanding the model by exploiting the spatial nature of the data. We also report how evaluating the learned models across time let us discover an error in the data.

  20. Modeling the spatial distribution of African buffalo (Syncerus caffer) in the Kruger National Park, South Africa

    PubMed Central

    Hughes, Kristen; Budke, Christine M.; Ward, Michael P.; Kerry, Ruth; Ingram, Ben

    2017-01-01

    The population density of wildlife reservoirs contributes to disease transmission risk for domestic animals. The objective of this study was to model the African buffalo distribution of the Kruger National Park. A secondary objective was to collect field data to evaluate models and determine environmental predictors of buffalo detection. Spatial distribution models were created using buffalo census information and archived data from previous research. Field data were collected during the dry (August 2012) and wet (January 2013) seasons using a random walk design. The fit of the prediction models were assessed descriptively and formally by calculating the root mean square error (rMSE) of deviations from field observations. Logistic regression was used to estimate the effects of environmental variables on the detection of buffalo herds and linear regression was used to identify predictors of larger herd sizes. A zero-inflated Poisson model produced distributions that were most consistent with expected buffalo behavior. Field data confirmed that environmental factors including season (P = 0.008), vegetation type (P = 0.002), and vegetation density (P = 0.010) were significant predictors of buffalo detection. Bachelor herds were more likely to be detected in dense vegetation (P = 0.005) and during the wet season (P = 0.022) compared to the larger mixed-sex herds. Static distribution models for African buffalo can produce biologically reasonable results but environmental factors have significant effects and therefore could be used to improve model performance. Accurate distribution models are critical for the evaluation of disease risk and to model disease transmission. PMID:28902858

  1. Neighborhood Landscape Spatial Patterns and Land Surface Temperature: An Empirical Study on Single-Family Residential Areas in Austin, Texas.

    PubMed

    Kim, Jun-Hyun; Gu, Donghwan; Sohn, Wonmin; Kil, Sung-Ho; Kim, Hwanyong; Lee, Dong-Kun

    2016-09-02

    Rapid urbanization has accelerated land use and land cover changes, and generated the urban heat island effect (UHI). Previous studies have reported positive effects of neighborhood landscapes on mitigating urban surface temperatures. However, the influence of neighborhood landscape spatial patterns on enhancing cooling effects has not yet been fully investigated. The main objective of this study was to assess the relationships between neighborhood landscape spatial patterns and land surface temperatures (LST) by using multi-regression models considering spatial autocorrelation issues. To measure the influence of neighborhood landscape spatial patterns on LST, this study analyzed neighborhood environments of 15,862 single-family houses in Austin, Texas, USA. Using aerial photos, geographic information systems (GIS), and remote sensing, FRAGSTATS was employed to calculate values of several landscape indices used to measure neighborhood landscape spatial patterns. After controlling for the spatial autocorrelation effect, results showed that larger and better-connected landscape spatial patterns were positively correlated with lower LST values in neighborhoods, while more fragmented and isolated neighborhood landscape patterns were negatively related to the reduction of LST.

  2. Neighborhood Landscape Spatial Patterns and Land Surface Temperature: An Empirical Study on Single-Family Residential Areas in Austin, Texas

    PubMed Central

    Kim, Jun-Hyun; Gu, Donghwan; Sohn, Wonmin; Kil, Sung-Ho; Kim, Hwanyong; Lee, Dong-Kun

    2016-01-01

    Rapid urbanization has accelerated land use and land cover changes, and generated the urban heat island effect (UHI). Previous studies have reported positive effects of neighborhood landscapes on mitigating urban surface temperatures. However, the influence of neighborhood landscape spatial patterns on enhancing cooling effects has not yet been fully investigated. The main objective of this study was to assess the relationships between neighborhood landscape spatial patterns and land surface temperatures (LST) by using multi-regression models considering spatial autocorrelation issues. To measure the influence of neighborhood landscape spatial patterns on LST, this study analyzed neighborhood environments of 15,862 single-family houses in Austin, Texas, USA. Using aerial photos, geographic information systems (GIS), and remote sensing, FRAGSTATS was employed to calculate values of several landscape indices used to measure neighborhood landscape spatial patterns. After controlling for the spatial autocorrelation effect, results showed that larger and better-connected landscape spatial patterns were positively correlated with lower LST values in neighborhoods, while more fragmented and isolated neighborhood landscape patterns were negatively related to the reduction of LST. PMID:27598186

  3. Extrapolating intensified forest inventory data to the surrounding landscape using landsat

    Treesearch

    Evan B. Brooks; John W. Coulston; Valerie A. Thomas; Randolph H. Wynne

    2015-01-01

    In 2011, a collection of spatially intensified plots was established on three of the Experimental Forests and Ranges (EFRs) sites with the intent of facilitating FIA program objectives for regional extrapolation. Characteristic coefficients from harmonic regression (HR) analysis of associated Landsat stacks are used as inputs into a conditional random forests model to...

  4. Ecological and Topographic Features of Volcanic Ash-Influenced Forest Soils

    Treesearch

    Mark Kimsey; Brian Gardner; Alan Busacca

    2007-01-01

    Volcanic ash distribution and thickness were determined for a forested region of north-central Idaho. Mean ash thickness and multiple linear regression analyses were used to model the effect of environmental variables on ash thickness. Slope and slope curvature relationships with volcanic ash thickness varied on a local spatial scale across the study area. Ash...

  5. Quantitative characterization of the regressive ecological succession by fractal analysis of plant spatial patterns

    USGS Publications Warehouse

    Alados, C.L.; Pueyo, Y.; Giner, M.L.; Navarro, T.; Escos, J.; Barroso, F.; Cabezudo, B.; Emlen, J.M.

    2003-01-01

    We studied the effect of grazing on the degree of regression of successional vegetation dynamic in a semi-arid Mediterranean matorral. We quantified the spatial distribution patterns of the vegetation by fractal analyses, using the fractal information dimension and spatial autocorrelation measured by detrended fluctuation analyses (DFA). It is the first time that fractal analysis of plant spatial patterns has been used to characterize the regressive ecological succession. Plant spatial patterns were compared over a long-term grazing gradient (low, medium and heavy grazing pressure) and on ungrazed sites for two different plant communities: A middle dense matorral of Chamaerops and Periploca at Sabinar-Romeral and a middle dense matorral of Chamaerops, Rhamnus and Ulex at Requena-Montano. The two communities differed also in the microclimatic characteristics (sea oriented at the Sabinar-Romeral site and inland oriented at the Requena-Montano site). The information fractal dimension increased as we moved from a middle dense matorral to discontinuous and scattered matorral and, finally to the late regressive succession, at Stipa steppe stage. At this stage a drastic change in the fractal dimension revealed a change in the vegetation structure, accurately indicating end successional vegetation stages. Long-term correlation analysis (DFA) revealed that an increase in grazing pressure leads to unpredictability (randomness) in species distributions, a reduction in diversity, and an increase in cover of the regressive successional species, e.g. Stipa tenacissima L. These comparisons provide a quantitative characterization of the successional dynamic of plant spatial patterns in response to grazing perturbation gradient. ?? 2002 Elsevier Science B.V. All rights reserved.

  6. Modeling the Spatial and Temporal Variation of Monthly and Seasonal Precipitation on the Nevada Test Site and Vicinity, 1960-2006

    USGS Publications Warehouse

    Blainey, Joan B.; Webb, Robert H.; Magirl, Christopher S.

    2007-01-01

    The Nevada Test Site (NTS), located in the climatic transition zone between the Mojave and Great Basin Deserts, has a network of precipitation gages that is unusually dense for this region. This network measures monthly and seasonal variation in a landscape with diverse topography. Precipitation data from 125 climate stations on or near the NTS were used to spatially interpolate precipitation for each month during the period of 1960 through 2006 at high spatial resolution (30 m). The data were collected at climate stations using manual and/or automated techniques. The spatial interpolation method, applied to monthly accumulations of precipitation, is based on a distance-weighted multivariate regression between the amount of precipitation and the station location and elevation. This report summarizes the temporal and spatial characteristics of the available precipitation records for the period 1960 to 2006, examines the temporal and spatial variability of precipitation during the period of record, and discusses some extremes in seasonal precipitation on the NTS.

  7. Determination of Spatially Resolved Tablet Density and Hardness Using Near-Infrared Chemical Imaging (NIR-CI).

    PubMed

    Talwar, Sameer; Roopwani, Rahul; Anderson, Carl A; Buckner, Ira S; Drennen, James K

    2017-08-01

    Near-infrared chemical imaging (NIR-CI) combines spectroscopy with digital imaging, enabling spatially resolved analysis and characterization of pharmaceutical samples. Hardness and relative density are critical quality attributes (CQA) that affect tablet performance. Intra-sample density or hardness variability can reveal deficiencies in formulation design or the tableting process. This study was designed to develop NIR-CI methods to predict spatially resolved tablet density and hardness. The method was implemented using a two-step procedure. First, NIR-CI was used to develop a relative density/solid fraction (SF) prediction method for pure microcrystalline cellulose (MCC) compacts only. A partial least squares (PLS) model for predicting SF was generated by regressing the spectra of certain representative pixels selected from each image against the compact SF. Pixel selection was accomplished with a threshold based on the Euclidean distance from the median tablet spectrum. Second, micro-indentation was performed on the calibration compacts to obtain hardness values. A univariate model was developed by relating the empirical hardness values to the NIR-CI predicted SF at the micro-indented pixel locations: this model generated spatially resolved hardness predictions for the entire tablet surface.

  8. Simulation of population-based commuter exposure to NO₂ using different air pollution models.

    PubMed

    Ragettli, Martina S; Tsai, Ming-Yi; Braun-Fahrländer, Charlotte; de Nazelle, Audrey; Schindler, Christian; Ineichen, Alex; Ducret-Stich, Regina E; Perez, Laura; Probst-Hensch, Nicole; Künzli, Nino; Phuleria, Harish C

    2014-05-12

    We simulated commuter routes and long-term exposure to traffic-related air pollution during commute in a representative population sample in Basel (Switzerland), and evaluated three air pollution models with different spatial resolution for estimating commute exposures to nitrogen dioxide (NO2) as a marker of long-term exposure to traffic-related air pollution. Our approach includes spatially and temporally resolved data on actual commuter routes, travel modes and three air pollution models. Annual mean NO2 commuter exposures were similar between models. However, we found more within-city and within-subject variability in annual mean (±SD) NO2 commuter exposure with a high resolution dispersion model (40 ± 7 µg m(-3), range: 21-61) than with a dispersion model with a lower resolution (39 ± 5 µg m(-3); range: 24-51), and a land use regression model (41 ± 5 µg m(-3); range: 24-54). Highest median cumulative exposures were calculated along motorized transport and bicycle routes, and the lowest for walking. For estimating commuter exposure within a city and being interested also in small-scale variability between roads, a model with a high resolution is recommended. For larger scale epidemiological health assessment studies, models with a coarser spatial resolution are likely sufficient, especially when study areas include suburban and rural areas.

  9. Food Crops Response to Climate Change

    NASA Astrophysics Data System (ADS)

    Butler, E.; Huybers, P.

    2009-12-01

    Projections of future climate show a warming world and heterogeneous changes in precipitation. Generally, warming temperatures indicate a decrease in crop yields where they are currently grown. However, warmer climate will also open up new areas at high latitudes for crop production. Thus, there is a question whether the warmer climate with decreased yields but potentially increased growing area will produce a net increase or decrease of overall food crop production. We explore this question through a multiple linear regression model linking temperature and precipitation to crop yield. Prior studies have emphasised temporal regression which indicate uniformly decreased yields, but neglect the potentially increased area opened up for crop production. This study provides a compliment to the prior work by exploring this spatial variation. We explore this subject with a multiple linear regression model from temperature, precipitation and crop yield data over the United States. The United States was chosen as the training region for the model because there are good crop data available over the same time frame as climate data and presumably the yield from crops in the United States is optimized with respect to potential yield. We study corn, soybeans, sorghum, hard red winter wheat and soft red winter wheat using monthly averages of temperature and precipitation from NCEP reanalysis and yearly yield data from the National Agriculture Statistics Service for 1948-2008. The use of monthly averaged temperature and precipitation, which neglect extreme events that can have a significant impact on crops limits this study as does the exclusive use of United States agricultural data. The GFDL 2.1 model under a 720ppm CO2 scenario provides temperature and precipitation fields for 2040-2100 which are used to explore how the spatial regions available for crop production will change under these new conditions.

  10. Source characterization and exposure modeling of gas-phase polycyclic aromatic hydrocarbon (PAH) concentrations in Southern California

    NASA Astrophysics Data System (ADS)

    Masri, Shahir; Li, Lianfa; Dang, Andy; Chung, Judith H.; Chen, Jiu-Chiuan; Fan, Zhi-Hua (Tina); Wu, Jun

    2018-03-01

    Airborne exposures to polycyclic aromatic hydrocarbons (PAHs) are associated with adverse health outcomes. Because personal air measurements of PAHs are labor intensive and costly, spatial PAH exposure models are useful for epidemiological studies. However, few studies provide adequate spatial coverage to reflect intra-urban variability of ambient PAHs. In this study, we collected 39-40 weekly gas-phase PAH samples in southern California twice in summer and twice in winter, 2009, in order to characterize PAH source contributions and develop spatial models that can estimate gas-phase PAH concentrations at a high resolution. A spatial mixed regression model was constructed, including such variables as roadway, traffic, land-use, vegetation index, commercial cooking facilities, meteorology, and population density. Cross validation of the model resulted in an R2 of 0.66 for summer and 0.77 for winter. Results showed higher total PAH concentrations in winter. Pyrogenic sources, such as fossil fuels and diesel exhaust, were the most dominant contributors to total PAHs. PAH sources varied by season, with a higher fossil fuel and wood burning contribution in winter. Spatial autocorrelation accounted for a substantial amount of the variance in total PAH concentrations for both winter (56%) and summer (19%). In summer, other key variables explaining the variance included meteorological factors (9%), population density (15%), and roadway length (21%). In winter, the variance was also explained by traffic density (16%). In this study, source characterization confirmed the dominance of traffic and other fossil fuel sources to total measured gas-phase PAH concentrations while a spatial exposure model identified key predictors of PAH concentrations. Gas-phase PAH source characterization and exposure estimation is of high utility to epidemiologist and policy makers interested in understanding the health impacts of gas-phase PAHs and strategies to reduce emissions.

  11. Predicting the spatial extent of liquefaction from geospatial and earthquake specific parameters

    USGS Publications Warehouse

    Zhu, Jing; Baise, Laurie G.; Thompson, Eric M.; Wald, David J.; Knudsen, Keith L.; Deodatis, George; Ellingwood, Bruce R.; Frangopol, Dan M.

    2014-01-01

    The spatially extensive damage from the 2010-2011 Christchurch, New Zealand earthquake events are a reminder of the need for liquefaction hazard maps for anticipating damage from future earthquakes. Liquefaction hazard mapping as traditionally relied on detailed geologic mapping and expensive site studies. These traditional techniques are difficult to apply globally for rapid response or loss estimation. We have developed a logistic regression model to predict the probability of liquefaction occurrence in coastal sedimentary areas as a function of simple and globally available geospatial features (e.g., derived from digital elevation models) and standard earthquake-specific intensity data (e.g., peak ground acceleration). Some of the geospatial explanatory variables that we consider are taken from the hydrology community, which has a long tradition of using remotely sensed data as proxies for subsurface parameters. As a result of using high resolution, remotely-sensed, and spatially continuous data as a proxy for important subsurface parameters such as soil density and soil saturation, and by using a probabilistic modeling framework, our liquefaction model inherently includes the natural spatial variability of liquefaction occurrence and provides an estimate of spatial extent of liquefaction for a given earthquake. To provide a quantitative check on how the predicted probabilities relate to spatial extent of liquefaction, we report the frequency of observed liquefaction features within a range of predicted probabilities. The percentage of liquefaction is the areal extent of observed liquefaction within a given probability contour. The regional model and the results show that there is a strong relationship between the predicted probability and the observed percentage of liquefaction. Visual inspection of the probability contours for each event also indicates that the pattern of liquefaction is well represented by the model.

  12. Source Characterization and Exposure Modeling of Gas-Phase Polycyclic Aromatic Hydrocarbon (PAH) Concentrations in Southern California.

    PubMed

    Masri, Shahir; Li, Lianfa; Dang, Andy; Chung, Judith H; Chen, Jiu-Chiuan; Fan, Zhi-Hua Tina; Wu, Jun

    2018-03-01

    Airborne exposures to polycyclic aromatic hydrocarbons (PAHs) are associated with adverse health outcomes. Because personal air measurements of PAHs are labor intensive and costly, spatial PAH exposure models are useful for epidemiological studies. However, few studies provide adequate spatial coverage to reflect intra-urban variability of ambient PAHs. In this study, we collected 39-40 weekly gas-phase PAH samples in southern California twice in summer and twice in winter, 2009, in order to characterize PAH source contributions and develop spatial models that can estimate gas-phase PAH concentrations at a high resolution. A spatial mixed regression model was constructed, including such variables as roadway, traffic, land-use, vegetation index, commercial cooking facilities, meteorology, and population density. Cross validation of the model resulted in an R 2 of 0.66 for summer and 0.77 for winter. Results showed higher total PAH concentrations in winter. Pyrogenic sources, such as fossil fuels and diesel exhaust, were the most dominant contributors to total PAHs. PAH sources varied by season, with a higher fossil fuel and wood burning contribution in winter. Spatial autocorrelation accounted for a substantial amount of the variance in total PAH concentrations for both winter (56%) and summer (19%). In summer, other key variables explaining the variance included meteorological factors (9%), population density (15%), and roadway length (21%). In winter, the variance was also explained by traffic density (16%). In this study, source characterization confirmed the dominance of traffic and other fossil fuel sources to total measured gas-phase PAH concentrations while a spatial exposure model identified key predictors of PAH concentrations. Gas-phase PAH source characterization and exposure estimation is of high utility to epidemiologist and policy makers interested in understanding the health impacts of gas-phase PAHs and strategies to reduce emissions.

  13. Trophic dilution of cyclic volatile methylsiloxanes (cVMS) in the pelagic marine food web of Tokyo Bay, Japan.

    PubMed

    Powell, David E; Suganuma, Noriyuki; Kobayashi, Keiji; Nakamura, Tsutomu; Ninomiya, Kouzo; Matsumura, Kozaburo; Omura, Naoki; Ushioka, Satoshi

    2017-02-01

    Bioaccumulation and trophic transfer of cyclic volatile methylsiloxanes (cVMS), specifically octamethylcyclotetrasiloxane (D4), decamethylcyclopentasiloxane (D5), and dodecamethylcyclohexasiloxane (D6), were evaluated in the pelagic marine food web of Tokyo Bay, Japan. Polychlorinated biphenyl (PCB) congeners that are "legacy" chemicals known to bioaccumulate in aquatic organisms and biomagnify across aquatic food webs were used as a benchmark chemical (CB-180) to calibrate the sampled food web and as a reference chemical (CB-153) to validate the results. Trophic magnification factors (TMFs) were calculated from slopes of ordinary least-squares (OLS) regression models and slopes of bootstrap regression models, which were used as robust alternatives to the OLS models. Various regression models were developed that incorporated benchmarking to control bias associated with experimental design, food web dynamics, and trophic level structure. There was no evidence from any of the regression models to suggest biomagnification of cVMS in Tokyo Bay. Rather, the regression models indicated that trophic dilution of cVMS, not trophic magnification, occurred across the sampled food web. Comparison of results for Tokyo Bay to results from other studies indicated that bioaccumulation of cVMS was not related to type of food web (pelagic vs demersal), environment (marine vs freshwater), species composition, or location. Rather, results suggested that differences between study areas was likely related to food web dynamics and variable conditions of exposure resulting from non-uniform patterns of organism movement across spatial concentration gradients. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  14. [Sociodemographic context of homicide in Mexico City: a spatial analysis].

    PubMed

    Fuentes Flores, César; Sánchez Salinas, Omar

    2015-12-01

    Investigate the spatial distribution pattern of the homicide rate and its relation to sociodemographic features in the Benito Juárez, Coyoacán, and Cuauhtémoc districts of Mexico City in 2010. Inferential cross-sectional study that uses spatial analysis methods to study the spatial association of the homicide rate and demographic features. Spatial association was determined through the location quotient, multiple regression analysis, and the use of geographically weighted regression. Homicides show a heterogeneous location pattern with high rates in areas with non-residential land use, low population density, and low marginalization. Spatial analysis tools are powerful instruments for the design of prevention- and recreation-focused public safety policies that aim to reduce mortality from external causes such as homicides.

  15. A generalized model for estimating the energy density of invertebrates

    USGS Publications Warehouse

    James, Daniel A.; Csargo, Isak J.; Von Eschen, Aaron; Thul, Megan D.; Baker, James M.; Hayer, Cari-Ann; Howell, Jessica; Krause, Jacob; Letvin, Alex; Chipps, Steven R.

    2012-01-01

    Invertebrate energy density (ED) values are traditionally measured using bomb calorimetry. However, many researchers rely on a few published literature sources to obtain ED values because of time and sampling constraints on measuring ED with bomb calorimetry. Literature values often do not account for spatial or temporal variability associated with invertebrate ED. Thus, these values can be unreliable for use in models and other ecological applications. We evaluated the generality of the relationship between invertebrate ED and proportion of dry-to-wet mass (pDM). We then developed and tested a regression model to predict ED from pDM based on a taxonomically, spatially, and temporally diverse sample of invertebrates representing 28 orders in aquatic (freshwater, estuarine, and marine) and terrestrial (temperate and arid) habitats from 4 continents and 2 oceans. Samples included invertebrates collected in all seasons over the last 19 y. Evaluation of these data revealed a significant relationship between ED and pDM (r2  =  0.96, p < 0.0001), where ED (as J/g wet mass) was estimated from pDM as ED  =  22,960pDM − 174.2. Model evaluation showed that nearly all (98.8%) of the variability between observed and predicted values for invertebrate ED could be attributed to residual error in the model. Regression of observed on predicted values revealed that the 97.5% joint confidence region included the intercept of 0 (−103.0 ± 707.9) and slope of 1 (1.01 ± 0.12). Use of this model requires that only dry and wet mass measurements be obtained, resulting in significant time, sample size, and cost savings compared to traditional bomb calorimetry approaches. This model should prove useful for a wide range of ecological studies because it is unaffected by taxonomic, seasonal, or spatial variability.

  16. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models

    PubMed Central

    Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties–sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen–in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models–multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)–were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness, coloration and saturation were prominent predictors in digital soil mapping. Considering the increased availability of freely available Remote Sensing data (e.g. Landsat, SRTM, Sentinels), soil information at local and regional scales in data poor regions such as West Africa can be improved with relatively little financial and human resources. PMID:28114334

  17. A pseudo-penalized quasi-likelihood approach to the spatial misalignment problem with non-normal data.

    PubMed

    Lopiano, Kenneth K; Young, Linda J; Gotway, Carol A

    2014-09-01

    Spatially referenced datasets arising from multiple sources are routinely combined to assess relationships among various outcomes and covariates. The geographical units associated with the data, such as the geographical coordinates or areal-level administrative units, are often spatially misaligned, that is, observed at different locations or aggregated over different geographical units. As a result, the covariate is often predicted at the locations where the response is observed. The method used to align disparate datasets must be accounted for when subsequently modeling the aligned data. Here we consider the case where kriging is used to align datasets in point-to-point and point-to-areal misalignment problems when the response variable is non-normally distributed. If the relationship is modeled using generalized linear models, the additional uncertainty induced from using the kriging mean as a covariate introduces a Berkson error structure. In this article, we develop a pseudo-penalized quasi-likelihood algorithm to account for the additional uncertainty when estimating regression parameters and associated measures of uncertainty. The method is applied to a point-to-point example assessing the relationship between low-birth weights and PM2.5 levels after the onset of the largest wildfire in Florida history, the Bugaboo scrub fire. A point-to-areal misalignment problem is presented where the relationship between asthma events in Florida's counties and PM2.5 levels after the onset of the fire is assessed. Finally, the method is evaluated using a simulation study. Our results indicate the method performs well in terms of coverage for 95% confidence intervals and naive methods that ignore the additional uncertainty tend to underestimate the variability associated with parameter estimates. The underestimation is most profound in Poisson regression models. © 2014, The International Biometric Society.

  18. Governance and Regional Variation of Homicide Rates: Evidence From Cross-National Data.

    PubMed

    Cao, Liqun; Zhang, Yan

    2017-01-01

    Criminological theories of cross-national studies of homicide have underestimated the effects of quality governance of liberal democracy and region. Data sets from several sources are combined and a comprehensive model of homicide is proposed. Results of the spatial regression model, which controls for the effect of spatial autocorrelation, show that quality governance, human development, economic inequality, and ethnic heterogeneity are statistically significant in predicting homicide. In addition, regions of Latin America and non-Muslim Sub-Saharan Africa have significantly higher rates of homicides ceteris paribus while the effects of East Asian countries and Islamic societies are not statistically significant. These findings are consistent with the expectation of the new modernization and regional theories. © The Author(s) 2015.

  19. Modeling and predicting urban growth pattern of the Tokyo metropolitan area based on cellular automata

    NASA Astrophysics Data System (ADS)

    Zhao, Yaolong; Zhao, Junsan; Murayama, Yuji

    2008-10-01

    The period of high economic growth in Japan which began in the latter half of the 1950s led to a massive migration of population from rural regions to the Tokyo metropolitan area. This phenomenon brought about rapid urban growth and urban structure changes in this area. Purpose of this study is to establish a constrained CA (Cellular Automata) model with GIS (Geographical Information Systems) to simulate urban growth pattern in the Tokyo metropolitan area towards predicting urban form and landscape for the near future. Urban land-use is classified into multi-categories for interpreting the effect of interaction among land-use categories in the spatial process of urban growth. Driving factors of urban growth pattern, such as land condition, railway network, land-use zoning, random perturbation, and neighborhood interaction and so forth, are explored and integrated into this model. These driving factors are calibrated based on exploratory spatial data analysis (ESDA), spatial statistics, logistic regression, and "trial and error" approach. The simulation is assessed at both macro and micro classification levels in three ways: visual approach; fractal dimension; and spatial metrics. Results indicate that this model provides an effective prototype to simulate and predict urban growth pattern of the Tokyo metropolitan area.

  20. Retrieval and Mapping of Heavy Metal Concentration in Soil Using Time Series Landsat 8 Imagery

    NASA Astrophysics Data System (ADS)

    Fang, Y.; Xu, L.; Peng, J.; Wang, H.; Wong, A.; Clausi, D. A.

    2018-04-01

    Heavy metal pollution is a critical global environmental problem which has always been a concern. Traditional approach to obtain heavy metal concentration relying on field sampling and lab testing is expensive and time consuming. Although many related studies use spectrometers data to build relational model between heavy metal concentration and spectra information, and then use the model to perform prediction using the hyperspectral imagery, this manner can hardly quickly and accurately map soil metal concentration of an area due to the discrepancies between spectrometers data and remote sensing imagery. Taking the advantage of easy accessibility of Landsat 8 data, this study utilizes Landsat 8 imagery to retrieve soil Cu concentration and mapping its distribution in the study area. To enlarge the spectral information for more accurate retrieval and mapping, 11 single date Landsat 8 imagery from 2013-2017 are selected to form a time series imagery. Three regression methods, partial least square regression (PLSR), artificial neural network (ANN) and support vector regression (SVR) are used to model construction. By comparing these models unbiasedly, the best model are selected to mapping Cu concentration distribution. The produced distribution map shows a good spatial autocorrelation and consistency with the mining area locations.

Top