Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Spatial interpolation schemes of daily precipitation for hydrologic modeling
Hwang, Y.; Clark, M.R.; Rajagopalan, B.; Leavesley, G.
2012-01-01
Distributed hydrologic models typically require spatial estimates of precipitation interpolated from sparsely located observational points to the specific grid points. We compare and contrast the performance of regression-based statistical methods for the spatial estimation of precipitation in two hydrologically different basins and confirmed that widely used regression-based estimation schemes fail to describe the realistic spatial variability of daily precipitation field. The methods assessed are: (1) inverse distance weighted average; (2) multiple linear regression (MLR); (3) climatological MLR; and (4) locally weighted polynomial regression (LWP). In order to improve the performance of the interpolations, the authors propose a two-step regression technique for effective daily precipitation estimation. In this simple two-step estimation process, precipitation occurrence is first generated via a logistic regression model before estimate the amount of precipitation separately on wet days. This process generated the precipitation occurrence, amount, and spatial correlation effectively. A distributed hydrologic model (PRMS) was used for the impact analysis in daily time step simulation. Multiple simulations suggested noticeable differences between the input alternatives generated by three different interpolation schemes. Differences are shown in overall simulation error against the observations, degree of explained variability, and seasonal volumes. Simulated streamflows also showed different characteristics in mean, maximum, minimum, and peak flows. Given the same parameter optimization technique, LWP input showed least streamflow error in Alapaha basin and CMLR input showed least error (still very close to LWP) in Animas basin. All of the two-step interpolation inputs resulted in lower streamflow error compared to the directly interpolated inputs. ?? 2011 Springer-Verlag.
Importance of spatial autocorrelation in modeling bird distributions at a continental scale
Bahn, V.; O'Connor, R.J.; Krohn, W.B.
2006-01-01
Spatial autocorrelation in species' distributions has been recognized as inflating the probability of a type I error in hypotheses tests, causing biases in variable selection, and violating the assumption of independence of error terms in models such as correlation or regression. However, it remains unclear whether these problems occur at all spatial resolutions and extents, and under which conditions spatially explicit modeling techniques are superior. Our goal was to determine whether spatial models were superior at large extents and across many different species. In addition, we investigated the importance of purely spatial effects in distribution patterns relative to the variation that could be explained through environmental conditions. We studied distribution patterns of 108 bird species in the conterminous United States using ten years of data from the Breeding Bird Survey. We compared the performance of spatially explicit regression models with non-spatial regression models using Akaike's information criterion. In addition, we partitioned the variance in species distributions into an environmental, a pure spatial and a shared component. The spatially-explicit conditional autoregressive regression models strongly outperformed the ordinary least squares regression models. In addition, partialling out the spatial component underlying the species' distributions showed that an average of 17% of the explained variation could be attributed to purely spatial effects independent of the spatial autocorrelation induced by the underlying environmental variables. We concluded that location in the range and neighborhood play an important role in the distribution of species. Spatially explicit models are expected to yield better predictions especially for mobile species such as birds, even in coarse-grained models with a large extent. ?? Ecography.
Alexeeff, Stacey E; Carroll, Raymond J; Coull, Brent
2016-04-01
Spatial modeling of air pollution exposures is widespread in air pollution epidemiology research as a way to improve exposure assessment. However, there are key sources of exposure model uncertainty when air pollution is modeled, including estimation error and model misspecification. We examine the use of predicted air pollution levels in linear health effect models under a measurement error framework. For the prediction of air pollution exposures, we consider a universal Kriging framework, which may include land-use regression terms in the mean function and a spatial covariance structure for the residuals. We derive the bias induced by estimation error and by model misspecification in the exposure model, and we find that a misspecified exposure model can induce asymptotic bias in the effect estimate of air pollution on health. We propose a new spatial simulation extrapolation (SIMEX) procedure, and we demonstrate that the procedure has good performance in correcting this asymptotic bias. We illustrate spatial SIMEX in a study of air pollution and birthweight in Massachusetts. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Benavides-Varela, S; Piva, D; Burgio, F; Passarini, L; Rolma, G; Meneghello, F; Semenza, C
2017-03-01
Arithmetical deficits in right-hemisphere damaged patients have been traditionally considered secondary to visuo-spatial impairments, although the exact relationship between the two deficits has rarely been assessed. The present study implemented a voxelwise lesion analysis among 30 right-hemisphere damaged patients and a controlled, matched-sample, cross-sectional analysis with 35 cognitively normal controls regressing three composite cognitive measures on standardized numerical measures. The results showed that patients and controls significantly differ in Number comprehension, Transcoding, and Written operations, particularly subtractions and multiplications. The percentage of patients performing below the cutoffs ranged between 27% and 47% across these tasks. Spatial errors were associated with extensive lesions in fronto-temporo-parietal regions -which frequently lead to neglect- whereas pure arithmetical errors appeared related to more confined lesions in the right angular gyrus and its proximity. Stepwise regression models consistently revealed that spatial errors were primarily predicted by composite measures of visuo-spatial attention/neglect and representational abilities. Conversely, specific errors of arithmetic nature linked to representational abilities only. Crucially, the proportion of arithmetical errors (ranging from 65% to 100% across tasks) was higher than that of spatial ones. These findings thus suggest that unilateral right hemisphere lesions can directly affect core numerical/arithmetical processes, and that right-hemisphere acalculia is not only ascribable to visuo-spatial deficits as traditionally thought. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi
2017-03-01
Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.
Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan
2015-06-01
Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations.
Alan K. Swanson; Solomon Z. Dobrowski; Andrew O. Finley; James H. Thorne; Michael K. Schwartz
2013-01-01
The uncertainty associated with species distribution model (SDM) projections is poorly characterized, despite its potential value to decision makers. Error estimates from most modelling techniques have been shown to be biased due to their failure to account for spatial autocorrelation (SAC) of residual error. Generalized linear mixed models (GLMM) have the ability to...
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Crime Modeling using Spatial Regression Approach
NASA Astrophysics Data System (ADS)
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Accounting for spatial effects in land use regression for urban air pollution modeling.
Bertazzon, Stefania; Johnson, Markey; Eccles, Kristin; Kaplan, Gilaad G
2015-01-01
In order to accurately assess air pollution risks, health studies require spatially resolved pollution concentrations. Land-use regression (LUR) models estimate ambient concentrations at a fine spatial scale. However, spatial effects such as spatial non-stationarity and spatial autocorrelation can reduce the accuracy of LUR estimates by increasing regression errors and uncertainty; and statistical methods for resolving these effects--e.g., spatially autoregressive (SAR) and geographically weighted regression (GWR) models--may be difficult to apply simultaneously. We used an alternate approach to address spatial non-stationarity and spatial autocorrelation in LUR models for nitrogen dioxide. Traditional models were re-specified to include a variable capturing wind speed and direction, and re-fit as GWR models. Mean R(2) values for the resulting GWR-wind models (summer: 0.86, winter: 0.73) showed a 10-20% improvement over traditional LUR models. GWR-wind models effectively addressed both spatial effects and produced meaningful predictive models. These results suggest a useful method for improving spatially explicit models. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Smith, Tony E.; Lee, Ka Lok
2012-01-01
There is a common belief that the presence of residual spatial autocorrelation in ordinary least squares (OLS) regression leads to inflated significance levels in beta coefficients and, in particular, inflated levels relative to the more efficient spatial error model (SEM). However, our simulations show that this is not always the case. Hence, the purpose of this paper is to examine this question from a geometric viewpoint. The key idea is to characterize the OLS test statistic in terms of angle cosines and examine the geometric implications of this characterization. Our first result is to show that if the explanatory variables in the regression exhibit no spatial autocorrelation, then the distribution of test statistics for individual beta coefficients in OLS is independent of any spatial autocorrelation in the error term. Hence, inferences about betas exhibit all the optimality properties of the classic uncorrelated error case. However, a second more important series of results show that if spatial autocorrelation is present in both the dependent and explanatory variables, then the conventional wisdom is correct. In particular, even when an explanatory variable is statistically independent of the dependent variable, such joint spatial dependencies tend to produce "spurious correlation" that results in over-rejection of the null hypothesis. The underlying geometric nature of this problem is clarified by illustrative examples. The paper concludes with a brief discussion of some possible remedies for this problem.
Enhancing hyperspectral spatial resolution using multispectral image fusion: A wavelet approach
NASA Astrophysics Data System (ADS)
Jazaeri, Amin
High spectral and spatial resolution images have a significant impact in remote sensing applications. Because both spatial and spectral resolutions of spaceborne sensors are fixed by design and it is not possible to further increase the spatial or spectral resolution, techniques such as image fusion must be applied to achieve such goals. This dissertation introduces the concept of wavelet fusion between hyperspectral and multispectral sensors in order to enhance the spectral and spatial resolution of a hyperspectral image. To test the robustness of this concept, images from Hyperion (hyperspectral sensor) and Advanced Land Imager (multispectral sensor) were first co-registered and then fused using different wavelet algorithms. A regression-based fusion algorithm was also implemented for comparison purposes. The results show that the fused images using a combined bi-linear wavelet-regression algorithm have less error than other methods when compared to the ground truth. In addition, a combined regression-wavelet algorithm shows more immunity to misalignment of the pixels due to the lack of proper registration. The quantitative measures of average mean square error show that the performance of wavelet-based methods degrades when the spatial resolution of hyperspectral images becomes eight times less than its corresponding multispectral image. Regardless of what method of fusion is utilized, the main challenge in image fusion is image registration, which is also a very time intensive process. Because the combined regression wavelet technique is computationally expensive, a hybrid technique based on regression and wavelet methods was also implemented to decrease computational overhead. However, the gain in faster computation was offset by the introduction of more error in the outcome. The secondary objective of this dissertation is to examine the feasibility and sensor requirements for image fusion for future NASA missions in order to be able to perform onboard image fusion. In this process, the main challenge of image registration was resolved by registering the input images using transformation matrices of previously acquired data. The composite image resulted from the fusion process remarkably matched the ground truth, indicating the possibility of real time onboard fusion processing.
Estimating riparian understory vegetation cover with beta regression and copula models
Eskelson, Bianca N.I.; Madsen, Lisa; Hagar, Joan C.; Temesgen, Hailemariam
2011-01-01
Understory vegetation communities are critical components of forest ecosystems. As a result, the importance of modeling understory vegetation characteristics in forested landscapes has become more apparent. Abundance measures such as shrub cover are bounded between 0 and 1, exhibit heteroscedastic error variance, and are often subject to spatial dependence. These distributional features tend to be ignored when shrub cover data are analyzed. The beta distribution has been used successfully to describe the frequency distribution of vegetation cover. Beta regression models ignoring spatial dependence (BR) and accounting for spatial dependence (BRdep) were used to estimate percent shrub cover as a function of topographic conditions and overstory vegetation structure in riparian zones in western Oregon. The BR models showed poor explanatory power (pseudo-R2 ≤ 0.34) but outperformed ordinary least-squares (OLS) and generalized least-squares (GLS) regression models with logit-transformed response in terms of mean square prediction error and absolute bias. We introduce a copula (COP) model that is based on the beta distribution and accounts for spatial dependence. A simulation study was designed to illustrate the effects of incorrectly assuming normality, equal variance, and spatial independence. It showed that BR, BRdep, and COP models provide unbiased parameter estimates, whereas OLS and GLS models result in slightly biased estimates for two of the three parameters. On the basis of the simulation study, 93–97% of the GLS, BRdep, and COP confidence intervals covered the true parameters, whereas OLS and BR only resulted in 84–88% coverage, which demonstrated the superiority of GLS, BRdep, and COP over OLS and BR models in providing standard errors for the parameter estimates in the presence of spatial dependence.
Zhang, Jingyi; Li, Bin; Chen, Yumin; Chen, Meijie; Fang, Tao; Liu, Yongfeng
2018-06-11
This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF) method to estimate ground PM 2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR) models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM 2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM 2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM 2.5 analysis and prediction.
Jacob, Benjamin G; Griffith, Daniel A; Muturi, Ephantus J; Caamano, Erick X; Githure, John I; Novak, Robert J
2009-01-01
Background Autoregressive regression coefficients for Anopheles arabiensis aquatic habitat models are usually assessed using global error techniques and are reported as error covariance matrices. A global statistic, however, will summarize error estimates from multiple habitat locations. This makes it difficult to identify where there are clusters of An. arabiensis aquatic habitats of acceptable prediction. It is therefore useful to conduct some form of spatial error analysis to detect clusters of An. arabiensis aquatic habitats based on uncertainty residuals from individual sampled habitats. In this research, a method of error estimation for spatial simulation models was demonstrated using autocorrelation indices and eigenfunction spatial filters to distinguish among the effects of parameter uncertainty on a stochastic simulation of ecological sampled Anopheles aquatic habitat covariates. A test for diagnostic checking error residuals in an An. arabiensis aquatic habitat model may enable intervention efforts targeting productive habitats clusters, based on larval/pupal productivity, by using the asymptotic distribution of parameter estimates from a residual autocovariance matrix. The models considered in this research extends a normal regression analysis previously considered in the literature. Methods Field and remote-sampled data were collected during July 2006 to December 2007 in Karima rice-village complex in Mwea, Kenya. SAS 9.1.4® was used to explore univariate statistics, correlations, distributions, and to generate global autocorrelation statistics from the ecological sampled datasets. A local autocorrelation index was also generated using spatial covariance parameters (i.e., Moran's Indices) in a SAS/GIS® database. The Moran's statistic was decomposed into orthogonal and uncorrelated synthetic map pattern components using a Poisson model with a gamma-distributed mean (i.e. negative binomial regression). The eigenfunction values from the spatial configuration matrices were then used to define expectations for prior distributions using a Markov chain Monte Carlo (MCMC) algorithm. A set of posterior means were defined in WinBUGS 1.4.3®. After the model had converged, samples from the conditional distributions were used to summarize the posterior distribution of the parameters. Thereafter, a spatial residual trend analyses was used to evaluate variance uncertainty propagation in the model using an autocovariance error matrix. Results By specifying coefficient estimates in a Bayesian framework, the covariate number of tillers was found to be a significant predictor, positively associated with An. arabiensis aquatic habitats. The spatial filter models accounted for approximately 19% redundant locational information in the ecological sampled An. arabiensis aquatic habitat data. In the residual error estimation model there was significant positive autocorrelation (i.e., clustering of habitats in geographic space) based on log-transformed larval/pupal data and the sampled covariate depth of habitat. Conclusion An autocorrelation error covariance matrix and a spatial filter analyses can prioritize mosquito control strategies by providing a computationally attractive and feasible description of variance uncertainty estimates for correctly identifying clusters of prolific An. arabiensis aquatic habitats based on larval/pupal productivity. PMID:19772590
Alexeeff, Stacey E.; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A.
2016-01-01
Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1km x 1km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R2 yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with greater than 0.9 out-of-sample R2 yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the standard errors. Land use regression models performed better in chronic effects simulations. These results can help researchers when interpreting health effect estimates in these types of studies. PMID:24896768
German M. Izon; Michael S. Hand; Daniel W. Mccollum; Jennifer A. Thacher; Robert P. Berrens
2016-01-01
The existing literature suggests that the presence of natural amenities, such as open spaces, can be highly valued and affect economic decisions about where people live and work. This article contributes to previous research by testing this hypothesis using a unique micro-level data set and by examining spatial variations in income levels and housing prices in the...
Comparison of spatial association approaches for landscape mapping of soil organic carbon stocks
NASA Astrophysics Data System (ADS)
Miller, B. A.; Koszinski, S.; Wehrhan, M.; Sommer, M.
2015-03-01
The distribution of soil organic carbon (SOC) can be variable at small analysis scales, but consideration of its role in regional and global issues demands the mapping of large extents. There are many different strategies for mapping SOC, among which is to model the variables needed to calculate the SOC stock indirectly or to model the SOC stock directly. The purpose of this research is to compare direct and indirect approaches to mapping SOC stocks from rule-based, multiple linear regression models applied at the landscape scale via spatial association. The final products for both strategies are high-resolution maps of SOC stocks (kg m-2), covering an area of 122 km2, with accompanying maps of estimated error. For the direct modelling approach, the estimated error map was based on the internal error estimations from the model rules. For the indirect approach, the estimated error map was produced by spatially combining the error estimates of component models via standard error propagation equations. We compared these two strategies for mapping SOC stocks on the basis of the qualities of the resulting maps as well as the magnitude and distribution of the estimated error. The direct approach produced a map with less spatial variation than the map produced by the indirect approach. The increased spatial variation represented by the indirect approach improved R2 values for the topsoil and subsoil stocks. Although the indirect approach had a lower mean estimated error for the topsoil stock, the mean estimated error for the total SOC stock (topsoil + subsoil) was lower for the direct approach. For these reasons, we recommend the direct approach to modelling SOC stocks be considered a more conservative estimate of the SOC stocks' spatial distribution.
Comparison of spatial association approaches for landscape mapping of soil organic carbon stocks
NASA Astrophysics Data System (ADS)
Miller, B. A.; Koszinski, S.; Wehrhan, M.; Sommer, M.
2014-11-01
The distribution of soil organic carbon (SOC) can be variable at small analysis scales, but consideration of its role in regional and global issues demands the mapping of large extents. There are many different strategies for mapping SOC, among which are to model the variables needed to calculate the SOC stock indirectly or to model the SOC stock directly. The purpose of this research is to compare direct and indirect approaches to mapping SOC stocks from rule-based, multiple linear regression models applied at the landscape scale via spatial association. The final products for both strategies are high-resolution maps of SOC stocks (kg m-2), covering an area of 122 km2, with accompanying maps of estimated error. For the direct modelling approach, the estimated error map was based on the internal error estimations from the model rules. For the indirect approach, the estimated error map was produced by spatially combining the error estimates of component models via standard error propagation equations. We compared these two strategies for mapping SOC stocks on the basis of the qualities of the resulting maps as well as the magnitude and distribution of the estimated error. The direct approach produced a map with less spatial variation than the map produced by the indirect approach. The increased spatial variation represented by the indirect approach improved R2 values for the topsoil and subsoil stocks. Although the indirect approach had a lower mean estimated error for the topsoil stock, the mean estimated error for the total SOC stock (topsoil + subsoil) was lower for the direct approach. For these reasons, we recommend the direct approach to modelling SOC stocks be considered a more conservative estimate of the SOC stocks' spatial distribution.
NASA Astrophysics Data System (ADS)
Webb, Mathew A.; Hall, Andrew; Kidd, Darren; Minansy, Budiman
2016-05-01
Assessment of local spatial climatic variability is important in the planning of planting locations for horticultural crops. This study investigated three regression-based calibration methods (i.e. traditional versus two optimized methods) to relate short-term 12-month data series from 170 temperature loggers and 4 weather station sites with data series from nearby long-term Australian Bureau of Meteorology climate stations. The techniques trialled to interpolate climatic temperature variables, such as frost risk, growing degree days (GDDs) and chill hours, were regression kriging (RK), regression trees (RTs) and random forests (RFs). All three calibration methods produced accurate results, with the RK-based calibration method delivering the most accurate validation measures: coefficients of determination ( R 2) of 0.92, 0.97 and 0.95 and root-mean-square errors of 1.30, 0.80 and 1.31 °C, for daily minimum, daily maximum and hourly temperatures, respectively. Compared with the traditional method of calibration using direct linear regression between short-term and long-term stations, the RK-based calibration method improved R 2 and reduced root-mean-square error (RMSE) by at least 5 % and 0.47 °C for daily minimum temperature, 1 % and 0.23 °C for daily maximum temperature and 3 % and 0.33 °C for hourly temperature. Spatial modelling indicated insignificant differences between the interpolation methods, with the RK technique tending to be the slightly better method due to the high degree of spatial autocorrelation between logger sites.
Hayes, Mark A.; Ozenberger, Katharine; Cryan, Paul M.; Wunder, Michael B.
2015-01-01
Bat specimens held in natural history museum collections can provide insights into the distribution of species. However, there are several important sources of spatial error associated with natural history specimens that may influence the analysis and mapping of bat species distributions. We analyzed the importance of geographic referencing and error correction in species distribution modeling (SDM) using occurrence records of hoary bats (Lasiurus cinereus). This species is known to migrate long distances and is a species of increasing concern due to fatalities documented at wind energy facilities in North America. We used 3,215 museum occurrence records collected from 1950–2000 for hoary bats in North America. We compared SDM performance using five approaches: generalized linear models, multivariate adaptive regression splines, boosted regression trees, random forest, and maximum entropy models. We evaluated results using three SDM performance metrics (AUC, sensitivity, and specificity) and two data sets: one comprised of the original occurrence data, and a second data set consisting of these same records after the locations were adjusted to correct for identifiable spatial errors. The increase in precision improved the mean estimated spatial error associated with hoary bat records from 5.11 km to 1.58 km, and this reduction in error resulted in a slight increase in all three SDM performance metrics. These results provide insights into the importance of geographic referencing and the value of correcting spatial errors in modeling the distribution of a wide-ranging bat species. We conclude that the considerable time and effort invested in carefully increasing the precision of the occurrence locations in this data set was not worth the marginal gains in improved SDM performance, and it seems likely that gains would be similar for other bat species that range across large areas of the continent, migrate, and are habitat generalists.
Conduct urban agglomeration with the baton of transportation.
DOT National Transportation Integrated Search
2013-12-01
A key indicator of traffic activity patterns is commuting distance. Shorter commuting distances yield less traffic, fewer emissions, : and lower energy consumption. This study develops a spatial error seemingly unrelated regression model to investiga...
Determinants of single family residential water use across scales in four western US cities.
Chang, Heejun; Bonnette, Matthew Ryan; Stoker, Philip; Crow-Miller, Britt; Wentz, Elizabeth
2017-10-15
A growing body of literature examines urban water sustainability with increasing evidence that locally-based physical and social spatial interactions contribute to water use. These studies however are based on single-city analysis and often fail to consider whether these interactions occur more generally. We examine a multi-city comparison using a common set of spatially-explicit water, socioeconomic, and biophysical data. We investigate the relative importance of variables for explaining the variations of single family residential (SFR) water uses at Census Block Group (CBG) and Census Tract (CT) scales in four representative western US cities - Austin, Phoenix, Portland, and Salt Lake City, - which cover a wide range of climate and development density. We used both ordinary least squares regression and spatial error regression models to identify the influence of spatial dependence on water use patterns. Our results show that older downtown areas show lower water use than newer suburban areas in all four cities. Tax assessed value and building age are the main determinants of SFR water use across the four cities regardless of the scale. Impervious surface area becomes an important variable for summer water use in all cities, and it is important in all seasons for arid environments such as Phoenix. CT level analysis shows better model predictability than CBG analysis. In all cities, seasons, and spatial scales, spatial error regression models better explain the variations of SFR water use. Such a spatially-varying relationship of urban water consumption provides additional evidence for the need to integrate urban land use planning and municipal water planning. Copyright © 2017 Elsevier B.V. All rights reserved.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...
Kierepka, E M; Latch, E K
2016-01-01
Landscape genetics is a powerful tool for conservation because it identifies landscape features that are important for maintaining genetic connectivity between populations within heterogeneous landscapes. However, using landscape genetics in poorly understood species presents a number of challenges, namely, limited life history information for the focal population and spatially biased sampling. Both obstacles can reduce power in statistics, particularly in individual-based studies. In this study, we genotyped 233 American badgers in Wisconsin at 12 microsatellite loci to identify alternative statistical approaches that can be applied to poorly understood species in an individual-based framework. Badgers are protected in Wisconsin owing to an overall lack in life history information, so our study utilized partial redundancy analysis (RDA) and spatially lagged regressions to quantify how three landscape factors (Wisconsin River, Ecoregions and land cover) impacted gene flow. We also performed simulations to quantify errors created by spatially biased sampling. Statistical analyses first found that geographic distance was an important influence on gene flow, mainly driven by fine-scale positive spatial autocorrelations. After controlling for geographic distance, both RDA and regressions found that Wisconsin River and Agriculture were correlated with genetic differentiation. However, only Agriculture had an acceptable type I error rate (3–5%) to be considered biologically relevant. Collectively, this study highlights the benefits of combining robust statistics and error assessment via simulations and provides a method for hypothesis testing in individual-based landscape genetics. PMID:26243136
Modelling space of spread Dengue Hemorrhagic Fever (DHF) in Central Java use spatial durbin model
NASA Astrophysics Data System (ADS)
Ispriyanti, Dwi; Prahutama, Alan; Taryono, Arkadina PN
2018-05-01
Dengue Hemorrhagic Fever is one of the major public health problems in Indonesia. From year to year, DHF causes Extraordinary Event in most parts of Indonesia, especially Central Java. Central Java consists of 35 districts or cities where each region is close to each other. Spatial regression is an analysis that suspects the influence of independent variables on the dependent variables with the influences of the region inside. In spatial regression modeling, there are spatial autoregressive model (SAR), spatial error model (SEM) and spatial autoregressive moving average (SARMA). Spatial Durbin model is the development of SAR where the dependent and independent variable have spatial influence. In this research dependent variable used is number of DHF sufferers. The independent variables observed are population density, number of hospitals, residents and health centers, and mean years of schooling. From the multiple regression model test, the variables that significantly affect the spread of DHF disease are the population and mean years of schooling. By using queen contiguity and rook contiguity, the best model produced is the SDM model with queen contiguity because it has the smallest AIC value of 494,12. Factors that generally affect the spread of DHF in Central Java Province are the number of population and the average length of school.
Bias and uncertainty in regression-calibrated models of groundwater flow in heterogeneous media
Cooley, R.L.; Christensen, S.
2006-01-01
Groundwater models need to account for detailed but generally unknown spatial variability (heterogeneity) of the hydrogeologic model inputs. To address this problem we replace the large, m-dimensional stochastic vector ?? that reflects both small and large scales of heterogeneity in the inputs by a lumped or smoothed m-dimensional approximation ????*, where ?? is an interpolation matrix and ??* is a stochastic vector of parameters. Vector ??* has small enough dimension to allow its estimation with the available data. The consequence of the replacement is that model function f(????*) written in terms of the approximate inputs is in error with respect to the same model function written in terms of ??, ??,f(??), which is assumed to be nearly exact. The difference f(??) - f(????*), termed model error, is spatially correlated, generates prediction biases, and causes standard confidence and prediction intervals to be too small. Model error is accounted for in the weighted nonlinear regression methodology developed to estimate ??* and assess model uncertainties by incorporating the second-moment matrix of the model errors into the weight matrix. Techniques developed by statisticians to analyze classical nonlinear regression methods are extended to analyze the revised method. The analysis develops analytical expressions for bias terms reflecting the interaction of model nonlinearity and model error, for correction factors needed to adjust the sizes of confidence and prediction intervals for this interaction, and for correction factors needed to adjust the sizes of confidence and prediction intervals for possible use of a diagonal weight matrix in place of the correct one. If terms expressing the degree of intrinsic nonlinearity for f(??) and f(????*) are small, then most of the biases are small and the correction factors are reduced in magnitude. Biases, correction factors, and confidence and prediction intervals were obtained for a test problem for which model error is large to test robustness of the methodology. Numerical results conform with the theoretical analysis. ?? 2005 Elsevier Ltd. All rights reserved.
Gaudio, Jennifer L; Snowdon, Charles T
2008-11-01
Animals living in stable home ranges have many potential cues to locate food. Spatial and color cues are important for wild Callitrichids (marmosets and tamarins). Field studies have assigned the highest priority to distal spatial cues for determining the location of food resources with color cues serving as a secondary cue to assess relative ripeness, once a food source is located. We tested two hypotheses with captive cotton-top tamarins: (a) Tamarins will demonstrate higher rates of initial learning when rewarded for attending to spatial cues versus color cues. (b) Tamarins will show higher rates of correct responses when transferred from color cues to spatial cues than from spatial cues to color cues. The results supported both hypotheses. Tamarins rewarded based on spatial location made significantly more correct choices and fewer errors than tamarins rewarded based on color cues during initial learning. Furthermore, tamarins trained on color cues showed significantly increased correct responses and decreased errors when cues were reversed to reward spatial cues. Subsequent reversal to color cues induced a regression in performance. For tamarins spatial cues appear more salient than color cues in a foraging task. (PsycINFO Database Record (c) 2008 APA, all rights reserved).
Li, Qi-Quan; Wang, Chang-Quan; Zhang, Wen-Jiang; Yu, Yong; Li, Bing; Yang, Juan; Bai, Gen-Chuan; Cai, Yan
2013-02-01
In this study, a radial basis function neural network model combined with ordinary kriging (RBFNN_OK) was adopted to predict the spatial distribution of soil nutrients (organic matter and total N) in a typical hilly region of Sichuan Basin, Southwest China, and the performance of this method was compared with that of ordinary kriging (OK) and regression kriging (RK). All the three methods produced the similar soil nutrient maps. However, as compared with those obtained by multiple linear regression model, the correlation coefficients between the measured values and the predicted values of soil organic matter and total N obtained by neural network model increased by 12. 3% and 16. 5% , respectively, suggesting that neural network model could more accurately capture the complicated relationships between soil nutrients and quantitative environmental factors. The error analyses of the prediction values of 469 validation points indicated that the mean absolute error (MAE) , mean relative error (MRE), and root mean squared error (RMSE) of RBFNN_OK were 6.9%, 7.4%, and 5. 1% (for soil organic matter), and 4.9%, 6.1% , and 4.6% (for soil total N) smaller than those of OK (P<0.01), and 2.4%, 2.6% , and 1.8% (for soil organic matter), and 2.1%, 2.8%, and 2.2% (for soil total N) smaller than those of RK, respectively (P<0.05).
Quantile regression models of animal habitat relationships
Cade, Brian S.
2003-01-01
Typically, all factors that limit an organism are not measured and included in statistical models used to investigate relationships with their environment. If important unmeasured variables interact multiplicatively with the measured variables, the statistical models often will have heterogeneous response distributions with unequal variances. Quantile regression is an approach for estimating the conditional quantiles of a response variable distribution in the linear model, providing a more complete view of possible causal relationships between variables in ecological processes. Chapter 1 introduces quantile regression and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of estimates for homogeneous and heterogeneous regression models. Chapter 2 evaluates performance of quantile rankscore tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). A permutation F test maintained better Type I errors than the Chi-square T test for models with smaller n, greater number of parameters p, and more extreme quantiles τ. Both versions of the test required weighting to maintain correct Type I errors when there was heterogeneity under the alternative model. An example application related trout densities to stream channel width:depth. Chapter 3 evaluates a drop in dispersion, F-ratio like permutation test for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). Chapter 4 simulates from a large (N = 10,000) finite population representing grid areas on a landscape to demonstrate various forms of hidden bias that might occur when the effect of a measured habitat variable on some animal was confounded with the effect of another unmeasured variable (spatially and not spatially structured). Depending on whether interactions of the measured habitat and unmeasured variable were negative (interference interactions) or positive (facilitation interactions), either upper (τ > 0.5) or lower (τ < 0.5) quantile regression parameters were less biased than mean rate parameters. Sampling (n = 20 - 300) simulations demonstrated that confidence intervals constructed by inverting rankscore tests provided valid coverage of these biased parameters. Quantile regression was used to estimate effects of physical habitat resources on a bivalve mussel (Macomona liliana) in a New Zealand harbor by modeling the spatial trend surface as a cubic polynomial of location coordinates.
Wong, Man Sing; Ho, Hung Chak; Yang, Lin; Shi, Wenzhong; Yang, Jinxin; Chan, Ta-Chien
2017-07-24
Dust events have long been recognized to be associated with a higher mortality risk. However, no study has investigated how prolonged dust events affect the spatial variability of mortality across districts in a downwind city. In this study, we applied a spatial regression approach to estimate the district-level mortality during two extreme dust events in Hong Kong. We compared spatial and non-spatial models to evaluate the ability of each regression to estimate mortality. We also compared prolonged dust events with non-dust events to determine the influences of community factors on mortality across the city. The density of a built environment (estimated by the sky view factor) had positive association with excess mortality in each district, while socioeconomic deprivation contributed by lower income and lower education induced higher mortality impact in each territory planning unit during a prolonged dust event. Based on the model comparison, spatial error modelling with the 1st order of queen contiguity consistently outperformed other models. The high-risk areas with higher increase in mortality were located in an urban high-density environment with higher socioeconomic deprivation. Our model design shows the ability to predict spatial variability of mortality risk during an extreme weather event that is not able to be estimated based on traditional time-series analysis or ecological studies. Our spatial protocol can be used for public health surveillance, sustainable planning and disaster preparation when relevant data are available.
Multiscale measurement error models for aggregated small area health data.
Aregay, Mehreteab; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Carroll, Rachel; Watjou, Kevin
2016-08-01
Spatial data are often aggregated from a finer (smaller) to a coarser (larger) geographical level. The process of data aggregation induces a scaling effect which smoothes the variation in the data. To address the scaling problem, multiscale models that link the convolution models at different scale levels via the shared random effect have been proposed. One of the main goals in aggregated health data is to investigate the relationship between predictors and an outcome at different geographical levels. In this paper, we extend multiscale models to examine whether a predictor effect at a finer level hold true at a coarser level. To adjust for predictor uncertainty due to aggregation, we applied measurement error models in the framework of multiscale approach. To assess the benefit of using multiscale measurement error models, we compare the performance of multiscale models with and without measurement error in both real and simulated data. We found that ignoring the measurement error in multiscale models underestimates the regression coefficient, while it overestimates the variance of the spatially structured random effect. On the other hand, accounting for the measurement error in multiscale models provides a better model fit and unbiased parameter estimates. © The Author(s) 2016.
Spatial association of public sports facilities with body mass index in Korea.
Han, Eun Jin; Kang, Kiyeon; Sohn, So Young
2018-05-07
Governments and also local councils create and enforce their own regional public health care plans for the problem of overweight and obesity in the population. Public sports facilities can help these plans. In this paper, we investigated the contribution of public sports facilities to the reduction of the obesity of local residents. We used the data obtained from the Fifth Korea National Health and Nutrition Examination Surveys; and measured the degree of obesity using body mass index (BMI). We conducted various spatial regression analyses including the global Moran's I test and local indicators of spatial autocorrelation analysis finding that there exists spatial dependence in the error term of spatial regression model for BMI. However, we also observed that the number of local public sports facilities is not significantly related to local BMI. This result can be caused by the low utilization ratio and an unbalanced spatial distribution of local public sports facilities. Based on our findings, we suggest that local councils need to improve the quality of public sports facilities encouraging the establishment of preferred types of pubic sports facilities.
Uncertainty Analysis in Large Area Aboveground Biomass Mapping
NASA Astrophysics Data System (ADS)
Baccini, A.; Carvalho, L.; Dubayah, R.; Goetz, S. J.; Friedl, M. A.
2011-12-01
Satellite and aircraft-based remote sensing observations are being more frequently used to generate spatially explicit estimates of aboveground carbon stock of forest ecosystems. Because deforestation and forest degradation account for circa 10% of anthropogenic carbon emissions to the atmosphere, policy mechanisms are increasingly recognized as a low-cost mitigation option to reduce carbon emission. They are, however, contingent upon the capacity to accurately measures carbon stored in the forests. Here we examine the sources of uncertainty and error propagation in generating maps of aboveground biomass. We focus on characterizing uncertainties associated with maps at the pixel and spatially aggregated national scales. We pursue three strategies to describe the error and uncertainty properties of aboveground biomass maps, including: (1) model-based assessment using confidence intervals derived from linear regression methods; (2) data-mining algorithms such as regression trees and ensembles of these; (3) empirical assessments using independently collected data sets.. The latter effort explores error propagation using field data acquired within satellite-based lidar (GLAS) acquisitions versus alternative in situ methods that rely upon field measurements that have not been systematically collected for this purpose (e.g. from forest inventory data sets). A key goal of our effort is to provide multi-level characterizations that provide both pixel and biome-level estimates of uncertainties at different scales.
Dorazio, Robert M.
2012-01-01
Several models have been developed to predict the geographic distribution of a species by combining measurements of covariates of occurrence at locations where the species is known to be present with measurements of the same covariates at other locations where species occurrence status (presence or absence) is unknown. In the absence of species detection errors, spatial point-process models and binary-regression models for case-augmented surveys provide consistent estimators of a species’ geographic distribution without prior knowledge of species prevalence. In addition, these regression models can be modified to produce estimators of species abundance that are asymptotically equivalent to those of the spatial point-process models. However, if species presence locations are subject to detection errors, neither class of models provides a consistent estimator of covariate effects unless the covariates of species abundance are distinct and independently distributed from the covariates of species detection probability. These analytical results are illustrated using simulation studies of data sets that contain a wide range of presence-only sample sizes. Analyses of presence-only data of three avian species observed in a survey of landbirds in western Montana and northern Idaho are compared with site-occupancy analyses of detections and nondetections of these species.
Regression techniques for oceanographic parameter retrieval using space-borne microwave radiometry
NASA Technical Reports Server (NTRS)
Hofer, R.; Njoku, E. G.
1981-01-01
Variations of conventional multiple regression techniques are applied to the problem of remote sensing of oceanographic parameters from space. The techniques are specifically adapted to the scanning multichannel microwave radiometer (SMRR) launched on the Seasat and Nimbus 7 satellites to determine ocean surface temperature, wind speed, and atmospheric water content. The retrievals are studied primarily from a theoretical viewpoint, to illustrate the retrieval error structure, the relative importances of different radiometer channels, and the tradeoffs between spatial resolution and retrieval accuracy. Comparisons between regressions using simulated and actual SMMR data are discussed; they show similar behavior.
Measurement Error Correction for Predicted Spatiotemporal Air Pollution Exposures.
Keller, Joshua P; Chang, Howard H; Strickland, Matthew J; Szpiro, Adam A
2017-05-01
Air pollution cohort studies are frequently analyzed in two stages, first modeling exposure then using predicted exposures to estimate health effects in a second regression model. The difference between predicted and unobserved true exposures introduces a form of measurement error in the second stage health model. Recent methods for spatial data correct for measurement error with a bootstrap and by requiring the study design ensure spatial compatibility, that is, monitor and subject locations are drawn from the same spatial distribution. These methods have not previously been applied to spatiotemporal exposure data. We analyzed the association between fine particulate matter (PM2.5) and birth weight in the US state of Georgia using records with estimated date of conception during 2002-2005 (n = 403,881). We predicted trimester-specific PM2.5 exposure using a complex spatiotemporal exposure model. To improve spatial compatibility, we restricted to mothers residing in counties with a PM2.5 monitor (n = 180,440). We accounted for additional measurement error via a nonparametric bootstrap. Third trimester PM2.5 exposure was associated with lower birth weight in the uncorrected (-2.4 g per 1 μg/m difference in exposure; 95% confidence interval [CI]: -3.9, -0.8) and bootstrap-corrected (-2.5 g, 95% CI: -4.2, -0.8) analyses. Results for the unrestricted analysis were attenuated (-0.66 g, 95% CI: -1.7, 0.35). This study presents a novel application of measurement error correction for spatiotemporal air pollution exposures. Our results demonstrate the importance of spatial compatibility between monitor and subject locations and provide evidence of the association between air pollution exposure and birth weight.
Evrendilek, Fatih
2007-12-12
This study aims at quantifying spatio-temporal dynamics of monthly mean dailyincident photosynthetically active radiation (PAR) over a vast and complex terrain such asTurkey. The spatial interpolation method of universal kriging, and the combination ofmultiple linear regression (MLR) models and map algebra techniques were implemented togenerate surface maps of PAR with a grid resolution of 500 x 500 m as a function of fivegeographical and 14 climatic variables. Performance of the geostatistical and MLR modelswas compared using mean prediction error (MPE), root-mean-square prediction error(RMSPE), average standard prediction error (ASE), mean standardized prediction error(MSPE), root-mean-square standardized prediction error (RMSSPE), and adjustedcoefficient of determination (R² adj. ). The best-fit MLR- and universal kriging-generatedmodels of monthly mean daily PAR were validated against an independent 37-year observeddataset of 35 climate stations derived from 160 stations across Turkey by the Jackknifingmethod. The spatial variability patterns of monthly mean daily incident PAR were moreaccurately reflected in the surface maps created by the MLR-based models than in thosecreated by the universal kriging method, in particular, for spring (May) and autumn(November). The MLR-based spatial interpolation algorithms of PAR described in thisstudy indicated the significance of the multifactor approach to understanding and mappingspatio-temporal dynamics of PAR for a complex terrain over meso-scales.
Post-Modeling Histogram Matching of Maps Produced Using Regression Trees
Andrew J. Lister; Tonya W. Lister
2006-01-01
Spatial predictive models often use statistical techniques that in some way rely on averaging of values. Estimates from linear modeling are known to be susceptible to truncation of variance when the independent (predictor) variables are measured with error. A straightforward post-processing technique (histogram matching) for attempting to mitigate this effect is...
NASA Astrophysics Data System (ADS)
Lisenko, S. A.; Kugeiko, M. M.
2013-01-01
The ability to determine noninvasively microphysical parameters (MPPs) of skin characteristic of malignant melanoma was demonstrated. The MPPs were the melanin content in dermis, saturation of tissue with blood vessels, and concentration and effective size of tissue scatterers. The proposed method was based on spatially resolved spectral measurements of skin diffuse reflectance and multiple regressions between linearly independent measurement components and skin MPPs. The regressions were established by modeling radiation transfer in skin with a wide variation of its MPPs. Errors in the determination of skin MPPs were estimated using fiber-optic measurements of its diffuse reflectance at wavelengths of commercially available semiconductor diode lasers (578, 625, 660, 760, and 806 nm) at source-detector separations of 0.23-1.38 mm.
Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R
2012-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.
Measurement error in epidemiologic studies of air pollution based on land-use regression models.
Basagaña, Xavier; Aguilera, Inmaculada; Rivera, Marcela; Agis, David; Foraster, Maria; Marrugat, Jaume; Elosua, Roberto; Künzli, Nino
2013-10-15
Land-use regression (LUR) models are increasingly used to estimate air pollution exposure in epidemiologic studies. These models use air pollution measurements taken at a small set of locations and modeling based on geographical covariates for which data are available at all study participant locations. The process of LUR model development commonly includes a variable selection procedure. When LUR model predictions are used as explanatory variables in a model for a health outcome, measurement error can lead to bias of the regression coefficients and to inflation of their variance. In previous studies dealing with spatial predictions of air pollution, bias was shown to be small while most of the effect of measurement error was on the variance. In this study, we show that in realistic cases where LUR models are applied to health data, bias in health-effect estimates can be substantial. This bias depends on the number of air pollution measurement sites, the number of available predictors for model selection, and the amount of explainable variability in the true exposure. These results should be taken into account when interpreting health effects from studies that used LUR models.
Developing and testing a global-scale regression model to quantify mean annual streamflow
NASA Astrophysics Data System (ADS)
Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.
2017-01-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.
Restoring method for missing data of spatial structural stress monitoring based on correlation
NASA Astrophysics Data System (ADS)
Zhang, Zeyu; Luo, Yaozhi
2017-07-01
Long-term monitoring of spatial structures is of great importance for the full understanding of their performance and safety. The missing part of the monitoring data link will affect the data analysis and safety assessment of the structure. Based on the long-term monitoring data of the steel structure of the Hangzhou Olympic Center Stadium, the correlation between the stress change of the measuring points is studied, and an interpolation method of the missing stress data is proposed. Stress data of correlated measuring points are selected in the 3 months of the season when missing data is required for fitting correlation. Data of daytime and nighttime are fitted separately for interpolation. For a simple linear regression when single point's correlation coefficient is 0.9 or more, the average error of interpolation is about 5%. For multiple linear regression, the interpolation accuracy is not significantly increased after the number of correlated points is more than 6. Stress baseline value of construction step should be calculated before interpolating missing data in the construction stage, and the average error is within 10%. The interpolation error of continuous missing data is slightly larger than that of the discrete missing data. The data missing rate of this method should better not exceed 30%. Finally, a measuring point's missing monitoring data is restored to verify the validity of the method.
Predicting active-layer soil thickness using topographic variables at a small watershed scale
Li, Aidi; Tan, Xing; Wu, Wei; Liu, Hongbin; Zhu, Jie
2017-01-01
Knowledge about the spatial distribution of active-layer (AL) soil thickness is indispensable for ecological modeling, precision agriculture, and land resource management. However, it is difficult to obtain the details on AL soil thickness by using conventional soil survey method. In this research, the objective is to investigate the possibility and accuracy of mapping the spatial distribution of AL soil thickness through random forest (RF) model by using terrain variables at a small watershed scale. A total of 1113 soil samples collected from the slope fields were randomly divided into calibration (770 soil samples) and validation (343 soil samples) sets. Seven terrain variables including elevation, aspect, relative slope position, valley depth, flow path length, slope height, and topographic wetness index were derived from a digital elevation map (30 m). The RF model was compared with multiple linear regression (MLR), geographically weighted regression (GWR) and support vector machines (SVM) approaches based on the validation set. Model performance was evaluated by precision criteria of mean error (ME), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). Comparative results showed that RF outperformed MLR, GWR and SVM models. The RF gave better values of ME (0.39 cm), MAE (7.09 cm), and RMSE (10.85 cm) and higher R2 (62%). The sensitivity analysis demonstrated that the DEM had less uncertainty than the AL soil thickness. The outcome of the RF model indicated that elevation, flow path length and valley depth were the most important factors affecting the AL soil thickness variability across the watershed. These results demonstrated the RF model is a promising method for predicting spatial distribution of AL soil thickness using terrain parameters. PMID:28877196
Lee, Joseph G L; Sun, Dennis L; Schleicher, Nina M; Ribisl, Kurt M; Luke, Douglas A; Henriksen, Lisa
2017-05-01
Evidence of racial/ethnic inequalities in tobacco outlet density is limited by: (1) reliance on studies from single counties or states, (2) limited attention to spatial dependence, and (3) an unclear theory-based relationship between neighbourhood composition and tobacco outlet density. In 97 counties from the contiguous USA, we calculated the 2012 density of likely tobacco outlets (N=90 407), defined as tobacco outlets per 1000 population in census tracts (n=17 667). We used 2 spatial regression techniques, (1) a spatial errors approach in GeoDa software and (2) fitting a covariance function to the errors using a distance matrix of all tract centroids. We examined density as a function of race, ethnicity, income and 2 indicators identified from city planning literature to indicate neighbourhood stability (vacant housing, renter-occupied housing). The average density was 1.3 tobacco outlets per 1000 persons. Both spatial regression approaches yielded similar results. In unadjusted models, tobacco outlet density was positively associated with the proportion of black residents and negatively associated with the proportion of Asian residents, white residents and median household income. There was no association with the proportion of Hispanic residents. Indicators of neighbourhood stability explained the disproportionate density associated with black residential composition, but inequalities by income persisted in multivariable models. Data from a large sample of US counties and results from 2 techniques to address spatial dependence strengthen evidence of inequalities in tobacco outlet density by race and income. Further research is needed to understand the underlying mechanisms in order to strengthen interventions. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks
NASA Astrophysics Data System (ADS)
Kanevski, Mikhail
2015-04-01
The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press. With a CD: data, software, guides. (2009). 2. Kanevski M. Spatial Predictions of Soil Contamination Using General Regression Neural Networks. Systems Research and Information Systems, Volume 8, number 4, 1999. 3. Robert S., Foresti L., Kanevski M. Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks. International Journal of Climatology, 33 pp. 1793-1804, 2013.
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
Characterizing error distributions for MISR and MODIS optical depth data
NASA Astrophysics Data System (ADS)
Paradise, S.; Braverman, A.; Kahn, R.; Wilson, B.
2008-12-01
The Multi-angle Imaging SpectroRadiometer (MISR) and Moderate Resolution Imaging Spectroradiometer (MODIS) on NASA's EOS satellites collect massive, long term data records on aerosol amounts and particle properties. MISR and MODIS have different but complementary sampling characteristics. In order to realize maximum scientific benefit from these data, the nature of their error distributions must be quantified and understood so that discrepancies between them can be rectified and their information combined in the most beneficial way. By 'error' we mean all sources of discrepancies between the true value of the quantity of interest and the measured value, including instrument measurement errors, artifacts of retrieval algorithms, and differential spatial and temporal sampling characteristics. Previously in [Paradise et al., Fall AGU 2007: A12A-05] we presented a unified, global analysis and comparison of MISR and MODIS measurement biases and variances over lives of the missions. We used AErosol RObotic NETwork (AERONET) data as ground truth and evaluated MISR and MODIS optical depth distributions relative to AERONET using simple linear regression. However, AERONET data are themselves instrumental measurements subject to sources of uncertainty. In this talk, we discuss results from an improved analysis of MISR and MODIS error distributions that uses errors-in-variables regression, accounting for uncertainties in both the dependent and independent variables. We demonstrate on optical depth data, but the method is generally applicable to other aerosol properties as well.
Spatial regression analysis of traffic crashes in Seoul.
Rhee, Kyoung-Ah; Kim, Joon-Ki; Lee, Young-ihn; Ulfarsson, Gudmundur F
2016-06-01
Traffic crashes can be spatially correlated events and the analysis of the distribution of traffic crash frequency requires evaluation of parameters that reflect spatial properties and correlation. Typically this spatial aspect of crash data is not used in everyday practice by planning agencies and this contributes to a gap between research and practice. A database of traffic crashes in Seoul, Korea, in 2010 was developed at the traffic analysis zone (TAZ) level with a number of GIS developed spatial variables. Practical spatial models using available software were estimated. The spatial error model was determined to be better than the spatial lag model and an ordinary least squares baseline regression. A geographically weighted regression model provided useful insights about localization of effects. The results found that an increased length of roads with speed limit below 30 km/h and a higher ratio of residents below age of 15 were correlated with lower traffic crash frequency, while a higher ratio of residents who moved to the TAZ, more vehicle-kilometers traveled, and a greater number of access points with speed limit difference between side roads and mainline above 30 km/h all increased the number of traffic crashes. This suggests, for example, that better control or design for merging lower speed roads with higher speed roads is important. A key result is that the length of bus-only center lanes had the largest effect on increasing traffic crashes. This is important as bus-only center lanes with bus stop islands have been increasingly used to improve transit times. Hence the potential negative safety impacts of such systems need to be studied further and mitigated through improved design of pedestrian access to center bus stop islands. Copyright © 2016 Elsevier Ltd. All rights reserved.
Climatological Modeling of Monthly Air Temperature and Precipitation in Egypt through GIS Techniques
NASA Astrophysics Data System (ADS)
El Kenawy, A.
2009-09-01
This paper describes a method for modeling and mapping four climatic variables (maximum temperature, minimum temperature, mean temperature and total precipitation) in Egypt using a multiple regression approach implemented in a GIS environment. In this model, a set of variables including latitude, longitude, elevation within a distance of 5, 10 and 15 km, slope, aspect, distance to the Mediterranean Sea, distance to the Red Sea, distance to the Nile, ratio between land and water masses within a radius of 5, 10, 15 km, the Normalized Difference Vegetation Index (NDVI), the Normalized Difference Water Index (NDWI), the Normalized Difference Temperature Index (NDTI) and reflectance are included as independent variables. These variables were integrated as raster layers in MiraMon software at a spatial resolution of 1 km. Climatic variables were considered as dependent variables and averaged from quality controlled and homogenized 39 series distributing across the entire country during the period of (1957-2006). For each climatic variable, digital and objective maps were finally obtained using the multiple regression coefficients at monthly, seasonal and annual timescale. The accuracy of these maps were assessed through cross-validation between predicted and observed values using a set of statistics including coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), mean bias Error (MBE) and D Willmott statistic. These maps are valuable in the sense of spatial resolution as well as the number of observatories involved in the current analysis.
Spatial regression test for ensuring temperature data quality in southern Spain
NASA Astrophysics Data System (ADS)
Estévez, J.; Gavilán, P.; García-Marín, A. P.
2018-01-01
Quality assurance of meteorological data is crucial for ensuring the reliability of applications and models that use such data as input variables, especially in the field of environmental sciences. Spatial validation of meteorological data is based on the application of quality control procedures using data from neighbouring stations to assess the validity of data from a candidate station (the station of interest). These kinds of tests, which are referred to in the literature as spatial consistency tests, take data from neighbouring stations in order to estimate the corresponding measurement at the candidate station. These estimations can be made by weighting values according to the distance between the stations or to the coefficient of correlation, among other methods. The test applied in this study relies on statistical decision-making and uses a weighting based on the standard error of the estimate. This paper summarizes the results of the application of this test to maximum, minimum and mean temperature data from the Agroclimatic Information Network of Andalusia (southern Spain). This quality control procedure includes a decision based on a factor f, the fraction of potential outliers for each station across the region. Using GIS techniques, the geographic distribution of the errors detected has been also analysed. Finally, the performance of the test was assessed by evaluating its effectiveness in detecting known errors.
Crighton, Eric J; Elliott, Susan J; Moineddin, Rahim; Kanaroglou, Pavlos; Upshur, Ross
2007-04-01
Previous research on the determinants of pneumonia and influenza has focused primarily on the role of individual level biological and behavioural risk factors resulting in partial explanations and largely curative approaches to reducing the disease burden. This study examines the geographic patterns of pneumonia and influenza hospitalizations and the role that broad ecologic-level factors may have in determining them. We conducted a county level, retrospective, ecologic study of pneumonia and influenza hospitalizations in the province of Ontario, Canada, between 1992 and 2001 (N=241,803), controlling for spatial dependence in the data. Non-spatial and spatial regression models were estimated using a range of environmental, social, economic, behavioural, and health care predictors. Results revealed low education to be positively associated with hospitalization rates over all age groups and both genders. The Aboriginal population variable was also positively associated in most models except for the 65+-year age group. Behavioural factors (daily smoking and heavy drinking), environmental factors (passive smoking, poor housing, temperature), and health care factors (influenza vaccination) were all significantly associated in different age and gender-specific models. The use of spatial error regression models allowed for unbiased estimation of regression parameters and their significance levels. These findings demonstrate the importance of broad age and gender-specific population-level factors in determining pneumonia and influenza hospitalizations, and illustrate the need for place and population-specific policies that take these factors into consideration.
Dam, Jan S; Yavari, Nazila; Sørensen, Søren; Andersson-Engels, Stefan
2005-07-10
We present a fast and accurate method for real-time determination of the absorption coefficient, the scattering coefficient, and the anisotropy factor of thin turbid samples by using simple continuous-wave noncoherent light sources. The three optical properties are extracted from recordings of angularly resolved transmittance in addition to spatially resolved diffuse reflectance and transmittance. The applied multivariate calibration and prediction techniques are based on multiple polynomial regression in combination with a Newton--Raphson algorithm. The numerical test results based on Monte Carlo simulations showed mean prediction errors of approximately 0.5% for all three optical properties within ranges typical for biological media. Preliminary experimental results are also presented yielding errors of approximately 5%. Thus the presented methods show a substantial potential for simultaneous absorption and scattering characterization of turbid media.
Chang, Brian A; Pearson, William S; Owusu-Edusei, Kwame
2017-04-01
We used a combination of hot spot analysis (HSA) and spatial regression to examine county-level hot spot correlates for the most commonly reported nonviral sexually transmitted infections (STIs) in the 48 contiguous states in the United States (US). We obtained reported county-level total case rates of chlamydia, gonorrhea, and primary and secondary (P&S) syphilis in all counties in the 48 contiguous states from national surveillance data and computed temporally smoothed rates using 2008-2012 data. Covariates were obtained from county-level multiyear (2008-2012) American Community Surveys from the US census. We conducted HSA to identify hot spot counties for all three STIs. We then applied spatial logistic regression with the spatial error model to determine the association between the identified hot spots and the covariates. HSA indicated that ≥84% of hot spots for each STI were in the South. Spatial regression results indicated that, a 10-unit increase in the percentage of Black non-Hispanics was associated with ≈42% (P < 0.01) [≈22% (P < 0.01), for Hispanics] increase in the odds of being a hot spot county for chlamydia and gonorrhea, and ≈27% (P < 0.01) [≈11% (P < 0.01) for Hispanics] for P&S syphilis. Compared with the other regions (West, Midwest, and Northeast), counties in the South were 6.5 (P < 0.01; chlamydia), 9.6 (P < 0.01; gonorrhea), and 4.7 (P < 0.01; P&S syphilis) times more likely to be hot spots. Our study provides important information on hot spot clusters of nonviral STIs in the entire United States, including associations between hot spot counties and sociodemographic factors. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Li, Wang; Niu, Zheng; Gao, Shuai; Wang, Cheng
2014-11-01
Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are two competitive active remote sensing techniques in forest above ground biomass estimation, which is important for forest management and global climate change study. This study aims to further explore their capabilities in temperate forest above ground biomass (AGB) estimation by emphasizing the spatial auto-correlation of variables obtained from these two remote sensing tools, which is a usually overlooked aspect in remote sensing applications to vegetation studies. Remote sensing variables including airborne LiDAR metrics, backscattering coefficient for different SAR polarizations and their ratio variables for Radarsat-2 imagery were calculated. First, simple linear regression models (SLR) was established between the field-estimated above ground biomass and the remote sensing variables. Pearson's correlation coefficient (R2) was used to find which LiDAR metric showed the most significant correlation with the regression residuals and could be selected as co-variable in regression co-kriging (RCoKrig). Second, regression co-kriging was conducted by choosing the regression residuals as dependent variable and the LiDAR metric (Hmean) with highest R2 as co-variable. Third, above ground biomass over the study area was estimated using SLR model and RCoKrig model, respectively. The results for these two models were validated using the same ground points. Results showed that both of these two methods achieved satisfactory prediction accuracy, while regression co-kriging showed the lower estimation error. It is proved that regression co-kriging model is feasible and effective in mapping the spatial pattern of AGB in the temperate forest using Radarsat-2 data calibrated by airborne LiDAR metrics.
Estimating top-of-atmosphere thermal infrared radiance using MERRA-2 atmospheric data
NASA Astrophysics Data System (ADS)
Kleynhans, Tania; Montanaro, Matthew; Gerace, Aaron; Kanan, Christopher
2017-05-01
Thermal infrared satellite images have been widely used in environmental studies. However, satellites have limited temporal resolution, e.g., 16 day Landsat or 1 to 2 day Terra MODIS. This paper investigates the use of the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis data product, produced by NASA's Global Modeling and Assimilation Office (GMAO) to predict global topof-atmosphere (TOA) thermal infrared radiance. The high temporal resolution of the MERRA-2 data product presents opportunities for novel research and applications. Various methods were applied to estimate TOA radiance from MERRA-2 variables namely (1) a parameterized physics based method, (2) Linear regression models and (3) non-linear Support Vector Regression. Model prediction accuracy was evaluated using temporally and spatially coincident Moderate Resolution Imaging Spectroradiometer (MODIS) thermal infrared data as reference data. This research found that Support Vector Regression with a radial basis function kernel produced the lowest error rates. Sources of errors are discussed and defined. Further research is currently being conducted to train deep learning models to predict TOA thermal radiance
Van de Voorde, Tim; Vlaeminck, Jeroen; Canters, Frank
2008-01-01
Urban growth and its related environmental problems call for sustainable urban management policies to safeguard the quality of urban environments. Vegetation plays an important part in this as it provides ecological, social, health and economic benefits to a city's inhabitants. Remotely sensed data are of great value to monitor urban green and despite the clear advantages of contemporary high resolution images, the benefits of medium resolution data should not be discarded. The objective of this research was to estimate fractional vegetation cover from a Landsat ETM+ image with sub-pixel classification, and to compare accuracies obtained with multiple stepwise regression analysis, linear spectral unmixing and multi-layer perceptrons (MLP) at the level of meaningful urban spatial entities. Despite the small, but nevertheless statistically significant differences at pixel level between the alternative approaches, the spatial pattern of vegetation cover and estimation errors is clearly distinctive at neighbourhood level. At this spatially aggregated level, a simple regression model appears to attain sufficient accuracy. For mapping at a spatially more detailed level, the MLP seems to be the most appropriate choice. Brightness normalisation only appeared to affect the linear models, especially the linear spectral unmixing. PMID:27879914
Effects of urban form on the urban heat island effect based on spatial regression model.
Yin, Chaohui; Yuan, Man; Lu, Youpeng; Huang, Yaping; Liu, Yanfang
2018-09-01
The urban heat island (UHI) effect is becoming more of a concern with the accelerated process of urbanization. However, few studies have examined the effect of urban form on land surface temperature (LST) especially from an urban planning perspective. This paper used spatial regression model to investigate the effects of both land use composition and urban form on LST in Wuhan City, China, based on the regulatory planning management unit. Landsat ETM+ image data was used to estimate LST. Land use composition was calculated by impervious surface area proportion, vegetated area proportion, and water proportion, while urban form indicators included sky view factor (SVF), building density, and floor area ratio (FAR). We first tested for spatial autocorrelation of urban LST, which confirmed that a traditional regression method would be invalid. A spatial error model (SEM) was chosen because its parameters were better than a spatial lag model (SLM). The results showed that urban form metrics should be the focus for mitigation efforts of UHI effects. In addition, analysis of the relationship between urban form and UHI effect based on the regulatory planning management unit was helpful for promoting corresponding UHI effect mitigation rules in practice. Finally, the spatial regression model was recommended to be an appropriate method for dealing with problems related to the urban thermal environment. Results suggested that the impact of urbanization on the UHI effect can be mitigated not only by balancing various land use types, but also by optimizing urban form, which is even more effective. This research expands the scientific understanding of effects of urban form on UHI by explicitly analyzing indicators closely related to urban detailed planning at the level of regulatory planning management unit. In addition, it may provide important insights and effective regulation measures for urban planners to mitigate future UHI effects. Copyright © 2018 Elsevier B.V. All rights reserved.
Spatial patterns of March and September streamflow trends in Pacific Northwest Streams, 1958-2008
Chang, Heejun; Jung, Il-Won; Steele, Madeline; Gannett, Marshall
2012-01-01
Summer streamflow is a vital water resource for municipal and domestic water supplies, irrigation, salmonid habitat, recreation, and water-related ecosystem services in the Pacific Northwest (PNW) in the United States. This study detects significant negative trends in September absolute streamflow in a majority of 68 stream-gauging stations located on unregulated streams in the PNW from 1958 to 2008. The proportion of March streamflow to annual streamflow increases in most stations over 1,000 m elevation, with a baseflow index of less than 50, while absolute March streamflow does not increase in most stations. The declining trends of September absolute streamflow are strongly associated with seven-day low flow, January–March maximum temperature trends, and the size of the basin (19–7,260 km2), while the increasing trends of the fraction of March streamflow are associated with elevation, April 1 snow water equivalent, March precipitation, center timing of streamflow, and October–December minimum temperature trends. Compared with ordinary least squares (OLS) estimated regression models, spatial error regression and geographically weighted regression (GWR) models effectively remove spatial autocorrelation in residuals. The GWR model results show spatial gradients of local R 2 values with consistently higher local R 2 values in the northern Cascades. This finding illustrates that different hydrologic landscape factors, such as geology and seasonal distribution of precipitation, also influence streamflow trends in the PNW. In addition, our spatial analysis model results show that considering various geographic factors help clarify the dynamics of streamflow trends over a large geographical area, supporting a spatial analysis approach over aspatial OLS-estimated regression models for predicting streamflow trends. Results indicate that transitional rain–snow surface water-dominated basins are likely to have reduced summer streamflow under warming scenarios. Consequently, a better understanding of the relationships among summer streamflow, precipitation, snowmelt, elevation, and geology can help water managers predict the response of regional summer streamflow to global warming.
Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C
2018-06-29
A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Gao, Jing; Burt, James E.
2017-12-01
This study investigates the usefulness of a per-pixel bias-variance error decomposition (BVD) for understanding and improving spatially-explicit data-driven models of continuous variables in environmental remote sensing (ERS). BVD is a model evaluation method originated from machine learning and have not been examined for ERS applications. Demonstrated with a showcase regression tree model mapping land imperviousness (0-100%) using Landsat images, our results showed that BVD can reveal sources of estimation errors, map how these sources vary across space, reveal the effects of various model characteristics on estimation accuracy, and enable in-depth comparison of different error metrics. Specifically, BVD bias maps can help analysts identify and delineate model spatial non-stationarity; BVD variance maps can indicate potential effects of ensemble methods (e.g. bagging), and inform efficient training sample allocation - training samples should capture the full complexity of the modeled process, and more samples should be allocated to regions with more complex underlying processes rather than regions covering larger areas. Through examining the relationships between model characteristics and their effects on estimation accuracy revealed by BVD for both absolute and squared errors (i.e. error is the absolute or the squared value of the difference between observation and estimate), we found that the two error metrics embody different diagnostic emphases, can lead to different conclusions about the same model, and may suggest different solutions for performance improvement. We emphasize BVD's strength in revealing the connection between model characteristics and estimation accuracy, as understanding this relationship empowers analysts to effectively steer performance through model adjustments.
Parresol, B. R.; Scott, D. A.; Zarnoch, S. J.; ...
2017-12-15
Spatially explicit mapping of forest productivity is important to assess many forest management alternatives. We assessed the relationship between mapped variables and site index of forests ranging from southern pine plantations to natural hardwoods on a 74,000-ha landscape in South Carolina, USA. Mapped features used in the analysis were soil association, land use condition in 1951, depth to groundwater, slope and aspect. Basal area, species composition, age and height were the tree variables measured. Linear modelling identified that plot basal area, depth to groundwater, soils association and the interactions between depth to groundwater and forest group, and between land usemore » in 1951 and forest group were related to site index (SI) (R 2 =0.37), but this model had regression attenuation. We then used structural equation modeling to incorporate error-in-measurement corrections for basal area and groundwater to remove bias in the model. We validated this model using 89 independent observations and found the 95% confidence intervals for the slope and intercept of an observed vs. predicted site index error-corrected regression included zero and one, respectively, indicating a good fit. With error in measurement incorporated, only basal area, soil association, and the interaction between forest groups and land use were important predictors (R2 =0.57). Thus, we were able to develop an unbiased model of SI that could be applied to create a spatially explicit map based primarily on soils as modified by past (land use and forest type) and recent forest management (basal area).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parresol, B. R.; Scott, D. A.; Zarnoch, S. J.
Spatially explicit mapping of forest productivity is important to assess many forest management alternatives. We assessed the relationship between mapped variables and site index of forests ranging from southern pine plantations to natural hardwoods on a 74,000-ha landscape in South Carolina, USA. Mapped features used in the analysis were soil association, land use condition in 1951, depth to groundwater, slope and aspect. Basal area, species composition, age and height were the tree variables measured. Linear modelling identified that plot basal area, depth to groundwater, soils association and the interactions between depth to groundwater and forest group, and between land usemore » in 1951 and forest group were related to site index (SI) (R 2 =0.37), but this model had regression attenuation. We then used structural equation modeling to incorporate error-in-measurement corrections for basal area and groundwater to remove bias in the model. We validated this model using 89 independent observations and found the 95% confidence intervals for the slope and intercept of an observed vs. predicted site index error-corrected regression included zero and one, respectively, indicating a good fit. With error in measurement incorporated, only basal area, soil association, and the interaction between forest groups and land use were important predictors (R2 =0.57). Thus, we were able to develop an unbiased model of SI that could be applied to create a spatially explicit map based primarily on soils as modified by past (land use and forest type) and recent forest management (basal area).« less
Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.
2016-01-01
Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale. PMID:26964095
Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S
2016-01-01
Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale.
How many stakes are required to measure the mass balance of a glacier?
Fountain, A.G.; Vecchia, A.
1999-01-01
Glacier mass balance is estimated for South Cascade Glacier and Maclure Glacier using a one-dimensional regression of mass balance with altitude as an alternative to the traditional approach of contouring mass balance values. One attractive feature of regression is that it can be applied to sparse data sets where contouring is not possible and can provide an objective error of the resulting estimate. Regression methods yielded mass balance values equivalent to contouring methods. The effect of the number of mass balance measurements on the final value for the glacier showed that sample sizes as small as five stakes provided reasonable estimates, although the error estimates were greater than for larger sample sizes. Different spatial patterns of measurement locations showed no appreciable influence on the final value as long as different surface altitudes were intermittently sampled over the altitude range of the glacier. Two different regression equations were examined, a quadratic, and a piecewise linear spline, and comparison of results showed little sensitivity to the type of equation. These results point to the dominant effect of the gradient of mass balance with altitude of alpine glaciers compared to transverse variations. The number of mass balance measurements required to determine the glacier balance appears to be scale invariant for small glaciers and five to ten stakes are sufficient.
NASA Astrophysics Data System (ADS)
Sergeev, A. P.; Tarasov, D. A.; Buevich, A. G.; Shichkin, A. V.; Tyagunov, A. G.; Medvedev, A. N.
2017-06-01
Modeling of spatial distribution of pollutants in the urbanized territories is difficult, especially if there are multiple emission sources. When monitoring such territories, it is often impossible to arrange the necessary detailed sampling. Because of this, the usual methods of analysis and forecasting based on geostatistics are often less effective. Approaches based on artificial neural networks (ANNs) demonstrate the best results under these circumstances. This study compares two models based on ANNs, which are multilayer perceptron (MLP) and generalized regression neural networks (GRNNs) with the base geostatistical method - kriging. Models of the spatial dust distribution in the snow cover around the existing copper quarry and in the area of emissions of a nickel factory were created. To assess the effectiveness of the models three indices were used: the mean absolute error (MAE), the root-mean-square error (RMSE), and the relative root-mean-square error (RRMSE). Taking into account all indices the model of GRNN proved to be the most accurate which included coordinates of the sampling points and the distance to the likely emission source as input parameters for the modeling. Maps of spatial dust distribution in the snow cover were created in the study area. It has been shown that the models based on ANNs were more accurate than the kriging, particularly in the context of a limited data set.
Fire frequency in the Interior Columbia River Basin: Building regional models from fire history data
McKenzie, D.; Peterson, D.L.; Agee, James K.
2000-01-01
Fire frequency affects vegetation composition and successional pathways; thus it is essential to understand fire regimes in order to manage natural resources at broad spatial scales. Fire history data are lacking for many regions for which fire management decisions are being made, so models are needed to estimate past fire frequency where local data are not yet available. We developed multiple regression models and tree-based (classification and regression tree, or CART) models to predict fire return intervals across the interior Columbia River basin at 1-km resolution, using georeferenced fire history, potential vegetation, cover type, and precipitation databases. The models combined semiqualitative methods and rigorous statistics. The fire history data are of uneven quality; some estimates are based on only one tree, and many are not cross-dated. Therefore, we weighted the models based on data quality and performed a sensitivity analysis of the effects on the models of estimation errors that are due to lack of cross-dating. The regression models predict fire return intervals from 1 to 375 yr for forested areas, whereas the tree-based models predict a range of 8 to 150 yr. Both types of models predict latitudinal and elevational gradients of increasing fire return intervals. Examination of regional-scale output suggests that, although the tree-based models explain more of the variation in the original data, the regression models are less likely to produce extrapolation errors. Thus, the models serve complementary purposes in elucidating the relationships among fire frequency, the predictor variables, and spatial scale. The models can provide local managers with quantitative information and provide data to initialize coarse-scale fire-effects models, although predictions for individual sites should be treated with caution because of the varying quality and uneven spatial coverage of the fire history database. The models also demonstrate the integration of qualitative and quantitative methods when requisite data for fully quantitative models are unavailable. They can be tested by comparing new, independent fire history reconstructions against their predictions and can be continually updated, as better fire history data become available.
Estimating effects of limiting factors with regression quantiles
Cade, B.S.; Terrell, J.W.; Schroeder, R.L.
1999-01-01
In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.
Relationships of Measurement Error and Prediction Error in Observed-Score Regression
ERIC Educational Resources Information Center
Moses, Tim
2012-01-01
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
2014-01-01
Background This study aims to suggest an approach that integrates multilevel models and eigenvector spatial filtering methods and apply it to a case study of self-rated health status in South Korea. In many previous health-related studies, multilevel models and single-level spatial regression are used separately. However, the two methods should be used in conjunction because the objectives of both approaches are important in health-related analyses. The multilevel model enables the simultaneous analysis of both individual and neighborhood factors influencing health outcomes. However, the results of conventional multilevel models are potentially misleading when spatial dependency across neighborhoods exists. Spatial dependency in health-related data indicates that health outcomes in nearby neighborhoods are more similar to each other than those in distant neighborhoods. Spatial regression models can address this problem by modeling spatial dependency. This study explores the possibility of integrating a multilevel model and eigenvector spatial filtering, an advanced spatial regression for addressing spatial dependency in datasets. Methods In this spatially filtered multilevel model, eigenvectors function as additional explanatory variables accounting for unexplained spatial dependency within the neighborhood-level error. The specification addresses the inability of conventional multilevel models to account for spatial dependency, and thereby, generates more robust outputs. Results The findings show that sex, employment status, monthly household income, and perceived levels of stress are significantly associated with self-rated health status. Residents living in neighborhoods with low deprivation and a high doctor-to-resident ratio tend to report higher health status. The spatially filtered multilevel model provides unbiased estimations and improves the explanatory power of the model compared to conventional multilevel models although there are no changes in the signs of parameters and the significance levels between the two models in this case study. Conclusions The integrated approach proposed in this paper is a useful tool for understanding the geographical distribution of self-rated health status within a multilevel framework. In future research, it would be useful to apply the spatially filtered multilevel model to other datasets in order to clarify the differences between the two models. It is anticipated that this integrated method will also out-perform conventional models when it is used in other contexts. PMID:24571639
Hjort, Jan; Hugg, Timo T; Antikainen, Harri; Rusanen, Jarmo; Sofiev, Mikhail; Kukkonen, Jaakko; Jaakkola, Maritta S; Jaakkola, Jouni J K
2016-05-01
Despite the recent developments in physically and chemically based analysis of atmospheric particles, no models exist for resolving the spatial variability of pollen concentration at urban scale. We developed a land use regression (LUR) approach for predicting spatial fine-scale allergenic pollen concentrations in the Helsinki metropolitan area, Finland, and evaluated the performance of the models against available empirical data. We used grass pollen data monitored at 16 sites in an urban area during the peak pollen season and geospatial environmental data. The main statistical method was generalized linear model (GLM). GLM-based LURs explained 79% of the spatial variation in the grass pollen data based on all samples, and 47% of the variation when samples from two sites with very high concentrations were excluded. In model evaluation, prediction errors ranged from 6% to 26% of the observed range of grass pollen concentrations. Our findings support the use of geospatial data-based statistical models to predict the spatial variation of allergenic grass pollen concentrations at intra-urban scales. A remote sensing-based vegetation index was the strongest predictor of pollen concentrations for exposure assessments at local scales. The LUR approach provides new opportunities to estimate the relations between environmental determinants and allergenic pollen concentration in human-modified environments at fine spatial scales. This approach could potentially be applied to estimate retrospectively pollen concentrations to be used for long-term exposure assessments. Hjort J, Hugg TT, Antikainen H, Rusanen J, Sofiev M, Kukkonen J, Jaakkola MS, Jaakkola JJ. 2016. Fine-scale exposure to allergenic pollen in the urban environment: evaluation of land use regression approach. Environ Health Perspect 124:619-626; http://dx.doi.org/10.1289/ehp.1509761.
Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?
NASA Astrophysics Data System (ADS)
Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.
2016-12-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.
Local regression type methods applied to the study of geophysics and high frequency financial data
NASA Astrophysics Data System (ADS)
Mariani, M. C.; Basu, K.
2014-09-01
In this work we applied locally weighted scatterplot smoothing techniques (Lowess/Loess) to Geophysical and high frequency financial data. We first analyze and apply this technique to the California earthquake geological data. A spatial analysis was performed to show that the estimation of the earthquake magnitude at a fixed location is very accurate up to the relative error of 0.01%. We also applied the same method to a high frequency data set arising in the financial sector and obtained similar satisfactory results. The application of this approach to the two different data sets demonstrates that the overall method is accurate and efficient, and the Lowess approach is much more desirable than the Loess method. The previous works studied the time series analysis; in this paper our local regression models perform a spatial analysis for the geophysics data providing different information. For the high frequency data, our models estimate the curve of best fit where data are dependent on time.
Giltrap, Donna L; Ausseil, Anne-Gaëlle E
2016-01-01
The availability of detailed input data frequently limits the application of process-based models at large scale. In this study, we produced simplified meta-models of the simulated nitrous oxide (N2O) emission factors (EF) using NZ-DNDC. Monte Carlo simulations were performed and the results investigated using multiple regression analysis to produce simplified meta-models of EF. These meta-models were then used to estimate direct N2O emissions from grazed pastures in New Zealand. New Zealand EF maps were generated using the meta-models with data from national scale soil maps. Direct emissions of N2O from grazed pasture were calculated by multiplying the EF map with a nitrogen (N) input map. Three meta-models were considered. Model 1 included only the soil organic carbon in the top 30cm (SOC30), Model 2 also included a clay content factor, and Model 3 added the interaction between SOC30 and clay. The median annual national direct N2O emissions from grazed pastures estimated using each model (assuming model errors were purely random) were: 9.6GgN (Model 1), 13.6GgN (Model 2), and 11.9GgN (Model 3). These values corresponded to an average EF of 0.53%, 0.75% and 0.63% respectively, while the corresponding average EF using New Zealand national inventory values was 0.67%. If the model error can be assumed to be independent for each pixel then the 95% confidence interval for the N2O emissions was of the order of ±0.4-0.7%, which is much lower than existing methods. However, spatial correlations in the model errors could invalidate this assumption. Under the extreme assumption that the model error for each pixel was identical the 95% confidence interval was approximately ±100-200%. Therefore further work is needed to assess the degree of spatial correlation in the model errors. Copyright © 2015 Elsevier B.V. All rights reserved.
Estimating maize production in Kenya using NDVI: Some statistical considerations
Lewis, J.E.; Rowland, James; Nadeau , A.
1998-01-01
A regression model approach using a normalized difference vegetation index (NDVI) has the potential for estimating crop production in East Africa. However, before production estimation can become a reality, the underlying model assumptions and statistical nature of the sample data (NDVI and crop production) must be examined rigorously. Annual maize production statistics from 1982-90 for 36 agricultural districts within Kenya were used as the dependent variable; median area NDVI (independent variable) values from each agricultural district and year were extracted from the annual maximum NDVI data set. The input data and the statistical association of NDVI with maize production for Kenya were tested systematically for the following items: (1) homogeneity of the data when pooling the sample, (2) gross data errors and influence points, (3) serial (time) correlation, (4) spatial autocorrelation and (5) stability of the regression coefficients. The results of using a simple regression model with NDVI as the only independent variable are encouraging (r 0.75, p 0.05) and illustrate that NDVI can be a responsive indicator of maize production, especially in areas of high NDVI spatial variability, which coincide with areas of production variability in Kenya.
Balint, Lajos; Dome, Peter; Daroczi, Gergely; Gonda, Xenia; Rihmer, Zoltan
2014-02-01
In the last century Hungary had astonishingly high suicide rates characterized by marked regional within-country inequalities, a spatial pattern which has been quite stable over time. To explain the above phenomenon at the level of micro-regions (n=175) in the period between 2005 and 2011. Our dependent variable was the age and gender standardized mortality ratio (SMR) for suicide while explanatory variables were factors which are supposed to influence suicide risk, such as measures of religious and political integration, travel time accessibility of psychiatric services, alcohol consumption, unemployment and disability pensionery. When applying the ordinary least squared regression model, the residuals were found to be spatially autocorrelated, which indicates the violation of the assumption on the independence of error terms and - accordingly - the necessity of application of a spatial autoregressive (SAR) model to handle this problem. According to our calculations the SARlag model was a better way (versus the SARerr model) of addressing the problem of spatial autocorrelation, furthermore its substantive meaning is more convenient. SMR was significantly associated with the "political integration" variable in a negative and with "lack of religious integration" and "disability pensionery" variables in a positive manner. Associations were not significant for the remaining explanatory variables. Several important psychiatric variables were not available at the level of micro-regions. We conducted our analysis on aggregate data. Our results may draw attention to the relevance and abiding validity of the classic Durkheimian suicide risk factors - such as lack of social integration - apropos of the spatial pattern of Hungarian suicides. © 2013 Published by Elsevier B.V.
Huard, Edouard; Derelle, Sophie; Jaeck, Julien; Nghiem, Jean; Haïdar, Riad; Primot, Jérôme
2018-03-05
A challenging point in the prediction of the image quality of infrared imaging systems is the evaluation of the detector modulation transfer function (MTF). In this paper, we present a linear method to get a 2D continuous MTF from sparse spectral data. Within the method, an object with a predictable sparse spatial spectrum is imaged by the focal plane array. The sparse data is then treated to return the 2D continuous MTF with the hypothesis that all the pixels have an identical spatial response. The linearity of the treatment is a key point to estimate directly the error bars of the resulting detector MTF. The test bench will be presented along with measurement tests on a 25 μm pitch InGaAs detector.
NASA Astrophysics Data System (ADS)
Steger, Stefan; Brenning, Alexander; Bell, Rainer; Glade, Thomas
2016-12-01
There is unanimous agreement that a precise spatial representation of past landslide occurrences is a prerequisite to produce high quality statistical landslide susceptibility models. Even though perfectly accurate landslide inventories rarely exist, investigations of how landslide inventory-based errors propagate into subsequent statistical landslide susceptibility models are scarce. The main objective of this research was to systematically examine whether and how inventory-based positional inaccuracies of different magnitudes influence modelled relationships, validation results, variable importance and the visual appearance of landslide susceptibility maps. The study was conducted for a landslide-prone site located in the districts of Amstetten and Waidhofen an der Ybbs, eastern Austria, where an earth-slide point inventory was available. The methodological approach comprised an artificial introduction of inventory-based positional errors into the present landslide data set and an in-depth evaluation of subsequent modelling results. Positional errors were introduced by artificially changing the original landslide position by a mean distance of 5, 10, 20, 50 and 120 m. The resulting differently precise response variables were separately used to train logistic regression models. Odds ratios of predictor variables provided insights into modelled relationships. Cross-validation and spatial cross-validation enabled an assessment of predictive performances and permutation-based variable importance. All analyses were additionally carried out with synthetically generated data sets to further verify the findings under rather controlled conditions. The results revealed that an increasing positional inventory-based error was generally related to increasing distortions of modelling and validation results. However, the findings also highlighted that interdependencies between inventory-based spatial inaccuracies and statistical landslide susceptibility models are complex. The systematic comparisons of 12 models provided valuable evidence that the respective error-propagation was not only determined by the degree of positional inaccuracy inherent in the landslide data, but also by the spatial representation of landslides and the environment, landslide magnitude, the characteristics of the study area, the selected classification method and an interplay of predictors within multiple variable models. Based on the results, we deduced that a direct propagation of minor to moderate inventory-based positional errors into modelling results can be partly counteracted by adapting the modelling design (e.g. generalization of input data, opting for strongly generalizing classifiers). Since positional errors within landslide inventories are common and subsequent modelling and validation results are likely to be distorted, the potential existence of inventory-based positional inaccuracies should always be considered when assessing landslide susceptibility by means of empirical models.
Wagner, Brian J.; Gorelick, Steven M.
1986-01-01
A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
Li, Jie; Fang, Xiangming
2010-01-01
Automated geocoding of patient addresses is an important data assimilation component of many spatial epidemiologic studies. Inevitably, the geocoding process results in positional errors. Positional errors incurred by automated geocoding tend to reduce the power of tests for disease clustering and otherwise affect spatial analytic methods. However, there are reasons to believe that the errors may often be positively spatially correlated and that this may mitigate their deleterious effects on spatial analyses. In this article, we demonstrate explicitly that the positional errors associated with automated geocoding of a dataset of more than 6000 addresses in Carroll County, Iowa are spatially autocorrelated. Furthermore, through two simulation studies of disease processes, including one in which the disease process is overlain upon the Carroll County addresses, we show that spatial autocorrelation among geocoding errors maintains the power of two tests for disease clustering at a level higher than that which would occur if the errors were independent. Implications of these results for cluster detection, privacy protection, and measurement-error modeling of geographic health data are discussed. PMID:20087879
Price, Weather, and `Acreage Abandonment' in Western Great Plains Wheat Culture.
NASA Astrophysics Data System (ADS)
Michaels, Patrick J.
1983-07-01
Multivariate analyses of acreage abandonment patterns in the U.S. Great Plains winter wheat region indicate that the major mode of variation is an in-phase oscillation confined to the western half of the overall area, which is also the area with lowest average yields. This is one of the more agroclimatically marginal environments in the United States, with wide interannual fluctuations in both climate and profitability.We developed a multiple regression model to determine the relative roles of weather and expected price in the decision not to harvest. The overall model explained 77% of the spatial and temporal variation in abandonment. The 36.5% of the non-spatial variation was explained by two simple transformations of climatic data from three monthly aggregates-September-October, November-February and March-April. Price factors, expressed as indexed future delivery quotations,were barely significant, with only between 3 and 5% of the non-spatial variation explained, depending upon the model.The model was based upon weather, climate and price data from 1932 through 1975. It was tested by sequentially withholding three-year blocks of data, and using the respecified regression coefficients, along with observed weather and price, to estimate abandonment in the withheld years. Error analyses indicate no loss of model fidelity in the test mode. Also, prediction errors in the 1970-75 period, characterized by widely fluctuating prices, were not different from those in the rest of the model.The overall results suggest that the perceived quality of the crop, as influenced by weather, is a much more important determinant of the abandonment decision than are expected returns based upon price considerations.
Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua
2016-01-01
(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran’s I statistic and Anselin’s local Moran’s I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R2 = 0.0741, log likelihood = −1819.69, AIC = 3665.38), SLM (R2 = 0.0786, log likelihood = −1819.04, AIC = 3665.08) and SEM (R2 = 0.0743, log likelihood = −1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide (p = 0.027), rainfall (p = 0.036) and sunshine hour (p = 0.048), while the relative humidity (p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention. PMID:27827946
Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua
2016-11-04
(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran's I statistic and Anselin's local Moran's I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R² = 0.0741, log likelihood = -1819.69, AIC = 3665.38), SLM (R² = 0.0786, log likelihood = -1819.04, AIC = 3665.08) and SEM (R² = 0.0743, log likelihood = -1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide ( p = 0.027), rainfall ( p = 0.036) and sunshine hour ( p = 0.048), while the relative humidity ( p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention.
NASA Astrophysics Data System (ADS)
Nandy, Sreyankar; Mostafa, Atahar; Kumavor, Patrick D.; Sanders, Melinda; Brewer, Molly; Zhu, Quing
2016-10-01
A spatial frequency domain imaging (SFDI) system was developed for characterizing ex vivo human ovarian tissue using wide-field absorption and scattering properties and their spatial heterogeneities. Based on the observed differences between absorption and scattering images of different ovarian tissue groups, six parameters were quantitatively extracted. These are the mean absorption and scattering, spatial heterogeneities of both absorption and scattering maps measured by a standard deviation, and a fitting error of a Gaussian model fitted to normalized mean Radon transform of the absorption and scattering maps. A logistic regression model was used for classification of malignant and normal ovarian tissues. A sensitivity of 95%, specificity of 100%, and area under the curve of 0.98 were obtained using six parameters extracted from the SFDI images. The preliminary results demonstrate the diagnostic potential of the SFDI method for quantitative characterization of wide-field optical properties and the spatial distribution heterogeneity of human ovarian tissue. SFDI could be an extremely robust and valuable tool for evaluation of the ovary and detection of neoplastic changes of ovarian cancer.
Empirical tools for simulating salinity in the estuaries in Everglades National Park, Florida
NASA Astrophysics Data System (ADS)
Marshall, F. E.; Smith, D. T.; Nickerson, D. M.
2011-12-01
Salinity in a shallow estuary is affected by upland freshwater inputs (surface runoff, stream/canal flows, groundwater), atmospheric processes (precipitation, evaporation), marine connectivity, and wind patterns. In Everglades National Park (ENP) in South Florida, the unique Everglades ecosystem exists as an interconnected system of fresh, brackish, and salt water marshes, mangroves, and open water. For this effort a coastal aquifer conceptual model of the Everglades hydrologic system was used with traditional correlation and regression hydrologic techniques to create a series of multiple linear regression (MLR) salinity models from observed hydrologic, marine, and weather data. The 37 ENP MLR salinity models cover most of the estuarine areas of ENP and produce daily salinity simulations that are capable of estimating 65-80% of the daily variability in salinity depending upon the model. The Root Mean Squared Error is typically about 2-4 salinity units, and there is little bias in the predictions. However, the absolute error of a model prediction in the nearshore embayments and the mangrove zone of Florida Bay may be relatively large for a particular daily simulation during the seasonal transitions. Comparisons show that the models group regionally by similar independent variables and salinity regimes. The MLR salinity models have approximately the same expected range of simulation accuracy and error as higher spatial resolution salinity models.
Olson, Scott A.
2003-01-01
The stream-gaging network in New Hampshire was analyzed for its effectiveness in providing regional information on peak-flood flow, mean-flow, and low-flow frequency. The data available for analysis were from stream-gaging stations in New Hampshire and selected stations in adjacent States. The principles of generalized-least-squares regression analysis were applied to develop regional regression equations that relate streamflow-frequency characteristics to watershed characteristics. Regression equations were developed for (1) the instantaneous peak flow with a 100-year recurrence interval, (2) the mean-annual flow, and (3) the 7-day, 10-year low flow. Active and discontinued stream-gaging stations with 10 or more years of flow data were used to develop the regression equations. Each stream-gaging station in the network was evaluated and ranked on the basis of how much the data from that station contributed to the cost-weighted sampling-error component of the regression equation. The potential effect of data from proposed and new stream-gaging stations on the sampling error also was evaluated. The stream-gaging network was evaluated for conditions in water year 2000 and for estimated conditions under various network strategies if an additional 5 years and 20 years of streamflow data were collected. The effectiveness of the stream-gaging network in providing regional streamflow information could be improved for all three flow characteristics with the collection of additional flow data, both temporally and spatially. With additional years of data collection, the greatest reduction in the average sampling error of the regional regression equations was found for the peak- and low-flow characteristics. In general, additional data collection at stream-gaging stations with unregulated flow, relatively short-term record (less than 20 years), and drainage areas smaller than 45 square miles contributed the largest cost-weighted reduction to the average sampling error of the regional estimating equations. The results of the network analyses can be used to prioritize the continued operation of active stations, the reactivation of discontinued stations, or the activation of new stations to maximize the regional information content provided by the stream-gaging network. Final decisions regarding altering the New Hampshire stream-gaging network would require the consideration of the many uses of the streamflow data serving local, State, and Federal interests.
ERIC Educational Resources Information Center
Shear, Benjamin R.; Zumbo, Bruno D.
2013-01-01
Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…
Operational quality control of daily precipitation using spatio-climatological consistency testing
NASA Astrophysics Data System (ADS)
Scherrer, S. C.; Croci-Maspoli, M.; van Geijtenbeek, D.; Naguel, C.; Appenzeller, C.
2010-09-01
Quality control (QC) of meteorological data is of utmost importance for climate related decisions. The search for an effective automated QC of precipitation data has proven difficult and many weather services still use mainly manual inspection of daily precipitation including MeteoSwiss. However, man power limitations force many weather services to move towards less labour intensive and more automated QC with the challenge to keeping data quality high. In the last decade, several approaches have been presented to objectify daily precipitation QC. Here we present a spatio-climatological approach that will be implemented operationally at MeteoSwiss. It combines the information from the event based spatial distribution of everyday's precipitation field and the historical information of the interpolation error using different precipitation intensity intervals. Expert judgement shows that the system is able to detect potential outliers very well (hardly any missed errors) without creating too many false alarms that need human inspection. 50-80% of all flagged values have been classified as real errors by the data editor. This is much better than the roughly 15-20% using standard spatial regression tests. Very helpful in the QC process is the automatic redistribution of accumulated several day sums. Manual inspection in operations can be reduced and the QC of precipitation objectified substantially.
A regression-kriging model for estimation of rainfall in the Laohahe basin
NASA Astrophysics Data System (ADS)
Wang, Hong; Ren, Li L.; Liu, Gao H.
2009-10-01
This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.
A spectral-spatial-dynamic hierarchical Bayesian (SSD-HB) model for estimating soybean yield
NASA Astrophysics Data System (ADS)
Kazama, Yoriko; Kujirai, Toshihiro
2014-10-01
A method called a "spectral-spatial-dynamic hierarchical-Bayesian (SSD-HB) model," which can deal with many parameters (such as spectral and weather information all together) by reducing the occurrence of multicollinearity, is proposed. Experiments conducted on soybean yields in Brazil fields with a RapidEye satellite image indicate that the proposed SSD-HB model can predict soybean yield with a higher degree of accuracy than other estimation methods commonly used in remote-sensing applications. In the case of the SSD-HB model, the mean absolute error between estimated yield of the target area and actual yield is 0.28 t/ha, compared to 0.34 t/ha when conventional PLS regression was applied, showing the potential effectiveness of the proposed model.
Use of machine learning methods to reduce predictive error of groundwater models.
Xu, Tianfang; Valocchi, Albert J; Choi, Jaesik; Amir, Eyal
2014-01-01
Quantitative analyses of groundwater flow and transport typically rely on a physically-based model, which is inherently subject to error. Errors in model structure, parameter and data lead to both random and systematic error even in the output of a calibrated model. We develop complementary data-driven models (DDMs) to reduce the predictive error of physically-based groundwater models. Two machine learning techniques, the instance-based weighting and support vector regression, are used to build the DDMs. This approach is illustrated using two real-world case studies of the Republican River Compact Administration model and the Spokane Valley-Rathdrum Prairie model. The two groundwater models have different hydrogeologic settings, parameterization, and calibration methods. In the first case study, cluster analysis is introduced for data preprocessing to make the DDMs more robust and computationally efficient. The DDMs reduce the root-mean-square error (RMSE) of the temporal, spatial, and spatiotemporal prediction of piezometric head of the groundwater model by 82%, 60%, and 48%, respectively. In the second case study, the DDMs reduce the RMSE of the temporal prediction of piezometric head of the groundwater model by 77%. It is further demonstrated that the effectiveness of the DDMs depends on the existence and extent of the structure in the error of the physically-based model. © 2013, National GroundWater Association.
NASA Astrophysics Data System (ADS)
Petković, Dalibor; Shamshirband, Shahaboddin; Saboohi, Hadi; Ang, Tan Fong; Anuar, Nor Badrul; Rahman, Zulkanain Abdul; Pavlović, Nenad T.
2014-07-01
The quantitative assessment of image quality is an important consideration in any type of imaging system. The modulation transfer function (MTF) is a graphical description of the sharpness and contrast of an imaging system or of its individual components. The MTF is also known and spatial frequency response. The MTF curve has different meanings according to the corresponding frequency. The MTF of an optical system specifies the contrast transmitted by the system as a function of image size, and is determined by the inherent optical properties of the system. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of Support Vector Regression (SVR) to estimate and predict estimate MTF value of the actual optical system according to experimental tests. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound so as to achieve generalized performance. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by the SVR_rbf approach in compare to SVR_poly soft computing methodology.
Local indicators of geocoding accuracy (LIGA): theory and application
Jacquez, Geoffrey M; Rommel, Robert
2009-01-01
Background Although sources of positional error in geographic locations (e.g. geocoding error) used for describing and modeling spatial patterns are widely acknowledged, research on how such error impacts the statistical results has been limited. In this paper we explore techniques for quantifying the perturbability of spatial weights to different specifications of positional error. Results We find that a family of curves describes the relationship between perturbability and positional error, and use these curves to evaluate sensitivity of alternative spatial weight specifications to positional error both globally (when all locations are considered simultaneously) and locally (to identify those locations that would benefit most from increased geocoding accuracy). We evaluate the approach in simulation studies, and demonstrate it using a case-control study of bladder cancer in south-eastern Michigan. Conclusion Three results are significant. First, the shape of the probability distributions of positional error (e.g. circular, elliptical, cross) has little impact on the perturbability of spatial weights, which instead depends on the mean positional error. Second, our methodology allows researchers to evaluate the sensitivity of spatial statistics to positional accuracy for specific geographies. This has substantial practical implications since it makes possible routine sensitivity analysis of spatial statistics to positional error arising in geocoded street addresses, global positioning systems, LIDAR and other geographic data. Third, those locations with high perturbability (most sensitive to positional error) and high leverage (that contribute the most to the spatial weight being considered) will benefit the most from increased positional accuracy. These are rapidly identified using a new visualization tool we call the LIGA scatterplot. Herein lies a paradox for spatial analysis: For a given level of positional error increasing sample density to more accurately follow the underlying population distribution increases perturbability and introduces error into the spatial weights matrix. In some studies positional error may not impact the statistical results, and in others it might invalidate the results. We therefore must understand the relationships between positional accuracy and the perturbability of the spatial weights in order to have confidence in a study's results. PMID:19863795
Sajjadi, Seyed Ali; Zolfaghari, Ghasem; Adab, Hamed; Allahabadi, Ahmad; Delsouz, Mehri
2017-01-01
This paper presented the levels of PM 2.5 and PM 10 in different stations at the city of Sabzevar, Iran. Furthermore, this study was an attempt to evaluate spatial interpolation methods for determining the PM 2.5 and PM 10 concentrations in the city of Sabzevar. Particulate matters were measured by Haz-Dust EPAM at 48 stations. Then, four interpolating models, including Radial Basis Functions (RBF), Inverse Distance Weighting (IDW), Ordinary Kriging (OK), and Universal Kriging (UK) were used to investigate the status of air pollution in the city. Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) were employed to compare the four models. The results showed that the PM 2.5 concentrations in the stations were between 10 and 500 μg/m 3 . Furthermore, the PM 10 concentrations for all of 48 stations ranged from 20 to 1500 μg/m 3 . The concentrations obtained for the period of nine months were greater than the standard limits. There was difference in the values of MAPE, RMSE, MBE, and MAE. The results indicated that the MAPE in IDW method was lower than other methods: (41.05 for PM 2.5 and 25.89 for PM 10 ). The best interpolation method for the particulate matter (PM 2.5 and PM 10 ) seemed to be IDW method. •The PM 10 and PM 2.5 concentration measurements were performed in the period of warm and risky in terms of particulate matter at 2016.•Concentrations of PM 2.5 and PM 10 were measured by a monitoring device, environmental dust model Haz-Dust EPAM 5000.•Interpolation is used to convert data from observation points to continuous fields to compare spatial patterns sampled by these measurements with spatial patterns of other spatial entities.
Accounting for and predicting the influence of spatial autocorrelation in water quality modeling
NASA Astrophysics Data System (ADS)
Miralha, L.; Kim, D.
2017-12-01
Although many studies have attempted to investigate the spatial trends of water quality, more attention is yet to be paid to the consequences of considering and ignoring the spatial autocorrelation (SAC) that exists in water quality parameters. Several studies have mentioned the importance of accounting for SAC in water quality modeling, as well as the differences in outcomes between models that account for and ignore SAC. However, the capacity to predict the magnitude of such differences is still ambiguous. In this study, we hypothesized that SAC inherently possessed by a response variable (i.e., water quality parameter) influences the outcomes of spatial modeling. We evaluated whether the level of inherent SAC is associated with changes in R-Squared, Akaike Information Criterion (AIC), and residual SAC (rSAC), after accounting for SAC during modeling procedure. The main objective was to analyze if water quality parameters with higher Moran's I values (inherent SAC measure) undergo a greater increase in R² and a greater reduction in both AIC and rSAC. We compared a non-spatial model (OLS) to two spatial regression approaches (spatial lag and error models). Predictor variables were the principal components of topographic (elevation and slope), land cover, and hydrological soil group variables. We acquired these data from federal online sources (e.g. USGS). Ten watersheds were selected, each in a different state of the USA. Results revealed that water quality parameters with higher inherent SAC showed substantial increase in R² and decrease in rSAC after performing spatial regressions. However, AIC values did not show significant changes. Overall, the higher the level of inherent SAC in water quality variables, the greater improvement of model performance. This indicates a linear and direct relationship between the spatial model outcomes (R² and rSAC) and the degree of SAC in each water quality variable. Therefore, our study suggests that the inherent level of SAC in response variables can predict improvements in models even before performing spatial regression approaches. We also recognize the constraints of this research and suggest that further studies focus on better ways of defining spatial neighborhoods, considering the differences among stations set in tributaries near to each other and in upstream areas.
NASA Astrophysics Data System (ADS)
Szymanowski, Mariusz; Kryza, Maciej
2017-02-01
Our study examines the role of auxiliary variables in the process of spatial modelling and mapping of climatological elements, with air temperature in Poland used as an example. The multivariable algorithms are the most frequently applied for spatialization of air temperature, and their results in many studies are proved to be better in comparison to those obtained by various one-dimensional techniques. In most of the previous studies, two main strategies were used to perform multidimensional spatial interpolation of air temperature. First, it was accepted that all variables significantly correlated with air temperature should be incorporated into the model. Second, it was assumed that the more spatial variation of air temperature was deterministically explained, the better was the quality of spatial interpolation. The main goal of the paper was to examine both above-mentioned assumptions. The analysis was performed using data from 250 meteorological stations and for 69 air temperature cases aggregated on different levels: from daily means to 10-year annual mean. Two cases were considered for detailed analysis. The set of potential auxiliary variables covered 11 environmental predictors of air temperature. Another purpose of the study was to compare the results of interpolation given by various multivariable methods using the same set of explanatory variables. Two regression models: multiple linear (MLR) and geographically weighted (GWR) method, as well as their extensions to the regression-kriging form, MLRK and GWRK, respectively, were examined. Stepwise regression was used to select variables for the individual models and the cross-validation method was used to validate the results with a special attention paid to statistically significant improvement of the model using the mean absolute error (MAE) criterion. The main results of this study led to rejection of both assumptions considered. Usually, including more than two or three of the most significantly correlated auxiliary variables does not improve the quality of the spatial model. The effects of introduction of certain variables into the model were not climatologically justified and were seen on maps as unexpected and undesired artefacts. The results confirm, in accordance with previous studies, that in the case of air temperature distribution, the spatial process is non-stationary; thus, the local GWR model performs better than the global MLR if they are specified using the same set of auxiliary variables. If only GWR residuals are autocorrelated, the geographically weighted regression-kriging (GWRK) model seems to be optimal for air temperature spatial interpolation.
NASA Technical Reports Server (NTRS)
Dong, D.; Fang, P.; Bock, F.; Webb, F.; Prawirondirdjo, L.; Kedar, S.; Jamason, P.
2006-01-01
Spatial filtering is an effective way to improve the precision of coordinate time series for regional GPS networks by reducing so-called common mode errors, thereby providing better resolution for detecting weak or transient deformation signals. The commonly used approach to regional filtering assumes that the common mode error is spatially uniform, which is a good approximation for networks of hundreds of kilometers extent, but breaks down as the spatial extent increases. A more rigorous approach should remove the assumption of spatially uniform distribution and let the data themselves reveal the spatial distribution of the common mode error. The principal component analysis (PCA) and the Karhunen-Loeve expansion (KLE) both decompose network time series into a set of temporally varying modes and their spatial responses. Therefore they provide a mathematical framework to perform spatiotemporal filtering.We apply the combination of PCA and KLE to daily station coordinate time series of the Southern California Integrated GPS Network (SCIGN) for the period 2000 to 2004. We demonstrate that spatially and temporally correlated common mode errors are the dominant error source in daily GPS solutions. The spatial characteristics of the common mode errors are close to uniform for all east, north, and vertical components, which implies a very long wavelength source for the common mode errors, compared to the spatial extent of the GPS network in southern California. Furthermore, the common mode errors exhibit temporally nonrandom patterns.
Bayesian quantitative precipitation forecasts in terms of quantiles
NASA Astrophysics Data System (ADS)
Bentzien, Sabrina; Friederichs, Petra
2014-05-01
Ensemble prediction systems (EPS) for numerical weather predictions on the mesoscale are particularly developed to obtain probabilistic guidance for high impact weather. An EPS not only issues a deterministic future state of the atmosphere but a sample of possible future states. Ensemble postprocessing then translates such a sample of forecasts into probabilistic measures. This study focus on probabilistic quantitative precipitation forecasts in terms of quantiles. Quantiles are particular suitable to describe precipitation at various locations, since no assumption is required on the distribution of precipitation. The focus is on the prediction during high-impact events and related to the Volkswagen Stiftung funded project WEX-MOP (Mesoscale Weather Extremes - Theory, Spatial Modeling and Prediction). Quantile forecasts are derived from the raw ensemble and via quantile regression. Neighborhood method and time-lagging are effective tools to inexpensively increase the ensemble spread, which results in more reliable forecasts especially for extreme precipitation events. Since an EPS provides a large amount of potentially informative predictors, a variable selection is required in order to obtain a stable statistical model. A Bayesian formulation of quantile regression allows for inference about the selection of predictive covariates by the use of appropriate prior distributions. Moreover, the implementation of an additional process layer for the regression parameters accounts for spatial variations of the parameters. Bayesian quantile regression and its spatially adaptive extension is illustrated for the German-focused mesoscale weather prediction ensemble COSMO-DE-EPS, which runs (pre)operationally since December 2010 at the German Meteorological Service (DWD). Objective out-of-sample verification uses the quantile score (QS), a weighted absolute error between quantile forecasts and observations. The QS is a proper scoring function and can be decomposed into reliability, resolutions and uncertainty parts. A quantile reliability plot gives detailed insights in the predictive performance of the quantile forecasts.
Discovering Communicable Scientific Knowledge from Spatio-Temporal Data
NASA Technical Reports Server (NTRS)
Schwabacher, Mark; Langley, Pat; Norvig, Peter (Technical Monitor)
2001-01-01
This paper describes how we used regression rules to improve upon a result previously published in the Earth science literature. In such a scientific application of machine learning, it is crucially important for the learned models to be understandable and communicable. We recount how we selected a learning algorithm to maximize communicability, and then describe two visualization techniques that we developed to aid in understanding the model by exploiting the spatial nature of the data. We also report how evaluating the learned models across time let us discover an error in the data.
Discovering Communicable Models from Earth Science Data
NASA Technical Reports Server (NTRS)
Schwabacher, Mark; Langley, Pat; Potter, Christopher; Klooster, Steven; Torregrosa, Alicia
2002-01-01
This chapter describes how we used regression rules to improve upon results previously published in the Earth science literature. In such a scientific application of machine learning, it is crucially important for the learned models to be understandable and communicable. We recount how we selected a learning algorithm to maximize communicability, and then describe two visualization techniques that we developed to aid in understanding the model by exploiting the spatial nature of the data. We also report how evaluating the learned models across time let us discover an error in the data.
Quantifying the uncertainty of regional and national estimates of soil carbon stocks
NASA Astrophysics Data System (ADS)
Papritz, Andreas
2013-04-01
At regional and national scales, carbon (C) stocks are frequently estimated by means of regression models. Such statistical models link measurements of carbons stocks, recorded for a set of soil profiles or soil cores, to covariates that characterize soil formation conditions and land management. A prerequisite is that these covariates are available for any location within a region of interest G because they are used along with the fitted regression coefficients to predict the carbon stocks at the nodes of a fine-meshed grid that is laid over G. The mean C stock in G is then estimated by the arithmetic mean of the stock predictions for the grid nodes. Apart from the mean stock, the precision of the estimate is often also of interest, for example to judge whether the mean C stock has changed significantly between two inventories. The standard error of the estimated mean stock in G can be computed from the regression results as well. Two issues are thereby important: (i) How large is the area of G relative to the support of the measurements? (ii) Are the residuals of the regression model spatially auto-correlated or is the assumption of statistical independence tenable? Both issues are correctly handled if one adopts a geostatistical block kriging approach for estimating the mean C stock within a region and its standard error. In the presentation I shall summarize the main ideas of external drift block kriging. To compute the standard error of the mean stock, one has in principle to sum the elements a potentially very large covariance matrix of point prediction errors, but I shall show that the required term can be approximated very well by Monte Carlo techniques. I shall further illustrated with a few examples how the standard error of the mean stock estimate changes with the size of G and with the strenght of the auto-correlation of the regression residuals. As an application a robust variant of block kriging is used to quantify the mean carbon stock stored in the soils of Swiss forests (Nussbaum et al., 2012). Nussbaum, M., Papritz, A., Baltensweiler, A., and Walthert, L. (2012). Organic carbon stocks of swiss forest soils. Final report, Institute of Terrestrial Ecosystems, ETH Zürich and Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), pp. 51, http://e-collection.library.ethz.ch/eserv/eth:6027/eth-6027-01.pdf
Troutman, Brent M.
1982-01-01
Errors in runoff prediction caused by input data errors are analyzed by treating precipitation-runoff models as regression (conditional expectation) models. Independent variables of the regression consist of precipitation and other input measurements; the dependent variable is runoff. In models using erroneous input data, prediction errors are inflated and estimates of expected storm runoff for given observed input variables are biased. This bias in expected runoff estimation results in biased parameter estimates if these parameter estimates are obtained by a least squares fit of predicted to observed runoff values. The problems of error inflation and bias are examined in detail for a simple linear regression of runoff on rainfall and for a nonlinear U.S. Geological Survey precipitation-runoff model. Some implications for flood frequency analysis are considered. A case study using a set of data from Turtle Creek near Dallas, Texas illustrates the problems of model input errors.
Students’ Errors in Geometry Viewed from Spatial Intelligence
NASA Astrophysics Data System (ADS)
Riastuti, N.; Mardiyana, M.; Pramudya, I.
2017-09-01
Geometry is one of the difficult materials because students must have ability to visualize, describe images, draw shapes, and know the kind of shapes. This study aim is to describe student error based on Newmans’ Error Analysis in solving geometry problems viewed from spatial intelligence. This research uses descriptive qualitative method by using purposive sampling technique. The datas in this research are the result of geometri material test and interview by the 8th graders of Junior High School in Indonesia. The results of this study show that in each category of spatial intelligence has a different type of error in solving the problem on the material geometry. Errors are mostly made by students with low spatial intelligence because they have deficiencies in visual abilities. Analysis of student error viewed from spatial intelligence is expected to help students do reflection in solving the problem of geometry.
Field Scale Spatial Modelling of Surface Soil Quality Attributes in Controlled Traffic Farming
NASA Astrophysics Data System (ADS)
Guenette, Kris; Hernandez-Ramirez, Guillermo
2017-04-01
The employment of controlled traffic farming (CTF) can yield improvements to soil quality attributes through the confinement of equipment traffic to tramlines with the field. There is a need to quantify and explain the spatial heterogeneity of soil quality attributes affected by CTF to further improve our understanding and modelling ability of field scale soil dynamics. Soil properties such as available nitrogen (AN), pH, soil total nitrogen (STN), soil organic carbon (SOC), bulk density, macroporosity, soil quality S-Index, plant available water capacity (PAWC) and unsaturated hydraulic conductivity (Km) were analysed and compared among trafficked and un-trafficked areas. We contrasted standard geostatistical methods such as ordinary kriging (OK) and covariate kriging (COK) as well as the hybrid method of regression kriging (ROK) to predict the spatial distribution of soil properties across two annual cropland sites actively employing CTF in Alberta, Canada. Field scale variability was quantified more accurately through the inclusion of covariates; however, the use of ROK was shown to improve model accuracy despite the regression model composition limiting the robustness of the ROK method. The exclusion of traffic from the un-trafficked areas displayed significant improvements to bulk density, macroporosity and Km while subsequently enhancing AN, STN and SOC. The ability of the regression models and the ROK method to account for spatial trends led to the highest goodness-of-fit and lowest error achieved for the soil physical properties, as the rigid traffic regime of CTF altered their spatial distribution at the field scale. Conversely, the COK method produced the most optimal predictions for the soil nutrient properties and Km. The use of terrain covariates derived from light ranging and detection (LiDAR), such as of elevation and topographic position index (TPI), yielded the best models in the COK method at the field scale.
NASA Astrophysics Data System (ADS)
Park, Seonyoung; Im, Jungho; Park, Sumin; Rhee, Jinyoung
2017-04-01
Soil moisture is one of the most important keys for understanding regional and global climate systems. Soil moisture is directly related to agricultural processes as well as hydrological processes because soil moisture highly influences vegetation growth and determines water supply in the agroecosystem. Accurate monitoring of the spatiotemporal pattern of soil moisture is important. Soil moisture has been generally provided through in situ measurements at stations. Although field survey from in situ measurements provides accurate soil moisture with high temporal resolution, it requires high cost and does not provide the spatial distribution of soil moisture over large areas. Microwave satellite (e.g., advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR2), the Advanced Scatterometer (ASCAT), and Soil Moisture Active Passive (SMAP)) -based approaches and numerical models such as Global Land Data Assimilation System (GLDAS) and Modern- Era Retrospective Analysis for Research and Applications (MERRA) provide spatial-temporalspatiotemporally continuous soil moisture products at global scale. However, since those global soil moisture products have coarse spatial resolution ( 25-40 km), their applications for agriculture and water resources at local and regional scales are very limited. Thus, soil moisture downscaling is needed to overcome the limitation of the spatial resolution of soil moisture products. In this study, GLDAS soil moisture data were downscaled up to 1 km spatial resolution through the integration of AMSR2 and ASCAT soil moisture data, Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and Moderate Resolution Imaging Spectroradiometer (MODIS) data—Land Surface Temperature, Normalized Difference Vegetation Index, and Land cover—using modified regression trees over East Asia from 2013 to 2015. Modified regression trees were implemented using Cubist, a commercial software tool based on machine learning. An optimization based on pruning of rules derived from the modified regression trees was conducted. Root Mean Square Error (RMSE) and Correlation coefficients (r) were used to optimize the rules, and finally 59 rules from modified regression trees were produced. The results show high validation r (0.79) and low validation RMSE (0.0556m3/m3). The 1 km downscaled soil moisture was evaluated using ground soil moisture data at 14 stations, and both soil moisture data showed similar temporal patterns (average r=0.51 and average RMSE=0.041). The spatial distribution of the 1 km downscaled soil moisture well corresponded with GLDAS soil moisture that caught both extremely dry and wet regions. Correlation between GLDAS and the 1 km downscaled soil moisture during growing season was positive (mean r=0.35) in most regions.
Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era
DOE Office of Scientific and Technical Information (OSTI.GOV)
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data pointmore » in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.« less
Do hospitals respond to rivals' quality and efficiency? A spatial panel econometric analysis.
Longo, Francesco; Siciliani, Luigi; Gravelle, Hugh; Santos, Rita
2017-09-01
We investigate whether hospitals in the English National Health Service change their quality or efficiency in response to changes in quality or efficiency of neighbouring hospitals. We first provide a theoretical model that predicts that a hospital will not respond to changes in the efficiency of its rivals but may change its quality or efficiency in response to changes in the quality of rivals, though the direction of the response is ambiguous. We use data on eight quality measures (including mortality, emergency readmissions, patient reported outcome, and patient satisfaction) and six efficiency measures (including bed occupancy, cancelled operations, and costs) for public hospitals between 2010/11 and 2013/14 to estimate both spatial cross-sectional and spatial fixed- and random-effects panel data models. We find that although quality and efficiency measures are unconditionally spatially correlated, the spatial regression models suggest that a hospital's quality or efficiency does not respond to its rivals' quality or efficiency, except for a hospital's overall mortality that is positively associated with that of its rivals. The results are robust to allowing for spatially correlated covariates and errors and to instrumenting rivals' quality and efficiency. Copyright © 2017 John Wiley & Sons, Ltd.
Insaf, Tabassum Z; Talbot, Thomas
2016-07-01
To assess the geographic distribution of Low Birth Weight (LBW) in New York State among singleton births using a spatial regression approach in order to identify priority areas for public health actions. LBW was defined as birth weight less than 2500g. Geocoded data from 562,586 birth certificates in New York State (years 2008-2012) were merged with 2010 census data at the tract level. To provide stable estimates and maintain confidentiality, data were aggregated to yield 1268 areas of analysis. LBW prevalence among singleton births was related with area-level behavioral, socioeconomic and demographic characteristics using a Poisson mixed effects spatial error regression model. Observed low birth weight showed statistically significant auto-correlation in our study area (Moran's I 0.16 p value 0.0005). After over-dispersion correction and accounting for fixed effects for selected social determinants, spatial autocorrelation was fully accounted for (Moran's I-0.007 p value 0.241). The proportion of LBW was higher in areas with larger Hispanic or Black populations and high smoking prevalence. Smoothed maps with predicted prevalence were developed to identify areas at high risk of LBW. Spatial patterns of residual variation were analyzed to identify unique risk factors. Neighborhood racial composition contributes to disparities in LBW prevalence beyond differences in behavioral and socioeconomic factors. Small-area analyses of LBW can identify areas for targeted interventions and display unique local patterns that should be accounted for in prevention strategies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Regression-assisted deconvolution.
McIntyre, Julie; Stefanski, Leonard A
2011-06-30
We present a semi-parametric deconvolution estimator for the density function of a random variable biX that is measured with error, a common challenge in many epidemiological studies. Traditional deconvolution estimators rely only on assumptions about the distribution of X and the error in its measurement, and ignore information available in auxiliary variables. Our method assumes the availability of a covariate vector statistically related to X by a mean-variance function regression model, where regression errors are normally distributed and independent of the measurement errors. Simulations suggest that the estimator achieves a much lower integrated squared error than the observed-data kernel density estimator when models are correctly specified and the assumption of normal regression errors is met. We illustrate the method using anthropometric measurements of newborns to estimate the density function of newborn length. Copyright © 2011 John Wiley & Sons, Ltd.
Chang, Howard H; Hu, Xuefei; Liu, Yang
2014-07-01
There has been a growing interest in the use of satellite-retrieved aerosol optical depth (AOD) to estimate ambient concentrations of PM2.5 (particulate matter <2.5 μm in aerodynamic diameter). With their broad spatial coverage, satellite data can increase the spatial-temporal availability of air quality data beyond ground monitoring measurements and potentially improve exposure assessment for population-based health studies. This paper describes a statistical downscaling approach that brings together (1) recent advances in PM2.5 land use regression models utilizing AOD and (2) statistical data fusion techniques for combining air quality data sets that have different spatial resolutions. Statistical downscaling assumes the associations between AOD and PM2.5 concentrations to be spatially and temporally dependent and offers two key advantages. First, it enables us to use gridded AOD data to predict PM2.5 concentrations at spatial point locations. Second, the unified hierarchical framework provides straightforward uncertainty quantification in the predicted PM2.5 concentrations. The proposed methodology is applied to a data set of daily AOD values in southeastern United States during the period 2003-2005. Via cross-validation experiments, our model had an out-of-sample prediction R(2) of 0.78 and a root mean-squared error (RMSE) of 3.61 μg/m(3) between observed and predicted daily PM2.5 concentrations. This corresponds to a 10% decrease in RMSE compared with the same land use regression model without AOD as a predictor. Prediction performances of spatial-temporal interpolations to locations and on days without monitoring PM2.5 measurements were also examined.
Canceling the momentum in a phase-shifting algorithm to eliminate spatially uniform errors.
Hibino, Kenichi; Kim, Yangjin
2016-08-10
In phase-shifting interferometry, phase modulation nonlinearity causes both spatially uniform and nonuniform errors in the measured phase. Conventional linear-detuning error-compensating algorithms only eliminate the spatially variable error component. The uniform error is proportional to the inertial momentum of the data-sampling weight of a phase-shifting algorithm. This paper proposes a design approach to cancel the momentum by using characteristic polynomials in the Z-transform space and shows that an arbitrary M-frame algorithm can be modified to a new (M+2)-frame algorithm that acquires new symmetry to eliminate the uniform error.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Brokamp, Cole; Jandarov, Roman; Rao, M. B.; LeMasters, Grace; Ryan, Patrick
2017-02-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.
Calibration of limited-area ensemble precipitation forecasts for hydrological predictions
NASA Astrophysics Data System (ADS)
Diomede, Tommaso; Marsigli, Chiara; Montani, Andrea; Nerozzi, Fabrizio; Paccagnella, Tiziana
2015-04-01
The main objective of this study is to investigate the impact of calibration for limited-area ensemble precipitation forecasts, to be used for driving discharge predictions up to 5 days in advance. A reforecast dataset, which spans 30 years, based on the Consortium for Small Scale Modeling Limited-Area Ensemble Prediction System (COSMO-LEPS) was used for testing the calibration strategy. Three calibration techniques were applied: quantile-to-quantile mapping, linear regression, and analogs. The performance of these methodologies was evaluated in terms of statistical scores for the precipitation forecasts operationally provided by COSMO-LEPS in the years 2003-2007 over Germany, Switzerland, and the Emilia-Romagna region (northern Italy). The analog-based method seemed to be preferred because of its capability of correct position errors and spread deficiencies. A suitable spatial domain for the analog search can help to handle model spatial errors as systematic errors. However, the performance of the analog-based method may degrade in cases where a limited training dataset is available. A sensitivity test on the length of the training dataset over which to perform the analog search has been performed. The quantile-to-quantile mapping and linear regression methods were less effective, mainly because the forecast-analysis relation was not so strong for the available training dataset. A comparison between the calibration based on the deterministic reforecast and the calibration based on the full operational ensemble used as training dataset has been considered, with the aim to evaluate whether reforecasts are really worthy for calibration, given that their computational cost is remarkable. The verification of the calibration process was then performed by coupling ensemble precipitation forecasts with a distributed rainfall-runoff model. This test was carried out for a medium-sized catchment located in Emilia-Romagna, showing a beneficial impact of the analog-based method on the reduction of missed events for discharge predictions.
Brokamp, Cole; Jandarov, Roman; Rao, M B; LeMasters, Grace; Ryan, Patrick
2017-02-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.
Brokamp, Cole; Jandarov, Roman; Rao, M.B.; LeMasters, Grace; Ryan, Patrick
2017-01-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment. PMID:28959135
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-01-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
NASA Astrophysics Data System (ADS)
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-12-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Somarathna, P D S N; Minasny, Budiman; Malone, Brendan P; Stockmann, Uta; McBratney, Alex B
2018-08-01
Spatial modelling of environmental data commonly only considers spatial variability as the single source of uncertainty. In reality however, the measurement errors should also be accounted for. In recent years, infrared spectroscopy has been shown to offer low cost, yet invaluable information needed for digital soil mapping at meaningful spatial scales for land management. However, spectrally inferred soil carbon data are known to be less accurate compared to laboratory analysed measurements. This study establishes a methodology to filter out the measurement error variability by incorporating the measurement error variance in the spatial covariance structure of the model. The study was carried out in the Lower Hunter Valley, New South Wales, Australia where a combination of laboratory measured, and vis-NIR and MIR inferred topsoil and subsoil soil carbon data are available. We investigated the applicability of residual maximum likelihood (REML) and Markov Chain Monte Carlo (MCMC) simulation methods to generate parameters of the Matérn covariance function directly from the data in the presence of measurement error. The results revealed that the measurement error can be effectively filtered-out through the proposed technique. When the measurement error was filtered from the data, the prediction variance almost halved, which ultimately yielded a greater certainty in spatial predictions of soil carbon. Further, the MCMC technique was successfully used to define the posterior distribution of measurement error. This is an important outcome, as the MCMC technique can be used to estimate the measurement error if it is not explicitly quantified. Although this study dealt with soil carbon data, this method is amenable for filtering the measurement error of any kind of continuous spatial environmental data. Copyright © 2018 Elsevier B.V. All rights reserved.
Reusser, D.A.; Lee, H.
2008-01-01
Habitat models can be used to predict the distributions of marine and estuarine non-indigenous species (NIS) over several spatial scales. At an estuary scale, our goal is to predict the estuaries most likely to be invaded, but at a habitat scale, the goal is to predict the specific locations within an estuary that are most vulnerable to invasion. As an initial step in evaluating several habitat models, model performance for a suite of benthic species with reasonably well-known distributions on the Pacific coast of the US needs to be compared. We discuss the utility of non-parametric multiplicative regression (NPMR) for predicting habitat- and estuary-scale distributions of native and NIS. NPMR incorporates interactions among variables, allows qualitative and categorical variables, and utilizes data on absence as well as presence. Preliminary results indicate that NPMR generally performs well at both spatial scales and that distributions of NIS are predicted as well as those of native species. For most species, latitude was the single best predictor, although similar model performance could be obtained at both spatial scales with combinations of other habitat variables. Errors of commission were more frequent at a habitat scale, with omission and commission errors approximately equal at an estuary scale. ?? 2008 International Council for the Exploration of the Sea. Published by Oxford Journals. All rights reserved.
Waltemeyer, Scott D.
2006-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable flood-hazard mapping in the Navajo Nation in Arizona, Utah, Colorado, and New Mexico. The Bureau of Indian Affairs, U.S. Army Corps of Engineers, and Navajo Nation requested that the U.S. Geological Survey update estimates of peak discharge magnitude for gaging stations in the region and update regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites using data collected through 1999 at 146 gaging stations, an additional 13 years of peak-discharge data since a 1997 investigation, which used gaging-station data through 1986. The equations for estimation of peak discharges at ungaged sites were developed for flood regions 8, 11, high elevation, and 6 and are delineated on the basis of the hydrologic codes from the 1997 investigation. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 82 of the 146 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge having a recurrence interval of less than 1.4 years in the probability-density function. Within each region, logarithms of the peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then was applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction for a peak discharge have a recurrence interval of 100-years for region 8 was 53 percent (average) for the 100-year flood. The average standard of prediction, which includes average sampling error and average standard error of regression, ranged from 45 to 83 percent for the 100-year flood. Estimated standard error of prediction for a hybrid method for region 11 was large in the 1997 investigation. No distinction of floods produced from a high-elevation region was presented in the 1997 investigation. Overall, the equations based on generalized least-squares regression techniques are considered to be more reliable than those in the 1997 report because of the increased length of record and improved GIS method. Techniques for transferring flood-frequency relations to ungaged sites on the same stream can be estimated at an ungaged site by a direct application of the regional regression equation or at an ungaged site on a stream that has a gaging station upstream or downstream by using the drainage-area ratio and the drainage-area exponent from the regional regression equation of the respective region.
Chen, Li; Gao, Shuang; Zhang, Hui; Sun, Yanling; Ma, Zhenxing; Vedal, Sverre; Mao, Jian; Bai, Zhipeng
2018-05-03
Concentrations of particulate matter with aerodynamic diameter <2.5 μm (PM 2.5 ) are relatively high in China. Estimation of PM 2.5 exposure is complex because PM 2.5 exhibits complex spatiotemporal patterns. To improve the validity of exposure predictions, several methods have been developed and applied worldwide. A hybrid approach combining a land use regression (LUR) model and Bayesian Maximum Entropy (BME) interpolation of the LUR space-time residuals were developed to estimate the PM 2.5 concentrations on a national scale in China. This hybrid model could potentially provide more valid predictions than a commonly-used LUR model. The LUR/BME model had good performance characteristics, with R 2 = 0.82 and root mean square error (RMSE) of 4.6 μg/m 3 . Prediction errors of the LUR/BME model were reduced by incorporating soft data accounting for data uncertainty, with the R 2 increasing by 6%. The performance of LUR/BME is better than OK/BME. The LUR/BME model is the most accurate fine spatial scale PM 2.5 model developed to date for China. Copyright © 2018. Published by Elsevier Ltd.
Spatially explicit estimation of aboveground boreal forest biomass in the Yukon River Basin, Alaska
Ji, Lei; Wylie, Bruce K.; Brown, Dana R. N.; Peterson, Birgit E.; Alexander, Heather D.; Mack, Michelle C.; Rover, Jennifer R.; Waldrop, Mark P.; McFarland, Jack W.; Chen, Xuexia; Pastick, Neal J.
2015-01-01
Quantification of aboveground biomass (AGB) in Alaska’s boreal forest is essential to the accurate evaluation of terrestrial carbon stocks and dynamics in northern high-latitude ecosystems. Our goal was to map AGB at 30 m resolution for the boreal forest in the Yukon River Basin of Alaska using Landsat data and ground measurements. We acquired Landsat images to generate a 3-year (2008–2010) composite of top-of-atmosphere reflectance for six bands as well as the brightness temperature (BT). We constructed a multiple regression model using field-observed AGB and Landsat-derived reflectance, BT, and vegetation indices. A basin-wide boreal forest AGB map at 30 m resolution was generated by applying the regression model to the Landsat composite. The fivefold cross-validation with field measurements had a mean absolute error (MAE) of 25.7 Mg ha−1 (relative MAE 47.5%) and a mean bias error (MBE) of 4.3 Mg ha−1(relative MBE 7.9%). The boreal forest AGB product was compared with lidar-based vegetation height data; the comparison indicated that there was a significant correlation between the two data sets.
Lee, Mihye; Kloog, Itai; Chudnovsky, Alexandra; Lyapustin, Alexei; Wang, Yujie; Melly, Steven; Coull, Brent; Koutrakis, Petros; Schwartz, Joel
2016-01-01
Numerous studies have demonstrated that fine particulate matter (PM2.5, particles smaller than 2.5 μm in aerodynamic diameter) is associated with adverse health outcomes. The use of ground monitoring stations of PM2.5 to assess personal exposure; however, induces measurement error. Land use regression provides spatially resolved predictions but land use terms do not vary temporally. Meanwhile, the advent of satellite-retrieved aerosol optical depth (AOD) products have made possible to predict the spatial and temporal patterns of PM2.5 exposures. In this paper, we used AOD data with other PM2.5 variables such as meteorological variables, land use regression, and spatial smoothing to predict daily concentrations of PM2.5 at a 1 km2 resolution of the southeastern United States including the seven states of Georgia, North Carolina, South Carolina, Alabama, Tennessee, Mississippi, and Florida for the years from 2003 through 2011. We divided the study area into 3 regions and applied separate mixed-effect models to calibrate AOD using ground PM2.5 measurements and other spatiotemporal predictors. Using 10-fold cross-validation, we obtained out of sample R2 values of 0.77, 0.81, and 0.70 with the square root of the mean squared prediction errors (RMSPE) of 2.89, 2.51, and 2.82 μg/m3 for regions 1, 2, and 3, respectively. The slopes of the relationships between predicted PM2.5 and held out measurements were approximately 1 indicating no bias between the observed and modeled PM2.5 concentrations. Predictions can be used in epidemiological studies investigating the effects of both acute and chronic exposures to PM2.5. Our model results will also extend the existing studies on PM2.5 which have mostly focused on urban areas due to the paucity of monitors in rural areas. PMID:26082149
NASA Technical Reports Server (NTRS)
Lee, Mihye; Kloog, Itai; Chudnovsky, Alexandra; Lyapustin, Alexei; Wang, Yujie; Melly, Steven; Coull, Brent; Koutrakis, Petros; Schwartz, Joel
2016-01-01
Numerous studies have demonstrated that fine particulate matter (PM(sub 2.5), particles smaller than 2.5 micrometers in aerodynamic diameter) is associated with adverse health outcomes. The use of ground monitoring stations of PM(sub 2.5) to assess personal exposure, however, induces measurement error. Land-use regression provides spatially resolved predictions but land-use terms do not vary temporally. Meanwhile, the advent of satellite-retrieved aerosol optical depth (AOD) products have made possible to predict the spatial and temporal patterns of PM(sub 2.5) exposures. In this paper, we used AOD data with other PM(sub 2.5) variables, such as meteorological variables, land-use regression, and spatial smoothing to predict daily concentrations of PM(sub 2.5) at a 1 sq km resolution of the Southeastern United States including the seven states of Georgia, North Carolina, South Carolina, Alabama, Tennessee, Mississippi, and Florida for the years from 2003 to 2011. We divided the study area into three regions and applied separate mixed-effect models to calibrate AOD using ground PM(sub 2.5) measurements and other spatiotemporal predictors. Using 10-fold cross-validation, we obtained out of sample R2 values of 0.77, 0.81, and 0.70 with the square root of the mean squared prediction errors of 2.89, 2.51, and 2.82 cu micrograms for regions 1, 2, and 3, respectively. The slopes of the relationships between predicted PM2.5 and held out measurements were approximately 1 indicating no bias between the observed and modeled PM(sub 2.5) concentrations. Predictions can be used in epidemiological studies investigating the effects of both acute and chronic exposures to PM(sub 2.5). Our model results will also extend the existing studies on PM(sub 2.5) which have mostly focused on urban areas because of the paucity of monitors in rural areas.
Guo, Ying; Little, Roderick J; McConnell, Daniel S
2012-01-01
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
Spatial durbin error model for human development index in Province of Central Java.
NASA Astrophysics Data System (ADS)
Septiawan, A. R.; Handajani, S. S.; Martini, T. S.
2018-05-01
The Human Development Index (HDI) is an indicator used to measure success in building the quality of human life, explaining how people access development outcomes when earning income, health and education. Every year HDI in Central Java has improved to a better direction. In 2016, HDI in Central Java was 69.98 %, an increase of 0.49 % over the previous year. The objective of this study was to apply the spatial Durbin error model using angle weights queen contiguity to measure HDI in Central Java Province. Spatial Durbin error model is used because the model overcomes the spatial effect of errors and the effects of spatial depedency on the independent variable. Factors there use is life expectancy, mean years of schooling, expected years of schooling, and purchasing power parity. Based on the result of research, we get spatial Durbin error model for HDI in Central Java with influencing factors are life expectancy, mean years of schooling, expected years of schooling, and purchasing power parity.
Bayesian learning for spatial filtering in an EEG-based brain-computer interface.
Zhang, Haihong; Yang, Huijuan; Guan, Cuntai
2013-07-01
Spatial filtering for EEG feature extraction and classification is an important tool in brain-computer interface. However, there is generally no established theory that links spatial filtering directly to Bayes classification error. To address this issue, this paper proposes and studies a Bayesian analysis theory for spatial filtering in relation to Bayes error. Following the maximum entropy principle, we introduce a gamma probability model for describing single-trial EEG power features. We then formulate and analyze the theoretical relationship between Bayes classification error and the so-called Rayleigh quotient, which is a function of spatial filters and basically measures the ratio in power features between two classes. This paper also reports our extensive study that examines the theory and its use in classification, using three publicly available EEG data sets and state-of-the-art spatial filtering techniques and various classifiers. Specifically, we validate the positive relationship between Bayes error and Rayleigh quotient in real EEG power features. Finally, we demonstrate that the Bayes error can be practically reduced by applying a new spatial filter with lower Rayleigh quotient.
Sharpening method of satellite thermal image based on the geographical statistical model
NASA Astrophysics Data System (ADS)
Qi, Pengcheng; Hu, Shixiong; Zhang, Haijun; Guo, Guangmeng
2016-04-01
To improve the effectiveness of thermal sharpening in mountainous regions, paying more attention to the laws of land surface energy balance, a thermal sharpening method based on the geographical statistical model (GSM) is proposed. Explanatory variables were selected from the processes of land surface energy budget and thermal infrared electromagnetic radiation transmission, then high spatial resolution (57 m) raster layers were generated for these variables through spatially simulating or using other raster data as proxies. Based on this, the local adaptation statistical relationship between brightness temperature (BT) and the explanatory variables, i.e., the GSM, was built at 1026-m resolution using the method of multivariate adaptive regression splines. Finally, the GSM was applied to the high-resolution (57-m) explanatory variables; thus, the high-resolution (57-m) BT image was obtained. This method produced a sharpening result with low error and good visual effect. The method can avoid the blind choice of explanatory variables and remove the dependence on synchronous imagery at visible and near-infrared bands. The influences of the explanatory variable combination, sampling method, and the residual error correction on sharpening results were analyzed deliberately, and their influence mechanisms are reported herein.
A method to estimate the effect of deformable image registration uncertainties on daily dose mapping
Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin
2012-01-01
Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766
De Sá Teixeira, Nuno Alexandre
2014-12-01
Given its conspicuous nature, gravity has been acknowledged by several research lines as a prime factor in structuring the spatial perception of one's environment. One such line of enquiry has focused on errors in spatial localization aimed at the vanishing location of moving objects - it has been systematically reported that humans mislocalize spatial positions forward, in the direction of motion (representational momentum) and downward in the direction of gravity (representational gravity). Moreover, spatial localization errors were found to evolve dynamically with time in a pattern congruent with an anticipated trajectory (representational trajectory). The present study attempts to ascertain the degree to which vestibular information plays a role in these phenomena. Human observers performed a spatial localization task while tilted to varying degrees and referring to the vanishing locations of targets moving along several directions. A Fourier decomposition of the obtained spatial localization errors revealed that although spatial errors were increased "downward" mainly along the body's longitudinal axis (idiotropic dominance), the degree of misalignment between the latter and physical gravity modulated the time course of the localization responses. This pattern is surmised to reflect increased uncertainty about the internal model when faced with conflicting cues regarding the perceived "downward" direction.
Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.
2015-09-28
Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Detecting Spatial Patterns in Biological Array Experiments
ROOT, DAVID E.; KELLEY, BRIAN P.; STOCKWELL, BRENT R.
2005-01-01
Chemical genetic screening and DNA and protein microarrays are among a number of increasingly important and widely used biological research tools that involve large numbers of parallel experiments arranged in a spatial array. It is often difficult to ensure that uniform experimental conditions are present throughout the entire array, and as a result, one often observes systematic spatially correlated errors, especially when array experiments are performed using robots. Here, the authors apply techniques based on the discrete Fourier transform to identify and quantify spatially correlated errors superimposed on a spatially random background. They demonstrate that these techniques are effective in identifying common spatially systematic errors in high-throughput 384-well microplate assay data. In addition, the authors employ a statistical test to allow for automatic detection of such errors. Software tools for using this approach are provided. PMID:14567791
NASA Astrophysics Data System (ADS)
Caimmi, R.
2011-08-01
Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both heteroscedastic and homoscedastic data. Conversely, samples related to different methods produce discrepant results, due to the presence of (still undetected) systematic errors, which implies no definitive statement can be made at present. A comparison is also made between different expressions of regression line slope and intercept variance estimators, where fractional discrepancies are found to be not exceeding a few percent, which grows up to about 20% in the presence of large dispersion data. An extension of the formalism to structural models is left to a forthcoming paper.
Strong, Laurence L.
2012-01-01
A prototype knowledge- and object-based image analysis model was developed to inventory and map least tern and piping plover habitat on the Missouri River, USA. The model has been used to inventory the state of sandbars annually for 4 segments of the Missouri River since 2006 using QuickBird imagery. Interpretation of the state of sandbars is difficult when images for the segment are acquired at different river stages and different states of vegetation phenology and canopy cover. Concurrent QuickBird and RapidEye images were classified using the model and the spatial correspondence of classes in the land cover and sandbar maps were analysed for the spatial extent of the images and at nest locations for both bird species. Omission and commission errors were low for unvegetated land cover classes used for nesting by both bird species and for land cover types with continuous vegetation cover and water. Errors were larger for land cover classes characterized by a mixture of sand and vegetation. Sandbar classification decisions are made using information on land cover class proportions and disagreement between sandbar classes was resolved using fuzzy membership possibilities. Regression analysis of area for a paired sample of 47 sandbars indicated an average positive bias, 1.15 ha, for RapidEye that did not vary with sandbar size. RapidEye has potential to reduce temporal uncertainty about least tern and piping plover habitat but would not be suitable for mapping sandbar erosion, and characterization of sandbar shapes or vegetation patches at fine spatial resolution.
Strong, Laurence L.
2012-01-01
A prototype knowledge- and object-based image analysis model was developed to inventory and map least tern and piping plover habitat on the Missouri River, USA. The model has been used to inventory the state of sandbars annually for 4 segments of the Missouri River since 2006 using QuickBird imagery. Interpretation of the state of sandbars is difficult when images for the segment are acquired at different river stages and different states of vegetation phenology and canopy cover. Concurrent QuickBird and RapidEye images were classified using the model and the spatial correspondence of classes in the land cover and sandbar maps were analysed for the spatial extent of the images and at nest locations for both bird species. Omission and commission errors were low for unvegetated land cover classes used for nesting by both bird species and for land cover types with continuous vegetation cover and water. Errors were larger for land cover classes characterized by a mixture of sand and vegetation. Sandbar classification decisions are made using information on land cover class proportions and disagreement between sandbar classes was resolved using fuzzy membership possibilities. Regression analysis of area for a paired sample of 47 sandbars indicated an average positive bias, 1.15 ha, for RapidEye that did not vary with sandbar size. RapidEye has potential to reduce temporal uncertainty about least tern and piping plover habitat but would not be suitable for mapping sandbar erosion, and characterization of sandbar shapes or vegetation patches at fine spatial resolution.
NASA Astrophysics Data System (ADS)
Gao, Cheng-Yan; Wang, Guan-Yu; Zhang, Hao; Deng, Fu-Guo
2017-01-01
We present a self-error-correction spatial-polarization hyperentanglement distribution scheme for N-photon systems in a hyperentangled Greenberger-Horne-Zeilinger state over arbitrary collective-noise channels. In our scheme, the errors of spatial entanglement can be first averted by encoding the spatial-polarization hyperentanglement into the time-bin entanglement with identical polarization and defined spatial modes before it is transmitted over the fiber channels. After transmission over the noisy channels, the polarization errors introduced by the depolarizing noise can be corrected resorting to the time-bin entanglement. Finally, the parties in quantum communication can in principle share maximally hyperentangled states with a success probability of 100%.
Theorizing Land Cover and Land Use Change: The Peasant Economy of Colonization in the Amazon Basin
NASA Technical Reports Server (NTRS)
Caldas, Marcellus; Walker, Robert; Arima, Eugenio; Perz, Stephen; Aldrich, Stephen; Simmons, Cynthia
2007-01-01
This paper addresses deforestation processes in the Amazon basin. It deploys a methodology combining remote sensing and survey-based fieldwork to examine, with regression analysis, the impact household structure and economic circumstances on deforestation decisions made by colonist farmers in the forest frontiers of Brazil. Unlike most previous regression-based studies, the methodology implemented analyzes behavior at the level of the individual property. The regressions correct for endogenous relationships between key variables, and spatial autocorrelation, as necessary. Variables used in the analysis are specified, in part, by a theoretical development integrating the Chayanovian concept of the peasant household with spatial considerations stemming from von Thuenen. The results from the empirical model indicate that demographic characteristics of households, as well as market factors, affect deforestation in the Amazon. Thus, statistical results from studies that do not include household-scale information may be subject to error. From a policy perspective, the results suggest that environmental policies in the Amazon based on market incentives to small farmers may not be as effective as hoped, given the importance of household factors in catalyzing the demand for land. The paper concludes by noting that household decisions regarding land use and deforestation are not independent of broader social circumstances, and that a full understanding of Amazonian deforestation will require insight into why poor families find it necessary to settle the frontier in the first place.
NASA Astrophysics Data System (ADS)
Wang, Rong; Andrews, Elisabeth; Balkanski, Yves; Boucher, Olivier; Myhre, Gunnar; Samset, Bjørn Hallvard; Schulz, Michael; Schuster, Gregory L.; Valari, Myrto; Tao, Shu
2018-02-01
There is high uncertainty in the direct radiative forcing of black carbon (BC), an aerosol that strongly absorbs solar radiation. The observation-constrained estimate, which is several times larger than the bottom-up estimate, is influenced by the spatial representativeness error due to the mesoscale inhomogeneity of the aerosol fields and the relatively low resolution of global chemistry-transport models. Here we evaluated the spatial representativeness error for two widely used observational networks (AErosol RObotic NETwork and Global Atmosphere Watch) by downscaling the geospatial grid in a global model of BC aerosol absorption optical depth to 0.1° × 0.1°. Comparing the models at a spatial resolution of 2° × 2° with BC aerosol absorption at AErosol RObotic NETwork sites (which are commonly located near emission hot spots) tends to cause a global spatial representativeness error of 30%, as a positive bias for the current top-down estimate of global BC direct radiative forcing. By contrast, the global spatial representativeness error will be 7% for the Global Atmosphere Watch network, because the sites are located in such a way that there are almost an equal number of sites with positive or negative representativeness error.
Hypothesis Testing Using Factor Score Regression
Devlieger, Ines; Mayer, Axel; Rosseel, Yves
2015-01-01
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886
Wetherbee, G.A.; Latysh, N.E.; Gordon, J.D.
2005-01-01
Data from the U.S. Geological Survey (USGS) collocated-sampler program for the National Atmospheric Deposition Program/National Trends Network (NADP/NTN) are used to estimate the overall error of NADP/NTN measurements. Absolute errors are estimated by comparison of paired measurements from collocated instruments. Spatial and temporal differences in absolute error were identified and are consistent with longitudinal distributions of NADP/NTN measurements and spatial differences in precipitation characteristics. The magnitude of error for calcium, magnesium, ammonium, nitrate, and sulfate concentrations, specific conductance, and sample volume is of minor environmental significance to data users. Data collected after a 1994 sample-handling protocol change are prone to less absolute error than data collected prior to 1994. Absolute errors are smaller during non-winter months than during winter months for selected constituents at sites where frozen precipitation is common. Minimum resolvable differences are estimated for different regions of the USA to aid spatial and temporal watershed analyses.
NASA Astrophysics Data System (ADS)
Cattani, Giorgio; Gaeta, Alessandra; di Menno di Bucchianico, Alessandro; de Santis, Antonella; Gaddi, Raffaela; Cusano, Mariacarmela; Ancona, Carla; Badaloni, Chiara; Forastiere, Francesco; Gariazzo, Claudio; Sozzi, Roberto; Inglessis, Marco; Silibello, Camillo; Salvatori, Elisabetta; Manes, Fausto; Cesaroni, Giulia; The Viias Study Group
2017-05-01
The health effects of long-term exposure to ultrafine particles (UFPs) are poorly understood. Data on spatial contrasts in ambient ultrafine particles (UFPs) concentrations are needed with fine resolution. This study aimed to assess the spatial variability of total particle number concentrations (PNC, a proxy for UFPs) in the city of Rome, Italy, using land use regression (LUR) models, and the correspondent exposure of population here living. PNC were measured using condensation particle counters at the building facade of 28 homes throughout the city. Three 7-day monitoring periods were carried out during cold, warm and intermediate seasons. Geographic Information System predictor variables, with buffers of varying size, were evaluated to model spatial variations of PNC. A stepwise forward selection procedure was used to develop a "base" linear regression model according to the European Study of Cohorts for Air Pollution Effects project methodology. Other variables were then included in more enhanced models and their capability of improving model performance was evaluated. Four LUR models were developed. Local variation in UFPs in the study area can be largely explained by the ratio of traffic intensity and distance to the nearest major road. The best model (adjusted R2 = 0.71; root mean square error = ±1,572 particles/cm³, leave one out cross validated R2 = 0.68) was achieved by regressing building and street configuration variables against residual from the "base" model, which added 3% more to the total variance explained. Urban green and population density in a 5,000 m buffer around each home were also relevant predictors. The spatial contrast in ambient PNC across the large conurbation of Rome, was successfully assessed. The average exposure of subjects living in the study area was 16,006 particles/cm³ (SD 2165 particles/cm³, range: 11,075-28,632 particles/cm³). A total of 203,886 subjects (16%) lives in Rome within 50 m from a high traffic road and they experience the highest exposure levels (18,229 particles/cm³). The results will be used to estimate the long-term health effects of ultrafine particle exposure of participants in Rome.
Notes on power of normality tests of error terms in regression models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Střelec, Luboš
2015-03-10
Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importancemore » of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.« less
Monitoring Building Deformation with InSAR: Experiments and Validation.
Yang, Kui; Yan, Li; Huang, Guoman; Chen, Chu; Wu, Zhengpeng
2016-12-20
Synthetic Aperture Radar Interferometry (InSAR) techniques are increasingly applied for monitoring land subsidence. The advantages of InSAR include high accuracy and the ability to cover large areas; nevertheless, research validating the use of InSAR on building deformation is limited. In this paper, we test the monitoring capability of the InSAR in experiments using two landmark buildings; the Bohai Building and the China Theater, located in Tianjin, China. They were selected as real examples to compare InSAR and leveling approaches for building deformation. Ten TerraSAR-X images spanning half a year were used in Permanent Scatterer InSAR processing. These extracted InSAR results were processed considering the diversity in both direction and spatial distribution, and were compared with true leveling values in both Ordinary Least Squares (OLS) regression and measurement of error analyses. The detailed experimental results for the Bohai Building and the China Theater showed a high correlation between InSAR results and the leveling values. At the same time, the two Root Mean Square Error (RMSE) indexes had values of approximately 1 mm. These analyses show that a millimeter level of accuracy can be achieved by means of InSAR technique when measuring building deformation. We discuss the differences in accuracy between OLS regression and measurement of error analyses, and compare the accuracy index of leveling in order to propose InSAR accuracy levels appropriate for monitoring buildings deformation. After assessing the advantages and limitations of InSAR techniques in monitoring buildings, further applications are evaluated.
Gurdak, Jason J.; Qi, Sharon L.; Geisler, Michael L.
2009-01-01
The U.S. Geological Survey Raster Error Propagation Tool (REPTool) is a custom tool for use with the Environmental System Research Institute (ESRI) ArcGIS Desktop application to estimate error propagation and prediction uncertainty in raster processing operations and geospatial modeling. REPTool is designed to introduce concepts of error and uncertainty in geospatial data and modeling and provide users of ArcGIS Desktop a geoprocessing tool and methodology to consider how error affects geospatial model output. Similar to other geoprocessing tools available in ArcGIS Desktop, REPTool can be run from a dialog window, from the ArcMap command line, or from a Python script. REPTool consists of public-domain, Python-based packages that implement Latin Hypercube Sampling within a probabilistic framework to track error propagation in geospatial models and quantitatively estimate the uncertainty of the model output. Users may specify error for each input raster or model coefficient represented in the geospatial model. The error for the input rasters may be specified as either spatially invariant or spatially variable across the spatial domain. Users may specify model output as a distribution of uncertainty for each raster cell. REPTool uses the Relative Variance Contribution method to quantify the relative error contribution from the two primary components in the geospatial model - errors in the model input data and coefficients of the model variables. REPTool is appropriate for many types of geospatial processing operations, modeling applications, and related research questions, including applications that consider spatially invariant or spatially variable error in geospatial data.
Entropy of space-time outcome in a movement speed-accuracy task.
Hsieh, Tsung-Yu; Pacheco, Matheus Maia; Newell, Karl M
2015-12-01
The experiment reported was set-up to investigate the space-time entropy of movement outcome as a function of a range of spatial (10, 20 and 30 cm) and temporal (250-2500 ms) criteria in a discrete aiming task. The variability and information entropy of the movement spatial and temporal errors considered separately increased and decreased on the respective dimension as a function of an increment of movement velocity. However, the joint space-time entropy was lowest when the relative contribution of spatial and temporal task criteria was comparable (i.e., mid-range of space-time constraints), and it increased with a greater trade-off between spatial or temporal task demands, revealing a U-shaped function across space-time task criteria. The traditional speed-accuracy functions of spatial error and temporal error considered independently mapped to this joint space-time U-shaped entropy function. The trade-off in movement tasks with joint space-time criteria is between spatial error and timing error, rather than movement speed and accuracy. Copyright © 2015 Elsevier B.V. All rights reserved.
Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T
2016-02-01
The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Santos, Monica; Fragoso, Marcelo
2010-05-01
Extreme precipitation events are one of the causes of natural hazards, such as floods and landslides, making its investigation so important, and this research aims to contribute to the study of the extreme rainfall patterns in a Portuguese mountainous area. The study area is centred on the Arcos de Valdevez county, located in the northwest region of Portugal, the rainiest of the country, with more than 3000 mm of annual rainfall at the Peneda-Gerês mountain system. This work focus on two main subjects related with the precipitation variability on the study area. First, a statistical analysis of several precipitation parameters is carried out, using daily data from 17 rain-gauges with a complete record for the 1960-1995 period. This approach aims to evaluate the main spatial contrasts regarding different aspects of the rainfall regime, described by ten parameters and indices of precipitation extremes (e.g. mean annual precipitation, the annual frequency of precipitation days, wet spells durations, maximum daily precipitation, maximum of precipitation in 30 days, number of days with rainfall exceeding 100 mm and estimated maximum daily rainfall for a return period of 100 years). The results show that the highest precipitation amounts (from annual to daily scales) and the higher frequency of very abundant rainfall events occur in the Serra da Peneda and Gerês mountains, opposing to the valleys of the Lima, Minho and Vez rivers, with lower precipitation amounts and less frequent heavy storms. The second purpose of this work is to find a method of mapping extreme rainfall in this mountainous region, investigating the complex influence of the relief (e.g. elevation, topography) on the precipitation patterns, as well others geographical variables (e.g. distance from coast, latitude), applying tested geo-statistical techniques (Goovaerts, 2000; Diodato, 2005). Models of linear regression were applied to evaluate the influence of different geographical variables (altitude, latitude, distance from sea and distance to the highest orographic barrier) on the rainfall behaviours described by the studied variables. The techniques of spatial interpolation evaluated include univariate and multivariate methods: cokriging, kriging, IDW (inverse distance weighted) and multiple linear regression. Validation procedures were used, assessing the estimated errors in the analysis of descriptive statistics of the models. Multiple linear regression models produced satisfactory results in relation to 70% of the rainfall parameters, suggested by lower average percentage of error. However, the results also demonstrates that there is no an unique and ideal model, depending on the rainfall parameter in consideration. Probably, the unsatisfactory results obtained in relation to some rainfall parameters was motivated by constraints as the spatial complexity of the precipitation patterns, as well as to the deficient spatial coverage of the territory by the rain-gauges network. References Diodato, N. (2005). The influence of topographic co-variables on the spatial variability of precipitation over small regions of complex terrain. Internacional Journal of Climatology, 25(3), 351-363. Goovaerts, P. (2000). Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology, 228, 113 - 129.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gill, Robert; Bush, Evan; Loutzenhiser, Peter, E-mail: peter.loutzenhiser@me.gatech.edu
2015-12-15
A systematic methodology for characterizing a novel and newly fabricated high-flux solar simulator is presented. The high-flux solar simulator consists of seven xenon short-arc lamps mounted in truncated ellipsoidal reflectors. Characterization of spatial radiative heat flux distribution was performed using calorimetric measurements of heat flow coupled with CCD camera imaging of a Lambertian target mounted in the focal plane. The calorimetric measurements and images of the Lambertian target were obtained in two separate runs under identical conditions. Detailed modeling in the high-flux solar simulator was accomplished using Monte Carlo ray tracing to capture radiative heat transport. A least-squares regression modelmore » was used on the Monte Carlo radiative heat transfer analysis with the experimental data to account for manufacturing defects. The Monte Carlo ray tracing was calibrated by regressing modeled radiative heat flux as a function of specular error and electric power to radiation conversion onto measured radiative heat flux from experimental results. Specular error and electric power to radiation conversion efficiency were 5.92 ± 0.05 mrad and 0.537 ± 0.004, respectively. An average radiative heat flux with 95% errors bounds of 4880 ± 223 kW ⋅ m{sup −2} was measured over a 40 mm diameter with a cavity-type calorimeter with an apparent absorptivity of 0.994. The Monte Carlo ray-tracing resulted in an average radiative heat flux of 893.3 kW ⋅ m{sup −2} for a single lamp, comparable to the measured radiative heat fluxes with 95% error bounds of 892.5 ± 105.3 kW ⋅ m{sup −2} from calorimetry.« less
Calibration of the carbon isotope composition (δ13C) of benthic foraminifera
NASA Astrophysics Data System (ADS)
Schmittner, Andreas; Bostock, Helen C.; Cartapanis, Olivier; Curry, William B.; Filipsson, Helena L.; Galbraith, Eric D.; Gottschalk, Julia; Herguera, Juan Carlos; Hoogakker, Babette; Jaccard, Samuel L.; Lisiecki, Lorraine E.; Lund, David C.; Martínez-Méndez, Gema; Lynch-Stieglitz, Jean; Mackensen, Andreas; Michel, Elisabeth; Mix, Alan C.; Oppo, Delia W.; Peterson, Carlye D.; Repschläger, Janne; Sikes, Elisabeth L.; Spero, Howard J.; Waelbroeck, Claire
2017-06-01
The carbon isotope composition (δ13C) of seawater provides valuable insight on ocean circulation, air-sea exchange, the biological pump, and the global carbon cycle and is reflected by the δ13C of foraminifera tests. Here more than 1700 δ13C observations of the benthic foraminifera genus Cibicides from late Holocene sediments (δ13CCibnat) are compiled and compared with newly updated estimates of the natural (preindustrial) water column δ13C of dissolved inorganic carbon (δ13CDICnat) as part of the international Ocean Circulation and Carbon Cycling (OC3) project. Using selection criteria based on the spatial distance between samples, we find high correlation between δ13CCibnat and δ13CDICnat, confirming earlier work. Regression analyses indicate significant carbonate ion (-2.6 ± 0.4) × 10-3‰/(μmol kg-1) [CO32-] and pressure (-4.9 ± 1.7) × 10-5‰ m-1 (depth) effects, which we use to propose a new global calibration for predicting δ13CDICnat from δ13CCibnat. This calibration is shown to remove some systematic regional biases and decrease errors compared with the one-to-one relationship (δ13CDICnat = δ13CCibnat). However, these effects and the error reductions are relatively small, which suggests that most conclusions from previous studies using a one-to-one relationship remain robust. The remaining standard error of the regression is generally σ ≅ 0.25‰, with larger values found in the southeast Atlantic and Antarctic (σ ≅ 0.4‰) and for species other than Cibicides wuellerstorfi. Discussion of species effects and possible sources of the remaining errors may aid future attempts to improve the use of the benthic δ13C record.
NASA Astrophysics Data System (ADS)
Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.
2014-09-01
Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.
Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J
2010-05-01
Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.
Evaluate error correction ability of magnetorheological finishing by smoothing spectral function
NASA Astrophysics Data System (ADS)
Wang, Jia; Fan, Bin; Wan, Yongjian; Shi, Chunyan; Zhuo, Bin
2014-08-01
Power Spectral Density (PSD) has been entrenched in optics design and manufacturing as a characterization of mid-high spatial frequency (MHSF) errors. Smoothing Spectral Function (SSF) is a newly proposed parameter that based on PSD to evaluate error correction ability of computer controlled optical surfacing (CCOS) technologies. As a typical deterministic and sub-aperture finishing technology based on CCOS, magnetorheological finishing (MRF) leads to MHSF errors inevitably. SSF is employed to research different spatial frequency error correction ability of MRF process. The surface figures and PSD curves of work-piece machined by MRF are presented. By calculating SSF curve, the correction ability of MRF for different spatial frequency errors will be indicated as a normalized numerical value.
Gabriel, Mark C; Kolka, Randy; Wickman, Trent; Nater, Ed; Woodruff, Laurel
2009-06-15
The primary objective of this research is to investigate relationships between mercury in upland soil, lake water and fish tissue and explore the cause for the observed spatial variation of THg in age one yellow perch (Perca flavescens) for ten lakes within the Superior National Forest. Spatial relationships between yellow perch THg tissue concentration and a total of 45 watershed and water chemistry parameters were evaluated for two separate years: 2005 and 2006. Results show agreement with other studies where watershed area, lake water pH, nutrient levels (specifically dissolved NO(3)(-)-N) and dissolved iron are important factors controlling and/or predicting fish THg level. Exceeding all was the strong dependence of yellow perch THg level on soil A-horizon THg and, in particular, soil O-horizon THg concentrations (Spearman rho=0.81). Soil B-horizon THg concentration was significantly correlated (Pearson r=0.75) with lake water THg concentration. Lakes surrounded by a greater percentage of shrub wetlands (peatlands) had higher fish tissue THg levels, thus it is highly possible that these wetlands are main locations for mercury methylation. Stepwise regression was used to develop empirical models for the purpose of predicting the spatial variation in yellow perch THg over the studied region. The 2005 regression model demonstrates it is possible to obtain good prediction (up to 60% variance description) of resident yellow perch THg level using upland soil O-horizon THg as the only independent variable. The 2006 model shows even greater prediction (r(2)=0.73, with an overall 10 ng/g [tissue, wet weight] margin of error), using lake water dissolved iron and watershed area as the only model independent variables. The developed regression models in this study can help with interpreting THg concentrations in low trophic level fish species for untested lakes of the greater Superior National Forest and surrounding Boreal ecosystem.
Generation, Validation, and Application of Abundance Map Reference Data for Spectral Unmixing
NASA Astrophysics Data System (ADS)
Williams, McKay D.
Reference data ("ground truth") maps traditionally have been used to assess the accuracy of imaging spectrometer classification algorithms. However, these reference data can be prohibitively expensive to produce, often do not include sub-pixel abundance estimates necessary to assess spectral unmixing algorithms, and lack published validation reports. Our research proposes methodologies to efficiently generate, validate, and apply abundance map reference data (AMRD) to airborne remote sensing scenes. We generated scene-wide AMRD for three different remote sensing scenes using our remotely sensed reference data (RSRD) technique, which spatially aggregates unmixing results from fine scale imagery (e.g., 1-m Ground Sample Distance (GSD)) to co-located coarse scale imagery (e.g., 10-m GSD or larger). We validated the accuracy of this methodology by estimating AMRD in 51 randomly-selected 10 m x 10 m plots, using seven independent methods and observers, including field surveys by two observers, imagery analysis by two observers, and RSRD using three algorithms. Results indicated statistically-significant differences between all versions of AMRD, suggesting that all forms of reference data need to be validated. Given these significant differences between the independent versions of AMRD, we proposed that the mean of all (MOA) versions of reference data for each plot and class were most likely to represent true abundances. We then compared each version of AMRD to MOA. Best case accuracy was achieved by a version of imagery analysis, which had a mean coverage area error of 2.0%, with a standard deviation of 5.6%. One of the RSRD algorithms was nearly as accurate, achieving a mean error of 3.0%, with a standard deviation of 6.3%, showing the potential of RSRD-based AMRD generation. Application of validated AMRD to specific coarse scale imagery involved three main parts: 1) spatial alignment of coarse and fine scale imagery, 2) aggregation of fine scale abundances to produce coarse scale imagery-specific AMRD, and 3) demonstration of comparisons between coarse scale unmixing abundances and AMRD. Spatial alignment was performed using our scene-wide spectral comparison (SWSC) algorithm, which aligned imagery with accuracy approaching the distance of a single fine scale pixel. We compared simple rectangular aggregation to coarse sensor point spread function (PSF) aggregation, and found that the PSF approach returned lower error, but that rectangular aggregation more accurately estimated true abundances at ground level. We demonstrated various metrics for comparing unmixing results to AMRD, including mean absolute error (MAE) and linear regression (LR). We additionally introduced reference data mean adjusted MAE (MA-MAE), and reference data confidence interval adjusted MAE (CIA-MAE), which account for known error in the reference data itself. MA-MAE analysis indicated that fully constrained linear unmixing of coarse scale imagery across all three scenes returned an error of 10.83% per class and pixel, with regression analysis yielding a slope = 0.85, intercept = 0.04, and R2 = 0.81. Our reference data research has demonstrated a viable methodology to efficiently generate, validate, and apply AMRD to specific examples of airborne remote sensing imagery, thereby enabling direct quantitative assessment of spectral unmixing performance.
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Correction of mid-spatial-frequency errors by smoothing in spin motion for CCOS
NASA Astrophysics Data System (ADS)
Zhang, Yizhong; Wei, Chaoyang; Shao, Jianda; Xu, Xueke; Liu, Shijie; Hu, Chen; Zhang, Haichao; Gu, Haojin
2015-08-01
Smoothing is a convenient and efficient way to correct mid-spatial-frequency errors. Quantifying the smoothing effect allows improvements in efficiency for finishing precision optics. A series experiments in spin motion are performed to study the smoothing effects about correcting mid-spatial-frequency errors. Some of them use a same pitch tool at different spinning speed, and others at a same spinning speed with different tools. Introduced and improved Shu's model to describe and compare the smoothing efficiency with different spinning speed and different tools. From the experimental results, the mid-spatial-frequency errors on the initial surface were nearly smoothed out after the process in spin motion and the number of smoothing times can be estimated by the model before the process. Meanwhile this method was also applied to smooth the aspherical component, which has an obvious mid-spatial-frequency error after Magnetorheological Finishing processing. As a result, a high precision aspheric optical component was obtained with PV=0.1λ and RMS=0.01λ.
Andrews, Elisabeth; Balkanski, Yves; Boucher, Olivier; Myhre, Gunnar; Samset, Bjørn Hallvard; Schulz, Michael; Schuster, Gregory L.; Valari, Myrto; Tao, Shu
2018-01-01
Abstract There is high uncertainty in the direct radiative forcing of black carbon (BC), an aerosol that strongly absorbs solar radiation. The observation‐constrained estimate, which is several times larger than the bottom‐up estimate, is influenced by the spatial representativeness error due to the mesoscale inhomogeneity of the aerosol fields and the relatively low resolution of global chemistry‐transport models. Here we evaluated the spatial representativeness error for two widely used observational networks (AErosol RObotic NETwork and Global Atmosphere Watch) by downscaling the geospatial grid in a global model of BC aerosol absorption optical depth to 0.1° × 0.1°. Comparing the models at a spatial resolution of 2° × 2° with BC aerosol absorption at AErosol RObotic NETwork sites (which are commonly located near emission hot spots) tends to cause a global spatial representativeness error of 30%, as a positive bias for the current top‐down estimate of global BC direct radiative forcing. By contrast, the global spatial representativeness error will be 7% for the Global Atmosphere Watch network, because the sites are located in such a way that there are almost an equal number of sites with positive or negative representativeness error. PMID:29937603
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Waltemeyer, Scott D.
2008-01-01
Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.
Yang, Xue; Wang, Shaojian; Zhang, Wenzhong; Zhan, Dongsheng; Li, Jiaming
2017-04-15
China has received increased international criticism in recent years in relation to its air pollution levels, both in terms of the transmission of pollutants across international borders and the attendant adverse health effects being witnessed. Whilst existing research has examined the factors influencing ambient air pollutant concentrations, previous studies have failed to adequately explore the determinants of such concentrations from either a source or diffusion perspective. This study addressed both source (specifically, anthropogenic emissions) and diffusion (namely, meteorological conditions) indicators, in order to detect their respective impacts on the spatial variations seen in the distribution of air pollution. Spatial panel data for 113 major cities in China was processed using a range of global regression models-the ordinary least square model, the spatial lag model, and the spatial error model-as well as a local, geographic weighted regression (GWR) model. Results from the study suggest that in 2014, average SO 2 concentrations exceeded China's first-level target. The most polluted cities were found to be predominantly located in northern China, while less polluted cities were located in southern China. Global regression results indicated that precipitation exerts a significant effect on SO 2 reduction (p<0.001) and that a regional increase of 1mm in precipitation can reduce SO 2 concentrations by 0.026μg/m 3 . Both emission and temperature factors were found to aggravate SO 2 concentrations, although no such significant correlation was found in relation to wind speed. GWR results suggest that the association between SO 2 and its factors varied over space. Increased emissions were found to be able to produce more pollution in the northwest than in other parts of the country. Higher wind speeds and temperatures in northwestern areas were shown to reinforce SO 2 pollution, while in southern regions, they had the opposite effect. Further, increased precipitation was found to exert a greater inhibitory effect on SO 2 pollution in the country's northeast than that in other areas. Our findings could provide a detailed reference for formulating regionally specific emission reduction policies in China. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
A map overlay error model based on boundary geometry
Gaeuman, D.; Symanzik, J.; Schmidt, J.C.
2005-01-01
An error model for quantifying the magnitudes and variability of errors generated in the areas of polygons during spatial overlay of vector geographic information system layers is presented. Numerical simulation of polygon boundary displacements was used to propagate coordinate errors to spatial overlays. The model departs from most previous error models in that it incorporates spatial dependence of coordinate errors at the scale of the boundary segment. It can be readily adapted to match the scale of error-boundary interactions responsible for error generation on a given overlay. The area of error generated by overlay depends on the sinuosity of polygon boundaries, as well as the magnitude of the coordinate errors on the input layers. Asymmetry in boundary shape has relatively little effect on error generation. Overlay errors are affected by real differences in boundary positions on the input layers, as well as errors in the boundary positions. Real differences between input layers tend to compensate for much of the error generated by coordinate errors. Thus, the area of change measured on an overlay layer produced by the XOR overlay operation will be more accurate if the area of real change depicted on the overlay is large. The model presented here considers these interactions, making it especially useful for estimating errors studies of landscape change over time. ?? 2005 The Ohio State University.
Spatial heterogeneity of type I error for local cluster detection tests
2014-01-01
Background Just as power, type I error of cluster detection tests (CDTs) should be spatially assessed. Indeed, CDTs’ type I error and power have both a spatial component as CDTs both detect and locate clusters. In the case of type I error, the spatial distribution of wrongly detected clusters (WDCs) can be particularly affected by edge effect. This simulation study aims to describe the spatial distribution of WDCs and to confirm and quantify the presence of edge effect. Methods A simulation of 40 000 datasets has been performed under the null hypothesis of risk homogeneity. The simulation design used realistic parameters from survey data on birth defects, and in particular, two baseline risks. The simulated datasets were analyzed using the Kulldorff’s spatial scan as a commonly used test whose behavior is otherwise well known. To describe the spatial distribution of type I error, we defined the participation rate for each spatial unit of the region. We used this indicator in a new statistical test proposed to confirm, as well as quantify, the edge effect. Results The predefined type I error of 5% was respected for both baseline risks. Results showed strong edge effect in participation rates, with a descending gradient from center to edge, and WDCs more often centrally situated. Conclusions In routine analysis of real data, clusters on the edge of the region should be carefully considered as they rarely occur when there is no cluster. Further work is needed to combine results from power studies with this work in order to optimize CDTs performance. PMID:24885343
NASA Technical Reports Server (NTRS)
Kicklighter, David W.; Melillo, Jerry M.; Peterjohn, William T.; Rastetter, Edward B.; Mcguire, A. David; Steudler, Paul A.; Aber, John D.
1994-01-01
We examine the influence of aggregation errors on developing estimates of regional soil-CO2 flux from temperate forests. We find daily soil-CO2 fluxes to be more sensitive to changes in soil temperatures (Q(sub 10) = 3.08) than air temperatures (Q(sub 10) = 1.99). The direct use of mean monthly air temperatures with a daily flux model underestimates regional fluxes by approximately 4%. Temporal aggregation error varies with spatial resolution. Overall, our calibrated modeling approach reduces spatial aggregation error by 9.3% and temporal aggregation error by 15.5%. After minimizing spatial and temporal aggregation errors, mature temperate forest soils are estimated to contribute 12.9 Pg C/yr to the atmosphere as carbon dioxide. Georeferenced model estimates agree well with annual soil-CO2 fluxes measured during chamber studies in mature temperate forest stands around the globe.
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
7 CFR 275.23 - Determination of State agency program performance.
Code of Federal Regulations, 2011 CFR
2011-01-01
... NUTRITION SERVICE, DEPARTMENT OF AGRICULTURE FOOD STAMP AND FOOD DISTRIBUTION PROGRAM PERFORMANCE REPORTING... section, the adjusted regressed payment error rate shall be calculated to yield the State agency's payment error rate. The adjusted regressed payment error rate is given by r 1″ + r 2″. (ii) If FNS determines...
Spatial sampling considerations of the CERES (Clouds and Earth Radiant Energy System) instrument
NASA Astrophysics Data System (ADS)
Smith, G. L.; Manalo-Smith, Natividdad; Priestley, Kory
2014-10-01
The CERES (Clouds and Earth Radiant Energy System) instrument is a scanning radiometer with three channels for measuring Earth radiation budget. At present CERES models are operating aboard the Terra, Aqua and Suomi/NPP spacecraft and flights of CERES instruments are planned for the JPSS-1 spacecraft and its successors. CERES scans from one limb of the Earth to the other and back. The footprint size grows with distance from nadir simply due to geometry so that the size of the smallest features which can be resolved from the data increases and spatial sampling errors increase with nadir angle. This paper presents an analysis of the effect of nadir angle on spatial sampling errors of the CERES instrument. The analysis performed in the Fourier domain. Spatial sampling errors are created by smoothing of features which are the size of the footprint and smaller, or blurring, and inadequate sampling, that causes aliasing errors. These spatial sampling errors are computed in terms of the system transfer function, which is the Fourier transform of the point response function, the spacing of data points and the spatial spectrum of the radiance field.
Entropy of Movement Outcome in Space-Time.
Lai, Shih-Chiung; Hsieh, Tsung-Yu; Newell, Karl M
2015-07-01
Information entropy of the joint spatial and temporal (space-time) probability of discrete movement outcome was investigated in two experiments as a function of different movement strategies (space-time, space, and time instructional emphases), task goals (point-aiming and target-aiming) and movement speed-accuracy constraints. The variance of the movement spatial and temporal errors was reduced by instructional emphasis on the respective spatial or temporal dimension, but increased on the other dimension. The space-time entropy was lower in targetaiming task than the point aiming task but did not differ between instructional emphases. However, the joint probabilistic measure of spatial and temporal entropy showed that spatial error is traded for timing error in tasks with space-time criteria and that the pattern of movement error depends on the dimension of the measurement process. The unified entropy measure of movement outcome in space-time reveals a new relation for the speed-accuracy.
Peak-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.
Exploring precipitation pattern scaling methodologies and robustness among CMIP5 models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kravitz, Ben; Lynch, Cary; Hartin, Corinne
Pattern scaling is a well-established method for approximating modeled spatial distributions of changes in temperature by assuming a time-invariant pattern that scales with changes in global mean temperature. We compare two methods of pattern scaling for annual mean precipitation (regression and epoch difference) and evaluate which method is better in particular circumstances by quantifying their robustness to interpolation/extrapolation in time, inter-model variations, and inter-scenario variations. Both the regression and epoch-difference methods (the two most commonly used methods of pattern scaling) have good absolute performance in reconstructing the climate model output, measured as an area-weighted root mean square error. We decomposemore » the precipitation response in the RCP8.5 scenario into a CO 2 portion and a non-CO 2 portion. Extrapolating RCP8.5 patterns to reconstruct precipitation change in the RCP2.6 scenario results in large errors due to violations of pattern scaling assumptions when this CO 2-/non-CO 2-forcing decomposition is applied. As a result, the methodologies discussed in this paper can help provide precipitation fields to be utilized in other models (including integrated assessment models or impacts assessment models) for a wide variety of scenarios of future climate change.« less
Exploring precipitation pattern scaling methodologies and robustness among CMIP5 models
Kravitz, Ben; Lynch, Cary; Hartin, Corinne; ...
2017-05-12
Pattern scaling is a well-established method for approximating modeled spatial distributions of changes in temperature by assuming a time-invariant pattern that scales with changes in global mean temperature. We compare two methods of pattern scaling for annual mean precipitation (regression and epoch difference) and evaluate which method is better in particular circumstances by quantifying their robustness to interpolation/extrapolation in time, inter-model variations, and inter-scenario variations. Both the regression and epoch-difference methods (the two most commonly used methods of pattern scaling) have good absolute performance in reconstructing the climate model output, measured as an area-weighted root mean square error. We decomposemore » the precipitation response in the RCP8.5 scenario into a CO 2 portion and a non-CO 2 portion. Extrapolating RCP8.5 patterns to reconstruct precipitation change in the RCP2.6 scenario results in large errors due to violations of pattern scaling assumptions when this CO 2-/non-CO 2-forcing decomposition is applied. As a result, the methodologies discussed in this paper can help provide precipitation fields to be utilized in other models (including integrated assessment models or impacts assessment models) for a wide variety of scenarios of future climate change.« less
Precoded spatial multiplexing MIMO system with spatial component interleaver.
Gao, Xiang; Wu, Zhanji
In this paper, the performance of precoded bit-interleaved coded modulation (BICM) spatial multiplexing multiple-input multiple-output (MIMO) system with spatial component interleaver is investigated. For the ideal precoded spatial multiplexing MIMO system with spatial component interleaver based on singular value decomposition (SVD) of the MIMO channel, the average pairwise error probability (PEP) of coded bits is derived. Based on the PEP analysis, the optimum spatial Q-component interleaver design criterion is provided to achieve the minimum error probability. For the limited feedback precoded proposed scheme with linear zero forcing (ZF) receiver, in order to minimize a bound on the average probability of a symbol vector error, a novel effective signal-to-noise ratio (SNR)-based precoding matrix selection criterion and a simplified criterion are proposed. Based on the average mutual information (AMI)-maximization criterion, the optimal constellation rotation angles are investigated. Simulation results indicate that the optimized spatial multiplexing MIMO system with spatial component interleaver can achieve significant performance advantages compared to the conventional spatial multiplexing MIMO system.
Diffraction analysis of sidelobe characteristics of optical elements with ripple error
NASA Astrophysics Data System (ADS)
Zhao, Lei; Luo, Yupeng; Bai, Jian; Zhou, Xiangdong; Du, Juan; Liu, Qun; Luo, Yujie
2018-03-01
The ripple errors of the lens lead to optical damage in high energy laser system. The analysis of sidelobe on the focal plane, caused by ripple error, provides a reference to evaluate the error and the imaging quality. In this paper, we analyze the diffraction characteristics of sidelobe of optical elements with ripple errors. First, we analyze the characteristics of ripple error and build relationship between ripple error and sidelobe. The sidelobe results from the diffraction of ripple errors. The ripple error tends to be periodic due to fabrication method on the optical surface. The simulated experiments are carried out based on angular spectrum method by characterizing ripple error as rotationally symmetric periodic structures. The influence of two major parameter of ripple including spatial frequency and peak-to-valley value to sidelobe is discussed. The results indicate that spatial frequency and peak-to-valley value both impact sidelobe at the image plane. The peak-tovalley value is the major factor to affect the energy proportion of the sidelobe. The spatial frequency is the major factor to affect the distribution of the sidelobe at the image plane.
Adaptive Laplacian filtering for sensorimotor rhythm-based brain-computer interfaces.
Lu, Jun; McFarland, Dennis J; Wolpaw, Jonathan R
2013-02-01
Sensorimotor rhythms (SMRs) are 8-30 Hz oscillations in the electroencephalogram (EEG) recorded from the scalp over sensorimotor cortex that change with movement and/or movement imagery. Many brain-computer interface (BCI) studies have shown that people can learn to control SMR amplitudes and can use that control to move cursors and other objects in one, two or three dimensions. At the same time, if SMR-based BCIs are to be useful for people with neuromuscular disabilities, their accuracy and reliability must be improved substantially. These BCIs often use spatial filtering methods such as common average reference (CAR), Laplacian (LAP) filter or common spatial pattern (CSP) filter to enhance the signal-to-noise ratio of EEG. Here, we test the hypothesis that a new filter design, called an 'adaptive Laplacian (ALAP) filter', can provide better performance for SMR-based BCIs. An ALAP filter employs a Gaussian kernel to construct a smooth spatial gradient of channel weights and then simultaneously seeks the optimal kernel radius of this spatial filter and the regularization parameter of linear ridge regression. This optimization is based on minimizing the leave-one-out cross-validation error through a gradient descent method and is computationally feasible. Using a variety of kinds of BCI data from a total of 22 individuals, we compare the performances of ALAP filter to CAR, small LAP, large LAP and CSP filters. With a large number of channels and limited data, ALAP performs significantly better than CSP, CAR, small LAP and large LAP both in classification accuracy and in mean-squared error. Using fewer channels restricted to motor areas, ALAP is still superior to CAR, small LAP and large LAP, but equally matched to CSP. Thus, ALAP may help to improve the accuracy and robustness of SMR-based BCIs.
Gómez-Alvarez, Fatima B; Jáuregui-Renaud, Kathrine
2011-02-01
We undertook this study to assess the correlation between the results of simple tests of spatial orientation and the occurrence of common psychological symptoms during the first 3 months after an acute, unilateral, peripheral, vestibular lesion. Ten vestibular patients were selected and accepted to participate in the study. During a 3-month follow-up, we recorded the static visual vertical (VV), the estimation error of reorientation in the yaw plane and the responses to a standardized questionnaire of balance symptoms, the Dizziness Handicap Inventory (DHI), the depersonalization/derealization inventory by Cox and Swinson (DD), the Dissociative Experiences Scale (DES), the 12-item General Health Questionnaire (GHQ-12), the Zung Instrument for Anxiety Disorders and the Hamilton Depression Rating Scale. At week 1, all patients showed a VV >2° and failed to reorient themselves effectively. They reported several balance symptoms and handicap as well as DD symptoms, including attention/concentration difficulties; 80% of the patients had a Hamilton score ≥8. At this time the balance symptom score correlated with the DHI. After 3 months, all scores decreased. Multiple regression analysis of the differences from baseline showed that the DD score difference was related to the difference on the balance score, the reorientation error and the DHI score (p <0.01). No other linear relationships were observed (p >0.5). During the acute phase of a unilateral, peripheral, vestibular lesion, patients may show poor spatial orientation concurrent with DD symptoms including attention/concentration difficulties, and somatic depression symptoms. After vestibular rehabilitation, DD symptoms decrease as the spatial orientation improves, even if somatic symptoms of depression persist. Copyright © 2011 IMSS. Published by Elsevier Inc. All rights reserved.
Adaptive Laplacian filtering for sensorimotor rhythm-based brain-computer interfaces
NASA Astrophysics Data System (ADS)
Lu, Jun; McFarland, Dennis J.; Wolpaw, Jonathan R.
2013-02-01
Objective. Sensorimotor rhythms (SMRs) are 8-30 Hz oscillations in the electroencephalogram (EEG) recorded from the scalp over sensorimotor cortex that change with movement and/or movement imagery. Many brain-computer interface (BCI) studies have shown that people can learn to control SMR amplitudes and can use that control to move cursors and other objects in one, two or three dimensions. At the same time, if SMR-based BCIs are to be useful for people with neuromuscular disabilities, their accuracy and reliability must be improved substantially. These BCIs often use spatial filtering methods such as common average reference (CAR), Laplacian (LAP) filter or common spatial pattern (CSP) filter to enhance the signal-to-noise ratio of EEG. Here, we test the hypothesis that a new filter design, called an ‘adaptive Laplacian (ALAP) filter’, can provide better performance for SMR-based BCIs. Approach. An ALAP filter employs a Gaussian kernel to construct a smooth spatial gradient of channel weights and then simultaneously seeks the optimal kernel radius of this spatial filter and the regularization parameter of linear ridge regression. This optimization is based on minimizing the leave-one-out cross-validation error through a gradient descent method and is computationally feasible. Main results. Using a variety of kinds of BCI data from a total of 22 individuals, we compare the performances of ALAP filter to CAR, small LAP, large LAP and CSP filters. With a large number of channels and limited data, ALAP performs significantly better than CSP, CAR, small LAP and large LAP both in classification accuracy and in mean-squared error. Using fewer channels restricted to motor areas, ALAP is still superior to CAR, small LAP and large LAP, but equally matched to CSP. Significance. Thus, ALAP may help to improve the accuracy and robustness of SMR-based BCIs.
Kumar, K Vasanth; Porkodi, K; Rocha, F
2008-01-15
A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
Biases and Standard Errors of Standardized Regression Coefficients
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Chan, Wai
2011-01-01
The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…
F. Mauro; Vicente J. Monleon; H. Temesgen; L.A. Ruiz
2017-01-01
Accounting for spatial correlation of LiDAR model errors can improve the precision of model-based estimators. To estimate spatial correlation, sample designs that provide close observations are needed, but their implementation might be prohibitively expensive. To quantify the gains obtained by accounting for the spatial correlation of model errors, we examined (
Eric H. Wharton; Tiberius Cunia
1987-01-01
Proceedings of a workshop co-sponsored by the USDA Forest Service, the State University of New York, and the Society of American Foresters. Presented were papers on the methodology of sample tree selection, tree biomass measurement, construction of biomass tables and estimation of their error, and combining the error of biomass tables with that of the sample plots or...
Is adult gait less susceptible than paediatric gait to hip joint centre regression equation error?
Kiernan, D; Hosking, J; O'Brien, T
2016-03-01
Hip joint centre (HJC) regression equation error during paediatric gait has recently been shown to have clinical significance. In relation to adult gait, it has been inferred that comparable errors with children in absolute HJC position may in fact result in less significant kinematic and kinetic error. This study investigated the clinical agreement of three commonly used regression equation sets (Bell et al., Davis et al. and Orthotrak) for adult subjects against the equations of Harrington et al. The relationship between HJC position error and subject size was also investigated for the Davis et al. set. Full 3-dimensional gait analysis was performed on 12 healthy adult subjects with data for each set compared to Harrington et al. The Gait Profile Score, Gait Variable Score and GDI-kinetic were used to assess clinical significance while differences in HJC position between the Davis and Harrington sets were compared to leg length and subject height using regression analysis. A number of statistically significant differences were present in absolute HJC position. However, all sets fell below the clinically significant thresholds (GPS <1.6°, GDI-Kinetic <3.6 points). Linear regression revealed a statistically significant relationship for both increasing leg length and increasing subject height with decreasing error in anterior/posterior and superior/inferior directions. Results confirm a negligible clinical error for adult subjects suggesting that any of the examined sets could be used interchangeably. Decreasing error with both increasing leg length and increasing subject height suggests that the Davis set should be used cautiously on smaller subjects. Copyright © 2016 Elsevier B.V. All rights reserved.
Multiple imputation of missing fMRI data in whole brain analysis
Vaden, Kenneth I.; Gebregziabher, Mulugeta; Kuchinsky, Stefanie E.; Eckert, Mark A.
2012-01-01
Whole brain fMRI analyses rarely include the entire brain because of missing data that result from data acquisition limits and susceptibility artifact, in particular. This missing data problem is typically addressed by omitting voxels from analysis, which may exclude brain regions that are of theoretical interest and increase the potential for Type II error at cortical boundaries or Type I error when spatial thresholds are used to establish significance. Imputation could significantly expand statistical map coverage, increase power, and enhance interpretations of fMRI results. We examined multiple imputation for group level analyses of missing fMRI data using methods that leverage the spatial information in fMRI datasets for both real and simulated data. Available case analysis, neighbor replacement, and regression based imputation approaches were compared in a general linear model framework to determine the extent to which these methods quantitatively (effect size) and qualitatively (spatial coverage) increased the sensitivity of group analyses. In both real and simulated data analysis, multiple imputation provided 1) variance that was most similar to estimates for voxels with no missing data, 2) fewer false positive errors in comparison to mean replacement, and 3) fewer false negative errors in comparison to available case analysis. Compared to the standard analysis approach of omitting voxels with missing data, imputation methods increased brain coverage in this study by 35% (from 33,323 to 45,071 voxels). In addition, multiple imputation increased the size of significant clusters by 58% and number of significant clusters across statistical thresholds, compared to the standard voxel omission approach. While neighbor replacement produced similar results, we recommend multiple imputation because it uses an informed sampling distribution to deal with missing data across subjects that can include neighbor values and other predictors. Multiple imputation is anticipated to be particularly useful for 1) large fMRI data sets with inconsistent missing voxels across subjects and 2) addressing the problem of increased artifact at ultra-high field, which significantly limit the extent of whole brain coverage and interpretations of results. PMID:22500925
NASA Astrophysics Data System (ADS)
Müller, Benjamin; Bernhardt, Matthias; Jackisch, Conrad; Schulz, Karsten
2016-09-01
For understanding water and solute transport processes, knowledge about the respective hydraulic properties is necessary. Commonly, hydraulic parameters are estimated via pedo-transfer functions using soil texture data to avoid cost-intensive measurements of hydraulic parameters in the laboratory. Therefore, current soil texture information is only available at a coarse spatial resolution of 250 to 1000 m. Here, a method is presented to derive high-resolution (15 m) spatial topsoil texture patterns for the meso-scale Attert catchment (Luxembourg, 288 km2) from 28 images of ASTER (advanced spaceborne thermal emission and reflection radiometer) thermal remote sensing. A principle component analysis of the images reveals the most dominant thermal patterns (principle components, PCs) that are related to 212 fractional soil texture samples. Within a multiple linear regression framework, distributed soil texture information is estimated and related uncertainties are assessed. An overall root mean squared error (RMSE) of 12.7 percentage points (pp) lies well within and even below the range of recent studies on soil texture estimation, while requiring sparser sample setups and a less diverse set of basic spatial input. This approach will improve the generation of spatially distributed topsoil maps, particularly for hydrologic modeling purposes, and will expand the usage of thermal remote sensing products.
Kraan, Casper; Aarts, Geert; Van der Meer, Jaap; Piersma, Theunis
2010-06-01
Ongoing statistical sophistication allows a shift from describing species' spatial distributions toward statistically disentangling the possible roles of environmental variables in shaping species distributions. Based on a landscape-scale benthic survey in the Dutch Wadden Sea, we show the merits of spatially explicit generalized estimating equations (GEE). The intertidal macrozoobenthic species, Macoma balthica, Cerastoderma edule, Marenzelleria viridis, Scoloplos armiger, Corophium volutator, and Urothoe poseidonis served as test cases, with median grain-size and inundation time as typical environmental explanatory variables. GEEs outperformed spatially naive generalized linear models (GLMs), and removed much residual spatial structure, indicating the importance of median grain-size and inundation time in shaping landscape-scale species distributions in the intertidal. GEE regression coefficients were smaller than those attained with GLM, and GEE standard errors were larger. The best fitting GEE for each species was used to predict species' density in relation to median grain-size and inundation time. Although no drastic changes were noted compared to previous work that described habitat suitability for benthic fauna in the Wadden Sea, our predictions provided more detailed and unbiased estimates of the determinants of species-environment relationships. We conclude that spatial GEEs offer the necessary methodological advances to further steps toward linking pattern to process.
NASA Astrophysics Data System (ADS)
Khaki, M.; Schumacher, M.; Forootan, E.; Kuhn, M.; Awange, J. L.; van Dijk, A. I. J. M.
2017-10-01
Assimilation of terrestrial water storage (TWS) information from the Gravity Recovery And Climate Experiment (GRACE) satellite mission can provide significant improvements in hydrological modelling. However, the rather coarse spatial resolution of GRACE TWS and its spatially correlated errors pose considerable challenges for achieving realistic assimilation results. Consequently, successful data assimilation depends on rigorous modelling of the full error covariance matrix of the GRACE TWS estimates, as well as realistic error behavior for hydrological model simulations. In this study, we assess the application of local analysis (LA) to maximize the contribution of GRACE TWS in hydrological data assimilation. For this, we assimilate GRACE TWS into the World-Wide Water Resources Assessment system (W3RA) over the Australian continent while applying LA and accounting for existing spatial correlations using the full error covariance matrix. GRACE TWS data is applied with different spatial resolutions including 1° to 5° grids, as well as basin averages. The ensemble-based sequential filtering technique of the Square Root Analysis (SQRA) is applied to assimilate TWS data into W3RA. For each spatial scale, the performance of the data assimilation is assessed through comparison with independent in-situ ground water and soil moisture observations. Overall, the results demonstrate that LA is able to stabilize the inversion process (within the implementation of the SQRA filter) leading to less errors for all spatial scales considered with an average RMSE improvement of 54% (e.g., 52.23 mm down to 26.80 mm) for all the cases with respect to groundwater in-situ measurements. Validating the assimilated results with groundwater observations indicates that LA leads to 13% better (in terms of RMSE) assimilation results compared to the cases with Gaussian errors assumptions. This highlights the great potential of LA and the use of the full error covariance matrix of GRACE TWS estimates for improved data assimilation results.
NASA Astrophysics Data System (ADS)
Rebolledo Coy, M. A.; Villanueva, O. M. B.; Bartz-Beielstein, T.; Ribbe, L.
2017-12-01
Rainfall measurement plays an important role on the understanding and modeling of the water cycle. However, the assessment of scarce data regions using common rain gauge information, cannot be done using a straightforward approach. Some of the main problems concerning rainfall assessment are; the lack of a sufficiently dense grid of ground stations in extensive areas and the unstable spatial accuracy of the Satellite Rainfall Estimates (SREs). Following previous works on SREs analysis and bias-correction, we generate an ensemble model that corrects the bias error on a seasonal and yearly basis using six different state-of-the-art SREs (TRMM 3B42RT, TRMM 3B42v7, PERSIANN-CDR, CHIRPSv2, CMORPH and MSWEPv1.2) in a point-to-pixel approach for the studied period (2003-2015). Three different basins; Magdalena in Colombia, Imperial in Chile and Paraiba do Sul in Brazil are evaluated. Using Gaussian process regression and Bayesian robust regression we model the behavior of the ground stations and evaluate its goodness-of-fit by using the modified Kling-Gupta efficiency (KGE'). Following this evaluation, the models are re-fitted by taking into account the error distribution in each point and the corresponding KGE' is evaluated again. Both models were specified using the probabilistic language STAN. To improve the efficiency of the Gaussian model a clustering of the data was implemented. We also compared the performance of both models in term of uncertainty and stability against the raw input concluding that both models represent better the study areas. The results show that the error displays an exponential behavior for days where precipitation was present, this allows the models to be corrected according to the observed rainfall values. The seasonal evaluations also show improved performance in relation to the yearly evaluations. The use of bias-corrected SREs for hydrologic purposes in scarce data regions is highly recommended in order to merge the punctual values from the ground measurements and the spatial distribution of rainfall from the satellite estimates.
Comparison of machine-learning methods for above-ground biomass estimation based on Landsat imagery
NASA Astrophysics Data System (ADS)
Wu, Chaofan; Shen, Huanhuan; Shen, Aihua; Deng, Jinsong; Gan, Muye; Zhu, Jinxia; Xu, Hongwei; Wang, Ke
2016-07-01
Biomass is one significant biophysical parameter of a forest ecosystem, and accurate biomass estimation on the regional scale provides important information for carbon-cycle investigation and sustainable forest management. In this study, Landsat satellite imagery data combined with field-based measurements were integrated through comparisons of five regression approaches [stepwise linear regression, K-nearest neighbor, support vector regression, random forest (RF), and stochastic gradient boosting] with two different candidate variable strategies to implement the optimal spatial above-ground biomass (AGB) estimation. The results suggested that RF algorithm exhibited the best performance by 10-fold cross-validation with respect to R2 (0.63) and root-mean-square error (26.44 ton/ha). Consequently, the map of estimated AGB was generated with a mean value of 89.34 ton/ha in northwestern Zhejiang Province, China, with a similar pattern to the distribution mode of local forest species. This research indicates that machine-learning approaches associated with Landsat imagery provide an economical way for biomass estimation. Moreover, ensemble methods using all candidate variables, especially for Landsat images, provide an alternative for regional biomass simulation.
Lamadrid-Figueroa, Héctor; Téllez-Rojo, Martha M; Angeles, Gustavo; Hernández-Ávila, Mauricio; Hu, Howard
2011-01-01
In-vivo measurement of bone lead by means of K-X-ray fluorescence (KXRF) is the preferred biological marker of chronic exposure to lead. Unfortunately, considerable measurement error associated with KXRF estimations can introduce bias in estimates of the effect of bone lead when this variable is included as the exposure in a regression model. Estimates of uncertainty reported by the KXRF instrument reflect the variance of the measurement error and, although they can be used to correct the measurement error bias, they are seldom used in epidemiological statistical analyzes. Errors-in-variables regression (EIV) allows for correction of bias caused by measurement error in predictor variables, based on the knowledge of the reliability of such variables. The authors propose a way to obtain reliability coefficients for bone lead measurements from uncertainty data reported by the KXRF instrument and compare, by the use of Monte Carlo simulations, results obtained using EIV regression models vs. those obtained by the standard procedures. Results of the simulations show that Ordinary Least Square (OLS) regression models provide severely biased estimates of effect, and that EIV provides nearly unbiased estimates. Although EIV effect estimates are more imprecise, their mean squared error is much smaller than that of OLS estimates. In conclusion, EIV is a better alternative than OLS to estimate the effect of bone lead when measured by KXRF. Copyright © 2010 Elsevier Inc. All rights reserved.
Two Enhancements of the Logarithmic Least-Squares Method for Analyzing Subjective Comparisons
1989-03-25
error term. 1 For this model, the total sum of squares ( SSTO ), defined as n 2 SSTO = E (yi y) i=1 can be partitioned into error and regression sums...of the regression line around the mean value. Mathematically, for the model given by equation A.4, SSTO = SSE + SSR (A.6) A-4 where SSTO is the total...sum of squares (i.e., the variance of the yi’s), SSE is error sum of squares, and SSR is the regression sum of squares. SSTO , SSE, and SSR are given
A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form
ERIC Educational Resources Information Center
Cai, Li; Hayes, Andrew F.
2008-01-01
When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…
Barnwell-Ménard, Jean-Louis; Li, Qing; Cohen, Alan A
2015-03-15
The loss of signal associated with categorizing a continuous variable is well known, and previous studies have demonstrated that this can lead to an inflation of Type-I error when the categorized variable is a confounder in a regression analysis estimating the effect of an exposure on an outcome. However, it is not known how the Type-I error may vary under different circumstances, including logistic versus linear regression, different distributions of the confounder, and different categorization methods. Here, we analytically quantified the effect of categorization and then performed a series of 9600 Monte Carlo simulations to estimate the Type-I error inflation associated with categorization of a confounder under different regression scenarios. We show that Type-I error is unacceptably high (>10% in most scenarios and often 100%). The only exception was when the variable categorized was a continuous mixture proxy for a genuinely dichotomous latent variable, where both the continuous proxy and the categorized variable are error-ridden proxies for the dichotomous latent variable. As expected, error inflation was also higher with larger sample size, fewer categories, and stronger associations between the confounder and the exposure or outcome. We provide online tools that can help researchers estimate the potential error inflation and understand how serious a problem this is. Copyright © 2014 John Wiley & Sons, Ltd.
Adam-Poupart, Ariane; Brand, Allan; Fournier, Michel; Jerrett, Michael; Smargiassi, Audrey
2014-09-01
Ambient air ozone (O3) is a pulmonary irritant that has been associated with respiratory health effects including increased lung inflammation and permeability, airway hyperreactivity, respiratory symptoms, and decreased lung function. Estimation of O3 exposure is a complex task because the pollutant exhibits complex spatiotemporal patterns. To refine the quality of exposure estimation, various spatiotemporal methods have been developed worldwide. We sought to compare the accuracy of three spatiotemporal models to predict summer ground-level O3 in Quebec, Canada. We developed a land-use mixed-effects regression (LUR) model based on readily available data (air quality and meteorological monitoring data, road networks information, latitude), a Bayesian maximum entropy (BME) model incorporating both O3 monitoring station data and the land-use mixed model outputs (BME-LUR), and a kriging method model based only on available O3 monitoring station data (BME kriging). We performed leave-one-station-out cross-validation and visually assessed the predictive capability of each model by examining the mean temporal and spatial distributions of the average estimated errors. The BME-LUR was the best predictive model (R2 = 0.653) with the lowest root mean-square error (RMSE ;7.06 ppb), followed by the LUR model (R2 = 0.466, RMSE = 8.747) and the BME kriging model (R2 = 0.414, RMSE = 9.164). Our findings suggest that errors of estimation in the interpolation of O3 concentrations with BME can be greatly reduced by incorporating outputs from a LUR model developed with readily available data.
Monitoring Building Deformation with InSAR: Experiments and Validation
Yang, Kui; Yan, Li; Huang, Guoman; Chen, Chu; Wu, Zhengpeng
2016-01-01
Synthetic Aperture Radar Interferometry (InSAR) techniques are increasingly applied for monitoring land subsidence. The advantages of InSAR include high accuracy and the ability to cover large areas; nevertheless, research validating the use of InSAR on building deformation is limited. In this paper, we test the monitoring capability of the InSAR in experiments using two landmark buildings; the Bohai Building and the China Theater, located in Tianjin, China. They were selected as real examples to compare InSAR and leveling approaches for building deformation. Ten TerraSAR-X images spanning half a year were used in Permanent Scatterer InSAR processing. These extracted InSAR results were processed considering the diversity in both direction and spatial distribution, and were compared with true leveling values in both Ordinary Least Squares (OLS) regression and measurement of error analyses. The detailed experimental results for the Bohai Building and the China Theater showed a high correlation between InSAR results and the leveling values. At the same time, the two Root Mean Square Error (RMSE) indexes had values of approximately 1 mm. These analyses show that a millimeter level of accuracy can be achieved by means of InSAR technique when measuring building deformation. We discuss the differences in accuracy between OLS regression and measurement of error analyses, and compare the accuracy index of leveling in order to propose InSAR accuracy levels appropriate for monitoring buildings deformation. After assessing the advantages and limitations of InSAR techniques in monitoring buildings, further applications are evaluated. PMID:27999403
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
Adam-Poupart, Ariane; Brand, Allan; Fournier, Michel; Jerrett, Michael
2014-01-01
Background: Ambient air ozone (O3) is a pulmonary irritant that has been associated with respiratory health effects including increased lung inflammation and permeability, airway hyperreactivity, respiratory symptoms, and decreased lung function. Estimation of O3 exposure is a complex task because the pollutant exhibits complex spatiotemporal patterns. To refine the quality of exposure estimation, various spatiotemporal methods have been developed worldwide. Objectives: We sought to compare the accuracy of three spatiotemporal models to predict summer ground-level O3 in Quebec, Canada. Methods: We developed a land-use mixed-effects regression (LUR) model based on readily available data (air quality and meteorological monitoring data, road networks information, latitude), a Bayesian maximum entropy (BME) model incorporating both O3 monitoring station data and the land-use mixed model outputs (BME-LUR), and a kriging method model based only on available O3 monitoring station data (BME kriging). We performed leave-one-station-out cross-validation and visually assessed the predictive capability of each model by examining the mean temporal and spatial distributions of the average estimated errors. Results: The BME-LUR was the best predictive model (R2 = 0.653) with the lowest root mean-square error (RMSE ;7.06 ppb), followed by the LUR model (R2 = 0.466, RMSE = 8.747) and the BME kriging model (R2 = 0.414, RMSE = 9.164). Conclusions: Our findings suggest that errors of estimation in the interpolation of O3 concentrations with BME can be greatly reduced by incorporating outputs from a LUR model developed with readily available data. Citation: Adam-Poupart A, Brand A, Fournier M, Jerrett M, Smargiassi A. 2014. Spatiotemporal modeling of ozone levels in Quebec (Canada): a comparison of kriging, land-use regression (LUR), and combined Bayesian maximum entropy–LUR approaches. Environ Health Perspect 122:970–976; http://dx.doi.org/10.1289/ehp.1306566 PMID:24879650
Accounting for substitution and spatial heterogeneity in a labelled choice experiment.
Lizin, S; Brouwer, R; Liekens, I; Broeckx, S
2016-10-01
Many environmental valuation studies using stated preferences techniques are single-site studies that ignore essential spatial aspects, including possible substitution effects. In this paper substitution effects are captured explicitly in the design of a labelled choice experiment and the inclusion of different distance variables in the choice model specification. We test the effect of spatial heterogeneity on welfare estimates and transfer errors for minor and major river restoration works, and the transferability of river specific utility functions, accounting for key variables such as site visitation, spatial clustering and income. River specific utility functions appear to be transferable, resulting in low transfer errors. However, ignoring spatial heterogeneity increases transfer errors. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Alexander, R. B.; Boyer, E. W.; Schwarz, G. E.; Smith, R. A.
2013-12-01
Estimating water and material stores and fluxes in watershed studies is frequently complicated by uncertainties in quantifying hydrological and biogeochemical effects of factors such as land use, soils, and climate. Although these process-related effects are commonly measured and modeled in separate catchments, researchers are especially challenged by their complexity across catchments and diverse environmental settings, leading to a poor understanding of how model parameters and prediction uncertainties vary spatially. To address these concerns, we illustrate the use of Bayesian hierarchical modeling techniques with a dynamic version of the spatially referenced watershed model SPARROW (SPAtially Referenced Regression On Watershed attributes). The dynamic SPARROW model is designed to predict streamflow and other water cycle components (e.g., evapotranspiration, soil and groundwater storage) for monthly varying hydrological regimes, using mechanistic functions, mass conservation constraints, and statistically estimated parameters. In this application, the model domain includes nearly 30,000 NHD (National Hydrologic Data) stream reaches and their associated catchments in the Susquehanna River Basin. We report the results of our comparisons of alternative models of varying complexity, including models with different explanatory variables as well as hierarchical models that account for spatial and temporal variability in model parameters and variance (error) components. The model errors are evaluated for changes with season and catchment size and correlations in time and space. The hierarchical models consist of a two-tiered structure in which climate forcing parameters are modeled as random variables, conditioned on watershed properties. Quantification of spatial and temporal variations in the hydrological parameters and model uncertainties in this approach leads to more efficient (lower variance) and less biased model predictions throughout the river network. Moreover, predictions of water-balance components are reported according to probabilistic metrics (e.g., percentiles, prediction intervals) that include both parameter and model uncertainties. These improvements in predictions of streamflow dynamics can inform the development of more accurate predictions of spatial and temporal variations in biogeochemical stores and fluxes (e.g., nutrients and carbon) in watersheds.
Chang, Howard H.; Hu, Xuefei; Liu, Yang
2014-01-01
There has been a growing interest in the use of satellite-retrieved aerosol optical depth (AOD) to estimate ambient concentrations of PM2.5 (particulate matter <2.5 μm in aerodynamic diameter). With their broad spatial coverage, satellite data can increase the spatial–temporal availability of air quality data beyond ground monitoring measurements and potentially improve exposure assessment for population-based health studies. This paper describes a statistical downscaling approach that brings together (1) recent advances in PM2.5 land use regression models utilizing AOD and (2) statistical data fusion techniques for combining air quality data sets that have different spatial resolutions. Statistical downscaling assumes the associations between AOD and PM2.5 concentrations to be spatially and temporally dependent and offers two key advantages. First, it enables us to use gridded AOD data to predict PM2.5 concentrations at spatial point locations. Second, the unified hierarchical framework provides straightforward uncertainty quantification in the predicted PM2.5 concentrations. The proposed methodology is applied to a data set of daily AOD values in southeastern United States during the period 2003–2005. Via cross-validation experiments, our model had an out-of-sample prediction R2 of 0.78 and a root mean-squared error (RMSE) of 3.61 μg/m3 between observed and predicted daily PM2.5 concentrations. This corresponds to a 10% decrease in RMSE compared with the same land use regression model without AOD as a predictor. Prediction performances of spatial–temporal interpolations to locations and on days without monitoring PM2.5 measurements were also examined. PMID:24368510
Assessment of Chlorophyll-a Algorithms Considering Different Trophic Statuses and Optimal Bands.
Salem, Salem Ibrahim; Higa, Hiroto; Kim, Hyungjun; Kobayashi, Hiroshi; Oki, Kazuo; Oki, Taikan
2017-07-31
Numerous algorithms have been proposed to retrieve chlorophyll- a concentrations in Case 2 waters; however, the retrieval accuracy is far from satisfactory. In this research, seven algorithms are assessed with different band combinations of multispectral and hyperspectral bands using linear (LN), quadratic polynomial (QP) and power (PW) regression approaches, resulting in altogether 43 algorithmic combinations. These algorithms are evaluated by using simulated and measured datasets to understand the strengths and limitations of these algorithms. Two simulated datasets comprising 500,000 reflectance spectra each, both based on wide ranges of inherent optical properties (IOPs), are generated for the calibration and validation stages. Results reveal that the regression approach (i.e., LN, QP, and PW) has more influence on the simulated dataset than on the measured one. The algorithms that incorporated linear regression provide the highest retrieval accuracy for the simulated dataset. Results from simulated datasets reveal that the 3-band (3b) algorithm that incorporate 665-nm and 680-nm bands and band tuning selection approach outperformed other algorithms with root mean square error (RMSE) of 15.87 mg·m -3 , 16.25 mg·m -3 , and 19.05 mg·m -3 , respectively. The spatial distribution of the best performing algorithms, for various combinations of chlorophyll- a (Chla) and non-algal particles (NAP) concentrations, show that the 3b_tuning_QP and 3b_680_QP outperform other algorithms in terms of minimum RMSE frequency of 33.19% and 60.52%, respectively. However, the two algorithms failed to accurately retrieve Chla for many combinations of Chla and NAP, particularly for low Chla and NAP concentrations. In addition, the spatial distribution emphasizes that no single algorithm can provide outstanding accuracy for Chla retrieval and that multi-algorithms should be included to reduce the error. Comparing the results of the measured and simulated datasets reveal that the algorithms that incorporate the 665-nm band outperform other algorithms for measured dataset (RMSE = 36.84 mg·m -3 ), while algorithms that incorporate the band tuning approach provide the highest retrieval accuracy for the simulated dataset (RMSE = 25.05 mg·m -3 ).
Assessment of Chlorophyll-a Algorithms Considering Different Trophic Statuses and Optimal Bands
Higa, Hiroto; Kobayashi, Hiroshi; Oki, Kazuo
2017-01-01
Numerous algorithms have been proposed to retrieve chlorophyll-a concentrations in Case 2 waters; however, the retrieval accuracy is far from satisfactory. In this research, seven algorithms are assessed with different band combinations of multispectral and hyperspectral bands using linear (LN), quadratic polynomial (QP) and power (PW) regression approaches, resulting in altogether 43 algorithmic combinations. These algorithms are evaluated by using simulated and measured datasets to understand the strengths and limitations of these algorithms. Two simulated datasets comprising 500,000 reflectance spectra each, both based on wide ranges of inherent optical properties (IOPs), are generated for the calibration and validation stages. Results reveal that the regression approach (i.e., LN, QP, and PW) has more influence on the simulated dataset than on the measured one. The algorithms that incorporated linear regression provide the highest retrieval accuracy for the simulated dataset. Results from simulated datasets reveal that the 3-band (3b) algorithm that incorporate 665-nm and 680-nm bands and band tuning selection approach outperformed other algorithms with root mean square error (RMSE) of 15.87 mg·m−3, 16.25 mg·m−3, and 19.05 mg·m−3, respectively. The spatial distribution of the best performing algorithms, for various combinations of chlorophyll-a (Chla) and non-algal particles (NAP) concentrations, show that the 3b_tuning_QP and 3b_680_QP outperform other algorithms in terms of minimum RMSE frequency of 33.19% and 60.52%, respectively. However, the two algorithms failed to accurately retrieve Chla for many combinations of Chla and NAP, particularly for low Chla and NAP concentrations. In addition, the spatial distribution emphasizes that no single algorithm can provide outstanding accuracy for Chla retrieval and that multi-algorithms should be included to reduce the error. Comparing the results of the measured and simulated datasets reveal that the algorithms that incorporate the 665-nm band outperform other algorithms for measured dataset (RMSE = 36.84 mg·m−3), while algorithms that incorporate the band tuning approach provide the highest retrieval accuracy for the simulated dataset (RMSE = 25.05 mg·m−3). PMID:28758984
A new polishing process for large-aperture and high-precision aspheric surface
NASA Astrophysics Data System (ADS)
Nie, Xuqing; Li, Shengyi; Dai, Yifan; Song, Ci
2013-07-01
The high-precision aspheric surface is hard to be achieved due to the mid-spatial frequency error in the finishing step. The influence of mid-spatial frequency error is studied through the simulations and experiments. In this paper, a new polishing process based on magnetorheological finishing (MRF), smooth polishing (SP) and ion beam figuring (IBF) is proposed. A 400mm aperture parabolic surface is polished with this new process. The smooth polishing (SP) is applied after rough machining to control the MSF error. In the middle finishing step, most of low-spatial frequency error is removed by MRF rapidly, then the mid-spatial frequency error is restricted by SP, finally ion beam figuring is used to finish the surface. The surface accuracy is improved from the initial 37.691nm (rms, 95% aperture) to the final 4.195nm. The results show that the new polishing process is effective to manufacture large-aperture and high-precision aspheric surface.
Deterministic error correction for nonlocal spatial-polarization hyperentanglement
Li, Tao; Wang, Guan-Yu; Deng, Fu-Guo; Long, Gui-Lu
2016-01-01
Hyperentanglement is an effective quantum source for quantum communication network due to its high capacity, low loss rate, and its unusual character in teleportation of quantum particle fully. Here we present a deterministic error-correction scheme for nonlocal spatial-polarization hyperentangled photon pairs over collective-noise channels. In our scheme, the spatial-polarization hyperentanglement is first encoded into a spatial-defined time-bin entanglement with identical polarization before it is transmitted over collective-noise channels, which leads to the error rejection of the spatial entanglement during the transmission. The polarization noise affecting the polarization entanglement can be corrected with a proper one-step decoding procedure. The two parties in quantum communication can, in principle, obtain a nonlocal maximally entangled spatial-polarization hyperentanglement in a deterministic way, which makes our protocol more convenient than others in long-distance quantum communication. PMID:26861681
Deterministic error correction for nonlocal spatial-polarization hyperentanglement.
Li, Tao; Wang, Guan-Yu; Deng, Fu-Guo; Long, Gui-Lu
2016-02-10
Hyperentanglement is an effective quantum source for quantum communication network due to its high capacity, low loss rate, and its unusual character in teleportation of quantum particle fully. Here we present a deterministic error-correction scheme for nonlocal spatial-polarization hyperentangled photon pairs over collective-noise channels. In our scheme, the spatial-polarization hyperentanglement is first encoded into a spatial-defined time-bin entanglement with identical polarization before it is transmitted over collective-noise channels, which leads to the error rejection of the spatial entanglement during the transmission. The polarization noise affecting the polarization entanglement can be corrected with a proper one-step decoding procedure. The two parties in quantum communication can, in principle, obtain a nonlocal maximally entangled spatial-polarization hyperentanglement in a deterministic way, which makes our protocol more convenient than others in long-distance quantum communication.
Hanigan, Ivan C; Williamson, Grant J; Knibbs, Luke D; Horsley, Joshua; Rolfe, Margaret I; Cope, Martin; Barnett, Adrian G; Cowie, Christine T; Heyworth, Jane S; Serre, Marc L; Jalaludin, Bin; Morgan, Geoffrey G
2017-11-07
Exposure to traffic related nitrogen dioxide (NO 2 ) air pollution is associated with adverse health outcomes. Average pollutant concentrations for fixed monitoring sites are often used to estimate exposures for health studies, however these can be imprecise due to difficulty and cost of spatial modeling at the resolution of neighborhoods (e.g., a scale of tens of meters) rather than at a coarse scale (around several kilometers). The objective of this study was to derive improved estimates of neighborhood NO 2 concentrations by blending measurements with modeled predictions in Sydney, Australia (a low pollution environment). We implemented the Bayesian maximum entropy approach to blend data with uncertainty defined using informative priors. We compiled NO 2 data from fixed-site monitors, chemical transport models, and satellite-based land use regression models to estimate neighborhood annual average NO 2 . The spatial model produced a posterior probability density function of estimated annual average concentrations that spanned an order of magnitude from 3 to 35 ppb. Validation using independent data showed improvement, with root mean squared error improvement of 6% compared with the land use regression model and 16% over the chemical transport model. These estimates will be used in studies of health effects and should minimize misclassification bias.
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Ordinary least squares regression is indicated for studies of allometry.
Kilmer, J T; Rodríguez, R L
2017-01-01
When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Phenological features for winter rapeseed identification in Ukraine using satellite data
NASA Astrophysics Data System (ADS)
Kravchenko, Oleksiy
2014-05-01
Winter rapeseed is one of the major oilseed crops in Ukraine that is characterized by high profitability and often grown with violations of the crop rotation requirements leading to soil degradation. Therefore, rapeseed identification using satellite data is a promising direction for operational estimation of the crop acreage and rotation control. Crop acreage of rapeseed is about 0.5-3% of total area of Ukraine, which poses a major problem for identification using satellite data [1]. While winter rapeseed could be classified using biomass features observed during autumn vegetation, these features are quite unstable due to field to field differences in planting dates as well as spatial and temporal heterogeneity in soil moisture availability. Due to this fact autumn biomass level features could be used only locally (at NUTS-3 level) and are not suitable for large-scale country wide crop identification. We propose to use crop parameters at flowering phenological stage for crop identification and present a method for parameters estimation using time-series of moderate resolution data. Rapeseed flowering could be observed as a bell-shaped peak in red reflectance time series. However the duration of the flowering period that is observable by satellite data is about only two weeks, which is quite short period taking into account inevitable cloud coverage issues. Thus we need daily time series to resolve the flowering peak and due to this we are limited to moderate resolution data. We used daily atmospherically corrected MODIS data coming from Terra and Aqua satellites within 90-160 DOY period to perform features calculations. Empirical BRDF correction is used to minimize angular effects. We used Gaussian Processes Regression (GPR) for temporal interpolation to minimize errors due to residual could coverage, atmospheric correction and a mixed pixel problems. We estimate 12 parameters for each time series. They are red and near-infrared (NIR) reflectance and the timing at four stages: before and after the flowering, at the peak flowering and at the maximum NIR level. We used Support Vector Machine for data classification. The most relevant feature for classification is flowering peak timing followed by flowering peak magnitude. The dependency of the peak time on the latitude as a sole feature could be used to reject 90% of non-rapeseed pixels that is greatly reduces the imbalance of the classification problem. To assess the accuracy of our approach we performed a stratified area frame sampling survey in Odessa region (NUTS-2 level) in 2013. The omission error is about 12.6% while commission error is higher at the level of 22%. This fact is explained by high viewing angle composition criterion that is used in our approach to mitigate high cloud coverage problem. However the errors are quite stable spatially and could be easily corrected by regression technique. To do this we performed area estimation for Odessa region using regression estimator and obtained good area estimation accuracy with 4.6% error (1σ). [1] Gallego, F.J., et al., Efficiency assessment of using satellite data for crop area estimation in Ukraine. Int. J. Appl. Earth Observ. Geoinf. (2014), http://dx.doi.org/10.1016/j.jag.2013.12.013
Guan, Yongtao; Li, Yehua; Sinha, Rajita
2011-01-01
In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material. PMID:21984854
Spatially distributed modeling of soil organic carbon across China with improved accuracy
NASA Astrophysics Data System (ADS)
Li, Qi-quan; Zhang, Hao; Jiang, Xin-ye; Luo, Youlin; Wang, Chang-quan; Yue, Tian-xiang; Li, Bing; Gao, Xue-song
2017-06-01
There is a need for more detailed spatial information on soil organic carbon (SOC) for the accurate estimation of SOC stock and earth system models. As it is effective to use environmental factors as auxiliary variables to improve the prediction accuracy of spatially distributed modeling, a combined method (HASM_EF) was developed to predict the spatial pattern of SOC across China using high accuracy surface modeling (HASM), artificial neural network (ANN), and principal component analysis (PCA) to introduce land uses, soil types, climatic factors, topographic attributes, and vegetation cover as predictors. The performance of HASM_EF was compared with ordinary kriging (OK), OK, and HASM combined, respectively, with land uses and soil types (OK_LS and HASM_LS), and regression kriging combined with land uses and soil types (RK_LS). Results showed that HASM_EF obtained the lowest prediction errors and the ratio of performance to deviation (RPD) presented the relative improvements of 89.91%, 63.77%, 55.86%, and 42.14%, respectively, compared to the other four methods. Furthermore, HASM_EF generated more details and more realistic spatial information on SOC. The improved performance of HASM_EF can be attributed to the introduction of more environmental factors, to explicit consideration of the multicollinearity of selected factors and the spatial nonstationarity and nonlinearity of relationships between SOC and selected factors, and to the performance of HASM and ANN. This method may play a useful tool in providing more precise spatial information on soil parameters for global modeling across large areas.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Impact of Spatial Soil and Climate Input Data Aggregation on Regional Yield Simulations
Hoffmann, Holger; Zhao, Gang; Asseng, Senthold; Bindi, Marco; Biernath, Christian; Constantin, Julie; Coucheney, Elsa; Dechow, Rene; Doro, Luca; Eckersten, Henrik; Gaiser, Thomas; Grosz, Balázs; Heinlein, Florian; Kassie, Belay T.; Kersebaum, Kurt-Christian; Klein, Christian; Kuhnert, Matthias; Lewan, Elisabet; Moriondo, Marco; Nendel, Claas; Priesack, Eckart; Raynal, Helene; Roggero, Pier P.; Rötter, Reimund P.; Siebert, Stefan; Specka, Xenia; Tao, Fulu; Teixeira, Edmar; Trombi, Giacomo; Wallach, Daniel; Weihermüller, Lutz; Yeluripati, Jagadeesh; Ewert, Frank
2016-01-01
We show the error in water-limited yields simulated by crop models which is associated with spatially aggregated soil and climate input data. Crop simulations at large scales (regional, national, continental) frequently use input data of low resolution. Therefore, climate and soil data are often generated via averaging and sampling by area majority. This may bias simulated yields at large scales, varying largely across models. Thus, we evaluated the error associated with spatially aggregated soil and climate data for 14 crop models. Yields of winter wheat and silage maize were simulated under water-limited production conditions. We calculated this error from crop yields simulated at spatial resolutions from 1 to 100 km for the state of North Rhine-Westphalia, Germany. Most models showed yields biased by <15% when aggregating only soil data. The relative mean absolute error (rMAE) of most models using aggregated soil data was in the range or larger than the inter-annual or inter-model variability in yields. This error increased further when both climate and soil data were aggregated. Distinct error patterns indicate that the rMAE may be estimated from few soil variables. Illustrating the range of these aggregation effects across models, this study is a first step towards an ex-ante assessment of aggregation errors in large-scale simulations. PMID:27055028
Impact of Spatial Soil and Climate Input Data Aggregation on Regional Yield Simulations.
Hoffmann, Holger; Zhao, Gang; Asseng, Senthold; Bindi, Marco; Biernath, Christian; Constantin, Julie; Coucheney, Elsa; Dechow, Rene; Doro, Luca; Eckersten, Henrik; Gaiser, Thomas; Grosz, Balázs; Heinlein, Florian; Kassie, Belay T; Kersebaum, Kurt-Christian; Klein, Christian; Kuhnert, Matthias; Lewan, Elisabet; Moriondo, Marco; Nendel, Claas; Priesack, Eckart; Raynal, Helene; Roggero, Pier P; Rötter, Reimund P; Siebert, Stefan; Specka, Xenia; Tao, Fulu; Teixeira, Edmar; Trombi, Giacomo; Wallach, Daniel; Weihermüller, Lutz; Yeluripati, Jagadeesh; Ewert, Frank
2016-01-01
We show the error in water-limited yields simulated by crop models which is associated with spatially aggregated soil and climate input data. Crop simulations at large scales (regional, national, continental) frequently use input data of low resolution. Therefore, climate and soil data are often generated via averaging and sampling by area majority. This may bias simulated yields at large scales, varying largely across models. Thus, we evaluated the error associated with spatially aggregated soil and climate data for 14 crop models. Yields of winter wheat and silage maize were simulated under water-limited production conditions. We calculated this error from crop yields simulated at spatial resolutions from 1 to 100 km for the state of North Rhine-Westphalia, Germany. Most models showed yields biased by <15% when aggregating only soil data. The relative mean absolute error (rMAE) of most models using aggregated soil data was in the range or larger than the inter-annual or inter-model variability in yields. This error increased further when both climate and soil data were aggregated. Distinct error patterns indicate that the rMAE may be estimated from few soil variables. Illustrating the range of these aggregation effects across models, this study is a first step towards an ex-ante assessment of aggregation errors in large-scale simulations.
Hoyo, Javier Del; Choi, Heejoo; Burge, James H; Kim, Geon-Hee; Kim, Dae Wook
2017-06-20
The control of surface errors as a function of spatial frequency is critical during the fabrication of modern optical systems. A large-scale surface figure error is controlled by a guided removal process, such as computer-controlled optical surfacing. Smaller-scale surface errors are controlled by polishing process parameters. Surface errors of only a few millimeters may degrade the performance of an optical system, causing background noise from scattered light and reducing imaging contrast for large optical systems. Conventionally, the microsurface roughness is often given by the root mean square at a high spatial frequency range, with errors within a 0.5×0.5 mm local surface map with 500×500 pixels. This surface specification is not adequate to fully describe the characteristics for advanced optical systems. The process for controlling and minimizing mid- to high-spatial frequency surface errors with periods of up to ∼2-3 mm was investigated for many optical fabrication conditions using the measured surface power spectral density (PSD) of a finished Zerodur optical surface. Then, the surface PSD was systematically related to various fabrication process parameters, such as the grinding methods, polishing interface materials, and polishing compounds. The retraceable experimental polishing conditions and processes used to produce an optimal optical surface PSD are presented.
Spatial abstraction for autonomous robot navigation.
Epstein, Susan L; Aroor, Anoop; Evanusa, Matthew; Sklar, Elizabeth I; Parsons, Simon
2015-09-01
Optimal navigation for a simulated robot relies on a detailed map and explicit path planning, an approach problematic for real-world robots that are subject to noise and error. This paper reports on autonomous robots that rely on local spatial perception, learning, and commonsense rationales instead. Despite realistic actuator error, learned spatial abstractions form a model that supports effective travel.
An improved portmanteau test for autocorrelated errors in interrupted time-series regression models.
Huitema, Bradley E; McKean, Joseph W
2007-08-01
A new portmanteau test for autocorrelation among the errors of interrupted time-series regression models is proposed. Simulation results demonstrate that the inferential properties of the proposed Q(H-M) test statistic are considerably more satisfactory than those of the well known Ljung-Box test and moderately better than those of the Box-Pierce test. These conclusions generally hold for a wide variety of autoregressive (AR), moving averages (MA), and ARMA error processes that are associated with time-series regression models of the form described in Huitema and McKean (2000a, 2000b).
Goldman, Gretchen T; Mulholland, James A; Russell, Armistead G; Strickland, Matthew J; Klein, Mitchel; Waller, Lance A; Tolbert, Paige E
2011-06-22
Two distinctly different types of measurement error are Berkson and classical. Impacts of measurement error in epidemiologic studies of ambient air pollution are expected to depend on error type. We characterize measurement error due to instrument imprecision and spatial variability as multiplicative (i.e. additive on the log scale) and model it over a range of error types to assess impacts on risk ratio estimates both on a per measurement unit basis and on a per interquartile range (IQR) basis in a time-series study in Atlanta. Daily measures of twelve ambient air pollutants were analyzed: NO2, NOx, O3, SO2, CO, PM10 mass, PM2.5 mass, and PM2.5 components sulfate, nitrate, ammonium, elemental carbon and organic carbon. Semivariogram analysis was applied to assess spatial variability. Error due to this spatial variability was added to a reference pollutant time-series on the log scale using Monte Carlo simulations. Each of these time-series was exponentiated and introduced to a Poisson generalized linear model of cardiovascular disease emergency department visits. Measurement error resulted in reduced statistical significance for the risk ratio estimates for all amounts (corresponding to different pollutants) and types of error. When modelled as classical-type error, risk ratios were attenuated, particularly for primary air pollutants, with average attenuation in risk ratios on a per unit of measurement basis ranging from 18% to 92% and on an IQR basis ranging from 18% to 86%. When modelled as Berkson-type error, risk ratios per unit of measurement were biased away from the null hypothesis by 2% to 31%, whereas risk ratios per IQR were attenuated (i.e. biased toward the null) by 5% to 34%. For CO modelled error amount, a range of error types were simulated and effects on risk ratio bias and significance were observed. For multiplicative error, both the amount and type of measurement error impact health effect estimates in air pollution epidemiology. By modelling instrument imprecision and spatial variability as different error types, we estimate direction and magnitude of the effects of error over a range of error types.
Documentation of procedures for textural/spatial pattern recognition techniques
NASA Technical Reports Server (NTRS)
Haralick, R. M.; Bryant, W. F.
1976-01-01
A C-130 aircraft was flown over the Sam Houston National Forest on March 21, 1973 at 10,000 feet altitude to collect multispectral scanner (MSS) data. Existing textural and spatial automatic processing techniques were used to classify the MSS imagery into specified timber categories. Several classification experiments were performed on this data using features selected from the spectral bands and a textural transform band. The results indicate that (1) spatial post-processing a classified image can cut the classification error to 1/2 or 1/3 of its initial value, (2) spatial post-processing the classified image using combined spectral and textural features produces a resulting image with less error than post-processing a classified image using only spectral features and (3) classification without spatial post processing using the combined spectral textural features tends to produce about the same error rate as a classification without spatial post processing using only spectral features.
NASA Astrophysics Data System (ADS)
Bahi, Hicham; Rhinane, Hassan; Bensalmia, Ahmed
2016-10-01
Air temperature is considered to be an essential variable for the study and analysis of meteorological regimes and chronics. However, the implementation of a daily monitoring of this variable is very difficult to achieve. It requires sufficient of measurements stations density, meteorological parks and favourable logistics. The present work aims to establish relationship between day and night land surface temperatures from MODIS data and the daily measurements of air temperature acquired between [2011-20112] and provided by the Department of National Meteorology [DMN] of Casablanca, Morocco. The results of the statistical analysis show significant interdependence during night observations with correlation coefficient of R2=0.921 and Root Mean Square Error RMSE=1.503 for Tmin while the physical magnitude estimated from daytime MODIS observation shows a relatively coarse error with R2=0.775 and RMSE=2.037 for Tmax. A method based on Gaussian process regression was applied to compute the spatial distribution of air temperature from MODIS throughout the city of Casablanca.
Incorporating Measurement Error from Modeled Air Pollution Exposures into Epidemiological Analyses.
Samoli, Evangelia; Butland, Barbara K
2017-12-01
Outdoor air pollution exposures used in epidemiological studies are commonly predicted from spatiotemporal models incorporating limited measurements, temporal factors, geographic information system variables, and/or satellite data. Measurement error in these exposure estimates leads to imprecise estimation of health effects and their standard errors. We reviewed methods for measurement error correction that have been applied in epidemiological studies that use model-derived air pollution data. We identified seven cohort studies and one panel study that have employed measurement error correction methods. These methods included regression calibration, risk set regression calibration, regression calibration with instrumental variables, the simulation extrapolation approach (SIMEX), and methods under the non-parametric or parameter bootstrap. Corrections resulted in small increases in the absolute magnitude of the health effect estimate and its standard error under most scenarios. Limited application of measurement error correction methods in air pollution studies may be attributed to the absence of exposure validation data and the methodological complexity of the proposed methods. Future epidemiological studies should consider in their design phase the requirements for the measurement error correction method to be later applied, while methodological advances are needed under the multi-pollutants setting.
Duncan, Dustin T; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A; Arbia, Giuseppe; Castro, Marcia C; White, Kellee; Williams, David R
2014-04-01
The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran's I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran's I range from 0.24 to 0.86, all P =0.001), for tree density (Global Moran's I =0.452, P =0.001), and in the OLS regression residuals (Global Moran's I range from 0.32 to 0.38, all P <0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (r S =-0.19; conventional P -value=0.016; spatially adjusted P -value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (r S =-0.18; conventional P -value=0.019; spatially adjusted P -value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed.
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
High dimensional linear regression models under long memory dependence and measurement error
NASA Astrophysics Data System (ADS)
Kaul, Abhishek
This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.
Regional regression models of watershed suspended-sediment discharge for the eastern United States
NASA Astrophysics Data System (ADS)
Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.
2012-11-01
SummaryEstimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1-8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.
Regional regression models of watershed suspended-sediment discharge for the eastern United States
Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.
2012-01-01
Estimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1–8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.
NASA Astrophysics Data System (ADS)
Yoshida, Kenichiro; Nishidate, Izumi; Ojima, Nobutoshi; Iwata, Kayoko
2014-01-01
To quantitatively evaluate skin chromophores over a wide region of curved skin surface, we propose an approach that suppresses the effect of the shading-derived error in the reflectance on the estimation of chromophore concentrations, without sacrificing the accuracy of that estimation. In our method, we use multiple regression analysis, assuming the absorbance spectrum as the response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as the predictor variables. The concentrations of melanin and total hemoglobin are determined from the multiple regression coefficients using compensation formulae (CF) based on the diffuse reflectance spectra derived from a Monte Carlo simulation. To suppress the shading-derived error, we investigated three different combinations of multiple regression coefficients for the CF. In vivo measurements with the forearm skin demonstrated that the proposed approach can reduce the estimation errors that are due to shading-derived errors in the reflectance. With the best combination of multiple regression coefficients, we estimated that the ratio of the error to the chromophore concentrations is about 10%. The proposed method does not require any measurements or assumptions about the shape of the subjects; this is an advantage over other studies related to the reduction of shading-derived errors.
NASA Astrophysics Data System (ADS)
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
Gabriel, M.C.; Kolka, R.; Wickman, T.; Nater, E.; Woodruff, Laurel G.
2009-01-01
The primary objective of this research is to investigate relationships between mercury in upland soil, lake water and fish tissue and explore the cause for the observed spatial variation of THg in age one yellow perch (Perca flavescens) for ten lakes within the Superior National Forest. Spatial relationships between yellow perch THg tissue concentration and a total of 45 watershed and water chemistry parameters were evaluated for two separate years: 2005 and 2006. Results show agreement with other studies where watershed area, lake water pH, nutrient levels (specifically dissolved NO3−-N) and dissolved iron are important factors controlling and/or predicting fish THg level. Exceeding all was the strong dependence of yellow perch THg level on soil A-horizon THg and, in particular, soil O-horizon THg concentrations (Spearman ρ = 0.81). Soil B-horizon THg concentration was significantly correlated (Pearson r = 0.75) with lake water THg concentration. Lakes surrounded by a greater percentage of shrub wetlands (peatlands) had higher fish tissue THg levels, thus it is highly possible that these wetlands are main locations for mercury methylation. Stepwise regression was used to develop empirical models for the purpose of predicting the spatial variation in yellow perch THg over the studied region. The 2005 regression model demonstrates it is possible to obtain good prediction (up to 60% variance description) of resident yellow perch THg level using upland soil O-horizon THg as the only independent variable. The 2006 model shows even greater prediction (r2 = 0.73, with an overall 10 ng/g [tissue, wet weight] margin of error), using lake water dissolved iron and watershed area as the only model independent variables. The developed regression models in this study can help with interpreting THg concentrations in low trophic level fish species for untested lakes of the greater Superior National Forest and surrounding Boreal ecosystem.
The role of visual spatial attention in adult developmental dyslexia.
Collis, Nathan L; Kohnen, Saskia; Kinoshita, Sachiko
2013-01-01
The present study investigated the nature of visual spatial attention deficits in adults with developmental dyslexia, using a partial report task with five-letter, digit, and symbol strings. Participants responded by a manual key press to one of nine alternatives, which included other characters in the string, allowing an assessment of position errors as well as intrusion errors. The results showed that the dyslexic adults performed significantly worse than age-matched controls with letter and digit strings but not with symbol strings. Both groups produced W-shaped serial position functions with letter and digit strings. The dyslexics' deficits with letter string stimuli were limited to position errors, specifically at the string-interior positions 2 and 4. These errors correlated with letter transposition reading errors (e.g., reading slat as "salt"), but not with the Rapid Automatized Naming (RAN) task. Overall, these results suggest that the dyslexic adults have a visual spatial attention deficit; however, the deficit does not reflect a reduced span in visual-spatial attention, but a deficit in processing a string of letters in parallel, probably due to difficulty in the coding of letter position.
NASA Astrophysics Data System (ADS)
Owens, P. R.; Libohova, Z.; Seybold, C. A.; Wills, S. A.; Peaslee, S.; Beaudette, D.; Lindbo, D. L.
2017-12-01
The measurement errors and spatial prediction uncertainties of soil properties in the modeling community are usually assessed against measured values when available. However, of equal importance is the assessment of errors and uncertainty impacts on cost benefit analysis and risk assessments. Soil pH was selected as one of the most commonly measured soil properties used for liming recommendations. The objective of this study was to assess the error size from different sources and their implications with respect to management decisions. Error sources include measurement methods, laboratory sources, pedotransfer functions, database transections, spatial aggregations, etc. Several databases of measured and predicted soil pH were used for this study including the United States National Cooperative Soil Survey Characterization Database (NCSS-SCDB), the US Soil Survey Geographic (SSURGO) Database. The distribution of errors among different sources from measurement methods to spatial aggregation showed a wide range of values. The greatest RMSE of 0.79 pH units was from spatial aggregation (SSURGO vs Kriging), while the measurement methods had the lowest RMSE of 0.06 pH units. Assuming the order of data acquisition based on the transaction distance i.e. from measurement method to spatial aggregation the RMSE increased from 0.06 to 0.8 pH units suggesting an "error propagation". This has major implications for practitioners and modeling community. Most soil liming rate recommendations are based on 0.1 pH unit increments, while the desired soil pH level increments are based on 0.4 to 0.5 pH units. Thus, even when the measured and desired target soil pH are the same most guidelines recommend 1 ton ha-1 lime, which translates in 111 ha-1 that the farmer has to factor in the cost-benefit analysis. However, this analysis need to be based on uncertainty predictions (0.5-1.0 pH units) rather than measurement errors (0.1 pH units) which would translate in 555-1,111 investment that need to be assessed against the risk. The modeling community can benefit from such analysis, however, error size and spatial distribution for global and regional predictions need to be assessed against the variability of other drivers and impact on management decisions.
Popa, Laurentiu S.; Hewitt, Angela L.; Ebner, Timothy J.
2012-01-01
The cerebellum has been implicated in processing motor errors required for online control of movement and motor learning. The dominant view is that Purkinje cell complex spike discharge signals motor errors. This study investigated whether errors are encoded in the simple spike discharge of Purkinje cells in monkeys trained to manually track a pseudo-randomly moving target. Four task error signals were evaluated based on cursor movement relative to target movement. Linear regression analyses based on firing residuals ensured that the modulation with a specific error parameter was independent of the other error parameters and kinematics. The results demonstrate that simple spike firing in lobules IV–VI is significantly correlated with position, distance and directional errors. Independent of the error signals, the same Purkinje cells encode kinematics. The strongest error modulation occurs at feedback timing. However, in 72% of cells at least one of the R2 temporal profiles resulting from regressing firing with individual errors exhibit two peak R2 values. For these bimodal profiles, the first peak is at a negative τ (lead) and a second peak at a positive τ (lag), implying that Purkinje cells encode both prediction and feedback about an error. For the majority of the bimodal profiles, the signs of the regression coefficients or preferred directions reverse at the times of the peaks. The sign reversal results in opposing simple spike modulation for the predictive and feedback components. Dual error representations may provide the signals needed to generate sensory prediction errors used to update a forward internal model. PMID:23115173
Decomposition of Sources of Errors in Seasonal Streamflow Forecasting over the U.S. Sunbelt
NASA Technical Reports Server (NTRS)
Mazrooei, Amirhossein; Sinah, Tusshar; Sankarasubramanian, A.; Kumar, Sujay V.; Peters-Lidard, Christa D.
2015-01-01
Seasonal streamflow forecasts, contingent on climate information, can be utilized to ensure water supply for multiple uses including municipal demands, hydroelectric power generation, and for planning agricultural operations. However, uncertainties in the streamflow forecasts pose significant challenges in their utilization in real-time operations. In this study, we systematically decompose various sources of errors in developing seasonal streamflow forecasts from two Land Surface Models (LSMs) (Noah3.2 and CLM2), which are forced with downscaled and disaggregated climate forecasts. In particular, the study quantifies the relative contributions of the sources of errors from LSMs, climate forecasts, and downscaling/disaggregation techniques in developing seasonal streamflow forecast. For this purpose, three month ahead seasonal precipitation forecasts from the ECHAM4.5 general circulation model (GCM) were statistically downscaled from 2.8deg to 1/8deg spatial resolution using principal component regression (PCR) and then temporally disaggregated from monthly to daily time step using kernel-nearest neighbor (K-NN) approach. For other climatic forcings, excluding precipitation, we considered the North American Land Data Assimilation System version 2 (NLDAS-2) hourly climatology over the years 1979 to 2010. Then the selected LSMs were forced with precipitation forecasts and NLDAS-2 hourly climatology to develop retrospective seasonal streamflow forecasts over a period of 20 years (1991-2010). Finally, the performance of LSMs in forecasting streamflow under different schemes was analyzed to quantify the relative contribution of various sources of errors in developing seasonal streamflow forecast. Our results indicate that the most dominant source of errors during winter and fall seasons is the errors due to ECHAM4.5 precipitation forecasts, while temporal disaggregation scheme contributes to maximum errors during summer season.
Amodeo, Dionisio A.; Grospe, Gena; Zang, Hui; Dwivedi, Yogesh; Ragozzino, Michael E.
2016-01-01
Central infusion of the Na+/K+-ATPase inhibitor, ouabain in rats serves as an animal model of mania because it leads to hyperactivity, as well as reproduces ion dysregulation and reduced BDNF levels similar to that observed in bipolar disorder. Bipolar disorder is also associated with cognitive inflexibility and working memory deficits. It is unknown whether ouabain treatment in rats leads to similar cognitive flexibility and working memory deficits. The present study examined the effects of an intracerebral ventricular infusion of ouabain in rats on spontaneous alternation, probabilistic reversal learning and BDNF expression levels in the frontal cortex. Ouabain treatment significantly increased locomotor activity, but did not affect alternation performance in a Y-maze. Ouabain treatment selectively impaired reversal learning in a spatial discrimination task using an 80/20 probabilistic reinforcement procedure. The reversal learning deficit in ouabain-treated rats resulted from an impaired ability to maintain a new choice pattern (increased regressive errors). Ouabain treatment also decreased sensitivity to negative feedback during the initial phase of reversal learning. Expression of BDNF mRNA and protein levels was downregulated in the frontal cortex which also negatively correlated with regressive errors. These findings suggest that the ouabain model of mania may be useful in understanding the neuropathophysiology that contributes to cognitive flexibility deficits and test potential treatments to alleviate cognitive deficits in bipolar disorder. PMID:27267245
Design Considerations of Polishing Lap for Computer-Controlled Cylindrical Polishing Process
NASA Technical Reports Server (NTRS)
Khan, Gufran S.; Gubarev, Mikhail; Arnold, William; Ramsey, Brian D.
2009-01-01
This paper establishes a relationship between the polishing process parameters and the generation of mid spatial-frequency error. The consideration of the polishing lap design to optimize the process in order to keep residual errors to a minimum and optimization of the process (speeds, stroke, etc.) and to keep the residual mid spatial-frequency error to a minimum, is also presented.
Multicollinearity and Regression Analysis
NASA Astrophysics Data System (ADS)
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
SMOS salinity retrieval by using Support Vector Regression (SVR)
NASA Astrophysics Data System (ADS)
Katagis, Thomas; Fernández-Prieto, Diego; Marconcini, Mattia; Sabia, Roberto; Martinez, Justino
2013-04-01
The Soil Moisture and Ocean Salinity (SMOS) mission was launched in November 2009 within the framework of the European Space Agency (ESA) Living Planet programme. Over the oceans, it aims at providing Sea Surface Salinity (SSS) maps with spatial and temporal coverage adequate for large scale oceanography. A comprehensive inversion scheme has been defined and implemented in the operational retrieval chain to allow proper SSS estimates in a single satellite overpass (L2 product) from the multi-angular brightness temperatures (TBs) measured by SMOS. Such SMOS operational L2 salinity processor minimizes the difference between the measured and modeled TBs, including additional constraints on Sea Surface Temperature (SST) and wind speed auxiliary fields. In particular, by adopting a maximum-likelihood Bayesian approach, the inversion scheme retrieves salinity under an iterative convergence loop. However, despite the implemented iterative technique is well established and robust, it is still prone to limitations; for instance, the presence of local minima in the cost function cannot be excluded. Moreover, previous studies have demonstrated that the background and observational terms of the cost function are not properly balanced and this is likely to introduce errors in the retrieval procedure. In order to overcome such potential drawbacks, in this study it is proposed a novel approach for the SSS estimation based on the ɛ-insensitive Support Vector Regression (SVR), where both SMOS L1 measurements and auxiliary parameters are used as input. The SVR technique already proved capable of high generalization and robustness in a variety of different applications, with a limited complexity in handling the learning phase. Notably, instead of minimizing the observed training error, it attempts to minimize the generalization error bound so as to achieve generalized performance. For this purpose, the original input domain is mapped into a higher dimensionality space (where the function underlying the data is supposed to have increased flatness) and linear regression is performed. The SVR training is performed using suitable in situ SSS data (i.e., ARGO buoys data) collected in a representative region of the ocean. So far, in situ data coming from a match-up ARGO database in November 2010 over the South Pacific constitute the preliminary benchmark of the study. Ongoing activities point at extending this spatial and temporal frame to assess the robustness of the method. The in situ data have been collocated with SMOS TB measurements and additional parameters (e.g., SST and wind speed) in the learning phase of the SVR under various training/testing configurations. Afterwards, the SSS regression has been performed out of the SMOS TBs or emissivities. Estimated SVR salinity fields are in general (very) well correlated with ARGO data. The analysis of the different impact of the various features has been performed once a rigorous data filtering/flagging is applied, and misfit (SSSSVR-SSSARGO) statistics have been computed. For assessing the effectiveness of the proposed method, final results will be compared to those obtained using the official SMOS SSS retrieval algorithm.
Ding, Yi; Peng, Kai; Yu, Miao; Lu, Lei; Zhao, Kun
2017-08-01
The performance of the two selected spatial frequency phase unwrapping methods is limited by a phase error bound beyond which errors will occur in the fringe order leading to a significant error in the recovered absolute phase map. In this paper, we propose a method to detect and correct the wrong fringe orders. Two constraints are introduced during the fringe order determination of two selected spatial frequency phase unwrapping methods. A strategy to detect and correct the wrong fringe orders is also described. Compared with the existing methods, we do not need to estimate the threshold associated with absolute phase values to determine the fringe order error, thus making it more reliable and avoiding the procedure of search in detecting and correcting successive fringe order errors. The effectiveness of the proposed method is validated by the experimental results.
Northrup, Joseph M.; Hooten, Mevin B.; Anderson, Charles R.; Wittemyer, George
2013-01-01
Habitat selection is a fundamental aspect of animal ecology, the understanding of which is critical to management and conservation. Global positioning system data from animals allow fine-scale assessments of habitat selection and typically are analyzed in a use-availability framework, whereby animal locations are contrasted with random locations (the availability sample). Although most use-availability methods are in fact spatial point process models, they often are fit using logistic regression. This framework offers numerous methodological challenges, for which the literature provides little guidance. Specifically, the size and spatial extent of the availability sample influences coefficient estimates potentially causing interpretational bias. We examined the influence of availability on statistical inference through simulations and analysis of serially correlated mule deer GPS data. Bias in estimates arose from incorrectly assessing and sampling the spatial extent of availability. Spatial autocorrelation in covariates, which is common for landscape characteristics, exacerbated the error in availability sampling leading to increased bias. These results have strong implications for habitat selection analyses using GPS data, which are increasingly prevalent in the literature. We recommend researchers assess the sensitivity of their results to their availability sample and, where bias is likely, take care with interpretations and use cross validation to assess robustness.
Ambient Ozone Exposure in Czech Forests: A GIS-Based Approach to Spatial Distribution Assessment
Hůnová, I.; Horálek, J.; Schreiberová, M.; Zapletal, M.
2012-01-01
Ambient ozone (O3) is an important phytotoxic pollutant, and detailed knowledge of its spatial distribution is becoming increasingly important. The aim of the paper is to compare different spatial interpolation techniques and to recommend the best approach for producing a reliable map for O3 with respect to its phytotoxic potential. For evaluation we used real-time ambient O3 concentrations measured by UV absorbance from 24 Czech rural sites in the 2007 and 2008 vegetation seasons. We considered eleven approaches for spatial interpolation used for the development of maps for mean vegetation season O3 concentrations and the AOT40F exposure index for forests. The uncertainty of maps was assessed by cross-validation analysis. The root mean square error (RMSE) of the map was used as a criterion. Our results indicate that the optimal interpolation approach is linear regression of O3 data and altitude with subsequent interpolation of its residuals by ordinary kriging. The relative uncertainty of the map of O3 mean for the vegetation season is less than 10%, using the optimal method as for both explored years, and this is a very acceptable value. In the case of AOT40F, however, the relative uncertainty of the map is notably worse, reaching nearly 20% in both examined years. PMID:22566757
Hollands, Simon; Campbell, M Karen; Gilliland, Jason; Sarma, Sisira
2013-10-01
To investigate the association between fast-food restaurant density and adult body mass index (BMI) in Canada. Individual-level BMI and confounding variables were obtained from the 2007-2008 Canadian Community Health Survey master file. Locations of the fast-food and full-service chain restaurants and other non-chain restaurants were obtained from the 2008 Infogroup Canada business database. Food outlet density (fast-food, full-service and other) per 10,000 population was calculated for each Forward Sortation Area (FSA). Global (Moran's I) and local indicators of spatial autocorrelation of BMI were assessed. Ordinary least squares (OLS) and spatial auto-regressive error (SARE) methods were used to assess the association between local food environment and adult BMI in Canada. Global and local spatial autocorrelation of BMI were found in our univariate analysis. We found that OLS and SARE estimates were very similar in our multivariate models. An additional fast-food restaurant per 10,000 people at the FSA-level is associated with a 0.022kg/m(2) increase in BMI. On the other hand, other restaurant density is negatively related to BMI. Fast-food restaurant density is positively associated with BMI in Canada. Results suggest that restricting availability of fast-food in local neighborhoods may play a role in obesity prevention. © 2013.
NASA Astrophysics Data System (ADS)
Zhao, Yaolong; Zhao, Junsan; Murayama, Yuji
2008-10-01
The period of high economic growth in Japan which began in the latter half of the 1950s led to a massive migration of population from rural regions to the Tokyo metropolitan area. This phenomenon brought about rapid urban growth and urban structure changes in this area. Purpose of this study is to establish a constrained CA (Cellular Automata) model with GIS (Geographical Information Systems) to simulate urban growth pattern in the Tokyo metropolitan area towards predicting urban form and landscape for the near future. Urban land-use is classified into multi-categories for interpreting the effect of interaction among land-use categories in the spatial process of urban growth. Driving factors of urban growth pattern, such as land condition, railway network, land-use zoning, random perturbation, and neighborhood interaction and so forth, are explored and integrated into this model. These driving factors are calibrated based on exploratory spatial data analysis (ESDA), spatial statistics, logistic regression, and "trial and error" approach. The simulation is assessed at both macro and micro classification levels in three ways: visual approach; fractal dimension; and spatial metrics. Results indicate that this model provides an effective prototype to simulate and predict urban growth pattern of the Tokyo metropolitan area.
NASA Technical Reports Server (NTRS)
Schlegel, Todd T.; Cortez, Daniel
2010-01-01
Our primary objective was to ascertain which commonly used 12-to-Frank-lead transformation yields spatial QRS-T angle values closest to those obtained from simultaneously collected true Frank-lead recordings. Simultaneous 12-lead and Frank XYZ-lead recordings were analyzed for 100 post-myocardial infarction patients and 50 controls. Relative agreement, with true Frank-lead results, of 12-to-Frank-lead transformed results for the spatial QRS-T angle using Kors regression versus inverse Dower was assessed via ANOVA, Lin s concordance and Bland-Altman plots. Spatial QRS-T angles from the true Frank leads were not significantly different than those derived from the Kors regression-related transformation but were significantly smaller than those derived from the inverse Dower-related transformation (P less than 0.001). Independent of method, spatial mean QRS-T angles were also always significantly larger than spatial maximum (peaks) QRS-T angles. Spatial QRS-T angles are best approximated by regression-related transforms. Spatial mean and spatial peaks QRS-T angles should also not be used interchangeably.
NASA Astrophysics Data System (ADS)
Reinhardt, Katja; Samimi, Cyrus
2018-01-01
While climatological data of high spatial resolution are largely available in most developed countries, the network of climatological stations in many other regions of the world still constitutes large gaps. Especially for those regions, interpolation methods are important tools to fill these gaps and to improve the data base indispensible for climatological research. Over the last years, new hybrid methods of machine learning and geostatistics have been developed which provide innovative prospects in spatial predictive modelling. This study will focus on evaluating the performance of 12 different interpolation methods for the wind components \\overrightarrow{u} and \\overrightarrow{v} in a mountainous region of Central Asia. Thereby, a special focus will be on applying new hybrid methods on spatial interpolation of wind data. This study is the first evaluating and comparing the performance of several of these hybrid methods. The overall aim of this study is to determine whether an optimal interpolation method exists, which can equally be applied for all pressure levels, or whether different interpolation methods have to be used for the different pressure levels. Deterministic (inverse distance weighting) and geostatistical interpolation methods (ordinary kriging) were explored, which take into account only the initial values of \\overrightarrow{u} and \\overrightarrow{v} . In addition, more complex methods (generalized additive model, support vector machine and neural networks as single methods and as hybrid methods as well as regression-kriging) that consider additional variables were applied. The analysis of the error indices revealed that regression-kriging provided the most accurate interpolation results for both wind components and all pressure heights. At 200 and 500 hPa, regression-kriging is followed by the different kinds of neural networks and support vector machines and for 850 hPa it is followed by the different types of support vector machine and ordinary kriging. Overall, explanatory variables improve the interpolation results.
NASA Astrophysics Data System (ADS)
Islamiyati, A.; Fatmawati; Chamidah, N.
2018-03-01
The correlation assumption of the longitudinal data with bi-response occurs on the measurement between the subjects of observation and the response. It causes the auto-correlation of error, and this can be overcome by using a covariance matrix. In this article, we estimate the covariance matrix based on the penalized spline regression model. Penalized spline involves knot points and smoothing parameters simultaneously in controlling the smoothness of the curve. Based on our simulation study, the estimated regression model of the weighted penalized spline with covariance matrix gives a smaller error value compared to the error of the model without covariance matrix.
ERIC Educational Resources Information Center
Cepeda-Cuervo, Edilberto; Núñez-Antón, Vicente
2013-01-01
In this article, a proposed Bayesian extension of the generalized beta spatial regression models is applied to the analysis of the quality of education in Colombia. We briefly revise the beta distribution and describe the joint modeling approach for the mean and dispersion parameters in the spatial regression models' setting. Finally, we motivate…
ERIC Educational Resources Information Center
Huitema, Bradley E.; McKean, Joseph W.
2007-01-01
Regression models used in the analysis of interrupted time-series designs assume statistically independent errors. Four methods of evaluating this assumption are the Durbin-Watson (D-W), Huitema-McKean (H-M), Box-Pierce (B-P), and Ljung-Box (L-B) tests. These tests were compared with respect to Type I error and power under a wide variety of error…
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Functional Additive Mixed Models
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2014-01-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach. PMID:26347592
Yang, Limin; Huang, Chengquan; Homer, Collin G.; Wylie, Bruce K.; Coan, Michael
2003-01-01
A wide range of urban ecosystem studies, including urban hydrology, urban climate, land use planning, and resource management, require current and accurate geospatial data of urban impervious surfaces. We developed an approach to quantify urban impervious surfaces as a continuous variable by using multisensor and multisource datasets. Subpixel percent impervious surfaces at 30-m resolution were mapped using a regression tree model. The utility, practicality, and affordability of the proposed method for large-area imperviousness mapping were tested over three spatial scales (Sioux Falls, South Dakota, Richmond, Virginia, and the Chesapeake Bay areas of the United States). Average error of predicted versus actual percent impervious surface ranged from 8.8 to 11.4%, with correlation coefficients from 0.82 to 0.91. The approach is being implemented to map impervious surfaces for the entire United States as one of the major components of the circa 2000 national land cover database.
Functional Additive Mixed Models.
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2015-04-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
Optimal configurations of spatial scale for grid cell firing under noise and uncertainty
Towse, Benjamin W.; Barry, Caswell; Bush, Daniel; Burgess, Neil
2014-01-01
We examined the accuracy with which the location of an agent moving within an environment could be decoded from the simulated firing of systems of grid cells. Grid cells were modelled with Poisson spiking dynamics and organized into multiple ‘modules’ of cells, with firing patterns of similar spatial scale within modules and a wide range of spatial scales across modules. The number of grid cells per module, the spatial scaling factor between modules and the size of the environment were varied. Errors in decoded location can take two forms: small errors of precision and larger errors resulting from ambiguity in decoding periodic firing patterns. With enough cells per module (e.g. eight modules of 100 cells each) grid systems are highly robust to ambiguity errors, even over ranges much larger than the largest grid scale (e.g. over a 500 m range when the maximum grid scale is 264 cm). Results did not depend strongly on the precise organization of scales across modules (geometric, co-prime or random). However, independent spatial noise across modules, which would occur if modules receive independent spatial inputs and might increase with spatial uncertainty, dramatically degrades the performance of the grid system. This effect of spatial uncertainty can be mitigated by uniform expansion of grid scales. Thus, in the realistic regimes simulated here, the optimal overall scale for a grid system represents a trade-off between minimizing spatial uncertainty (requiring large scales) and maximizing precision (requiring small scales). Within this view, the temporary expansion of grid scales observed in novel environments may be an optimal response to increased spatial uncertainty induced by the unfamiliarity of the available spatial cues. PMID:24366144
Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case
NASA Astrophysics Data System (ADS)
Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann
2017-04-01
Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.
Duncan, Dustin T.; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A.; Arbia, Giuseppe; Castro, Marcia C.; White, Kellee; Williams, David R.
2017-01-01
The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran’s I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran’s I range from 0.24 to 0.86, all P=0.001), for tree density (Global Moran’s I=0.452, P=0.001), and in the OLS regression residuals (Global Moran’s I range from 0.32 to 0.38, all P<0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (rS=−0.19; conventional P-value=0.016; spatially adjusted P-value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (rS=−0.18; conventional P-value=0.019; spatially adjusted P-value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed. PMID:29354668
Climbing fibers predict movement kinematics and performance errors.
Streng, Martha L; Popa, Laurentiu S; Ebner, Timothy J
2017-09-01
Requisite for understanding cerebellar function is a complete characterization of the signals provided by complex spike (CS) discharge of Purkinje cells, the output neurons of the cerebellar cortex. Numerous studies have provided insights into CS function, with the most predominant view being that they are evoked by error events. However, several reports suggest that CSs encode other aspects of movements and do not always respond to errors or unexpected perturbations. Here, we evaluated CS firing during a pseudo-random manual tracking task in the monkey ( Macaca mulatta ). This task provides extensive coverage of the work space and relative independence of movement parameters, delivering a robust data set to assess the signals that activate climbing fibers. Using reverse correlation, we determined feedforward and feedback CSs firing probability maps with position, velocity, and acceleration, as well as position error, a measure of tracking performance. The direction and magnitude of the CS modulation were quantified using linear regression analysis. The major findings are that CSs significantly encode all three kinematic parameters and position error, with acceleration modulation particularly common. The modulation is not related to "events," either for position error or kinematics. Instead, CSs are spatially tuned and provide a linear representation of each parameter evaluated. The CS modulation is largely predictive. Similar analyses show that the simple spike firing is modulated by the same parameters as the CSs. Therefore, CSs carry a broader array of signals than previously described and argue for climbing fiber input having a prominent role in online motor control. NEW & NOTEWORTHY This article demonstrates that complex spike (CS) discharge of cerebellar Purkinje cells encodes multiple parameters of movement, including motor errors and kinematics. The CS firing is not driven by error or kinematic events; instead it provides a linear representation of each parameter. In contrast with the view that CSs carry feedback signals, the CSs are predominantly predictive of upcoming position errors and kinematics. Therefore, climbing fibers carry multiple and predictive signals for online motor control. Copyright © 2017 the American Physiological Society.
Discrete distributed strain sensing of intelligent structures
NASA Technical Reports Server (NTRS)
Anderson, Mark S.; Crawley, Edward F.
1992-01-01
Techniques are developed for the design of discrete highly distributed sensor systems for use in intelligent structures. First the functional requirements for such a system are presented. Discrete spatially averaging strain sensors are then identified as satisfying the functional requirements. A variety of spatial weightings for spatially averaging sensors are examined, and their wave number characteristics are determined. Preferable spatial weightings are identified. Several numerical integration rules used to integrate such sensors in order to determine the global deflection of the structure are discussed. A numerical simulation is conducted using point and rectangular sensors mounted on a cantilevered beam under static loading. Gage factor and sensor position uncertainties are incorporated to assess the absolute error and standard deviation of the error in the estimated tip displacement found by numerically integrating the sensor outputs. An experiment is carried out using a statically loaded cantilevered beam with five point sensors. It is found that in most cases the actual experimental error is within one standard deviation of the absolute error as found in the numerical simulation.
Cohen, Michael X
2015-09-01
The purpose of this paper is to compare the effects of different spatial transformations applied to the same scalp-recorded EEG data. The spatial transformations applied are two referencing schemes (average and linked earlobes), the surface Laplacian, and beamforming (a distributed source localization procedure). EEG data were collected during a speeded reaction time task that provided a comparison of activity between error vs. correct responses. Analyses focused on time-frequency power, frequency band-specific inter-electrode connectivity, and within-subject cross-trial correlations between EEG activity and reaction time. Time-frequency power analyses showed similar patterns of midfrontal delta-theta power for errors compared to correct responses across all spatial transformations. Beamforming additionally revealed error-related anterior and lateral prefrontal beta-band activity. Within-subject brain-behavior correlations showed similar patterns of results across the spatial transformations, with the correlations being the weakest after beamforming. The most striking difference among the spatial transformations was seen in connectivity analyses: linked earlobe reference produced weak inter-site connectivity that was attributable to volume conduction (zero phase lag), while the average reference and Laplacian produced more interpretable connectivity results. Beamforming did not reveal any significant condition modulations of connectivity. Overall, these analyses show that some findings are robust to spatial transformations, while other findings, particularly those involving cross-trial analyses or connectivity, are more sensitive and may depend on the use of appropriate spatial transformations. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wei, J; Chao, M
2016-06-15
Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associatedmore » algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately improving tumor motion management for radiation therapy of cancer patients.« less
NASA Astrophysics Data System (ADS)
Mitra, Ashis; Majumdar, Prabal Kumar; Bannerjee, Debamalya
2013-03-01
This paper presents a comparative analysis of two modeling methodologies for the prediction of air permeability of plain woven handloom cotton fabrics. Four basic fabric constructional parameters namely ends per inch, picks per inch, warp count and weft count have been used as inputs for artificial neural network (ANN) and regression models. Out of the four regression models tried, interaction model showed very good prediction performance with a meager mean absolute error of 2.017 %. However, ANN models demonstrated superiority over the regression models both in terms of correlation coefficient and mean absolute error. The ANN model with 10 nodes in the single hidden layer showed very good correlation coefficient of 0.982 and 0.929 and mean absolute error of only 0.923 and 2.043 % for training and testing data respectively.
Application and evaluation of ISVR method in QuickBird image fusion
NASA Astrophysics Data System (ADS)
Cheng, Bo; Song, Xiaolu
2014-05-01
QuickBird satellite images are widely used in many fields, and applications have put forward high requirements for the integration of the spatial information and spectral information of the imagery. A fusion method for high resolution remote sensing images based on ISVR is identified in this study. The core principle of ISVS is taking the advantage of radicalization targeting to remove the effect of different gain and error of satellites' sensors. Transformed from DN to radiance, the multi-spectral image's energy is used to simulate the panchromatic band. The linear regression analysis is carried through the simulation process to find a new synthetically panchromatic image, which is highly linearly correlated to the original panchromatic image. In order to evaluate, test and compare the algorithm results, this paper used ISVR and other two different fusion methods to give a comparative study of the spatial information and spectral information, taking the average gradient and the correlation coefficient as an indicator. Experiments showed that this method could significantly improve the quality of fused image, especially in preserving spectral information, to maximize the spectral information of original multispectral images, while maintaining abundant spatial information.
Husak, G.J.; Marshall, M. T.; Michaelsen, J.; Pedreros, Diego; Funk, Christopher C.; Galu, G.
2008-01-01
Reliable estimates of cropped area (CA) in developing countries with chronic food shortages are essential for emergency relief and the design of appropriate market-based food security programs. Satellite interpretation of CA is an effective alternative to extensive and costly field surveys, which fail to represent the spatial heterogeneity at the country-level. Bias-corrected, texture based classifications show little deviation from actual crop inventories, when estimates derived from aerial photographs or field measurements are used to remove systematic errors in medium resolution estimates. In this paper, we demonstrate a hybrid high-medium resolution technique for Central Ethiopia that combines spatially limited unbiased estimates from IKONOS images, with spatially extensive Landsat ETM+ interpretations, land-cover, and SRTM-based topography. Logistic regression is used to derive the probability of a location being crop. These individual points are then aggregated to produce regional estimates of CA. District-level analysis of Landsat based estimates showed CA totals which supported the estimates of the Bureau of Agriculture and Rural Development. Continued work will evaluate the technique in other parts of Africa, while segmentation algorithms will be evaluated, in order to automate classification of medium resolution imagery for routine CA estimation in the future.
Gutierrez-Corea, Federico-Vladimir; Manso-Callejo, Miguel-Angel; Moreno-Regidor, María-Pilar; Velasco-Gómez, Jesús
2014-01-01
This study was motivated by the need to improve densification of Global Horizontal Irradiance (GHI) observations, increasing the number of surface weather stations that observe it, using sensors with a sub-hour periodicity and examining the methods of spatial GHI estimation (by interpolation) with that periodicity in other locations. The aim of the present research project is to analyze the goodness of 15-minute GHI spatial estimations for five methods in the territory of Spain (three geo-statistical interpolation methods, one deterministic method and the HelioSat2 method, which is based on satellite images). The research concludes that, when the work area has adequate station density, the best method for estimating GHI every 15 min is Regression Kriging interpolation using GHI estimated from satellite images as one of the input variables. On the contrary, when station density is low, the best method is estimating GHI directly from satellite images. A comparison between the GHI observed by volunteer stations and the estimation model applied concludes that 67% of the volunteer stations analyzed present values within the margin of error (average of ±2 standard deviations). PMID:24732102
GLASS daytime all-wave net radiation product: Algorithm development and preliminary validation
Jiang, Bo; Liang, Shunlin; Ma, Han; ...
2016-03-09
Mapping surface all-wave net radiation (R n) is critically needed for various applications. Several existing R n products from numerical models and satellite observations have coarse spatial resolutions and their accuracies may not meet the requirements of land applications. In this study, we develop the Global LAnd Surface Satellite (GLASS) daytime R n product at a 5 km spatial resolution. Its algorithm for converting shortwave radiation to all-wave net radiation using the Multivariate Adaptive Regression Splines (MARS) model is determined after comparison with three other algorithms. The validation of the GLASS R n product based on high-quality in situ measurementsmore » in the United States shows a coefficient of determination value of 0.879, an average root mean square error value of 31.61 Wm -2, and an average bias of 17.59 Wm -2. Furthermore, we also compare our product/algorithm with another satellite product (CERES-SYN) and two reanalysis products (MERRA and JRA55), and find that the accuracy of the much higher spatial resolution GLASS R n product is satisfactory. The GLASS R n product from 2000 to the present is operational and freely available to the public.« less
Gutierrez-Corea, Federico-Vladimir; Manso-Callejo, Miguel-Angel; Moreno-Regidor, María-Pilar; Velasco-Gómez, Jesús
2014-04-11
This study was motivated by the need to improve densification of Global Horizontal Irradiance (GHI) observations, increasing the number of surface weather stations that observe it, using sensors with a sub-hour periodicity and examining the methods of spatial GHI estimation (by interpolation) with that periodicity in other locations. The aim of the present research project is to analyze the goodness of 15-minute GHI spatial estimations for five methods in the territory of Spain (three geo-statistical interpolation methods, one deterministic method and the HelioSat2 method, which is based on satellite images). The research concludes that, when the work area has adequate station density, the best method for estimating GHI every 15 min is Regression Kriging interpolation using GHI estimated from satellite images as one of the input variables. On the contrary, when station density is low, the best method is estimating GHI directly from satellite images. A comparison between the GHI observed by volunteer stations and the estimation model applied concludes that 67% of the volunteer stations analyzed present values within the margin of error (average of ±2 standard deviations).
NASA Astrophysics Data System (ADS)
Hatvani, István Gábor; Leuenberger, Markus; Kohán, Balázs; Kern, Zoltán
2017-09-01
Water stable isotopes preserved in ice cores provide essential information about polar precipitation. In the present study, multivariate regression and variogram analyses were conducted on 22 δ2H and 53 δ18O records from 60 ice cores covering the second half of the 20th century. Taking the multicollinearity of the explanatory variables into account, as also the model's adjusted R2 and its mean absolute error, longitude, elevation and distance from the coast were found to be the main independent geographical driving factors governing the spatial δ18O variability of firn/ice in the chosen Antarctic macro region. After diminishing the effects of these factors, using variography, the weights for interpolation with kriging were obtained and the spatial autocorrelation structure of the dataset was revealed. This indicates an average area of influence with a radius of 350 km. This allows the determination of the areas which are as yet not covered by the spatial variability of the existing network of ice cores. Finally, the regional isoscape was obtained for the study area, and this may be considered the first step towards a geostatistically improved isoscape for Antarctica.
NASA Astrophysics Data System (ADS)
Husak, G. J.; Marshall, M. T.; Michaelsen, J.; Pedreros, D.; Funk, C.; Galu, G.
2008-07-01
Reliable estimates of cropped area (CA) in developing countries with chronic food shortages are essential for emergency relief and the design of appropriate market-based food security programs. Satellite interpretation of CA is an effective alternative to extensive and costly field surveys, which fail to represent the spatial heterogeneity at the country-level. Bias-corrected, texture based classifications show little deviation from actual crop inventories, when estimates derived from aerial photographs or field measurements are used to remove systematic errors in medium resolution estimates. In this paper, we demonstrate a hybrid high-medium resolution technique for Central Ethiopia that combines spatially limited unbiased estimates from IKONOS images, with spatially extensive Landsat ETM+ interpretations, land-cover, and SRTM-based topography. Logistic regression is used to derive the probability of a location being crop. These individual points are then aggregated to produce regional estimates of CA. District-level analysis of Landsat based estimates showed CA totals which supported the estimates of the Bureau of Agriculture and Rural Development. Continued work will evaluate the technique in other parts of Africa, while segmentation algorithms will be evaluated, in order to automate classification of medium resolution imagery for routine CA estimation in the future.
GLASS daytime all-wave net radiation product: Algorithm development and preliminary validation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Bo; Liang, Shunlin; Ma, Han
Mapping surface all-wave net radiation (R n) is critically needed for various applications. Several existing R n products from numerical models and satellite observations have coarse spatial resolutions and their accuracies may not meet the requirements of land applications. In this study, we develop the Global LAnd Surface Satellite (GLASS) daytime R n product at a 5 km spatial resolution. Its algorithm for converting shortwave radiation to all-wave net radiation using the Multivariate Adaptive Regression Splines (MARS) model is determined after comparison with three other algorithms. The validation of the GLASS R n product based on high-quality in situ measurementsmore » in the United States shows a coefficient of determination value of 0.879, an average root mean square error value of 31.61 Wm -2, and an average bias of 17.59 Wm -2. Furthermore, we also compare our product/algorithm with another satellite product (CERES-SYN) and two reanalysis products (MERRA and JRA55), and find that the accuracy of the much higher spatial resolution GLASS R n product is satisfactory. The GLASS R n product from 2000 to the present is operational and freely available to the public.« less
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods.
Vizcaíno, Iván P; Carrera, Enrique V; Muñoz-Romero, Sergio; Cumbal, Luis H; Rojo-Álvarez, José Luis
2017-10-16
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer's kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer's kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem.
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods
Vizcaíno, Iván P.; Muñoz-Romero, Sergio; Cumbal, Luis H.
2017-01-01
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer’s kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer’s kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem. PMID:29035333
Spatial uncertainty analysis: Propagation of interpolation errors in spatially distributed models
Phillips, D.L.; Marks, D.G.
1996-01-01
In simulation modelling, it is desirable to quantify model uncertainties and provide not only point estimates for output variables but confidence intervals as well. Spatially distributed physical and ecological process models are becoming widely used, with runs being made over a grid of points that represent the landscape. This requires input values at each grid point, which often have to be interpolated from irregularly scattered measurement sites, e.g., weather stations. Interpolation introduces spatially varying errors which propagate through the model We extended established uncertainty analysis methods to a spatial domain for quantifying spatial patterns of input variable interpolation errors and how they propagate through a model to affect the uncertainty of the model output. We applied this to a model of potential evapotranspiration (PET) as a demonstration. We modelled PET for three time periods in 1990 as a function of temperature, humidity, and wind on a 10-km grid across the U.S. portion of the Columbia River Basin. Temperature, humidity, and wind speed were interpolated using kriging from 700- 1000 supporting data points. Kriging standard deviations (SD) were used to quantify the spatially varying interpolation uncertainties. For each of 5693 grid points, 100 Monte Carlo simulations were done, using the kriged values of temperature, humidity, and wind, plus random error terms determined by the kriging SDs and the correlations of interpolation errors among the three variables. For the spring season example, kriging SDs averaged 2.6??C for temperature, 8.7% for relative humidity, and 0.38 m s-1 for wind. The resultant PET estimates had coefficients of variation (CVs) ranging from 14% to 27% for the 10-km grid cells. Maps of PET means and CVs showed the spatial patterns of PET with a measure of its uncertainty due to interpolation of the input variables. This methodology should be applicable to a variety of spatially distributed models using interpolated inputs.
Three-dimensional FLASH Laser Radar Range Estimation via Blind Deconvolution
2009-10-01
scene can result in errors due to several factors including the optical spatial impulse response, detector blurring, photon noise , timing jitter, and...estimation error include spatial blur, detector blurring, noise , timing jitter, and inter-sample targets. Unlike previous research, this paper ac- counts...for pixel coupling by defining the range image mathematical model as a 2D convolution between the system spatial impulse response and the object (target
Zhou, Yang; Fu, Xiaping; Ying, Yibin; Fang, Zhenhuan
2015-06-23
A fiber-optic probe system was developed to estimate the optical properties of turbid media based on spatially resolved diffuse reflectance. Because of the limitations in numerical calculation of radiative transfer equation (RTE), diffusion approximation (DA) and Monte Carlo simulations (MC), support vector regression (SVR) was introduced to model the relationship between diffuse reflectance values and optical properties. The SVR models of four collection fibers were trained by phantoms in calibration set with a wide range of optical properties which represented products of different applications, then the optical properties of phantoms in prediction set were predicted after an optimal searching on SVR models. The results indicated that the SVR model was capable of describing the relationship with little deviation in forward validation. The correlation coefficient (R) of reduced scattering coefficient μ'(s) and absorption coefficient μ(a) in the prediction set were 0.9907 and 0.9980, respectively. The root mean square errors of prediction (RMSEP) of μ'(s) and μ(a) in inverse validation were 0.411 cm(-1) and 0.338 cm(-1), respectively. The results indicated that the integrated fiber-optic probe system combined with SVR model were suitable for fast and accurate estimation of optical properties of turbid media based on spatially resolved diffuse reflectance. Copyright © 2015 Elsevier B.V. All rights reserved.
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Bennett, Derrick A; Landry, Denise; Little, Julian; Minelli, Cosetta
2017-09-19
Several statistical approaches have been proposed to assess and correct for exposure measurement error. We aimed to provide a critical overview of the most common approaches used in nutritional epidemiology. MEDLINE, EMBASE, BIOSIS and CINAHL were searched for reports published in English up to May 2016 in order to ascertain studies that described methods aimed to quantify and/or correct for measurement error for a continuous exposure in nutritional epidemiology using a calibration study. We identified 126 studies, 43 of which described statistical methods and 83 that applied any of these methods to a real dataset. The statistical approaches in the eligible studies were grouped into: a) approaches to quantify the relationship between different dietary assessment instruments and "true intake", which were mostly based on correlation analysis and the method of triads; b) approaches to adjust point and interval estimates of diet-disease associations for measurement error, mostly based on regression calibration analysis and its extensions. Two approaches (multiple imputation and moment reconstruction) were identified that can deal with differential measurement error. For regression calibration, the most common approach to correct for measurement error used in nutritional epidemiology, it is crucial to ensure that its assumptions and requirements are fully met. Analyses that investigate the impact of departures from the classical measurement error model on regression calibration estimates can be helpful to researchers in interpreting their findings. With regard to the possible use of alternative methods when regression calibration is not appropriate, the choice of method should depend on the measurement error model assumed, the availability of suitable calibration study data and the potential for bias due to violation of the classical measurement error model assumptions. On the basis of this review, we provide some practical advice for the use of methods to assess and adjust for measurement error in nutritional epidemiology.
Bartlett, Jonathan W; Keogh, Ruth H
2018-06-01
Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.
Brown, Judith A.; Bishop, Joseph E.
2016-07-20
An a posteriori error-estimation framework is introduced to quantify and reduce modeling errors resulting from approximating complex mesoscale material behavior with a simpler macroscale model. Such errors may be prevalent when modeling welds and additively manufactured structures, where spatial variations and material textures may be present in the microstructure. We consider a case where a <100> fiber texture develops in the longitudinal scanning direction of a weld. Transversely isotropic elastic properties are obtained through homogenization of a microstructural model with this texture and are considered the reference weld properties within the error-estimation framework. Conversely, isotropic elastic properties are considered approximatemore » weld properties since they contain no representation of texture. Errors introduced by using isotropic material properties to represent a weld are assessed through a quantified error bound in the elastic regime. Lastly, an adaptive error reduction scheme is used to determine the optimal spatial variation of the isotropic weld properties to reduce the error bound.« less
Seasonal Differences in Spatial Scales of Chlorophyll-A Concentration in Lake TAIHU,CHINA
NASA Astrophysics Data System (ADS)
Bao, Y.; Tian, Q.; Sun, S.; Wei, H.; Tian, J.
2012-08-01
Spatial distribution of chlorophyll-a (chla) concentration in Lake Taihu is non-uniform and seasonal variability. Chla concentration retrieval algorithms were separately established using measured data and remote sensing images (HJ-1 CCD and MODIS data) in October 2010, March 2011, and September 2011. Then parameters of semi- variance were calculated on the scale of 30m, 250m and 500m for analyzing spatial heterogeneity in different seasons. Finally, based on the definitions of Lumped chla (chlaL) and Distributed chla (chlaD), seasonal model of chla concentration scale error was built. The results indicated that: spatial distribution of chla concentration in spring was more uniform. In summer and autumn, chla concentration in the north of the lake such as Meiliang Bay and Zhushan Bay was higher than that in the south of Lake Taihu. Chla concentration on different scales showed the similar structure in the same season, while it had different structure in different seasons. And inversion chla concentration from MODIS 500m had a greater scale error. The spatial scale error changed with seasons. It was higher in summer and autumn than that in spring. The maximum relative error can achieve 23%.
Using ridge regression in systematic pointing error corrections
NASA Technical Reports Server (NTRS)
Guiar, C. N.
1988-01-01
A pointing error model is used in the antenna calibration process. Data from spacecraft or radio star observations are used to determine the parameters in the model. However, the regression variables are not truly independent, displaying a condition known as multicollinearity. Ridge regression, a biased estimation technique, is used to combat the multicollinearity problem. Two data sets pertaining to Voyager 1 spacecraft tracking (days 105 and 106 of 1987) were analyzed using both linear least squares and ridge regression methods. The advantages and limitations of employing the technique are presented. The problem is not yet fully resolved.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Spatial Resolution, Grayscale, and Error Diffusion Trade-offs: Impact on Display System Design
NASA Technical Reports Server (NTRS)
Gille, Jennifer L. (Principal Investigator)
1996-01-01
We examine technology trade-offs related to grayscale resolution, spatial resolution, and error diffusion for tessellated display systems. We present new empirical results from our psychophysical study of these trade-offs and compare them to the predictions of a model of human vision.
Cawyer, Chase R; Anderson, Sarah B; Szychowski, Jeff M; Neely, Cherry; Owen, John
2018-03-01
To compare the accuracy of a new regression-derived formula developed from the National Fetal Growth Studies data to the common alternative method that uses the average of the gestational ages (GAs) calculated for each fetal biometric measurement (biparietal diameter, head circumference, abdominal circumference, and femur length). This retrospective cross-sectional study identified nonanomalous singleton pregnancies that had a crown-rump length plus at least 1 additional sonographic examination with complete fetal biometric measurements. With the use of the crown-rump length to establish the referent estimated date of delivery, each method's (National Institute of Child Health and Human Development regression versus Hadlock average [Radiology 1984; 152:497-501]), error at every examination was computed. Error, defined as the difference between the crown-rump length-derived GA and each method's predicted GA (weeks), was compared in 3 GA intervals: 1 (14 weeks-20 weeks 6 days), 2 (21 weeks-28 weeks 6 days), and 3 (≥29 weeks). In addition, the proportion of each method's examinations that had errors outside prespecified (±) day ranges was computed by using odds ratios. A total of 16,904 sonograms were identified. The overall and prespecified GA range subset mean errors were significantly smaller for the regression compared to the average (P < .01), and the regression had significantly lower odds of observing examinations outside the specified range of error in GA intervals 2 (odds ratio, 1.15; 95% confidence interval, 1.01-1.31) and 3 (odds ratio, 1.24; 95% confidence interval, 1.17-1.32) than the average method. In a contemporary unselected population of women dated by a crown-rump length-derived GA, the National Institute of Child Health and Human Development regression formula produced fewer estimates outside a prespecified margin of error than the commonly used Hadlock average; the differences were most pronounced for GA estimates at 29 weeks and later. © 2017 by the American Institute of Ultrasound in Medicine.
Batistatou, Evridiki; McNamee, Roseanne
2012-12-10
It is known that measurement error leads to bias in assessing exposure effects, which can however, be corrected if independent replicates are available. For expensive replicates, two-stage (2S) studies that produce data 'missing by design', may be preferred over a single-stage (1S) study, because in the second stage, measurement of replicates is restricted to a sample of first-stage subjects. Motivated by an occupational study on the acute effect of carbon black exposure on respiratory morbidity, we compare the performance of several bias-correction methods for both designs in a simulation study: an instrumental variable method (EVROS IV) based on grouping strategies, which had been recommended especially when measurement error is large, the regression calibration and the simulation extrapolation methods. For the 2S design, either the problem of 'missing' data was ignored or the 'missing' data were imputed using multiple imputations. Both in 1S and 2S designs, in the case of small or moderate measurement error, regression calibration was shown to be the preferred approach in terms of root mean square error. For 2S designs, regression calibration as implemented by Stata software is not recommended in contrast to our implementation of this method; the 'problematic' implementation of regression calibration although substantially improved with use of multiple imputations. The EVROS IV method, under a good/fairly good grouping, outperforms the regression calibration approach in both design scenarios when exposure mismeasurement is severe. Both in 1S and 2S designs with moderate or large measurement error, simulation extrapolation severely failed to correct for bias. Copyright © 2012 John Wiley & Sons, Ltd.
Simulation of wave propagation in three-dimensional random media
NASA Astrophysics Data System (ADS)
Coles, Wm. A.; Filice, J. P.; Frehlich, R. G.; Yadlowsky, M.
1995-04-01
Quantitative error analyses for the simulation of wave propagation in three-dimensional random media, when narrow angular scattering is assumed, are presented for plane-wave and spherical-wave geometry. This includes the errors that result from finite grid size, finite simulation dimensions, and the separation of the two-dimensional screens along the propagation direction. Simple error scalings are determined for power-law spectra of the random refractive indices of the media. The effects of a finite inner scale are also considered. The spatial spectra of the intensity errors are calculated and compared with the spatial spectra of
Recent developments in measurement and evaluation of FAC damage in power plants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garud, Y.S.; Besuner, P.; Cohn, M.J.
1999-11-01
This paper describes some recent developments in the measurement and evaluation of flow-accelerated corrosion (FAC) damage in power plants. The evaluation focuses on data checking and smoothing to account for gross errors, noise, and uncertainty in the wall thickness measurements from ultrasonic or pulsed eddy-current data. Also, the evaluation method utilizes advanced regression analysis for spatial and temporal evolution of the wall loss, providing statistically robust predictions of wear rates and associated uncertainty. Results of the application of these new tools are presented for several components in actual service. More importantly, the practical implications of using these advances are discussedmore » in relation to the likely impact on the scope and effectiveness of FAC related inspection programs.« less
Online Learners’ Reading Ability Detection Based on Eye-Tracking Sensors
Zhan, Zehui; Zhang, Lei; Mei, Hu; Fong, Patrick S. W.
2016-01-01
The detection of university online learners’ reading ability is generally problematic and time-consuming. Thus the eye-tracking sensors have been employed in this study, to record temporal and spatial human eye movements. Learners’ pupils, blinks, fixation, saccade, and regression are recognized as primary indicators for detecting reading abilities. A computational model is established according to the empirical eye-tracking data, and applying the multi-feature regularization machine learning mechanism based on a Low-rank Constraint. The model presents good generalization ability with an error of only 4.9% when randomly running 100 times. It has obvious advantages in saving time and improving precision, with only 20 min of testing required for prediction of an individual learner’s reading ability. PMID:27626418
Garcia-Vargas, Gonzalo G; Rothenberg, Stephen J; Silbergeld, Ellen K; Weaver, Virginia; Zamoiski, Rachel; Resnick, Carol; Rubio-Andrade, Marisela; Parsons, Patrick J; Steuerwald, Amy J; Navas-Acién, Ana; Guallar, Eliseo
2014-11-01
High blood lead (BPb) levels in children and elevated soil and dust arsenic, cadmium, and lead were previously found in Torreón, northern Mexico, host to the world's fourth largest lead-zinc metal smelter. The objectives of this study were to determine spatial distributions of adolescents with higher BPb and creatinine-corrected urine total arsenic, cadmium, molybdenum, thallium, and uranium around the smelter. Cross-sectional study of 512 male and female subjects 12-15 years of age was conducted. We measured BPb by graphite furnace atomic absorption spectrometry and urine trace elements by inductively coupled plasma-mass spectrometry, with dynamic reaction cell mode for arsenic. We constructed multiple regression models including sociodemographic variables and adjusted for subject residence spatial correlation with spatial lag or error terms. We applied local indicators of spatial association statistics to model residuals to identify hot spots of significant spatial clusters of subjects with higher trace elements. We found spatial clusters of subjects with elevated BPb (range 3.6-14.7 μg/dl) and urine cadmium (0.18-1.14 μg/g creatinine) adjacent to and downwind of the smelter and elevated urine thallium (0.28-0.93 μg/g creatinine) and uranium (0.07-0.13 μg/g creatinine) near ore transport routes, former waste, and industrial discharge sites. The conclusion derived from this study was that spatial clustering of adolescents with high BPb and urine cadmium adjacent to and downwind of the smelter and residual waste pile, areas identified over a decade ago with high lead and cadmium in soil and dust, suggests that past and/or present plant operations continue to present health risks to children in those neighborhoods.
Garcia-Vargas, Gonzalo G.; Rothenberg, Stephen J.; Silbergeld, Ellen K.; Weaver, Virginia; Zamoiski, Rachel; Resnick, Carol; Rubio-Andrade, Marisela; Parsons, Patrick J.; Steuerwald, Amy J.; Navas-Acién, Ana; Guallar, Eliseo
2016-01-01
High blood lead (BPb) levels in children and elevated soil and dust arsenic, cadmium, and lead were previously found in Torreón, northern Mexico, host to the world’s fourth largest lead–zinc metal smelter. The objectives of this study were to determine spatial distributions of adolescents with higher BPb and creatinine-corrected urine total arsenic, cadmium, molybdenum, thallium, and uranium around the smelter. Cross-sectional study of 512 male and female subjects 12–15 years of age was conducted. We measured BPb by graphite furnace atomic absorption spectrometry and urine trace elements by inductively coupled plasma-mass spectrometry, with dynamic reaction cell mode for arsenic. We constructed multiple regression models including sociodemographic variables and adjusted for subject residence spatial correlation with spatial lag or error terms. We applied local indicators of spatial association statistics to model residuals to identify hot spots of significant spatial clusters of subjects with higher trace elements. We found spatial clusters of subjects with elevated BPb (range 3.6–14.7 µg/dl) and urine cadmium (0.18–1.14 µg/g creatinine) adjacent to and downwind of the smelter and elevated urine thallium (0.28–0.93 µg/g creatinine) and uranium (0.07–0.13 µg/g creatinine) near ore transport routes, former waste, and industrial discharge sites. The conclusion derived from this study was that spatial clustering of adolescents with high BPb and urine cadmium adjacent to and downwind of the smelter and residual waste pile, areas identified over a decade ago with high lead and cadmium in soil and dust, suggests that past and/or present plant operations continue to present health risks to children in those neighborhoods. PMID:24549228
Visuospatial memory computations during whole-body rotations in roll.
Van Pelt, S; Van Gisbergen, J A M; Medendorp, W P
2005-08-01
We used a memory-saccade task to test whether the location of a target, briefly presented before a whole-body rotation in roll, is stored in egocentric or in allocentric coordinates. To make this distinction, we exploited the fact that subjects, when tilted sideways in darkness, make systematic errors when indicating the direction of gravity (an allocentric task) even though they have a veridical percept of their self-orientation in space. We hypothesized that if spatial memory is coded allocentrically, these distortions affect the coding of remembered targets and their readout after a body rotation. Alternatively, if coding is egocentric, updating for body rotation becomes essential and errors in performance should be related to the amount of intervening rotation. Subjects (n = 6) were tested making saccades to remembered world-fixed targets after passive body tilts. Initial and final tilt angle ranged between -120 degrees CCW and 120 degrees CW. The results showed that subjects made large systematic directional errors in their saccades (up to 90 degrees ). These errors did not occur in the absence of intervening body rotation, ruling out a memory degradation effect. Regression analysis showed that the errors were closely related to the amount of subjective allocentric distortion at both the initial and final tilt angle, rather than to the amount of intervening rotation. We conclude that the brain uses an allocentric reference frame, possibly gravity-based, to code visuospatial memories during whole-body tilts. This supports the notion that the brain can define information in multiple frames of reference, depending on sensory inputs and task demands.
Dudley, Robert W.
2015-12-03
The largest average errors of prediction are associated with regression equations for the lowest streamflows derived for months during which the lowest streamflows of the year occur (such as the 5 and 1 monthly percentiles for August and September). The regression equations have been derived on the basis of streamflow and basin characteristics data for unregulated, rural drainage basins without substantial streamflow or drainage modifications (for example, diversions and (or) regulation by dams or reservoirs, tile drainage, irrigation, channelization, and impervious paved surfaces), therefore using the equations for regulated or urbanized basins with substantial streamflow or drainage modifications will yield results of unknown error. Input basin characteristics derived using techniques or datasets other than those documented in this report or using values outside the ranges used to develop these regression equations also will yield results of unknown error.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
NASA Astrophysics Data System (ADS)
Shen, S. S.
2014-12-01
This presentation describes a suite of global precipitation products reconstructed by a multivariate regression method using an empirical orthogonal function (EOF) expansion. The sampling errors of the reconstruction are estimated for each product datum entry. The maximum temporal coverage is 1850-present and the spatial coverage is quasi-global (75S, 75N). The temporal resolution ranges from 5-day, monthly, to seasonal and annual. The Global Precipitation Climatology Project (GPCP) precipitation data from 1979-2008 are used to calculate the EOFs. The Global Historical Climatology Network (GHCN) gridded data are used to calculate the regression coefficients for reconstructions. The sampling errors of the reconstruction are analyzed in detail for different EOF modes. Our reconstructed 1900-2011 time series of the global average annual precipitation shows a 0.024 (mm/day)/100a trend, which is very close to the trend derived from the mean of 25 models of the CMIP5 (Coupled Model Intercomparison Project Phase 5). Our reconstruction examples of 1983 El Niño precipitation and 1917 La Niña precipitation (Figure 1) demonstrate that the El Niño and La Niña precipitation patterns are well reflected in the first two EOFs. The validation of our reconstruction results with GPCP makes it possible to use the reconstruction as the benchmark data for climate models. This will help the climate modeling community to improve model precipitation mechanisms and reduce the systematic difference between observed global precipitation, which hovers at around 2.7 mm/day for reconstructions and GPCP, and model precipitations, which have a range of 2.6-3.3 mm/day for CMIP5. Our precipitation products are publically available online, including digital data, precipitation animations, computer codes, readme files, and the user manual. This work is a joint effort between San Diego State University (Sam Shen, Nancy Tafolla, Barbara Sperberg, and Melanie Thorn) and University of Maryland (Phil Arkin, Tom Smith, Li Ren, and Li Dai) and supported in part by the U.S. National Science Foundation (Awards No. AGS-1015926 and AGS-1015957).
Evaluation of the streamflow-gaging network of Alaska in providing regional streamflow information
Brabets, Timothy P.
1996-01-01
In 1906, the U.S. Geological Survey (USGS) began operating a network of streamflow-gaging stations in Alaska. The primary purpose of the streamflow- gaging network has been to provide peak flow, average flow, and low-flow characteristics to a variety of users. In 1993, the USGS began a study to evaluate the current network of 78 stations. The objectives of this study were to determine the adequacy of the existing network in predicting selected regional flow characteristics and to determine if providing additional streamflow-gaging stations could improve the network's ability to predict these characteristics. Alaska was divided into six distinct hydrologic regions: Arctic, Northwest, Southcentral, Southeast, Southwest, and Yukon. For each region, historical and current streamflow data were compiled. In Arctic, Northwest, and Southwest Alaska, insufficient data were available to develop regional regression equations. In these areas, proposed locations of streamflow-gaging stations were selected by using clustering techniques to define similar areas within a region and by spatial visual analysis using the precipitation, physiographic, and hydrologic unit maps of Alaska. Sufficient data existed in Southcentral and Southeast Alaska to use generalized least squares (GLS) procedures to develop regional regression equations to estimate the 50-year peak flow, annual average flow, and a low-flow statistic. GLS procedures were also used for Yukon Alaska but the results should be used with caution because the data do not have an adequate spatial distribution. Network analysis procedures were used for the Southcentral, Southeast, and Yukon regions. Network analysis indicates the reduction in the sampling error of the regional regression equation that can be obtained given different scenarios. For Alaska, a 10-year planning period was used. One scenario showed the results of continuing the current network with no additional gaging stations and another scenario showed the results of adding gaging stations to the network. With the exception of the annual average discharge equation for Southeast Alaska, by adding gaging stations in all three regions, the sampling error was reduced to a greater extent than by not adding gaging stations. The proposed streamflow-gaging network for Alaska consists of 308 gaging stations, of which 32 are designated as index stations. If the proposed network can not be implemented in its entirety, then a lesser cost alternative would be to establish the index stations and to implement the network for a particular region.
Schistosomiasis Breeding Environment Situation Analysis in Dongting Lake Area
NASA Astrophysics Data System (ADS)
Li, Chuanrong; Jia, Yuanyuan; Ma, Lingling; Liu, Zhaoyan; Qian, Yonggang
2013-01-01
Monitoring environmental characteristics, such as vegetation, soil moisture et al., of Oncomelania hupensis (O. hupensis)’ spatial/temporal distribution is of vital importance to the schistosomiasis prevention and control. In this study, the relationship between environmental factors derived from remotely sensed data and the density of O. hupensis was analyzed by a multiple linear regression model. Secondly, spatial analysis of the regression residual was investigated by the semi-variogram method. Thirdly, spatial analysis of the regression residual and the multiple linear regression model were both employed to estimate the spatial variation of O. hupensis density. Finally, the approach was used to monitor and predict the spatial and temporal variations of oncomelania of Dongting Lake region, China. And the areas of potential O. hupensis habitats were predicted and the influence of Three Gorges Dam (TGB)project on the density of O. hupensis was analyzed.
Detecting influential observations in nonlinear regression modeling of groundwater flow
Yager, Richard M.
1998-01-01
Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.
NASA Astrophysics Data System (ADS)
Kapoor, Mudit
The first part of my dissertation considers the estimation of a panel data model with error components that are both spatially and time-wise correlated. The dissertation combines widely used model for spatial correlation (Cliff and Ord (1973, 1981)) with the classical error component panel data model. I introduce generalizations of the generalized moments (GM) procedure suggested in Kelejian and Prucha (1999) for estimating the spatial autoregressive parameter in case of a single cross section. I then use those estimators to define feasible generalized least squares (GLS) procedures for the regression parameters. I give formal large sample results concerning the consistency of the proposed GM procedures, as well as the consistency and asymptotic normality of the proposed feasible GLS procedures. The new estimators remain computationally feasible even in large samples. The second part of my dissertation employs a Cliff-Ord-type model to empirically estimate the nature and extent of price competition in the US wholesale gasoline industry. I use data on average weekly wholesale gasoline price for 289 terminals (distribution facilities) in the US. Data on demand factors, cost factors and market structure that affect price are also used. I consider two time periods, a high demand period (August 1999) and a low demand period (January 2000). I find a high level of competition in prices between neighboring terminals. In particular, price in one terminal is significantly and positively correlated to the price of its neighboring terminal. Moreover, I find this to be much higher during the low demand period, as compared to the high demand period. In contrast to previous work, I include for each terminal the characteristics of the marginal customer by controlling for demand factors in the neighboring location. I find these demand factors to be important during period of high demand and insignificant during the low demand period. Furthermore, I have also considered spatial correlation in unobserved factors that affect price. I find it to be high and significant only during the low demand period. Not correcting for it leads to incorrect inferences regarding exogenous explanatory variables.
Low flow of streams in the Susquehanna River basin of New York
Randall, Allan D.
2011-01-01
The principal source of streamflow during periods of low flow in the Susquehanna River basin of New York is the discharge of groundwater from sand-and-gravel deposits. Spatial variation in low flow is mostly a function of differences in three watershed properties: the amount of water that is introduced to the watershed and available for runoff, the extent of surficial sand and gravel relative to till-mantled bedrock, and the extent of wetlands. These three properties were consistently significant in regression equations that were developed to estimate several indices of low flow expressed in cubic feet per second or in cubic feet per second per square mile. The equations explain 90 to 99 percent of the spatial variation in low flow. A few equations indicate that underflow that bypasses streamflow-measurement sites through permeable sand and gravel can significantly decrease low flows. Analytical and numerical groundwater-flow models indicate that spatial extent, hydraulic conductivity and thickness, storage capacity, and topography of stratified sandand- gravel deposits affect low-flow yields from those deposits. Model-simulated discharge of groundwater to streams at low flow reaches a maximum where hydraulic-conductivity values are about 15 feet per day (in valleys 0.5 mile wide) to 60 feet per day (in valleys 1 mile wide). These hydraulic-conductivity values are much larger than those that are considered typical of till and bedrock, but smaller than values reported for productive sand-and-gravel aquifers in some valley reaches in New York. Differences in the properties of till and bedrock and in land-surface slope or relief within the Susquehanna River basin of New York apparently have little effect on low flow. Three regression equations were selected for practical application in estimating 7-day mean low flows in cubic feet per second with 10-year and 2-year recurrence intervals, and 90-percent flow duration, at ungaged sites draining more than 30 square miles; standard errors were 0.88, 1.40, and 1.95 cubic feet per second, respectively. Equations that express low flows in cubic feet per second per square mile were selected for estimating these three indices at ungaged sites draining less than 30 square miles; standard errors were 0.012, 0.018, and 0.022 cubic feet per second per square mile, respectively.
Weighted linear regression using D2H and D2 as the independent variables
Hans T. Schreuder; Michael S. Williams
1998-01-01
Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...
Crop area estimation based on remotely-sensed data with an accurate but costly subsample
NASA Technical Reports Server (NTRS)
Gunst, R. F.
1983-01-01
Alternatives to sampling-theory stratified and regression estimators of crop production and timber biomass were examined. An alternative estimator which is viewed as especially promising is the errors-in-variable regression estimator. Investigations established the need for caution with this estimator when the ratio of two error variances is not precisely known.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
Applying Intelligent Algorithms to Automate the Identification of Error Factors.
Jin, Haizhe; Qu, Qingxing; Munechika, Masahiko; Sano, Masataka; Kajihara, Chisato; Duffy, Vincent G; Chen, Han
2018-05-03
Medical errors are the manifestation of the defects occurring in medical processes. Extracting and identifying defects as medical error factors from these processes are an effective approach to prevent medical errors. However, it is a difficult and time-consuming task and requires an analyst with a professional medical background. The issues of identifying a method to extract medical error factors and reduce the extraction difficulty need to be resolved. In this research, a systematic methodology to extract and identify error factors in the medical administration process was proposed. The design of the error report, extraction of the error factors, and identification of the error factors were analyzed. Based on 624 medical error cases across four medical institutes in both Japan and China, 19 error-related items and their levels were extracted. After which, they were closely related to 12 error factors. The relational model between the error-related items and error factors was established based on a genetic algorithm (GA)-back-propagation neural network (BPNN) model. Additionally, compared to GA-BPNN, BPNN, partial least squares regression and support vector regression, GA-BPNN exhibited a higher overall prediction accuracy, being able to promptly identify the error factors from the error-related items. The combination of "error-related items, their different levels, and the GA-BPNN model" was proposed as an error-factor identification technology, which could automatically identify medical error factors.
Oh, Eric J; Shepherd, Bryan E; Lumley, Thomas; Shaw, Pamela A
2018-04-15
For time-to-event outcomes, a rich literature exists on the bias introduced by covariate measurement error in regression models, such as the Cox model, and methods of analysis to address this bias. By comparison, less attention has been given to understanding the impact or addressing errors in the failure time outcome. For many diseases, the timing of an event of interest (such as progression-free survival or time to AIDS progression) can be difficult to assess or reliant on self-report and therefore prone to measurement error. For linear models, it is well known that random errors in the outcome variable do not bias regression estimates. With nonlinear models, however, even random error or misclassification can introduce bias into estimated parameters. We compare the performance of 2 common regression models, the Cox and Weibull models, in the setting of measurement error in the failure time outcome. We introduce an extension of the SIMEX method to correct for bias in hazard ratio estimates from the Cox model and discuss other analysis options to address measurement error in the response. A formula to estimate the bias induced into the hazard ratio by classical measurement error in the event time for a log-linear survival model is presented. Detailed numerical studies are presented to examine the performance of the proposed SIMEX method under varying levels and parametric forms of the error in the outcome. We further illustrate the method with observational data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic. Copyright © 2017 John Wiley & Sons, Ltd.
Infant and Child Mortality in India in the Last Two Decades: A Geospatial Analysis
Singh, Abhishek; Pathak, Praveen Kumar; Chauhan, Rajesh Kumar; Pan, William
2011-01-01
Background Studies examining the intricate interplay between poverty, female literacy, child malnutrition, and child mortality are rare in demographic literature. Given the recent focus on Millennium Development Goals 4 (child survival) and 5 (maternal health), we explored whether the geographic regions that were underprivileged in terms of wealth, female literacy, child nutrition, or safe delivery were also grappling with the elevated risk of child mortality; whether there were any spatial outliers; whether these relationships have undergone any significant change over historical time periods. Methodology The present paper attempted to investigate these critical questions using data from household surveys like NFHS 1992–1993, NFHS 1998–1999 and DLHS 2002–2004. For the first time, we employed geo-spatial techniques like Moran's-I, univariate LISA, bivariate LISA, spatial error regression, and spatiotemporal regression to address the research problem. For carrying out the geospatial analysis, we classified India into 76 natural regions based on the agro-climatic scheme proposed by Bhat and Zavier (1999) following the Census of India Study and all estimates were generated for each of the geographic regions. Result/Conclusions This study brings out the stark intra-state and inter-regional disparities in infant and under-five mortality in India over the past two decades. It further reveals, for the first time, that geographic regions that were underprivileged in child nutrition or wealth or female literacy were also likely to be disadvantaged in terms of infant and child survival irrespective of the state to which they belong. While the role of economic status in explaining child malnutrition and child survival has weakened, the effect of mother's education has actually become stronger over time. PMID:22073208
NASA Astrophysics Data System (ADS)
Verkade, J. S.; Brown, J. D.; Reggiani, P.; Weerts, A. H.
2013-09-01
The ECMWF temperature and precipitation ensemble reforecasts are evaluated for biases in the mean, spread and forecast probabilities, and how these biases propagate to streamflow ensemble forecasts. The forcing ensembles are subsequently post-processed to reduce bias and increase skill, and to investigate whether this leads to improved streamflow ensemble forecasts. Multiple post-processing techniques are used: quantile-to-quantile transform, linear regression with an assumption of bivariate normality and logistic regression. Both the raw and post-processed ensembles are run through a hydrologic model of the river Rhine to create streamflow ensembles. The results are compared using multiple verification metrics and skill scores: relative mean error, Brier skill score and its decompositions, mean continuous ranked probability skill score and its decomposition, and the ROC score. Verification of the streamflow ensembles is performed at multiple spatial scales: relatively small headwater basins, large tributaries and the Rhine outlet at Lobith. The streamflow ensembles are verified against simulated streamflow, in order to isolate the effects of biases in the forcing ensembles and any improvements therein. The results indicate that the forcing ensembles contain significant biases, and that these cascade to the streamflow ensembles. Some of the bias in the forcing ensembles is unconditional in nature; this was resolved by a simple quantile-to-quantile transform. Improvements in conditional bias and skill of the forcing ensembles vary with forecast lead time, amount, and spatial scale, but are generally moderate. The translation to streamflow forecast skill is further muted, and several explanations are considered, including limitations in the modelling of the space-time covariability of the forcing ensembles and the presence of storages.
Lopiano, Kenneth K; Young, Linda J; Gotway, Carol A
2014-09-01
Spatially referenced datasets arising from multiple sources are routinely combined to assess relationships among various outcomes and covariates. The geographical units associated with the data, such as the geographical coordinates or areal-level administrative units, are often spatially misaligned, that is, observed at different locations or aggregated over different geographical units. As a result, the covariate is often predicted at the locations where the response is observed. The method used to align disparate datasets must be accounted for when subsequently modeling the aligned data. Here we consider the case where kriging is used to align datasets in point-to-point and point-to-areal misalignment problems when the response variable is non-normally distributed. If the relationship is modeled using generalized linear models, the additional uncertainty induced from using the kriging mean as a covariate introduces a Berkson error structure. In this article, we develop a pseudo-penalized quasi-likelihood algorithm to account for the additional uncertainty when estimating regression parameters and associated measures of uncertainty. The method is applied to a point-to-point example assessing the relationship between low-birth weights and PM2.5 levels after the onset of the largest wildfire in Florida history, the Bugaboo scrub fire. A point-to-areal misalignment problem is presented where the relationship between asthma events in Florida's counties and PM2.5 levels after the onset of the fire is assessed. Finally, the method is evaluated using a simulation study. Our results indicate the method performs well in terms of coverage for 95% confidence intervals and naive methods that ignore the additional uncertainty tend to underestimate the variability associated with parameter estimates. The underestimation is most profound in Poisson regression models. © 2014, The International Biometric Society.
Space, time, and the third dimension (model error)
Moss, Marshall E.
1979-01-01
The space-time tradeoff of hydrologic data collection (the ability to substitute spatial coverage for temporal extension of records or vice versa) is controlled jointly by the statistical properties of the phenomena that are being measured and by the model that is used to meld the information sources. The control exerted on the space-time tradeoff by the model and its accompanying errors has seldom been studied explicitly. The technique, known as Network Analyses for Regional Information (NARI), permits such a study of the regional regression model that is used to relate streamflow parameters to the physical and climatic characteristics of the drainage basin.The NARI technique shows that model improvement is a viable and sometimes necessary means of improving regional data collection systems. Model improvement provides an immediate increase in the accuracy of regional parameter estimation and also increases the information potential of future data collection. Model improvement, which can only be measured in a statistical sense, cannot be quantitatively estimated prior to its achievement; thus an attempt to upgrade a particular model entails a certain degree of risk on the part of the hydrologist.
Perceptual Color Characterization of Cameras
Vazquez-Corral, Javier; Connah, David; Bertalmío, Marcelo
2014-01-01
Color camera characterization, mapping outputs from the camera sensors to an independent color space, such as XY Z, is an important step in the camera processing pipeline. Until now, this procedure has been primarily solved by using a 3 × 3 matrix obtained via a least-squares optimization. In this paper, we propose to use the spherical sampling method, recently published by Finlayson et al., to perform a perceptual color characterization. In particular, we search for the 3 × 3 matrix that minimizes three different perceptual errors, one pixel based and two spatially based. For the pixel-based case, we minimize the CIE ΔE error, while for the spatial-based case, we minimize both the S-CIELAB error and the CID error measure. Our results demonstrate an improvement of approximately 3% for the ΔE error, 7% for the S-CIELAB error and 13% for the CID error measures. PMID:25490586
Examining Impulse-Variability in Kicking.
Chappell, Andrew; Molina, Sergio L; McKibben, Jonathon; Stodden, David F
2016-07-01
This study examined variability in kicking speed and spatial accuracy to test the impulse-variability theory prediction of an inverted-U function and the speed-accuracy trade-off. Twenty-eight 18- to 25-year-old adults kicked a playground ball at various percentages (50-100%) of their maximum speed at a wall target. Speed variability and spatial error were analyzed using repeated-measures ANOVA with built-in polynomial contrasts. Results indicated a significant inverse linear trajectory for speed variability (p < .001, η2= .345) where 50% and 60% maximum speed had significantly higher variability than the 100% condition. A significant quadratic fit was found for spatial error scores of mean radial error (p < .0001, η2 = .474) and subject-centroid radial error (p < .0001, η2 = .453). Findings suggest variability and accuracy of multijoint, ballistic skill performance may not follow the general principles of impulse-variability theory or the speed-accuracy trade-off.
NASA Astrophysics Data System (ADS)
Mei, Zhixiong; Wu, Hao; Li, Shiyun
2018-06-01
The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.
Left neglect dyslexia: Perseveration and reading error types.
Ronchi, Roberta; Algeri, Lorella; Chiapella, Laura; Gallucci, Marcello; Spada, Maria Simonetta; Vallar, Giuseppe
2016-08-01
Right-brain-damaged patients may show a reading disorder termed neglect dyslexia. Patients with left neglect dyslexia omit letters on the left-hand-side (the beginning, when reading left-to-right) part of the letter string, substitute them with other letters, and add letters to the left of the string. The aim of this study was to investigate the pattern of association, if any, between error types in patients with left neglect dyslexia and recurrent perseveration (a productive visuo-motor deficit characterized by addition of marks) in target cancellation. Specifically, we aimed at assessing whether different productive symptoms (relative to the reading and the visuo-motor domains) could be associated in patients with left spatial neglect. Fifty-four right-brain-damaged patients took part in the study: 50 out of the 54 patients showed left spatial neglect, with 27 of them also exhibiting left neglect dyslexia. Neglect dyslexic patients who showed perseveration produced mainly substitution neglect errors in reading. Conversely, omissions were the prevailing reading error pattern in neglect dyslexic patients without perseveration. Addition reading errors were much infrequent. Different functional pathological mechanisms may underlie omission and substitution reading errors committed by right-brain-damaged patients with left neglect dyslexia. One such mechanism, involving the defective stopping of inappropriate responses, may contribute to both recurrent perseveration in target cancellation, and substitution errors in reading. Productive pathological phenomena, together with deficits of spatial attention to events taking place on the left-hand-side of space, shape the manifestations of neglect dyslexia, and, more generally, of spatial neglect. Copyright © 2016 Elsevier Ltd. All rights reserved.
Methods for estimating flood frequency in Montana based on data through water year 1998
Parrett, Charles; Johnson, Dave R.
2004-01-01
Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Exploring geographic variation in US mortality rates using a spatial Durbin approach
Yang, Tse-Chuan; Noah, Aggie; Shoff, Carla
2015-01-01
Previous studies focused on identifying the determinants of mortality in US counties have examined the relationships between mortality and explanatory covariates within a county only, and have ignored the well-documented spatial dependence of mortality. We challenge earlier literature by arguing that the mortality rate of a certain county may also be associated with the features of its neighboring counties beyond its own features. Drawing from both the spillover (i.e., same direction effect) and social relativity (i.e., opposite direction effect) perspectives, our spatial Durbin modeling results indicate that both theoretical perspectives provide valuable frameworks to guide the modeling of mortality variation in US counties. Our empirical findings support that mortality rate of a certain county is associated with the features of its neighbors beyond its own features. Specifically, we found support for the spillover perspective in which the percentage of the Hispanic population, concentrated disadvantage, and the social capital of a specific county are negatively associated with the mortality rate in the specific county and also in neighboring counties. On the other hand, the following covariates fit the social relativity process: health insurance coverage, percentage of non-Hispanic other races, and income inequality. Their direction of the associations with mortality in the specific county is opposite to that of the relationships with mortality in neighboring counties. Methodologically, spatial Durbin modeling addresses the shortcomings of traditional analytic approaches used in ecological mortality research such as ordinary least squares, spatial error, and spatial lag regression. Our results produce new insights drawn from unbiased estimates. PMID:25642156
Divided spatial attention and feature-mixing errors.
Golomb, Julie D
2015-11-01
Spatial attention is thought to play a critical role in feature binding. However, often multiple objects or locations are of interest in our environment, and we need to shift or split attention between them. Recent evidence has demonstrated that shifting and splitting spatial attention results in different types of feature-binding errors. In particular, when two locations are simultaneously sharing attentional resources, subjects are susceptible to feature-mixing errors; that is, they tend to report a color that is a subtle blend of the target color and the color at the other attended location. The present study was designed to test whether these feature-mixing errors are influenced by target-distractor similarity. Subjects were cued to split attention across two different spatial locations, and were subsequently presented with an array of colored stimuli, followed by a postcue indicating which color to report. Target-distractor similarity was manipulated by varying the distance in color space between the two attended stimuli. Probabilistic modeling in all cases revealed shifts in the response distribution consistent with feature-mixing errors; however, the patterns differed considerably across target-distractor color distances. With large differences in color, the findings replicated the mixing result, but with small color differences, repulsion was instead observed, with the reported target color shifted away from the other attended color.
Examining impulse-variability in overarm throwing.
Urbin, M A; Stodden, David; Boros, Rhonda; Shannon, David
2012-01-01
The purpose of this study was to examine variability in overarm throwing velocity and spatial output error at various percentages of maximum to test the prediction of an inverted-U function as predicted by impulse-variability theory and a speed-accuracy trade-off as predicted by Fitts' Law Thirty subjects (16 skilled, 14 unskilled) were instructed to throw a tennis ball at seven percentages of their maximum velocity (40-100%) in random order (9 trials per condition) at a target 30 feet away. Throwing velocity was measured with a radar gun and interpreted as an index of overall systemic power output. Within-subject throwing velocity variability was examined using within-subjects repeated-measures ANOVAs (7 repeated conditions) with built-in polynomial contrasts. Spatial error was analyzed using mixed model regression. Results indicated a quadratic fit with variability in throwing velocity increasing from 40% up to 60%, where it peaked, and then decreasing at each subsequent interval to maximum (p < .001, η2 = .555). There was no linear relationship between speed and accuracy. Overall, these data support the notion of an inverted-U function in overarm throwing velocity variability as both skilled and unskilled subjects approach maximum effort. However, these data do not support the notion of a speed-accuracy trade-off. The consistent demonstration of an inverted-U function associated with systemic power output variability indicates an enhanced capability to regulate aspects of force production and relative timing between segments as individuals approach maximum effort, even in a complex ballistic skill.
Validation of Metrics as Error Predictors
NASA Astrophysics Data System (ADS)
Mendling, Jan
In this chapter, we test the validity of metrics that were defined in the previous chapter for predicting errors in EPC business process models. In Section 5.1, we provide an overview of how the analysis data is generated. Section 5.2 describes the sample of EPCs from practice that we use for the analysis. Here we discuss a disaggregation by the EPC model group and by error as well as a correlation analysis between metrics and error. Based on this sample, we calculate a logistic regression model for predicting error probability with the metrics as input variables in Section 5.3. In Section 5.4, we then test the regression function for an independent sample of EPC models from textbooks as a cross-validation. Section 5.5 summarizes the findings.
GIS-based spatial regression and prediction of water quality in river networks: A case study in Iowa
Yang, X.; Jin, W.
2010-01-01
Nonpoint source pollution is the leading cause of the U.S.'s water quality problems. One important component of nonpoint source pollution control is an understanding of what and how watershed-scale conditions influence ambient water quality. This paper investigated the use of spatial regression to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration in the Cedar River Watershed, Iowa. An Arc Hydro geodatabase was constructed to organize various datasets on the watershed. Spatial regression models were developed to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration and predict NO3NO2-N concentration at unmonitored locations. Unlike the traditional ordinary least square (OLS) method, the spatial regression method incorporates the potential spatial correlation among the observations in its coefficient estimation. Study results show that NO3NO2-N observations in the Cedar River Watershed are spatially correlated, and by ignoring the spatial correlation, the OLS method tends to over-estimate the impacts of watershed characteristics on stream NO3NO2-N concentration. In conjunction with kriging, the spatial regression method not only makes better stream NO3NO2-N concentration predictions than the OLS method, but also gives estimates of the uncertainty of the predictions, which provides useful information for optimizing the design of stream monitoring network. It is a promising tool for better managing and controlling nonpoint source pollution. ?? 2010 Elsevier Ltd.
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan
2014-02-10
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.
Spatiotemporal integration for tactile localization during arm movements: a probabilistic approach.
Maij, Femke; Wing, Alan M; Medendorp, W Pieter
2013-12-01
It has been shown that people make systematic errors in the localization of a brief tactile stimulus that is delivered to the index finger while they are making an arm movement. Here we modeled these spatial errors with a probabilistic approach, assuming that they follow from temporal uncertainty about the occurrence of the stimulus. In the model, this temporal uncertainty converts into a spatial likelihood about the external stimulus location, depending on arm velocity. We tested the prediction of the model that the localization errors depend on arm velocity. Participants (n = 8) were instructed to localize a tactile stimulus that was presented to their index finger while they were making either slow- or fast-targeted arm movements. Our results confirm the model's prediction that participants make larger localization errors when making faster arm movements. The model, which was used to fit the errors for both slow and fast arm movements simultaneously, accounted very well for all the characteristics of these data with temporal uncertainty in stimulus processing as the only free parameter. We conclude that spatial errors in dynamic tactile perception stem from the temporal precision with which tactile inputs are processed.
Benefit transfer and spatial heterogeneity of preferences for water quality improvements.
Martin-Ortega, J; Brouwer, R; Ojea, E; Berbel, J
2012-09-15
The improvement in the water quality resulting from the implementation of the EU Water Framework Directive is expected to generate substantial non-market benefits. A wide spread estimation of these benefits across Europe will require the application of benefit transfer. We use a spatially explicit valuation design to account for the spatial heterogeneity of preferences to help generate lower transfer errors. A map-based choice experiment is applied in the Guadalquivir River Basin (Spain), accounting simultaneously for the spatial distribution of water quality improvements and beneficiaries. Our results show that accounting for the spatial heterogeneity of preferences generally produces lower transfer errors. Copyright © 2012 Elsevier Ltd. All rights reserved.
Johnson, Jeffrey S; Spencer, John P
2016-05-01
Studies examining the relationship between spatial attention and spatial working memory (SWM) have shown that discrimination responses are faster for targets appearing at locations that are being maintained in SWM, and that location memory is impaired when attention is withdrawn during the delay. These observations support the proposal that sustained attention is required for successful retention in SWM: If attention is withdrawn, memory representations are likely to fail, increasing errors. In the present study, this proposal was reexamined in light of a neural-process model of SWM. On the basis of the model's functioning, we propose an alternative explanation for the observed decline in SWM performance when a secondary task is performed during retention: SWM representations drift systematically toward the location of targets appearing during the delay. To test this explanation, participants completed a color discrimination task during the delay interval of a spatial-recall task. In the critical shifting-attention condition, the color stimulus could appear either toward or away from the midline reference axis, relative to the memorized location. We hypothesized that if shifting attention during the delay leads to the failure of SWM representations, there should be an increase in the variance of recall errors, but no change in directional errors, regardless of the direction of the shift. Conversely, if shifting attention induces drift of SWM representations-as predicted by the model-systematic changes in the patterns of spatial-recall errors should occur that would depend on the direction of the shift. The results were consistent with the latter possibility-recall errors were biased toward the locations of discrimination targets appearing during the delay.
Testing a Dynamic Field Account of Interactions between Spatial Attention and Spatial Working Memory
Johnson, Jeffrey S.; Spencer, John P.
2016-01-01
Studies examining the relationship between spatial attention and spatial working memory (SWM) have shown that discrimination responses are faster for targets appearing at locations that are being maintained in SWM, and that location memory is impaired when attention is withdrawn during the delay. These observations support the proposal that sustained attention is required for successful retention in SWM: if attention is withdrawn, memory representations are likely to fail, increasing errors. In the present study, this proposal is reexamined in light of a neural process model of SWM. On the basis of the model's functioning, we propose an alternative explanation for the observed decline in SWM performance when a secondary task is performed during retention: SWM representations drift systematically toward the location of targets appearing during the delay. To test this explanation, participants completed a color-discrimination task during the delay interval of a spatial recall task. In the critical shifting attention condition, the color stimulus could appear either toward or away from the memorized location relative to a midline reference axis. We hypothesized that if shifting attention during the delay leads to the failure of SWM representations, there should be an increase in the variance of recall errors but no change in directional error, regardless of the direction of the shift. Conversely, if shifting attention induces drift of SWM representations—as predicted by the model—there should be systematic changes in the pattern of spatial recall errors depending on the direction of the shift. Results were consistent with the latter possibility—recall errors were biased toward the location of discrimination targets appearing during the delay. PMID:26810574
Hendricks, Brian; Mark-Carew, Miguella; Conley, Jamison
2017-11-13
Domestic dogs and cats are potentially effective sentinel populations for monitoring occurrence and spread of Lyme disease. Few studies have evaluated the public health utility of sentinel programmes using geo-analytic approaches. Confirmed Lyme disease cases diagnosed by physicians and ticks submitted by veterinarians to the West Virginia State Health Department were obtained for 2014-2016. Ticks were identified to species, and only Ixodes scapularis were incorporated in the analysis. Separate ordinary least squares (OLS) and spatial lag regression models were conducted to estimate the association between average numbers of Ix. scapularis collected on pets and human Lyme disease incidence. Regression residuals were visualised using Local Moran's I as a diagnostic tool to identify spatial dependence. Statistically significant associations were identified between average numbers of Ix. scapularis collected from dogs and human Lyme disease in the OLS (β=20.7, P<0.001) and spatial lag (β=12.0, P=0.002) regression. No significant associations were identified for cats in either regression model. Statistically significant (P≤0.05) spatial dependence was identified in all regression models. Local Moran's I maps produced for spatial lag regression residuals indicated a decrease in model over- and under-estimation, but identified a higher number of statistically significant outliers than OLS regression. Results support previous conclusions that dogs are effective sentinel populations for monitoring risk of human exposure to Lyme disease. Findings reinforce the utility of spatial analysis of surveillance data, and highlight West Virginia's unique position within the eastern United States in regards to Lyme disease occurrence.
Evaluation of normalization methods for cDNA microarray data by k-NN classification
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-01-01
Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803
Evaluation of normalization methods for cDNA microarray data by k-NN classification.
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-07-26
Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.
NASA Astrophysics Data System (ADS)
Xu, Yadong; Serre, Marc L.; Reyes, Jeanette M.; Vizuete, William
2017-10-01
We have developed a Bayesian Maximum Entropy (BME) framework that integrates observations from a surface monitoring network and predictions from a Chemical Transport Model (CTM) to create improved exposure estimates that can be resolved into any spatial and temporal resolution. The flexibility of the framework allows for input of data in any choice of time scales and CTM predictions of any spatial resolution with varying associated degrees of estimation error and cost in terms of implementation and computation. This study quantifies the impact on exposure estimation error due to these choices by first comparing estimations errors when BME relied on ozone concentration data either as an hourly average, the daily maximum 8-h average (DM8A), or the daily 24-h average (D24A). Our analysis found that the use of DM8A and D24A data, although less computationally intensive, reduced estimation error more when compared to the use of hourly data. This was primarily due to the poorer CTM model performance in the hourly average predicted ozone. Our second analysis compared spatial variability and estimation errors when BME relied on CTM predictions with a grid cell resolution of 12 × 12 km2 versus a coarser resolution of 36 × 36 km2. Our analysis found that integrating the finer grid resolution CTM predictions not only reduced estimation error, but also increased the spatial variability in daily ozone estimates by 5 times. This improvement was due to the improved spatial gradients and model performance found in the finer resolved CTM simulation. The integration of observational and model predictions that is permitted in a BME framework continues to be a powerful approach for improving exposure estimates of ambient air pollution. The results of this analysis demonstrate the importance of also understanding model performance variability and its implications on exposure error.
Estimating standard errors in feature network models.
Frank, Laurence E; Heiser, Willem J
2007-05-01
Feature network models are graphical structures that represent proximity data in a discrete space while using the same formalism that is the basis of least squares methods employed in multidimensional scaling. Existing methods to derive a network model from empirical data only give the best-fitting network and yield no standard errors for the parameter estimates. The additivity properties of networks make it possible to consider the model as a univariate (multiple) linear regression problem with positivity restrictions on the parameters. In the present study, both theoretical and empirical standard errors are obtained for the constrained regression parameters of a network model with known features. The performance of both types of standard error is evaluated using Monte Carlo techniques.
NASA Technical Reports Server (NTRS)
Chen, Chien-Chung; Gardner, Chester S.
1989-01-01
Given the rms transmitter pointing error and the desired probability of bit error (PBE), it can be shown that an optimal transmitter antenna gain exists which minimizes the required transmitter power. Given the rms local oscillator tracking error, an optimum receiver antenna gain can be found which optimizes the receiver performance. The impact of pointing and tracking errors on the design of direct-detection pulse-position modulation (PPM) and heterodyne noncoherent frequency-shift keying (NCFSK) systems are then analyzed in terms of constraints on the antenna size and the power penalty incurred. It is shown that in the limit of large spatial tracking errors, the advantage in receiver sensitivity for the heterodyne system is quickly offset by the smaller antenna gain and the higher power penalty due to tracking errors. In contrast, for systems with small spatial tracking errors, the heterodyne system is superior because of the higher receiver sensitivity.
NASA Astrophysics Data System (ADS)
Merker, Claire; Ament, Felix; Clemens, Marco
2017-04-01
The quantification of measurement uncertainty for rain radar data remains challenging. Radar reflectivity measurements are affected, amongst other things, by calibration errors, noise, blocking and clutter, and attenuation. Their combined impact on measurement accuracy is difficult to quantify due to incomplete process understanding and complex interdependencies. An improved quality assessment of rain radar measurements is of interest for applications both in meteorology and hydrology, for example for precipitation ensemble generation, rainfall runoff simulations, or in data assimilation for numerical weather prediction. Especially a detailed description of the spatial and temporal structure of errors is beneficial in order to make best use of the areal precipitation information provided by radars. Radar precipitation ensembles are one promising approach to represent spatially variable radar measurement errors. We present a method combining ensemble radar precipitation nowcasting with data assimilation to estimate radar measurement uncertainty at each pixel. This combination of ensemble forecast and observation yields a consistent spatial and temporal evolution of the radar error field. We use an advection-based nowcasting method to generate an ensemble reflectivity forecast from initial data of a rain radar network. Subsequently, reflectivity data from single radars is assimilated into the forecast using the Local Ensemble Transform Kalman Filter. The spread of the resulting analysis ensemble provides a flow-dependent, spatially and temporally correlated reflectivity error estimate at each pixel. We will present first case studies that illustrate the method using data from a high-resolution X-band radar network.
Liu, Yan; Salvendy, Gavriel
2009-05-01
This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.
A method to map errors in the deformable registration of 4DCT images1
Vaman, Constantin; Staub, David; Williamson, Jeffrey; Murphy, Martin J.
2010-01-01
Purpose: To present a new approach to the problem of estimating errors in deformable image registration (DIR) applied to sequential phases of a 4DCT data set. Methods: A set of displacement vector fields (DVFs) are made by registering a sequence of 4DCT phases. The DVFs are assumed to display anatomical movement, with the addition of errors due to the imaging and registration processes. The positions of physical landmarks in each CT phase are measured as ground truth for the physical movement in the DVF. Principal component analysis of the DVFs and the landmarks is used to identify and separate the eigenmodes of physical movement from the error eigenmodes. By subtracting the physical modes from the principal components of the DVFs, the registration errors are exposed and reconstructed as DIR error maps. The method is demonstrated via a simple numerical model of 4DCT DVFs that combines breathing movement with simulated maps of spatially correlated DIR errors. Results: The principal components of the simulated DVFs were observed to share the basic properties of principal components for actual 4DCT data. The simulated error maps were accurately recovered by the estimation method. Conclusions: Deformable image registration errors can have complex spatial distributions. Consequently, point-by-point landmark validation can give unrepresentative results that do not accurately reflect the registration uncertainties away from the landmarks. The authors are developing a method for mapping the complete spatial distribution of DIR errors using only a small number of ground truth validation landmarks. PMID:21158288
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271
NASA Astrophysics Data System (ADS)
Jiang, YuXiao; Guo, PengLiang; Gao, ChengYan; Wang, HaiBo; Alzahrani, Faris; Hobiny, Aatef; Deng, FuGuo
2017-12-01
We present an original self-error-rejecting photonic qubit transmission scheme for both the polarization and spatial states of photon systems transmitted over collective noise channels. In our scheme, we use simple linear-optical elements, including half-wave plates, 50:50 beam splitters, and polarization beam splitters, to convert spatial-polarization modes into different time bins. By using postselection in different time bins, the success probability of obtaining the uncorrupted states approaches 1/4 for single-photon transmission, which is not influenced by the coefficients of noisy channels. Our self-error-rejecting transmission scheme can be generalized to hyperentangled n-photon systems and is useful in practical high-capacity quantum communications with photon systems in two degrees of freedom.
NASA Technical Reports Server (NTRS)
Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)
1978-01-01
The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.
Trans-dimensional joint inversion of seabed scattering and reflection data.
Steininger, Gavin; Dettmer, Jan; Dosso, Stan E; Holland, Charles W
2013-03-01
This paper examines joint inversion of acoustic scattering and reflection data to resolve seabed interface roughness parameters (spectral strength, exponent, and cutoff) and geoacoustic profiles. Trans-dimensional (trans-D) Bayesian sampling is applied with both the number of sediment layers and the order (zeroth or first) of auto-regressive parameters in the error model treated as unknowns. A prior distribution that allows fluid sediment layers over an elastic basement in a trans-D inversion is derived and implemented. Three cases are considered: Scattering-only inversion, joint scattering and reflection inversion, and joint inversion with the trans-D auto-regressive error model. Including reflection data improves the resolution of scattering and geoacoustic parameters. The trans-D auto-regressive model further improves scattering resolution and correctly differentiates between strongly and weakly correlated residual errors.
Satellite derived bathymetry: mapping the Irish coastline
NASA Astrophysics Data System (ADS)
Monteys, X.; Cahalane, C.; Harris, P.; Hanafin, J.
2017-12-01
Ireland has a varied coastline in excess of 3000 km in length largely characterized by extended shallow environments. The coastal shallow water zone can be a challenging and costly environment in which to acquire bathymetry and other oceanographic data using traditional survey methods or airborne LiDAR techniques as demonstrated in the Irish INFOMAR program. Thus, large coastal areas in Ireland, and much of the coastal zone worldwide remain unmapped using modern techniques and is poorly understood. Earth Observations (EO) missions are currently being used to derive timely, cost effective, and quality controlled information for mapping and monitoring coastal environments. Different wavelengths of the solar light penetrate the water column to different depths and are routinely sensed by EO satellites. A large selection of multispectral imagery (MS) from many platforms were examined, as well as from small aircrafts and drones. A number of bays representing very different coastal environments were explored in turn. The project's workflow is created by building a catalogue of satellite and field bathymetric data to assess the suitability of imagery captured at a range of spatial, spectral and temporal resolutions. Turbidity indices are derived from the multispectral information. Finally, a number of spatial regression models using water-leaving radiance parameters and field calibration data are examined. Our assessment reveals that spatial regression algorithms have the potential to significantly improve the accuracy of the predictions up to 10m WD and offer a better handle on the error and uncertainty budget. The four spatial models investigated show better adjustments than the basic non-spatial model. Accuracy of the predictions is better than 10% WD at 95% confidence. Future work will focus on improving the accuracy of the predictions incorporating an analytical model in conjunction with improved empirical methods. The recently launched ESA Sentinel 2 will become the primary focus of study. Satellite bathymetry and coastal mapping products, and remarkably, their repeatability over time, can offer solutions to important coastal zone management issues and address key challenges in the critical line between shoreline changes and human activity, particularly in the light of future climate change scenarios.
NASA Astrophysics Data System (ADS)
Monteys, Xavier; Harris, Paul; Caloca, Silvia
2014-05-01
The coastal shallow water zone can be a challenging and expensive environment within which to acquire bathymetry and other oceanographic data using traditional survey methods. Dangers and limited swath coverage make some of these areas unfeasible to survey using ship borne systems, and turbidity can preclude marine LIDAR. As a result, an extensive part of the coastline worldwide remains completely unmapped. Satellite EO multispectral data, after processing, allows timely, cost efficient and quality controlled information to be used for planning, monitoring, and regulating coastal environments. It has the potential to deliver repetitive derivation of medium resolution bathymetry, coastal water properties and seafloor characteristics in shallow waters. Over the last 30 years satellite passive imaging methods for bathymetry extraction, implementing analytical or empirical methods, have had a limited success predicting water depths. Different wavelengths of the solar light penetrate the water column to varying depths. They can provide acceptable results up to 20 m but become less accurate in deeper waters. The study area is located in the inner part of Dublin Bay, on the East coast of Ireland. The region investigated is a C-shaped inlet covering an area of 10 km long and 5 km wide with water depths ranging from 0 to 10 m. The methodology employed on this research uses a ratio of reflectance from SPOT 5 satellite bands, differing to standard linear transform algorithms. High accuracy water depths were derived using multibeam data. The final empirical model uses spatially weighted geographical tools to retrieve predicted depths. The results of this paper confirm that SPOT satellite scenes are suitable to predict depths using empirical models in very shallow embayments. Spatial regression models show better adjustments in the predictions over non-spatial models. The spatial regression equation used provides realistic results down to 6 m below the water surface, with reliable and error controlled depths. Bathymetric extraction approaches involving satellite imagery data are regarded as a fast, successful and economically advantageous solution to automatic water depth calculation in shallow and complex environments.
Holsclaw, Tracy; Hallgren, Kevin A; Steyvers, Mark; Smyth, Padhraic; Atkins, David C
2015-12-01
Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased Type I and Type II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in online supplemental materials. (c) 2016 APA, all rights reserved).
Holsclaw, Tracy; Hallgren, Kevin A.; Steyvers, Mark; Smyth, Padhraic; Atkins, David C.
2015-01-01
Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non-normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased type-I and type-II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally-technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in supplementary materials. PMID:26098126
Jackman, Patrick; Sun, Da-Wen; Elmasry, Gamal
2012-08-01
A new algorithm for the conversion of device dependent RGB colour data into device independent L*a*b* colour data without introducing noticeable error has been developed. By combining a linear colour space transform and advanced multiple regression methodologies it was possible to predict L*a*b* colour data with less than 2.2 colour units of error (CIE 1976). By transforming the red, green and blue colour components into new variables that better reflect the structure of the L*a*b* colour space, a low colour calibration error was immediately achieved (ΔE(CAL) = 14.1). Application of a range of regression models on the data further reduced the colour calibration error substantially (multilinear regression ΔE(CAL) = 5.4; response surface ΔE(CAL) = 2.9; PLSR ΔE(CAL) = 2.6; LASSO regression ΔE(CAL) = 2.1). Only the PLSR models deteriorated substantially under cross validation. The algorithm is adaptable and can be easily recalibrated to any working computer vision system. The algorithm was tested on a typical working laboratory computer vision system and delivered only a very marginal loss of colour information ΔE(CAL) = 2.35. Colour features derived on this system were able to safely discriminate between three classes of ham with 100% correct classification whereas colour features measured on a conventional colourimeter were not. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Francq, Bernard G; Govaerts, Bernadette
2016-06-30
Two main methodologies for assessing equivalence in method-comparison studies are presented separately in the literature. The first one is the well-known and widely applied Bland-Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors-in-variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated-errors-in-variables regression is introduced as the errors are shown to be correlated in the Bland-Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland-Atman plot with excellent coverage probabilities. We conclude that the (correlated)-errors-in-variables regressions should not be avoided in method comparison studies, although the Bland-Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Hwang, Jae Joon; Kim, Kee-Deog; Park, Hyok; Park, Chang Seo; Jeong, Ho-Gul
2014-01-01
Superimposition has been used as a method to evaluate the changes of orthodontic or orthopedic treatment in the dental field. With the introduction of cone beam CT (CBCT), evaluating 3 dimensional changes after treatment became possible by superimposition. 4 point plane orientation is one of the simplest ways to achieve superimposition of 3 dimensional images. To find factors influencing superimposition error of cephalometric landmarks by 4 point plane orientation method and to evaluate the reproducibility of cephalometric landmarks for analyzing superimposition error, 20 patients were analyzed who had normal skeletal and occlusal relationship and took CBCT for diagnosis of temporomandibular disorder. The nasion, sella turcica, basion and midpoint between the left and the right most posterior point of the lesser wing of sphenoidal bone were used to define a three-dimensional (3D) anatomical reference co-ordinate system. Another 15 reference cephalometric points were also determined three times in the same image. Reorientation error of each landmark could be explained substantially (23%) by linear regression model, which consists of 3 factors describing position of each landmark towards reference axes and locating error. 4 point plane orientation system may produce an amount of reorientation error that may vary according to the perpendicular distance between the landmark and the x-axis; the reorientation error also increases as the locating error and shift of reference axes viewed from each landmark increases. Therefore, in order to reduce the reorientation error, accuracy of all landmarks including the reference points is important. Construction of the regression model using reference points of greater precision is required for the clinical application of this model.
Deciphering factors controlling groundwater arsenic spatial variability in Bangladesh
NASA Astrophysics Data System (ADS)
Tan, Z.; Yang, Q.; Zheng, C.; Zheng, Y.
2017-12-01
Elevated concentrations of geogenic arsenic in groundwater have been found in many countries to exceed 10 μg/L, the WHO's guideline value for drinking water. A common yet unexplained characteristic of groundwater arsenic spatial distribution is the extensive variability at various spatial scales. This study investigates factors influencing the spatial variability of groundwater arsenic in Bangladesh to improve the accuracy of models predicting arsenic exceedance rate spatially. A novel boosted regression tree method is used to establish a weak-learning ensemble model, which is compared to a linear model using a conventional stepwise logistic regression method. The boosted regression tree models offer the advantage of parametric interaction when big datasets are analyzed in comparison to the logistic regression. The point data set (n=3,538) of groundwater hydrochemistry with 19 parameters was obtained by the British Geological Survey in 2001. The spatial data sets of geological parameters (n=13) were from the Consortium for Spatial Information, Technical University of Denmark, University of East Anglia and the FAO, while the soil parameters (n=42) were from the Harmonized World Soil Database. The aforementioned parameters were regressed to categorical groundwater arsenic concentrations below or above three thresholds: 5 μg/L, 10 μg/L and 50 μg/L to identify respective controlling factors. Boosted regression tree method outperformed logistic regression methods in all three threshold levels in terms of accuracy, specificity and sensitivity, resulting in an improvement of spatial distribution map of probability of groundwater arsenic exceeding all three thresholds when compared to disjunctive-kriging interpolated spatial arsenic map using the same groundwater arsenic dataset. Boosted regression tree models also show that the most important controlling factors of groundwater arsenic distribution include groundwater iron content and well depth for all three thresholds. The probability of a well with iron content higher than 5mg/L to contain greater than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be more than 91%, 85% and 51%, respectively, while the probability of a well from depth more than 160m to contain more than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be less than 38%, 25% and 14%, respectively.
[Analysis of intrusion errors in free recall].
Diesfeldt, H F A
2017-06-01
Extra-list intrusion errors during five trials of the eight-word list-learning task of the Amsterdam Dementia Screening Test (ADST) were investigated in 823 consecutive psychogeriatric patients (87.1% suffering from major neurocognitive disorder). Almost half of the participants (45.9%) produced one or more intrusion errors on the verbal recall test. Correct responses were lower when subjects made intrusion errors, but learning slopes did not differ between subjects who committed intrusion errors and those who did not so. Bivariate regression analyses revealed that participants who committed intrusion errors were more deficient on measures of eight-word recognition memory, delayed visual recognition and tests of executive control (the Behavioral Dyscontrol Scale and the ADST-Graphical Sequences as measures of response inhibition). Using hierarchical multiple regression, only free recall and delayed visual recognition retained an independent effect in the association with intrusion errors, such that deficient scores on tests of episodic memory were sufficient to explain the occurrence of intrusion errors. Measures of inhibitory control did not add significantly to the explanation of intrusion errors in free recall, which makes insufficient strength of memory traces rather than a primary deficit in inhibition the preferred account for intrusion errors in free recall.
NASA Astrophysics Data System (ADS)
Yoo, Jin Woo
In my 1st essay, the study explores Pennsylvania residents. willingness to pay for development of renewable energy technologies such as solar power, wind power, biomass electricity, and other renewable energy using a choice experiment method. Principle component analysis identified 3 independent attitude components that affect the variation of preference, a desire for renewable energy and environmental quality and concern over cost. The results show that urban residents have a higher desire for environmental quality and concern less about cost than rural residents and consequently have a higher willingness to pay to increase renewable energy production. The results of sub-sample analysis show that a representative respondent in rural (urban) Pennsylvania is willing to pay 3.8(5.9) and 4.1(5.7)/month for increasing the share of Pennsylvania electricity generated from wind power and other renewable energy by 1 percent point, respectively. Mean WTP for solar and biomass electricity was not significantly different from zero. In my second essay, heterogeneity of individual WTP for various renewable energy technologies is investigated using several different variants of the multinomial logit model: a simple MNL with interaction terms, a latent class choice model, a random parameter mixed logit choice model, and a random parameter-latent class choice model. The results of all models consistently show that respondents. preference for individual renewable technology is heterogeneous, but the degree of heterogeneity differs for different renewable technologies. In general, the random parameter logit model with interactions and a hybrid random parameter logit-latent class model fit better than other models and better capture respondents. heterogeneity of preference for renewable energy. The impact of the land under agricultural conservation easement (ACE) contract on the values of nearby residential properties is investigated using housing sales data in two Pennsylvania Counties. The spatial-lag (SLM), the spatial error (SEM) and the spatial error component (SEC) models were compared. A geographically weighted regression (GWR) model is estimated to study the spatial heterogeneity of the marginal implicit prices of ACE impact within each county. New hybrid spatial hedonic models, the GWR-SEC and a modified GWR-SEM, are estimated such that both spatial autocorrelation and heterogeneity are accounted. The results show that the coefficient of land under easement contract varies spatially within one county, but not within the other county studied. Also, ACE's are found to have both positive and negative impacts on the values of nearby residential properties. Among global spatial models, the SEM fit better than the SLM and the SEC. Statistical goodness of fit measures showed that the GWR-SEC model fit better than the GWR or the GWR-SEC model. Finally, the GWR-SEC showed spatial autocorrelation is stronger in one county than in the other county.
Accurate motion parameter estimation for colonoscopy tracking using a regression method
NASA Astrophysics Data System (ADS)
Liu, Jianfei; Subramanian, Kalpathi R.; Yoo, Terry S.
2010-03-01
Co-located optical and virtual colonoscopy images have the potential to provide important clinical information during routine colonoscopy procedures. In our earlier work, we presented an optical flow based algorithm to compute egomotion from live colonoscopy video, permitting navigation and visualization of the corresponding patient anatomy. In the original algorithm, motion parameters were estimated using the traditional Least Sum of squares(LS) procedure which can be unstable in the context of optical flow vectors with large errors. In the improved algorithm, we use the Least Median of Squares (LMS) method, a robust regression method for motion parameter estimation. Using the LMS method, we iteratively analyze and converge toward the main distribution of the flow vectors, while disregarding outliers. We show through three experiments the improvement in tracking results obtained using the LMS method, in comparison to the LS estimator. The first experiment demonstrates better spatial accuracy in positioning the virtual camera in the sigmoid colon. The second and third experiments demonstrate the robustness of this estimator, resulting in longer tracked sequences: from 300 to 1310 in the ascending colon, and 410 to 1316 in the transverse colon.
Meng, Xia; Fu, Qingyan; Ma, Zongwei; Chen, Li; Zou, Bin; Zhang, Yan; Xue, Wenbo; Wang, Jinnan; Wang, Dongfang; Kan, Haidong; Liu, Yang
2016-01-01
Development of exposure assessment model is the key component for epidemiological studies concerning air pollution, but the evidence from China is limited. Therefore, a linear mixed effects (LME) model was established in this study in a Chinese metropolis by incorporating aerosol optical depth (AOD), meteorological information and the land use regression (LUR) model to predict ground PM10 levels on high spatiotemporal resolution. The cross validation (CV) R(2) and the RMSE of the LME model were 0.87 and 19.2 μg/m(3), respectively. The relative prediction error (RPE) of daily and annual mean predicted PM10 concentrations were 19.1% and 7.5%, respectively. This study was the first attempt in China to estimate both short-term and long-term variation of PM10 levels with high spatial resolution in a Chinese metropolis with the LME model. The results suggested that the LME model could provide exposure assessment for short-term and long-term epidemiological studies in China. Copyright © 2015 Elsevier Ltd. All rights reserved.
Infant mortality in Brazil, 1980-2000: A spatial panel data analysis
2012-01-01
Background Infant mortality is an important measure of human development, related to the level of welfare of a society. In order to inform public policy, various studies have tried to identify the factors that influence, at an aggregated level, infant mortality. The objective of this paper is to analyze the regional pattern of infant mortality in Brazil, evaluating the effect of infrastructure, socio-economic, and demographic variables to understand its distribution across the country. Methods Regressions including socio-economic and living conditions variables are conducted in a structure of panel data. More specifically, a spatial panel data model with fixed effects and a spatial error autocorrelation structure is used to help to solve spatial dependence problems. The use of a spatial modeling approach takes into account the potential presence of spillovers between neighboring spatial units. The spatial units considered are Minimum Comparable Areas, defined to provide a consistent definition across Census years. Data are drawn from the 1980, 1991 and 2000 Census of Brazil, and from data collected by the Ministry of Health (DATASUS). In order to identify the influence of health care infrastructure, variables related to the number of public and private hospitals are included. Results The results indicate that the panel model with spatial effects provides the best fit to the data. The analysis confirms that the provision of health care infrastructure and social policy measures (e.g. improving education attainment) are linked to reduced rates of infant mortality. An original finding concerns the role of spatial effects in the analysis of IMR. Spillover effects associated with health infrastructure and water and sanitation facilities imply that there are regional benefits beyond the unit of analysis. Conclusions A spatial modeling approach is important to produce reliable estimates in the analysis of panel IMR data. Substantively, this paper contributes to our understanding of the physical and social factors that influence IMR in the case of a developing country. PMID:22410079
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
Spatially coupled low-density parity-check error correction for holographic data storage
NASA Astrophysics Data System (ADS)
Ishii, Norihiko; Katano, Yutaro; Muroi, Tetsuhiko; Kinoshita, Nobuhiro
2017-09-01
The spatially coupled low-density parity-check (SC-LDPC) was considered for holographic data storage. The superiority of SC-LDPC was studied by simulation. The simulations show that the performance of SC-LDPC depends on the lifting number, and when the lifting number is over 100, SC-LDPC shows better error correctability compared with irregular LDPC. SC-LDPC is applied to the 5:9 modulation code, which is one of the differential codes. The error-free point is near 2.8 dB and over 10-1 can be corrected in simulation. From these simulation results, this error correction code can be applied to actual holographic data storage test equipment. Results showed that 8 × 10-2 can be corrected, furthermore it works effectively and shows good error correctability.
NASA Technical Reports Server (NTRS)
Kimes, D. S.; Kerber, A. G.; Sellers, P. J.
1993-01-01
Spatial averaging errors which may occur when creating hemispherical reflectance maps for different cover types from direct nadir technique to estimate the hemispherical reflectance are assessed by comparing the results with those obtained with a knowledge-based system called VEG (Kimes et al., 1991, 1992). It was found that hemispherical reflectance errors provided by using VEG are much less than those using the direct nadir techniques, depending on conditions. Suggestions are made concerning sampling and averaging strategies for creating hemispherical reflectance maps for photosynthetic, carbon cycle, and climate change studies.
Vegetation spatial variability and its effect on vegetation indices
NASA Technical Reports Server (NTRS)
Ormsby, J. P.; Choudhury, B. J.; Owe, M.
1987-01-01
Landsat MSS data were used to simulate low resolution satellite data, such as NOAA AVHRR, to quantify the fractional vegetation cover within a pixel and relate the fractional cover to the normalized difference vegetation index (NDVI) and the simple ratio (SR). The MSS data were converted to radiances from which the NDVI and SR values for the simulated pixels were determined. Each simulated pixel was divided into clusters using an unsupervised classification program. Spatial and spectral analysis provided a means of combining clusters representing similar surface characteristics into vegetated and non-vegetated areas. Analysis showed an average error of 12.7 per cent in determining these areas. NDVI values less than 0.3 represented fractional vegetated areas of 5 per cent or less, while a value of 0.7 or higher represented fractional vegetated areas greater than 80 per cent. Regression analysis showed a strong linear relation between fractional vegetation area and the NDVI and SR values; correlation values were 0.89 and 0.95 respectively. The range of NDVI values calculated from the MSS data agrees well with field studies.
High spatial precision nano-imaging of polarization-sensitive plasmonic particles
NASA Astrophysics Data System (ADS)
Liu, Yunbo; Wang, Yipei; Lee, Somin Eunice
2018-02-01
Precise polarimetric imaging of polarization-sensitive nanoparticles is essential for resolving their accurate spatial positions beyond the diffraction limit. However, conventional technologies currently suffer from beam deviation errors which cannot be corrected beyond the diffraction limit. To overcome this issue, we experimentally demonstrate a spatially stable nano-imaging system for polarization-sensitive nanoparticles. In this study, we show that by integrating a voltage-tunable imaging variable polarizer with optical microscopy, we are able to suppress beam deviation errors. We expect that this nano-imaging system should allow for acquisition of accurate positional and polarization information from individual nanoparticles in applications where real-time, high precision spatial information is required.
NASA Astrophysics Data System (ADS)
Madani, Nima; Kimball, John S.; Running, Steven W.
2017-11-01
In the light use efficiency (LUE) approach of estimating the gross primary productivity (GPP), plant productivity is linearly related to absorbed photosynthetically active radiation assuming that plants absorb and convert solar energy into biomass within a maximum LUE (LUEmax) rate, which is assumed to vary conservatively within a given biome type. However, it has been shown that photosynthetic efficiency can vary within biomes. In this study, we used 149 global CO2 flux towers to derive the optimum LUE (LUEopt) under prevailing climate conditions for each tower location, stratified according to model training and test sites. Unlike LUEmax, LUEopt varies according to heterogeneous landscape characteristics and species traits. The LUEopt data showed large spatial variability within and between biome types, so that a simple biome classification explained only 29% of LUEopt variability over 95 global tower training sites. The use of explanatory variables in a mixed effect regression model explained 62.2% of the spatial variability in tower LUEopt data. The resulting regression model was used for global extrapolation of the LUEopt data and GPP estimation. The GPP estimated using the new LUEopt map showed significant improvement relative to global tower data, including a 15% R2 increase and 34% root-mean-square error reduction relative to baseline GPP calculations derived from biome-specific LUEmax constants. The new global LUEopt map is expected to improve the performance of LUE-based GPP algorithms for better assessment and monitoring of global terrestrial productivity and carbon dynamics.
Deformation field correction for spatial normalization of PET images
Bilgel, Murat; Carass, Aaron; Resnick, Susan M.; Wong, Dean F.; Prince, Jerry L.
2015-01-01
Spatial normalization of positron emission tomography (PET) images is essential for population studies, yet the current state of the art in PET-to-PET registration is limited to the application of conventional deformable registration methods that were developed for structural images. A method is presented for the spatial normalization of PET images that improves their anatomical alignment over the state of the art. The approach works by correcting the deformable registration result using a model that is learned from training data having both PET and structural images. In particular, viewing the structural registration of training data as ground truth, correction factors are learned by using a generalized ridge regression at each voxel given the PET intensities and voxel locations in a population-based PET template. The trained model can then be used to obtain more accurate registration of PET images to the PET template without the use of a structural image. A cross validation evaluation on 79 subjects shows that the proposed method yields more accurate alignment of the PET images compared to deformable PET-to-PET registration as revealed by 1) a visual examination of the deformed images, 2) a smaller error in the deformation fields, and 3) a greater overlap of the deformed anatomical labels with ground truth segmentations. PMID:26142272
Spatial effects, sampling errors, and task specialization in the honey bee.
Johnson, B R
2010-05-01
Task allocation patterns should depend on the spatial distribution of work within the nest, variation in task demand, and the movement patterns of workers, however, relatively little research has focused on these topics. This study uses a spatially explicit agent based model to determine whether such factors alone can generate biases in task performance at the individual level in the honey bees, Apis mellifera. Specialization (bias in task performance) is shown to result from strong sampling error due to localized task demand, relatively slow moving workers relative to nest size, and strong spatial variation in task demand. To date, specialization has been primarily interpreted with the response threshold concept, which is focused on intrinsic (typically genotypic) differences between workers. Response threshold variation and sampling error due to spatial effects are not mutually exclusive, however, and this study suggests that both contribute to patterns of task bias at the individual level. While spatial effects are strong enough to explain some documented cases of specialization; they are relatively short term and not explanatory for long term cases of specialization. In general, this study suggests that the spatial layout of tasks and fluctuations in their demand must be explicitly controlled for in studies focused on identifying genotypic specialists.
Hoos, Anne B.; Patel, Anant R.
1996-01-01
Model-adjustment procedures were applied to the combined data bases of storm-runoff quality for Chattanooga, Knoxville, and Nashville, Tennessee, to improve predictive accuracy for storm-runoff quality for urban watersheds in these three cities and throughout Middle and East Tennessee. Data for 45 storms at 15 different sites (five sites in each city) constitute the data base. Comparison of observed values of storm-runoff load and event-mean concentration to the predicted values from the regional regression models for 10 constituents shows prediction errors, as large as 806,000 percent. Model-adjustment procedures, which combine the regional model predictions with local data, are applied to improve predictive accuracy. Standard error of estimate after model adjustment ranges from 67 to 322 percent. Calibration results may be biased due to sampling error in the Tennessee data base. The relatively large values of standard error of estimate for some of the constituent models, although representing significant reduction (at least 50 percent) in prediction error compared to estimation with unadjusted regional models, may be unacceptable for some applications. The user may wish to collect additional local data for these constituents and repeat the analysis, or calibrate an independent local regression model.
Re-assessing accumulated oxygen deficit in middle-distance runners.
Bickham, D; Le Rossignol, P; Gibbons, C; Russell, A P
2002-12-01
The purpose of this study was to re-assess the accumulated oxygen deficit (AOD), incorporating recent methodological improvements i.e., 4 min submaximal tests spread above and below the lactate threshold (LT). We Investigated the Influence of the VO2 -speed regression, on the precision of the estimated total energy demand and AOD. utilising different numbers of regression points and including measurement errors. Seven trained middle-distance runners (mean +/- SD age: 25.3 +/- 5.4y, mass: 73.7 +/- 4.3kg. VO2max 64.4 +/- 6.1 mL x kg(-1) x min(-1)) completed a VO2max, LT, 10 x 4 min exercise tests (above and below LT) and high-intensity exhaustive tests. The VO2 -speed regression was developed using 10 submaximal points and a forced y-intercept value. The average precision (measured as the width of 95% confidence Interval) for the estimated total energy demand using this regression was 7.8mL O2 Eq x kg(-1) x min(-1). There was a two-fold decrease in precision of estimated total energy demand with the Inclusion of measurement errors from the metabolic system. The mean AOD value was 43.3 mL O2 Eq x kg(-1) (upper and lower 95% CI 32.1 and 54.5mL o2 Eq x kg(-1) respectively). Converting the 95% CI for estimated total energy demand to AOD or including maximum possible measurement errors amplified the error associated with the estimated total energy demand. No significant difference in AOD variables were found, using 10,4 or 2 regression points with a forced y-intercept. For practical purposes we recommend the use of 4 submaximal values with a y-intercept. Using 95% CIs and calculating error highlighted possible error in estimating AOD. Without accurate data collection, increased variability could decrease the accuracy of the AOD as shown by a 95% CI of the AOD.
NASA Astrophysics Data System (ADS)
Hutchinson, G. L.; Livingston, G. P.; Healy, R. W.; Striegl, R. G.
2000-04-01
We employed a three-dimensional finite difference gas diffusion model to simulate the performance of chambers used to measure surface-atmosphere trace gas exchange. We found that systematic errors often result from conventional chamber design and deployment protocols, as well as key assumptions behind the estimation of trace gas exchange rates from observed concentration data. Specifically, our simulations showed that (1) when a chamber significantly alters atmospheric mixing processes operating near the soil surface, it also nearly instantaneously enhances or suppresses the postdeployment gas exchange rate, (2) any change resulting in greater soil gas diffusivity, or greater partitioning of the diffusing gas to solid or liquid soil fractions, increases the potential for chamber-induced measurement error, and (3) all such errors are independent of the magnitude, kinetics, and/or distribution of trace gas sources, but greater for trace gas sinks with the same initial absolute flux. Finally, and most importantly, we found that our results apply to steady state as well as non-steady-state chambers, because the slow rate of gas diffusion in soil inhibits recovery of the former from their initial non-steady-state condition. Over a range of representative conditions, the error in steady state chamber estimates of the trace gas flux varied from -30 to +32%, while estimates computed by linear regression from non-steady-state chamber concentrations were 2 to 31% too small. Although such errors are relatively small in comparison to the temporal and spatial variability characteristic of trace gas exchange, they bias the summary statistics for each experiment as well as larger scale trace gas flux estimates based on them.
Hutchinson, G.L.; Livingston, G.P.; Healy, R.W.; Striegl, Robert G.
2000-01-01
We employed a three-dimensional finite difference gas diffusion model to simulate the performance of chambers used to measure surface-atmosphere tace gas exchange. We found that systematic errors often result from conventional chamber design and deployment protocols, as well as key assumptions behind the estimation of trace gas exchange rates from observed concentration data. Specifically, our simulationshowed that (1) when a chamber significantly alters atmospheric mixing processes operating near the soil surface, it also nearly instantaneously enhances or suppresses the postdeployment gas exchange rate, (2) any change resulting in greater soil gas diffusivity, or greater partitioning of the diffusing gas to solid or liquid soil fractions, increases the potential for chamber-induced measurement error, and (3) all such errors are independent of the magnitude, kinetics, and/or distribution of trace gas sources, but greater for trace gas sinks with the same initial absolute flux. Finally, and most importantly, we found that our results apply to steady state as well as non-steady-state chambers, because the slow rate of gas diffusion in soil inhibits recovery of the former from their initial non-steady-state condition. Over a range of representative conditions, the error in steady state chamber estimates of the trace gas flux varied from -30 to +32%, while estimates computed by linear regression from non-steadystate chamber concentrations were 2 to 31% too small. Although such errors are relatively small in comparison to the temporal and spatial variability characteristic of trace gas exchange, they bias the summary statistics for each experiment as well as larger scale trace gas flux estimates based on them.
2014-01-01
Background There have been large-scale outbreaks of hand, foot and mouth disease (HFMD) in Mainland China over the last decade. These events varied greatly across the country. It is necessary to identify the spatial risk factors and spatial distribution patterns of HFMD for public health control and prevention. Climate risk factors associated with HFMD occurrence have been recognized. However, few studies discussed the socio-economic determinants of HFMD risk at a space scale. Methods HFMD records in Mainland China in May 2008 were collected. Both climate and socio-economic factors were selected as potential risk exposures of HFMD. Odds ratio (OR) was used to identify the spatial risk factors. A spatial autologistic regression model was employed to get OR values of each exposures and model the spatial distribution patterns of HFMD risk. Results Results showed that both climate and socio-economic variables were spatial risk factors for HFMD transmission in Mainland China. The statistically significant risk factors are monthly average precipitation (OR = 1.4354), monthly average temperature (OR = 1.379), monthly average wind speed (OR = 1.186), the number of industrial enterprises above designated size (OR = 17.699), the population density (OR = 1.953), and the proportion of student population (OR = 1.286). The spatial autologistic regression model has a good goodness of fit (ROC = 0.817) and prediction accuracy (Correct ratio = 78.45%) of HFMD occurrence. The autologistic regression model also reduces the contribution of the residual term in the ordinary logistic regression model significantly, from 17.25 to 1.25 for the odds ratio. Based on the prediction results of the spatial model, we obtained a map of the probability of HFMD occurrence that shows the spatial distribution pattern and local epidemic risk over Mainland China. Conclusions The autologistic regression model was used to identify spatial risk factors and model spatial risk patterns of HFMD. HFMD occurrences were found to be spatially heterogeneous over the Mainland China, which is related to both the climate and socio-economic variables. The combination of socio-economic and climate exposures can explain the HFMD occurrences more comprehensively and objectively than those with only climate exposures. The modeled probability of HFMD occurrence at the county level reveals not only the spatial trends, but also the local details of epidemic risk, even in the regions where there were no HFMD case records. PMID:24731248
Reducing representativeness and sampling errors in radio occultation-radiosonde comparisons
NASA Astrophysics Data System (ADS)
Gilpin, Shay; Rieckh, Therese; Anthes, Richard
2018-05-01
Radio occultation (RO) and radiosonde (RS) comparisons provide a means of analyzing errors associated with both observational systems. Since RO and RS observations are not taken at the exact same time or location, temporal and spatial sampling errors resulting from atmospheric variability can be significant and inhibit error analysis of the observational systems. In addition, the vertical resolutions of RO and RS profiles vary and vertical representativeness errors may also affect the comparison. In RO-RS comparisons, RO observations are co-located with RS profiles within a fixed time window and distance, i.e. within 3-6 h and circles of radii ranging between 100 and 500 km. In this study, we first show that vertical filtering of RO and RS profiles to a common vertical resolution reduces representativeness errors. We then test two methods of reducing horizontal sampling errors during RO-RS comparisons: restricting co-location pairs to within ellipses oriented along the direction of wind flow rather than circles and applying a spatial-temporal sampling correction based on model data. Using data from 2011 to 2014, we compare RO and RS differences at four GCOS Reference Upper-Air Network (GRUAN) RS stations in different climatic locations, in which co-location pairs were constrained to a large circle ( ˜ 666 km radius), small circle ( ˜ 300 km radius), and ellipse parallel to the wind direction ( ˜ 666 km semi-major axis, ˜ 133 km semi-minor axis). We also apply a spatial-temporal sampling correction using European Centre for Medium-Range Weather Forecasts Interim Reanalysis (ERA-Interim) gridded data. Restricting co-locations to within the ellipse reduces root mean square (RMS) refractivity, temperature, and water vapor pressure differences relative to RMS differences within the large circle and produces differences that are comparable to or less than the RMS differences within circles of similar area. Applying the sampling correction shows the most significant reduction in RMS differences, such that RMS differences are nearly identical to the sampling correction regardless of the geometric constraints. We conclude that implementing the spatial-temporal sampling correction using a reliable model will most effectively reduce sampling errors during RO-RS comparisons; however, if a reliable model is not available, restricting spatial comparisons to within an ellipse parallel to the wind flow will reduce sampling errors caused by horizontal atmospheric variability.
Intrinsic Raman spectroscopy for quantitative biological spectroscopy Part II
Bechtel, Kate L.; Shih, Wei-Chuan; Feld, Michael S.
2009-01-01
We demonstrate the effectiveness of intrinsic Raman spectroscopy (IRS) at reducing errors caused by absorption and scattering. Physical tissue models, solutions of varying absorption and scattering coefficients with known concentrations of Raman scatterers, are studied. We show significant improvement in prediction error by implementing IRS to predict concentrations of Raman scatterers using both ordinary least squares regression (OLS) and partial least squares regression (PLS). In particular, we show that IRS provides a robust calibration model that does not increase in error when applied to samples with optical properties outside the range of calibration. PMID:18711512
Decreased Leftward ‘Aiming’ Motor-Intentional Spatial Cuing in Traumatic Brain Injury
Wagner, Daymond; Eslinger, Paul J.; Barrett, A. M.
2016-01-01
Objective To characterize the mediation of attention and action in space following traumatic brain injury (TBI). Method Two exploratory analyses were performed to determine the influence of spatial ‘Aiming’ motor versus spatial ‘Where’ bias on line bisection in TBI participants. The first experiment compared performance according to severity and location of injury in TBI. The second experiment examined bisection performance in a larger TBI sample against a matched control group. In both experiments, participants bisected lines in near and far space using an apparatus that allowed for the fractionation of spatial Aiming versus Where error components. Results In the first experiment, participants with severe injuries tended to incur rightward error when starting from the right in far space, compared with participants with mild injuries. In the second experiment, when performance was examined at the individual level, more participants with TBI tended to incur rightward motor error compared to controls. Conclusions TBI may cause frontal-subcortical cognitive dysfunction and asymmetric motor perseveration, affecting spatial Aiming bias on line bisection. Potential effects on real-world function need further investigation. PMID:27571220
Jang, Dae-Hyun; Kim, Min-Wook; Park, Kyoung Ha; Lee, Jae Woo
2015-03-01
The purpose of the present study was to investigate the relationship between Korean language-specific dysgraphia and unilateral spatial neglect in 31 right brain stroke patients. All patients were tested for writing errors in spontaneous writing, dictation, and copying tests. The dysgraphia was classified into visuospatial omission, visuospatial destruction, syllabic tilting, stroke omission, stroke addition, and stroke tilting. Twenty-three (77.4%) of the 31 patients made dysgraphia and 18 (58.1%) demonstrated unilateral spatial neglect. The visuospatial omission was the most common dysgraphia followed by stroke addition and omission errors. The highest number of errors was made in the copying and the least was in the spontaneous writing test. Patients with unilateral spatial neglect made a significantly higher number of dysgraphia in the copying test than those without. We identified specific dysgraphia features such as a right side space omission and a vertical stroke addition in Korean right brain stroke patients. In conclusion, unilateral spatial neglect influences copy writing system of Korean language in patients with right brain stroke.
Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.
2017-07-14
A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.
A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less
Rotational wind indicator enhances control of rotated displays
NASA Technical Reports Server (NTRS)
Cunningham, H. A.; Pavel, Misha
1991-01-01
Rotation by 108 deg of the spatial mapping between a visual display and a manual input device produces large spatial errors in a discrete aiming task. These errors are not easily corrected by voluntary mental effort, but the central nervous system does adapt gradually to the new mapping. Bernotat (1970) showed that adding true hand position to a 90 deg rotated display improved performance of a compensatory tracking task, but tracking error rose again upon removal of the explicit cue. This suggests that the explicit error signal did not induce changes in the neural mapping, but rather allowed the operator to reduce tracking error using a higher mental strategy. In this report, we describe an explicit visual display enhancement applied to a 108 deg rotated discrete aiming task. A 'wind indicator' corresponding to the effect of the mapping rotation is displayed on the operator-controlled cursor. The human operator is instructed to oppose the virtual force represented by the indicator, as one would do if flying an airplane in a crosswind. This enhancement reduces spatial aiming error in the first 10 minutes of practice by an average of 70 percent when compared to a no enhancement control condition. Moreover, it produces adaptation aftereffect, which is evidence of learning by neural adaptation rather than by mental strategy. Finally, aiming error does not rise upon removal of the explicit cue.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Spatial Dynamics and Determinants of County-Level Education Expenditure in China
ERIC Educational Resources Information Center
Gu, Jiafeng
2012-01-01
In this paper, a multivariate spatial autoregressive model of local public education expenditure determination with autoregressive disturbance is developed and estimated. The existence of spatial interdependence is tested using Moran's I statistic and Lagrange multiplier test statistics for both the spatial error and spatial lag models. The full…
Exploration of walking behavior in Vermont using spatial regression.
DOT National Transportation Integrated Search
2015-06-01
This report focuses on the relationship between walking and its contributing factors by : applying spatial regression methods. Using the Vermont data from the New England : Transportation Survey (NETS), walking variables as well as 170 independent va...
Spatial quantile regression using INLA with applications to childhood overweight in Malawi.
Mtambo, Owen P L; Masangwi, Salule J; Kazembe, Lawrence N M
2015-04-01
Analyses of childhood overweight have mainly used mean regression. However, using quantile regression is more appropriate as it provides flexibility to analyse the determinants of overweight corresponding to quantiles of interest. The main objective of this study was to fit a Bayesian additive quantile regression model with structured spatial effects for childhood overweight in Malawi using the 2010 Malawi DHS data. Inference was fully Bayesian using R-INLA package. The significant determinants of childhood overweight ranged from socio-demographic factors such as type of residence to child and maternal factors such as child age and maternal BMI. We observed significant positive structured spatial effects on childhood overweight in some districts of Malawi. We recommended that the childhood malnutrition policy makers should consider timely interventions based on risk factors as identified in this paper including spatial targets of interventions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Spatial serial order processing in schizophrenia.
Fraser, David; Park, Sohee; Clark, Gina; Yohanna, Daniel; Houk, James C
2004-10-01
The aim of this study was to examine serial order processing deficits in 21 schizophrenia patients and 16 age- and education-matched healthy controls. In a spatial serial order working memory task, one to four spatial targets were presented in a randomized sequence. Subjects were required to remember the locations and the order in which the targets were presented. Patients showed a marked deficit in ability to remember the sequences compared with controls. Increasing the number of targets within a sequence resulted in poorer memory performance for both control and schizophrenia subjects, but the effect was much more pronounced in the patients. Targets presented at the end of a long sequence were more vulnerable to memory error in schizophrenia patients. Performance deficits were not attributable to motor errors, but to errors in target choice. The results support the idea that the memory errors seen in schizophrenia patients may be due to saturating the working memory network at relatively low levels of memory load.
Aging and the intrusion superiority effect in visuo-spatial working memory.
Cornoldi, Cesare; Bassani, Chiara; Berto, Rita; Mammarella, Nicola
2007-01-01
This study investigated the active component of visuo-spatial working memory (VSWM) in younger and older adults testing the hypotheses that elderly individuals have a poorer performance than younger ones and that errors in active VSWM tasks depend, at least partially, on difficulties in avoiding intrusions (i.e., avoiding already activated information). In two experiments, participants were presented with sequences of matrices on which three positions were pointed out sequentially: their task was to process all the positions but indicate only the final position of each sequence. Results showed a poorer performance in the elderly compared to the younger group and a higher number of intrusion (errors due to activated but irrelevant positions) rather than invention (errors consisting of pointing out a position never indicated by the experiementer) errors. The number of errors increased when a concurrent task was introduced (Experiment 1) and it was affected by different patterns of matrices (Experiment 2). In general, results show that elderly people have an impaired VSWM and produce a large number of errors due to inhibition failures. However, both the younger and the older adults' visuo-spatial working memory was affected by the presence of activated irrelevant information, the reduction of the available resources, and task constraints.
Visualizing Uncertainty of Point Phenomena by Redesigned Error Ellipses
NASA Astrophysics Data System (ADS)
Murphy, Christian E.
2018-05-01
Visualizing uncertainty remains one of the great challenges in modern cartography. There is no overarching strategy to display the nature of uncertainty, as an effective and efficient visualization depends, besides on the spatial data feature type, heavily on the type of uncertainty. This work presents a design strategy to visualize uncertainty con-nected to point features. The error ellipse, well-known from mathematical statistics, is adapted to display the uncer-tainty of point information originating from spatial generalization. Modified designs of the error ellipse show the po-tential of quantitative and qualitative symbolization and simultaneous point based uncertainty symbolization. The user can intuitively depict the centers of gravity, the major orientation of the point arrays as well as estimate the ex-tents and possible spatial distributions of multiple point phenomena. The error ellipse represents uncertainty in an intuitive way, particularly suitable for laymen. Furthermore it is shown how applicable an adapted design of the er-ror ellipse is to display the uncertainty of point features originating from incomplete data. The suitability of the error ellipse to display the uncertainty of point information is demonstrated within two showcases: (1) the analysis of formations of association football players, and (2) uncertain positioning of events on maps for the media.
A spatial error model with continuous random effects and an application to growth convergence
NASA Astrophysics Data System (ADS)
Laurini, Márcio Poletti
2017-10-01
We propose a spatial error model with continuous random effects based on Matérn covariance functions and apply this model for the analysis of income convergence processes (β -convergence). The use of a model with continuous random effects permits a clearer visualization and interpretation of the spatial dependency patterns, avoids the problems of defining neighborhoods in spatial econometrics models, and allows projecting the spatial effects for every possible location in the continuous space, circumventing the existing aggregations in discrete lattice representations. We apply this model approach to analyze the economic growth of Brazilian municipalities between 1991 and 2010 using unconditional and conditional formulations and a spatiotemporal model of convergence. The results indicate that the estimated spatial random effects are consistent with the existence of income convergence clubs for Brazilian municipalities in this period.
A Study of the Groundwater Level Spatial Variability in the Messara Valley of Crete
NASA Astrophysics Data System (ADS)
Varouchakis, E. A.; Hristopulos, D. T.; Karatzas, G. P.
2009-04-01
The island of Crete (Greece) has a dry sub-humid climate and marginal groundwater resources, which are extensively used for agricultural activities and human consumption. The Messara valley is located in the south of the Heraklion prefecture, it covers an area of 398 km2, and it is the largest and most productive valley of the island. Over-exploitation during the past thirty (30) years has led to a dramatic decrease of thirty five (35) meters in the groundwater level. Possible future climatic changes in the Mediterranean region, potential desertification, population increase, and extensive agricultural activity generate concern over the sustainability of the water resources of the area. The accurate estimation of the water table depth is important for an integrated groundwater resource management plan. This study focuses on the Mires basin of the Messara valley for reasons of hydro-geological data availability and geological homogeneity. The research goal is to model and map the spatial variability of the basin's groundwater level accurately. The data used in this study consist of seventy (70) piezometric head measurements for the hydrological year 2001-2002. These are unevenly distributed and mostly concentrated along a temporary river that crosses the basin. The range of piezometric heads varies from an extreme low value of 9.4 meters above sea level (masl) to 62 masl, for the wet period of the year (October to April). An initial goal of the study is to develop spatial models for the accurate generation of static maps of groundwater level. At a second stage, these maps should extend the models to dynamic (space-time) situations for the prediction of future water levels. Preliminary data analysis shows that the piezometric head variations are not normally distributed. Several methods including Box-Cox transformation and a modified version of it, transgaussian Kriging, and Gaussian anamorphosis have been used to obtain a spatial model for the piezometric head. A trend model was constructed that accounted for the distance of the wells from the river bed. The spatial dependence of the fluctuations was studied by fitting isotropic and anisotropic empirical variograms with classical models, the Matérn model and the Spartan variogram family (Hristopulos, 2003; Hristopoulos and Elogne, 2007). The most accurate results, mean absolute prediction error of 4.57 masl, were obtained using the modified Box-Cox transform of the original data. The exponential and the isotropic Spartan variograms provided the best fits to the experimental variogram. Using Ordinary Kriging with either variogram function gave a mean absolute estimation error of 4.57 masl based on leave-one-out cross validation. The bias error of the predictions was calculated equal to -0.38 masl and the correlation coefficient of the predictions with respect of the original data equal to 0.8. The estimates located on the borders of the study domain presented a higher prediction error that varies from 8 to 14 masl due to the limited number of neighbor data. The maximum estimation error, observed at the extreme low value calculation, was 23 masl. The method of locally weighted regression (LWR), (NIST/SEMATECH 2009) was also investigated as an alternative approach for spatial modeling. The trend calculated from a second order LWR method showed a remarkable fit to the original data marked by a mean absolute estimation error of 4.4 masl. The bias prediction error was calculated equal to -0.16 masl and the correlation coefficient between predicted and original data equal to 0.88 masl. Higher estimation errors were found at the same locations and vary within the same range. The extreme low value calculation error has improved to 21 masl. Plans for future research include the incorporation of spatial anisotropy in the kriging algorithm, the investigation of kernel functions other than the tricube in LWR, as well as the use of locally adapted bandwidth values. Furthermore, pumping rates for fifty eight (58) of the seventy (70) wells are available display a correlation coefficient of -0.6 with the respective ground water levels. A Digital Elevation Model (DEM) of the area will provide additional information about the unsampled locations of the basin. The pumping rates and the DEM will be used as secondary information in a co-kriging approach, leading to more accurate estimation of the basin's water table. NIST/SEMATECH e-Handbook of Statitical Methods, http://www.itl.nist.gov/div898/handbook/, 12/01/09. D.T. Hristopulos, "Spartan Gibbs random field models for geostatistical applications," SIAM J. Scient. Comput., vol. 24, no. 6, pp. 2125-2162, 2003 D.T. Hristopulos and S. Elogne, "Analytic properties and covariance functions for a new class of generalized Gibbs random fields," IEEE TRANSACTIONS ON INFORMATION THEORY, vol. 53, no 12, pp. 4667-4679, 2007
Data mining: Potential applications in research on nutrition and health.
Batterham, Marijka; Neale, Elizabeth; Martin, Allison; Tapsell, Linda
2017-02-01
Data mining enables further insights from nutrition-related research, but caution is required. The aim of this analysis was to demonstrate and compare the utility of data mining methods in classifying a categorical outcome derived from a nutrition-related intervention. Baseline data (23 variables, 8 categorical) on participants (n = 295) in an intervention trial were used to classify participants in terms of meeting the criteria of achieving 10 000 steps per day. Results from classification and regression trees (CARTs), random forests, adaptive boosting, logistic regression, support vector machines and neural networks were compared using area under the curve (AUC) and error assessments. The CART produced the best model when considering the AUC (0.703), overall error (18%) and within class error (28%). Logistic regression also performed reasonably well compared to the other models (AUC 0.675, overall error 23%, within class error 36%). All the methods gave different rankings of variables' importance. CART found that body fat, quality of life using the SF-12 Physical Component Summary (PCS) and the cholesterol: HDL ratio were the most important predictors of meeting the 10 000 steps criteria, while logistic regression showed the SF-12PCS, glucose levels and level of education to be the most significant predictors (P ≤ 0.01). Differing outcomes suggest caution is required with a single data mining method, particularly in a dataset with nonlinear relationships and outliers and when exploring relationships that were not the primary outcomes of the research. © 2017 Dietitians Association of Australia.
NASA Technical Reports Server (NTRS)
Gentry, R. C.; Rodgers, E.; Steranka, J.; Shenk, W. E.
1978-01-01
A regression technique was developed to forecast 24 hour changes of the maximum winds for weak (maximum winds less than or equal to 65 Kt) and strong (maximum winds greater than 65 Kt) tropical cyclones by utilizing satellite measured equivalent blackbody temperatures around the storm alone and together with the changes in maximum winds during the preceding 24 hours and the current maximum winds. Independent testing of these regression equations shows that the mean errors made by the equations are lower than the errors in forecasts made by the peristence techniques.
EMC: Air Quality Forecast Home page
archive NAM Verification Meteorology Error Time Series EMC NAM Spatial Maps Real Time Mesoscale Analysis Precipitation verification NAQFC VERIFICATION CMAQ Ozone & PM Error Time Series AOD Error Time Series HYSPLIT Smoke forecasts vs GASP satellite Dust and Smoke Error Time Series HYSPLIT WCOSS Upgrade (July
Neuropsychology of selective attention and magnetic cortical stimulation.
Sabatino, M; Di Nuovo, S; Sardo, P; Abbate, C S; La Grutta, V
1996-01-01
Informed volunteers were asked to perform different neuropsychological tests involving selective attention under control conditions and during transcranial magnetic cortical stimulation. The tests chosen involved the recognition of a specific letter among different letters (verbal test) and the search for three different spatial orientations of an appendage to a square (visuo-spatial test). For each test the total time taken and the error rate were calculated. Results showed that cortical stimulation did not cause a worsening in performance. Moreover, magnetic stimulation of the temporal lobe neither modified completion time in both verbal and visuo-spatial tests nor changed error rate. In contrast, magnetic stimulation of the pre-frontal area induced a significant reduction in the performance time of both the verbal and visuo-spatial tests always without an increase in the number of errors. The experimental findings underline the importance of the pre-frontal area in performing tasks requiring a high level of controlled attention and suggest the need to adopt an interdisciplinary approach towards the study of neurone/mind interface mechanisms.
Evaluation of spatial filtering on the accuracy of wheat area estimate
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Delima, A. M.
1982-01-01
A 3 x 3 pixel spatial filter for postclassification was used for wheat classification to evaluate the effects of this procedure on the accuracy of area estimation using LANDSAT digital data obtained from a single pass. Quantitative analyses were carried out in five test sites (approx 40 sq km each) and t tests showed that filtering with threshold values significantly decreased errors of commission and omission. In area estimation filtering improved the overestimate of 4.5% to 2.7% and the root-mean-square error decreased from 126.18 ha to 107.02 ha. Extrapolating the same procedure of automatic classification using spatial filtering for postclassification to the whole study area, the accuracy in area estimate was improved from the overestimate of 10.9% to 9.7%. It is concluded that when single pass LANDSAT data is used for crop identification and area estimation the postclassification procedure using a spatial filter provides a more accurate area estimate by reducing classification errors.
NASA Astrophysics Data System (ADS)
Song, Lanlan
2017-04-01
Nitrous oxide is much more potent greenhouse gas than carbon dioxide. However, the estimation of N2O flux is usually clouded with uncertainty, mainly due to high spatial and temporal variations. This hampers the development of general mechanistic models for N2O emission as well, as most previously developed models were empirical or exhibited low predictability with numerous assumptions. In this study, we tested General Regression Neural Networks (GRNN) as an alternative to classic empirical models for simulating N2O emission in riparian zones of Reservoirs. GRNN and nonlinear regression (NLR) were applied to estimate the N2O flux of 1-year observations in riparian zones of Three Gorge Reservoir. NLR resulted in lower prediction power and higher residuals compared to GRNN. Although nonlinear regression model estimated similar average values of N2O, it could not capture the fluctuation patterns accurately. In contrast, GRNN model achieved a fairly high predictability, with an R2 of 0.59 for model validation, 0.77 for model calibration (training), and a low root mean square error (RMSE), indicating a high capacity to simulate the dynamics of N2O flux. According to a sensitivity analysis of the GRNN, nonlinear relationships between input variables and N2O flux were well explained. Our results suggest that the GRNN developed in this study has a greater performance in simulating variations in N2O flux than nonlinear regressions.
Region of influence regression for estimating the 50-year flood at ungaged sites
Tasker, Gary D.; Hodge, S.A.; Barks, C.S.
1996-01-01
Five methods of developing regional regression models to estimate flood characteristics at ungaged sites in Arkansas are examined. The methods differ in the manner in which the State is divided into subrogions. Each successive method (A to E) is computationally more complex than the previous method. Method A makes no subdivision. Methods B and C define two and four geographic subrogions, respectively. Method D uses cluster/discriminant analysis to define subrogions on the basis of similarities in watershed characteristics. Method E, the new region of influence method, defines a unique subregion for each ungaged site. Split-sample results indicate that, in terms of root-mean-square error, method E (38 percent error) is best. Methods C and D (42 and 41 percent error) were in a virtual tie for second, and methods B (44 percent error) and A (49 percent error) were fourth and fifth best.
Void Growth and Coalescence Simulations
2013-08-01
distortion and damage, minimum time step, and appropriate material model parameters. Further, a temporal and spatial convergence study was used to...estimate errors, thus, this study helps to provide guidelines for modeling of materials with voids. Finally, we use a Gurson model with Johnson-Cook...spatial convergence study was used to estimate errors, thus, this study helps to provide guidelines for modeling of materials with voids. Finally, we
BaTMAn: Bayesian Technique for Multi-image Analysis
NASA Astrophysics Data System (ADS)
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
Area-to-point regression kriging for pan-sharpening
NASA Astrophysics Data System (ADS)
Wang, Qunming; Shi, Wenzhong; Atkinson, Peter M.
2016-04-01
Pan-sharpening is a technique to combine the fine spatial resolution panchromatic (PAN) band with the coarse spatial resolution multispectral bands of the same satellite to create a fine spatial resolution multispectral image. In this paper, area-to-point regression kriging (ATPRK) is proposed for pan-sharpening. ATPRK considers the PAN band as the covariate. Moreover, ATPRK is extended with a local approach, called adaptive ATPRK (AATPRK), which fits a regression model using a local, non-stationary scheme such that the regression coefficients change across the image. The two geostatistical approaches, ATPRK and AATPRK, were compared to the 13 state-of-the-art pan-sharpening approaches summarized in Vivone et al. (2015) in experiments on three separate datasets. ATPRK and AATPRK produced more accurate pan-sharpened images than the 13 benchmark algorithms in all three experiments. Unlike the benchmark algorithms, the two geostatistical solutions precisely preserved the spectral properties of the original coarse data. Furthermore, ATPRK can be enhanced by a local scheme in AATRPK, in cases where the residuals from a global regression model are such that their spatial character varies locally.
Simulation of wave propagation in three-dimensional random media
NASA Technical Reports Server (NTRS)
Coles, William A.; Filice, J. P.; Frehlich, R. G.; Yadlowsky, M.
1993-01-01
Quantitative error analysis for simulation of wave propagation in three dimensional random media assuming narrow angular scattering are presented for the plane wave and spherical wave geometry. This includes the errors resulting from finite grid size, finite simulation dimensions, and the separation of the two-dimensional screens along the propagation direction. Simple error scalings are determined for power-law spectra of the random refractive index of the media. The effects of a finite inner scale are also considered. The spatial spectra of the intensity errors are calculated and compared to the spatial spectra of intensity. The numerical requirements for a simulation of given accuracy are determined for realizations of the field. The numerical requirements for accurate estimation of higher moments of the field are less stringent.
Haptic spatial matching in near peripersonal space.
Kaas, Amanda L; Mier, Hanneke I van
2006-04-01
Research has shown that haptic spatial matching at intermanual distances over 60 cm is prone to large systematic errors. The error pattern has been explained by the use of reference frames intermediate between egocentric and allocentric coding. This study investigated haptic performance in near peripersonal space, i.e. at intermanual distances of 60 cm and less. Twelve blindfolded participants (six males and six females) were presented with two turn bars at equal distances from the midsagittal plane, 30 or 60 cm apart. Different orientations (vertical/horizontal or oblique) of the left bar had to be matched by adjusting the right bar to either a mirror symmetric (/ \\) or parallel (/ /) position. The mirror symmetry task can in principle be performed accurately in both an egocentric and an allocentric reference frame, whereas the parallel task requires an allocentric representation. Results showed that parallel matching induced large systematic errors which increased with distance. Overall error was significantly smaller in the mirror task. The task difference also held for the vertical orientation at 60 cm distance, even though this orientation required the same response in both tasks, showing a marked effect of task instruction. In addition, men outperformed women on the parallel task. Finally, contrary to our expectations, systematic errors were found in the mirror task, predominantly at 30 cm distance. Based on these findings, we suggest that haptic performance in near peripersonal space might be dominated by different mechanisms than those which come into play at distances over 60 cm. Moreover, our results indicate that both inter-individual differences and task demands affect task performance in haptic spatial matching. Therefore, we conclude that the study of haptic spatial matching in near peripersonal space might reveal important additional constraints for the specification of adequate models of haptic spatial performance.
Morphological Awareness and Children's Writing: Accuracy, Error, and Invention
McCutchen, Deborah; Stull, Sara
2014-01-01
This study examined the relationship between children's morphological awareness and their ability to produce accurate morphological derivations in writing. Fifth-grade U.S. students (n = 175) completed two writing tasks that invited or required morphological manipulation of words. We examined both accuracy and error, specifically errors in spelling and errors of the sort we termed morphological inventions, which entailed inappropriate, novel pairings of stems and suffixes. Regressions were used to determine the relationship between morphological awareness, morphological accuracy, and spelling accuracy, as well as between morphological awareness and morphological inventions. Linear regressions revealed that morphological awareness uniquely predicted children's generation of accurate morphological derivations, regardless of whether or not accurate spelling was required. A logistic regression indicated that morphological awareness was also uniquely predictive of morphological invention, with higher morphological awareness increasing the probability of morphological invention. These findings suggest that morphological knowledge may not only assist children with spelling during writing, but may also assist with word production via generative experimentation with morphological rules during sentence generation. Implications are discussed for the development of children's morphological knowledge and relationships with writing. PMID:25663748
Gaussian Process Regression Model in Spatial Logistic Regression
NASA Astrophysics Data System (ADS)
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Doubková, Marcela; Van Dijk, Albert I.J.M.; Sabel, Daniel; Wagner, Wolfgang; Blöschl, Günter
2012-01-01
The Sentinel-1 will carry onboard a C-band radar instrument that will map the European continent once every four days and the global land surface at least once every twelve days with finest 5 × 20 m spatial resolution. The high temporal sampling rate and operational configuration make Sentinel-1 of interest for operational soil moisture monitoring. Currently, updated soil moisture data are made available at 1 km spatial resolution as a demonstration service using Global Mode (GM) measurements from the Advanced Synthetic Aperture Radar (ASAR) onboard ENVISAT. The service demonstrates the potential of the C-band observations to monitor variations in soil moisture. Importantly, a retrieval error estimate is also available; these are needed to assimilate observations into models. The retrieval error is estimated by propagating sensor errors through the retrieval model. In this work, the existing ASAR GM retrieval error product is evaluated using independent top soil moisture estimates produced by the grid-based landscape hydrological model (AWRA-L) developed within the Australian Water Resources Assessment system (AWRA). The ASAR GM retrieval error estimate, an assumed prior AWRA-L error estimate and the variance in the respective datasets were used to spatially predict the root mean square error (RMSE) and the Pearson's correlation coefficient R between the two datasets. These were compared with the RMSE calculated directly from the two datasets. The predicted and computed RMSE showed a very high level of agreement in spatial patterns as well as good quantitative agreement; the RMSE was predicted within accuracy of 4% of saturated soil moisture over 89% of the Australian land mass. Predicted and calculated R maps corresponded within accuracy of 10% over 61% of the continent. The strong correspondence between the predicted and calculated RMSE and R builds confidence in the retrieval error model and derived ASAR GM error estimates. The ASAR GM and Sentinel-1 have the same basic physical measurement characteristics, and therefore very similar retrieval error estimation method can be applied. Because of the expected improvements in radiometric resolution of the Sentinel-1 backscatter measurements, soil moisture estimation errors can be expected to be an order of magnitude less than those for ASAR GM. This opens the possibility for operationally available medium resolution soil moisture estimates with very well-specified errors that can be assimilated into hydrological or crop yield models, with potentially large benefits for land-atmosphere fluxes, crop growth, and water balance monitoring and modelling. PMID:23483015
NASA Astrophysics Data System (ADS)
Peng, F.; Wong, M. S.; Nichol, J. E.; Chan, P. W.
2016-06-01
Rapid urban development between the 1960 and 2010 decades have changed the urban landscape and pattern in the Kowloon Peninsula of Hong Kong. This paper aims to study the changes of urban morphological parameters between the 1985 and 2010 and explore their influences on the urban heat island (UHI) effect. This study applied a mono-window algorithm to retrieve the land surface temperature (LST) using Landsat Thematic Mapper (TM) images from 1987 to 2009. In order to estimate the effects of local urban morphological parameters to LST, the global surface temperature anomaly was analysed. Historical 3D building model was developed based on aerial photogrammetry technique using aerial photographs from 1964 to 2010, in which the urban digital surface models (DSMs) including elevations of infrastructures and buildings have been generated. Then, urban morphological parameters (i.e. frontal area index (FAI), sky view factor (SVF)), vegetation fractional cover (VFC), global solar radiation (GSR), Normalized Difference Built-Up Index (NDBI), wind speed were derived. Finally, a linear regression method in Waikato Environment for Knowledge Analysis (WEKA) was used to build prediction model for revealing LST spatial patterns. Results show that the final apparent surface temperature have uncertainties less than 1 degree Celsius. The comparison between the simulated and actual spatial pattern of LST in 2009 showed that the correlation coefficient is 0.65, mean absolute error (MAE) is 1.24 degree Celsius, and root mean square error (RMSE) is 1.51 degree Celsius of 22,429 pixels.
Assessing uncertainty in SRTM elevations for global flood modelling
NASA Astrophysics Data System (ADS)
Hawker, L. P.; Rougier, J.; Neal, J. C.; Bates, P. D.
2017-12-01
The SRTM DEM is widely used as the topography input to flood models in data-sparse locations. Understanding spatial error in the SRTM product is crucial in constraining uncertainty about elevations and assessing the impact of these upon flood prediction. Assessment of SRTM error was carried out by Rodriguez et al (2006), but this did not explicitly quantify the spatial structure of vertical errors in the DEM, and nor did it distinguish between errors over different types of landscape. As a result, there is a lack of information about spatial structure of vertical errors of the SRTM in the landscape that matters most to flood models - the floodplain. Therefore, this study attempts this task by comparing SRTM, an error corrected SRTM product (The MERIT DEM of Yamazaki et al., 2017) and near truth LIDAR elevations for 3 deltaic floodplains (Mississippi, Po, Wax Lake) and a large lowland region (the Fens, UK). Using the error covariance function, calculated by comparing SRTM elevations to the near truth LIDAR, perturbations of the 90m SRTM DEM were generated, producing a catalogue of plausible DEMs. This allows modellers to simulate a suite of plausible DEMs at any aggregated block size above native SRTM resolution. Finally, the generated DEM's were input into a hydrodynamic model of the Mekong Delta, built using the LISFLOOD-FP hydrodynamic model, to assess how DEM error affects the hydrodynamics and inundation extent across the domain. The end product of this is an inundation map with the probability of each pixel being flooded based on the catalogue of DEMs. In a world of increasing computer power, but a lack of detailed datasets, this powerful approach can be used throughout natural hazard modelling to understand how errors in the SRTM DEM can impact the hazard assessment.
Liu, Chuanjun; Xiao, Chengli
2018-01-01
The spatial updating and memory systems are employed during updating in both the immediate and retrieved environments. However, these dual systems seem to work differently, as the difference of pointing latency and absolute error between the two systems vary across environments. To verify this issue, the present study employed the bias analysis of signed errors based on the hypothesis that the transformed representation will bias toward the original one. Participants learned a spatial layout and then either stayed in the learning location or were transferred to a neighboring room directly or after being disoriented. After that, they performed spatial judgments from perspectives aligned with the learning direction, aligned with the direction they faced during the test, or a novel direction misaligned with the two above-mentioned directions. The patterns of signed error bias were consistent across environments. Responses for memory aligned perspectives were unbiased, whereas responses for sensorimotor aligned perspectives were biased away from the memory aligned perspective, and responses for misaligned perspectives were biased toward sensorimotor aligned perspectives. These findings indicate that the spatial updating system is consistently independent of the spatial memory system regardless of the environments, but the updating system becomes less accessible as the environment changes from immediate to a retrieved one.
Liu, Chuanjun; Xiao, Chengli
2018-01-01
The spatial updating and memory systems are employed during updating in both the immediate and retrieved environments. However, these dual systems seem to work differently, as the difference of pointing latency and absolute error between the two systems vary across environments. To verify this issue, the present study employed the bias analysis of signed errors based on the hypothesis that the transformed representation will bias toward the original one. Participants learned a spatial layout and then either stayed in the learning location or were transferred to a neighboring room directly or after being disoriented. After that, they performed spatial judgments from perspectives aligned with the learning direction, aligned with the direction they faced during the test, or a novel direction misaligned with the two above-mentioned directions. The patterns of signed error bias were consistent across environments. Responses for memory aligned perspectives were unbiased, whereas responses for sensorimotor aligned perspectives were biased away from the memory aligned perspective, and responses for misaligned perspectives were biased toward sensorimotor aligned perspectives. These findings indicate that the spatial updating system is consistently independent of the spatial memory system regardless of the environments, but the updating system becomes less accessible as the environment changes from immediate to a retrieved one. PMID:29467698
Spatiotemporal exposure modeling of ambient erythemal ultraviolet radiation.
VoPham, Trang; Hart, Jaime E; Bertrand, Kimberly A; Sun, Zhibin; Tamimi, Rulla M; Laden, Francine
2016-11-24
Ultraviolet B (UV-B) radiation plays a multifaceted role in human health, inducing DNA damage and representing the primary source of vitamin D for most humans; however, current U.S. UV exposure models are limited in spatial, temporal, and/or spectral resolution. Area-to-point (ATP) residual kriging is a geostatistical method that can be used to create a spatiotemporal exposure model by downscaling from an area- to point-level spatial resolution using fine-scale ancillary data. A stratified ATP residual kriging approach was used to predict average July noon-time erythemal UV (UV Ery ) (mW/m 2 ) biennially from 1998 to 2012 by downscaling National Aeronautics and Space Administration (NASA) Total Ozone Mapping Spectrometer (TOMS) and Ozone Monitoring Instrument (OMI) gridded remote sensing images to a 1 km spatial resolution. Ancillary data were incorporated in random intercept linear mixed-effects regression models. Modeling was performed separately within nine U.S. regions to satisfy stationarity and account for locally varying associations between UV Ery and predictors. Cross-validation was used to compare ATP residual kriging models and NASA grids to UV-B Monitoring and Research Program (UVMRP) measurements (gold standard). Predictors included in the final regional models included surface albedo, aerosol optical depth (AOD), cloud cover, dew point, elevation, latitude, ozone, surface incoming shortwave flux, sulfur dioxide (SO 2 ), year, and interactions between year and surface albedo, AOD, cloud cover, dew point, elevation, latitude, and SO 2 . ATP residual kriging models more accurately estimated UV Ery at UVMRP monitoring stations on average compared to NASA grids across the contiguous U.S. (average mean absolute error [MAE] for ATP, NASA: 15.8, 20.3; average root mean square error [RMSE]: 21.3, 25.5). ATP residual kriging was associated with positive percent relative improvements in MAE (0.6-31.5%) and RMSE (3.6-29.4%) across all regions compared to NASA grids. ATP residual kriging incorporating fine-scale spatial predictors can provide more accurate, high-resolution UV Ery estimates compared to using NASA grids and can be used in epidemiologic studies examining the health effects of ambient UV.
Orthogonal Projection in Teaching Regression and Financial Mathematics
ERIC Educational Resources Information Center
Kachapova, Farida; Kachapov, Ilias
2010-01-01
Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…
Cross, Paul C.; Caillaud, Damien; Heisey, Dennis M.
2013-01-01
Many ecological and epidemiological studies occur in systems with mobile individuals and heterogeneous landscapes. Using a simulation model, we show that the accuracy of inferring an underlying biological process from observational data depends on movement and spatial scale of the analysis. As an example, we focused on estimating the relationship between host density and pathogen transmission. Observational data can result in highly biased inference about the underlying process when individuals move among sampling areas. Even without sampling error, the effect of host density on disease transmission is underestimated by approximately 50 % when one in ten hosts move among sampling areas per lifetime. Aggregating data across larger regions causes minimal bias when host movement is low, and results in less biased inference when movement rates are high. However, increasing data aggregation reduces the observed spatial variation, which would lead to the misperception that a spatially targeted control effort may not be very effective. In addition, averaging over the local heterogeneity will result in underestimating the importance of spatial covariates. Minimizing the bias due to movement is not just about choosing the best spatial scale for analysis, but also about reducing the error associated with using the sampling location as a proxy for an individual’s spatial history. This error associated with the exposure covariate can be reduced by choosing sampling regions with less movement, including longitudinal information of individuals’ movements, or reducing the window of exposure by using repeated sampling or younger individuals.
Patient identification using a near-infrared laser scanner
NASA Astrophysics Data System (ADS)
Manit, Jirapong; Bremer, Christina; Schweikard, Achim; Ernst, Floris
2017-03-01
We propose a new biometric approach where the tissue thickness of a person's forehead is used as a biometric feature. Given that the spatial registration of two 3D laser scans of the same human face usually produces a low error value, the principle of point cloud registration and its error metric can be applied to human classification techniques. However, by only considering the spatial error, it is not possible to reliably verify a person's identity. We propose to use a novel near-infrared laser-based head tracking system to determine an additional feature, the tissue thickness, and include this in the error metric. Using MRI as a ground truth, data from the foreheads of 30 subjects was collected from which a 4D reference point cloud was created for each subject. The measurements from the near-infrared system were registered with all reference point clouds using the ICP algorithm. Afterwards, the spatial and tissue thickness errors were extracted, forming a 2D feature space. For all subjects, the lowest feature distance resulted from the registration of a measurement and the reference point cloud of the same person. The combined registration error features yielded two clusters in the feature space, one from the same subject and another from the other subjects. When only the tissue thickness error was considered, these clusters were less distinct but still present. These findings could help to raise safety standards for head and neck cancer patients and lays the foundation for a future human identification technique.
Switching between Spatial Stimulus-Response Mappings: A Developmental Study of Cognitive Flexibility
ERIC Educational Resources Information Center
Crone, Eveline A.; Ridderinkhof, K. Richard; Worm, Mijkje; Somsen, Riek J. M.; van der Molen, Maurits W.
2004-01-01
Four different age groups (8-9-year-olds, 11-12-year-olds, 13-15-year-olds and young adults) performed a spatial rule-switch task in which the sorting rule had to be detected on the basis of feedback or on the basis of switch cues. Performance errors were examined on the basis of a recently introduced method of error scoring for the Wisconsin Card…
Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko
2018-03-01
The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.
A provisional effective evaluation when errors are present in independent variables
NASA Technical Reports Server (NTRS)
Gurin, L. S.
1983-01-01
Algorithms are examined for evaluating the parameters of a regression model when there are errors in the independent variables. The algorithms are fast and the estimates they yield are stable with respect to the correlation of errors and measurements of both the dependent variable and the independent variables.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saeki, Hiroshi, E-mail: saeki@spring8.or.jp; Magome, Tamotsu, E-mail: saeki@spring8.or.jp
2014-10-06
To compensate pressure-measurement errors caused by a synchrotron radiation environment, a precise method using a hot-cathode-ionization-gauge head with correcting electrode, was developed and tested in a simulation experiment with excess electrons in the SPring-8 storage ring. This precise method to improve the measurement accuracy, can correctly reduce the pressure-measurement errors caused by electrons originating from the external environment, and originating from the primary gauge filament influenced by spatial conditions of the installed vacuum-gauge head. As the result of the simulation experiment to confirm the performance reducing the errors caused by the external environment, the pressure-measurement error using this method wasmore » approximately less than several percent in the pressure range from 10{sup −5} Pa to 10{sup −8} Pa. After the experiment, to confirm the performance reducing the error caused by spatial conditions, an additional experiment was carried out using a sleeve and showed that the improved function was available.« less
Prediction of hourly PM2.5 using a space-time support vector regression model
NASA Astrophysics Data System (ADS)
Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang
2018-05-01
Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.
Mathematics skills in good readers with hydrocephalus.
Barnes, Marcia A; Pengelly, Sarah; Dennis, Maureen; Wilkinson, Margaret; Rogers, Tracey; Faulkner, Heather
2002-01-01
Children with hydrocephalus have poor math skills. We investigated the nature of their arithmetic computation errors by comparing written subtraction errors in good readers with hydrocephalus, typically developing good readers of the same age, and younger children matched for math level to the children with hydrocephalus. Children with hydrocephalus made more procedural errors (although not more fact retrieval or visual-spatial errors) than age-matched controls; they made the same number of procedural errors as younger, math-level matched children. We also investigated a broad range of math abilities, and found that children with hydrocephalus performed more poorly than age-matched controls on tests of geometry and applied math skills such as estimation and problem solving. Computation deficits in children with hydrocephalus reflect delayed development of procedural knowledge. Problems in specific math domains such as geometry and applied math, were associated with deficits in constituent cognitive skills such as visual spatial competence, memory, and general knowledge.
Adaptive optics system performance approximations for atmospheric turbulence correction
NASA Astrophysics Data System (ADS)
Tyson, Robert K.
1990-10-01
Analysis of adaptive optics system behavior often can be reduced to a few approximations and scaling laws. For atmospheric turbulence correction, the deformable mirror (DM) fitting error is most often used to determine a priori the interactuator spacing and the total number of correction zones required. This paper examines the mirror fitting error in terms of its most commonly used exponential form. The explicit constant in the error term is dependent on deformable mirror influence function shape and actuator geometry. The method of least squares fitting of discrete influence functions to the turbulent wavefront is compared to the linear spatial filtering approximation of system performance. It is found that the spatial filtering method overstimates the correctability of the adaptive optics system by a small amount. By evaluating fitting error for a number of DM configurations, actuator geometries, and influence functions, fitting error constants verify some earlier investigations.
Performance evaluation of spatial compounding in the presence of aberration and adaptive imaging
NASA Astrophysics Data System (ADS)
Dahl, Jeremy J.; Guenther, Drake; Trahey, Gregg E.
2003-05-01
Spatial compounding has been used for years to reduce speckle in ultrasonic images and to resolve anatomical features hidden behind the grainy appearance of speckle. Adaptive imaging restores image contrast and resolution by compensating for beamforming errors caused by tissue-induced phase errors. Spatial compounding represents a form of incoherent imaging, whereas adaptive imaging attempts to maintain a coherent, diffraction-limited aperture in the presence of aberration. Using a Siemens Antares scanner, we acquired single channel RF data on a commercially available 1-D probe. Individual channel RF data was acquired on a cyst phantom in the presence of a near field electronic phase screen. Simulated data was also acquired for both a 1-D and a custom built 8x96, 1.75-D probe (Tetrad Corp.). The data was compounded using a receive spatial compounding algorithm; a widely used algorithm because it takes advantage of parallel beamforming to avoid reductions in frame rate. Phase correction was also performed by using a least mean squares algorithm to estimate the arrival time errors. We present simulation and experimental data comparing the performance of spatial compounding to phase correction in contrast and resolution tasks. We evaluate spatial compounding and phase correction, and combinations of the two methods, under varying aperture sizes, aperture overlaps, and aberrator strength to examine the optimum configuration and conditions in which spatial compounding will provide a similar or better result than adaptive imaging. We find that, in general, phase correction is hindered at high aberration strengths and spatial frequencies, whereas spatial compounding is helped by these aberrators.
A priori discretization error metrics for distributed hydrologic modeling applications
NASA Astrophysics Data System (ADS)
Liu, Hongli; Tolson, Bryan A.; Craig, James R.; Shafii, Mahyar
2016-12-01
Watershed spatial discretization is an important step in developing a distributed hydrologic model. A key difficulty in the spatial discretization process is maintaining a balance between the aggregation-induced information loss and the increase in computational burden caused by the inclusion of additional computational units. Objective identification of an appropriate discretization scheme still remains a challenge, in part because of the lack of quantitative measures for assessing discretization quality, particularly prior to simulation. This study proposes a priori discretization error metrics to quantify the information loss of any candidate discretization scheme without having to run and calibrate a hydrologic model. These error metrics are applicable to multi-variable and multi-site discretization evaluation and provide directly interpretable information to the hydrologic modeler about discretization quality. The first metric, a subbasin error metric, quantifies the routing information loss from discretization, and the second, a hydrological response unit (HRU) error metric, improves upon existing a priori metrics by quantifying the information loss due to changes in land cover or soil type property aggregation. The metrics are straightforward to understand and easy to recode. Informed by the error metrics, a two-step discretization decision-making approach is proposed with the advantage of reducing extreme errors and meeting the user-specified discretization error targets. The metrics and decision-making approach are applied to the discretization of the Grand River watershed in Ontario, Canada. Results show that information loss increases as discretization gets coarser. Moreover, results help to explain the modeling difficulties associated with smaller upstream subbasins since the worst discretization errors and highest error variability appear in smaller upstream areas instead of larger downstream drainage areas. Hydrologic modeling experiments under candidate discretization schemes validate the strong correlation between the proposed discretization error metrics and hydrologic simulation responses. Discretization decision-making results show that the common and convenient approach of making uniform discretization decisions across the watershed performs worse than the proposed non-uniform discretization approach in terms of preserving spatial heterogeneity under the same computational cost.
Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.
2015-01-01
Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
Fully automatic cervical vertebrae segmentation framework for X-ray images.
Al Arif, S M Masudur Rahman; Knapp, Karen; Slabaugh, Greg
2018-04-01
The cervical spine is a highly flexible anatomy and therefore vulnerable to injuries. Unfortunately, a large number of injuries in lateral cervical X-ray images remain undiagnosed due to human errors. Computer-aided injury detection has the potential to reduce the risk of misdiagnosis. Towards building an automatic injury detection system, in this paper, we propose a deep learning-based fully automatic framework for segmentation of cervical vertebrae in X-ray images. The framework first localizes the spinal region in the image using a deep fully convolutional neural network. Then vertebra centers are localized using a novel deep probabilistic spatial regression network. Finally, a novel shape-aware deep segmentation network is used to segment the vertebrae in the image. The framework can take an X-ray image and produce a vertebrae segmentation result without any manual intervention. Each block of the fully automatic framework has been trained on a set of 124 X-ray images and tested on another 172 images, all collected from real-life hospital emergency rooms. A Dice similarity coefficient of 0.84 and a shape error of 1.69 mm have been achieved. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Khaleghi, Mohammad Reza; Varvani, Javad
2018-02-01
Complex and variable nature of the river sediment yield caused many problems in estimating the long-term sediment yield and problems input into the reservoirs. Sediment Rating Curves (SRCs) are generally used to estimate the suspended sediment load of the rivers and drainage watersheds. Since the regression equations of the SRCs are obtained by logarithmic retransformation and have a little independent variable in this equation, they also overestimate or underestimate the true sediment load of the rivers. To evaluate the bias correction factors in Kalshor and Kashafroud watersheds, seven hydrometric stations of this region with suitable upstream watershed and spatial distribution were selected. Investigation of the accuracy index (ratio of estimated sediment yield to observed sediment yield) and the precision index of different bias correction factors of FAO, Quasi-Maximum Likelihood Estimator (QMLE), Smearing, and Minimum-Variance Unbiased Estimator (MVUE) with LSD test showed that FAO coefficient increases the estimated error in all of the stations. Application of MVUE in linear and mean load rating curves has not statistically meaningful effects. QMLE and smearing factors increased the estimated error in mean load rating curve, but that does not have any effect on linear rating curve estimation.
Triki Fourati, Hela; Bouaziz, Moncef; Benzina, Mourad; Bouaziz, Samir
2017-04-01
Traditional surveying methods of soil properties over landscapes are dramatically cost and time-consuming. Thus, remote sensing is a proper choice for monitoring environmental problem. This research aims to study the effect of environmental factors on soil salinity and to map the spatial distribution of this salinity over the southern east part of Tunisia by means of remote sensing and geostatistical techniques. For this purpose, we used Advanced Spaceborne Thermal Emission and Reflection Radiometer data to depict geomorphological parameters: elevation, slope, plan curvature (PLC), profile curvature (PRC), and aspect. Pearson correlation between these parameters and soil electrical conductivity (EC soil ) showed that mainly slope and elevation affect the concentration of salt in soil. Moreover, spectral analysis illustrated the high potential of short-wave infrared (SWIR) bands to identify saline soils. To map soil salinity in southern Tunisia, ordinary kriging (OK), minimum distance (MD) classification, and simple regression (SR) were used. The findings showed that ordinary kriging technique provides the most reliable performances to identify and classify saline soils over the study area with a root mean square error of 1.83 and mean error of 0.018.
Zhang, Ling Yu; Liu, Zhao Gang
2017-12-01
Based on the data collected from 108 permanent plots of the forest resources survey in Maoershan Experimental Forest Farm during 2004-2016, this study investigated the spatial distribution of recruitment trees in natural secondary forest by global Poisson regression and geographically weighted Poisson regression (GWPR) with four bandwidths of 2.5, 5, 10 and 15 km. The simulation effects of the 5 regressions and the factors influencing the recruitment trees in stands were analyzed, a description was given to the spatial autocorrelation of the regression residuals on global and local levels using Moran's I. The results showed that the spatial distribution of the number of natural secondary forest recruitment was significantly influenced by stands and topographic factors, especially average DBH. The GWPR model with small scale (2.5 km) had high accuracy of model fitting, a large range of model parameter estimates was generated, and the localized spatial distribution effect of the model parameters was obtained. The GWPR model at small scale (2.5 and 5 km) had produced a small range of model residuals, and the stability of the model was improved. The global spatial auto-correlation of the GWPR model residual at the small scale (2.5 km) was the lowe-st, and the local spatial auto-correlation was significantly reduced, in which an ideal spatial distribution pattern of small clusters with different observations was formed. The local model at small scale (2.5 km) was much better than the global model in the simulation effect on the spatial distribution of recruitment tree number.
Bonilla, M.G.; Mark, R.K.; Lienkaemper, J.J.
1984-01-01
In order to refine correlations of surface-wave magnitude, fault rupture length at the ground surface, and fault displacement at the surface by including the uncertainties in these variables, the existing data were critically reviewed and a new data base was compiled. Earthquake magnitudes were redetermined as necessary to make them as consistent as possible with the Gutenberg methods and results, which necessarily make up much of the data base. Measurement errors were estimated for the three variables for 58 moderate to large shallow-focus earthquakes. Regression analyses were then made utilizing the estimated measurement errors. The regression analysis demonstrates that the relations among the variables magnitude, length, and displacement are stochastic in nature. The stochastic variance, introduced in part by incomplete surface expression of seismogenic faulting, variation in shear modulus, and regional factors, dominates the estimated measurement errors. Thus, it is appropriate to use ordinary least squares for the regression models, rather than regression models based upon an underlying deterministic relation with the variance resulting from measurement errors. Significant differences exist in correlations of certain combinations of length, displacement, and magnitude when events are qrouped by fault type or by region, including attenuation regions delineated by Evernden and others. Subdivision of the data results in too few data for some fault types and regions, and for these only regressions using all of the data as a group are reported. Estimates of the magnitude and the standard deviation of the magnitude of a prehistoric or future earthquake associated with a fault can be made by correlating M with the logarithms of rupture length, fault displacement, or the product of length and displacement. Fault rupture area could be reliably estimated for about 20 of the events in the data set. Regression of MS on rupture area did not result in a marked improvement over regressions that did not involve rupture area. Because no subduction-zone earthquakes are included in this study, the reported results do not apply to such zones.
Comparing spatial regression to random forests for large environmental data sets
Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...
Buonaccorsi, John P; Dalen, Ingvild; Laake, Petter; Hjartåker, Anette; Engeset, Dagrun; Thoresen, Magne
2015-04-15
Measurement error occurs when we observe error-prone surrogates, rather than true values. It is common in observational studies and especially so in epidemiology, in nutritional epidemiology in particular. Correcting for measurement error has become common, and regression calibration is the most popular way to account for measurement error in continuous covariates. We consider its use in the context where there are validation data, which are used to calibrate the true values given the observed covariates. We allow for the case that the true value itself may not be observed in the validation data, but instead, a so-called reference measure is observed. The regression calibration method relies on certain assumptions.This paper examines possible biases in regression calibration estimators when some of these assumptions are violated. More specifically, we allow for the fact that (i) the reference measure may not necessarily be an 'alloyed gold standard' (i.e., unbiased) for the true value; (ii) there may be correlated random subject effects contributing to the surrogate and reference measures in the validation data; and (iii) the calibration model itself may not be the same in the validation study as in the main study; that is, it is not transportable. We expand on previous work to provide a general result, which characterizes potential bias in the regression calibration estimators as a result of any combination of the violations aforementioned. We then illustrate some of the general results with data from the Norwegian Women and Cancer Study. Copyright © 2015 John Wiley & Sons, Ltd.
Refractive regression after laser in situ keratomileusis.
Yan, Mabel K; Chang, John Sm; Chan, Tommy Cy
2018-04-26
Uncorrected refractive errors are a leading cause of visual impairment across the world. In today's society, laser in situ keratomileusis (LASIK) has become the most commonly performed surgical procedure to correct refractive errors. However, regression of the initially achieved refractive correction has been a widely observed phenomenon following LASIK since its inception more than two decades ago. Despite technological advances in laser refractive surgery and various proposed management strategies, post-LASIK regression is still frequently observed and has significant implications for the long-term visual performance and quality of life of patients. This review explores the mechanism of refractive regression after both myopic and hyperopic LASIK, predisposing risk factors and its clinical course. In addition, current preventative strategies and therapies are also reviewed. © 2018 Royal Australian and New Zealand College of Ophthalmologists.
Estimating the Standard Error of Robust Regression Estimates.
1987-03-01
error is 0(n4/5). In another Monte Carlo study, McKean and Schrader (1984) found that the tests resulting from studentizing ; by _3d/1/2 with d =0(n4 /5...44 4 -:~~-~*v: -. *;~ ~ ~*t .~ # ~ 44 % * ~ .%j % % % * . ., ~ -%. -14- Sheather, S. J. and McKean, J. W. (1987). A comparison of testing and...Wiley, New York. Welsch, R. E. (1980). Regression Sensitivity Analysis and Bounded- Influence Estimation, in Evaluation of Econometric Models eds. J
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
A stream-gaging network analysis for the 7-day, 10-year annual low flow in New Hampshire streams
Flynn, Robert H.
2003-01-01
The 7-day, 10-year (7Q10) low-flow-frequency statistic is a widely used measure of surface-water availability in New Hampshire. Regression equations and basin-characteristic digital data sets were developed to help water-resource managers determine surface-water resources during periods of low flow in New Hampshire streams. These regression equations and data sets were developed to estimate streamflow statistics for the annual and seasonal low-flow-frequency, and period-of-record and seasonal period-of-record flow durations. generalized-least-squares (GLS) regression methods were used to develop the annual 7Q10 low-flow-frequency regression equation from 60 continuous-record stream-gaging stations in New Hampshire and in neighboring States. In the regression equation, the dependent variables were the annual 7Q10 flows at the 60 stream-gaging stations. The independent (or predictor) variables were objectively selected characteristics of the drainage basins that contribute flow to those stations. In contrast to ordinary-least-squares (OLS) regression analysis, GLS-developed estimating equations account for differences in length of record and spatial correlations among the flow-frequency statistics at the various stations.A total of 93 measurable drainage-basin characteristics were candidate independent variables. On the basis of several statistical parameters that were used to evaluate which combination of basin characteristics contribute the most to the predictive power of the equations, three drainage-basin characteristics were determined to be statistically significant predictors of the annual 7Q10: (1) total drainage area, (2) mean summer stream-gaging station precipitation from 1961 to 90, and (3) average mean annual basinwide temperature from 1961 to 1990.To evaluate the effectiveness of the stream-gaging network in providing regional streamflow data for the annual 7Q10, the computer program GLSNET (generalized-least-squares NETwork) was used to analyze the network by application of GLS regression between streamflow and the climatic and basin characteristics of the drainage basin upstream from each stream-gaging station. Improvement to the predictive ability of the regression equations developed for the network analyses is measured by the reduction in the average sampling-error variance, and can be achieved by collecting additional streamflow data at existing stations. The predictive ability of the regression equations is enhanced even further with the addition of new stations to the network. Continued data collection at unregulated stream-gaging stations with less than 14 years of record resulted in the greatest cost-weighted reduction to the average sampling-error variance of the annual 7Q10 regional regression equation. The addition of new stations in basins with underrepresented values for the independent variables of the total drainage area, average mean annual basinwide temperature, or mean summer stream-gaging station precipitation in the annual 7Q10 regression equation yielded a much greater cost-weighted reduction to the average sampling-error variance than when more data were collected at existing unregulated stations. To maximize the regional information obtained from the stream-gaging network for the annual 7Q10, ranking of the streamflow data can be used to determine whether an active station should be continued or if a new or discontinued station should be activated for streamflow data collection. Thus, this network analysis can help determine the costs and benefits of continuing the operation of a particular station or activating a new station at another location to predict the 7Q10 at ungaged stream reaches. The decision to discontinue an existing station or activate a new station, however, must also consider its contribution to other water-resource analyses such as flood management, water quality, or trends in land use or climatic change.
Effects of Head Rotation on Space- and Word-Based Reading Errors in Spatial Neglect
ERIC Educational Resources Information Center
Reinhart, Stefan; Keller, Ingo; Kerkhoff, Georg
2010-01-01
Patients with right hemisphere lesions often omit or misread words on the left side of a text or the beginning letters of single words which is termed neglect dyslexia (ND). Two types of reading errors are typically observed in ND: omissions and word-based reading errors. The prior are considered as space-based omission errors on the…
Estimating pixel variances in the scenes of staring sensors
Simonson, Katherine M [Cedar Crest, NM; Ma, Tian J [Albuquerque, NM
2012-01-24
A technique for detecting changes in a scene perceived by a staring sensor is disclosed. The technique includes acquiring a reference image frame and a current image frame of a scene with the staring sensor. A raw difference frame is generated based upon differences between the reference image frame and the current image frame. Pixel error estimates are generated for each pixel in the raw difference frame based at least in part upon spatial error estimates related to spatial intensity gradients in the scene. The pixel error estimates are used to mitigate effects of camera jitter in the scene between the current image frame and the reference image frame.
NASA Astrophysics Data System (ADS)
Norajitra, Tobias; Meinzer, Hans-Peter; Maier-Hein, Klaus H.
2015-03-01
During image segmentation, 3D Statistical Shape Models (SSM) usually conduct a limited search for target landmarks within one-dimensional search profiles perpendicular to the model surface. In addition, landmark appearance is modeled only locally based on linear profiles and weak learners, altogether leading to segmentation errors from landmark ambiguities and limited search coverage. We present a new method for 3D SSM segmentation based on 3D Random Forest Regression Voting. For each surface landmark, a Random Regression Forest is trained that learns a 3D spatial displacement function between the according reference landmark and a set of surrounding sample points, based on an infinite set of non-local randomized 3D Haar-like features. Landmark search is then conducted omni-directionally within 3D search spaces, where voxelwise forest predictions on landmark position contribute to a common voting map which reflects the overall position estimate. Segmentation experiments were conducted on a set of 45 CT volumes of the human liver, of which 40 images were randomly chosen for training and 5 for testing. Without parameter optimization, using a simple candidate selection and a single resolution approach, excellent results were achieved, while faster convergence and better concavity segmentation were observed, altogether underlining the potential of our approach in terms of increased robustness from distinct landmark detection and from better search coverage.
Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.
NASA Astrophysics Data System (ADS)
Ouédraogo, Mohamar Moussa; Degré, Aurore; Debouche, Charles; Lisein, Jonathan
2014-06-01
Agricultural watersheds tend to be places of intensive farming activities that permanently modify their microtopography. The surface characteristics of the soil vary depending on the crops that are cultivated in these areas. Agricultural soil microtopography plays an important role in the quantification of runoff and sediment transport because the presence of crops, crop residues, furrows and ridges may impact the direction of water flow. To better assess such phenomena, 3-D reconstructions of high-resolution agricultural watershed topography are essential. Fine-resolution topographic data collection technologies can be used to discern highly detailed elevation variability in these areas. Knowledge of the strengths and weaknesses of existing technologies used for data collection on agricultural watersheds may be helpful in choosing an appropriate technology. This study assesses the suitability of terrestrial laser scanning (TLS) and unmanned aerial system (UAS) photogrammetry for collecting the fine-resolution topographic data required to generate accurate, high-resolution digital elevation models (DEMs) in a small watershed area (12 ha). Because of farming activity, 14 TLS scans (≈ 25 points m- 2) were collected without using high-definition surveying (HDS) targets, which are generally used to mesh adjacent scans. To evaluate the accuracy of the DEMs created from the TLS scan data, 1098 ground control points (GCPs) were surveyed using a real time kinematic global positioning system (RTK-GPS). Linear regressions were then applied to each DEM to remove vertical errors from the TLS point elevations, errors caused by the non-perpendicularity of the scanner's vertical axis to the local horizontal plane, and errors correlated with the distance to the scanner's position. The scans were then meshed to generate a DEMTLS with a 1 × 1 m spatial resolution. The Agisoft PhotoScan and MicMac software packages were used to process the aerial photographs and generate a DEMPSC (Agisoft PhotoScan) and DEMMCM (MicMac), respectively, with spatial resolutions of 1 × 1 m. Comparing the DEMs with the 1098 GCPs showed that the DEMTLS was the most accurate data product, with a root mean square error (RMSE) of 4.5 cm, followed by the DEMMCM and the DEMPSC, which had RMSE values of 9.0 and 13.9 cm, respectively. The DEMPSC had absolute errors along the border of the study area that ranged from 15.0 to 52.0 cm, indicating the presence of systematic errors. Although the derived DEMMCM was accurate, an error analysis along a transect showed that the errors in the DEMMCM data tended to increase in areas of lower elevation. Compared with TLS, UAS is a promising tool for data collection because of its flexibility and low operational cost. However, improvements are needed in the photogrammetric processing of the aerial photographs to remove non-linear distortions.
Jember, Abebaw; Hailu, Mignote; Messele, Anteneh; Demeke, Tesfaye; Hassen, Mohammed
2018-01-01
A medication error (ME) is any preventable event that may cause or lead to inappropriate medication use or patient harm. Voluntary reporting has a principal role in appreciating the extent and impact of medication errors. Thus, exploration of the proportion of medication error reporting and associated factors among nurses is important to inform service providers and program implementers so as to improve the quality of the healthcare services. Institution based quantitative cross-sectional study was conducted among 397 nurses from March 6 to May 10, 2015. Stratified sampling followed by simple random sampling technique was used to select the study participants. The data were collected using structured self-administered questionnaire which was adopted from studies conducted in Australia and Jordan. A pilot study was carried out to validate the questionnaire before data collection for this study. Bivariate and multivariate logistic regression models were fitted to identify factors associated with the proportion of medication error reporting among nurses. An adjusted odds ratio with 95% confidence interval was computed to determine the level of significance. The proportion of medication error reporting among nurses was found to be 57.4%. Regression analysis showed that sex, marital status, having made a medication error and medication error experience were significantly associated with medication error reporting. The proportion of medication error reporting among nurses in this study was found to be higher than other studies.
Mapping the Climate of Puerto Rico, Vieques and Culebra.
CHRISTOPHER DALY; E. H. HELMER; MAYA QUINONES
2003-01-01
Spatially explicit climate data contribute to watershed resource management, mapping vegetation type with satellite imagery, mapping present and hypothetical future ecological zones, and predicting species distributions. The regression based Parameter-elevation Regressions on Independent Slopes Model (PRISM) uses spatial data sets, a knowledge base and expert...
NASA Astrophysics Data System (ADS)
Maloney, Chris; Lormeau, Jean Pierre; Dumas, Paul
2016-07-01
Many astronomical sensing applications operate in low-light conditions; for these applications every photon counts. Controlling mid-spatial frequencies and surface roughness on astronomical optics are critical for mitigating scattering effects such as flare and energy loss. By improving these two frequency regimes higher contrast images can be collected with improved efficiency. Classically, Magnetorheological Finishing (MRF) has offered an optical fabrication technique to correct low order errors as well has quilting/print-through errors left over in light-weighted optics from conventional polishing techniques. MRF is a deterministic, sub-aperture polishing process that has been used to improve figure on an ever expanding assortment of optical geometries, such as planos, spheres, on and off axis aspheres, primary mirrors and freeform optics. Precision optics are routinely manufactured by this technology with sizes ranging from 5-2,000mm in diameter. MRF can be used for form corrections; turning a sphere into an asphere or free form, but more commonly for figure corrections achieving figure errors as low as 1nm RMS while using careful metrology setups. Recent advancements in MRF technology have improved the polishing performance expected for astronomical optics in low, mid and high spatial frequency regimes. Deterministic figure correction with MRF is compatible with most materials, including some recent examples on Silicon Carbide and RSA905 Aluminum. MRF also has the ability to produce `perfectly-bad' compensating surfaces, which may be used to compensate for measured or modeled optical deformation from sources such as gravity or mounting. In addition, recent advances in MRF technology allow for corrections of mid-spatial wavelengths as small as 1mm simultaneously with form error correction. Efficient midspatial frequency corrections make use of optimized process conditions including raster polishing in combination with a small tool size. Furthermore, a novel MRF fluid, called C30, has been developed to finish surfaces to ultra-low roughness (ULR) and has been used as the low removal rate fluid required for fine figure correction of mid-spatial frequency errors. This novel MRF fluid is able to achieve <4Å RMS on Nickel-plated Aluminum and even <1.5Å RMS roughness on Silicon, Fused Silica and other materials. C30 fluid is best utilized within a fine figure correction process to target mid-spatial frequency errors as well as smooth surface roughness 'for free' all in one step. In this paper we will discuss recent advancements in MRF technology and the ability to meet requirements for precision optics in low, mid and high spatial frequency regimes and how improved MRF performance addresses the need for achieving tight specifications required for astronomical optics.
Hughes, Kristen; Budke, Christine M.; Ward, Michael P.; Kerry, Ruth; Ingram, Ben
2017-01-01
The population density of wildlife reservoirs contributes to disease transmission risk for domestic animals. The objective of this study was to model the African buffalo distribution of the Kruger National Park. A secondary objective was to collect field data to evaluate models and determine environmental predictors of buffalo detection. Spatial distribution models were created using buffalo census information and archived data from previous research. Field data were collected during the dry (August 2012) and wet (January 2013) seasons using a random walk design. The fit of the prediction models were assessed descriptively and formally by calculating the root mean square error (rMSE) of deviations from field observations. Logistic regression was used to estimate the effects of environmental variables on the detection of buffalo herds and linear regression was used to identify predictors of larger herd sizes. A zero-inflated Poisson model produced distributions that were most consistent with expected buffalo behavior. Field data confirmed that environmental factors including season (P = 0.008), vegetation type (P = 0.002), and vegetation density (P = 0.010) were significant predictors of buffalo detection. Bachelor herds were more likely to be detected in dense vegetation (P = 0.005) and during the wet season (P = 0.022) compared to the larger mixed-sex herds. Static distribution models for African buffalo can produce biologically reasonable results but environmental factors have significant effects and therefore could be used to improve model performance. Accurate distribution models are critical for the evaluation of disease risk and to model disease transmission. PMID:28902858
NASA Astrophysics Data System (ADS)
Taylor, Sarah L.; Hill, Ross A.; Edwards, Colin
2013-07-01
Compared with traditional ground surveys, remote sensing has the potential to map the spatial extent of non-native invasive species rapidly and reliably. This paper assesses the potential of spectroradiometry to distinguish and characterise the status of invasive non-native rhododendron (Rhododendron ponticum). Absolute reflectance of target plant material was measured with an ASD Fieldspec Pro System under standardised laboratory conditions and in the field to characterise spectral signatures in the winter, during leaf-off conditions for woodland overstory, and in the summer when mature rhododendrons are flowering. A logistic regression model of absolute reflectance at key wavelengths (490, 550, 610, 1040 and 1490 nm) was used to determine the success of discriminating rhododendron from three other shrubby species likely to be encountered in woodlands during the winter. The logistic regression model was highly significant (p < 0.001), with 93.5% of 246 leaf sets correctly identified as rhododendron or non-rhododendron (i.e. cherry laurel (Prunus laurocerasus), holly (Ilex aquifolium), and beech (Fagus sylvatica)). Rescaling the data to emulate the spectral resolution of airborne and satellite acquired data decreased the total success rate of correctly identifying rhododendron by only 0.4%; although this error rate will likely increase for airborne or satellite data as a result of atmospheric attenuation and reduced spatial resolution. This demonstrates the potential to map bush presence using hyperspectral data and indicates the optimum spectral wavelengths required. Such information is critical to the development of successful strategic management plans to eradicate rhododendron (and the associated Phytophthora ramorum pathogen) effectively from a site.
[Winter wheat area estimation with MODIS-NDVI time series based on parcel].
Li, Le; Zhang, Jin-shui; Zhu, Wen-quan; Hu, Tan-gao; Hou, Dong
2011-05-01
Several attributes of MODIS (moderate resolution imaging spectrometer) data, especially the short temporal intervals and the global coverage, provide an extremely efficient way to map cropland and monitor its seasonal change. However, the reliability of their measurement results is challenged because of the limited spatial resolution. The parcel data has clear geo-location and obvious boundary information of cropland. Also, the spectral differences and the complexity of mixed pixels are weak in parcels. All of these make that area estimation based on parcels presents more advantage than on pixels. In the present study, winter wheat area estimation based on MODIS-NDVI time series has been performed with the support of cultivated land parcel in Tongzhou, Beijing. In order to extract the regional winter wheat acreage, multiple regression methods were used to simulate the stable regression relationship between MODIS-NDVI time series data and TM samples in parcels. Through this way, the consistency of the extraction results from MODIS and TM can stably reach up to 96% when the amount of samples accounts for 15% of the whole area. The results shows that the use of parcel data can effectively improve the error in recognition results in MODIS-NDVI based multi-series data caused by the low spatial resolution. Therefore, with combination of moderate and low resolution data, the winter wheat area estimation became available in large-scale region which lacks completed medium resolution images or has images covered with clouds. Meanwhile, it carried out the preliminary experiments for other crop area estimation.
Bisese, James A.
1995-01-01
Methods are presented for estimating the peak discharges of rural, unregulated streams in Virginia. A Pearson Type III distribution is fitted to the logarithms of the unregulated annual peak-discharge records from 363 stream-gaging stations in Virginia to estimate the peak discharge at these stations for recurrence intervals of 2 to 500 years. Peak-discharge characteristics for 284 unregulated stations are divided into eight regions based on physiographic province, and regressed on basin characteristics, including drainage area, main channel length, main channel slope, mean basin elevation, percentage of forest cover, mean annual precipitation, and maximum rainfall intensity. Regression equations for each region are computed by use of the generalized least-squares method, which accounts for spatial and temporal correlation between nearby gaging stations. This regression technique weights the significance of each station to the regional equation based on the length of records collected at each cation, the correlation between annual peak discharges among the stations, and the standard deviation of the annual peak discharge for each station.Drainage area proved to be the only significant explanatory variable in four regions, while other regions have as many as three significant variables. Standard errors of the regression equations range from 30 to 80 percent. Alternate equations using drainage area only are provided for the five regions with more than one significant explanatory variable.Methods and sample computations are provided to estimate peak discharges at gaged and engaged sites in Virginia for recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, and to adjust the regression estimates for sites on gaged streams where nearby gaging-station records are available.
Stitching-error reduction in gratings by shot-shifted electron-beam lithography
NASA Technical Reports Server (NTRS)
Dougherty, D. J.; Muller, R. E.; Maker, P. D.; Forouhar, S.
2001-01-01
Calculations of the grating spatial-frequency spectrum and the filtering properties of multiple-pass electron-beam writing demonstrate a tradeoff between stitching-error suppression and minimum pitch separation. High-resolution measurements of optical-diffraction patterns show a 25-dB reduction in stitching-error side modes.
Shen, Chung-Wei; Chen, Yi-Hau
2015-10-01
Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bergen, Silas; Sheppard, Lianne; Kaufman, Joel D.; Szpiro, Adam A.
2016-01-01
Summary Air pollution epidemiology studies are trending towards a multi-pollutant approach. In these studies, exposures at subject locations are unobserved and must be predicted using observed exposures at misaligned monitoring locations. This induces measurement error, which can bias the estimated health effects and affect standard error estimates. We characterize this measurement error and develop an analytic bias correction when using penalized regression splines to predict exposure. Our simulations show bias from multi-pollutant measurement error can be severe, and in opposite directions or simultaneously positive or negative. Our analytic bias correction combined with a non-parametric bootstrap yields accurate coverage of 95% confidence intervals. We apply our methodology to analyze the association of systolic blood pressure with PM2.5 and NO2 in the NIEHS Sister Study. We find that NO2 confounds the association of systolic blood pressure with PM2.5 and vice versa. Elevated systolic blood pressure was significantly associated with increased PM2.5 and decreased NO2. Correcting for measurement error bias strengthened these associations and widened 95% confidence intervals. PMID:27789915
Merging gauge and satellite rainfall with specification of associated uncertainty across Australia
NASA Astrophysics Data System (ADS)
Woldemeskel, Fitsum M.; Sivakumar, Bellie; Sharma, Ashish
2013-08-01
Accurate estimation of spatial rainfall is crucial for modelling hydrological systems and planning and management of water resources. While spatial rainfall can be estimated either using rain gauge-based measurements or using satellite-based measurements, such estimates are subject to uncertainties due to various sources of errors in either case, including interpolation and retrieval errors. The purpose of the present study is twofold: (1) to investigate the benefit of merging rain gauge measurements and satellite rainfall data for Australian conditions and (2) to produce a database of retrospective rainfall along with a new uncertainty metric for each grid location at any timestep. The analysis involves four steps: First, a comparison of rain gauge measurements and the Tropical Rainfall Measuring Mission (TRMM) 3B42 data at such rain gauge locations is carried out. Second, gridded monthly rain gauge rainfall is determined using thin plate smoothing splines (TPSS) and modified inverse distance weight (MIDW) method. Third, the gridded rain gauge rainfall is merged with the monthly accumulated TRMM 3B42 using a linearised weighting procedure, the weights at each grid being calculated based on the error variances of each dataset. Finally, cross validation (CV) errors at rain gauge locations and standard errors at gridded locations for each timestep are estimated. The CV error statistics indicate that merging of the two datasets improves the estimation of spatial rainfall, and more so where the rain gauge network is sparse. The provision of spatio-temporal standard errors with the retrospective dataset is particularly useful for subsequent modelling applications where input error knowledge can help reduce the uncertainty associated with modelling outcomes.
Combined fabrication technique for high-precision aspheric optical windows
NASA Astrophysics Data System (ADS)
Hu, Hao; Song, Ci; Xie, Xuhui
2016-07-01
Specifications made on optical components are becoming more and more stringent with the performance improvement of modern optical systems. These strict requirements not only involve low spatial frequency surface accuracy, mid-and-high spatial frequency surface errors, but also surface smoothness and so on. This presentation mainly focuses on the fabrication process for square aspheric window which combines accurate grinding, magnetorheological finishing (MRF) and smoothing polishing (SP). In order to remove the low spatial frequency surface errors and subsurface defects after accurate grinding, the deterministic polishing method MRF with high convergence and stable material removal rate is applied. Then the SP technology with pseudo-random path is adopted to eliminate the mid-and-high spatial frequency surface ripples and high slope errors which is the defect for MRF. Additionally, the coordinate measurement method and interferometry are combined in different phase. Acid-etched method and ion beam figuring (IBF) are also investigated on observing and reducing the subsurface defects. Actual fabrication result indicates that the combined fabrication technique can lead to high machining efficiency on manufaturing the high-precision and high-quality optical aspheric windows.
Kretzschmar, A; Durand, E; Maisonnasse, A; Vallon, J; Le Conte, Y
2015-06-01
A new procedure of stratified sampling is proposed in order to establish an accurate estimation of Varroa destructor populations on sticky bottom boards of the hive. It is based on the spatial sampling theory that recommends using regular grid stratification in the case of spatially structured process. The distribution of varroa mites on sticky board being observed as spatially structured, we designed a sampling scheme based on a regular grid with circles centered on each grid element. This new procedure is then compared with a former method using partially random sampling. Relative error improvements are exposed on the basis of a large sample of simulated sticky boards (n=20,000) which provides a complete range of spatial structures, from a random structure to a highly frame driven structure. The improvement of varroa mite number estimation is then measured by the percentage of counts with an error greater than a given level. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Characteristics and Impact of Imperviousness From a GIS-based Hydrological Perspective
NASA Astrophysics Data System (ADS)
Moglen, G. E.; Kim, S.
2005-12-01
With the concern that imperviousness can be differently quantified depending on data sources and methods, this study assessed imperviousness estimates using two different data sources: land use and land cover. Year 2000 land use developed by the Maryland Department of Planning was utilized to estimate imperviousness by assigning imperviousness coefficients to unique land use categories. These estimates were compared with imperviousness estimates based on satellite-derived land cover from the 2001 National Land Cover Dataset. Our study developed the relationships between these two estimates in the form of regression equations to convert imperviousness derived from one data source to the other. The regression equations are considered reliable, based on goodness-of-fit measures. Furthermore, this study examined how quantitatively different imperviousness estimates affect the prediction of hydrological response both in the flow regime and in the thermal regime. We assessed the relationships between indicators of hydrological response and imperviousness-descriptors. As indicators of flow variability, coefficient of variance, lag-one autocorrelation, and mean daily flow change were calculated based on measured mean daily stream flow from the water year 1997 to 2003. For thermal variability, indicators such as percent-days of surge, degree-day, and mean daily temperature difference were calculated base on measured stream temperature over several basins in Maryland. To describe imperviousness through the hydrological process, GIS-based spatially distributed hydrological models were developed based on a water-balance method and the SCS-CN method. Imperviousness estimates from land use and land cover were used as predictors in these models to examine the effect of imperviousness using different data sources on the prediction of hydrological response. Indicators of hydrological response were also regressed on aggregate imperviousness. This allowed for identifying if hydrological response is more sensitive to spatially distributed imperviousness or aggregate (lumped) imperviousness. The regressions between indicators of hydrological response and imperviousness-descriptors were evaluated by examining goodness-of-fit measures such as explained variance or relative standard error. The results show that imperviousness estimates using land use are better predictors of flow variability and thermal variability than imperviousness estimates using land cover. Also, this study reveals that flow variability is more sensitive to spatially distributed models than lumped models, while thermal variability is equally responsive to both models. The findings from this study can be further examined from a policy perspective with regard to policies that are based on a threshold concept for imperviousness impacts on the ecological and hydrological system.
A voxel-based investigation for MRI-only radiotherapy of the brain using ultra short echo times
NASA Astrophysics Data System (ADS)
Edmund, Jens M.; Kjer, Hans M.; Van Leemput, Koen; Hansen, Rasmus H.; Andersen, Jon AL; Andreasen, Daniel
2014-12-01
Radiotherapy (RT) based on magnetic resonance imaging (MRI) as the only modality, so-called MRI-only RT, would remove the systematic registration error between MR and computed tomography (CT), and provide co-registered MRI for assessment of treatment response and adaptive RT. Electron densities, however, need to be assigned to the MRI images for dose calculation and patient setup based on digitally reconstructed radiographs (DRRs). Here, we investigate the geometric and dosimetric performance for a number of popular voxel-based methods to generate a so-called pseudo CT (pCT). Five patients receiving cranial irradiation, each containing a co-registered MRI and CT scan, were included. An ultra short echo time MRI sequence for bone visualization was used. Six methods were investigated for three popular types of voxel-based approaches; (1) threshold-based segmentation, (2) Bayesian segmentation and (3) statistical regression. Each approach contained two methods. Approach 1 used bulk density assignment of MRI voxels into air, soft tissue and bone based on logical masks and the transverse relaxation time T2 of the bone. Approach 2 used similar bulk density assignments with Bayesian statistics including or excluding additional spatial information. Approach 3 used a statistical regression correlating MRI voxels with their corresponding CT voxels. A similar photon and proton treatment plan was generated for a target positioned between the nasal cavity and the brainstem for all patients. The CT agreement with the pCT of each method was quantified and compared with the other methods geometrically and dosimetrically using both a number of reported metrics and introducing some novel metrics. The best geometrical agreement with CT was obtained with the statistical regression methods which performed significantly better than the threshold and Bayesian segmentation methods (excluding spatial information). All methods agreed significantly better with CT than a reference water MRI comparison. The mean dosimetric deviation for photons and protons compared to the CT was about 2% and highest in the gradient dose region of the brainstem. Both the threshold based method and the statistical regression methods showed the highest dosimetrical agreement. Generation of pCTs using statistical regression seems to be the most promising candidate for MRI-only RT of the brain. Further, the total amount of different tissues needs to be taken into account for dosimetric considerations regardless of their correct geometrical position.
A proposed metric for assessing the measurement quality of individual microarrays
Kim, Kyoungmi; Page, Grier P; Beasley, T Mark; Barnes, Stephen; Scheirer, Katherine E; Allison, David B
2006-01-01
Background High-density microarray technology is increasingly applied to study gene expression levels on a large scale. Microarray experiments rely on several critical steps that may introduce error and uncertainty in analyses. These steps include mRNA sample extraction, amplification and labeling, hybridization, and scanning. In some cases this may be manifested as systematic spatial variation on the surface of microarray in which expression measurements within an individual array may vary as a function of geographic position on the array surface. Results We hypothesized that an index of the degree of spatiality of gene expression measurements associated with their physical geographic locations on an array could indicate the summary of the physical reliability of the microarray. We introduced a novel way to formulate this index using a statistical analysis tool. Our approach regressed gene expression intensity measurements on a polynomial response surface of the microarray's Cartesian coordinates. We demonstrated this method using a fixed model and presented results from real and simulated datasets. Conclusion We demonstrated the potential of such a quantitative metric for assessing the reliability of individual arrays. Moreover, we showed that this procedure can be incorporated into laboratory practice as a means to set quality control specifications and as a tool to determine whether an array has sufficient quality to be retained in terms of spatial correlation of gene expression measurements. PMID:16430768
NASA Astrophysics Data System (ADS)
Ahmed, Oumer S.; Franklin, Steven E.; Wulder, Michael A.; White, Joanne C.
2015-03-01
Many forest management activities, including the development of forest inventories, require spatially detailed forest canopy cover and height data. Among the various remote sensing technologies, LiDAR (Light Detection and Ranging) offers the most accurate and consistent means for obtaining reliable canopy structure measurements. A potential solution to reduce the cost of LiDAR data, is to integrate transects (samples) of LiDAR data with frequently acquired and spatially comprehensive optical remotely sensed data. Although multiple regression is commonly used for such modeling, often it does not fully capture the complex relationships between forest structure variables. This study investigates the potential of Random Forest (RF), a machine learning technique, to estimate LiDAR measured canopy structure using a time series of Landsat imagery. The study is implemented over a 2600 ha area of industrially managed coastal temperate forests on Vancouver Island, British Columbia, Canada. We implemented a trajectory-based approach to time series analysis that generates time since disturbance (TSD) and disturbance intensity information for each pixel and we used this information to stratify the forest land base into two strata: mature forests and young forests. Canopy cover and height for three forest classes (i.e. mature, young and mature and young (combined)) were modeled separately using multiple regression and Random Forest (RF) techniques. For all forest classes, the RF models provided improved estimates relative to the multiple regression models. The lowest validation error was obtained for the mature forest strata in a RF model (R2 = 0.88, RMSE = 2.39 m and bias = -0.16 for canopy height; R2 = 0.72, RMSE = 0.068% and bias = -0.0049 for canopy cover). This study demonstrates the value of using disturbance and successional history to inform estimates of canopy structure and obtain improved estimates of forest canopy cover and height using the RF algorithm.
The formulation and estimation of a spatial skew-normal generalized ordered-response model.
DOT National Transportation Integrated Search
2016-06-01
This paper proposes a new spatial generalized ordered response model with skew-normal kernel error terms and an : associated estimation method. It contributes to the spatial analysis field by allowing a flexible and parametric skew-normal : distribut...
Effects of errors and gaps in spatial data sets on assessment of conservation progress.
Visconti, P; Di Marco, M; Álvarez-Romero, J G; Januchowski-Hartley, S R; Pressey, R L; Weeks, R; Rondinini, C
2013-10-01
Data on the location and extent of protected areas, ecosystems, and species' distributions are essential for determining gaps in biodiversity protection and identifying future conservation priorities. However, these data sets always come with errors in the maps and associated metadata. Errors are often overlooked in conservation studies, despite their potential negative effects on the reported extent of protection of species and ecosystems. We used 3 case studies to illustrate the implications of 3 sources of errors in reporting progress toward conservation objectives: protected areas with unknown boundaries that are replaced by buffered centroids, propagation of multiple errors in spatial data, and incomplete protected-area data sets. As of 2010, the frequency of protected areas with unknown boundaries in the World Database on Protected Areas (WDPA) caused the estimated extent of protection of 37.1% of the terrestrial Neotropical mammals to be overestimated by an average 402.8% and of 62.6% of species to be underestimated by an average 10.9%. Estimated level of protection of the world's coral reefs was 25% higher when using recent finer-resolution data on coral reefs as opposed to globally available coarse-resolution data. Accounting for additional data sets not yet incorporated into WDPA contributed up to 6.7% of additional protection to marine ecosystems in the Philippines. We suggest ways for data providers to reduce the errors in spatial and ancillary data and ways for data users to mitigate the effects of these errors on biodiversity assessments. © 2013 Society for Conservation Biology.
NASA Astrophysics Data System (ADS)
Murillo Feo, C. A.; Martnez Martinez, L. J.; Correa Muñoz, N. A.
2016-06-01
The accuracy of locating attributes on topographic surfaces when, using GPS in mountainous areas, is affected by obstacles to wave propagation. As part of this research on the semi-automatic detection of landslides, we evaluate the accuracy and spatial distribution of the horizontal error in GPS positioning in the tertiary road network of six municipalities located in mountainous areas in the department of Cauca, Colombia, using geo-referencing with GPS mapping equipment and static-fast and pseudo-kinematic methods. We obtained quality parameters for the GPS surveys with differential correction, using a post-processing method. The consolidated database underwent exploratory analyses to determine the statistical distribution, a multivariate analysis to establish relationships and partnerships between the variables, and an analysis of the spatial variability and calculus of accuracy, considering the effect of non-Gaussian distribution errors. The evaluation of the internal validity of the data provide metrics with a confidence level of 95% between 1.24 and 2.45 m in the static-fast mode and between 0.86 and 4.2 m in the pseudo-kinematic mode. The external validity had an absolute error of 4.69 m, indicating that this descriptor is more critical than precision. Based on the ASPRS standard, the scale obtained with the evaluated equipment was in the order of 1:20000, a level of detail expected in the landslide-mapping project. Modelling the spatial variability of the horizontal errors from the empirical semi-variogram analysis showed predictions errors close to the external validity of the devices.
Alados, C.L.; Pueyo, Y.; Giner, M.L.; Navarro, T.; Escos, J.; Barroso, F.; Cabezudo, B.; Emlen, J.M.
2003-01-01
We studied the effect of grazing on the degree of regression of successional vegetation dynamic in a semi-arid Mediterranean matorral. We quantified the spatial distribution patterns of the vegetation by fractal analyses, using the fractal information dimension and spatial autocorrelation measured by detrended fluctuation analyses (DFA). It is the first time that fractal analysis of plant spatial patterns has been used to characterize the regressive ecological succession. Plant spatial patterns were compared over a long-term grazing gradient (low, medium and heavy grazing pressure) and on ungrazed sites for two different plant communities: A middle dense matorral of Chamaerops and Periploca at Sabinar-Romeral and a middle dense matorral of Chamaerops, Rhamnus and Ulex at Requena-Montano. The two communities differed also in the microclimatic characteristics (sea oriented at the Sabinar-Romeral site and inland oriented at the Requena-Montano site). The information fractal dimension increased as we moved from a middle dense matorral to discontinuous and scattered matorral and, finally to the late regressive succession, at Stipa steppe stage. At this stage a drastic change in the fractal dimension revealed a change in the vegetation structure, accurately indicating end successional vegetation stages. Long-term correlation analysis (DFA) revealed that an increase in grazing pressure leads to unpredictability (randomness) in species distributions, a reduction in diversity, and an increase in cover of the regressive successional species, e.g. Stipa tenacissima L. These comparisons provide a quantitative characterization of the successional dynamic of plant spatial patterns in response to grazing perturbation gradient. ?? 2002 Elsevier Science B.V. All rights reserved.
A generalized model for estimating the energy density of invertebrates
James, Daniel A.; Csargo, Isak J.; Von Eschen, Aaron; Thul, Megan D.; Baker, James M.; Hayer, Cari-Ann; Howell, Jessica; Krause, Jacob; Letvin, Alex; Chipps, Steven R.
2012-01-01
Invertebrate energy density (ED) values are traditionally measured using bomb calorimetry. However, many researchers rely on a few published literature sources to obtain ED values because of time and sampling constraints on measuring ED with bomb calorimetry. Literature values often do not account for spatial or temporal variability associated with invertebrate ED. Thus, these values can be unreliable for use in models and other ecological applications. We evaluated the generality of the relationship between invertebrate ED and proportion of dry-to-wet mass (pDM). We then developed and tested a regression model to predict ED from pDM based on a taxonomically, spatially, and temporally diverse sample of invertebrates representing 28 orders in aquatic (freshwater, estuarine, and marine) and terrestrial (temperate and arid) habitats from 4 continents and 2 oceans. Samples included invertebrates collected in all seasons over the last 19 y. Evaluation of these data revealed a significant relationship between ED and pDM (r2 = 0.96, p < 0.0001), where ED (as J/g wet mass) was estimated from pDM as ED = 22,960pDM − 174.2. Model evaluation showed that nearly all (98.8%) of the variability between observed and predicted values for invertebrate ED could be attributed to residual error in the model. Regression of observed on predicted values revealed that the 97.5% joint confidence region included the intercept of 0 (−103.0 ± 707.9) and slope of 1 (1.01 ± 0.12). Use of this model requires that only dry and wet mass measurements be obtained, resulting in significant time, sample size, and cost savings compared to traditional bomb calorimetry approaches. This model should prove useful for a wide range of ecological studies because it is unaffected by taxonomic, seasonal, or spatial variability.
Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Dai, Wensheng
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.
NASA Technical Reports Server (NTRS)
Tolson, R. H.
1981-01-01
A technique is described for providing a means of evaluating the influence of spatial sampling on the determination of global mean total columnar ozone. A finite number of coefficients in the expansion are determined, and the truncated part of the expansion is shown to contribute an error to the estimate, which depends strongly on the spatial sampling and is relatively insensitive to data noise. First and second order statistics are derived for each term in a spherical harmonic expansion which represents the ozone field, and the statistics are used to estimate systematic and random errors in the estimates of total ozone.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
[Sociodemographic context of homicide in Mexico City: a spatial analysis].
Fuentes Flores, César; Sánchez Salinas, Omar
2015-12-01
Investigate the spatial distribution pattern of the homicide rate and its relation to sociodemographic features in the Benito Juárez, Coyoacán, and Cuauhtémoc districts of Mexico City in 2010. Inferential cross-sectional study that uses spatial analysis methods to study the spatial association of the homicide rate and demographic features. Spatial association was determined through the location quotient, multiple regression analysis, and the use of geographically weighted regression. Homicides show a heterogeneous location pattern with high rates in areas with non-residential land use, low population density, and low marginalization. Spatial analysis tools are powerful instruments for the design of prevention- and recreation-focused public safety policies that aim to reduce mortality from external causes such as homicides.
Šiljić Tomić, Aleksandra N; Antanasijević, Davor Z; Ristić, Mirjana Đ; Perić-Grujić, Aleksandra A; Pocajt, Viktor V
2016-05-01
This paper describes the application of artificial neural network models for the prediction of biological oxygen demand (BOD) levels in the Danube River. Eighteen regularly monitored water quality parameters at 17 stations on the river stretch passing through Serbia were used as input variables. The optimization of the model was performed in three consecutive steps: firstly, the spatial influence of a monitoring station was examined; secondly, the monitoring period necessary to reach satisfactory performance was determined; and lastly, correlation analysis was applied to evaluate the relationship among water quality parameters. Root-mean-square error (RMSE) was used to evaluate model performance in the first two steps, whereas in the last step, multiple statistical indicators of performance were utilized. As a result, two optimized models were developed, a general regression neural network model (labeled GRNN-1) that covers the monitoring stations from the Danube inflow to the city of Novi Sad and a GRNN model (labeled GRNN-2) that covers the stations from the city of Novi Sad to the border with Romania. Both models demonstrated good agreement between the predicted and actually observed BOD values.
On the effect of model parameters on forecast objects
NASA Astrophysics Data System (ADS)
Marzban, Caren; Jones, Corinne; Li, Ning; Sandgathe, Scott
2018-04-01
Many physics-based numerical models produce a gridded, spatial field of forecasts, e.g., a temperature map
. The field for some quantities generally consists of spatially coherent and disconnected objects
. Such objects arise in many problems, including precipitation forecasts in atmospheric models, eddy currents in ocean models, and models of forest fires. Certain features of these objects (e.g., location, size, intensity, and shape) are generally of interest. Here, a methodology is developed for assessing the impact of model parameters on the features of forecast objects. The main ingredients of the methodology include the use of (1) Latin hypercube sampling for varying the values of the model parameters, (2) statistical clustering algorithms for identifying objects, (3) multivariate multiple regression for assessing the impact of multiple model parameters on the distribution (across the forecast domain) of object features, and (4) methods for reducing the number of hypothesis tests and controlling the resulting errors. The final output
of the methodology is a series of box plots and confidence intervals that visually display the sensitivities. The methodology is demonstrated on precipitation forecasts from a mesoscale numerical weather prediction model.
A simulation study to quantify the impacts of exposure ...
BackgroundExposure measurement error in copollutant epidemiologic models has the potential to introduce bias in relative risk (RR) estimates. A simulation study was conducted using empirical data to quantify the impact of correlated measurement errors in time-series analyses of air pollution and health.MethodsZIP-code level estimates of exposure for six pollutants (CO, NOx, EC, PM2.5, SO4, O3) from 1999 to 2002 in the Atlanta metropolitan area were used to calculate spatial, population (i.e. ambient versus personal), and total exposure measurement error.Empirically determined covariance of pollutant concentration pairs and the associated measurement errors were used to simulate true exposure (exposure without error) from observed exposure. Daily emergency department visits for respiratory diseases were simulated using a Poisson time-series model with a main pollutant RR = 1.05 per interquartile range, and a null association for the copollutant (RR = 1). Monte Carlo experiments were used to evaluate the impacts of correlated exposure errors of different copollutant pairs.ResultsSubstantial attenuation of RRs due to exposure error was evident in nearly all copollutant pairs studied, ranging from 10 to 40% attenuation for spatial error, 3–85% for population error, and 31–85% for total error. When CO, NOx or EC is the main pollutant, we demonstrated the possibility of false positives, specifically identifying significant, positive associations for copoll
Carroll, Raymond J; Delaigle, Aurore; Hall, Peter
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.
2015-01-01
In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.
Cognitive flexibility correlates with gambling severity in young adults.
Leppink, Eric W; Redden, Sarah A; Chamberlain, Samuel R; Grant, Jon E
2016-10-01
Although gambling disorder (GD) is often characterized as a problem of impulsivity, compulsivity has recently been proposed as a potentially important feature of addictive disorders. The present analysis assessed the neurocognitive and clinical relationship between compulsivity on gambling behavior. A sample of 552 non-treatment seeking gamblers age 18-29 was recruited from the community for a study on gambling in young adults. Gambling severity levels included both casual and disordered gamblers. All participants completed the Intra/Extra-Dimensional Set Shift (IED) task, from which the total adjusted errors were correlated with gambling severity measures, and linear regression modeling was used to assess three error measures from the task. The present analysis found significant positive correlations between problems with cognitive flexibility and gambling severity (reflected by the number of DSM-5 criteria, gambling frequency, amount of money lost in the past year, and gambling urge/behavior severity). IED errors also showed a positive correlation with self-reported compulsive behavior scores. A significant correlation was also found between IED errors and non-planning impulsivity from the BIS. Linear regression models based on total IED errors, extra-dimensional (ED) shift errors, or pre-ED shift errors indicated that these factors accounted for a significant portion of the variance noted in several variables. These findings suggest that cognitive flexibility may be an important consideration in the assessment of gamblers. Results from correlational and linear regression analyses support this possibility, but the exact contributions of both impulsivity and cognitive flexibility remain entangled. Future studies will ideally be able to assess the longitudinal relationships between gambling, compulsivity, and impulsivity, helping to clarify the relative contributions of both impulsive and compulsive features. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung
2018-01-01
The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin
Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology
Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin; ...
2017-05-15
Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
NASA Astrophysics Data System (ADS)
Ishizaki, N. N.; Dairaku, K.; Ueno, G.
2016-12-01
We have developed a statistical downscaling method for estimating probabilistic climate projection using CMIP5 multi general circulation models (GCMs). A regression model was established so that the combination of weights of GCMs reflects the characteristics of the variation of observations at each grid point. Cross validations were conducted to select GCMs and to evaluate the regression model to avoid multicollinearity. By using spatially high resolution observation system, we conducted statistically downscaled probabilistic climate projections with 20-km horizontal grid spacing. Root mean squared errors for monthly mean air surface temperature and precipitation estimated by the regression method were the smallest compared with the results derived from a simple ensemble mean of GCMs and a cumulative distribution function based bias correction method. Projected changes in the mean temperature and precipitation were basically similar to those of the simple ensemble mean of GCMs. Mean precipitation was generally projected to increase associated with increased temperature and consequent increased moisture content in the air. Weakening of the winter monsoon may affect precipitation decrease in some areas. Temperature increase in excess of 4 K was expected in most areas of Japan in the end of 21st century under RCP8.5 scenario. The estimated probability of monthly precipitation exceeding 300 mm would increase around the Pacific side during the summer and the Japan Sea side during the winter season. This probabilistic climate projection based on the statistical method can be expected to bring useful information to the impact studies and risk assessments.
Extreme Universe Space Observatory (EUSO) Optics Module
NASA Technical Reports Server (NTRS)
Young, Roy; Christl, Mark
2008-01-01
A demonstration part will be manufactured in Japan on one of the large Toshiba machines with a diameter of 2.5 meters. This will be a flat PMMA disk that is cut between 0.5 and 1.25 meters radius. The cut should demonstrate manufacturing the most difficult parts of the 2.5 meter Fresnel pattern and the blazed grating on the diffractive surface. Optical simulations, validated with the subscale prototype, will be used to determine the limits on manufacturing errors (tolerances) that will result in optics that meet EUSO s requirements. There will be limits on surface roughness (or errors at high spatial frequency); radial and azimuthal slope errors (at lower spatial frequencies) and plunge cut depth errors in the blazed grating. The demonstration part will be measured to determine whether it was made within the allowable tolerances.
2014-01-01
Background Exposure measurement error is a concern in long-term PM2.5 health studies using ambient concentrations as exposures. We assessed error magnitude by estimating calibration coefficients as the association between personal PM2.5 exposures from validation studies and typically available surrogate exposures. Methods Daily personal and ambient PM2.5, and when available sulfate, measurements were compiled from nine cities, over 2 to 12 days. True exposure was defined as personal exposure to PM2.5 of ambient origin. Since PM2.5 of ambient origin could only be determined for five cities, personal exposure to total PM2.5 was also considered. Surrogate exposures were estimated as ambient PM2.5 at the nearest monitor or predicted outside subjects’ homes. We estimated calibration coefficients by regressing true on surrogate exposures in random effects models. Results When monthly-averaged personal PM2.5 of ambient origin was used as the true exposure, calibration coefficients equaled 0.31 (95% CI:0.14, 0.47) for nearest monitor and 0.54 (95% CI:0.42, 0.65) for outdoor home predictions. Between-city heterogeneity was not found for outdoor home PM2.5 for either true exposure. Heterogeneity was significant for nearest monitor PM2.5, for both true exposures, but not after adjusting for city-average motor vehicle number for total personal PM2.5. Conclusions Calibration coefficients were <1, consistent with previously reported chronic health risks using nearest monitor exposures being under-estimated when ambient concentrations are the exposure of interest. Calibration coefficients were closer to 1 for outdoor home predictions, likely reflecting less spatial error. Further research is needed to determine how our findings can be incorporated in future health studies. PMID:24410940
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2012-01-01
Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…
Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.
Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia
2017-06-01
Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.
Liu, Zun-Lei; Yuan, Xing-Wei; Yan, Li-Ping; Yang, Lin-Lin; Cheng, Jia-Hua
2013-09-01
By using the 2008-2010 investigation data about the body condition of small yellow croaker in the offshore waters of southern Yellow Sea (SYS), open waters of northern East China Sea (NECS), and offshore waters of middle East China Sea (MECS), this paper analyzed the spatial heterogeneity of body length-body mass of juvenile and adult small yellow croakers by the statistical approaches of mean regression model and quantile regression model. The results showed that the residual standard errors from the analysis of covariance (ANCOVA) and the linear mixed-effects model were similar, and those from the simple linear regression were the highest. For the juvenile small yellow croakers, their mean body mass in SYS and NECS estimated by the mixed-effects mean regression model was higher than the overall average mass across the three regions, while the mean body mass in MECS was below the overall average. For the adult small yellow croakers, their mean body mass in NECS was higher than the overall average, while the mean body mass in SYS and MECS was below the overall average. The results from quantile regression indicated the substantial differences in the allometric relationships of juvenile small yellow croakers between SYS, NECS, and MECS, with the estimated mean exponent of the allometric relationship in SYS being 2.85, and the interquartile range being from 2.63 to 2.96, which indicated the heterogeneity of body form. The results from ANCOVA showed that the allometric body length-body mass relationships were significantly different between the 25th and 75th percentile exponent values (F=6.38, df=1737, P<0.01) and the 25th percentile and median exponent values (F=2.35, df=1737, P=0.039). The relationship was marginally different between the median and 75th percentile exponent values (F=2.21, df=1737, P=0.051). The estimated body length-body mass exponent of adult small yellow croakers in SYS was 3.01 (10th and 95th percentiles = 2.77 and 3.1, respectively). The estimated body length-body mass relationships were significantly different from the lower and upper quantiles of the exponent (F=3.31, df=2793, P=0.01) and the median and upper quantiles (F=3.56, df=2793, P<0.01), while no significant difference was observed between the lower and median quantiles (F=0.98, df=2793, P=0.43).
Monitoring interannual variation in global crop yield using long-term AVHRR and MODIS observations
NASA Astrophysics Data System (ADS)
Zhang, Xiaoyang; Zhang, Qingyuan
2016-04-01
Advanced Very High Resolution Radiometer (AVHRR) and Moderate Resolution Imaging Spectroradiometer (MODIS) data have been extensively applied for crop yield prediction because of their daily temporal resolution and a global coverage. This study investigated global crop yield using daily two band Enhanced Vegetation Index (EVI2) derived from AVHRR (1981-1999) and MODIS (2000-2013) observations at a spatial resolution of 0.05° (∼5 km). Specifically, EVI2 temporal trajectory of crop growth was simulated using a hybrid piecewise logistic model (HPLM) for individual pixels, which was used to detect crop phenological metrics. The derived crop phenology was then applied to calculate crop greenness defined as EVI2 amplitude and EVI2 integration during annual crop growing seasons, which was further aggregated for croplands in each country, respectively. The interannual variations in EVI2 amplitude and EVI2 integration were combined to correlate to the variation in cereal yield from 1982-2012 for individual countries using a stepwise regression model, respectively. The results show that the confidence level of the established regression models was higher than 90% (P value < 0.1) in most countries in the northern hemisphere although it was relatively poor in the southern hemisphere (mainly in Africa). The error in the yield predication was relatively smaller in America, Europe and East Asia than that in Africa. In the 10 countries with largest cereal production across the world, the prediction error was less than 9% during past three decades. This suggests that crop phenology-controlled greenness from coarse resolution satellite data has the capability of predicting national crop yield across the world, which could provide timely and reliable crop information for global agricultural trade and policymakers.
Chen, Wei; Li, Hui; Hou, Enke; Wang, Shengquan; Wang, Guirong; Panahi, Mahdi; Li, Tao; Peng, Tao; Guo, Chen; Niu, Chao; Xiao, Lele; Wang, Jiale; Xie, Xiaoshen; Ahmad, Baharin Bin
2018-09-01
The aim of the current study was to produce groundwater spring potential maps using novel ensemble weights-of-evidence (WoE) with logistic regression (LR) and functional tree (FT) models. First, a total of 66 springs were identified by field surveys, out of which 70% of the spring locations were used for training the models and 30% of the spring locations were employed for the validation process. Second, a total of 14 affecting factors including aspect, altitude, slope, plan curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), sediment transport index (STI), lithology, normalized difference vegetation index (NDVI), land use, soil, distance to roads, and distance to streams was used to analyze the spatial relationship between these affecting factors and spring occurrences. Multicollinearity analysis and feature selection of the correlation attribute evaluation (CAE) method were employed to optimize the affecting factors. Subsequently, the novel ensembles of the WoE, LR, and FT models were constructed using the training dataset. Finally, the receiver operating characteristic (ROC) curves, standard error, confidence interval (CI) at 95%, and significance level P were employed to validate and compare the performance of three models. Overall, all three models performed well for groundwater spring potential evaluation. The prediction capability of the FT model, with the highest AUC values, the smallest standard errors, the narrowest CIs, and the smallest P values for the training and validation datasets, is better compared to those of other models. The groundwater spring potential maps can be adopted for the management of water resources and land use by planners and engineers. Copyright © 2018 Elsevier B.V. All rights reserved.
Digital halftoning methods for selectively partitioning error into achromatic and chromatic channels
NASA Technical Reports Server (NTRS)
Mulligan, Jeffrey B.
1990-01-01
A method is described for reducing the visibility of artifacts arising in the display of quantized color images on CRT displays. The method is based on the differential spatial sensitivity of the human visual system to chromatic and achromatic modulations. Because the visual system has the highest spatial and temporal acuity for the luminance component of an image, a technique which will reduce luminance artifacts at the expense of introducing high-frequency chromatic errors is sought. A method based on controlling the correlations between the quantization errors in the individual phosphor images is explored. The luminance component is greatest when the phosphor errors are positively correlated, and is minimized when the phosphor errors are negatively correlated. The greatest effect of the correlation is obtained when the intensity quantization step sizes of the individual phosphors have equal luminances. For the ordered dither algorithm, a version of the method can be implemented by simply inverting the matrix of thresholds for one of the color components.
Molina, Sergio L; Stodden, David F
2018-04-01
This study examined variability in throwing speed and spatial error to test the prediction of an inverted-U function (i.e., impulse-variability [IV] theory) and the speed-accuracy trade-off. Forty-five 9- to 11-year-old children were instructed to throw at a specified percentage of maximum speed (45%, 65%, 85%, and 100%) and hit the wall target. Results indicated no statistically significant differences in variable error across the target conditions (p = .72), failing to support the inverted-U hypothesis. Spatial accuracy results indicated no statistically significant differences with mean radial error (p = .18), centroid radial error (p = .13), and bivariate variable error (p = .08) also failing to support the speed-accuracy trade-off in overarm throwing. As neither throwing performance variability nor accuracy changed across percentages of maximum speed in this sample of children as well as in a previous adult sample, current policy and practices of practitioners may need to be reevaluated.
Functional CAR models for large spatially correlated functional datasets.
Zhang, Lin; Baladandayuthapani, Veerabhadran; Zhu, Hongxiao; Baggerly, Keith A; Majewski, Tadeusz; Czerniak, Bogdan A; Morris, Jeffrey S
2016-01-01
We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations.
Nagwani, Naresh Kumar; Deo, Shirish V
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.
Nagwani, Naresh Kumar; Deo, Shirish V.
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939
Spatiotemporal predictions of soil properties and states in variably saturated landscapes
NASA Astrophysics Data System (ADS)
Franz, Trenton E.; Loecke, Terrance D.; Burgin, Amy J.; Zhou, Yuzhen; Le, Tri; Moscicki, David
2017-07-01
Understanding greenhouse gas (GHG) fluxes from landscapes with variably saturated soil conditions is challenging given the highly dynamic nature of GHG fluxes in both space and time, dubbed hot spots, and hot moments. On one hand, our ability to directly monitor these processes is limited by sparse in situ and surface chamber observational networks. On the other hand, remote sensing approaches provide spatial data sets but are limited by infrequent imaging over time. We use a robust statistical framework to merge sparse sensor network observations with reconnaissance style hydrogeophysical mapping at a well-characterized site in Ohio. We find that combining time-lapse electromagnetic induction surveys with empirical orthogonal functions provides additional environmental covariates related to soil properties and states at high spatial resolutions ( 5 m). A cross-validation experiment using eight different spatial interpolation methods versus 120 in situ soil cores indicated an 30% reduction in root-mean-square error for soil properties (clay weight percent and total soil carbon weight percent) using hydrogeophysical derived environmental covariates with regression kriging. In addition, the hydrogeophysical derived environmental covariates were found to be good predictors of soil states (soil temperature, soil water content, and soil oxygen). The presented framework allows for temporal gap filling of individual sensor data sets as well as provides flexible geometric interpolation to complex areas/volumes. We anticipate that the framework, with its flexible temporal and spatial monitoring options, will be useful in designing future monitoring networks as well as support the next generation of hyper-resolution hydrologic and biogeochemical models.
Bonilla, Manuel G.; Mark, Robert K.; Lienkaemper, James J.
1984-01-01
In order to refine correlations of surface-wave magnitude, fault rupture length at the ground surface, and fault displacement at the surface by including the uncertainties in these variables, the existing data were critically reviewed and a new data base was compiled. Earthquake magnitudes were redetermined as necessary to make them as consistent as possible with the Gutenberg methods and results, which make up much of the data base. Measurement errors were estimated for the three variables for 58 moderate to large shallow-focus earthquakes. Regression analyses were then made utilizing the estimated measurement errors.The regression analysis demonstrates that the relations among the variables magnitude, length, and displacement are stochastic in nature. The stochastic variance, introduced in part by incomplete surface expression of seismogenic faulting, variation in shear modulus, and regional factors, dominates the estimated measurement errors. Thus, it is appropriate to use ordinary least squares for the regression models, rather than regression models based upon an underlying deterministic relation in which the variance results primarily from measurement errors.Significant differences exist in correlations of certain combinations of length, displacement, and magnitude when events are grouped by fault type or by region, including attenuation regions delineated by Evernden and others.Estimates of the magnitude and the standard deviation of the magnitude of a prehistoric or future earthquake associated with a fault can be made by correlating Ms with the logarithms of rupture length, fault displacement, or the product of length and displacement.Fault rupture area could be reliably estimated for about 20 of the events in the data set. Regression of Ms on rupture area did not result in a marked improvement over regressions that did not involve rupture area. Because no subduction-zone earthquakes are included in this study, the reported results do not apply to such zones.
NASA Astrophysics Data System (ADS)
Tesfagiorgis, Kibrewossen B.
Satellite Precipitation Estimates (SPEs) may be the only available source of information for operational hydrologic and flash flood prediction due to spatial limitations of radar and gauge products in mountainous regions. The present work develops an approach to seamlessly blend satellite, available radar, climatological and gauge precipitation products to fill gaps in ground-based radar precipitation field. To mix different precipitation products, the error of any of the products relative to each other should be removed. For bias correction, the study uses a new ensemble-based method which aims to estimate spatially varying multiplicative biases in SPEs using a radar-gauge precipitation product. Bias factors were calculated for a randomly selected sample of rainy pixels in the study area. Spatial fields of estimated bias were generated taking into account spatial variation and random errors in the sampled values. In addition to biases, sometimes there is also spatial error between the radar and satellite precipitation estimates; one of them has to be geometrically corrected with reference to the other. A set of corresponding raining points between SPE and radar products are selected to apply linear registration using a regularized least square technique to minimize the dislocation error in SPEs with respect to available radar products. A weighted Successive Correction Method (SCM) is used to make the merging between error corrected satellite and radar precipitation estimates. In addition to SCM, we use a combination of SCM and Bayesian spatial method for merging the rain gauges and climatological precipitation sources with radar and SPEs. We demonstrated the method using two satellite-based, CPC Morphing (CMORPH) and Hydro-Estimator (HE), two radar-gauge based, Stage-II and ST-IV, a climatological product PRISM and rain gauge dataset for several rain events from 2006 to 2008 over different geographical locations of the United States. Results show that: (a) the method of ensembles helped reduce biases in SPEs significantly; (b) the SCM method in combination with the Bayesian spatial model produced a precipitation product in good agreement with independent measurements .The study implies that using the available radar pixels surrounding the gap area, rain gauge, PRISM and satellite products, a radar like product is achievable over radar gap areas that benefits the operational meteorology and hydrology community.
Determination of suitable drying curve model for bread moisture loss during baking
NASA Astrophysics Data System (ADS)
Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.
2013-03-01
This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
Divergent estimation error in portfolio optimization and in linear regression
NASA Astrophysics Data System (ADS)
Kondor, I.; Varga-Haszonits, I.
2008-08-01
The problem of estimation error in portfolio optimization is discussed, in the limit where the portfolio size N and the sample size T go to infinity such that their ratio is fixed. The estimation error strongly depends on the ratio N/T and diverges for a critical value of this parameter. This divergence is the manifestation of an algorithmic phase transition, it is accompanied by a number of critical phenomena, and displays universality. As the structure of a large number of multidimensional regression and modelling problems is very similar to portfolio optimization, the scope of the above observations extends far beyond finance, and covers a large number of problems in operations research, machine learning, bioinformatics, medical science, economics, and technology.
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
PREDICTION OF SOLAR FLARE SIZE AND TIME-TO-FLARE USING SUPPORT VECTOR MACHINE REGRESSION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boucheron, Laura E.; Al-Ghraibah, Amani; McAteer, R. T. James
We study the prediction of solar flare size and time-to-flare using 38 features describing magnetic complexity of the photospheric magnetic field. This work uses support vector regression to formulate a mapping from the 38-dimensional feature space to a continuous-valued label vector representing flare size or time-to-flare. When we consider flaring regions only, we find an average error in estimating flare size of approximately half a geostationary operational environmental satellite (GOES) class. When we additionally consider non-flaring regions, we find an increased average error of approximately three-fourths a GOES class. We also consider thresholding the regressed flare size for the experimentmore » containing both flaring and non-flaring regions and find a true positive rate of 0.69 and a true negative rate of 0.86 for flare prediction. The results for both of these size regression experiments are consistent across a wide range of predictive time windows, indicating that the magnetic complexity features may be persistent in appearance long before flare activity. This is supported by our larger error rates of some 40 hr in the time-to-flare regression problem. The 38 magnetic complexity features considered here appear to have discriminative potential for flare size, but their persistence in time makes them less discriminative for the time-to-flare problem.« less
Knopman, Debra S.; Voss, Clifford I.
1987-01-01
The spatial and temporal variability of sensitivities has a significant impact on parameter estimation and sampling design for studies of solute transport in porous media. Physical insight into the behavior of sensitivities is offered through an analysis of analytically derived sensitivities for the one-dimensional form of the advection-dispersion equation. When parameters are estimated in regression models of one-dimensional transport, the spatial and temporal variability in sensitivities influences variance and covariance of parameter estimates. Several principles account for the observed influence of sensitivities on parameter uncertainty. (1) Information about a physical parameter may be most accurately gained at points in space and time with a high sensitivity to the parameter. (2) As the distance of observation points from the upstream boundary increases, maximum sensitivity to velocity during passage of the solute front increases and the consequent estimate of velocity tends to have lower variance. (3) The frequency of sampling must be “in phase” with the S shape of the dispersion sensitivity curve to yield the most information on dispersion. (4) The sensitivity to the dispersion coefficient is usually at least an order of magnitude less than the sensitivity to velocity. (5) The assumed probability distribution of random error in observations of solute concentration determines the form of the sensitivities. (6) If variance in random error in observations is large, trends in sensitivities of observation points may be obscured by noise and thus have limited value in predicting variance in parameter estimates among designs. (7) Designs that minimize the variance of one parameter may not necessarily minimize the variance of other parameters. (8) The time and space interval over which an observation point is sensitive to a given parameter depends on the actual values of the parameters in the underlying physical system.
Radon-222 concentrations in ground water and soil gas on Indian reservations in Wisconsin
DeWild, John F.; Krohelski, James T.
1995-01-01
For sites with wells finished in the sand and gravel aquifer, the coefficient of determination (R2) of the regression of concentration of radon-222 in ground water as a function of well depth is 0.003 and the significance level is 0.32, which indicates that there is not a statistically significant relation between radon-222 concentrations in ground water and well depth. The coefficient of determination of the regression of radon-222 in ground water and soil gas is 0.19 and the root mean square error of the regression line is 271 picocuries per liter. Even though the significance level (0.036) indicates a statistical relation, the root mean square error of the regression is so large that the regression equation would not give reliable predictions. Because of an inadequate number of samples, similar statistical analyses could not be performed for sites with wells finished in the crystalline and sedimentary bedrock aquifers.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
Comparing spatial regression to random forests for large ...
Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. Our primary goal is predicting MMI at over 1.1 million perennial stream reaches across the USA. For spatial regression modeling, we develop two new methods to accommodate large data: (1) a procedure that estimates optimal Box-Cox transformations to linearize covariate relationships; and (2) a computationally efficient covariate selection routine that takes into account spatial autocorrelation. We show that our new methods lead to cross-validated performance similar to random forests, but that there is an advantage for spatial regression when quantifying the uncertainty of the predictions. Simulations are used to clarify advantages for each method. This research investigates different approaches for modeling and mapping national stream condition. We use MMI data from the EPA's National Rivers and Streams Assessment and predictors from StreamCat (Hill et al., 2015). Previous studies have focused on modeling the MMI condition classes (i.e., good, fair, and po
Role of Aedes aegypti (Linnaeus) and Aedes albopictus (Skuse) in local dengue epidemics in Taiwan.
Tsai, Pui-Jen; Teng, Hwa-Jen
2016-11-09
Aedes mosquitoes in Taiwan mainly comprise Aedes albopictus and Ae. aegypti. However, the species contributing to autochthonous dengue spread and the extent at which it occurs remain unclear. Thus, in this study, we spatially analyzed real data to determine spatial features related to local dengue incidence and mosquito density, particularly that of Ae. albopictus and Ae. aegypti. We used bivariate Moran's I statistic and geographically weighted regression (GWR) spatial methods to analyze the globally spatial dependence and locally regressed relationship between (1) imported dengue incidences and Breteau indices (BIs) of Ae. albopictus, (2) imported dengue incidences and BI of Ae. aegypti, (3) autochthonous dengue incidences and BI of Ae. albopictus, (4) autochthonous dengue incidences and BI of Ae. aegypti, (5) all dengue incidences and BI of Ae. albopictus, (6) all dengue incidences and BI of Ae. aegypti, (7) BI of Ae. albopictus and human population density, and (8) BI of Ae. aegypti and human population density in 348 townships in Taiwan. In the GWR models, regression coefficients of spatially regressed relationships between the incidence of autochthonous dengue and vector density of Ae. aegypti were significant and positive in most townships in Taiwan. However, Ae. albopictus had significant but negative regression coefficients in clusters of dengue epidemics. In the global bivariate Moran's index, spatial dependence between the incidence of autochthonous dengue and vector density of Ae. aegypti was significant and exhibited positive correlation in Taiwan (bivariate Moran's index = 0.51). However, Ae. albopictus exhibited positively significant but low correlation (bivariate Moran's index = 0.06). Similar results were observed in the two spatial methods between all dengue incidences and Aedes mosquitoes (Ae. aegypti and Ae. albopictus). The regression coefficients of spatially regressed relationships between imported dengue cases and Aedes mosquitoes (Ae. aegypti and Ae. albopictus) were significant in 348 townships in Taiwan. The results indicated that local Aedes mosquitoes do not contribute to the dengue incidence of imported cases. The density of Ae. aegypti positively correlated with the density of human population. By contrast, the density of Ae. albopictus negatively correlated with the density of human population in the areas of southern Taiwan. The results indicated that Ae. aegypti has more opportunities for human-mosquito contact in dengue endemic areas in southern Taiwan. Ae. aegypti, but not Ae. albopictus, and human population density in southern Taiwan are closely associated with an increased risk of autochthonous dengue incidence.
NASA Technical Reports Server (NTRS)
Rose, F. G.
1983-01-01
Modeled temperature data from a one-dimensional, time-dependent, initial value, planetary boundary layer model for 16 separate model runs with varying initial values of moisture availability are applied, by the use of a regression equation, to longwave infrared GOES satellite data to infer moisture availability over a regional area in the central U.S. This was done for several days during the summers of 1978 and 1980 where a large gradient in the antecedent precipitation index (API) represented the boundary between a drought area and a region of near normal precipitation. Correlations between satellite derived moisture availability and API were found to exist. Errors from the presence of clouds, water vapor and other spatial inhomogeneities made the use of the measurement for anything except the relative degree of moisture availability dubious.
Garcia, Tanya P; Ma, Yanyuan
2017-10-01
We develop consistent and efficient estimation of parameters in general regression models with mismeasured covariates. We assume the model error and covariate distributions are unspecified, and the measurement error distribution is a general parametric distribution with unknown variance-covariance. We construct root- n consistent, asymptotically normal and locally efficient estimators using the semiparametric efficient score. We do not estimate any unknown distribution or model error heteroskedasticity. Instead, we form the estimator under possibly incorrect working distribution models for the model error, error-prone covariate, or both. Empirical results demonstrate robustness to different incorrect working models in homoscedastic and heteroskedastic models with error-prone covariates.
NASA Astrophysics Data System (ADS)
Zeng, Jicai; Zha, Yuanyuan; Zhang, Yonggen; Shi, Liangsheng; Zhu, Yan; Yang, Jinzhong
2017-11-01
Multi-scale modeling of the localized groundwater flow problems in a large-scale aquifer has been extensively investigated under the context of cost-benefit controversy. An alternative is to couple the parent and child models with different spatial and temporal scales, which may result in non-trivial sub-model errors in the local areas of interest. Basically, such errors in the child models originate from the deficiency in the coupling methods, as well as from the inadequacy in the spatial and temporal discretizations of the parent and child models. In this study, we investigate the sub-model errors within a generalized one-way coupling scheme given its numerical stability and efficiency, which enables more flexibility in choosing sub-models. To couple the models at different scales, the head solution at parent scale is delivered downward onto the child boundary nodes by means of the spatial and temporal head interpolation approaches. The efficiency of the coupling model is improved either by refining the grid or time step size in the parent and child models, or by carefully locating the sub-model boundary nodes. The temporal truncation errors in the sub-models can be significantly reduced by the adaptive local time-stepping scheme. The generalized one-way coupling scheme is promising to handle the multi-scale groundwater flow problems with complex stresses and heterogeneity.
NASA Technical Reports Server (NTRS)
Baxa, E. G., Jr.
1974-01-01
A theoretical formulation of differential and composite OMEGA error is presented to establish hypotheses about the functional relationships between various parameters and OMEGA navigational errors. Computer software developed to provide for extensive statistical analysis of the phase data is described. Results from the regression analysis used to conduct parameter sensitivity studies on differential OMEGA error tend to validate the theoretically based hypothesis concerning the relationship between uncorrected differential OMEGA error and receiver separation range and azimuth. Limited results of measurement of receiver repeatability error and line of position measurement error are also presented.
Wang, Ching-Yun; Cullings, Harry; Song, Xiao; Kopecky, Kenneth J.
2017-01-01
SUMMARY Observational epidemiological studies often confront the problem of estimating exposure-disease relationships when the exposure is not measured exactly. In the paper, we investigate exposure measurement error in excess relative risk regression, which is a widely used model in radiation exposure effect research. In the study cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies a generalized version of the classical additive measurement error model, but it may or may not have repeated measurements. In addition, an instrumental variable is available for individuals in a subset of the whole cohort. We develop a nonparametric correction (NPC) estimator using data from the subcohort, and further propose a joint nonparametric correction (JNPC) estimator using all observed data to adjust for exposure measurement error. An optimal linear combination estimator of JNPC and NPC is further developed. The proposed estimators are nonparametric, which are consistent without imposing a covariate or error distribution, and are robust to heteroscedastic errors. Finite sample performance is examined via a simulation study. We apply the developed methods to data from the Radiation Effects Research Foundation, in which chromosome aberration is used to adjust for the effects of radiation dose measurement error on the estimation of radiation dose responses. PMID:29354018
Spatial perception predicts laparoscopic skills on virtual reality laparoscopy simulator.
Hassan, I; Gerdes, B; Koller, M; Dick, B; Hellwig, D; Rothmund, M; Zielke, A
2007-06-01
This study evaluates the influence of visual-spatial perception on laparoscopic performance of novices with a virtual reality simulator (LapSim(R)). Twenty-four novices completed standardized tests of visual-spatial perception (Lameris Toegepaste Natuurwetenschappelijk Onderzoek [TNO] Test(R) and Stumpf-Fay Cube Perspectives Test(R)) and laparoscopic skills were assessed objectively, while performing 1-h practice sessions on the LapSim(R), comprising of coordination, cutting, and clip application tasks. Outcome variables included time to complete the tasks, economy of motion as well as total error scores, respectively. The degree of visual-spatial perception correlated significantly with laparoscopic performance on the LapSim(R) scores. Participants with a high degree of spatial perception (Group A) performed the tasks faster than those (Group B) who had a low degree of spatial perception (p = 0.001). Individuals with a high degree of spatial perception also scored better for economy of motion (p = 0.021), tissue damage (p = 0.009), and total error (p = 0.007). Among novices, visual-spatial perception is associated with manual skills performed on a virtual reality simulator. This result may be important for educators to develop adequate training programs that can be individually adapted.
What aspects of vision facilitate haptic processing?
Millar, Susanna; Al-Attar, Zainab
2005-12-01
We investigate how vision affects haptic performance when task-relevant visual cues are reduced or excluded. The task was to remember the spatial location of six landmarks that were explored by touch in a tactile map. Here, we use specially designed spectacles that simulate residual peripheral vision, tunnel vision, diffuse light perception, and total blindness. Results for target locations differed, suggesting additional effects from adjacent touch cues. These are discussed. Touch with full vision was most accurate, as expected. Peripheral and tunnel vision, which reduce visuo-spatial cues, differed in error pattern. Both were less accurate than full vision, and significantly more accurate than touch with diffuse light perception, and touch alone. The important finding was that touch with diffuse light perception, which excludes spatial cues, did not differ from touch without vision in performance accuracy, nor in location error pattern. The contrast between spatially relevant versus spatially irrelevant vision provides new, rather decisive, evidence against the hypothesis that vision affects haptic processing even if it does not add task-relevant information. The results support optimal integration theories, and suggest that spatial and non-spatial aspects of vision need explicit distinction in bimodal studies and theories of spatial integration.
Item Discrimination and Type I Error in the Detection of Differential Item Functioning
ERIC Educational Resources Information Center
Li, Yanju; Brooks, Gordon P.; Johanson, George A.
2012-01-01
In 2009, DeMars stated that when impact exists there will be Type I error inflation, especially with larger sample sizes and larger discrimination parameters for items. One purpose of this study is to present the patterns of Type I error rates using Mantel-Haenszel (MH) and logistic regression (LR) procedures when the mean ability between the…