large spatial datasets: Topics by Science.gov

Sample records for large spatial datasets

Finding Spatio-Temporal Patterns in Large Sensor Datasets

ERIC Educational Resources Information Center

McGuire, Michael Patrick

2010-01-01

Spatial or temporal data mining tasks are performed in the context of the relevant space, defined by a spatial neighborhood, and the relevant time period, defined by a specific time interval. Furthermore, when mining large spatio-temporal datasets, interesting patterns typically emerge where the dataset is most dynamic. This dissertation is…
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.

PubMed

Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O; Gelfand, Alan E

2016-01-01

Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

PubMed Central

Datta, Abhirup; Banerjee, Sudipto; Finley, Andrew O.; Gelfand, Alan E.

2018-01-01

Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online. PMID:29720777
geoknife: Reproducible web-processing of large gridded datasets

USGS Publications Warehouse

Read, Jordan S.; Walker, Jordan I.; Appling, Alison P.; Blodgett, David L.; Read, Emily K.; Winslow, Luke A.

2016-01-01

Geoprocessing of large gridded data according to overlap with irregular landscape features is common to many large-scale ecological analyses. The geoknife R package was created to facilitate reproducible analyses of gridded datasets found on the U.S. Geological Survey Geo Data Portal web application or elsewhere, using a web-enabled workflow that eliminates the need to download and store large datasets that are reliably hosted on the Internet. The package provides access to several data subset and summarization algorithms that are available on remote web processing servers. Outputs from geoknife include spatial and temporal data subsets, spatially-averaged time series values filtered by user-specified areas of interest, and categorical coverage fractions for various land-use types.
A novel on-line spatial-temporal k-anonymity method for location privacy protection from sequence rules-based inference attacks.

PubMed

Zhang, Haitao; Wu, Chenxue; Chen, Zewei; Liu, Zhao; Zhu, Yunhong

2017-01-01

Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules.
A novel on-line spatial-temporal k-anonymity method for location privacy protection from sequence rules-based inference attacks

PubMed Central

Wu, Chenxue; Liu, Zhao; Zhu, Yunhong

2017-01-01

Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules. PMID:28767687
Functional CAR models for large spatially correlated functional datasets.

PubMed

Zhang, Lin; Baladandayuthapani, Veerabhadran; Zhu, Hongxiao; Baggerly, Keith A; Majewski, Tadeusz; Czerniak, Bogdan A; Morris, Jeffrey S

2016-01-01

We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations.
What are we ‘tweeting’ about obesity? Mapping tweets with Topic Modeling and Geographic Information System

PubMed Central

Ghosh, Debarchana (Debs); Guha, Rajarshi

2014-01-01

Public health related tweets are difficult to identify in large conversational datasets like Twitter.com. Even more challenging is the visualization and analyses of the spatial patterns encoded in tweets. This study has the following objectives: How can topic modeling be used to identify relevant public health topics such as obesity on Twitter.com? What are the common obesity related themes? What is the spatial pattern of the themes? What are the research challenges of using large conversational datasets from social networking sites? Obesity is chosen as a test theme to demonstrate the effectiveness of topic modeling using Latent Dirichlet Allocation (LDA) and spatial analysis using Geographic Information System (GIS). The dataset is constructed from tweets (originating from the United States) extracted from Twitter.com on obesity-related queries. Examples of such queries are ‘food deserts’, ‘fast food’, and ‘childhood obesity’. The tweets are also georeferenced and time stamped. Three cohesive and meaningful themes such as ‘childhood obesity and schools’, ‘obesity prevention’, and ‘obesity and food habits’ are extracted from the LDA model. The GIS analysis of the extracted themes show distinct spatial pattern between rural and urban areas, northern and southern states, and between coasts and inland states. Further, relating the themes with ancillary datasets such as US census and locations of fast food restaurants based upon the location of the tweets in a GIS environment opened new avenues for spatial analyses and mapping. Therefore the techniques used in this study provide a possible toolset for computational social scientists in general and health researchers in specific to better understand health problems from large conversational datasets. PMID:25126022
What are we 'tweeting' about obesity? Mapping tweets with Topic Modeling and Geographic Information System.

PubMed

Ghosh, Debarchana Debs; Guha, Rajarshi

2013-01-01

Public health related tweets are difficult to identify in large conversational datasets like Twitter.com. Even more challenging is the visualization and analyses of the spatial patterns encoded in tweets. This study has the following objectives: How can topic modeling be used to identify relevant public health topics such as obesity on Twitter.com? What are the common obesity related themes? What is the spatial pattern of the themes? What are the research challenges of using large conversational datasets from social networking sites? Obesity is chosen as a test theme to demonstrate the effectiveness of topic modeling using Latent Dirichlet Allocation (LDA) and spatial analysis using Geographic Information System (GIS). The dataset is constructed from tweets (originating from the United States) extracted from Twitter.com on obesity-related queries. Examples of such queries are 'food deserts', 'fast food', and 'childhood obesity'. The tweets are also georeferenced and time stamped. Three cohesive and meaningful themes such as 'childhood obesity and schools', 'obesity prevention', and 'obesity and food habits' are extracted from the LDA model. The GIS analysis of the extracted themes show distinct spatial pattern between rural and urban areas, northern and southern states, and between coasts and inland states. Further, relating the themes with ancillary datasets such as US census and locations of fast food restaurants based upon the location of the tweets in a GIS environment opened new avenues for spatial analyses and mapping. Therefore the techniques used in this study provide a possible toolset for computational social scientists in general and health researchers in specific to better understand health problems from large conversational datasets.
Precipitation climatology over India: validation with observations and reanalysis datasets and spatial trends

NASA Astrophysics Data System (ADS)

Kishore, P.; Jyothi, S.; Basha, Ghouse; Rao, S. V. B.; Rajeevan, M.; Velicogna, Isabella; Sutterley, Tyler C.

2016-01-01

Changing rainfall patterns have significant effect on water resources, agriculture output in many countries, especially the country like India where the economy depends on rain-fed agriculture. Rainfall over India has large spatial as well as temporal variability. To understand the variability in rainfall, spatial-temporal analyses of rainfall have been studied by using 107 (1901-2007) years of daily gridded India Meteorological Department (IMD) rainfall datasets. Further, the validation of IMD precipitation data is carried out with different observational and different reanalysis datasets during the period from 1989 to 2007. The Global Precipitation Climatology Project data shows similar features as that of IMD with high degree of comparison, whereas Asian Precipitation-Highly-Resolved Observational Data Integration Towards Evaluation data show similar features but with large differences, especially over northwest, west coast and western Himalayas. Spatially, large deviation is observed in the interior peninsula during the monsoon season with National Aeronautics Space Administration-Modern Era Retrospective-analysis for Research and Applications (NASA-MERRA), pre-monsoon with Japanese 25 years Re Analysis (JRA-25), and post-monsoon with climate forecast system reanalysis (CFSR) reanalysis datasets. Among the reanalysis datasets, European Centre for Medium-Range Weather Forecasts Interim Re-Analysis (ERA-Interim) shows good comparison followed by CFSR, NASA-MERRA, and JRA-25. Further, for the first time, with high resolution and long-term IMD data, the spatial distribution of trends is estimated using robust regression analysis technique on the annual and seasonal rainfall data with respect to different regions of India. Significant positive and negative trends are noticed in the whole time series of data during the monsoon season. The northeast and west coast of the Indian region shows significant positive trends and negative trends over western Himalayas and north central Indian region.
Architectural Implications for Spatial Object Association Algorithms*

PubMed Central

Kumar, Vijay S.; Kurc, Tahsin; Saltz, Joel; Abdulla, Ghaleb; Kohn, Scott R.; Matarazzo, Celeste

2013-01-01

Spatial object association, also referred to as crossmatch of spatial datasets, is the problem of identifying and comparing objects in two or more datasets based on their positions in a common spatial coordinate system. In this work, we evaluate two crossmatch algorithms that are used for astronomical sky surveys, on the following database system architecture configurations: (1) Netezza Performance Server®, a parallel database system with active disk style processing capabilities, (2) MySQL Cluster, a high-throughput network database system, and (3) a hybrid configuration consisting of a collection of independent database system instances with data replication support. Our evaluation provides insights about how architectural characteristics of these systems affect the performance of the spatial crossmatch algorithms. We conducted our study using real use-case scenarios borrowed from a large-scale astronomy application known as the Large Synoptic Survey Telescope (LSST). PMID:25692244
Development of a GIService based on spatial data mining for location choice of convenience stores in Taipei City

NASA Astrophysics Data System (ADS)

Jung, Chinte; Sun, Chih-Hong

2006-10-01

Motivated by the increasing accessibility of technology, more and more spatial data are being made digitally available. How to extract the valuable knowledge from these large (spatial) databases is becoming increasingly important to businesses, as well. It is essential to be able to analyze and utilize these large datasets, convert them into useful knowledge, and transmit them through GIS-enabled instruments and the Internet, conveying the key information to business decision-makers effectively and benefiting business entities. In this research, we combine the techniques of GIS, spatial decision support system (SDSS), spatial data mining (SDM), and ArcGIS Server to achieve the following goals: (1) integrate databases from spatial and non-spatial datasets about the locations of businesses in Taipei, Taiwan; (2) use the association rules, one of the SDM methods, to extract the knowledge from the integrated databases; and (3) develop a Web-based SDSS GIService as a location-selection tool for business by the product of ArcGIS Server.
ESSG-based global spatial reference frame for datasets interrelation

NASA Astrophysics Data System (ADS)

Yu, J. Q.; Wu, L. X.; Jia, Y. J.

2013-10-01

To know well about the highly complex earth system, a large volume of, as well as a large variety of, datasets on the planet Earth are being obtained, distributed, and shared worldwide everyday. However, seldom of existing systems concentrates on the distribution and interrelation of different datasets in a common Global Spatial Reference Frame (GSRF), which holds an invisble obstacle to the data sharing and scientific collaboration. Group on Earth Obeservation (GEO) has recently established a new GSRF, named Earth System Spatial Grid (ESSG), for global datasets distribution, sharing and interrelation in its 2012-2015 WORKING PLAN.The ESSG may bridge the gap among different spatial datasets and hence overcome the obstacles. This paper is to present the implementation of the ESSG-based GSRF. A reference spheroid, a grid subdvision scheme, and a suitable encoding system are required to implement it. The radius of ESSG reference spheroid was set to the double of approximated Earth radius to make datasets from different areas of earth system science being covered. The same paramerters of positioning and orienting as Earth Centred Earth Fixed (ECEF) was adopted for the ESSG reference spheroid to make any other GSRFs being freely transformed into the ESSG-based GSRF. Spheroid degenerated octree grid with radius refiment (SDOG-R) and its encoding method were taken as the grid subdvision and encoding scheme for its good performance in many aspects. A triple (C, T, A) model is introduced to represent and link different datasets based on the ESSG-based GSRF. Finally, the methods of coordinate transformation between the ESSGbased GSRF and other GSRFs were presented to make ESSG-based GSRF operable and propagable.
Architectural Implications for Spatial Object Association Algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, V S; Kurc, T; Saltz, J

2009-01-29

Spatial object association, also referred to as cross-match of spatial datasets, is the problem of identifying and comparing objects in two or more datasets based on their positions in a common spatial coordinate system. In this work, we evaluate two crossmatch algorithms that are used for astronomical sky surveys, on the following database system architecture configurations: (1) Netezza Performance Server R, a parallel database system with active disk style processing capabilities, (2) MySQL Cluster, a high-throughput network database system, and (3) a hybrid configuration consisting of a collection of independent database system instances with data replication support. Our evaluation providesmore » insights about how architectural characteristics of these systems affect the performance of the spatial crossmatch algorithms. We conducted our study using real use-case scenarios borrowed from a large-scale astronomy application known as the Large Synoptic Survey Telescope (LSST).« less
Hierarchical Bayesian spatial models for predicting multiple forest variables using waveform LiDAR, hyperspectral imagery, and large inventory datasets

USGS Publications Warehouse

Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.

2013-01-01

In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Resolution testing and limitations of geodetic and tsunami datasets for finite fault inversions along subduction zones

NASA Astrophysics Data System (ADS)

Williamson, A.; Newman, A. V.

2017-12-01

Finite fault inversions utilizing multiple datasets have become commonplace for large earthquakes pending data availability. The mixture of geodetic datasets such as Global Navigational Satellite Systems (GNSS) and InSAR, seismic waveforms, and when applicable, tsunami waveforms from Deep-Ocean Assessment and Reporting of Tsunami (DART) gauges, provide slightly different observations that when incorporated together lead to a more robust model of fault slip distribution. The merging of different datasets is of particular importance along subduction zones where direct observations of seafloor deformation over the rupture area are extremely limited. Instead, instrumentation measures related ground motion from tens to hundreds of kilometers away. The distance from the event and dataset type can lead to a variable degree of resolution, affecting the ability to accurately model the spatial distribution of slip. This study analyzes the spatial resolution attained individually from geodetic and tsunami datasets as well as in a combined dataset. We constrain the importance of distance between estimated parameters and observed data and how that varies between land-based and open ocean datasets. Analysis focuses on accurately scaled subduction zone synthetic models as well as analysis of the relationship between slip and data in recent large subduction zone earthquakes. This study shows that seafloor deformation sensitive datasets, like open-ocean tsunami waveforms or seafloor geodetic instrumentation, can provide unique offshore resolution for understanding most large and particularly tsunamigenic megathrust earthquake activity. In most environments, we simply lack the capability to resolve static displacements using land-based geodetic observations.
Mapping and spatiotemporal analysis tool for hydrological data: Spellmap

USDA-ARS?s Scientific Manuscript database

Lack of data management and analyses tools is one of the major limitations to effectively evaluate and use large datasets of high-resolution atmospheric, surface, and subsurface observations. High spatial and temporal resolution datasets better represent the spatiotemporal variability of hydrologica...
A high-resolution European dataset for hydrologic modeling

NASA Astrophysics Data System (ADS)

Ntegeka, Victor; Salamon, Peter; Gomes, Goncalo; Sint, Hadewij; Lorini, Valerio; Thielen, Jutta

2013-04-01

There is an increasing demand for large scale hydrological models not only in the field of modeling the impact of climate change on water resources but also for disaster risk assessments and flood or drought early warning systems. These large scale models need to be calibrated and verified against large amounts of observations in order to judge their capabilities to predict the future. However, the creation of large scale datasets is challenging for it requires collection, harmonization, and quality checking of large amounts of observations. For this reason, only a limited number of such datasets exist. In this work, we present a pan European, high-resolution gridded dataset of meteorological observations (EFAS-Meteo) which was designed with the aim to drive a large scale hydrological model. Similar European and global gridded datasets already exist, such as the HadGHCND (Caesar et al., 2006), the JRC MARS-STAT database (van der Goot and Orlandi, 2003) and the E-OBS gridded dataset (Haylock et al., 2008). However, none of those provide similarly high spatial resolution and/or a complete set of variables to force a hydrologic model. EFAS-Meteo contains daily maps of precipitation, surface temperature (mean, minimum and maximum), wind speed and vapour pressure at a spatial grid resolution of 5 x 5 km for the time period 1 January 1990 - 31 December 2011. It furthermore contains calculated radiation, which is calculated by using a staggered approach depending on the availability of sunshine duration, cloud cover and minimum and maximum temperature, and evapotranspiration (potential evapotranspiration, bare soil and open water evapotranspiration). The potential evapotranspiration was calculated using the Penman-Monteith equation with the above-mentioned meteorological variables. The dataset was created as part of the development of the European Flood Awareness System (EFAS) and has been continuously updated throughout the last years. The dataset variables are used as inputs to the hydrological calibration and validation of EFAS as well as for establishing long-term discharge "proxy" climatologies which can then in turn be used for statistical analysis to derive return periods or other time series derivatives. In addition, this dataset will be used to assess climatological trends in Europe. Unfortunately, to date no baseline dataset at the European scale exists to test the quality of the herein presented data. Hence, a comparison against other existing datasets can therefore only be an indication of data quality. Due to availability, a comparison was made for precipitation and temperature only, arguably the most important meteorological drivers for hydrologic models. A variety of analyses was undertaken at country scale against data reported to EUROSTAT and E-OBS datasets. The comparison revealed that while the datasets showed overall similar temporal and spatial patterns, there were some differences in magnitudes especially for precipitation. It is not straightforward to define the specific cause for these differences. However, in most cases the comparatively low observation station density appears to be the principal reason for the differences in magnitude.
A Big Spatial Data Processing Framework Applying to National Geographic Conditions Monitoring

NASA Astrophysics Data System (ADS)

Xiao, F.

2018-04-01

In this paper, a novel framework for spatial data processing is proposed, which apply to National Geographic Conditions Monitoring project of China. It includes 4 layers: spatial data storage, spatial RDDs, spatial operations, and spatial query language. The spatial data storage layer uses HDFS to store large size of spatial vector/raster data in the distributed cluster. The spatial RDDs are the abstract logical dataset of spatial data types, and can be transferred to the spark cluster to conduct spark transformations and actions. The spatial operations layer is a series of processing on spatial RDDs, such as range query, k nearest neighbor and spatial join. The spatial query language is a user-friendly interface which provide people not familiar with Spark with a comfortable way to operation the spatial operation. Compared with other spatial frameworks, it is highlighted that comprehensive technologies are referred for big spatial data processing. Extensive experiments on real datasets show that the framework achieves better performance than traditional process methods.
Sensitivity and specificity considerations for fMRI encoding, decoding, and mapping of auditory cortex at ultra-high field.

PubMed

Moerel, Michelle; De Martino, Federico; Kemper, Valentin G; Schmitter, Sebastian; Vu, An T; Uğurbil, Kâmil; Formisano, Elia; Yacoub, Essa

2018-01-01

Following rapid technological advances, ultra-high field functional MRI (fMRI) enables exploring correlates of neuronal population activity at an increasing spatial resolution. However, as the fMRI blood-oxygenation-level-dependent (BOLD) contrast is a vascular signal, the spatial specificity of fMRI data is ultimately determined by the characteristics of the underlying vasculature. At 7T, fMRI measurement parameters determine the relative contribution of the macro- and microvasculature to the acquired signal. Here we investigate how these parameters affect relevant high-end fMRI analyses such as encoding, decoding, and submillimeter mapping of voxel preferences in the human auditory cortex. Specifically, we compare a T 2 * weighted fMRI dataset, obtained with 2D gradient echo (GE) EPI, to a predominantly T 2 weighted dataset obtained with 3D GRASE. We first investigated the decoding accuracy based on two encoding models that represented different hypotheses about auditory cortical processing. This encoding/decoding analysis profited from the large spatial coverage and sensitivity of the T 2 * weighted acquisitions, as evidenced by a significantly higher prediction accuracy in the GE-EPI dataset compared to the 3D GRASE dataset for both encoding models. The main disadvantage of the T 2 * weighted GE-EPI dataset for encoding/decoding analyses was that the prediction accuracy exhibited cortical depth dependent vascular biases. However, we propose that the comparison of prediction accuracy across the different encoding models may be used as a post processing technique to salvage the spatial interpretability of the GE-EPI cortical depth-dependent prediction accuracy. Second, we explored the mapping of voxel preferences. Large-scale maps of frequency preference (i.e., tonotopy) were similar across datasets, yet the GE-EPI dataset was preferable due to its larger spatial coverage and sensitivity. However, submillimeter tonotopy maps revealed biases in assigned frequency preference and selectivity for the GE-EPI dataset, but not for the 3D GRASE dataset. Thus, a T 2 weighted acquisition is recommended if high specificity in tonotopic maps is required. In conclusion, different fMRI acquisitions were better suited for different analyses. It is therefore critical that any sequence parameter optimization considers the eventual intended fMRI analyses and the nature of the neuroscience questions being asked. Copyright © 2017 Elsevier Inc. All rights reserved.

Spatially-explicit estimation of geographical representation in large-scale species distribution datasets.

PubMed

Kalwij, Jesse M; Robertson, Mark P; Ronk, Argo; Zobel, Martin; Pärtel, Meelis

2014-01-01

Much ecological research relies on existing multispecies distribution datasets. Such datasets, however, can vary considerably in quality, extent, resolution or taxonomic coverage. We provide a framework for a spatially-explicit evaluation of geographical representation within large-scale species distribution datasets, using the comparison of an occurrence atlas with a range atlas dataset as a working example. Specifically, we compared occurrence maps for 3773 taxa from the widely-used Atlas Florae Europaeae (AFE) with digitised range maps for 2049 taxa of the lesser-known Atlas of North European Vascular Plants. We calculated the level of agreement at a 50-km spatial resolution using average latitudinal and longitudinal species range, and area of occupancy. Agreement in species distribution was calculated and mapped using Jaccard similarity index and a reduced major axis (RMA) regression analysis of species richness between the entire atlases (5221 taxa in total) and between co-occurring species (601 taxa). We found no difference in distribution ranges or in the area of occupancy frequency distribution, indicating that atlases were sufficiently overlapping for a valid comparison. The similarity index map showed high levels of agreement for central, western, and northern Europe. The RMA regression confirmed that geographical representation of AFE was low in areas with a sparse data recording history (e.g., Russia, Belarus and the Ukraine). For co-occurring species in south-eastern Europe, however, the Atlas of North European Vascular Plants showed remarkably higher richness estimations. Geographical representation of atlas data can be much more heterogeneous than often assumed. Level of agreement between datasets can be used to evaluate geographical representation within datasets. Merging atlases into a single dataset is worthwhile in spite of methodological differences, and helps to fill gaps in our knowledge of species distribution ranges. Species distribution dataset mergers, such as the one exemplified here, can serve as a baseline towards comprehensive species distribution datasets.
An Evaluation of Database Solutions to Spatial Object Association

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, V S; Kurc, T; Saltz, J

2008-06-24

Object association is a common problem encountered in many applications. Spatial object association, also referred to as crossmatch of spatial datasets, is the problem of identifying and comparing objects in two datasets based on their positions in a common spatial coordinate system--one of the datasets may correspond to a catalog of objects observed over time in a multi-dimensional domain; the other dataset may consist of objects observed in a snapshot of the domain at a time point. The use of database management systems to the solve the object association problem provides portability across different platforms and also greater flexibility. Increasingmore » dataset sizes in today's applications, however, have made object association a data/compute-intensive problem that requires targeted optimizations for efficient execution. In this work, we investigate how database-based crossmatch algorithms can be deployed on different database system architectures and evaluate the deployments to understand the impact of architectural choices on crossmatch performance and associated trade-offs. We investigate the execution of two crossmatch algorithms on (1) a parallel database system with active disk style processing capabilities, (2) a high-throughput network database (MySQL Cluster), and (3) shared-nothing databases with replication. We have conducted our study in the context of a large-scale astronomy application with real use-case scenarios.« less
Emerging Technologies for Assessing Physical Activity Behaviors in Space and Time

PubMed Central

Hurvitz, Philip M.; Moudon, Anne Vernez; Kang, Bumjoon; Saelens, Brian E.; Duncan, Glen E.

2014-01-01

Precise measurement of physical activity is important for health research, providing a better understanding of activity location, type, duration, and intensity. This article describes a novel suite of tools to measure and analyze physical activity behaviors in spatial epidemiology research. We use individual-level, high-resolution, objective data collected in a space-time framework to investigate built and social environment influences on activity. First, we collect data with accelerometers, global positioning system units, and smartphone-based digital travel and photo diaries to overcome many limitations inherent in self-reported data. Behaviors are measured continuously over the full spectrum of environmental exposures in daily life, instead of focusing exclusively on the home neighborhood. Second, data streams are integrated using common timestamps into a single data structure, the “LifeLog.” A graphic interface tool, “LifeLog View,” enables simultaneous visualization of all LifeLog data streams. Finally, we use geographic information system SmartMap rasters to measure spatially continuous environmental variables to capture exposures at the same spatial and temporal scale as in the LifeLog. These technologies enable precise measurement of behaviors in their spatial and temporal settings but also generate very large datasets; we discuss current limitations and promising methods for processing and analyzing such large datasets. Finally, we provide applications of these methods in spatially oriented research, including a natural experiment to evaluate the effects of new transportation infrastructure on activity levels, and a study of neighborhood environmental effects on activity using twins as quasi-causal controls to overcome self-selection and reverse causation problems. In summary, the integrative characteristics of large datasets contained in LifeLogs and SmartMaps hold great promise for advancing spatial epidemiologic research to promote healthy behaviors. PMID:24479113
GeoPAT: A toolbox for pattern-based information retrieval from large geospatial databases

NASA Astrophysics Data System (ADS)

Jasiewicz, Jarosław; Netzel, Paweł; Stepinski, Tomasz

2015-07-01

Geospatial Pattern Analysis Toolbox (GeoPAT) is a collection of GRASS GIS modules for carrying out pattern-based geospatial analysis of images and other spatial datasets. The need for pattern-based analysis arises when images/rasters contain rich spatial information either because of their very high resolution or their very large spatial extent. Elementary units of pattern-based analysis are scenes - patches of surface consisting of a complex arrangement of individual pixels (patterns). GeoPAT modules implement popular GIS algorithms, such as query, overlay, and segmentation, to operate on the grid of scenes. To achieve these capabilities GeoPAT includes a library of scene signatures - compact numerical descriptors of patterns, and a library of distance functions - providing numerical means of assessing dissimilarity between scenes. Ancillary GeoPAT modules use these functions to construct a grid of scenes or to assign signatures to individual scenes having regular or irregular geometries. Thus GeoPAT combines knowledge retrieval from patterns with mapping tasks within a single integrated GIS environment. GeoPAT is designed to identify and analyze complex, highly generalized classes in spatial datasets. Examples include distinguishing between different styles of urban settlements using VHR images, delineating different landscape types in land cover maps, and mapping physiographic units from DEM. The concept of pattern-based spatial analysis is explained and the roles of all modules and functions are described. A case study example pertaining to delineation of landscape types in a subregion of NLCD is given. Performance evaluation is included to highlight GeoPAT's applicability to very large datasets. The GeoPAT toolbox is available for download from
Unleashing spatially distributed ecohydrology modeling using Big Data tools

NASA Astrophysics Data System (ADS)

Miles, B.; Idaszak, R.

2015-12-01

Physically based spatially distributed ecohydrology models are useful for answering science and management questions related to the hydrology and biogeochemistry of prairie, savanna, forested, as well as urbanized ecosystems. However, these models can produce hundreds of gigabytes of spatial output for a single model run over decadal time scales when run at regional spatial scales and moderate spatial resolutions (~100-km2+ at 30-m spatial resolution) or when run for small watersheds at high spatial resolutions (~1-km2 at 3-m spatial resolution). Numerical data formats such as HDF5 can store arbitrarily large datasets. However even in HPC environments, there are practical limits on the size of single files that can be stored and reliably backed up. Even when such large datasets can be stored, querying and analyzing these data can suffer from poor performance due to memory limitations and I/O bottlenecks, for example on single workstations where memory and bandwidth are limited, or in HPC environments where data are stored separately from computational nodes. The difficulty of storing and analyzing spatial data from ecohydrology models limits our ability to harness these powerful tools. Big Data tools such as distributed databases have the potential to surmount the data storage and analysis challenges inherent to large spatial datasets. Distributed databases solve these problems by storing data close to computational nodes while enabling horizontal scalability and fault tolerance. Here we present the architecture of and preliminary results from PatchDB, a distributed datastore for managing spatial output from the Regional Hydro-Ecological Simulation System (RHESSys). The initial version of PatchDB uses message queueing to asynchronously write RHESSys model output to an Apache Cassandra cluster. Once stored in the cluster, these data can be efficiently queried to quickly produce both spatial visualizations for a particular variable (e.g. maps and animations), as well as point time series of arbitrary variables at arbitrary points in space within a watershed or river basin. By treating ecohydrology modeling as a Big Data problem, we hope to provide a platform for answering transformative science and management questions related to water quantity and quality in a world of non-stationary climate.
Challenges in Extracting Information From Large Hydrogeophysical-monitoring Datasets

NASA Astrophysics Data System (ADS)

Day-Lewis, F. D.; Slater, L. D.; Johnson, T.

2012-12-01

Over the last decade, new automated geophysical data-acquisition systems have enabled collection of increasingly large and information-rich geophysical datasets. Concurrent advances in field instrumentation, web services, and high-performance computing have made real-time processing, inversion, and visualization of large three-dimensional tomographic datasets practical. Geophysical-monitoring datasets have provided high-resolution insights into diverse hydrologic processes including groundwater/surface-water exchange, infiltration, solute transport, and bioremediation. Despite the high information content of such datasets, extraction of quantitative or diagnostic hydrologic information is challenging. Visual inspection and interpretation for specific hydrologic processes is difficult for datasets that are large, complex, and (or) affected by forcings (e.g., seasonal variations) unrelated to the target hydrologic process. New strategies are needed to identify salient features in spatially distributed time-series data and to relate temporal changes in geophysical properties to hydrologic processes of interest while effectively filtering unrelated changes. Here, we review recent work using time-series and digital-signal-processing approaches in hydrogeophysics. Examples include applications of cross-correlation, spectral, and time-frequency (e.g., wavelet and Stockwell transforms) approaches to (1) identify salient features in large geophysical time series; (2) examine correlation or coherence between geophysical and hydrologic signals, even in the presence of non-stationarity; and (3) condense large datasets while preserving information of interest. Examples demonstrate analysis of large time-lapse electrical tomography and fiber-optic temperature datasets to extract information about groundwater/surface-water exchange and contaminant transport.
Efficient Lane Boundary Detection with Spatial-Temporal Knowledge Filtering

PubMed Central

Nan, Zhixiong; Wei, Ping; Xu, Linhai; Zheng, Nanning

2016-01-01

Lane boundary detection technology has progressed rapidly over the past few decades. However, many challenges that often lead to lane detection unavailability remain to be solved. In this paper, we propose a spatial-temporal knowledge filtering model to detect lane boundaries in videos. To address the challenges of structure variation, large noise and complex illumination, this model incorporates prior spatial-temporal knowledge with lane appearance features to jointly identify lane boundaries. The model first extracts line segments in video frames. Two novel filters—the Crossing Point Filter (CPF) and the Structure Triangle Filter (STF)—are proposed to filter out the noisy line segments. The two filters introduce spatial structure constraints and temporal location constraints into lane detection, which represent the spatial-temporal knowledge about lanes. A straight line or curve model determined by a state machine is used to fit the line segments to finally output the lane boundaries. We collected a challenging realistic traffic scene dataset. The experimental results on this dataset and other standard dataset demonstrate the strength of our method. The proposed method has been successfully applied to our autonomous experimental vehicle. PMID:27529248
Merging Station Observations with Large-Scale Gridded Data to Improve Hydrological Predictions over Chile

NASA Astrophysics Data System (ADS)

Peng, L.; Sheffield, J.; Verbist, K. M. J.

2016-12-01

Hydrological predictions at regional-to-global scales are often hampered by the lack of meteorological forcing data. The use of large-scale gridded meteorological data is able to overcome this limitation, but these data are subject to regional biases and unrealistic values at local scale. This is especially challenging in regions such as Chile, where climate exhibits high spatial heterogeneity as a result of long latitude span and dramatic elevation changes. However, regional station-based observational datasets are not fully exploited and have the potential of constraining biases and spatial patterns. This study aims at adjusting precipitation and temperature estimates from the Princeton University global meteorological forcing (PGF) gridded dataset to improve hydrological simulations over Chile, by assimilating 982 gauges from the Dirección General de Aguas (DGA). To merge station data with the gridded dataset, we use a state-space estimation method to produce optimal gridded estimates, considering both the error of the station measurements and the gridded PGF product. The PGF daily precipitation, maximum and minimum temperature at 0.25° spatial resolution are adjusted for the period of 1979-2010. Precipitation and temperature gauges with long and continuous records (>70% temporal coverage) are selected, while the remaining stations are used for validation. The leave-one-out cross validation verifies the robustness of this data assimilation approach. The merged dataset is then used to force the Variable Infiltration Capacity (VIC) hydrological model over Chile at daily time step which are compared to the observations of streamflow. Our initial results show that the station-merged PGF precipitation effectively captures drizzle and the spatial pattern of storms. Overall the merged dataset has significant improvements compared to the original PGF with reduced biases and stronger inter-annual variability. The invariant spatial pattern of errors between the station data and the gridded product opens up the possibility of merging real-time satellite and intermittent gauge observations to produce more accurate real-time hydrological predictions.
Statistical analysis of mesoscale rainfall: Dependence of a random cascade generator on large-scale forcing

NASA Technical Reports Server (NTRS)

Over, Thomas, M.; Gupta, Vijay K.

1994-01-01

Under the theory of independent and identically distributed random cascades, the probability distribution of the cascade generator determines the spatial and the ensemble properties of spatial rainfall. Three sets of radar-derived rainfall data in space and time are analyzed to estimate the probability distribution of the generator. A detailed comparison between instantaneous scans of spatial rainfall and simulated cascades using the scaling properties of the marginal moments is carried out. This comparison highlights important similarities and differences between the data and the random cascade theory. Differences are quantified and measured for the three datasets. Evidence is presented to show that the scaling properties of the rainfall can be captured to the first order by a random cascade with a single parameter. The dependence of this parameter on forcing by the large-scale meteorological conditions, as measured by the large-scale spatial average rain rate, is investigated for these three datasets. The data show that this dependence can be captured by a one-to-one function. Since the large-scale average rain rate can be diagnosed from the large-scale dynamics, this relationship demonstrates an important linkage between the large-scale atmospheric dynamics and the statistical cascade theory of mesoscale rainfall. Potential application of this research to parameterization of runoff from the land surface and regional flood frequency analysis is briefly discussed, and open problems for further research are presented.
Multiresolution comparison of precipitation datasets for large-scale models

NASA Astrophysics Data System (ADS)

Chun, K. P.; Sapriza Azuri, G.; Davison, B.; DeBeer, C. M.; Wheater, H. S.

2014-12-01

Gridded precipitation datasets are crucial for driving large-scale models which are related to weather forecast and climate research. However, the quality of precipitation products is usually validated individually. Comparisons between gridded precipitation products along with ground observations provide another avenue for investigating how the precipitation uncertainty would affect the performance of large-scale models. In this study, using data from a set of precipitation gauges over British Columbia and Alberta, we evaluate several widely used North America gridded products including the Canadian Gridded Precipitation Anomalies (CANGRD), the National Center for Environmental Prediction (NCEP) reanalysis, the Water and Global Change (WATCH) project, the thin plate spline smoothing algorithms (ANUSPLIN) and Canadian Precipitation Analysis (CaPA). Based on verification criteria for various temporal and spatial scales, results provide an assessment of possible applications for various precipitation datasets. For long-term climate variation studies (~100 years), CANGRD, NCEP, WATCH and ANUSPLIN have different comparative advantages in terms of their resolution and accuracy. For synoptic and mesoscale precipitation patterns, CaPA provides appealing performance of spatial coherence. In addition to the products comparison, various downscaling methods are also surveyed to explore new verification and bias-reduction methods for improving gridded precipitation outputs for large-scale models.
Global patterns of current and future road infrastructure

NASA Astrophysics Data System (ADS)

Meijer, Johan R.; Huijbregts, Mark A. J.; Schotten, Kees C. G. J.; Schipper, Aafke M.

2018-06-01

Georeferenced information on road infrastructure is essential for spatial planning, socio-economic assessments and environmental impact analyses. Yet current global road maps are typically outdated or characterized by spatial bias in coverage. In the Global Roads Inventory Project we gathered, harmonized and integrated nearly 60 geospatial datasets on road infrastructure into a global roads dataset. The resulting dataset covers 222 countries and includes over 21 million km of roads, which is two to three times the total length in the currently best available country-based global roads datasets. We then related total road length per country to country area, population density, GDP and OECD membership, resulting in a regression model with adjusted R 2 of 0.90, and found that that the highest road densities are associated with densely populated and wealthier countries. Applying our regression model to future population densities and GDP estimates from the Shared Socioeconomic Pathway (SSP) scenarios, we obtained a tentative estimate of 3.0–4.7 million km additional road length for the year 2050. Large increases in road length were projected for developing nations in some of the world’s last remaining wilderness areas, such as the Amazon, the Congo basin and New Guinea. This highlights the need for accurate spatial road datasets to underpin strategic spatial planning in order to reduce the impacts of roads in remaining pristine ecosystems.
Optimizing tertiary storage organization and access for spatio-temporal datasets

NASA Technical Reports Server (NTRS)

Chen, Ling Tony; Rotem, Doron; Shoshani, Arie; Drach, Bob; Louis, Steve; Keating, Meridith

1994-01-01

We address in this paper data management techniques for efficiently retrieving requested subsets of large datasets stored on mass storage devices. This problem represents a major bottleneck that can negate the benefits of fast networks, because the time to access a subset from a large dataset stored on a mass storage system is much greater that the time to transmit that subset over a network. This paper focuses on very large spatial and temporal datasets generated by simulation programs in the area of climate modeling, but the techniques developed can be applied to other applications that deal with large multidimensional datasets. The main requirement we have addressed is the efficient access of subsets of information contained within much larger datasets, for the purpose of analysis and interactive visualization. We have developed data partitioning techniques that partition datasets into 'clusters' based on analysis of data access patterns and storage device characteristics. The goal is to minimize the number of clusters read from mass storage systems when subsets are requested. We emphasize in this paper proposed enhancements to current storage server protocols to permit control over physical placement of data on storage devices. We also discuss in some detail the aspects of the interface between the application programs and the mass storage system, as well as a workbench to help scientists to design the best reorganization of a dataset for anticipated access patterns.
TECA: A Parallel Toolkit for Extreme Climate Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prabhat, Mr; Ruebel, Oliver; Byna, Surendra

2012-03-12

We present TECA, a parallel toolkit for detecting extreme events in large climate datasets. Modern climate datasets expose parallelism across a number of dimensions: spatial locations, timesteps and ensemble members. We design TECA to exploit these modes of parallelism and demonstrate a prototype implementation for detecting and tracking three classes of extreme events: tropical cyclones, extra-tropical cyclones and atmospheric rivers. We process a modern TB-sized CAM5 simulation dataset with TECA, and demonstrate good runtime performance for the three case studies.
Statistical Inference and Spatial Patterns in Correlates of IQ

ERIC Educational Resources Information Center

Hassall, Christopher; Sherratt, Thomas N.

2011-01-01

Cross-national comparisons of IQ have become common since the release of a large dataset of international IQ scores. However, these studies have consistently failed to consider the potential lack of independence of these scores based on spatial proximity. To demonstrate the importance of this omission, we present a re-evaluation of several…
OpenMSI Arrayed Analysis Tools v2.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

BOWEN, BENJAMIN; RUEBEL, OLIVER; DE ROND, TRISTAN

2017-02-07

Mass spectrometry imaging (MSI) enables high-resolution spatial mapping of biomolecules in samples and is a valuable tool for the analysis of tissues from plants and animals, microbial interactions, high-throughput screening, drug metabolism, and a host of other applications. This is accomplished by desorbing molecules from the surface on spatially defined locations, using a laser or ion beam. These ions are analyzed by a mass spectrometry and collected into a MSI 'image', a dataset containing unique mass spectra from the sampled spatial locations. MSI is used in a diverse and increasing number of biological applications. The OpenMSI Arrayed Analysis Tool (OMAAT)more » is a new software method that addresses the challenges of analyzing spatially defined samples in large MSI datasets, by providing support for automatic sample position optimization and ion selection.« less
Phylogenetic congruence of lichenised fungi and algae is affected by spatial scale and taxonomic diversity.

PubMed

Buckley, Hannah L; Rafat, Arash; Ridden, Johnathon D; Cruickshank, Robert H; Ridgway, Hayley J; Paterson, Adrian M

2014-01-01

The role of species' interactions in structuring biological communities remains unclear. Mutualistic symbioses, involving close positive interactions between two distinct organismal lineages, provide an excellent means to explore the roles of both evolutionary and ecological processes in determining how positive interactions affect community structure. In this study, we investigate patterns of co-diversification between fungi and algae for a range of New Zealand lichens at the community, genus, and species levels and explore explanations for possible patterns related to spatial scale and pattern, taxonomic diversity of the lichens considered, and the level sampling replication. We assembled six independent datasets to compare patterns in phylogenetic congruence with varied spatial extent of sampling, taxonomic diversity and level of specimen replication. For each dataset, we used the DNA sequences from the ITS regions of both the fungal and algal genomes from lichen specimens to produce genetic distance matrices. Phylogenetic congruence between fungi and algae was quantified using distance-based redundancy analysis and we used geographic distance matrices in Moran's eigenvector mapping and variance partitioning to evaluate the effects of spatial variation on the quantification of phylogenetic congruence. Phylogenetic congruence was highly significant for all datasets and a large proportion of variance in both algal and fungal genetic distances was explained by partner genetic variation. Spatial variables, primarily at large and intermediate scales, were also important for explaining genetic diversity patterns in all datasets. Interestingly, spatial structuring was stronger for fungal than algal genetic variation. As the spatial extent of the samples increased, so too did the proportion of explained variation that was shared between the spatial variables and the partners' genetic variation. Different lichen taxa showed some variation in their phylogenetic congruence and spatial genetic patterns and where greater sample replication was used, the amount of variation explained by partner genetic variation increased. Our results suggest that the phylogenetic congruence pattern, at least at small spatial scales, is likely due to reciprocal co-adaptation or co-dispersal. However, the detection of these patterns varies among different lichen taxa, across spatial scales and with different levels of sample replication. This work provides insight into the complexities faced in determining how evolutionary and ecological processes may interact to generate diversity in symbiotic association patterns at the population and community levels. Further, it highlights the critical importance of considering sample replication, taxonomic diversity and spatial scale in designing studies of co-diversification.
A geospatial database model for the management of remote sensing datasets at multiple spectral, spatial, and temporal scales

NASA Astrophysics Data System (ADS)

Ifimov, Gabriela; Pigeau, Grace; Arroyo-Mora, J. Pablo; Soffer, Raymond; Leblanc, George

2017-10-01

In this study the development and implementation of a geospatial database model for the management of multiscale datasets encompassing airborne imagery and associated metadata is presented. To develop the multi-source geospatial database we have used a Relational Database Management System (RDBMS) on a Structure Query Language (SQL) server which was then integrated into ArcGIS and implemented as a geodatabase. The acquired datasets were compiled, standardized, and integrated into the RDBMS, where logical associations between different types of information were linked (e.g. location, date, and instrument). Airborne data, at different processing levels (digital numbers through geocorrected reflectance), were implemented in the geospatial database where the datasets are linked spatially and temporally. An example dataset consisting of airborne hyperspectral imagery, collected for inter and intra-annual vegetation characterization and detection of potential hydrocarbon seepage events over pipeline areas, is presented. Our work provides a model for the management of airborne imagery, which is a challenging aspect of data management in remote sensing, especially when large volumes of data are collected.
GPU Accelerated Clustering for Arbitrary Shapes in Geoscience Data

NASA Astrophysics Data System (ADS)

Pankratius, V.; Gowanlock, M.; Rude, C. M.; Li, J. D.

2016-12-01

Clustering algorithms have become a vital component in intelligent systems for geoscience that helps scientists discover and track phenomena of various kinds. Here, we outline advances in Density-Based Spatial Clustering of Applications with Noise (DBSCAN) which detects clusters of arbitrary shape that are common in geospatial data. In particular, we propose a hybrid CPU-GPU implementation of DBSCAN and highlight new optimization approaches on the GPU that allows clustering detection in parallel while optimizing data transport during CPU-GPU interactions. We employ an efficient batching scheme between the host and GPU such that limited GPU memory is not prohibitive when processing large and/or dense datasets. To minimize data transfer overhead, we estimate the total workload size and employ an execution that generates optimized batches that will not overflow the GPU buffer. This work is demonstrated on space weather Total Electron Content (TEC) datasets containing over 5 million measurements from instruments worldwide, and allows scientists to spot spatially coherent phenomena with ease. Our approach is up to 30 times faster than a sequential implementation and therefore accelerates discoveries in large datasets. We acknowledge support from NSF ACI-1442997.
A Framework for Spatial Interaction Analysis Based on Large-Scale Mobile Phone Data

PubMed Central

Li, Weifeng; Cheng, Xiaoyun; Guo, Gaohua

2014-01-01

The overall understanding of spatial interaction and the exact knowledge of its dynamic evolution are required in the urban planning and transportation planning. This study aimed to analyze the spatial interaction based on the large-scale mobile phone data. The newly arisen mass dataset required a new methodology which was compatible with its peculiar characteristics. A three-stage framework was proposed in this paper, including data preprocessing, critical activity identification, and spatial interaction measurement. The proposed framework introduced the frequent pattern mining and measured the spatial interaction by the obtained association. A case study of three communities in Shanghai was carried out as verification of proposed method and demonstration of its practical application. The spatial interaction patterns and the representative features proved the rationality of the proposed framework. PMID:25435865
Support vector machine in crash prediction at the level of traffic analysis zones: Assessing the spatial proximity effects.

PubMed

Dong, Ni; Huang, Helai; Zheng, Liang

2015-09-01

In zone-level crash prediction, accounting for spatial dependence has become an extensively studied topic. This study proposes Support Vector Machine (SVM) model to address complex, large and multi-dimensional spatial data in crash prediction. Correlation-based Feature Selector (CFS) was applied to evaluate candidate factors possibly related to zonal crash frequency in handling high-dimension spatial data. To demonstrate the proposed approaches and to compare them with the Bayesian spatial model with conditional autoregressive prior (i.e., CAR), a dataset in Hillsborough county of Florida was employed. The results showed that SVM models accounting for spatial proximity outperform the non-spatial model in terms of model fitting and predictive performance, which indicates the reasonableness of considering cross-zonal spatial correlations. The best model predictive capability, relatively, is associated with the model considering proximity of the centroid distance by choosing the RBF kernel and setting the 10% of the whole dataset as the testing data, which further exhibits SVM models' capacity for addressing comparatively complex spatial data in regional crash prediction modeling. Moreover, SVM models exhibit the better goodness-of-fit compared with CAR models when utilizing the whole dataset as the samples. A sensitivity analysis of the centroid-distance-based spatial SVM models was conducted to capture the impacts of explanatory variables on the mean predicted probabilities for crash occurrence. While the results conform to the coefficient estimation in the CAR models, which supports the employment of the SVM model as an alternative in regional safety modeling. Copyright © 2015 Elsevier Ltd. All rights reserved.

Effect of Variable Spatial Scales on USLE-GIS Computations

NASA Astrophysics Data System (ADS)

Patil, R. J.; Sharma, S. K.

2017-12-01

Use of appropriate spatial scale is very important in Universal Soil Loss Equation (USLE) based spatially distributed soil erosion modelling. This study aimed at assessment of annual rates of soil erosion at different spatial scales/grid sizes and analysing how changes in spatial scales affect USLE-GIS computations using simulation and statistical variabilities. Efforts have been made in this study to recommend an optimum spatial scale for further USLE-GIS computations for management and planning in the study area. The present research study was conducted in Shakkar River watershed, situated in Narsinghpur and Chhindwara districts of Madhya Pradesh, India. Remote Sensing and GIS techniques were integrated with Universal Soil Loss Equation (USLE) to predict spatial distribution of soil erosion in the study area at four different spatial scales viz; 30 m, 50 m, 100 m, and 200 m. Rainfall data, soil map, digital elevation model (DEM) and an executable C++ program, and satellite image of the area were used for preparation of the thematic maps for various USLE factors. Annual rates of soil erosion were estimated for 15 years (1992 to 2006) at four different grid sizes. The statistical analysis of four estimated datasets showed that sediment loss dataset at 30 m spatial scale has a minimum standard deviation (2.16), variance (4.68), percent deviation from observed values (2.68 - 18.91 %), and highest coefficient of determination (R2 = 0.874) among all the four datasets. Thus, it is recommended to adopt this spatial scale for USLE-GIS computations in the study area due to its minimum statistical variability and better agreement with the observed sediment loss data. This study also indicates large scope for use of finer spatial scales in spatially distributed soil erosion modelling.
Spatially Explicit Models to Investigate Geographic Patterns in the Distribution of Forensic STRs: Application to the North-Eastern Mediterranean.

PubMed

Messina, Francesco; Finocchio, Andrea; Akar, Nejat; Loutradis, Aphrodite; Michalodimitrakis, Emmanuel I; Brdicka, Radim; Jodice, Carla; Novelletto, Andrea

2016-01-01

Human forensic STRs used for individual identification have been reported to have little power for inter-population analyses. Several methods have been developed which incorporate information on the spatial distribution of individuals to arrive at a description of the arrangement of diversity. We genotyped at 16 forensic STRs a large population sample obtained from many locations in Italy, Greece and Turkey, i.e. three countries crucial to the understanding of discontinuities at the European/Asian junction and the genetic legacy of ancient migrations, but seldom represented together in previous studies. Using spatial PCA on the full dataset, we detected patterns of population affinities in the area. Additionally, we devised objective criteria to reduce the overall complexity into reduced datasets. Independent spatially explicit methods applied to these latter datasets converged in showing that the extraction of information on long- to medium-range geographical trends and structuring from the overall diversity is possible. All analyses returned the picture of a background clinal variation, with regional discontinuities captured by each of the reduced datasets. Several aspects of our results are confirmed on external STR datasets and replicate those of genome-wide SNP typings. High levels of gene flow were inferred within the main continental areas by coalescent simulations. These results are promising from a microevolutionary perspective, in view of the fast pace at which forensic data are being accumulated for many locales. It is foreseeable that this will allow the exploitation of an invaluable genotypic resource, assembled for other (forensic) purposes, to clarify important aspects in the formation of local gene pools.
Spatial Structure of Above-Ground Biomass Limits Accuracy of Carbon Mapping in Rainforest but Large Scale Forest Inventories Can Help to Overcome.

PubMed

Guitet, Stéphane; Hérault, Bruno; Molto, Quentin; Brunaux, Olivier; Couteron, Pierre

2015-01-01

Precise mapping of above-ground biomass (AGB) is a major challenge for the success of REDD+ processes in tropical rainforest. The usual mapping methods are based on two hypotheses: a large and long-ranged spatial autocorrelation and a strong environment influence at the regional scale. However, there are no studies of the spatial structure of AGB at the landscapes scale to support these assumptions. We studied spatial variation in AGB at various scales using two large forest inventories conducted in French Guiana. The dataset comprised 2507 plots (0.4 to 0.5 ha) of undisturbed rainforest distributed over the whole region. After checking the uncertainties of estimates obtained from these data, we used half of the dataset to develop explicit predictive models including spatial and environmental effects and tested the accuracy of the resulting maps according to their resolution using the rest of the data. Forest inventories provided accurate AGB estimates at the plot scale, for a mean of 325 Mg.ha-1. They revealed high local variability combined with a weak autocorrelation up to distances of no more than10 km. Environmental variables accounted for a minor part of spatial variation. Accuracy of the best model including spatial effects was 90 Mg.ha-1 at plot scale but coarse graining up to 2-km resolution allowed mapping AGB with accuracy lower than 50 Mg.ha-1. Whatever the resolution, no agreement was found with available pan-tropical reference maps at all resolutions. We concluded that the combined weak autocorrelation and weak environmental effect limit AGB maps accuracy in rainforest, and that a trade-off has to be found between spatial resolution and effective accuracy until adequate "wall-to-wall" remote sensing signals provide reliable AGB predictions. Waiting for this, using large forest inventories with low sampling rate (<0.5%) may be an efficient way to increase the global coverage of AGB maps with acceptable accuracy at kilometric resolution.
Building a multi-scaled geospatial temporal ecology database from disparate data sources: fostering open science and data reuse.

PubMed

Soranno, Patricia A; Bissell, Edward G; Cheruvelil, Kendra S; Christel, Samuel T; Collins, Sarah M; Fergus, C Emi; Filstrup, Christopher T; Lapierre, Jean-Francois; Lottig, Noah R; Oliver, Samantha K; Scott, Caren E; Smith, Nicole J; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A; Gries, Corinna; Henry, Emily N; Skaff, Nick K; Stanley, Emily H; Stow, Craig A; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E

2015-01-01

Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km(2)). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.
Building a multi-scaled geospatial temporal ecology database from disparate data sources: Fostering open science through data reuse

USGS Publications Warehouse

Soranno, Patricia A.; Bissell, E.G.; Cheruvelil, Kendra S.; Christel, Samuel T.; Collins, Sarah M.; Fergus, C. Emi; Filstrup, Christopher T.; Lapierre, Jean-Francois; Lotting, Noah R.; Oliver, Samantha K.; Scott, Caren E.; Smith, Nicole J.; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A.; Gries, Corinna; Henry, Emily N.; Skaff, Nick K.; Stanley, Emily H.; Stow, Craig A.; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E.

2015-01-01

Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km2). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.
ADJUST: An automatic EEG artifact detector based on the joint use of spatial and temporal features.

PubMed

Mognon, Andrea; Jovicich, Jorge; Bruzzone, Lorenzo; Buiatti, Marco

2011-02-01

A successful method for removing artifacts from electroencephalogram (EEG) recordings is Independent Component Analysis (ICA), but its implementation remains largely user-dependent. Here, we propose a completely automatic algorithm (ADJUST) that identifies artifacted independent components by combining stereotyped artifact-specific spatial and temporal features. Features were optimized to capture blinks, eye movements, and generic discontinuities on a feature selection dataset. Validation on a totally different EEG dataset shows that (1) ADJUST's classification of independent components largely matches a manual one by experts (agreement on 95.2% of the data variance), and (2) Removal of the artifacted components detected by ADJUST leads to neat reconstruction of visual and auditory event-related potentials from heavily artifacted data. These results demonstrate that ADJUST provides a fast, efficient, and automatic way to use ICA for artifact removal. Copyright © 2010 Society for Psychophysiological Research.
Spatial Covariability of Temperature and Hydroclimate as a Function of Timescale During the Common Era

NASA Astrophysics Data System (ADS)

McKay, N.

2017-12-01

As timescale increases from years to centuries, the spatial scale of covariability in the climate system is hypothesized to increase as well. Covarying spatial scales are larger for temperature than for hydroclimate, however, both aspects of the climate system show systematic changes on large-spatial scales on orbital to tectonic timescales. The extent to which this phenomenon is evident in temperature and hydroclimate at centennial timescales is largely unknown. Recent syntheses of multidecadal to century-scale variability in hydroclimate during the past 2k in the Arctic, North America, and Australasia show little spatial covariability in hydroclimate during the Common Era. To determine 1) the evidence for systematic relationships between the spatial scale of climate covariability as a function of timescale, and 2) whether century-scale hydroclimate variability deviates from the relationship between spatial covariability and timescale, we quantify this phenomenon during the Common Era by calculating the e-folding distance in large instrumental and paleoclimate datasets. We calculate this metric of spatial covariability, at different timescales (1, 10 and 100-yr), for a large network of temperature and precipitation observations from the Global Historical Climatology Network (n=2447), from v2.0.0 of the PAGES2k temperature database (n=692), and from moisture-sensitive paleoclimate records North America, the Arctic, and the Iso2k project (n = 328). Initial results support the hypothesis that the spatial scale of covariability is larger for temperature, than for precipitation or paleoclimate hydroclimate indicators. Spatially, e-folding distances for temperature are largest at low latitudes and over the ocean. Both instrumental and proxy temperature data show clear evidence for increasing spatial extent as a function of timescale, but this phenomenon is very weak in the hydroclimate data analyzed here. In the proxy hydroclimate data, which are predominantly indicators of effective moisture, e-folding distance increases from annual to decadal timescales, but does not continue to increase to centennial timescales. Future work includes examining additional instrumental and proxy datasets of moisture variability, and extending the analysis to millennial timescales of variability.
Self-organizing maps: a versatile tool for the automatic analysis of untargeted imaging datasets.

PubMed

Franceschi, Pietro; Wehrens, Ron

2014-04-01

MS-based imaging approaches allow for location-specific identification of chemical components in biological samples, opening up possibilities of much more detailed understanding of biological processes and mechanisms. Data analysis, however, is challenging, mainly because of the sheer size of such datasets. This article presents a novel approach based on self-organizing maps, extending previous work in order to be able to handle the large number of variables present in high-resolution mass spectra. The key idea is to generate prototype images, representing spatial distributions of ions, rather than prototypical mass spectra. This allows for a two-stage approach, first generating typical spatial distributions and associated m/z bins, and later analyzing the interesting bins in more detail using accurate masses. The possibilities and advantages of the new approach are illustrated on an in-house dataset of apple slices. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Farmer data sourcing. The case study of the spatial soil information maps in South Tyrol.

NASA Astrophysics Data System (ADS)

Della Chiesa, Stefano; Niedrist, Georg; Thalheimer, Martin; Hafner, Hansjörg; La Cecilia, Daniele

2017-04-01

Nord-Italian region South Tyrol is Europe's largest apple growing area exporting ca. 15% in Europe and 2% worldwide. Vineyards represent ca. 1% of Italian production. In order to deliver high quality food, most of the farmers in South Tyrol follow sustainable farming practices. One of the key practice is the sustainable soil management, where farmers collect regularly (each 5 years) soil samples and send for analyses to improve cultivation management, yield and finally profitability. However, such data generally remain inaccessible. On this regard, in South Tyrol, private interests and the public administration have established a long tradition of collaboration with the local farming industry. This has granted to the collection of large spatial and temporal database of soil analyses along all the cultivated areas. Thanks to this best practice, information on soil properties are centralized and geocoded. The large dataset consist mainly in soil information of texture, humus content, pH and microelements availability such as, K, Mg, Bor, Mn, Cu Zn. This data was finally spatialized by mean of geostatistical methods and several high-resolution digital maps were created. In this contribution, we present the best practice where farmers data source soil information in South Tyrol. Show the capability of a large spatial-temporal geocoded soil dataset to reproduce detailed digital soil property maps and to assess long-term changes in soil properties. Finally, implication and potential application are discussed.
Automated Topographic Change Detection via Dem Differencing at Large Scales Using The Arcticdem Database

NASA Astrophysics Data System (ADS)

Candela, S. G.; Howat, I.; Noh, M. J.; Porter, C. C.; Morin, P. J.

2016-12-01

In the last decade, high resolution satellite imagery has become an increasingly accessible tool for geoscientists to quantify changes in the Arctic land surface due to geophysical, ecological and anthropomorphic processes. However, the trade off between spatial coverage and spatial-temporal resolution has limited detailed, process-level change detection over large (i.e. continental) scales. The ArcticDEM project utilized over 300,000 Worldview image pairs to produce a nearly 100% coverage elevation model (above 60°N) offering the first polar, high spatial - high resolution (2-8m by region) dataset, often with multiple repeats in areas of particular interest to geo-scientists. A dataset of this size (nearly 250 TB) offers endless new avenues of scientific inquiry, but quickly becomes unmanageable computationally and logistically for the computing resources available to the average scientist. Here we present TopoDiff, a framework for a generalized. automated workflow that requires minimal input from the end user about a study site, and utilizes cloud computing resources to provide a temporally sorted and differenced dataset, ready for geostatistical analysis. This hands-off approach allows the end user to focus on the science, without having to manage thousands of files, or petabytes of data. At the same time, TopoDiff provides a consistent and accurate workflow for image sorting, selection, and co-registration enabling cross-comparisons between research projects.
Potential distribution dataset of honeybees in Indian Ocean Islands: Case study of Zanzibar Island.

PubMed

Mwalusepo, Sizah; Muli, Eliud; Nkoba, Kiatoko; Nguku, Everlyn; Kilonzo, Joseph; Abdel-Rahman, Elfatih M; Landmann, Tobias; Fakih, Asha; Raina, Suresh

2017-10-01

Honeybees ( Apis mellifera ) are principal insect pollinators, whose worldwide distribution and abundance is known to largely depend on climatic conditions. However, the presence records dataset on potential distribution of honeybees in Indian Ocean Islands remain less documented. Presence records in shape format and probability of occurrence of honeybees with different temperature change scenarios is provided in this article across Zanzibar Island. Maximum entropy (Maxent) package was used to analyse the potential distribution of honeybees. The dataset provides information on the current and future distribution of the honey bees in Zanzibar Island. The dataset is of great importance for improving stakeholders understanding of the role of temperature change on the spatial distribution of honeybees.
Sparse modeling of spatial environmental variables associated with asthma

PubMed Central

Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.

2014-01-01

Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437
Sparse modeling of spatial environmental variables associated with asthma.

PubMed

Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

2015-02-01

Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.
A Comparison of Latent Heat Fluxes over Global Oceans for Four Flux Products

NASA Technical Reports Server (NTRS)

Chou, Shu-Hsien; Nelkin, Eric; Ardizzone, Joe; Atlas, Robert M.

2003-01-01

To improve our understanding of global energy and water cycle variability, and to improve model simulations of climate variations, it is vital to have accurate latent heat fluxes (LHF) over global oceans. Monthly LHF, 10-m wind speed (U10m), 10-m specific humidity (Q10h), and sea-air humidity difference (Qs-Q10m) of GSSTF2 (version 2 Goddard Satellite-based Surface Turbulent Fluxes) over global Oceans during 1992-93 are compared with those of HOAPS (Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data), NCEP (NCEP/NCAR reanalysis). The mean differences, standard deviations of differences, and temporal correlation of these monthly variables over global Oceans during 1992-93 between GSSTF2 and each of the three datasets are analyzed. The large-scale patterns of the 2yr-mean fields for these variables are similar among these four datasets, but significant quantitative differences are found. The temporal correlation is higher in the northern extratropics than in the south for all variables, with the contrast being especially large for da Silva as a result of more missing ship data in the south. The da Silva has extremely low temporal correlation and large differences with GSSTF2 for all variables in the southern extratropics, indicating that da Silva hardly produces a realistic variability in these variables. The NCEP has extremely low temporal correlation (0.27) and large spatial variations of differences with GSSTF2 for Qs-Q10m in the tropics, which causes the low correlation for LHF. Over the tropics, the HOAPS LHF is significantly smaller than GSSTF2 by approx. 31% (37 W/sq m), whereas the other two datasets are comparable to GSSTF2. This is because the HOAPS has systematically smaller LHF than GSSTF2 in space, while the other two datasets have very large spatial variations of large positive and negative LHF differences with GSSTF2 to cancel and to produce smaller regional-mean differences. Our analyses suggest that the GSSTF2 latent heat flux, surface air humidity, and winds are likely to be more realistic than the other three flux datasets examined, although those of GSSTF2 are still subject to regional biases.
A global wind resource atlas including high-resolution terrain effects

NASA Astrophysics Data System (ADS)

Hahmann, Andrea; Badger, Jake; Olsen, Bjarke; Davis, Neil; Larsen, Xiaoli; Badger, Merete

2015-04-01

Currently no accurate global wind resource dataset is available to fill the needs of policy makers and strategic energy planners. Evaluating wind resources directly from coarse resolution reanalysis datasets underestimate the true wind energy resource, as the small-scale spatial variability of winds is missing. This missing variability can account for a large part of the local wind resource. Crucially, it is the windiest sites that suffer the largest wind resource errors: in simple terrain the windiest sites may be underestimated by 25%, in complex terrain the underestimate can be as large as 100%. The small-scale spatial variability of winds can be modelled using novel statistical methods and by application of established microscale models within WAsP developed at DTU Wind Energy. We present the framework for a single global methodology, which is relative fast and economical to complete. The method employs reanalysis datasets, which are downscaled to high-resolution wind resource datasets via a so-called generalization step, and microscale modelling using WAsP. This method will create the first global wind atlas (GWA) that covers all land areas (except Antarctica) and 30 km coastal zone over water. Verification of the GWA estimates will be done at carefully selected test regions, against verified estimates from mesoscale modelling and satellite synthetic aperture radar (SAR). This verification exercise will also help in the estimation of the uncertainty of the new wind climate dataset. Uncertainty will be assessed as a function of spatial aggregation. It is expected that the uncertainty at verification sites will be larger than that of dedicated assessments, but the uncertainty will be reduced at levels of aggregation appropriate for energy planning, and importantly much improved relative to what is used today. In this presentation we discuss the methodology used, which includes the generalization of wind climatologies, and the differences in local and spatially aggregated wind resources that result from using different reanalyses in the various verification regions. A prototype web interface for the public access to the data will also be showcased.
Spatially Explicit Models to Investigate Geographic Patterns in the Distribution of Forensic STRs: Application to the North-Eastern Mediterranean

PubMed Central

Messina, Francesco; Finocchio, Andrea; Akar, Nejat; Loutradis, Aphrodite; Michalodimitrakis, Emmanuel I.; Brdicka, Radim; Jodice, Carla

2016-01-01

Human forensic STRs used for individual identification have been reported to have little power for inter-population analyses. Several methods have been developed which incorporate information on the spatial distribution of individuals to arrive at a description of the arrangement of diversity. We genotyped at 16 forensic STRs a large population sample obtained from many locations in Italy, Greece and Turkey, i.e. three countries crucial to the understanding of discontinuities at the European/Asian junction and the genetic legacy of ancient migrations, but seldom represented together in previous studies. Using spatial PCA on the full dataset, we detected patterns of population affinities in the area. Additionally, we devised objective criteria to reduce the overall complexity into reduced datasets. Independent spatially explicit methods applied to these latter datasets converged in showing that the extraction of information on long- to medium-range geographical trends and structuring from the overall diversity is possible. All analyses returned the picture of a background clinal variation, with regional discontinuities captured by each of the reduced datasets. Several aspects of our results are confirmed on external STR datasets and replicate those of genome-wide SNP typings. High levels of gene flow were inferred within the main continental areas by coalescent simulations. These results are promising from a microevolutionary perspective, in view of the fast pace at which forensic data are being accumulated for many locales. It is foreseeable that this will allow the exploitation of an invaluable genotypic resource, assembled for other (forensic) purposes, to clarify important aspects in the formation of local gene pools. PMID:27898725
Comparison of Radiative Energy Flows in Observational Datasets and Climate Modeling

NASA Technical Reports Server (NTRS)

Raschke, Ehrhard; Kinne, Stefan; Rossow, William B.; Stackhouse, Paul W. Jr.; Wild, Martin

2016-01-01

This study examines radiative flux distributions and local spread of values from three major observational datasets (CERES, ISCCP, and SRB) and compares them with results from climate modeling (CMIP3). Examinations of the spread and differences also differentiate among contributions from cloudy and clear-sky conditions. The spread among observational datasets is in large part caused by noncloud ancillary data. Average differences of at least 10Wm(exp -2) each for clear-sky downward solar, upward solar, and upward infrared fluxes at the surface demonstrate via spatial difference patterns major differences in assumptions for atmospheric aerosol, solar surface albedo and surface temperature, and/or emittance in observational datasets. At the top of the atmosphere (TOA), observational datasets are less influenced by the ancillary data errors than at the surface. Comparisons of spatial radiative flux distributions at the TOA between observations and climate modeling indicate large deficiencies in the strength and distribution of model-simulated cloud radiative effects. Differences are largest for lower-altitude clouds over low-latitude oceans. Global modeling simulates stronger cloud radiative effects (CRE) by +30Wmexp -2) over trade wind cumulus regions, yet smaller CRE by about -30Wm(exp -2) over (smaller in area) stratocumulus regions. At the surface, climate modeling simulates on average about 15Wm(exp -2) smaller radiative net flux imbalances, as if climate modeling underestimates latent heat release (and precipitation). Relative to observational datasets, simulated surface net fluxes are particularly lower over oceanic trade wind regions (where global modeling tends to overestimate the radiative impact of clouds). Still, with the uncertainty in noncloud ancillary data, observational data do not establish a reliable reference.
Nick Grue | NREL

Science.gov Websites

geospatial data analysis using parallel processing High performance computing Renewable resource technical potential and supply curve analysis Spatial database utilization Rapid analysis of large geospatial datasets energy and geospatial analysis products Research Interests Rapid, web-based renewable resource analysis
Improved Statistical Method For Hydrographic Climatic Records Quality Control

NASA Astrophysics Data System (ADS)

Gourrion, J.; Szekely, T.

2016-02-01

Climate research benefits from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of a quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to early 2014, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has been implemented in the latest version of the CORA dataset and will benefit to the next version of the Copernicus CMEMS dataset.
Impacts of spatial resolution and representation of flow connectivity on large-scale simulation of floods

NASA Astrophysics Data System (ADS)

Mateo, Cherry May R.; Yamazaki, Dai; Kim, Hyungjun; Champathong, Adisorn; Vaze, Jai; Oki, Taikan

2017-10-01

Global-scale river models (GRMs) are core tools for providing consistent estimates of global flood hazard, especially in data-scarce regions. Due to former limitations in computational power and input datasets, most GRMs have been developed to use simplified representations of flow physics and run at coarse spatial resolutions. With increasing computational power and improved datasets, the application of GRMs to finer resolutions is becoming a reality. To support development in this direction, the suitability of GRMs for application to finer resolutions needs to be assessed. This study investigates the impacts of spatial resolution and flow connectivity representation on the predictive capability of a GRM, CaMa-Flood, in simulating the 2011 extreme flood in Thailand. Analyses show that when single downstream connectivity (SDC) is assumed, simulation results deteriorate with finer spatial resolution; Nash-Sutcliffe efficiency coefficients decreased by more than 50 % between simulation results at 10 km resolution and 1 km resolution. When multiple downstream connectivity (MDC) is represented, simulation results slightly improve with finer spatial resolution. The SDC simulations result in excessive backflows on very flat floodplains due to the restrictive flow directions at finer resolutions. MDC channels attenuated these effects by maintaining flow connectivity and flow capacity between floodplains in varying spatial resolutions. While a regional-scale flood was chosen as a test case, these findings should be universal and may have significant impacts on large- to global-scale simulations, especially in regions where mega deltas exist.These results demonstrate that a GRM can be used for higher resolution simulations of large-scale floods, provided that MDC in rivers and floodplains is adequately represented in the model structure.

Bayesian Hierarchical Modeling for Big Data Fusion in Soil Hydrology

NASA Astrophysics Data System (ADS)

Mohanty, B.; Kathuria, D.; Katzfuss, M.

2016-12-01

Soil moisture datasets from remote sensing (RS) platforms (such as SMOS and SMAP) and reanalysis products from land surface models are typically available on a coarse spatial granularity of several square km. Ground based sensors on the other hand provide observations on a finer spatial scale (meter scale or less) but are sparsely available. Soil moisture is affected by high variability due to complex interactions between geologic, topographic, vegetation and atmospheric variables. Hydrologic processes usually occur at a scale of 1 km or less and therefore spatially ubiquitous and temporally periodic soil moisture products at this scale are required to aid local decision makers in agriculture, weather prediction and reservoir operations. Past literature has largely focused on downscaling RS soil moisture for a small extent of a field or a watershed and hence the applicability of such products has been limited. The present study employs a spatial Bayesian Hierarchical Model (BHM) to derive soil moisture products at a spatial scale of 1 km for the state of Oklahoma by fusing point scale Mesonet data and coarse scale RS data for soil moisture and its auxiliary covariates such as precipitation, topography, soil texture and vegetation. It is seen that the BHM model handles change of support problems easily while performing accurate uncertainty quantification arising from measurement errors and imperfect retrieval algorithms. The computational challenge arising due to the large number of measurements is tackled by utilizing basis function approaches and likelihood approximations. The BHM model can be considered as a complex Bayesian extension of traditional geostatistical prediction methods (such as Kriging) for large datasets in the presence of uncertainties.
Spatial Downscaling of Alien Species Presences using Machine Learning

NASA Astrophysics Data System (ADS)

Daliakopoulos, Ioannis N.; Katsanevakis, Stelios; Moustakas, Aristides

2017-07-01

Large scale, high-resolution data on alien species distributions are essential for spatially explicit assessments of their environmental and socio-economic impacts, and management interventions for mitigation. However, these data are often unavailable. This paper presents a method that relies on Random Forest (RF) models to distribute alien species presence counts at a finer resolution grid, thus achieving spatial downscaling. A sufficiently large number of RF models are trained using random subsets of the dataset as predictors, in a bootstrapping approach to account for the uncertainty introduced by the subset selection. The method is tested with an approximately 8×8 km2 grid containing floral alien species presence and several indices of climatic, habitat, land use covariates for the Mediterranean island of Crete, Greece. Alien species presence is aggregated at 16×16 km2 and used as a predictor of presence at the original resolution, thus simulating spatial downscaling. Potential explanatory variables included habitat types, land cover richness, endemic species richness, soil type, temperature, precipitation, and freshwater availability. Uncertainty assessment of the spatial downscaling of alien species’ occurrences was also performed and true/false presences and absences were quantified. The approach is promising for downscaling alien species datasets of larger spatial scale but coarse resolution, where the underlying environmental information is available at a finer resolution than the alien species data. Furthermore, the RF architecture allows for tuning towards operationally optimal sensitivity and specificity, thus providing a decision support tool for designing a resource efficient alien species census.
Insights from Modelling the Spatial Dependence Structure of Hydraulic Conductivity at the MADE Site Using Spatial Copulas

NASA Astrophysics Data System (ADS)

Haslauer, Claus; Bohling, Geoff

2013-04-01

Hydraulic conductivity (K) is a fundamental parameter that influences groundwater flow and solute transport. Measurements of K are limited and uncertain. Moreover, the spatial structure of K, which impacts the groundwater velocity field and hence directly influences the advective spreading of a solute migrating in the subsurface, is commonly described by approaches using second order moments. Spatial copulas have in the recent past been applied successfully to model the spatial dependence structure of heterogeneous subsurface datasets. At the MADE site, hydraulic conductivity (K) has been measured in exceptional detail. Two independently collected data-sets were used for this study: (1) ~2000 flowmeter based K measurements, and (2) ~20,000 direct-push based K measurements. These datasets exhibit a very heterogeneous (Var[ln(K)]>2) spatially distributed K field. A copula analysis reveals that the spatial dependence structure of the flowmeter and direct-push datasets are essentially the same. A spatial copula analysis factors out the influence of the marginal distribution of the property under investigation. This independence from the marginal distributions allows the copula analysis to reveal the underlying similarity between the spatial dependence structures of the flowmeter and direct-push datasets despite two complicating factors: 1) an overall offset between the datasets, with direct-push K values being, on average, roughly a factor of five lower than flowmeter K values, due at least in part to opposite biases between the two measurement techniques, and 2) the presence of some anomalously high K values in the direct-push dataset due to a lower limit on accurately measureable pressure responses in high-K zones. In addition, the vertical resolution of the direct-push dataset is ten times finer than that of the flowmeter dataset. Upscaling the direct-push data to compensate for this difference resulted in little change to the spatial structure. The objective of the presented work is to use multidimensional spatial copulas to describe and model the spatial dependence of the spatial structure of K at the heterogeneous MADE site, and evaluate the effects of this multidimensional description on solute transport.
RE-Europe, a large-scale dataset for modeling a highly renewable European electricity system

PubMed Central

Jensen, Tue V.; Pinson, Pierre

2017-01-01

Future highly renewable energy systems will couple to complex weather and climate dynamics. This coupling is generally not captured in detail by the open models developed in the power and energy system communities, where such open models exist. To enable modeling such a future energy system, we describe a dedicated large-scale dataset for a renewable electric power system. The dataset combines a transmission network model, as well as information for generation and demand. Generation includes conventional generators with their technical and economic characteristics, as well as weather-driven forecasts and corresponding realizations for renewable energy generation for a period of 3 years. These may be scaled according to the envisioned degrees of renewable penetration in a future European energy system. The spatial coverage, completeness and resolution of this dataset, open the door to the evaluation, scaling analysis and replicability check of a wealth of proposals in, e.g., market design, network actor coordination and forecasting of renewable power generation. PMID:29182600
RE-Europe, a large-scale dataset for modeling a highly renewable European electricity system.

PubMed

Jensen, Tue V; Pinson, Pierre

2017-11-28

Future highly renewable energy systems will couple to complex weather and climate dynamics. This coupling is generally not captured in detail by the open models developed in the power and energy system communities, where such open models exist. To enable modeling such a future energy system, we describe a dedicated large-scale dataset for a renewable electric power system. The dataset combines a transmission network model, as well as information for generation and demand. Generation includes conventional generators with their technical and economic characteristics, as well as weather-driven forecasts and corresponding realizations for renewable energy generation for a period of 3 years. These may be scaled according to the envisioned degrees of renewable penetration in a future European energy system. The spatial coverage, completeness and resolution of this dataset, open the door to the evaluation, scaling analysis and replicability check of a wealth of proposals in, e.g., market design, network actor coordination and forecasting of renewable power generation.
RE-Europe, a large-scale dataset for modeling a highly renewable European electricity system

NASA Astrophysics Data System (ADS)

Jensen, Tue V.; Pinson, Pierre

2017-11-01

Future highly renewable energy systems will couple to complex weather and climate dynamics. This coupling is generally not captured in detail by the open models developed in the power and energy system communities, where such open models exist. To enable modeling such a future energy system, we describe a dedicated large-scale dataset for a renewable electric power system. The dataset combines a transmission network model, as well as information for generation and demand. Generation includes conventional generators with their technical and economic characteristics, as well as weather-driven forecasts and corresponding realizations for renewable energy generation for a period of 3 years. These may be scaled according to the envisioned degrees of renewable penetration in a future European energy system. The spatial coverage, completeness and resolution of this dataset, open the door to the evaluation, scaling analysis and replicability check of a wealth of proposals in, e.g., market design, network actor coordination and forecasting of renewable power generation.
Global Data Spatially Interrelate System for Scientific Big Data Spatial-Seamless Sharing

NASA Astrophysics Data System (ADS)

Yu, J.; Wu, L.; Yang, Y.; Lei, X.; He, W.

2014-04-01

A good data sharing system with spatial-seamless services will prevent the scientists from tedious, boring, and time consuming work of spatial transformation, and hence encourage the usage of the scientific data, and increase the scientific innovation. Having been adopted as the framework of Earth datasets by Group on Earth Observation (GEO), Earth System Spatial Grid (ESSG) is potential to be the spatial reference of the Earth datasets. Based on the implementation of ESSG, SDOG-ESSG, a data sharing system named global data spatially interrelate system (GASE) was design to make the data sharing spatial-seamless. The architecture of GASE was introduced. The implementation of the two key components, V-Pools, and interrelating engine, and the prototype is presented. Any dataset is firstly resampled into SDOG-ESSG, and is divided into small blocks, and then are mapped into hierarchical system of the distributed file system in V-Pools, which together makes the data serving at a uniform spatial reference and at a high efficiency. Besides, the datasets from different data centres are interrelated by the interrelating engine at the uniform spatial reference of SDOGESSG, which enables the system to sharing the open datasets in the internet spatial-seamless.
Ontology for Transforming Geo-Spatial Data for Discovery and Integration of Scientific Data

NASA Astrophysics Data System (ADS)

Nguyen, L.; Chee, T.; Minnis, P.

2013-12-01

Discovery and access to geo-spatial scientific data across heterogeneous repositories and multi-discipline datasets can present challenges for scientist. We propose to build a workflow for transforming geo-spatial datasets into semantic environment by using relationships to describe the resource using OWL Web Ontology, RDF, and a proposed geo-spatial vocabulary. We will present methods for transforming traditional scientific dataset, use of a semantic repository, and querying using SPARQL to integrate and access datasets. This unique repository will enable discovery of scientific data by geospatial bound or other criteria.
The MIND PALACE: A Multi-Spectral Imaging and Spectroscopy Database for Planetary Science

NASA Astrophysics Data System (ADS)

Eshelman, E.; Doloboff, I.; Hara, E. K.; Uckert, K.; Sapers, H. M.; Abbey, W.; Beegle, L. W.; Bhartia, R.

2017-12-01

The Multi-Instrument Database (MIND) is the web-based home to a well-characterized set of analytical data collected by a suite of deep-UV fluorescence/Raman instruments built at the Jet Propulsion Laboratory (JPL). Samples derive from a growing body of planetary surface analogs, mineral and microbial standards, meteorites, spacecraft materials, and other astrobiologically relevant materials. In addition to deep-UV spectroscopy, datasets stored in MIND are obtained from a variety of analytical techniques obtained over multiple spatial and spectral scales including electron microscopy, optical microscopy, infrared spectroscopy, X-ray fluorescence, and direct fluorescence imaging. Multivariate statistical analysis techniques, primarily Principal Component Analysis (PCA), are used to guide interpretation of these large multi-analytical spectral datasets. Spatial co-referencing of integrated spectral/visual maps is performed using QGIS (geographic information system software). Georeferencing techniques transform individual instrument data maps into a layered co-registered data cube for analysis across spectral and spatial scales. The body of data in MIND is intended to serve as a permanent, reliable, and expanding database of deep-UV spectroscopy datasets generated by this unique suite of JPL-based instruments on samples of broad planetary science interest.
[Spatial domain display for interference image dataset].

PubMed

Wang, Cai-Ling; Li, Yu-Shan; Liu, Xue-Bin; Hu, Bing-Liang; Jing, Juan-Juan; Wen, Jia

2011-11-01

The requirements of imaging interferometer visualization is imminent for the user of image interpretation and information extraction. However, the conventional researches on visualization only focus on the spectral image dataset in spectral domain. Hence, the quick show of interference spectral image dataset display is one of the nodes in interference image processing. The conventional visualization of interference dataset chooses classical spectral image dataset display method after Fourier transformation. In the present paper, the problem of quick view of interferometer imager in image domain is addressed and the algorithm is proposed which simplifies the matter. The Fourier transformation is an obstacle since its computation time is very large and the complexion would be even deteriorated with the size of dataset increasing. The algorithm proposed, named interference weighted envelopes, makes the dataset divorced from transformation. The authors choose three interference weighted envelopes respectively based on the Fourier transformation, features of interference data and human visual system. After comparing the proposed with the conventional methods, the results show the huge difference in display time.
Tree-based approach for exploring marine spatial patterns with raster datasets.

PubMed

Liao, Xiaohan; Xue, Cunjin; Su, Fenzhen

2017-01-01

From multiple raster datasets to spatial association patterns, the data-mining technique is divided into three subtasks, i.e., raster dataset pretreatment, mining algorithm design, and spatial pattern exploration from the mining results. Comparison with the former two subtasks reveals that the latter remains unresolved. Confronted with the interrelated marine environmental parameters, we propose a Tree-based Approach for eXploring Marine Spatial Patterns with multiple raster datasets called TAXMarSP, which includes two models. One is the Tree-based Cascading Organization Model (TCOM), and the other is the Spatial Neighborhood-based CAlculation Model (SNCAM). TCOM designs the "Spatial node→Pattern node" from top to bottom layers to store the table-formatted frequent patterns. Together with TCOM, SNCAM considers the spatial neighborhood contributions to calculate the pattern-matching degree between the specified marine parameters and the table-formatted frequent patterns and then explores the marine spatial patterns. Using the prevalent quantification Apriori algorithm and a real remote sensing dataset from January 1998 to December 2014, a successful application of TAXMarSP to marine spatial patterns in the Pacific Ocean is described, and the obtained marine spatial patterns present not only the well-known but also new patterns to Earth scientists.
Improved statistical method for temperature and salinity quality control

NASA Astrophysics Data System (ADS)

Gourrion, Jérôme; Szekely, Tanguy

2017-04-01

Climate research and Ocean monitoring benefit from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of an automatic quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to late 2015, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has already been implemented in the latest version of the delayed-time CMEMS in-situ dataset and will be deployed soon in the equivalent near-real time products.
Spatial Structure of Above-Ground Biomass Limits Accuracy of Carbon Mapping in Rainforest but Large Scale Forest Inventories Can Help to Overcome

PubMed Central

Guitet, Stéphane; Hérault, Bruno; Molto, Quentin; Brunaux, Olivier; Couteron, Pierre

2015-01-01

Precise mapping of above-ground biomass (AGB) is a major challenge for the success of REDD+ processes in tropical rainforest. The usual mapping methods are based on two hypotheses: a large and long-ranged spatial autocorrelation and a strong environment influence at the regional scale. However, there are no studies of the spatial structure of AGB at the landscapes scale to support these assumptions. We studied spatial variation in AGB at various scales using two large forest inventories conducted in French Guiana. The dataset comprised 2507 plots (0.4 to 0.5 ha) of undisturbed rainforest distributed over the whole region. After checking the uncertainties of estimates obtained from these data, we used half of the dataset to develop explicit predictive models including spatial and environmental effects and tested the accuracy of the resulting maps according to their resolution using the rest of the data. Forest inventories provided accurate AGB estimates at the plot scale, for a mean of 325 Mg.ha-1. They revealed high local variability combined with a weak autocorrelation up to distances of no more than10 km. Environmental variables accounted for a minor part of spatial variation. Accuracy of the best model including spatial effects was 90 Mg.ha-1 at plot scale but coarse graining up to 2-km resolution allowed mapping AGB with accuracy lower than 50 Mg.ha-1. Whatever the resolution, no agreement was found with available pan-tropical reference maps at all resolutions. We concluded that the combined weak autocorrelation and weak environmental effect limit AGB maps accuracy in rainforest, and that a trade-off has to be found between spatial resolution and effective accuracy until adequate “wall-to-wall” remote sensing signals provide reliable AGB predictions. Waiting for this, using large forest inventories with low sampling rate (<0.5%) may be an efficient way to increase the global coverage of AGB maps with acceptable accuracy at kilometric resolution. PMID:26402522
Scaling identity connects human mobility and social interactions.

PubMed

Deville, Pierre; Song, Chaoming; Eagle, Nathan; Blondel, Vincent D; Barabási, Albert-László; Wang, Dashun

2016-06-28

Massive datasets that capture human movements and social interactions have catalyzed rapid advances in our quantitative understanding of human behavior during the past years. One important aspect affecting both areas is the critical role space plays. Indeed, growing evidence suggests both our movements and communication patterns are associated with spatial costs that follow reproducible scaling laws, each characterized by its specific critical exponents. Although human mobility and social networks develop concomitantly as two prolific yet largely separated fields, we lack any known relationships between the critical exponents explored by them, despite the fact that they often study the same datasets. Here, by exploiting three different mobile phone datasets that capture simultaneously these two aspects, we discovered a new scaling relationship, mediated by a universal flux distribution, which links the critical exponents characterizing the spatial dependencies in human mobility and social networks. Therefore, the widely studied scaling laws uncovered in these two areas are not independent but connected through a deeper underlying reality.
Scaling identity connects human mobility and social interactions

PubMed Central

Deville, Pierre; Song, Chaoming; Eagle, Nathan; Blondel, Vincent D.; Barabási, Albert-László; Wang, Dashun

2016-01-01

Massive datasets that capture human movements and social interactions have catalyzed rapid advances in our quantitative understanding of human behavior during the past years. One important aspect affecting both areas is the critical role space plays. Indeed, growing evidence suggests both our movements and communication patterns are associated with spatial costs that follow reproducible scaling laws, each characterized by its specific critical exponents. Although human mobility and social networks develop concomitantly as two prolific yet largely separated fields, we lack any known relationships between the critical exponents explored by them, despite the fact that they often study the same datasets. Here, by exploiting three different mobile phone datasets that capture simultaneously these two aspects, we discovered a new scaling relationship, mediated by a universal flux distribution, which links the critical exponents characterizing the spatial dependencies in human mobility and social networks. Therefore, the widely studied scaling laws uncovered in these two areas are not independent but connected through a deeper underlying reality. PMID:27274050
A SOA-based approach to geographical data sharing

NASA Astrophysics Data System (ADS)

Li, Zonghua; Peng, Mingjun; Fan, Wei

2009-10-01

In the last few years, large volumes of spatial data have been available in different government departments in China, but these data are mainly used within these departments. With the e-government project initiated, spatial data sharing become more and more necessary. Currently, the Web has been used not only for document searching but also for the provision and use of services, known as Web services, which are published in a directory and may be automatically discovered by software agents. Particularly in the spatial domain, the possibility of accessing these large spatial datasets via Web services has motivated research into the new field of Spatial Data Infrastructure (SDI) implemented using service-oriented architecture. In this paper a Service-Oriented Architecture (SOA) based Geographical Information Systems (GIS) is proposed, and a prototype system is deployed based on Open Geospatial Consortium (OGC) standard in Wuhan, China, thus that all the departments authorized can access the spatial data within the government intranet, and also these spatial data can be easily integrated into kinds of applications.
A multiscale Bayesian data integration approach for mapping air dose rates around the Fukushima Daiichi Nuclear Power Plant.

PubMed

Wainwright, Haruko M; Seki, Akiyuki; Chen, Jinsong; Saito, Kimiaki

2017-02-01

This paper presents a multiscale data integration method to estimate the spatial distribution of air dose rates in the regional scale around the Fukushima Daiichi Nuclear Power Plant. We integrate various types of datasets, such as ground-based walk and car surveys, and airborne surveys, all of which have different scales, resolutions, spatial coverage, and accuracy. This method is based on geostatistics to represent spatial heterogeneous structures, and also on Bayesian hierarchical models to integrate multiscale, multi-type datasets in a consistent manner. The Bayesian method allows us to quantify the uncertainty in the estimates, and to provide the confidence intervals that are critical for robust decision-making. Although this approach is primarily data-driven, it has great flexibility to include mechanistic models for representing radiation transport or other complex correlations. We demonstrate our approach using three types of datasets collected at the same time over Fukushima City in Japan: (1) coarse-resolution airborne surveys covering the entire area, (2) car surveys along major roads, and (3) walk surveys in multiple neighborhoods. Results show that the method can successfully integrate three types of datasets and create an integrated map (including the confidence intervals) of air dose rates over the domain in high resolution. Moreover, this study provides us with various insights into the characteristics of each dataset, as well as radiocaesium distribution. In particular, the urban areas show high heterogeneity in the contaminant distribution due to human activities as well as large discrepancy among different surveys due to such heterogeneity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Location Prediction Based on Transition Probability Matrices Constructing from Sequential Rules for Spatial-Temporal K-Anonymity Dataset

PubMed Central

Liu, Zhao; Zhu, Yunhong; Wu, Chenxue

2016-01-01

Spatial-temporal k-anonymity has become a mainstream approach among techniques for protection of users’ privacy in location-based services (LBS) applications, and has been applied to several variants such as LBS snapshot queries and continuous queries. Analyzing large-scale spatial-temporal anonymity sets may benefit several LBS applications. In this paper, we propose two location prediction methods based on transition probability matrices constructing from sequential rules for spatial-temporal k-anonymity dataset. First, we define single-step sequential rules mined from sequential spatial-temporal k-anonymity datasets generated from continuous LBS queries for multiple users. We then construct transition probability matrices from mined single-step sequential rules, and normalize the transition probabilities in the transition matrices. Next, we regard a mobility model for an LBS requester as a stationary stochastic process and compute the n-step transition probability matrices by raising the normalized transition probability matrices to the power n. Furthermore, we propose two location prediction methods: rough prediction and accurate prediction. The former achieves the probabilities of arriving at target locations along simple paths those include only current locations, target locations and transition steps. By iteratively combining the probabilities for simple paths with n steps and the probabilities for detailed paths with n-1 steps, the latter method calculates transition probabilities for detailed paths with n steps from current locations to target locations. Finally, we conduct extensive experiments, and correctness and flexibility of our proposed algorithm have been verified. PMID:27508502
A Review of Land-Cover Mapping Activities in Coastal Alabama and Mississippi

USGS Publications Warehouse

Smith, Kathryn E.L.; Nayegandhi, Amar; Brock, John C.

2010-01-01

INTRODUCTION Land-use and land-cover (LULC) data provide important information for environmental management. Data pertaining to land-cover and land-management activities are a common requirement for spatial analyses, such as watershed modeling, climate change, and hazard assessment. In coastal areas, land development, storms, and shoreline modification amplify the need for frequent and detailed land-cover datasets. The northern Gulf of Mexico coastal area is no exception. The impact of severe storms, increases in urban area, dramatic changes in land cover, and loss of coastal-wetland habitat all indicate a vital need for reliable and comparable land-cover data. Four main attributes define a land-cover dataset: the date/time of data collection, the spatial resolution, the type of classification, and the source data. The source data are the foundation dataset used to generate LULC classification and are typically remotely sensed data, such as aerial photography or satellite imagery. These source data have a large influence on the final LULC data product, so much so that one can classify LULC datasets into two general groups: LULC data derived from aerial photography and LULC data derived from satellite imagery. The final LULC data can be converted from one format to another (for instance, vector LULC data can be converted into raster data for analysis purposes, and vice versa), but each subsequent dataset maintains the imprint of the source medium within its spatial accuracy and data features. The source data will also influence the spatial and temporal resolution, as well as the type of classification. The intended application of the LULC data typically defines the type of source data and methodology, with satellite imagery being selected for large landscapes (state-wide, national data products) and repeatability (environmental monitoring and change analysis). The coarse spatial scale and lack of refined land-use categories are typical drawbacks to satellite-based land-use classifications. Aerial photography is typically selected for smaller landscapes (watershed-basin scale), for greater definition of the land-use categories, and for increased spatial resolution. Disadvantages of using photography include time-consuming digitization, high costs for imagery collection, and lack of seasonal data. Recently, the availability of high-resolution satellite imagery has generated a new category of LULC data product. These new datasets have similar strengths to the aerial-photo-based LULC in that they possess the potential for refined definition of land-use categories and increased spatial resolution but also have the benefit of satellite-based classifications, such as repeatability for change analysis. LULC classification based on high-resolution satellite imagery is still in the early stages of development but merits greater attention because environmental-monitoring and landscape-modeling programs rely heavily on LULC data. This publication summarizes land-use and land-cover mapping activities for Alabama and Mississippi coastal areas within the U.S. Geological Survey (USGS) Northern Gulf of Mexico (NGOM) Ecosystem Change and Hazard Susceptibility Project boundaries. Existing LULC datasets will be described, as well as imagery data sources and ancillary data that may provide ground-truth or satellite training data for a forthcoming land-cover classification. Finally, potential areas for a high-resolution land-cover classification in the Alabama-Mississippi region will be identified.
Spatio-temporal Eigenvector Filtering: Application on Bioenergy Crop Impacts

NASA Astrophysics Data System (ADS)

Wang, M.; Kamarianakis, Y.; Georgescu, M.

2017-12-01

A suite of 10-year ensemble-based simulations was conducted to investigate the hydroclimatic impacts due to large-scale deployment of perennial bioenergy crops across the continental United States. Given the large size of the simulated dataset (about 60Tb), traditional hierarchical spatio-temporal statistical modelling cannot be implemented for the evaluation of physics parameterizations and biofuel impacts. In this work, we propose a filtering algorithm that takes into account the spatio-temporal autocorrelation structure of the data while avoiding spatial confounding. This method is used to quantify the robustness of simulated hydroclimatic impacts associated with bioenergy crops to alternative physics parameterizations and observational datasets. Results are evaluated against those obtained from three alternative Bayesian spatio-temporal specifications.

Finding Intervals of Abrupt Change in Earth Science Data

NASA Astrophysics Data System (ADS)

Zhou, X.; Shekhar, S.; Liess, S.

2011-12-01

In earth science data (e.g., climate data), it is often observed that a persistently abrupt change in value occurs in a certain time-period or spatial interval. For example, abrupt climate change is defined as an unusually large shift of precipitation, temperature, etc, that occurs during a relatively short time period. A similar pattern can also be found in geographical space, representing a sharp transition of the environment (e.g., vegetation between different ecological zones). Identifying such intervals of change from earth science datasets is a crucial step for understanding and attributing the underlying phenomenon. However, inconsistencies in these noisy datasets can obstruct the major change trend, and more importantly can complicate the search of the beginning and end points of the interval of change. Also, the large volume of data makes it challenging to process the dataset reasonably fast. In earth science data (e.g., climate data), it is often observed that a persistently abrupt change in value occurs in a certain time-period or spatial interval. For example, abrupt climate change is defined as an unusually large shift of precipitation, temperature, etc, that occurs during a relatively short time period. A similar change pattern can also be found in geographical space, representing a sharp transition of the environment (e.g., vegetation between different ecological zones). Identifying such intervals of change from earth science datasets is a crucial step for understanding and attributing the underlying phenomenon. However, inconsistencies in these noisy datasets can obstruct the major change trend, and more importantly can complicate the search of the beginning and end points of the interval of change. Also, the large volume of data makes it challenging to process the dataset fast. In this work, we analyze earth science data using a novel, automated data mining approach to identify spatial/temporal intervals of persistent, abrupt change. We first propose a statistical model to quantitatively evaluate the change abruptness and persistence in an interval. Then we design an algorithm to exhaustively examine all the intervals using this model. Intervals passing a threshold test will be kept as final results. We evaluate the proposed method with the Climate Research Unit (CRU) precipitation data, whereby we focus on the Sahel rainfall index. Results show that this method can find periods of persistent and abrupt value changes with different temporal scales. We also further optimize the algorithm using a smart strategy, which always examines longer intervals before its subsets. By doing this, we reduce the computational cost to only one third of that of the original algorithm for the above test case. More significantly, the optimized algorithm is also proven to scale up well with data volume and number of changes. Particularly, it achieves better performance when dealing with longer change intervals.
Architecture of a spatial data service system for statistical analysis and visualization of regional climate changes

NASA Astrophysics Data System (ADS)

Titov, A. G.; Okladnikov, I. G.; Gordov, E. P.

2017-11-01

The use of large geospatial datasets in climate change studies requires the development of a set of Spatial Data Infrastructure (SDI) elements, including geoprocessing and cartographical visualization web services. This paper presents the architecture of a geospatial OGC web service system as an integral part of a virtual research environment (VRE) general architecture for statistical processing and visualization of meteorological and climatic data. The architecture is a set of interconnected standalone SDI nodes with corresponding data storage systems. Each node runs a specialized software, such as a geoportal, cartographical web services (WMS/WFS), a metadata catalog, and a MySQL database of technical metadata describing geospatial datasets available for the node. It also contains geospatial data processing services (WPS) based on a modular computing backend realizing statistical processing functionality and, thus, providing analysis of large datasets with the results of visualization and export into files of standard formats (XML, binary, etc.). Some cartographical web services have been developed in a system’s prototype to provide capabilities to work with raster and vector geospatial data based on OGC web services. The distributed architecture presented allows easy addition of new nodes, computing and data storage systems, and provides a solid computational infrastructure for regional climate change studies based on modern Web and GIS technologies.
Access to Emissions Distributions and Related Ancillary Data through the ECCAD database

NASA Astrophysics Data System (ADS)

Darras, Sabine; Granier, Claire; Liousse, Catherine; De Graaf, Erica; Enriquez, Edgar; Boulanger, Damien; Brissebrat, Guillaume

2017-04-01

The ECCAD database (Emissions of atmospheric Compounds and Compilation of Ancillary Data) provides a user-friendly access to global and regional surface emissions for a large set of chemical compounds and ancillary data (land use, active fires, burned areas, population,etc). The emissions inventories are time series gridded data at spatial resolution from 1x1 to 0.1x0.1 degrees. ECCAD is the emissions database of the GEIA (Global Emissions InitiAtive) project and a sub-project of the French Atmospheric Data Center AERIS (http://www.aeris-data.fr). ECCAD has currently more than 2200 users originating from more than 80 countries. The project benefits from this large international community of users to expand the number of emission datasets made available. ECCAD provides detailed metadata for each of the datasets and various tools for data visualization, for computing global and regional totals and for interactive spatial and temporal analysis. The data can be downloaded as interoperable NetCDF CF-compliant files, i.e. the data are compatible with many other client interfaces. The presentation will provide information on the datasets available within ECCAD, as well as examples of the analysis work that can be done online through the website: http://eccad.aeris-data.fr.
Access to Emissions Distributions and Related Ancillary Data through the ECCAD database

NASA Astrophysics Data System (ADS)

Darras, Sabine; Enriquez, Edgar; Granier, Claire; Liousse, Catherine; Boulanger, Damien; Fontaine, Alain

2016-04-01

The ECCAD database (Emissions of atmospheric Compounds and Compilation of Ancillary Data) provides a user-friendly access to global and regional surface emissions for a large set of chemical compounds and ancillary data (land use, active fires, burned areas, population,etc). The emissions inventories are time series gridded data at spatial resolution from 1x1 to 0.1x0.1 degrees. ECCAD is the emissions database of the GEIA (Global Emissions InitiAtive) project and a sub-project of the French Atmospheric Data Center AERIS (http://www.aeris-data.fr). ECCAD has currently more than 2200 users originating from more than 80 countries. The project benefits from this large international community of users to expand the number of emission datasets made available. ECCAD provides detailed metadata for each of the datasets and various tools for data visualization, for computing global and regional totals and for interactive spatial and temporal analysis. The data can be downloaded as interoperable NetCDF CF-compliant files, i.e. the data are compatible with many other client interfaces. The presentation will provide information on the datasets available within ECCAD, as well as examples of the analysis work that can be done online through the website: http://eccad.aeris-data.fr.
Ricardo Oliveira | NREL

Science.gov Websites

the System Modeling & Geospatial Data Science Group in the Strategic Energy Analysis Center. Areas Publications Oliveira, R and Moreno, R. 2016. Harvesting, Integrating and Distributing Large Open Geospatial Datasets Using Free and Open-Source Software. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLI-B7
Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruebel, Oliver

2009-11-20

Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less
Towards improved parameterization of a macroscale hydrologic model in a discontinuous permafrost boreal forest ecosystem

DOE PAGES

Endalamaw, Abraham; Bolton, W. Robert; Young-Robertson, Jessica M.; ...

2017-09-14

Modeling hydrological processes in the Alaskan sub-arctic is challenging because of the extreme spatial heterogeneity in soil properties and vegetation communities. Nevertheless, modeling and predicting hydrological processes is critical in this region due to its vulnerability to the effects of climate change. Coarse-spatial-resolution datasets used in land surface modeling pose a new challenge in simulating the spatially distributed and basin-integrated processes since these datasets do not adequately represent the small-scale hydrological, thermal, and ecological heterogeneity. The goal of this study is to improve the prediction capacity of mesoscale to large-scale hydrological models by introducing a small-scale parameterization scheme, which bettermore » represents the spatial heterogeneity of soil properties and vegetation cover in the Alaskan sub-arctic. The small-scale parameterization schemes are derived from observations and a sub-grid parameterization method in the two contrasting sub-basins of the Caribou Poker Creek Research Watershed (CPCRW) in Interior Alaska: one nearly permafrost-free (LowP) sub-basin and one permafrost-dominated (HighP) sub-basin. The sub-grid parameterization method used in the small-scale parameterization scheme is derived from the watershed topography. We found that observed soil thermal and hydraulic properties – including the distribution of permafrost and vegetation cover heterogeneity – are better represented in the sub-grid parameterization method than the coarse-resolution datasets. Parameters derived from the coarse-resolution datasets and from the sub-grid parameterization method are implemented into the variable infiltration capacity (VIC) mesoscale hydrological model to simulate runoff, evapotranspiration (ET), and soil moisture in the two sub-basins of the CPCRW. Simulated hydrographs based on the small-scale parameterization capture most of the peak and low flows, with similar accuracy in both sub-basins, compared to simulated hydrographs based on the coarse-resolution datasets. On average, the small-scale parameterization scheme improves the total runoff simulation by up to 50 % in the LowP sub-basin and by up to 10 % in the HighP sub-basin from the large-scale parameterization. This study shows that the proposed sub-grid parameterization method can be used to improve the performance of mesoscale hydrological models in the Alaskan sub-arctic watersheds.« less
Towards improved parameterization of a macroscale hydrologic model in a discontinuous permafrost boreal forest ecosystem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Endalamaw, Abraham; Bolton, W. Robert; Young-Robertson, Jessica M.

Modeling hydrological processes in the Alaskan sub-arctic is challenging because of the extreme spatial heterogeneity in soil properties and vegetation communities. Nevertheless, modeling and predicting hydrological processes is critical in this region due to its vulnerability to the effects of climate change. Coarse-spatial-resolution datasets used in land surface modeling pose a new challenge in simulating the spatially distributed and basin-integrated processes since these datasets do not adequately represent the small-scale hydrological, thermal, and ecological heterogeneity. The goal of this study is to improve the prediction capacity of mesoscale to large-scale hydrological models by introducing a small-scale parameterization scheme, which bettermore » represents the spatial heterogeneity of soil properties and vegetation cover in the Alaskan sub-arctic. The small-scale parameterization schemes are derived from observations and a sub-grid parameterization method in the two contrasting sub-basins of the Caribou Poker Creek Research Watershed (CPCRW) in Interior Alaska: one nearly permafrost-free (LowP) sub-basin and one permafrost-dominated (HighP) sub-basin. The sub-grid parameterization method used in the small-scale parameterization scheme is derived from the watershed topography. We found that observed soil thermal and hydraulic properties – including the distribution of permafrost and vegetation cover heterogeneity – are better represented in the sub-grid parameterization method than the coarse-resolution datasets. Parameters derived from the coarse-resolution datasets and from the sub-grid parameterization method are implemented into the variable infiltration capacity (VIC) mesoscale hydrological model to simulate runoff, evapotranspiration (ET), and soil moisture in the two sub-basins of the CPCRW. Simulated hydrographs based on the small-scale parameterization capture most of the peak and low flows, with similar accuracy in both sub-basins, compared to simulated hydrographs based on the coarse-resolution datasets. On average, the small-scale parameterization scheme improves the total runoff simulation by up to 50 % in the LowP sub-basin and by up to 10 % in the HighP sub-basin from the large-scale parameterization. This study shows that the proposed sub-grid parameterization method can be used to improve the performance of mesoscale hydrological models in the Alaskan sub-arctic watersheds.« less
Optimizing spatial patterns with sparse filter bands for motor-imagery based brain-computer interface.

PubMed

Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej

2015-11-30

Common spatial pattern (CSP) has been most popularly applied to motor-imagery (MI) feature extraction for classification in brain-computer interface (BCI) application. Successful application of CSP depends on the filter band selection to a large degree. However, the most proper band is typically subject-specific and can hardly be determined manually. This study proposes a sparse filter band common spatial pattern (SFBCSP) for optimizing the spatial patterns. SFBCSP estimates CSP features on multiple signals that are filtered from raw EEG data at a set of overlapping bands. The filter bands that result in significant CSP features are then selected in a supervised way by exploiting sparse regression. A support vector machine (SVM) is implemented on the selected features for MI classification. Two public EEG datasets (BCI Competition III dataset IVa and BCI Competition IV IIb) are used to validate the proposed SFBCSP method. Experimental results demonstrate that SFBCSP help improve the classification performance of MI. The optimized spatial patterns by SFBCSP give overall better MI classification accuracy in comparison with several competing methods. The proposed SFBCSP is a potential method for improving the performance of MI-based BCI. Copyright © 2015 Elsevier B.V. All rights reserved.
Quantifying Urban Watershed Stressor Gradients and Evaluating How Different Land Cover Datasets Affect Stream Management

EPA Science Inventory

We used a gradient (divided into impervious cover categories), spatially-balanced, random design (1) to sample streams along an impervious cover gradient in a large coastal watershed, (2) to characterize relationships between water chemistry and land cover, and (3) to document di...
Chemical elements in the environment: multi-element geochemical datasets from continental to national scale surveys on four continents

USGS Publications Warehouse

Caritat, Patrice de; Reimann, Clemens; Smith, David; Wang, Xueqiu

2017-01-01

During the last 10-20 years, Geological Surveys around the world have undertaken a major effort towards delivering fully harmonized and tightly quality-controlled low-density multi-element soil geochemical maps and datasets of vast regions including up to whole continents. Concentrations of between 45 and 60 elements commonly have been determined in a variety of different regolith types (e.g., sediment, soil). The multi-element datasets are published as complete geochemical atlases and made available to the general public. Several other geochemical datasets covering smaller areas but generally at a higher spatial density are also available. These datasets may, however, not be found by superficial internet-based searches because the elements are not mentioned individually either in the title or in the keyword lists of the original references. This publication attempts to increase the visibility and discoverability of these fundamental background datasets covering large areas up to whole continents.
Global daily reference evapotranspiration modeling and evaluation

USGS Publications Warehouse

Senay, G.B.; Verdin, J.P.; Lietzow, R.; Melesse, Assefa M.

2008-01-01

Accurate and reliable evapotranspiration (ET) datasets are crucial in regional water and energy balance studies. Due to the complex instrumentation requirements, actual ET values are generally estimated from reference ET values by adjustment factors using coefficients for water stress and vegetation conditions, commonly referred to as crop coefficients. Until recently, the modeling of reference ET has been solely based on important weather variables collected from weather stations that are generally located in selected agro-climatic locations. Since 2001, the National Oceanic and Atmospheric Administration’s Global Data Assimilation System (GDAS) has been producing six-hourly climate parameter datasets that are used to calculate daily reference ET for the whole globe at 1-degree spatial resolution. The U.S. Geological Survey Center for Earth Resources Observation and Science has been producing daily reference ET (ETo) since 2001, and it has been used on a variety of operational hydrological models for drought and streamflow monitoring all over the world. With the increasing availability of local station-based reference ET estimates, we evaluated the GDAS-based reference ET estimates using data from the California Irrigation Management Information System (CIMIS). Daily CIMIS reference ET estimates from 85 stations were compared with GDAS-based reference ET at different spatial and temporal scales using five-year daily data from 2002 through 2006. Despite the large difference in spatial scale (point vs. ∼100 km grid cell) between the two datasets, the correlations between station-based ET and GDAS-ET were very high, exceeding 0.97 on a daily basis to more than 0.99 on time scales of more than 10 days. Both the temporal and spatial correspondences in trend/pattern and magnitudes between the two datasets were satisfactory, suggesting the reliability of using GDAS parameter-based reference ET for regional water and energy balance studies in many parts of the world. While the study revealed the potential of GDAS ETo for large-scale hydrological applications, site-specific use of GDAS ETo in complex hydro-climatic regions such as coastal areas and rugged terrain may require the application of bias correction and/or disaggregation of the GDAS ETo using downscaling techniques.
Observational Evidence for Desert Amplification Using Multiple Satellite Datasets.

PubMed

Wei, Nan; Zhou, Liming; Dai, Yongjiu; Xia, Geng; Hua, Wenjian

2017-05-17

Desert amplification identified in recent studies has large uncertainties due to data paucity over remote deserts. Here we present observational evidence using multiple satellite-derived datasets that desert amplification is a real large-scale pattern of warming mode in near surface and low-tropospheric temperatures. Trend analyses of three long-term temperature products consistently confirm that near-surface warming is generally strongest over the driest climate regions and this spatial pattern of warming maximizes near the surface, gradually decays with height, and disappears in the upper troposphere. Short-term anomaly analyses show a strong spatial and temporal coupling of changes in temperatures, water vapor and downward longwave radiation (DLR), indicating that the large increase in DLR drives primarily near surface warming and is tightly associated with increasing water vapor over deserts. Atmospheric soundings of temperature and water vapor anomalies support the results of the long-term temperature trend analysis and suggest that desert amplification is due to comparable warming and moistening effects of the troposphere. Likely, desert amplification results from the strongest water vapor feedbacks near the surface over the driest deserts, where the air is very sensitive to changes in water vapor and thus efficient in enhancing the longwave greenhouse effect in a warming climate.
Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution.

PubMed

Gangnon, Ronald E

2012-03-01

The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, whereas rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset. © 2011, The International Biometric Society.
Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution

PubMed Central

Gangnon, Ronald E.

2011-01-01

Summary The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, while rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset. PMID:21762118
The worth of data to reduce predictive uncertainty of an integrated catchment model by multi-constraint calibration

NASA Astrophysics Data System (ADS)

Koch, J.; Jensen, K. H.; Stisen, S.

2017-12-01

Hydrological models that integrate numerical process descriptions across compartments of the water cycle are typically required to undergo thorough model calibration in order to estimate suitable effective model parameters. In this study, we apply a spatially distributed hydrological model code which couples the saturated zone with the unsaturated zone and the energy portioning at the land surface. We conduct a comprehensive multi-constraint model calibration against nine independent observational datasets which reflect both the temporal and the spatial behavior of hydrological response of a 1000km2 large catchment in Denmark. The datasets are obtained from satellite remote sensing and in-situ measurements and cover five keystone hydrological variables: discharge, evapotranspiration, groundwater head, soil moisture and land surface temperature. Results indicate that a balanced optimization can be achieved where errors on objective functions for all nine observational datasets can be reduced simultaneously. The applied calibration framework was tailored with focus on improving the spatial pattern performance; however results suggest that the optimization is still more prone to improve the temporal dimension of model performance. This study features a post-calibration linear uncertainty analysis. This allows quantifying parameter identifiability which is the worth of a specific observational dataset to infer values to model parameters through calibration. Furthermore the ability of an observation to reduce predictive uncertainty is assessed as well. Such findings determine concrete implications on the design of model calibration frameworks and, in more general terms, the acquisition of data in hydrological observatories.
Daily precipitation grids for Austria since 1961—development and evaluation of a spatial dataset for hydroclimatic monitoring and modelling

NASA Astrophysics Data System (ADS)

Hiebl, Johann; Frei, Christoph

2018-04-01

Spatial precipitation datasets that are long-term consistent, highly resolved and extend over several decades are an increasingly popular basis for modelling and monitoring environmental processes and planning tasks in hydrology, agriculture, energy resources management, etc. Here, we present a grid dataset of daily precipitation for Austria meant to promote such applications. It has a grid spacing of 1 km, extends back till 1961 and is continuously updated. It is constructed with the classical two-tier analysis, involving separate interpolations for mean monthly precipitation and daily relative anomalies. The former was accomplished by kriging with topographic predictors as external drift utilising 1249 stations. The latter is based on angular distance weighting and uses 523 stations. The input station network was kept largely stationary over time to avoid artefacts on long-term consistency. Example cases suggest that the new analysis is at least as plausible as previously existing datasets. Cross-validation and comparison against experimental high-resolution observations (WegenerNet) suggest that the accuracy of the dataset depends on interpretation. Users interpreting grid point values as point estimates must expect systematic overestimates for light and underestimates for heavy precipitation as well as substantial random errors. Grid point estimates are typically within a factor of 1.5 from in situ observations. Interpreting grid point values as area mean values, conditional biases are reduced and the magnitude of random errors is considerably smaller. Together with a similar dataset of temperature, the new dataset (SPARTACUS) is an interesting basis for modelling environmental processes, studying climate change impacts and monitoring the climate of Austria.
GIEMS-D3: A new long-term, dynamical, high-spatial resolution inundation extent dataset at global scale

NASA Astrophysics Data System (ADS)

Aires, Filipe; Miolane, Léo; Prigent, Catherine; Pham Duc, Binh; Papa, Fabrice; Fluet-Chouinard, Etienne; Lehner, Bernhard

2017-04-01

The Global Inundation Extent from Multi-Satellites (GIEMS) provides multi-year monthly variations of the global surface water extent at 25kmx25km resolution. It is derived from multiple satellite observations. Its spatial resolution is usually compatible with climate model outputs and with global land surface model grids but is clearly not adequate for local applications that require the characterization of small individual water bodies. There is today a strong demand for high-resolution inundation extent datasets, for a large variety of applications such as water management, regional hydrological modeling, or for the analysis of mosquitos-related diseases. A new procedure is introduced to downscale the GIEMS low spatial resolution inundations to a 3 arc second (90 m) dataset. The methodology is based on topography and hydrography information from the HydroSHEDS database. A new floodability index is adopted and an innovative smoothing procedure is developed to ensure the smooth transition, in the high-resolution maps, between the low-resolution boxes from GIEMS. Topography information is relevant for natural hydrology environments controlled by elevation, but is more limited in human-modified basins. However, the proposed downscaling approach is compatible with forthcoming fusion with other more pertinent satellite information in these difficult regions. The resulting GIEMS-D3 database is the only high spatial resolution inundation database available globally at the monthly time scale over the 1993-2007 period. GIEMS-D3 is assessed by analyzing its spatial and temporal variability, and evaluated by comparisons to other independent satellite observations from visible (Google Earth and Landsat), infrared (MODIS) and active microwave (SAR).
TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015.

PubMed

Abatzoglou, John T; Dobrowski, Solomon Z; Parks, Sean A; Hegewisch, Katherine C

2018-01-09

We present TerraClimate, a dataset of high-spatial resolution (1/24°, ~4-km) monthly climate and climatic water balance for global terrestrial surfaces from 1958-2015. TerraClimate uses climatically aided interpolation, combining high-spatial resolution climatological normals from the WorldClim dataset, with coarser resolution time varying (i.e., monthly) data from other sources to produce a monthly dataset of precipitation, maximum and minimum temperature, wind speed, vapor pressure, and solar radiation. TerraClimate additionally produces monthly surface water balance datasets using a water balance model that incorporates reference evapotranspiration, precipitation, temperature, and interpolated plant extractable soil water capacity. These data provide important inputs for ecological and hydrological studies at global scales that require high spatial resolution and time varying climate and climatic water balance data. We validated spatiotemporal aspects of TerraClimate using annual temperature, precipitation, and calculated reference evapotranspiration from station data, as well as annual runoff from streamflow gauges. TerraClimate datasets showed noted improvement in overall mean absolute error and increased spatial realism relative to coarser resolution gridded datasets.
TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015

NASA Astrophysics Data System (ADS)

Abatzoglou, John T.; Dobrowski, Solomon Z.; Parks, Sean A.; Hegewisch, Katherine C.

2018-01-01

We present TerraClimate, a dataset of high-spatial resolution (1/24°, ~4-km) monthly climate and climatic water balance for global terrestrial surfaces from 1958-2015. TerraClimate uses climatically aided interpolation, combining high-spatial resolution climatological normals from the WorldClim dataset, with coarser resolution time varying (i.e., monthly) data from other sources to produce a monthly dataset of precipitation, maximum and minimum temperature, wind speed, vapor pressure, and solar radiation. TerraClimate additionally produces monthly surface water balance datasets using a water balance model that incorporates reference evapotranspiration, precipitation, temperature, and interpolated plant extractable soil water capacity. These data provide important inputs for ecological and hydrological studies at global scales that require high spatial resolution and time varying climate and climatic water balance data. We validated spatiotemporal aspects of TerraClimate using annual temperature, precipitation, and calculated reference evapotranspiration from station data, as well as annual runoff from streamflow gauges. TerraClimate datasets showed noted improvement in overall mean absolute error and increased spatial realism relative to coarser resolution gridded datasets.

A multimodal MRI dataset of professional chess players.

PubMed

Li, Kaiming; Jiang, Jing; Qiu, Lihua; Yang, Xun; Huang, Xiaoqi; Lui, Su; Gong, Qiyong

2015-01-01

Chess is a good model to study high-level human brain functions such as spatial cognition, memory, planning, learning and problem solving. Recent studies have demonstrated that non-invasive MRI techniques are valuable for researchers to investigate the underlying neural mechanism of playing chess. For professional chess players (e.g., chess grand masters and masters or GM/Ms), what are the structural and functional alterations due to long-term professional practice, and how these alterations relate to behavior, are largely veiled. Here, we report a multimodal MRI dataset from 29 professional Chinese chess players (most of whom are GM/Ms), and 29 age matched novices. We hope that this dataset will provide researchers with new materials to further explore high-level human brain functions.
Regional hydro-climatic impacts of contemporary Amazonian deforestation

NASA Astrophysics Data System (ADS)

Khanna, Jaya

More than 17% of the Amazon rainforest has been cleared in the past three decades triggering important climatological and societal impacts. This thesis is devoted to identifying and explaining the regional hydroclimatic impacts of this change employing multidecadal satellite observations and numerical simulations providing an integrated perspective on this topic. The climatological nature of this study motivated the implementation and application of a cloud detection technique to a new geostationary satellite dataset. The resulting sub daily, high spatial resolution, multidecadal time series facilitated the detection of trends and variability in deforestation triggered cloud cover changes. The analysis was complemented by satellite precipitation, reanalysis and ground based datasets and attribution with the variable resolution Ocean-Land-Atmosphere-Model. Contemporary Amazonian deforestation affects spatial scales of hundreds of kilometers. But, unlike the well-studied impacts of a few kilometers scale deforestation, the climatic response to contemporary, large scale deforestation is neither well observed nor well understood. Employing satellite datasets, this thesis shows a transition in the regional hydroclimate accompanying increasing scales of deforestation, with downwind deforested regions receiving 25% more and upwind deforested regions receiving 25% less precipitation from the deforested area mean. Simulations robustly reproduce these shifts when forced with increasing deforestation alone, suggesting a negligible role of large-scale decadal climate variability in causing the shifts. Furthermore, deforestation-induced surface roughness variations are found necessary to reproduce the observed spatial patterns in recent times illustrating the strong scale-sensitivity of the climatic response to Amazonian deforestation. This phenomenon, inconsequential during the wet season, is found to substantially affect the regional hydroclimate in the local dry and parts of transition seasons, hence occurring in atmospheric conditions otherwise less conducive to thermal convection. Evidence of this phenomenon is found at two large scale deforested areas considered in this thesis. Hence, the 'dynamical' mechanism, which affects the seasons most important for regional ecology, emerges as an impactful convective triggering mechanism. The phenomenon studied in this thesis provides context for thinking about the climate of a future, more patchily forested Amazonia, by articulating relationships between climate and spatial scales of deforestation.
Evaluating the Consistency of the 1982–1999 NDVI Trends in the Iberian Peninsula across Four Time-series Derived from the AVHRR Sensor: LTDR, GIMMS, FASIR, and PAL-II

PubMed Central

Alcaraz-Segura, Domingo; Liras, Elisa; Tabik, Siham; Paruelo, José; Cabello, Javier

2010-01-01

Successive efforts have processed the Advanced Very High Resolution Radiometer (AVHRR) sensor archive to produce Normalized Difference Vegetation Index (NDVI) datasets (i.e., PAL, FASIR, GIMMS, and LTDR) under different corrections and processing schemes. Since NDVI datasets are used to evaluate carbon gains, differences among them may affect nations’ carbon budgets in meeting international targets (such as the Kyoto Protocol). This study addresses the consistency across AVHRR NDVI datasets in the Iberian Peninsula (Spain and Portugal) by evaluating whether their 1982–1999 NDVI trends show similar spatial patterns. Significant trends were calculated with the seasonal Mann-Kendall trend test and their spatial consistency with partial Mantel tests. Over 23% of the Peninsula (N, E, and central mountain ranges) showed positive and significant NDVI trends across the four datasets and an additional 18% across three datasets. In 20% of Iberia (SW quadrant), the four datasets exhibited an absence of significant trends and an additional 22% across three datasets. Significant NDVI decreases were scarce (croplands in the Guadalquivir and Segura basins, La Mancha plains, and Valencia). Spatial consistency of significant trends across at least three datasets was observed in 83% of the Peninsula, but it decreased to 47% when comparing across the four datasets. FASIR, PAL, and LTDR were the most spatially similar datasets, while GIMMS was the most different. The different performance of each AVHRR dataset to detect significant NDVI trends (e.g., LTDR detected greater significant trends (both positive and negative) and in 32% more pixels than GIMMS) has great implications to evaluate carbon budgets. The lack of spatial consistency across NDVI datasets derived from the same AVHRR sensor archive, makes it advisable to evaluate carbon gains trends using several satellite datasets and, whether possible, independent/additional data sources to contrast. PMID:22205868
Evaluating the consistency of the 1982-1999 NDVI trends in the Iberian Peninsula across four time-series derived from the AVHRR sensor: LTDR, GIMMS, FASIR, and PAL-II.

PubMed

Alcaraz-Segura, Domingo; Liras, Elisa; Tabik, Siham; Paruelo, José; Cabello, Javier

2010-01-01

Successive efforts have processed the Advanced Very High Resolution Radiometer (AVHRR) sensor archive to produce Normalized Difference Vegetation Index (NDVI) datasets (i.e., PAL, FASIR, GIMMS, and LTDR) under different corrections and processing schemes. Since NDVI datasets are used to evaluate carbon gains, differences among them may affect nations' carbon budgets in meeting international targets (such as the Kyoto Protocol). This study addresses the consistency across AVHRR NDVI datasets in the Iberian Peninsula (Spain and Portugal) by evaluating whether their 1982-1999 NDVI trends show similar spatial patterns. Significant trends were calculated with the seasonal Mann-Kendall trend test and their spatial consistency with partial Mantel tests. Over 23% of the Peninsula (N, E, and central mountain ranges) showed positive and significant NDVI trends across the four datasets and an additional 18% across three datasets. In 20% of Iberia (SW quadrant), the four datasets exhibited an absence of significant trends and an additional 22% across three datasets. Significant NDVI decreases were scarce (croplands in the Guadalquivir and Segura basins, La Mancha plains, and Valencia). Spatial consistency of significant trends across at least three datasets was observed in 83% of the Peninsula, but it decreased to 47% when comparing across the four datasets. FASIR, PAL, and LTDR were the most spatially similar datasets, while GIMMS was the most different. The different performance of each AVHRR dataset to detect significant NDVI trends (e.g., LTDR detected greater significant trends (both positive and negative) and in 32% more pixels than GIMMS) has great implications to evaluate carbon budgets. The lack of spatial consistency across NDVI datasets derived from the same AVHRR sensor archive, makes it advisable to evaluate carbon gains trends using several satellite datasets and, whether possible, independent/additional data sources to contrast.
Defining surfaces for skewed, highly variable data

USGS Publications Warehouse

Helsel, D.R.; Ryker, S.J.

2002-01-01

Skewness of environmental data is often caused by more than simply a handful of outliers in an otherwise normal distribution. Statistical procedures for such datasets must be sufficiently robust to deal with distributions that are strongly non-normal, containing both a large proportion of outliers and a skewed main body of data. In the field of water quality, skewness is commonly associated with large variation over short distances. Spatial analysis of such data generally requires either considerable effort at modeling or the use of robust procedures not strongly affected by skewness and local variability. Using a skewed dataset of 675 nitrate measurements in ground water, commonly used methods for defining a surface (least-squares regression and kriging) are compared to a more robust method (loess). Three choices are critical in defining a surface: (i) is the surface to be a central mean or median surface? (ii) is either a well-fitting transformation or a robust and scale-independent measure of center used? (iii) does local spatial autocorrelation assist in or detract from addressing objectives? Published in 2002 by John Wiley & Sons, Ltd.
GTN-G, WGI, RGI, DCW, GLIMS, WGMS, GCOS - What's all this about? (Invited)

NASA Astrophysics Data System (ADS)

Paul, F.; Raup, B. H.; Zemp, M.

2013-12-01

In a large collaborative effort, the glaciological community has compiled a new and spa-tially complete global dataset of glacier outlines, the so-called Randolph Glacier Inventory or RGI. Despite its regional shortcomings in quality (e.g. in regard to geolocation, gener-alization, and interpretation), this dataset was heavily used for global-scale modelling ap-plications (e.g. determination of total glacier volume and glacier contribution to sea-level rise) in support of the forthcoming 5th Assessment Report (AR5) of Working Group I of the IPCC. The RGI is a merged dataset that is largely based on the GLIMS database and several new datasets provided by the community (both are mostly derived from satellite data), as well as the Digital Chart of the World (DCW) and glacier attribute information (location, size) from the World Glacier Inventory (WGI). There are now two key tasks to be performed, (1) improving the quality of the RGI in all regions where the outlines do not met the quality required for local scale applications, and (2) integrating the RGI in the GLIMS glacier database to improve its spatial completeness. While (1) requires again a huge effort but is already ongoing, (2) is mainly a technical issue that is nearly solved. Apart from this technical dimension, there is also a more political or structural one. While GLIMS is responsible for the remote sensing and glacier inventory part (Tier 5) of the Global Terrestrial Network for Glaciers (GTN-G) within the Global Climate Observing System (GCOS), the World Glacier Monitoring Service (WGMS) is collecting and dis-seminating the field observations. Along with new global products derived from satellite data (e.g. elevation changes and velocity fields) and the community wish to keep a snap-shot dataset such as the RGI available, how to make all these datasets available to the community without duplicating efforts and making best use of the very limited financial resources available must now be discussed. This overview presentation describes the cur-rently available datasets, clarifying the terminology and the international framework, and suggesting a way forward to serve the community at best.
Dynamic analysis, transformation, dissemination and applications of scientific multidimensional data in ArcGIS Platform

NASA Astrophysics Data System (ADS)

Shrestha, S. R.; Collow, T. W.; Rose, B.

2016-12-01

Scientific datasets are generated from various sources and platforms but they are typically produced either by earth observation systems or by modelling systems. These are widely used for monitoring, simulating, or analyzing measurements that are associated with physical, chemical, and biological phenomena over the ocean, atmosphere, or land. A significant subset of scientific datasets stores values directly as rasters or in a form that can be rasterized. This is where a value exists at every cell in a regular grid spanning the spatial extent of the dataset. Government agencies like NOAA, NASA, EPA, USGS produces large volumes of near real-time, forecast, and historical data that drives climatological and meteorological studies, and underpins operations ranging from weather prediction to sea ice loss. Modern science is computationally intensive because of the availability of an enormous amount of scientific data, the adoption of data-driven analysis, and the need to share these dataset and research results with the public. ArcGIS as a platform is sophisticated and capable of handling such complex domain. We'll discuss constructs and capabilities applicable to multidimensional gridded data that can be conceptualized as a multivariate space-time cube. Building on the concept of a two-dimensional raster, a typical multidimensional raster dataset could contain several "slices" within the same spatial extent. We will share a case from the NOAA Climate Forecast Systems Reanalysis (CFSR) multidimensional data as an example of how large collections of rasters can be efficiently organized and managed through a data model within a geodatabase called "Mosaic dataset" and dynamically transformed and analyzed using raster functions. A raster function is a lightweight, raster-valued transformation defined over a mixed set of raster and scalar input. That means, just like any tool, you can provide a raster function with input parameters. It enables dynamic processing of only the data that's being displayed on the screen or requested by an application. We will present the dynamic processing and analysis of CFSR data using the chains of raster function and share it as dynamic multidimensional image service. This workflow and capabilities can be easily applied to any scientific data formats that are supported in mosaic dataset.
A Compilation of Spatial Datasets to Support a Preliminary Assessment of Pesticides and Pesticide Use on Tribal Lands in Oklahoma

USGS Publications Warehouse

Mashburn, Shana L.; Winton, Kimberly T.

2010-01-01

This CD-ROM contains spatial datasets that describe natural and anthropogenic features and county-level estimates of agricultural pesticide use and pesticide data for surface-water, groundwater, and biological specimens in the state of Oklahoma. County-level estimates of pesticide use were compiled from the Pesticide National Synthesis Project of the U.S. Geological Survey, National Water-Quality Assessment Program. Pesticide data for surface water, groundwater, and biological specimens were compiled from U.S. Geological Survey National Water Information System database. These spatial datasets that describe natural and manmade features were compiled from several agencies and contain information collected by the U.S. Geological Survey. The U.S. Geological Survey datasets were not collected specifically for this compilation, but were previously collected for projects with various objectives. The spatial datasets were created by different agencies from sources with varied quality. As a result, features common to multiple layers may not overlay exactly. Users should check the metadata to determine proper use of these spatial datasets. These data were not checked for accuracy or completeness. If a question of accuracy or completeness arise, the user should contact the originator cited in the metadata.
Statistical and Spatial Analysis of Bathymetric Data for the St. Clair River, 1971-2007

USGS Publications Warehouse

Bennion, David

2009-01-01

To address questions concerning ongoing geomorphic processes in the St. Clair River, selected bathymetric datasets spanning 36 years were analyzed. Comparisons of recent high-resolution datasets covering the upper river indicate a highly variable, active environment. Although statistical and spatial comparisons of the datasets show that some changes to the channel size and shape have taken place during the study period, uncertainty associated with various survey methods and interpolation processes limit the statistically certain results. The methods used to spatially compare the datasets are sensitive to small variations in position and depth that are within the range of uncertainty associated with the datasets. Characteristics of the data, such as the density of measured points and the range of values surveyed, can also influence the results of spatial comparison. With due consideration of these limitations, apparently active and ongoing areas of elevation change in the river are mapped and discussed.
Reconstruction of global gridded monthly sectoral water withdrawals for 1971-2010 and analysis of their spatiotemporal patterns

NASA Astrophysics Data System (ADS)

Huang, Zhongwei; Hejazi, Mohamad; Li, Xinya; Tang, Qiuhong; Vernon, Chris; Leng, Guoyong; Liu, Yaling; Döll, Petra; Eisner, Stephanie; Gerten, Dieter; Hanasaki, Naota; Wada, Yoshihide

2018-04-01

Human water withdrawal has increasingly altered the global water cycle in past decades, yet our understanding of its driving forces and patterns is limited. Reported historical estimates of sectoral water withdrawals are often sparse and incomplete, mainly restricted to water withdrawal estimates available at annual and country scales, due to a lack of observations at seasonal and local scales. In this study, through collecting and consolidating various sources of reported data and developing spatial and temporal statistical downscaling algorithms, we reconstruct a global monthly gridded (0.5°) sectoral water withdrawal dataset for the period 1971-2010, which distinguishes six water use sectors, i.e., irrigation, domestic, electricity generation (cooling of thermal power plants), livestock, mining, and manufacturing. Based on the reconstructed dataset, the spatial and temporal patterns of historical water withdrawal are analyzed. Results show that total global water withdrawal has increased significantly during 1971-2010, mainly driven by the increase in irrigation water withdrawal. Regions with high water withdrawal are those densely populated or with large irrigated cropland production, e.g., the United States (US), eastern China, India, and Europe. Seasonally, irrigation water withdrawal in summer for the major crops contributes a large percentage of total annual irrigation water withdrawal in mid- and high-latitude regions, and the dominant season of irrigation water withdrawal is also different across regions. Domestic water withdrawal is mostly characterized by a summer peak, while water withdrawal for electricity generation has a winter peak in high-latitude regions and a summer peak in low-latitude regions. Despite the overall increasing trend, irrigation in the western US and domestic water withdrawal in western Europe exhibit a decreasing trend. Our results highlight the distinct spatial pattern of human water use by sectors at the seasonal and annual timescales. The reconstructed gridded water withdrawal dataset is open access, and can be used for examining issues related to water withdrawals at fine spatial, temporal, and sectoral scales.
Scalable persistent identifier systems for dynamic datasets

NASA Astrophysics Data System (ADS)

Golodoniuc, P.; Cox, S. J. D.; Klump, J. F.

2016-12-01

Reliable and persistent identification of objects, whether tangible or not, is essential in information management. Many Internet-based systems have been developed to identify digital data objects, e.g., PURL, LSID, Handle, ARK. These were largely designed for identification of static digital objects. The amount of data made available online has grown exponentially over the last two decades and fine-grained identification of dynamically generated data objects within large datasets using conventional systems (e.g., PURL) has become impractical. We have compared capabilities of various technological solutions to enable resolvability of data objects in dynamic datasets, and developed a dataset-centric approach to resolution of identifiers. This is particularly important in Semantic Linked Data environments where dynamic frequently changing data is delivered live via web services, so registration of individual data objects to obtain identifiers is impractical. We use identifier patterns and pattern hierarchies for identification of data objects, which allows relationships between identifiers to be expressed, and also provides means for resolving a single identifier into multiple forms (i.e. views or representations of an object). The latter can be implemented through (a) HTTP content negotiation, or (b) use of URI querystring parameters. The pattern and hierarchy approach has been implemented in the Linked Data API supporting the United Nations Spatial Data Infrastructure (UNSDI) initiative and later in the implementation of geoscientific data delivery for the Capricorn Distal Footprints project using International Geo Sample Numbers (IGSN). This enables flexible resolution of multi-view persistent identifiers and provides a scalable solution for large heterogeneous datasets.
Octree-based indexing for 3D pointclouds within an Oracle Spatial DBMS

NASA Astrophysics Data System (ADS)

Schön, Bianca; Mosa, Abu Saleh Mohammad; Laefer, Debra F.; Bertolotto, Michela

2013-02-01

A large proportion of today's digital datasets have a spatial component. The effective storage and management of which poses particular challenges, especially with light detection and ranging (LiDAR), where datasets of even small geographic areas may contain several hundred million points. While in the last decade 2.5-dimensional data were prevalent, true 3-dimensional data are increasingly commonplace via LiDAR. They have gained particular popularity for urban applications including generation of city-scale maps, baseline data disaster management, and utility planning. Additionally, LiDAR is commonly used for flood plane identification, coastal-erosion tracking, and forest biomass mapping. Despite growing data availability, current spatial information systems do not provide suitable full support for the data's true 3D nature. Consequently, one system is needed to store the data and another for its processing, thereby necessitating format transformations. The work presented herein aims at a more cost-effective way for managing 3D LiDAR data that allows for storage and manipulation within a single system by enabling a new index within existing spatial database management technology. Implementation of an octree index for 3D LiDAR data atop Oracle Spatial 11g is presented, along with an evaluation showing up to an eight-fold improvement compared to the native Oracle R-tree index.
Evaluation of the global MODIS 30 arc-second spatially and temporally complete snow-free land surface albedo and reflectance anisotropy dataset

NASA Astrophysics Data System (ADS)

Sun, Qingsong; Wang, Zhuosen; Li, Zhan; Erb, Angela; Schaaf, Crystal B.

2017-06-01

Land surface albedo is an essential variable for surface energy and climate modeling as it describes the proportion of incident solar radiant flux that is reflected from the Earth's surface. To capture the temporal variability and spatial heterogeneity of the land surface, satellite remote sensing must be used to monitor albedo accurately at a global scale. However, large data gaps caused by cloud or ephemeral snow have slowed the adoption of satellite albedo products by the climate modeling community. To address the needs of this community, we used a number of temporal and spatial gap-filling strategies to improve the spatial and temporal coverage of the global land surface MODIS BRDF, albedo and NBAR products. A rigorous evaluation of the gap-filled values shows good agreement with original high quality data (RMSE = 0.027 for the NIR band albedo, 0.020 for the red band albedo). This global snow-free and cloud-free MODIS BRDF and albedo dataset (established from 2001 to 2015) offers unique opportunities to monitor and assess the impact of the changes on the Earth's land surface.
CELL5M: A geospatial database of agricultural indicators for Africa South of the Sahara.

PubMed

Koo, Jawoo; Cox, Cindy M; Bacou, Melanie; Azzarri, Carlo; Guo, Zhe; Wood-Sichra, Ulrike; Gong, Queenie; You, Liangzhi

2016-01-01

Recent progress in large-scale georeferenced data collection is widening opportunities for combining multi-disciplinary datasets from biophysical to socioeconomic domains, advancing our analytical and modeling capacity. Granular spatial datasets provide critical information necessary for decision makers to identify target areas, assess baseline conditions, prioritize investment options, set goals and targets and monitor impacts. However, key challenges in reconciling data across themes, scales and borders restrict our capacity to produce global and regional maps and time series. This paper provides overview, structure and coverage of CELL5M-an open-access database of geospatial indicators at 5 arc-minute grid resolution-and introduces a range of analytical applications and case-uses. CELL5M covers a wide set of agriculture-relevant domains for all countries in Africa South of the Sahara and supports our understanding of multi-dimensional spatial variability inherent in farming landscapes throughout the region.
Compression of head-related transfer function using autoregressive-moving-average models and Legendre polynomials.

PubMed

Shekarchi, Sayedali; Hallam, John; Christensen-Dalsgaard, Jakob

2013-11-01

Head-related transfer functions (HRTFs) are generally large datasets, which can be an important constraint for embedded real-time applications. A method is proposed here to reduce redundancy and compress the datasets. In this method, HRTFs are first compressed by conversion into autoregressive-moving-average (ARMA) filters whose coefficients are calculated using Prony's method. Such filters are specified by a few coefficients which can generate the full head-related impulse responses (HRIRs). Next, Legendre polynomials (LPs) are used to compress the ARMA filter coefficients. LPs are derived on the sphere and form an orthonormal basis set for spherical functions. Higher-order LPs capture increasingly fine spatial details. The number of LPs needed to represent an HRTF, therefore, is indicative of its spatial complexity. The results indicate that compression ratios can exceed 98% while maintaining a spectral error of less than 4 dB in the recovered HRTFs.
Application of an imputation method for geospatial inventory of forest structural attributes across multiple spatial scales in the Lake States, U.S.A

NASA Astrophysics Data System (ADS)

Deo, Ram K.

Credible spatial information characterizing the structure and site quality of forests is critical to sustainable forest management and planning, especially given the increasing demands and threats to forest products and services. Forest managers and planners are required to evaluate forest conditions over a broad range of scales, contingent on operational or reporting requirements. Traditionally, forest inventory estimates are generated via a design-based approach that involves generalizing sample plot measurements to characterize an unknown population across a larger area of interest. However, field plot measurements are costly and as a consequence spatial coverage is limited. Remote sensing technologies have shown remarkable success in augmenting limited sample plot data to generate stand- and landscape-level spatial predictions of forest inventory attributes. Further enhancement of forest inventory approaches that couple field measurements with cutting edge remotely sensed and geospatial datasets are essential to sustainable forest management. We evaluated a novel Random Forest based k Nearest Neighbors (RF-kNN) imputation approach to couple remote sensing and geospatial data with field inventory collected by different sampling methods to generate forest inventory information across large spatial extents. The forest inventory data collected by the FIA program of US Forest Service was integrated with optical remote sensing and other geospatial datasets to produce biomass distribution maps for a part of the Lake States and species-specific site index maps for the entire Lake State. Targeting small-area application of the state-of-art remote sensing, LiDAR (light detection and ranging) data was integrated with the field data collected by an inexpensive method, called variable plot sampling, in the Ford Forest of Michigan Tech to derive standing volume map in a cost-effective way. The outputs of the RF-kNN imputation were compared with independent validation datasets and extant map products based on different sampling and modeling strategies. The RF-kNN modeling approach was found to be very effective, especially for large-area estimation, and produced results statistically equivalent to the field observations or the estimates derived from secondary data sources. The models are useful to resource managers for operational and strategic purposes.
DCL System Using Deep Learning Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals

DTIC Science & Technology

2014-09-30

repeating pulse-like signals were investigated. Software prototypes were developed and integrated into distinct streams of reseach ; projects...to study complex sound archives spanning large spatial and temporal scales. A new post processing method for detection and classifcation was also...false positive rates. HK-ANN was successfully tested for a large minke whale dataset, but could easily be used on other signal types. Various
PANTHER. Pattern ANalytics To support High-performance Exploitation and Reasoning.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Czuchlewski, Kristina Rodriguez; Hart, William E.

Sandia has approached the analysis of big datasets with an integrated methodology that uses computer science, image processing, and human factors to exploit critical patterns and relationships in large datasets despite the variety and rapidity of information. The work is part of a three-year LDRD Grand Challenge called PANTHER (Pattern ANalytics To support High-performance Exploitation and Reasoning). To maximize data analysis capability, Sandia pursued scientific advances across three key technical domains: (1) geospatial-temporal feature extraction via image segmentation and classification; (2) geospatial-temporal analysis capabilities tailored to identify and process new signatures more efficiently; and (3) domain- relevant models of humanmore » perception and cognition informing the design of analytic systems. Our integrated results include advances in geographical information systems (GIS) in which we discover activity patterns in noisy, spatial-temporal datasets using geospatial-temporal semantic graphs. We employed computational geometry and machine learning to allow us to extract and predict spatial-temporal patterns and outliers from large aircraft and maritime trajectory datasets. We automatically extracted static and ephemeral features from real, noisy synthetic aperture radar imagery for ingestion into a geospatial-temporal semantic graph. We worked with analysts and investigated analytic workflows to (1) determine how experiential knowledge evolves and is deployed in high-demand, high-throughput visual search workflows, and (2) better understand visual search performance and attention. Through PANTHER, Sandia's fundamental rethinking of key aspects of geospatial data analysis permits the extraction of much richer information from large amounts of data. The project results enable analysts to examine mountains of historical and current data that would otherwise go untouched, while also gaining meaningful, measurable, and defensible insights into overlooked relationships and patterns. The capability is directly relevant to the nation's nonproliferation remote-sensing activities and has broad national security applications for military and intelligence- gathering organizations.« less
The uncertainties and causes of the recent changes in global evapotranspiration from 1982 to 2010

NASA Astrophysics Data System (ADS)

Dong, Bo; Dai, Aiguo

2017-07-01

Recent studies have shown considerable changes in terrestrial evapotranspiration (ET) since the early 1980s, but the causes of these changes remain unclear. In this study, the relative contributions of external climate forcing and internal climate variability to the recent ET changes are examined. Three datasets of global terrestrial ET and the CMIP5 multi-model ensemble mean ET are analyzed, respectively, to quantify the apparent and externally-forced ET changes, while the unforced ET variations are estimated as the apparent ET minus the forced component. Large discrepancies of the ET estimates, in terms of their trend, variability, and temperature- and precipitation-dependence, are found among the three datasets. Results show that the forced global-mean ET exhibits an upward trend of 0.08 mm day-1 century-1 from 1982 to 2010. The forced ET also contains considerable multi-year to decadal variations during the latter half of the 20th century that are caused by volcanic aerosols. The spatial patterns and interannual variations of the forced ET are more closely linked to precipitation than temperature. After removing the forced component, the global-mean ET shows a trend ranging from -0.07 to 0.06 mm day-1 century-1 during 1982-2010 with varying spatial patterns among the three datasets. Furthermore, linkages between the unforced ET and internal climate modes are examined. Variations in Pacific sea surface temperatures (SSTs) are found to be consistently correlated with ET over many land areas among the ET datasets. The results suggest that there are large uncertainties in our current estimates of global terrestrial ET for the recent decades, and the greenhouse gas (GHG) and aerosol external forcings account for a large part of the apparent trend in global-mean terrestrial ET since 1982, but Pacific SST and other internal climate variability dominate recent ET variations and changes over most regions.
Task Dependence, Tissue Specificity, and Spatial Distribution of Widespread Activations in Large Single-Subject Functional MRI Datasets at 7T

PubMed Central

Gonzalez-Castillo, Javier; Hoy, Colin W.; Handwerker, Daniel A.; Roopchansingh, Vinai; Inati, Souheil J.; Saad, Ziad S.; Cox, Robert W.; Bandettini, Peter A.

2015-01-01

It was recently shown that when large amounts of task-based blood oxygen level–dependent (BOLD) data are combined to increase contrast- and temporal signal-to-noise ratios, the majority of the brain shows significant hemodynamic responses time-locked with the experimental paradigm. Here, we investigate the biological significance of such widespread activations. First, the relationship between activation extent and task demands was investigated by varying cognitive load across participants. Second, the tissue specificity of responses was probed using the better BOLD signal localization capabilities of a 7T scanner. Finally, the spatial distribution of 3 primary response types—namely positively sustained (pSUS), negatively sustained (nSUS), and transient—was evaluated using a newly defined voxel-wise waveshape index that permits separation of responses based on their temporal signature. About 86% of gray matter (GM) became significantly active when all data entered the analysis for the most complex task. Activation extent scaled with task load and largely followed the GM contour. The most common response type was nSUS BOLD, irrespective of the task. Our results suggest that widespread activations associated with extremely large single-subject functional magnetic resonance imaging datasets can provide valuable information about the functional organization of the brain that goes undetected in smaller sample sizes. PMID:25405938

Within-population spatial synchrony in mast seeding of North American oaks.

Treesearch

A.V. Liebhold; M. Sork; O.N. Peltonen; Westfall R. Bjørnstad; J. Elkinton; M. H. J. Knops

2004-01-01

Mast seeding, the synchronous production of large crops of seeds, has been frequently documented in oak species. In this study we used several North American oak data-sets to quantify within-stand (10 km) synchrony in mast dynamics. Results indicated that intraspecific synchrony in seed production always exceeded interspecific synchrony and was essentially constant...
Unsupervised data mining in nanoscale x-ray spectro-microscopic study of NdFeB magnet

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duan, Xiaoyue; Yang, Feifei; Antono, Erin

Novel developments in X-ray based spectro-microscopic characterization techniques have increased the rate of acquisition of spatially resolved spectroscopic data by several orders of magnitude over what was possible a few years ago. This accelerated data acquisition, with high spatial resolution at nanoscale and sensitivity to subtle differences in chemistry and atomic structure, provides a unique opportunity to investigate hierarchically complex and structurally heterogeneous systems found in functional devices and materials systems. However, handling and analyzing the large volume data generated poses significant challenges. Here we apply an unsupervised data-mining algorithm known as DBSCAN to study a rare-earth element based permanentmore » magnet material, Nd 2Fe 14B. We are able to reduce a large spectro-microscopic dataset of over 300,000 spectra to 3, preserving much of the underlying information. Scientists can easily and quickly analyze in detail three characteristic spectra. Our approach can rapidly provide a concise representation of a large and complex dataset to materials scientists and chemists. For instance, it shows that the surface of common Nd 2Fe 14B magnet is chemically and structurally very different from the bulk, suggesting a possible surface alteration effect possibly due to the corrosion, which could affect the material’s overall properties.« less
Unsupervised data mining in nanoscale x-ray spectro-microscopic study of NdFeB magnet

DOE PAGES

Duan, Xiaoyue; Yang, Feifei; Antono, Erin; ...

2016-09-29

Novel developments in X-ray based spectro-microscopic characterization techniques have increased the rate of acquisition of spatially resolved spectroscopic data by several orders of magnitude over what was possible a few years ago. This accelerated data acquisition, with high spatial resolution at nanoscale and sensitivity to subtle differences in chemistry and atomic structure, provides a unique opportunity to investigate hierarchically complex and structurally heterogeneous systems found in functional devices and materials systems. However, handling and analyzing the large volume data generated poses significant challenges. Here we apply an unsupervised data-mining algorithm known as DBSCAN to study a rare-earth element based permanentmore » magnet material, Nd 2Fe 14B. We are able to reduce a large spectro-microscopic dataset of over 300,000 spectra to 3, preserving much of the underlying information. Scientists can easily and quickly analyze in detail three characteristic spectra. Our approach can rapidly provide a concise representation of a large and complex dataset to materials scientists and chemists. For instance, it shows that the surface of common Nd 2Fe 14B magnet is chemically and structurally very different from the bulk, suggesting a possible surface alteration effect possibly due to the corrosion, which could affect the material’s overall properties.« less
Identifying spatially similar gene expression patterns in early stage fruit fly embryo images: binary feature versus invariant moment digital representations

PubMed Central

Gurunathan, Rajalakshmi; Van Emden, Bernard; Panchanathan, Sethuraman; Kumar, Sudhir

2004-01-01

Background Modern developmental biology relies heavily on the analysis of embryonic gene expression patterns. Investigators manually inspect hundreds or thousands of expression patterns to identify those that are spatially similar and to ultimately infer potential gene interactions. However, the rapid accumulation of gene expression pattern data over the last two decades, facilitated by high-throughput techniques, has produced a need for the development of efficient approaches for direct comparison of images, rather than their textual descriptions, to identify spatially similar expression patterns. Results The effectiveness of the Binary Feature Vector (BFV) and Invariant Moment Vector (IMV) based digital representations of the gene expression patterns in finding biologically meaningful patterns was compared for a small (226 images) and a large (1819 images) dataset. For each dataset, an ordered list of images, with respect to a query image, was generated to identify overlapping and similar gene expression patterns, in a manner comparable to what a developmental biologist might do. The results showed that the BFV representation consistently outperforms the IMV representation in finding biologically meaningful matches when spatial overlap of the gene expression pattern and the genes involved are considered. Furthermore, we explored the value of conducting image-content based searches in a dataset where individual expression components (or domains) of multi-domain expression patterns were also included separately. We found that this technique improves performance of both IMV and BFV based searches. Conclusions We conclude that the BFV representation consistently produces a more extensive and better list of biologically useful patterns than the IMV representation. The high quality of results obtained scales well as the search database becomes larger, which encourages efforts to build automated image query and retrieval systems for spatial gene expression patterns. PMID:15603586
A Large-Scale, High-Resolution Hydrological Model Parameter Data Set for Climate Change Impact Assessment for the Conterminous US

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oubeidillah, Abdoul A; Kao, Shih-Chieh; Ashfaq, Moetasim

2014-01-01

To extend geographical coverage, refine spatial resolution, and improve modeling efficiency, a computation- and data-intensive effort was conducted to organize a comprehensive hydrologic dataset with post-calibrated model parameters for hydro-climate impact assessment. Several key inputs for hydrologic simulation including meteorologic forcings, soil, land class, vegetation, and elevation were collected from multiple best-available data sources and organized for 2107 hydrologic subbasins (8-digit hydrologic units, HUC8s) in the conterminous United States at refined 1/24 (~4 km) spatial resolution. Using high-performance computing for intensive model calibration, a high-resolution parameter dataset was prepared for the macro-scale Variable Infiltration Capacity (VIC) hydrologic model. The VICmore » simulation was driven by DAYMET daily meteorological forcing and was calibrated against USGS WaterWatch monthly runoff observations for each HUC8. The results showed that this new parameter dataset may help reasonably simulate runoff at most US HUC8 subbasins. Based on this exhaustive calibration effort, it is now possible to accurately estimate the resources required for further model improvement across the entire conterminous United States. We anticipate that through this hydrologic parameter dataset, the repeated effort of fundamental data processing can be lessened, so that research efforts can emphasize the more challenging task of assessing climate change impacts. The pre-organized model parameter dataset will be provided to interested parties to support further hydro-climate impact assessment.« less
High-resolution spatial databases of monthly climate variables (1961-2010) over a complex terrain region in southwestern China

NASA Astrophysics Data System (ADS)

Wu, Wei; Xu, An-Ding; Liu, Hong-Bin

2015-01-01

Climate data in gridded format are critical for understanding climate change and its impact on eco-environment. The aim of the current study is to develop spatial databases for three climate variables (maximum, minimum temperatures, and relative humidity) over a large region with complex topography in southwestern China. Five widely used approaches including inverse distance weighting, ordinary kriging, universal kriging, co-kriging, and thin-plate smoothing spline were tested. Root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) showed that thin-plate smoothing spline with latitude, longitude, and elevation outperformed other models. Average RMSE, MAE, and MAPE of the best models were 1.16 °C, 0.74 °C, and 7.38 % for maximum temperature; 0.826 °C, 0.58 °C, and 6.41 % for minimum temperature; and 3.44, 2.28, and 3.21 % for relative humidity, respectively. Spatial datasets of annual and monthly climate variables with 1-km resolution covering the period 1961-2010 were then obtained using the best performance methods. Comparative study showed that the current outcomes were in well agreement with public datasets. Based on the gridded datasets, changes in temperature variables were investigated across the study area. Future study might be needed to capture the uncertainty induced by environmental conditions through remote sensing and knowledge-based methods.
Evaluating the utility of the medium-spatial resolution Landsat 8 multispectral sensor in quantifying aboveground biomass in uMgeni catchment, South Africa

NASA Astrophysics Data System (ADS)

Dube, Timothy; Mutanga, Onisimo

2015-03-01

Aboveground biomass estimation is critical in understanding forest contribution to regional carbon cycles. Despite the successful application of high spatial and spectral resolution sensors in aboveground biomass (AGB) estimation, there are challenges related to high acquisition costs, small area coverage, multicollinearity and limited availability. These challenges hamper the successful regional scale AGB quantification. The aim of this study was to assess the utility of the newly-launched medium-resolution multispectral Landsat 8 Operational Land Imager (OLI) dataset with a large swath width, in quantifying AGB in a forest plantation. We applied different sets of spectral analysis (test I: spectral bands; test II: spectral vegetation indices and test III: spectral bands + spectral vegetation indices) in testing the utility of Landsat 8 OLI using two non-parametric algorithms: stochastic gradient boosting and the random forest ensembles. The results of the study show that the medium-resolution multispectral Landsat 8 OLI dataset provides better AGB estimates for Eucalyptus dunii, Eucalyptus grandis and Pinus taeda especially when using the extracted spectral information together with the derived spectral vegetation indices. We also noted that incorporating the optimal subset of the most important selected medium-resolution multispectral Landsat 8 OLI bands improved AGB accuracies. We compared medium-resolution multispectral Landsat 8 OLI AGB estimates with Landsat 7 ETM + estimates and the latter yielded lower estimation accuracies. Overall, this study demonstrates the invaluable potential and strength of applying the relatively affordable and readily available newly-launched medium-resolution Landsat 8 OLI dataset, with a large swath width (185-km) in precisely estimating AGB. This strength of the Landsat OLI dataset is crucial especially in sub-Saharan Africa where high-resolution remote sensing data availability remains a challenge.
Steps toward a CONUS-wide reanalysis with archived NEXRAD data using National Mosaic and Multisensor Quantitative Precipitation Estimation (NMQ/Q2) algorithms

NASA Astrophysics Data System (ADS)

Stevens, S. E.; Nelson, B. R.; Langston, C.; Qi, Y.

2012-12-01

The National Mosaic and Multisensor QPE (NMQ/Q2) software suite, developed at NOAA's National Severe Storms Laboratory (NSSL) in Norman, OK, addresses a large deficiency in the resolution of currently archived precipitation datasets. Current standards, both radar- and satellite-based, provide for nationwide precipitation data with a spatial resolution of up to 4-5 km, with a temporal resolution as fine as one hour. Efforts are ongoing to process archived NEXRAD data for the period of record (1996 - present), producing a continuous dataset providing precipitation data at a spatial resolution of 1 km, on a timescale of only five minutes. In addition, radar-derived precipitation data are adjusted hourly using a wide variety of automated gauge networks spanning the United States. Applications for such a product range widely, from emergency management and flash flood guidance, to hydrological studies and drought monitoring. Results are presented from a subset of the NEXRAD dataset, providing basic statistics on the distribution of rainrates, relative frequency of precipitation types, and several other variables which demonstrate the variety of output provided by the software. Precipitation data from select case studies are also presented to highlight the increased resolution provided by this reanalysis and the possibilities that arise from the availability of data on such fine scales. A previously completed pilot project and steps toward a nationwide implementation are presented along with proposed strategies for managing and processing such a large dataset. Reprocessing efforts span several institutions in both North Carolina and Oklahoma, and data/software coordination are key in producing a homogeneous record of precipitation to be archived alongside NOAA's other Climate Data Records. Methods are presented for utilizing supercomputing capability in expediting processing, to allow for the iterative nature of a reanalysis effort.
EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation

PubMed Central

Amidi, Afshine; Megalooikonomou, Vasileios; Paragios, Nikos

2018-01-01

During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet. PMID:29740518
EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation.

PubMed

Amidi, Afshine; Amidi, Shervine; Vlachakis, Dimitrios; Megalooikonomou, Vasileios; Paragios, Nikos; Zacharaki, Evangelia I

2018-01-01

During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.
Does using different modern climate datasets impact pollen-based paleoclimate reconstructions in North America during the past 2,000 years

NASA Astrophysics Data System (ADS)

Ladd, Matthew; Viau, Andre

2013-04-01

Paleoclimate reconstructions rely on the accuracy of modern climate datasets for calibration of fossil records under the assumption of climate normality through time, which means that the modern climate operates in a similar manner as over the past 2,000 years. In this study, we show how using different modern climate datasets have an impact on a pollen-based reconstruction of mean temperature of the warmest month (MTWA) during the past 2,000 years for North America. The modern climate datasets used to explore this research question include the: Whitmore et al., (2005) modern climate dataset; North American Regional Reanalysis (NARR); National Center For Environmental Prediction (NCEP); European Center for Medium Range Weather Forecasting (ECMWF) ERA-40 reanalysis; WorldClim, Global Historical Climate Network (GHCN) and New et al., which is derived from the CRU dataset. Results show that some caution is advised in using the reanalysis data on large-scale reconstructions. Station data appears to dampen out the variability of the reconstruction produced using station based datasets. The reanalysis or model-based datasets are not recommended for paleoclimate large-scale North American reconstructions as they appear to lack some of the dynamics observed in station datasets (CRU) which resulted in warm-biased reconstructions as compared to the station-based reconstructions. The Whitmore et al. (2005) modern climate dataset appears to be a compromise between CRU-based datasets and model-based datasets except for the ERA-40. In addition, an ultra-high resolution gridded climate dataset such as WorldClim may only be useful if the pollen calibration sites in North America have at least the same spatial precision. We reconstruct the MTWA to within +/-0.01°C by using an average of all curves derived from the different modern climate datasets, demonstrating the robustness of the procedure used. It may be that the use of an average of different modern datasets may reduce the impact of uncertainty of paleoclimate reconstructions, however, this is yet to be determined with certainty. Future evaluation using for example the newly developed Berkeley earth surface temperature datasets should be tested against the paleoclimate record.
Development of large scale riverine terrain-bathymetry dataset by integrating NHDPlus HR with NED,CoNED and HAND data

NASA Astrophysics Data System (ADS)

Li, Z.; Clark, E. P.

2017-12-01

Large scale and fine resolution riverine bathymetry data is critical for flood inundation modelingbut not available over the continental United States (CONUS). Previously we implementedbankfull hydraulic geometry based approaches to simulate bathymetry for individual riversusing NHDPlus v2.1 data and 10 m National Elevation Dataset (NED). USGS has recentlydeveloped High Resolution NHD data (NHDPlus HR Beta) (USGS, 2017), and thisenhanced dataset has a significant improvement on its spatial correspondence with 10 m DEM.In this study, we used this high resolution data, specifically NHDFlowline and NHDArea,to create bathymetry/terrain for CONUS river channels and floodplains. A software packageNHDPlus Inundation Modeler v5.0 Beta was developed for this project as an Esri ArcGIShydrological analysis extension. With the updated tools, raw 10 m DEM was first hydrologicallytreated to remove artificial blockages (e.g., overpasses, bridges and eve roadways, etc.) usinglow pass moving window filters. Cross sections were then automatically constructed along eachflowline to extract elevation from the hydrologically treated DEM. In this study, river channelshapes were approximated using quadratic curves to reduce uncertainties from commonly usedtrapezoids. We calculated underneath water channel elevation at each cross section samplingpoint using bankfull channel dimensions that were estimated from physiographicprovince/division based regression equations (Bieger et al. 2015). These elevation points werethen interpolated to generate bathymetry raster. The simulated bathymetry raster wasintegrated with USGS NED and Coastal National Elevation Database (CoNED) (whereveravailable) to make seamless terrain-bathymetry dataset. Channel bathymetry was alsointegrated to the HAND (Height above Nearest Drainage) dataset to improve large scaleinundation modeling. The generated terrain-bathymetry was processed at WatershedBoundary Dataset Hydrologic Unit 4 (WBDHU4) level.
Spatial patterns of fish standing biomass across Brazilian reefs.

PubMed

Morais, R A; Ferreira, C E L; Floeter, S R

2017-12-01

A large fish-count dataset from the Brazilian province was used to describe spatial patterns in standing biomass and test if total biomass, taxonomic and functional trophic structure vary across nested spatial scales. Taxonomic and functional structure varied more among localities and sites than among regions. Total biomass was generally higher at oceanic islands and remote or protected localities along the coast. Lower level carnivores comprised a large part of the biomass at almost all localities (mean of 44%), zooplanktivores never attained more than 14% and omnivores were more representative of subtropical reefs and oceanic islands (up to 66% of total biomass). Small and large herbivores and detritivores varied greatly in their contribution to total biomass, with no clear geographical patterns. Macrocarnivores comprised less than 12% of the biomass anywhere, except for two remote localities. Top predators, such as sharks and very large groupers, were rare and restricted to a few reefs, suggesting that their ecological function might have already been lost in many Brazilian reefs. © 2017 The Fisheries Society of the British Isles.
Comparing apples and oranges: the Community Intercomparison Suite

NASA Astrophysics Data System (ADS)

Schutgens, Nick; Stier, Philip; Pascoe, Stephen

2014-05-01

Visual representation and comparison of geoscientific datasets presents a huge challenge due to the large variety of file formats and spatio-temporal sampling of data (be they observations or simulations). The Community Intercomparison Suite attempts to greatly simplify these tasks for users by offering an intelligent but simple command line tool for visualisation and colocation of diverse datasets. In addition, CIS can subset and aggregate large datasets into smaller more manageable datasets. Our philosophy is to remove as much as possible the need for specialist knowledge by the user of the structure of a dataset. The colocation of observations with model data is as simple as: "cis col ::" which will resample the simulation data to the spatio-temporal sampling of the observations, contingent on a few user-defined options that specify a resampling kernel. CIS can deal with both gridded and ungridded datasets of 2, 3 or 4 spatio-temporal dimensions. It can handle different spatial coordinates (e.g. longitude or distance, altitude or pressure level). CIS supports both HDF, netCDF and ASCII file formats. The suite is written in Python with entirely publicly available open source dependencies. Plug-ins allow a high degree of user-moddability. A web-based developer hub includes a manual and simple examples. CIS is developed as open source code by a specialist IT company under supervision of scientists from the University of Oxford as part of investment in the JASMIN superdatacluster facility at the Centre of Environmental Data Archival.
a Comparative Analysis of Five Cropland Datasets in Africa

NASA Astrophysics Data System (ADS)

Wei, Y.; Lu, M.; Wu, W.

2018-04-01

The food security, particularly in Africa, is a challenge to be resolved. The cropland area and spatial distribution obtained from remote sensing imagery are vital information. In this paper, according to cropland area and spatial location, we compare five global cropland datasets including CCI Land Cover, GlobCover, MODIS Collection 5, GlobeLand30 and Unified Cropland in circa 2010 of Africa in terms of cropland area and spatial location. The accuracy of cropland area calculated from five datasets was analyzed compared with statistic data. Based on validation samples, the accuracies of spatial location for the five cropland products were assessed by error matrix. The results show that GlobeLand30 has the best fitness with the statistics, followed by MODIS Collection 5 and Unified Cropland, GlobCover and CCI Land Cover have the lower accuracies. For the accuracy of spatial location of cropland, GlobeLand30 reaches the highest accuracy, followed by Unified Cropland, MODIS Collection 5 and GlobCover, CCI Land Cover has the lowest accuracy. The spatial location accuracy of five datasets in the Csa with suitable farming condition is generally higher than in the Bsk.
Radiosonde Atmospheric Temperature Products for Assessing Climate (RATPAC): Towards a New Adjusted Radiosonde Dataset

NASA Astrophysics Data System (ADS)

Free, M. P.; Angell, J. K.; Durre, I.; Klein, S.; Lanzante, J.; Lawrimore, J.; Peterson, T.; Seidel, D.

2002-05-01

The objective of NOAA's RATPAC project is to develop climate-quality global, hemispheric and zonal upper-air temperature time series from the NCDC radiosonde database. Lanzante, Klein and Seidel (LKS) have produced an 87-station adjusted radiosonde dataset using a multifactor expert decision approach. Our goal is to extend this dataset spatially and temporally and to provide a method to update it routinely at NCDC. Since the LKS adjustment method is too labor-intensive for these purposes, we are investigating a first-difference method (Peterson et al., 1998) and an automated version of the LKS method. The first difference method (FD) can be used to combine large numbers of time series into spatial means, but also introduces a random error in the resulting large-scale averages. If the portions of the time series with suspect continuity are withheld from the calculations, it has the potential to reconstruct the real variability without the effects of the discontinuities. However, tests of FD on unadjusted radiosonde data and on reanalysis temperature data suggest that it must be used with caution when the number of stations is low and the number of data gaps is high. Because of these problems with the first difference approach, we are also considering an automated version of the LKS adjustment method using statistical change points, day-night temperature difference series, relationships between changes in adjacent atmospheric levels, and station histories to identify inhomogeneities in the temperature data.
Investigation and Evaluation of the open source ETL tools GeoKettle and Talend Open Studio in terms of their ability to process spatial data

NASA Astrophysics Data System (ADS)

Kuhnert, Kristin; Quedenau, Jörn

2016-04-01

Integration and harmonization of large spatial data sets is not only since the introduction of the spatial data infrastructure INSPIRE a big issue. The process of extracting and combining spatial data from heterogeneous source formats, transforming that data to obtain the required quality for particular purposes and loading it into a data store, are common tasks. The procedure of Extraction, Transformation and Loading of data is called ETL process. Geographic Information Systems (GIS) can take over many of these tasks but often they are not suitable for processing large datasets. ETL tools can make the implementation and execution of ETL processes convenient and efficient. One reason for choosing ETL tools for data integration is that they ease maintenance because of a clear (graphical) presentation of the transformation steps. Developers and administrators are provided with tools for identification of errors, analyzing processing performance and managing the execution of ETL processes. Another benefit of ETL tools is that for most tasks no or only little scripting skills are required so that also researchers without programming background can easily work with it. Investigations on ETL tools for business approaches are available for a long time. However, little work has been published on the capabilities of those tools to handle spatial data. In this work, we review and compare the open source ETL tools GeoKettle and Talend Open Studio in terms of processing spatial data sets of different formats. For evaluation, ETL processes are performed with both software packages based on air quality data measured during the BÄRLIN2014 Campaign initiated by the Institute for Advanced Sustainability Studies (IASS). The aim of the BÄRLIN2014 Campaign is to better understand the sources and distribution of particulate matter in Berlin. The air quality data are available in heterogeneous formats because they were measured with different instruments. For further data analysis, the instrument data has been complemented by other georeferenced data provided by the local environmental authorities. This includes both vector and raster data on e.g. land use categories or building heights, extracted from flat files and OGC-compliant web services. The requirements on the ETL tools are now for instance the extraction of different input datasets like Web Feature Services or vector datasets and the loading of those into databases. The tools also have to manage transformations on spatial datasets like to work with spatial functions (e.g. intersection, union) or change spatial reference systems. Preliminary results suggest that many complex transformation tasks could be accomplished with the existing set of components from both software tools, while there are still many gaps in the range of available features. Both ETL tools differ in functionality and in the way of implementation of various steps. For some tasks no predefined components are available at all, which could partly be compensated by the use of the respective API (freely configurable components in Java or JavaScript).
Intensity-Duration-Frequency curves from remote sensing datasets: direct comparison of weather radar and CMORPH over the Eastern Mediterranean

NASA Astrophysics Data System (ADS)

Morin, Efrat; Marra, Francesco; Peleg, Nadav; Mei, Yiwen; Anagnostou, Emmanouil N.

2017-04-01

Rainfall frequency analysis is used to quantify the probability of occurrence of extreme rainfall and is traditionally based on rain gauge records. The limited spatial coverage of rain gauges is insufficient to sample the spatiotemporal variability of extreme rainfall and to provide the areal information required by management and design applications. Conversely, remote sensing instruments, even if quantitative uncertain, offer coverage and spatiotemporal detail that allow overcoming these issues. In recent years, remote sensing datasets began to be used for frequency analyses, taking advantage of increased record lengths and quantitative adjustments of the data. However, the studies so far made use of concepts and techniques developed for rain gauge (i.e. point or multiple-point) data and have been validated by comparison with gauge-derived analyses. These procedures add further sources of uncertainty and prevent from isolating between data and methodological uncertainties and from fully exploiting the available information. In this study, we step out of the gauge-centered concept presenting a direct comparison between at-site Intensity-Duration-Frequency (IDF) curves derived from different remote sensing datasets on corresponding spatial scales, temporal resolutions and records. We analyzed 16 years of homogeneously corrected and gauge-adjusted C-Band weather radar estimates, high-resolution CMORPH and gauge-adjusted high-resolution CMORPH over the Eastern Mediterranean. Results of this study include: (a) good spatial correlation between radar and satellite IDFs ( 0.7 for 2-5 years return period); (b) consistent correlation and dispersion in the raw and gauge adjusted CMORPH; (c) bias is almost uniform with return period for 12-24 h durations; (d) radar identifies thicker tail distributions than CMORPH and the tail of the distributions depends on the spatial and temporal scales. These results demonstrate the potential of remote sensing datasets for rainfall frequency analysis for management (e.g. warning and early-warning systems) and design (e.g. sewer design, large scale drainage planning)
Modeling the Hydrological Regime of Turkana Lake (Kenya, Ethiopia) by Combining Spatially Distributed Hydrological Modeling and Remote Sensing Datasets

NASA Astrophysics Data System (ADS)

Anghileri, D.; Kaelin, A.; Peleg, N.; Fatichi, S.; Molnar, P.; Roques, C.; Longuevergne, L.; Burlando, P.

2017-12-01

Hydrological modeling in poorly gauged basins can benefit from the use of remote sensing datasets although there are challenges associated with the mismatch in spatial and temporal scales between catchment scale hydrological models and remote sensing products. We model the hydrological processes and long-term water budget of the Lake Turkana catchment, a transboundary basin between Kenya and Ethiopia, by integrating several remote sensing products into a spatially distributed and physically explicit model, Topkapi-ETH. Lake Turkana is the world largest desert lake draining a catchment of 145'500 km2. It has three main contributing rivers: the Omo river, which contributes most of the annual lake inflow, the Turkwel river, and the Kerio rivers, which contribute the remaining part. The lake levels have shown great variations in the last decades due to long-term climate fluctuations and the regulation of three reservoirs, Gibe I, II, and III, which significantly alter the hydrological seasonality. Another large reservoir is planned and may be built in the next decade, generating concerns about the fate of Lake Turkana in the long run because of this additional anthropogenic pressure and increasing evaporation driven by climate change. We consider different remote sensing datasets, i.e., TRMM-V7 for precipitation, MERRA-2 for temperature, as inputs to the spatially distributed hydrological model. We validate the simulation results with other remote sensing datasets, i.e., GRACE for total water storage anomalies, GLDAS-NOAH for soil moisture, ERA-Interim/Land for surface runoff, and TOPEX/Poseidon for satellite altimetry data. Results highlight how different remote sensing products can be integrated into a hydrological modeling framework accounting for their relative uncertainties. We also carried out simulations with the artificial reservoirs planned in the north part of the catchment and without any reservoirs, to assess their impacts on the catchment hydrological regime and the Lake Turkana level variability.
Thermal Structure and Dynamics of Saturn's Northern Springtime Disturbance

NASA Technical Reports Server (NTRS)

Fletcher, Leigh N.; Hesman, Brigette E.; Irwin, Patrick G.; Baines, Kevin H.; Momary, Thomas W.; SanchezLavega, Agustin; Flasar, F. Michael; Read, Peter L.; Orton, Glenn S.; SimonMiller, Amy;

2011-01-01

This article combined several infrared datasets to study the vertical properties of Saturn's northern springtime storm. Spectroscopic observations of Saturn's northern hemisphere at 0.5 and 2.5 / cm spectral resolution were provided by the Cassini Composite Infrared Spectrometer (CIRS, 17). These were supplemented with narrow-band filtered imaging from the ESO Very Large Telescope VISIR instrument (16) to provide a global spatial context for the Cassini spectroscopy. Finally, nightside imaging from the Cassini Visual and Infrared Mapping Spectrometer (VIMS, 22) provided a glimpse of the undulating cloud activity in the eastern branch of the disturbance. Each of these datasets, and the methods used to reduce and analyse them, will be described in detail below. Spatial maps of atmospheric temperatures, aerosol opacity and gaseous distributions are derived from infrared spectroscopy using a suite of radiative transfer and optimal estimation retrieval tools developed at the University of Oxford, known collectively as Nemesis (23). Synthetic spectra created from a reference atmospheric model for Saturn and appropriate sources of spectroscopic line data (6, 24) are convolved with the instrument function for each dataset. Atmospheric properties are then iteratively adjusted until the measurements are accurately reproduced with physically-realistic temperatures, compositions and cloud opacities.

Tree migration detection through comparisons of historic and current forest inventories

Treesearch

Christopher W. Woodall; Christopher M. Oswalt; James A. Westfall; Charles H. Perry; Mark N. Nelson

2009-01-01

Changes in tree species distributions are a potential impact of climate change on forest ecosystems. The examination of tree species shifts in forests of the eastern United States largely has been limited to modeling activities with little empirical analysis of long-term forest inventory datasets. The goal of this study was to compare historic and current spatial...
Wildlife tracking data management: a new vision.

PubMed

Urbano, Ferdinando; Cagnacci, Francesca; Calenge, Clément; Dettki, Holger; Cameron, Alison; Neteler, Markus

2010-07-27

To date, the processing of wildlife location data has relied on a diversity of software and file formats. Data management and the following spatial and statistical analyses were undertaken in multiple steps, involving many time-consuming importing/exporting phases. Recent technological advancements in tracking systems have made large, continuous, high-frequency datasets of wildlife behavioural data available, such as those derived from the global positioning system (GPS) and other animal-attached sensor devices. These data can be further complemented by a wide range of other information about the animals' environment. Management of these large and diverse datasets for modelling animal behaviour and ecology can prove challenging, slowing down analysis and increasing the probability of mistakes in data handling. We address these issues by critically evaluating the requirements for good management of GPS data for wildlife biology. We highlight that dedicated data management tools and expertise are needed. We explore current research in wildlife data management. We suggest a general direction of development, based on a modular software architecture with a spatial database at its core, where interoperability, data model design and integration with remote-sensing data sources play an important role in successful GPS data handling.
Wildlife tracking data management: a new vision

PubMed Central

Urbano, Ferdinando; Cagnacci, Francesca; Calenge, Clément; Dettki, Holger; Cameron, Alison; Neteler, Markus

2010-01-01

To date, the processing of wildlife location data has relied on a diversity of software and file formats. Data management and the following spatial and statistical analyses were undertaken in multiple steps, involving many time-consuming importing/exporting phases. Recent technological advancements in tracking systems have made large, continuous, high-frequency datasets of wildlife behavioural data available, such as those derived from the global positioning system (GPS) and other animal-attached sensor devices. These data can be further complemented by a wide range of other information about the animals' environment. Management of these large and diverse datasets for modelling animal behaviour and ecology can prove challenging, slowing down analysis and increasing the probability of mistakes in data handling. We address these issues by critically evaluating the requirements for good management of GPS data for wildlife biology. We highlight that dedicated data management tools and expertise are needed. We explore current research in wildlife data management. We suggest a general direction of development, based on a modular software architecture with a spatial database at its core, where interoperability, data model design and integration with remote-sensing data sources play an important role in successful GPS data handling. PMID:20566495
A Hybrid Neuro-Fuzzy Model For Integrating Large Earth-Science Datasets

NASA Astrophysics Data System (ADS)

Porwal, A.; Carranza, J.; Hale, M.

2004-12-01

A GIS-based hybrid neuro-fuzzy approach to integration of large earth-science datasets for mineral prospectivity mapping is described. It implements a Takagi-Sugeno type fuzzy inference system in the framework of a four-layered feed-forward adaptive neural network. Each unique combination of the datasets is considered a feature vector whose components are derived by knowledge-based ordinal encoding of the constituent datasets. A subset of feature vectors with a known output target vector (i.e., unique conditions known to be associated with either a mineralized or a barren location) is used for the training of an adaptive neuro-fuzzy inference system. Training involves iterative adjustment of parameters of the adaptive neuro-fuzzy inference system using a hybrid learning procedure for mapping each training vector to its output target vector with minimum sum of squared error. The trained adaptive neuro-fuzzy inference system is used to process all feature vectors. The output for each feature vector is a value that indicates the extent to which a feature vector belongs to the mineralized class or the barren class. These values are used to generate a prospectivity map. The procedure is demonstrated by an application to regional-scale base metal prospectivity mapping in a study area located in the Aravalli metallogenic province (western India). A comparison of the hybrid neuro-fuzzy approach with pure knowledge-driven fuzzy and pure data-driven neural network approaches indicates that the former offers a superior method for integrating large earth-science datasets for predictive spatial mathematical modelling.
Open and scalable analytics of large Earth observation datasets: From scenes to multidimensional arrays using SciDB and GDAL

NASA Astrophysics Data System (ADS)

Appel, Marius; Lahn, Florian; Buytaert, Wouter; Pebesma, Edzer

2018-04-01

Earth observation (EO) datasets are commonly provided as collection of scenes, where individual scenes represent a temporal snapshot and cover a particular region on the Earth's surface. Using these data in complex spatiotemporal modeling becomes difficult as soon as data volumes exceed a certain capacity or analyses include many scenes, which may spatially overlap and may have been recorded at different dates. In order to facilitate analytics on large EO datasets, we combine and extend the geospatial data abstraction library (GDAL) and the array-based data management and analytics system SciDB. We present an approach to automatically convert collections of scenes to multidimensional arrays and use SciDB to scale computationally intensive analytics. We evaluate the approach in three study cases on national scale land use change monitoring with Landsat imagery, global empirical orthogonal function analysis of daily precipitation, and combining historical climate model projections with satellite-based observations. Results indicate that the approach can be used to represent various EO datasets and that analyses in SciDB scale well with available computational resources. To simplify analyses of higher-dimensional datasets as from climate model output, however, a generalization of the GDAL data model might be needed. All parts of this work have been implemented as open-source software and we discuss how this may facilitate open and reproducible EO analyses.
Spatial Mutual Information Based Hyperspectral Band Selection for Classification

PubMed Central

2015-01-01

The amount of information involved in hyperspectral imaging is large. Hyperspectral band selection is a popular method for reducing dimensionality. Several information based measures such as mutual information have been proposed to reduce information redundancy among spectral bands. Unfortunately, mutual information does not take into account the spatial dependency between adjacent pixels in images thus reducing its robustness as a similarity measure. In this paper, we propose a new band selection method based on spatial mutual information. As validation criteria, a supervised classification method using support vector machine (SVM) is used. Experimental results of the classification of hyperspectral datasets show that the proposed method can achieve more accurate results. PMID:25918742
Calibrating a numerical model's morphology using high-resolution spatial and temporal datasets from multithread channel flume experiments.

NASA Astrophysics Data System (ADS)

Javernick, L.; Bertoldi, W.; Redolfi, M.

2017-12-01

Accessing or acquiring high quality, low-cost topographic data has never been easier due to recent developments of the photogrammetric techniques of Structure-from-Motion (SfM). Researchers can acquire the necessary SfM imagery with various platforms, with the ability to capture millimetre resolution and accuracy, or large-scale areas with the help of unmanned platforms. Such datasets in combination with numerical modelling have opened up new opportunities to study river environments physical and ecological relationships. While numerical models overall predictive accuracy is most influenced by topography, proper model calibration requires hydraulic data and morphological data; however, rich hydraulic and morphological datasets remain scarce. This lack in field and laboratory data has limited model advancement through the inability to properly calibrate, assess sensitivity, and validate the models performance. However, new time-lapse imagery techniques have shown success in identifying instantaneous sediment transport in flume experiments and their ability to improve hydraulic model calibration. With new capabilities to capture high resolution spatial and temporal datasets of flume experiments, there is a need to further assess model performance. To address this demand, this research used braided river flume experiments and captured time-lapse observed sediment transport and repeat SfM elevation surveys to provide unprecedented spatial and temporal datasets. Through newly created metrics that quantified observed and modeled activation, deactivation, and bank erosion rates, the numerical model Delft3d was calibrated. This increased temporal data of both high-resolution time series and long-term temporal coverage provided significantly improved calibration routines that refined calibration parameterization. Model results show that there is a trade-off between achieving quantitative statistical and qualitative morphological representations. Specifically, statistical agreement simulations suffered to represent braiding planforms (evolving toward meandering), and parameterization that ensured braided produced exaggerated activation and bank erosion rates. Marie Sklodowska-Curie Individual Fellowship: River-HMV, 656917
Implementing DOIs for Oceanographic Satellite Data at PO.DAAC

NASA Astrophysics Data System (ADS)

Hausman, J.; Tauer, E.; Chung, N.; Chen, C.; Moroni, D. F.

2013-12-01

The Physical Oceanographic Distributed Active Archive Center (PO.DAAC) is NASA's archive for physical oceanographic satellite data. It distributes over 500 datasets from gravity, ocean wind, sea surface topography, sea ice, ocean currents, salinity, and sea surface temperature satellite missions. A dataset is a collection of granules/files that share the same mission/project, versioning, processing level, spatial, and temporal characteristics. The large number of datasets is partially due to the number of satellite missions, but mostly because a single satellite mission typically has multiple versions or even temporal and spatial resolutions of data. As a result, a user might mistake one dataset for a different dataset from the same satellite mission. Due to the PO.DAAC'S vast variety and volume of data and growing requirements to report dataset usage, it has begun implementing DOIs for the datasets it archives and distributes. However, this was not as simple as registering a name for a DOI and providing a URL. Before implementing DOIs multiple questions needed to be answered. What are the sponsor and end-user expectations regarding DOIs? At what level does a DOI get assigned (dataset, file/granule)? Do all data get a DOI, or only selected data? How do we create a DOI? How do we create landing pages and manage them? What changes need to be made to the data archive, life cycle policy and web portal to accommodate DOIs? What if the data also exists at another archive and a DOI already exists? How is a DOI included if the data were obtained via a subsetting tool? How does a researcher or author provide a unique, definitive reference (standard citation) for a given dataset? This presentation will discuss how these questions were answered through changes in policy, process, and system design. Implementing DOIs is not a trivial undertaking, but as DOIs are rapidly becoming the de facto approach, it is worth the effort. Researchers have historically referenced the source satellite and data center (or archive), but scientific writings do not typically provide enough detail to point to a singular, uniquely identifiable dataset. DOIs provide the means to help researchers be precise in their data citations and provide needed clarity, standardization and permanence.
The effects of spatial population dataset choice on estimates of population at risk of disease

PubMed Central

2011-01-01

Background The spatial modeling of infectious disease distributions and dynamics is increasingly being undertaken for health services planning and disease control monitoring, implementation, and evaluation. Where risks are heterogeneous in space or dependent on person-to-person transmission, spatial data on human population distributions are required to estimate infectious disease risks, burdens, and dynamics. Several different modeled human population distribution datasets are available and widely used, but the disparities among them and the implications for enumerating disease burdens and populations at risk have not been considered systematically. Here, we quantify some of these effects using global estimates of populations at risk (PAR) of P. falciparum malaria as an example. Methods The recent construction of a global map of P. falciparum malaria endemicity enabled the testing of different gridded population datasets for providing estimates of PAR by endemicity class. The estimated population numbers within each class were calculated for each country using four different global gridded human population datasets: GRUMP (~1 km spatial resolution), LandScan (~1 km), UNEP Global Population Databases (~5 km), and GPW3 (~5 km). More detailed assessments of PAR variation and accuracy were conducted for three African countries where census data were available at a higher administrative-unit level than used by any of the four gridded population datasets. Results The estimates of PAR based on the datasets varied by more than 10 million people for some countries, even accounting for the fact that estimates of population totals made by different agencies are used to correct national totals in these datasets and can vary by more than 5% for many low-income countries. In many cases, these variations in PAR estimates comprised more than 10% of the total national population. The detailed country-level assessments suggested that none of the datasets was consistently more accurate than the others in estimating PAR. The sizes of such differences among modeled human populations were related to variations in the methods, input resolution, and date of the census data underlying each dataset. Data quality varied from country to country within the spatial population datasets. Conclusions Detailed, highly spatially resolved human population data are an essential resource for planning health service delivery for disease control, for the spatial modeling of epidemics, and for decision-making processes related to public health. However, our results highlight that for the low-income regions of the world where disease burden is greatest, existing datasets display substantial variations in estimated population distributions, resulting in uncertainty in disease assessments that utilize them. Increased efforts are required to gather contemporary and spatially detailed demographic data to reduce this uncertainty, particularly in Africa, and to develop population distribution modeling methods that match the rigor, sophistication, and ability to handle uncertainty of contemporary disease mapping and spread modeling. In the meantime, studies that utilize a particular spatial population dataset need to acknowledge the uncertainties inherent within them and consider how the methods and data that comprise each will affect conclusions. PMID:21299885
High performance computing environment for multidimensional image analysis

PubMed Central

Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

2007-01-01

Background The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. Results We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478× speedup. Conclusion Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets. PMID:17634099
High performance computing environment for multidimensional image analysis.

PubMed

Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

2007-07-10

The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup. Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.
Analysing and correcting the differences between multi-source and multi-scale spatial remote sensing observations.

PubMed

Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun

2014-01-01

Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation.
Analysing and Correcting the Differences between Multi-Source and Multi-Scale Spatial Remote Sensing Observations

PubMed Central

Dong, Yingying; Luo, Ruisen; Feng, Haikuan; Wang, Jihua; Zhao, Jinling; Zhu, Yining; Yang, Guijun

2014-01-01

Differences exist among analysis results of agriculture monitoring and crop production based on remote sensing observations, which are obtained at different spatial scales from multiple remote sensors in same time period, and processed by same algorithms, models or methods. These differences can be mainly quantitatively described from three aspects, i.e. multiple remote sensing observations, crop parameters estimation models, and spatial scale effects of surface parameters. Our research proposed a new method to analyse and correct the differences between multi-source and multi-scale spatial remote sensing surface reflectance datasets, aiming to provide references for further studies in agricultural application with multiple remotely sensed observations from different sources. The new method was constructed on the basis of physical and mathematical properties of multi-source and multi-scale reflectance datasets. Theories of statistics were involved to extract statistical characteristics of multiple surface reflectance datasets, and further quantitatively analyse spatial variations of these characteristics at multiple spatial scales. Then, taking the surface reflectance at small spatial scale as the baseline data, theories of Gaussian distribution were selected for multiple surface reflectance datasets correction based on the above obtained physical characteristics and mathematical distribution properties, and their spatial variations. This proposed method was verified by two sets of multiple satellite images, which were obtained in two experimental fields located in Inner Mongolia and Beijing, China with different degrees of homogeneity of underlying surfaces. Experimental results indicate that differences of surface reflectance datasets at multiple spatial scales could be effectively corrected over non-homogeneous underlying surfaces, which provide database for further multi-source and multi-scale crop growth monitoring and yield prediction, and their corresponding consistency analysis evaluation. PMID:25405760
On the visualization of water-related big data: extracting insights from drought proxies' datasets

NASA Astrophysics Data System (ADS)

Diaz, Vitali; Corzo, Gerald; van Lanen, Henny A. J.; Solomatine, Dimitri

2017-04-01

Big data is a growing area of science where hydroinformatics can benefit largely. There have been a number of important developments in the area of data science aimed at analysis of large datasets. Such datasets related to water include measurements, simulations, reanalysis, scenario analyses and proxies. By convention, information contained in these databases is referred to a specific time and a space (i.e., longitude/latitude). This work is motivated by the need to extract insights from large water-related datasets, i.e., transforming large amounts of data into useful information that helps to better understand of water-related phenomena, particularly about drought. In this context, data visualization, part of data science, involves techniques to create and to communicate data by encoding it as visual graphical objects. They may help to better understand data and detect trends. Base on existing methods of data analysis and visualization, this work aims to develop tools for visualizing water-related large datasets. These tools were developed taking advantage of existing libraries for data visualization into a group of graphs which include both polar area diagrams (PADs) and radar charts (RDs). In both graphs, time steps are represented by the polar angles and the percentages of area in drought by the radios. For illustration, three large datasets of drought proxies are chosen to identify trends, prone areas and spatio-temporal variability of drought in a set of case studies. The datasets are (1) SPI-TS2p1 (1901-2002, 11.7 GB), (2) SPI-PRECL0p5 (1948-2016, 7.91 GB) and (3) SPEI-baseV2.3 (1901-2013, 15.3 GB). All of them are on a monthly basis and with a spatial resolution of 0.5 degrees. First two were retrieved from the repository of the International Research Institute for Climate and Society (IRI). They are included into the Analyses Standardized Precipitation Index (SPI) project (iridl.ldeo.columbia.edu/SOURCES/.IRI/.Analyses/.SPI/). The third dataset was recovered from the Standardized Precipitation Evaporation Index (SPEI) Monitor (digital.csic.es/handle/10261/128892). PADs were found suitable to identify the spatio-temporal variability and prone areas of drought. Drought trends were visually detected by using both PADs and RDs. A similar approach can be followed to include other types of graphs to deal with the analysis of water-related big data. Key words: Big data, data visualization, drought, SPI, SPEI
Efficient and Flexible Climate Analysis with Python in a Cloud-Based Distributed Computing Framework

NASA Astrophysics Data System (ADS)

Gannon, C.

2017-12-01

As climate models become progressively more advanced, and spatial resolution further improved through various downscaling projects, climate projections at a local level are increasingly insightful and valuable. However, the raw size of climate datasets presents numerous hurdles for analysts wishing to develop customized climate risk metrics or perform site-specific statistical analysis. Four Twenty Seven, a climate risk consultancy, has implemented a Python-based distributed framework to analyze large climate datasets in the cloud. With the freedom afforded by efficiently processing these datasets, we are able to customize and continually develop new climate risk metrics using the most up-to-date data. Here we outline our process for using Python packages such as XArray and Dask to evaluate netCDF files in a distributed framework, StarCluster to operate in a cluster-computing environment, cloud computing services to access publicly hosted datasets, and how this setup is particularly valuable for generating climate change indicators and performing localized statistical analysis.
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mapping.

PubMed

Pinho, Ana Luísa; Amadon, Alexis; Ruest, Torsten; Fabre, Murielle; Dohmatob, Elvis; Denghien, Isabelle; Ginisty, Chantal; Becuwe-Desmidt, Séverine; Roger, Séverine; Laurier, Laurence; Joly-Testault, Véronique; Médiouni-Cloarec, Gaëlle; Doublé, Christine; Martins, Bernadette; Pinel, Philippe; Eger, Evelyn; Varoquaux, Gaël; Pallier, Christophe; Dehaene, Stanislas; Hertz-Pannier, Lucie; Thirion, Bertrand

2018-06-12

Functional Magnetic Resonance Imaging (fMRI) has furthered brain mapping on perceptual, motor, as well as higher-level cognitive functions. However, to date, no data collection has systematically addressed the functional mapping of cognitive mechanisms at a fine spatial scale. The Individual Brain Charting (IBC) project stands for a high-resolution multi-task fMRI dataset that intends to provide the objective basis toward a comprehensive functional atlas of the human brain. The data refer to a cohort of 12 participants performing many different tasks. The large amount of task-fMRI data on the same subjects yields a precise mapping of the underlying functions, free from both inter-subject and inter-site variability. The present article gives a detailed description of the first release of the IBC dataset. It comprises a dozen of tasks, addressing both low- and high- level cognitive functions. This openly available dataset is thus intended to become a reference for cognitive brain mapping.
Estimating and interpreting migration of Amazonian forests using spatially implicit and semi-explicit neutral models.

PubMed

Pos, Edwin; Guevara Andino, Juan Ernesto; Sabatier, Daniel; Molino, Jean-François; Pitman, Nigel; Mogollón, Hugo; Neill, David; Cerón, Carlos; Rivas-Torres, Gonzalo; Di Fiore, Anthony; Thomas, Raquel; Tirado, Milton; Young, Kenneth R; Wang, Ophelia; Sierra, Rodrigo; García-Villacorta, Roosevelt; Zagt, Roderick; Palacios Cuenca, Walter; Aulestia, Milton; Ter Steege, Hans

2017-06-01

With many sophisticated methods available for estimating migration, ecologists face the difficult decision of choosing for their specific line of work. Here we test and compare several methods, performing sanity and robustness tests, applying to large-scale data and discussing the results and interpretation. Five methods were selected to compare for their ability to estimate migration from spatially implicit and semi-explicit simulations based on three large-scale field datasets from South America (Guyana, Suriname, French Guiana and Ecuador). Space was incorporated semi-explicitly by a discrete probability mass function for local recruitment, migration from adjacent plots or from a metacommunity. Most methods were able to accurately estimate migration from spatially implicit simulations. For spatially semi-explicit simulations, estimation was shown to be the additive effect of migration from adjacent plots and the metacommunity. It was only accurate when migration from the metacommunity outweighed that of adjacent plots, discrimination, however, proved to be impossible. We show that migration should be considered more an approximation of the resemblance between communities and the summed regional species pool. Application of migration estimates to simulate field datasets did show reasonably good fits and indicated consistent differences between sets in comparison with earlier studies. We conclude that estimates of migration using these methods are more an approximation of the homogenization among local communities over time rather than a direct measurement of migration and hence have a direct relationship with beta diversity. As betadiversity is the result of many (non)-neutral processes, we have to admit that migration as estimated in a spatial explicit world encompasses not only direct migration but is an ecological aggregate of these processes. The parameter m of neutral models then appears more as an emerging property revealed by neutral theory instead of being an effective mechanistic parameter and spatially implicit models should be rejected as an approximation of forest dynamics.
A reference dataset for deformable image registration spatial accuracy evaluation using the COPDgene study archive

NASA Astrophysics Data System (ADS)

Castillo, Richard; Castillo, Edward; Fuentes, David; Ahmad, Moiz; Wood, Abbie M.; Ludwig, Michelle S.; Guerrero, Thomas

2013-05-01

Landmark point-pairs provide a strategy to assess deformable image registration (DIR) accuracy in terms of the spatial registration of the underlying anatomy depicted in medical images. In this study, we propose to augment a publicly available database (www.dir-lab.com) of medical images with large sets of manually identified anatomic feature pairs between breath-hold computed tomography (BH-CT) images for DIR spatial accuracy evaluation. Ten BH-CT image pairs were randomly selected from the COPDgene study cases. Each patient had received CT imaging of the entire thorax in the supine position at one-fourth dose normal expiration and maximum effort full dose inspiration. Using dedicated in-house software, an imaging expert manually identified large sets of anatomic feature pairs between images. Estimates of inter- and intra-observer spatial variation in feature localization were determined by repeat measurements of multiple observers over subsets of randomly selected features. 7298 anatomic landmark features were manually paired between the 10 sets of images. Quantity of feature pairs per case ranged from 447 to 1172. Average 3D Euclidean landmark displacements varied substantially among cases, ranging from 12.29 (SD: 6.39) to 30.90 (SD: 14.05) mm. Repeat registration of uniformly sampled subsets of 150 landmarks for each case yielded estimates of observer localization error, which ranged in average from 0.58 (SD: 0.87) to 1.06 (SD: 2.38) mm for each case. The additions to the online web database (www.dir-lab.com) described in this work will broaden the applicability of the reference data, providing a freely available common dataset for targeted critical evaluation of DIR spatial accuracy performance in multiple clinical settings. Estimates of observer variance in feature localization suggest consistent spatial accuracy for all observers across both four-dimensional CT and COPDgene patient cohorts.
A global database of ant species abundances

USGS Publications Warehouse

Gibb, Heloise; Dunn, Rob R.; Sanders, Nathan J.; Grossman, Blair F.; Photakis, Manoli; Abril, Silvia; Agosti, Donat; Andersen, Alan N.; Angulo, Elena; Armbrecht, Ingre; Arnan, Xavier; Baccaro, Fabricio B.; Bishop, Tom R.; Boulay, Raphael; Bruhl, Carsten; Castracani, Cristina; Cerda, Xim; Del Toro, Israel; Delsinne, Thibaut; Diaz, Mireia; Donoso, David A.; Ellison, Aaron M.; Enriquez, Martha L.; Fayle, Tom M.; Feener Jr., Donald H.; Fisher, Brian L.; Fisher, Robert N.; Fitpatrick, Matthew C.; Gomez, Cristanto; Gotelli, Nicholas J.; Gove, Aaron; Grasso, Donato A.; Groc, Sarah; Guenard, Benoit; Gunawardene, Nihara; Heterick, Brian; Hoffmann, Benjamin; Janda, Milan; Jenkins, Clinton; Kaspari, Michael; Klimes, Petr; Lach, Lori; Laeger, Thomas; Lattke, John; Leponce, Maurice; Lessard, Jean-Philippe; Longino, John; Lucky, Andrea; Luke, Sarah H.; Majer, Jonathan; McGlynn, Terrence P.; Menke, Sean; Mezger, Dirk; Mori, Alessandra; Moses, Jimmy; Munyai, Thinandavha Caswell; Pacheco, Renata; Paknia, Omid; Pearce-Duvet, Jessica; Pfeiffer, Martin; Philpott, Stacy M.; Resasco, Julian; Retana, Javier; Silva, Rogerio R.; Sorger, Magdalena D.; Souza, Jorge; Suarez, Andrew V.; Tista, Melanie; Vasconcelos, Heraldo L.; Vonshak, Merav; Weiser, Michael D.; Yates, Michelle; Parr, Catherine L.

2017-01-01

What forces structure ecological assemblages? A key limitation to general insights about assemblage structure is the availability of data that are collected at a small spatial grain (local assemblages) and a large spatial extent (global coverage). Here, we present published and unpublished data from 51,388 ant abundance and occurrence records of more than 2693 species and 7953 morphospecies from local assemblages collected at 4212 locations around the world. Ants were selected because they are diverse and abundant globally, comprise a large fraction of animal biomass in most terrestrial communities, and are key contributors to a range of ecosystem functions. Data were collected between 1949 and 2014, and include, for each geo-referenced sampling site, both the identity of the ants collected and details of sampling design, habitat type and degree of disturbance. The aim of compiling this dataset was to provide comprehensive species abundance data in order to test relationships between assemblage structure and environmental and biogeographic factors. Data were collected using a variety of standardised methods, such as pitfall and Winkler traps, and will be valuable for studies investigating large-scale forces structuring local assemblages. Understanding such relationships is particularly critical under current rates of global change. We encourage authors holding additional data on systematically collected ant assemblages, especially those in dry and cold, and remote areas, to contact us and contribute their data to this growing dataset.
Maximizing Accessibility to Spatially Referenced Digital Data.

ERIC Educational Resources Information Center

Hunt, Li; Joselyn, Mark

1995-01-01

Discusses some widely available spatially referenced datasets, including raster and vector datasets. Strategies for improving accessibility include: acquisition of data in a software-dependent format; reorganization of data into logical geographic units; acquisition of intelligent retrieval software; improving computer hardware; and intelligent…

Development and assessment of 30-meter pine density maps for landscape-level modeling of mountain pine beetle dynamics

Treesearch

Benjamin A. Crabb; James A. Powell; Barbara J. Bentz

2012-01-01

Forecasting spatial patterns of mountain pine beetle (MPB) population success requires spatially explicit information on host pine distribution. We developed a means of producing spatially explicit datasets of pine density at 30-m resolution using existing geospatial datasets of vegetation composition and structure. Because our ultimate goal is to model MPB population...
Data Descriptor: TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015

Treesearch

John T. Abatzoglou; Solomon Z. Dobrowski; Sean A. Parks; Katherine C. Hegewisch

2018-01-01

We present TerraClimate, a dataset of high-spatial resolution (1/24Â°, ~4-km) monthly climate and climatic water balance for global terrestrial surfaces from 1958â2015. TerraClimate uses climatically aided interpolation, combining high-spatial resolution climatological normals from the WorldClim dataset, with coarser resolution time varying (i.e., monthly) data from...
High resolution population distribution maps for Southeast Asia in 2010 and 2015.

PubMed

Gaughan, Andrea E; Stevens, Forrest R; Linard, Catherine; Jia, Peng; Tatem, Andrew J

2013-01-01

Spatially accurate, contemporary data on human population distributions are vitally important to many applied and theoretical researchers. The Southeast Asia region has undergone rapid urbanization and population growth over the past decade, yet existing spatial population distribution datasets covering the region are based principally on population count data from censuses circa 2000, with often insufficient spatial resolution or input data to map settlements precisely. Here we outline approaches to construct a database of GIS-linked circa 2010 census data and methods used to construct fine-scale (∼100 meters spatial resolution) population distribution datasets for each country in the Southeast Asia region. Landsat-derived settlement maps and land cover information were combined with ancillary datasets on infrastructure to model population distributions for 2010 and 2015. These products were compared with those from two other methods used to construct commonly used global population datasets. Results indicate mapping accuracies are consistently higher when incorporating land cover and settlement information into the AsiaPop modelling process. Using existing data, it is possible to produce detailed, contemporary and easily updatable population distribution datasets for Southeast Asia. The 2010 and 2015 datasets produced are freely available as a product of the AsiaPop Project and can be downloaded from: www.asiapop.org.
High Resolution Population Distribution Maps for Southeast Asia in 2010 and 2015

PubMed Central

Gaughan, Andrea E.; Stevens, Forrest R.; Linard, Catherine; Jia, Peng; Tatem, Andrew J.

2013-01-01

Spatially accurate, contemporary data on human population distributions are vitally important to many applied and theoretical researchers. The Southeast Asia region has undergone rapid urbanization and population growth over the past decade, yet existing spatial population distribution datasets covering the region are based principally on population count data from censuses circa 2000, with often insufficient spatial resolution or input data to map settlements precisely. Here we outline approaches to construct a database of GIS-linked circa 2010 census data and methods used to construct fine-scale (∼100 meters spatial resolution) population distribution datasets for each country in the Southeast Asia region. Landsat-derived settlement maps and land cover information were combined with ancillary datasets on infrastructure to model population distributions for 2010 and 2015. These products were compared with those from two other methods used to construct commonly used global population datasets. Results indicate mapping accuracies are consistently higher when incorporating land cover and settlement information into the AsiaPop modelling process. Using existing data, it is possible to produce detailed, contemporary and easily updatable population distribution datasets for Southeast Asia. The 2010 and 2015 datasets produced are freely available as a product of the AsiaPop Project and can be downloaded from: www.asiapop.org. PMID:23418469
Comparing apples and oranges: the Community Intercomparison Suite

NASA Astrophysics Data System (ADS)

Schutgens, Nick; Stier, Philip; Kershaw, Philip; Pascoe, Stephen

2015-04-01

Visual representation and comparison of geoscientific datasets presents a huge challenge due to the large variety of file formats and spatio-temporal sampling of data (be they observations or simulations). The Community Intercomparison Suite attempts to greatly simplify these tasks for users by offering an intelligent but simple command line tool for visualisation and colocation of diverse datasets. In addition, CIS can subset and aggregate large datasets into smaller more manageable datasets. Our philosophy is to remove as much as possible the need for specialist knowledge by the user of the structure of a dataset. The colocation of observations with model data is as simple as: "cis col ::" which will resample the simulation data to the spatio-temporal sampling of the observations, contingent on a few user-defined options that specify a resampling kernel. As an example, we apply CIS to a case study of biomass burning aerosol from the Congo. Remote sensing observations, in-situe observations and model data are shown in various plots, with the purpose of either comparing different datasets or integrating them into a single comprehensive picture. CIS can deal with both gridded and ungridded datasets of 2, 3 or 4 spatio-temporal dimensions. It can handle different spatial coordinates (e.g. longitude or distance, altitude or pressure level). CIS supports both HDF, netCDF and ASCII file formats. The suite is written in Python with entirely publicly available open source dependencies. Plug-ins allow a high degree of user-moddability. A web-based developer hub includes a manual and simple examples. CIS is developed as open source code by a specialist IT company under supervision of scientists from the University of Oxford and the Centre of Environmental Data Archival as part of investment in the JASMIN superdatacluster facility.
Diviner lunar radiometer gridded brightness temperatures from geodesic binning of modeled fields of view

NASA Astrophysics Data System (ADS)

Sefton-Nash, E.; Williams, J.-P.; Greenhagen, B. T.; Aye, K.-M.; Paige, D. A.

2017-12-01

An approach is presented to efficiently produce high quality gridded data records from the large, global point-based dataset returned by the Diviner Lunar Radiometer Experiment aboard NASA's Lunar Reconnaissance Orbiter. The need to minimize data volume and processing time in production of science-ready map products is increasingly important with the growth in data volume of planetary datasets. Diviner makes on average >1400 observations per second of radiance that is reflected and emitted from the lunar surface, using 189 detectors divided into 9 spectral channels. Data management and processing bottlenecks are amplified by modeling every observation as a probability distribution function over the field of view, which can increase the required processing time by 2-3 orders of magnitude. Geometric corrections, such as projection of data points onto a digital elevation model, are numerically intensive and therefore it is desirable to perform them only once. Our approach reduces bottlenecks through parallel binning and efficient storage of a pre-processed database of observations. Database construction is via subdivision of a geodesic icosahedral grid, with a spatial resolution that can be tailored to suit the field of view of the observing instrument. Global geodesic grids with high spatial resolution are normally impractically memory intensive. We therefore demonstrate a minimum storage and highly parallel method to bin very large numbers of data points onto such a grid. A database of the pre-processed and binned points is then used for production of mapped data products that is significantly faster than if unprocessed points were used. We explore quality controls in the production of gridded data records by conditional interpolation, allowed only where data density is sufficient. The resultant effects on the spatial continuity and uncertainty in maps of lunar brightness temperatures is illustrated. We identify four binning regimes based on trades between the spatial resolution of the grid, the size of the FOV and the on-target spacing of observations. Our approach may be applicable and beneficial for many existing and future point-based planetary datasets.
Reconstruction of global gridded monthly sectoral water withdrawals for 1971–2010 and analysis of their spatiotemporal patterns

DOE PAGES

Huang, Zhongwei; Hejazi, Mohamad; Li, Xinya; ...

2018-04-06

Human water withdrawal has increasingly altered the global water cycle in past decades, yet our understanding of its driving forces and patterns is limited. Reported historical estimates of sectoral water withdrawals are often sparse and incomplete, mainly restricted to water withdrawal estimates available at annual and country scales, due to a lack of observations at seasonal and local scales. In this study, through collecting and consolidating various sources of reported data and developing spatial and temporal statistical downscaling algorithms, we reconstruct a global monthly gridded (0.5°) sectoral water withdrawal dataset for the period 1971–2010, which distinguishes six water use sectors, i.e., irrigation,more » domestic, electricity generation (cooling of thermal power plants), livestock, mining, and manufacturing. Based on the reconstructed dataset, the spatial and temporal patterns of historical water withdrawal are analyzed. Results show that total global water withdrawal has increased significantly during 1971–2010, mainly driven by the increase in irrigation water withdrawal. Regions with high water withdrawal are those densely populated or with large irrigated cropland production, e.g., the United States (US), eastern China, India, and Europe. Seasonally, irrigation water withdrawal in summer for the major crops contributes a large percentage of total annual irrigation water withdrawal in mid- and high-latitude regions, and the dominant season of irrigation water withdrawal is also different across regions. Domestic water withdrawal is mostly characterized by a summer peak, while water withdrawal for electricity generation has a winter peak in high-latitude regions and a summer peak in low-latitude regions. Despite the overall increasing trend, irrigation in the western US and domestic water withdrawal in western Europe exhibit a decreasing trend. Our results highlight the distinct spatial pattern of human water use by sectors at the seasonal and annual timescales. Here, the reconstructed gridded water withdrawal dataset is open access, and can be used for examining issues related to water withdrawals at fine spatial, temporal, and sectoral scales.« less
Reconstruction of global gridded monthly sectoral water withdrawals for 1971–2010 and analysis of their spatiotemporal patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Zhongwei; Hejazi, Mohamad; Li, Xinya

Human water withdrawal has increasingly altered the global water cycle in past decades, yet our understanding of its driving forces and patterns is limited. Reported historical estimates of sectoral water withdrawals are often sparse and incomplete, mainly restricted to water withdrawal estimates available at annual and country scales, due to a lack of observations at seasonal and local scales. In this study, through collecting and consolidating various sources of reported data and developing spatial and temporal statistical downscaling algorithms, we reconstruct a global monthly gridded (0.5°) sectoral water withdrawal dataset for the period 1971–2010, which distinguishes six water use sectors, i.e., irrigation,more » domestic, electricity generation (cooling of thermal power plants), livestock, mining, and manufacturing. Based on the reconstructed dataset, the spatial and temporal patterns of historical water withdrawal are analyzed. Results show that total global water withdrawal has increased significantly during 1971–2010, mainly driven by the increase in irrigation water withdrawal. Regions with high water withdrawal are those densely populated or with large irrigated cropland production, e.g., the United States (US), eastern China, India, and Europe. Seasonally, irrigation water withdrawal in summer for the major crops contributes a large percentage of total annual irrigation water withdrawal in mid- and high-latitude regions, and the dominant season of irrigation water withdrawal is also different across regions. Domestic water withdrawal is mostly characterized by a summer peak, while water withdrawal for electricity generation has a winter peak in high-latitude regions and a summer peak in low-latitude regions. Despite the overall increasing trend, irrigation in the western US and domestic water withdrawal in western Europe exhibit a decreasing trend. Our results highlight the distinct spatial pattern of human water use by sectors at the seasonal and annual timescales. Here, the reconstructed gridded water withdrawal dataset is open access, and can be used for examining issues related to water withdrawals at fine spatial, temporal, and sectoral scales.« less
Reconstruction of global gridded monthly sectoral water withdrawals for 1971–2010 and analysis of their spatiotemporal patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Zhongwei; Hejazi, Mohamad; Li, Xinya

Human water withdrawal has increasingly altered the global water cycle in past decades, yet our understanding of its driving forces and patterns is limited. Reported historical estimates of sectoral water withdrawals are often sparse and incomplete, mainly restricted to water withdrawal estimates available at annual and country scales, due to a lack of observations at seasonal and local scales. In this study, through collecting and consolidating various sources of reported data and developing spatial and temporal statistical downscaling algorithms, we reconstruct a global monthly gridded (0.5°) sectoral water withdrawal dataset for the period 1971–2010, which distinguishes six water use sectors, i.e., irrigation,more » domestic, electricity generation (cooling of thermal power plants), livestock, mining, and manufacturing. Based on the reconstructed dataset, the spatial and temporal patterns of historical water withdrawal are analyzed. Results show that total global water withdrawal has increased significantly during 1971–2010, mainly driven by the increase in irrigation water withdrawal. Regions with high water withdrawal are those densely populated or with large irrigated cropland production, e.g., the United States (US), eastern China, India, and Europe. Seasonally, irrigation water withdrawal in summer for the major crops contributes a large percentage of total annual irrigation water withdrawal in mid- and high-latitude regions, and the dominant season of irrigation water withdrawal is also different across regions. Domestic water withdrawal is mostly characterized by a summer peak, while water withdrawal for electricity generation has a winter peak in high-latitude regions and a summer peak in low-latitude regions. Despite the overall increasing trend, irrigation in the western US and domestic water withdrawal in western Europe exhibit a decreasing trend. Our results highlight the distinct spatial pattern of human water use by sectors at the seasonal and annual timescales. The reconstructed gridded water withdrawal dataset is open access, and can be used for examining issues related to water withdrawals at fine spatial, temporal, and sectoral scales.« less
Comparison and validation of gridded precipitation datasets for Spain

NASA Astrophysics Data System (ADS)

Quintana-Seguí, Pere; Turco, Marco; Míguez-Macho, Gonzalo

2016-04-01

In this study, two gridded precipitation datasets are compared and validated in Spain: the recently developed SAFRAN dataset and the Spain02 dataset. These are validated using rain gauges and they are also compared to the low resolution ERA-Interim reanalysis. The SAFRAN precipitation dataset has been recently produced, using the SAFRAN meteorological analysis, which is extensively used in France (Durand et al. 1993, 1999; Quintana-Seguí et al. 2008; Vidal et al., 2010) and which has recently been applied to Spain (Quintana-Seguí et al., 2015). SAFRAN uses an optimal interpolation (OI) algorithm and uses all available rain gauges from the Spanish State Meteorological Agency (Agencia Estatal de Meteorología, AEMET). The product has a spatial resolution of 5 km and it spans from September 1979 to August 2014. This dataset has been produced mainly to be used in large scale hydrological applications. Spain02 (Herrera et al. 2012, 2015) is another high quality precipitation dataset for Spain based on a dense network of quality-controlled stations and it has different versions at different resolutions. In this study we used the version with a resolution of 0.11°. The product spans from 1971 to 2010. Spain02 is well tested and widely used, mainly, but not exclusively, for RCM model validation and statistical downscliang. ERA-Interim is a well known global reanalysis with a spatial resolution of ˜79 km. It has been included in the comparison because it is a widely used product for continental and global scale studies and also in smaller scale studies in data poor countries. Thus, its comparison with higher resolution products of a data rich country, such as Spain, allows us to quantify the errors made when using such datasets for national scale studies, in line with some of the objectives of the EU-FP7 eartH2Observe project. The comparison shows that SAFRAN and Spain02 perform similarly, even though their underlying principles are different. Both products are largely better than ERA-Interim, which has a much coarser representation of the relief, which is crucial for precipitation. These results are a contribution to the Spanish Case Study of the eartH2Observe project, which is focused on the simulation of drought processes in Spain using Land-Surface Models (LSM). This study will also be helpful in the Spanish MARCO project, which aims at improving the ability of RCMs to simulate hydrometeorological extremes.
Investigating Bacterial-Animal Symbioses with Light Sheet Microscopy

PubMed Central

Taormina, Michael J.; Jemielita, Matthew; Stephens, W. Zac; Burns, Adam R.; Troll, Joshua V.; Parthasarathy, Raghuveer; Guillemin, Karen

2014-01-01

SUMMARY Microbial colonization of the digestive tract is a crucial event in vertebrate development, required for maturation of host immunity and establishment of normal digestive physiology. Advances in genomic, proteomic, and metabolomic technologies are providing a more detailed picture of the constituents of the intestinal habitat, but these approaches lack the spatial and temporal resolution needed to characterize the assembly and dynamics of microbial communities in this complex environment. We report the use of light sheet microscopy to provide high resolution imaging of bacterial colonization of the zebrafish intestine. The methodology allows us to characterize bacterial population dynamics across the entire organ and the behaviors of individual bacterial and host cells throughout the colonization process. The large four-dimensional datasets generated by these imaging approaches require new strategies for image analysis. When integrated with other “omics” datasets, information about the spatial and temporal dynamics of microbial cells within the vertebrate intestine will provide new mechanistic insights into how microbial communities assemble and function within hosts. PMID:22983029
Insights and Challenges to Integrating Data from Diverse Ecological Networks

NASA Astrophysics Data System (ADS)

Peters, D. P. C.

2014-12-01

Many of the most dramatic and surprising effects of global change occur across large spatial extents, from regions to continents, that impact multiple ecosystem types across a range of interacting spatial and temporal scales. The ability of ecologists and inter-disciplinary scientists to understand and predict these dynamics depend, in large part, on existing site-based research infrastructures that developed in response to historic events. Integrating these diverse sources of data is critical to addressing these broad-scale questions. A conceptual approach is presented to synthesize and integrate diverse sources and types of data from different networks of research sites. This approach focuses on developing derived data products through spatial and temporal aggregation that allow datasets collected with different methods to be compared. The approach is illustrated through the integration, analysis, and comparison of hundreds of long-term datasets from 50 ecological sites in the US that represent ecosystem types commonly found globally. New insights were found by comparing multiple sites using common derived data. In addition to "bringing to light" many dark data in a standardized, open access, easy-to-use format, a suite of lessons were learned that can be applied to up and coming research networks in the US and internationally. These lessons will be described along with the challenges, including cyber-infrastructure, cultural, and behavioral constraints associated with the use of big and little data, that may keep ecologists and inter-disciplinary scientists from taking full advantage of the vast amounts of existing and yet-to-be exposed data.
A comparative analysis reveals weak relationships between ecological factors and beta diversity of stream insect metacommunities at two spatial levels.

PubMed

Heino, Jani; Melo, Adriano S; Bini, Luis Mauricio; Altermatt, Florian; Al-Shami, Salman A; Angeler, David G; Bonada, Núria; Brand, Cecilia; Callisto, Marcos; Cottenie, Karl; Dangles, Olivier; Dudgeon, David; Encalada, Andrea; Göthe, Emma; Grönroos, Mira; Hamada, Neusa; Jacobsen, Dean; Landeiro, Victor L; Ligeiro, Raphael; Martins, Renato T; Miserendino, María Laura; Md Rawi, Che Salmah; Rodrigues, Marciel E; Roque, Fabio de Oliveira; Sandin, Leonard; Schmera, Denes; Sgarbi, Luciano F; Simaika, John P; Siqueira, Tadeu; Thompson, Ross M; Townsend, Colin R

2015-03-01

The hypotheses that beta diversity should increase with decreasing latitude and increase with spatial extent of a region have rarely been tested based on a comparative analysis of multiple datasets, and no such study has focused on stream insects. We first assessed how well variability in beta diversity of stream insect metacommunities is predicted by insect group, latitude, spatial extent, altitudinal range, and dataset properties across multiple drainage basins throughout the world. Second, we assessed the relative roles of environmental and spatial factors in driving variation in assemblage composition within each drainage basin. Our analyses were based on a dataset of 95 stream insect metacommunities from 31 drainage basins distributed around the world. We used dissimilarity-based indices to quantify beta diversity for each metacommunity and, subsequently, regressed beta diversity on insect group, latitude, spatial extent, altitudinal range, and dataset properties (e.g., number of sites and percentage of presences). Within each metacommunity, we used a combination of spatial eigenfunction analyses and partial redundancy analysis to partition variation in assemblage structure into environmental, shared, spatial, and unexplained fractions. We found that dataset properties were more important predictors of beta diversity than ecological and geographical factors across multiple drainage basins. In the within-basin analyses, environmental and spatial variables were generally poor predictors of variation in assemblage composition. Our results revealed deviation from general biodiversity patterns because beta diversity did not show the expected decreasing trend with latitude. Our results also call for reconsideration of just how predictable stream assemblages are along ecological gradients, with implications for environmental assessment and conservation decisions. Our findings may also be applicable to other dynamic systems where predictability is low.
Mining spatiotemporal patterns of urban dwellers from taxi trajectory data

NASA Astrophysics Data System (ADS)

Mao, Feng; Ji, Minhe; Liu, Ting

2016-06-01

With the widespread adoption of locationaware technology, obtaining long-sequence, massive and high-accuracy spatiotemporal trajectory data of individuals has become increasingly popular in various geographic studies. Trajectory data of taxis, one of the most widely used inner-city travel modes, contain rich information about both road network traffic and travel behavior of passengers. Such data can be used to study the microscopic activity patterns of individuals as well as the macro system of urban spatial structures. This paper focuses on trajectories obtained from GPS-enabled taxis and their applications for mining urban commuting patterns. A novel approach is proposed to discover spatiotemporal patterns of household travel from the taxi trajectory dataset with a large number of point locations. The approach involves three critical steps: spatial clustering of taxi origin-destination (OD) based on urban traffic grids to discover potentially meaningful places, identifying threshold values from statistics of the OD clusters to extract urban jobs-housing structures, and visualization of analytic results to understand the spatial distribution and temporal trends of the revealed urban structures and implied household commuting behavior. A case study with a taxi trajectory dataset in Shanghai, China is presented to demonstrate and evaluate the proposed method.
Multi-source Geospatial Data Analysis with Google Earth Engine

NASA Astrophysics Data System (ADS)

Erickson, T.

2014-12-01

The Google Earth Engine platform is a cloud computing environment for data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog is a multi-petabyte archive of georeferenced datasets that include images from Earth observing satellite and airborne sensors (examples: USGS Landsat, NASA MODIS, USDA NAIP), weather and climate datasets, and digital elevation models. Earth Engine supports both a just-in-time computation model that enables real-time preview and debugging during algorithm development for open-ended data exploration, and a batch computation mode for applying algorithms over large spatial and temporal extents. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, and resampling, which facilitates writing algorithms that combine data from multiple sensors and/or models. Although the primary use of Earth Engine, to date, has been the analysis of large Earth observing satellite datasets, the computational platform is generally applicable to a wide variety of use cases that require large-scale geospatial data analyses. This presentation will focus on how Earth Engine facilitates the analysis of geospatial data streams that originate from multiple separate sources (and often communities) and how it enables collaboration during algorithm development and data exploration. The talk will highlight current projects/analyses that are enabled by this functionality.https://earthengine.google.org
On the Discrepancy in Simultaneous Observations of the Structure Parameter of Temperature Using Scintillometers and Unmanned Aircraft

NASA Astrophysics Data System (ADS)

Braam, Miranda; Beyrich, Frank; Bange, Jens; Platis, Andreas; Martin, Sabrina; Maronga, Björn; Moene, Arnold F.

2016-02-01

We elaborate on the preliminary results presented in Beyrich et al. (in Boundary-Layer Meteorol 144:83-112, 2012), who compared the structure parameter of temperature ({CT^2}_{}) obtained with the unmanned meteorological mini aerial vehicle (M2 AV) versus {CT^2}_{} obtained with two large-aperture scintillometers (LASs) for a limited dataset from one single experiment (LITFASS-2009). They found that {CT^2}_{} obtained from the M2 AV data is significantly larger than that obtained from the LAS data. We investigate if similar differences can be found for the flights on the other six days during LITFASS-2009 and LITFASS-2010, and whether these differences can be reduced or explained through a more elaborate processing of both the LAS data and the M2 AV data. This processing includes different corrections and measures to reduce the differences between the spatial and temporal averaging of the datasets. We conclude that the differences reported in Beyrich et al. can be found for other days as well. For the LAS-derived values the additional processing steps that have the largest effect are the saturation correction and the humidity correction. For the M2 AV -derived values the most important step is the application of the scintillometer path-weighting function. Using the true air speed of the M2 AV to convert from a temporal to a spatial structure function rather than the ground speed (as in Beyrich et al.) does not change the mean discrepancy, but it does affect {CT^2}_{} values for individual flights. To investigate whether {CT^2}_{} derived from the M2 AV data depends on the fact that the underlying temperature dataset combines spatial and temporal sampling, we used large-eddy simulation data to analyze {CT^2}_{} from virtual flights with different mean ground speeds. This analysis shows that {CT^2}_{} does only slightly depends on the true air speed when averaged over many flights.
Impact of missing data on the efficiency of homogenisation: experiments with ACMANTv3

NASA Astrophysics Data System (ADS)

Domonkos, Peter; Coll, John

2018-04-01

The impact of missing data on the efficiency of homogenisation with ACMANTv3 is examined with simulated monthly surface air temperature test datasets. The homogeneous database is derived from an earlier benchmarking of daily temperature data in the USA, and then outliers and inhomogeneities (IHs) are randomly inserted into the time series. Three inhomogeneous datasets are generated and used, one with relatively few and small IHs, another one with IHs of medium frequency and size, and a third one with large and frequent IHs. All of the inserted IHs are changes to the means. Most of the IHs are single sudden shifts or pair of shifts resulting in platform-shaped biases. Each test dataset consists of 158 time series of 100 years length, and their mean spatial correlation is 0.68-0.88. For examining the impacts of missing data, seven experiments are performed, in which 18 series are left complete, while variable quantities (10-70%) of the data of the other 140 series are removed. The results show that data gaps have a greater impact on the monthly root mean squared error (RMSE) than the annual RMSE and trend bias. When data with a large ratio of gaps is homogenised, the reduction of the upper 5% of the monthly RMSE is the least successful, but even there, the efficiency remains positive. In terms of reducing the annual RMSE and trend bias, the efficiency is 54-91%. The inclusion of short and incomplete series with sufficient spatial correlation in all cases improves the efficiency of homogenisation with ACMANTv3.
Database Objects vs Files: Evaluation of alternative strategies for managing large remote sensing data

NASA Astrophysics Data System (ADS)

Baru, Chaitan; Nandigam, Viswanath; Krishnan, Sriram

2010-05-01

Increasingly, the geoscience user community expects modern IT capabilities to be available in service of their research and education activities, including the ability to easily access and process large remote sensing datasets via online portals such as GEON (www.geongrid.org) and OpenTopography (opentopography.org). However, serving such datasets via online data portals presents a number of challenges. In this talk, we will evaluate the pros and cons of alternative storage strategies for management and processing of such datasets using binary large object implementations (BLOBs) in database systems versus implementation in Hadoop files using the Hadoop Distributed File System (HDFS). The storage and I/O requirements for providing online access to large datasets dictate the need for declustering data across multiple disks, for capacity as well as bandwidth and response time performance. This requires partitioning larger files into a set of smaller files, and is accompanied by the concomitant requirement for managing large numbers of file. Storing these sub-files as blobs in a shared-nothing database implemented across a cluster provides the advantage that all the distributed storage management is done by the DBMS. Furthermore, subsetting and processing routines can be implemented as user-defined functions (UDFs) on these blobs and would run in parallel across the set of nodes in the cluster. On the other hand, there are both storage overheads and constraints, and software licensing dependencies created by such an implementation. Another approach is to store the files in an external filesystem with pointers to them from within database tables. The filesystem may be a regular UNIX filesystem, a parallel filesystem, or HDFS. In the HDFS case, HDFS would provide the file management capability, while the subsetting and processing routines would be implemented as Hadoop programs using the MapReduce model. Hadoop and its related software libraries are freely available. Another consideration is the strategy used for partitioning large data collections, and large datasets within collections, using round-robin vs hash partitioning vs range partitioning methods. Each has different characteristics in terms of spatial locality of data and resultant degree of declustering of the computations on the data. Furthermore, we have observed that, in practice, there can be large variations in the frequency of access to different parts of a large data collection and/or dataset, thereby creating "hotspots" in the data. We will evaluate the ability of different approaches for dealing effectively with such hotspots and alternative strategies for dealing with hotspots.
Unsupervised classification of multivariate geostatistical data: Two algorithms

NASA Astrophysics Data System (ADS)

Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

2015-12-01

With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
An Evaluation of Data Fusion Products for the Analysis of Dryland Forest Phenology

NASA Astrophysics Data System (ADS)

Walker, J. J.; de Beurs, K.; Wynne, R. H.; Gao, F.

2010-12-01

Semi-arid forest areas cover a significant proportion of the world’s land surface; in the interior western U.S. alone, dryland forests extend across more than 56 million hectares. The scarcity of water in these systems makes them acutely sensitive to sustained weather fluctuations, such as the higher temperatures and altered water regimes predicted under most climate change scenarios. To understand, monitor, and predict the anticipated spatial and temporal changes in these areas, it is vital to characterize current phenological patterns. Phenological analysis of western U.S. drylands is complicated by patchy land cover and mosaics of plant phenology states at a variety of spatial scales. Our aim is to use complementary satellite sensors to mitigate these difficulties and gain greater insight into phenological patterns in dryland forests. In this study we applied the spatial and temporal adaptive reflectance model (STARFM; Gao et al. 2006) to fuse Landsat and MODIS imagery to create synthetic images at Landsat spatial resolution and MODIS temporal resolution. To determine which MODIS dataset is most appropriate for the creation of synthetic images intended for the analysis of dryland forest phenology, we examined the effect of temporal compositing and BRDF function adjustment on the accuracy of STARFM imagery. We assembled seven Landsat 5 scenes (path/row 37/36) and temporally-coincident 500m MODIS datasets (seven daily (MOD09GA), seven 8-day composite (MOD09A1), and fourteen 16-day nadir BRDF-adjusted composite (MCD43A4) images) spanning the 2006 April - October growing season in northern Arizona, which is characterized by large tracts of dryland forest. The STARFM algorithm was applied to each MODIS data series to produce four synthetic images (one daily; one 8-day composite; and two 16-day composites) corresponding to each Landsat image. Validation of the accuracy of the synthetic images was achieved by comparing the reflectance values of a random sample of the identified dryland forest pixels in both images. Preliminary data analysis of the effect of the temporal resolution and dataset parameters indicates that the MODIS 8-day composite image may be a suitable and sufficient dataset for phenological analysis in this dryland forest ecosystem. Overall, this work demonstrates the feasibility of using data fusion products to assemble an imagery dataset at sufficiently high temporal and spatial scales to permit a more detailed examination of the underlying phenological processes and trends in dryland forest areas.

Zebra Crossing Spotter: Automatic Population of Spatial Databases for Increased Safety of Blind Travelers

PubMed Central

Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M.; Mascetti, Sergio

2016-01-01

In this paper we propose a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Knowing the location of crosswalks is critical for a blind person planning a trip that includes street crossing. By augmenting existing spatial databases (such as Google Maps or OpenStreetMap) with this information, a blind traveler may make more informed routing decisions, resulting in greater safety during independent travel. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm could also be complemented by a final crowdsourcing validation stage for increased accuracy. PMID:26824080
A computationally efficient Bayesian sequential simulation approach for the assimilation of vast and diverse hydrogeophysical datasets

NASA Astrophysics Data System (ADS)

Nussbaumer, Raphaël; Gloaguen, Erwan; Mariéthoz, Grégoire; Holliger, Klaus

2016-04-01

Bayesian sequential simulation (BSS) is a powerful geostatistical technique, which notably has shown significant potential for the assimilation of datasets that are diverse with regard to the spatial resolution and their relationship. However, these types of applications of BSS require a large number of realizations to adequately explore the solution space and to assess the corresponding uncertainties. Moreover, such simulations generally need to be performed on very fine grids in order to adequately exploit the technique's potential for characterizing heterogeneous environments. Correspondingly, the computational cost of BSS algorithms in their classical form is very high, which so far has limited an effective application of this method to large models and/or vast datasets. In this context, it is also important to note that the inherent assumption regarding the independence of the considered datasets is generally regarded as being too strong in the context of sequential simulation. To alleviate these problems, we have revisited the classical implementation of BSS and incorporated two key features to increase the computational efficiency. The first feature is a combined quadrant spiral - superblock search, which targets run-time savings on large grids and adds flexibility with regard to the selection of neighboring points using equal directional sampling and treating hard data and previously simulated points separately. The second feature is a constant path of simulation, which enhances the efficiency for multiple realizations. We have also modified the aggregation operator to be more flexible with regard to the assumption of independence of the considered datasets. This is achieved through log-linear pooling, which essentially allows for attributing weights to the various data components. Finally, a multi-grid simulating path was created to enforce large-scale variance and to allow for adapting parameters, such as, for example, the log-linear weights or the type of simulation path at various scales. The newly implemented search method for kriging reduces the computational cost from an exponential dependence with regard to the grid size in the original algorithm to a linear relationship, as each neighboring search becomes independent from the grid size. For the considered examples, our results show a sevenfold reduction in run time for each additional realization when a constant simulation path is used. The traditional criticism that constant path techniques introduce a bias to the simulations was explored and our findings do indeed reveal a minor reduction in the diversity of the simulations. This bias can, however, be largely eliminated by changing the path type at different scales through the use of the multi-grid approach. Finally, we show that adapting the aggregation weight at each scale considered in our multi-grid approach allows for reproducing both the variogram and histogram, and the spatial trend of the underlying data.
The Human and Physical Determinants of Wildfires and Burnt Areas in Israel

NASA Astrophysics Data System (ADS)

Levin, Noam; Tessler, Naama; Smith, Andrew; McAlpine, Clive

2016-09-01

Wildfires are expected to increase in Mediterranean landscapes as a result of climate change and changes in land-use practices. In order to advance our understanding of human and physical factors shaping spatial patterns of wildfires in the region, we compared two independently generated datasets of wildfires for Israel that cover approximately the same study period. We generated a site-based dataset containing the location of 10,879 wildfires (1991-2011), and compared it to a dataset of burnt areas derived from MODIS imagery (2000-2011). We hypothesized that the physical and human factors explaining the spatial distribution of burnt areas derived from remote sensing (mostly large fires, >100 ha) will differ from those explaining site-based wildfires recorded by national agencies (mostly small fires, <10 ha). Small wildfires recorded by forestry agencies were concentrated within planted forests and near built-up areas, whereas the largest wildfires were located in more remote regions, often associated with military training areas and herbaceous vegetation. We conclude that to better understand wildfire dynamics, consolidation of wildfire databases should be achieved, combining field reports and remote sensing. As nearly all wildfires in Mediterranean landscapes are caused by human activities, improving the management of forest areas and raising public awareness to fire risk are key considerations in reducing fire danger.
The Human and Physical Determinants of Wildfires and Burnt Areas in Israel.

PubMed

Levin, Noam; Tessler, Naama; Smith, Andrew; McAlpine, Clive

2016-09-01

Wildfires are expected to increase in Mediterranean landscapes as a result of climate change and changes in land-use practices. In order to advance our understanding of human and physical factors shaping spatial patterns of wildfires in the region, we compared two independently generated datasets of wildfires for Israel that cover approximately the same study period. We generated a site-based dataset containing the location of 10,879 wildfires (1991-2011), and compared it to a dataset of burnt areas derived from MODIS imagery (2000-2011). We hypothesized that the physical and human factors explaining the spatial distribution of burnt areas derived from remote sensing (mostly large fires, >100 ha) will differ from those explaining site-based wildfires recorded by national agencies (mostly small fires, <10 ha). Small wildfires recorded by forestry agencies were concentrated within planted forests and near built-up areas, whereas the largest wildfires were located in more remote regions, often associated with military training areas and herbaceous vegetation. We conclude that to better understand wildfire dynamics, consolidation of wildfire databases should be achieved, combining field reports and remote sensing. As nearly all wildfires in Mediterranean landscapes are caused by human activities, improving the management of forest areas and raising public awareness to fire risk are key considerations in reducing fire danger.
A reanalysis dataset of the South China Sea.

PubMed

Zeng, Xuezhi; Peng, Shiqiu; Li, Zhijin; Qi, Yiquan; Chen, Rongyu

2014-01-01

Ocean reanalysis provides a temporally continuous and spatially gridded four-dimensional estimate of the ocean state for a better understanding of the ocean dynamics and its spatial/temporal variability. Here we present a 19-year (1992-2010) high-resolution ocean reanalysis dataset of the upper ocean in the South China Sea (SCS) produced from an ocean data assimilation system. A wide variety of observations, including in-situ temperature/salinity profiles, ship-measured and satellite-derived sea surface temperatures, and sea surface height anomalies from satellite altimetry, are assimilated into the outputs of an ocean general circulation model using a multi-scale incremental three-dimensional variational data assimilation scheme, yielding a daily high-resolution reanalysis dataset of the SCS. Comparisons between the reanalysis and independent observations support the reliability of the dataset. The presented dataset provides the research community of the SCS an important data source for studying the thermodynamic processes of the ocean circulation and meso-scale features in the SCS, including their spatial and temporal variability.
A reanalysis dataset of the South China Sea

PubMed Central

Zeng, Xuezhi; Peng, Shiqiu; Li, Zhijin; Qi, Yiquan; Chen, Rongyu

2014-01-01

Ocean reanalysis provides a temporally continuous and spatially gridded four-dimensional estimate of the ocean state for a better understanding of the ocean dynamics and its spatial/temporal variability. Here we present a 19-year (1992–2010) high-resolution ocean reanalysis dataset of the upper ocean in the South China Sea (SCS) produced from an ocean data assimilation system. A wide variety of observations, including in-situ temperature/salinity profiles, ship-measured and satellite-derived sea surface temperatures, and sea surface height anomalies from satellite altimetry, are assimilated into the outputs of an ocean general circulation model using a multi-scale incremental three-dimensional variational data assimilation scheme, yielding a daily high-resolution reanalysis dataset of the SCS. Comparisons between the reanalysis and independent observations support the reliability of the dataset. The presented dataset provides the research community of the SCS an important data source for studying the thermodynamic processes of the ocean circulation and meso-scale features in the SCS, including their spatial and temporal variability. PMID:25977803
A downscaled 1 km dataset of daily Greenland ice sheet surface mass balance components (1958-2014)

NASA Astrophysics Data System (ADS)

Noel, B.; Van De Berg, W. J.; Fettweis, X.; Machguth, H.; Howat, I. M.; van den Broeke, M. R.

2015-12-01

The current spatial resolution in regional climate models (RCMs), typically around 5 to 20 km, remains too coarse to accurately reproduce the spatial variability in surface mass balance (SMB) components over the narrow ablation zones, marginal outlet glaciers and neighbouring ice caps of the Greenland ice sheet (GrIS). In these topographically rough terrains, the SMB components are highly dependent on local variations in topography. However, the relatively low-resolution elevation and ice mask prescribed in RCMs contribute to significantly underestimate melt and runoff in these regions due to unresolved valley glaciers and fjords. Therefore, near-km resolution topography is essential to better capture SMB variability in these spatially restricted regions. We present a 1 km resolution dataset of daily GrIS SMB covering the period 1958-2014, which is statistically downscaled from data of the polar regional climate model RACMO2.3 at 11 km, using an elevation dependence. The dataset includes all individual SMB components projected on the elevation and ice mask from the GIMP DEM, down-sampled to 1 km. Daily runoff and sublimation are interpolated to the 1 km topography using a local regression to elevation valid for each day specifically; daily precipitation is bi-linearly downscaled without elevation corrections. The daily SMB dataset is then reconstructed by summing downscaled precipitation, sublimation and runoff. High-resolution elevation and ice mask allow for properly resolving the narrow ablation zones and valley glaciers at the GrIS margins, leading to significant increase in runoff estimate. In these regions, and especially over narrow glaciers tongues, the downscaled products improve on the original RACMO2.3 outputs by better representing local SMB patterns through a gradual ablation increase towards the GrIS margins. We discuss the impact of downscaling on the SMB components in a case study for a spatially restricted region, where large elevation discrepancies are observed between both resolutions. Owing to generally enhanced runoff in the GrIS ablation zone, the evaluation of daily downscaled SMB against ablation measurements, collected at in-situ measuring sites derived from a newly compiled ablation dataset, shows a better agreement with observations relative to native RACMO2.3 SMB at 11 km.
ASSESSING THE ACCURACY OF NATIONAL LAND COVER DATASET AREA ESTIMATES AT MULTIPLE SPATIAL EXTENTS

EPA Science Inventory

Site specific accuracy assessments provide fine-scale evaluation of the thematic accuracy of land use/land cover (LULC) datasets; however, they provide little insight into LULC accuracy across varying spatial extents. Additionally, LULC data are typically used to describe lands...
How does spatial extent of fMRI datasets affect independent component analysis decomposition?

PubMed

Aragri, Adriana; Scarabino, Tommaso; Seifritz, Erich; Comani, Silvia; Cirillo, Sossio; Tedeschi, Gioacchino; Esposito, Fabrizio; Di Salle, Francesco

2006-09-01

Spatial independent component analysis (sICA) of functional magnetic resonance imaging (fMRI) time series can generate meaningful activation maps and associated descriptive signals, which are useful to evaluate datasets of the entire brain or selected portions of it. Besides computational implications, variations in the input dataset combined with the multivariate nature of ICA may lead to different spatial or temporal readouts of brain activation phenomena. By reducing and increasing a volume of interest (VOI), we applied sICA to different datasets from real activation experiments with multislice acquisition and single or multiple sensory-motor task-induced blood oxygenation level-dependent (BOLD) signal sources with different spatial and temporal structure. Using receiver operating characteristics (ROC) methodology for accuracy evaluation and multiple regression analysis as benchmark, we compared sICA decompositions of reduced and increased VOI fMRI time-series containing auditory, motor and hemifield visual activation occurring separately or simultaneously in time. Both approaches yielded valid results; however, the results of the increased VOI approach were spatially more accurate compared to the results of the decreased VOI approach. This is consistent with the capability of sICA to take advantage of extended samples of statistical observations and suggests that sICA is more powerful with extended rather than reduced VOI datasets to delineate brain activity. (c) 2006 Wiley-Liss, Inc.
Spatial and temporal air quality pattern recognition using environmetric techniques: a case study in Malaysia.

PubMed

Syed Abdul Mutalib, Sharifah Norsukhairin; Juahir, Hafizan; Azid, Azman; Mohd Sharif, Sharifah; Latif, Mohd Talib; Aris, Ahmad Zaharin; Zain, Sharifuddin M; Dominick, Doreena

2013-09-01

The objective of this study is to identify spatial and temporal patterns in the air quality at three selected Malaysian air monitoring stations based on an eleven-year database (January 2000-December 2010). Four statistical methods, Discriminant Analysis (DA), Hierarchical Agglomerative Cluster Analysis (HACA), Principal Component Analysis (PCA) and Artificial Neural Networks (ANNs), were selected to analyze the datasets of five air quality parameters, namely: SO2, NO2, O3, CO and particulate matter with a diameter size of below 10 μm (PM10). The three selected air monitoring stations share the characteristic of being located in highly urbanized areas and are surrounded by a number of industries. The DA results show that spatial characterizations allow successful discrimination between the three stations, while HACA shows the temporal pattern from the monthly and yearly factor analysis which correlates with severe haze episodes that have happened in this country at certain periods of time. The PCA results show that the major source of air pollution is mostly due to the combustion of fossil fuel in motor vehicles and industrial activities. The spatial pattern recognition (S-ANN) results show a better prediction performance in discriminating between the regions, with an excellent percentage of correct classification compared to DA. This study presents the necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations.
Cadastral Database Positional Accuracy Improvement

NASA Astrophysics Data System (ADS)

Hashim, N. M.; Omar, A. H.; Ramli, S. N. M.; Omar, K. M.; Din, N.

2017-10-01

Positional Accuracy Improvement (PAI) is the refining process of the geometry feature in a geospatial dataset to improve its actual position. This actual position relates to the absolute position in specific coordinate system and the relation to the neighborhood features. With the growth of spatial based technology especially Geographical Information System (GIS) and Global Navigation Satellite System (GNSS), the PAI campaign is inevitable especially to the legacy cadastral database. Integration of legacy dataset and higher accuracy dataset like GNSS observation is a potential solution for improving the legacy dataset. However, by merely integrating both datasets will lead to a distortion of the relative geometry. The improved dataset should be further treated to minimize inherent errors and fitting to the new accurate dataset. The main focus of this study is to describe a method of angular based Least Square Adjustment (LSA) for PAI process of legacy dataset. The existing high accuracy dataset known as National Digital Cadastral Database (NDCDB) is then used as bench mark to validate the results. It was found that the propose technique is highly possible for positional accuracy improvement of legacy spatial datasets.
Comparative analysis of 2D and 3D distance measurements to study spatial genome organization.

PubMed

Finn, Elizabeth H; Pegoraro, Gianluca; Shachar, Sigal; Misteli, Tom

2017-07-01

The spatial organization of genomes is non-random, cell-type specific, and has been linked to cellular function. The investigation of spatial organization has traditionally relied extensively on fluorescence microscopy. The validity of the imaging methods used to probe spatial genome organization often depends on the accuracy and precision of distance measurements. Imaging-based measurements may either use 2 dimensional datasets or 3D datasets which include the z-axis information in image stacks. Here we compare the suitability of 2D vs 3D distance measurements in the analysis of various features of spatial genome organization. We find in general good agreement between 2D and 3D analysis with higher convergence of measurements as the interrogated distance increases, especially in flat cells. Overall, 3D distance measurements are more accurate than 2D distances, but are also more susceptible to noise. In particular, z-stacks are prone to error due to imaging properties such as limited resolution along the z-axis and optical aberrations, and we also find significant deviations from unimodal distance distributions caused by low sampling frequency in z. These deviations are ameliorated by significantly higher sampling frequency in the z-direction. We conclude that 2D distances are preferred for comparative analyses between cells, but 3D distances are preferred when comparing to theoretical models in large samples of cells. In general and for practical purposes, 2D distance measurements are preferable for many applications of analysis of spatial genome organization. Published by Elsevier Inc.
Influence of spatial and temporal scales in identifying temperature extremes

NASA Astrophysics Data System (ADS)

van Eck, Christel M.; Friedlingstein, Pierre; Mulder, Vera L.; Regnier, Pierre A. G.

2016-04-01

Extreme heat events are becoming more frequent. Notable are severe heatwaves such as the European heatwave of 2003, the Russian heat wave of 2010 and the Australian heatwave of 2013. Surface temperature is attaining new maxima not only during the summer but also during the winter. The year of 2015 is reported to be a temperature record breaking year for both summer and winter. These extreme temperatures are taking their human and environmental toll, emphasizing the need for an accurate method to define a heat extreme in order to fully understand the spatial and temporal spread of an extreme and its impact. This research aims to explore how the use of different spatial and temporal scales influences the identification of a heat extreme. For this purpose, two near-surface temperature datasets of different temporal scale and spatial scale are being used. First, the daily ERA-Interim dataset of 0.25 degree and a time span of 32 years (1979-2010). Second, the daily Princeton Meteorological Forcing Dataset of 0.5 degree and a time span of 63 years (1948-2010). A temperature is considered extreme anomalous when it is surpassing the 90th, 95th, or the 99th percentile threshold based on the aforementioned pre-processed datasets. The analysis is conducted on a global scale, dividing the world in IPCC's so-called SREX regions developed for the analysis of extreme climate events. Pre-processing is done by detrending and/or subtracting the monthly climatology based on 32 years of data for both datasets and on 63 years of data for only the Princeton Meteorological Forcing Dataset. This results in 6 datasets of temperature anomalies from which the location in time and space of the anomalous warm days are identified. Comparison of the differences between these 6 datasets in terms of absolute threshold temperatures for extremes and the temporal and spatial spread of the extreme anomalous warm days show a dependence of the results on the datasets and methodology used. This stresses the need for a careful selection of data and methodology when identifying heat extremes.
VEMAP Phase 2 bioclimatic database. I. Gridded historical (20th century) climate for modeling ecosystem dynamics across the conterminous USA

USGS Publications Warehouse

Kittel, T.G.F.; Rosenbloom, N.A.; Royle, J. Andrew; Daly, Christopher; Gibson, W.P.; Fisher, H.H.; Thornton, P.; Yates, D.N.; Aulenbach, S.; Kaufman, C.; McKeown, R.; Bachelet, D.; Schimel, D.S.; Neilson, R.; Lenihan, J.; Drapek, R.; Ojima, D.S.; Parton, W.J.; Melillo, J.M.; Kicklighter, D.W.; Tian, H.; McGuire, A.D.; Sykes, M.T.; Smith, B.; Cowling, S.; Hickler, T.; Prentice, I.C.; Running, S.; Hibbard, K.A.; Post, W.M.; King, A.W.; Smith, T.; Rizzo, B.; Woodward, F.I.

2004-01-01

Analysis and simulation of biospheric responses to historical forcing require surface climate data that capture those aspects of climate that control ecological processes, including key spatial gradients and modes of temporal variability. We developed a multivariate, gridded historical climate dataset for the conterminous USA as a common input database for the Vegetation/Ecosystem Modeling and Analysis Project (VEMAP), a biogeochemical and dynamic vegetation model intercomparison. The dataset covers the period 1895-1993 on a 0.5?? latitude/longitude grid. Climate is represented at both monthly and daily timesteps. Variables are: precipitation, mininimum and maximum temperature, total incident solar radiation, daylight-period irradiance, vapor pressure, and daylight-period relative humidity. The dataset was derived from US Historical Climate Network (HCN), cooperative network, and snowpack telemetry (SNOTEL) monthly precipitation and mean minimum and maximum temperature station data. We employed techniques that rely on geostatistical and physical relationships to create the temporally and spatially complete dataset. We developed a local kriging prediction model to infill discontinuous and limited-length station records based on spatial autocorrelation structure of climate anomalies. A spatial interpolation model (PRISM) that accounts for physiographic controls was used to grid the infilled monthly station data. We implemented a stochastic weather generator (modified WGEN) to disaggregate the gridded monthly series to dailies. Radiation and humidity variables were estimated from the dailies using a physically-based empirical surface climate model (MTCLIM3). Derived datasets include a 100 yr model spin-up climate and a historical Palmer Drought Severity Index (PDSI) dataset. The VEMAP dataset exhibits statistically significant trends in temperature, precipitation, solar radiation, vapor pressure, and PDSI for US National Assessment regions. The historical climate and companion datasets are available online at data archive centers. ?? Inter-Research 2004.
Determining Scale-dependent Patterns in Spatial and Temporal Datasets

NASA Astrophysics Data System (ADS)

Roy, A.; Perfect, E.; Mukerji, T.; Sylvester, L.

2016-12-01

Spatial and temporal datasets of interest to Earth scientists often contain plots of one variable against another, e.g., rainfall magnitude vs. time or fracture aperture vs. spacing. Such data, comprised of distributions of events along a transect / timeline along with their magnitudes, can display persistent or antipersistent trends, as well as random behavior, that may contain signatures of underlying physical processes. Lacunarity is a technique that was originally developed for multiscale analysis of data. In a recent study we showed that lacunarity can be used for revealing changes in scale-dependent patterns in fracture spacing data. Here we present a further improvement in our technique, with lacunarity applied to various non-binary datasets comprised of event spacings and magnitudes. We test our technique on a set of four synthetic datasets, three of which are based on an autoregressive model and have magnitudes at every point along the "timeline" thus representing antipersistent, persistent, and random trends. The fourth dataset is made up of five clusters of events, each containing a set of random magnitudes. The concept of lacunarity ratio, LR, is introduced; this is the lacunarity of a given dataset normalized to the lacunarity of its random counterpart. It is demonstrated that LR can successfully delineate scale-dependent changes in terms of antipersistence and persistence in the synthetic datasets. This technique is then applied to three different types of data: a hundred-year rainfall record from Knoxville, TN, USA, a set of varved sediments from Marca Shale, and a set of fracture aperture and spacing data from NE Mexico. While the rainfall data and varved sediments both appear to be persistent at small scales, at larger scales they both become random. On the other hand, the fracture data shows antipersistence at small scale (within cluster) and random behavior at large scales. Such differences in behavior with respect to scale-dependent changes in antipersistence to random, persistence to random, or otherwise, maybe be related to differences in the physicochemical properties and processes contributing to multiscale datasets.
A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

PubMed

Wilke, Marko

2018-02-01

This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.
Spatial aspects of building and population exposure data and their implications for global earthquake exposure modeling

USGS Publications Warehouse

Dell’Acqua, F.; Gamba, P.; Jaiswal, K.

2012-01-01

This paper discusses spatial aspects of the global exposure dataset and mapping needs for earthquake risk assessment. We discuss this in the context of development of a Global Exposure Database for the Global Earthquake Model (GED4GEM), which requires compilation of a multi-scale inventory of assets at risk, for example, buildings, populations, and economic exposure. After defining the relevant spatial and geographic scales of interest, different procedures are proposed to disaggregate coarse-resolution data, to map them, and if necessary to infer missing data by using proxies. We discuss the advantages and limitations of these methodologies and detail the potentials of utilizing remote-sensing data. The latter is used especially to homogenize an existing coarser dataset and, where possible, replace it with detailed information extracted from remote sensing using the built-up indicators for different environments. Present research shows that the spatial aspects of earthquake risk computation are tightly connected with the availability of datasets of the resolution necessary for producing sufficiently detailed exposure. The global exposure database designed by the GED4GEM project is able to manage datasets and queries of multiple spatial scales.
Natural image sequences constrain dynamic receptive fields and imply a sparse code.

PubMed

Häusler, Chris; Susemihl, Alex; Nawrot, Martin P

2013-11-06

In their natural environment, animals experience a complex and dynamic visual scenery. Under such natural stimulus conditions, neurons in the visual cortex employ a spatially and temporally sparse code. For the input scenario of natural still images, previous work demonstrated that unsupervised feature learning combined with the constraint of sparse coding can predict physiologically measured receptive fields of simple cells in the primary visual cortex. This convincingly indicated that the mammalian visual system is adapted to the natural spatial input statistics. Here, we extend this approach to the time domain in order to predict dynamic receptive fields that can account for both spatial and temporal sparse activation in biological neurons. We rely on temporal restricted Boltzmann machines and suggest a novel temporal autoencoding training procedure. When tested on a dynamic multi-variate benchmark dataset this method outperformed existing models of this class. Learning features on a large dataset of natural movies allowed us to model spatio-temporal receptive fields for single neurons. They resemble temporally smooth transformations of previously obtained static receptive fields and are thus consistent with existing theories. A neuronal spike response model demonstrates how the dynamic receptive field facilitates temporal and population sparseness. We discuss the potential mechanisms and benefits of a spatially and temporally sparse representation of natural visual input. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Multivariate Non-Symmetric Stochastic Models for Spatial Dependence Models

NASA Astrophysics Data System (ADS)

Haslauer, C. P.; Bárdossy, A.

2017-12-01

A copula based multivariate framework allows more flexibility to describe different kind of dependences than what is possible using models relying on the confining assumption of symmetric Gaussian models: different quantiles can be modelled with a different degree of dependence; it will be demonstrated how this can be expected given process understanding. maximum likelihood based multivariate quantitative parameter estimation yields stable and reliable results; not only improved results in cross-validation based measures of uncertainty are obtained but also a more realistic spatial structure of uncertainty compared to second order models of dependence; as much information as is available is included in the parameter estimation: incorporation of censored measurements (e.g., below detection limit, or ones that are above the sensitive range of the measurement device) yield to more realistic spatial models; the proportion of true zeros can be jointly estimated with and distinguished from censored measurements which allow estimates about the age of a contaminant in the system; secondary information (categorical and on the rational scale) has been used to improve the estimation of the primary variable; These copula based multivariate statistical techniques are demonstrated based on hydraulic conductivity observations at the Borden (Canada) site, the MADE site (USA), and a large regional groundwater quality data-set in south-west Germany. Fields of spatially distributed K were simulated with identical marginal simulation, identical second order spatial moments, yet substantially differing solute transport characteristics when numerical tracer tests were performed. A statistical methodology is shown that allows the delineation of a boundary layer separating homogenous parts of a spatial data-set. The effects of this boundary layer (macro structure) and the spatial dependence of K (micro structure) on solute transport behaviour is shown.
Heuristics for Relevancy Ranking of Earth Dataset Search Results

NASA Astrophysics Data System (ADS)

Lynnes, C.; Quinn, P.; Norton, J.

2016-12-01

As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.

Heuristics for Relevancy Ranking of Earth Dataset Search Results

NASA Technical Reports Server (NTRS)

Lynnes, Christopher; Quinn, Patrick; Norton, James

2016-01-01

As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
Relevancy Ranking of Satellite Dataset Search Results

NASA Technical Reports Server (NTRS)

Lynnes, Christopher; Quinn, Patrick; Norton, James

2017-01-01

As the Variety of Earth science datasets increases, science researchers find it more challenging to discover and select the datasets that best fit their needs. The most common way of search providers to address this problem is to rank the datasets returned for a query by their likely relevance to the user. Large web page search engines typically use text matching supplemented with reverse link counts, semantic annotations and user intent modeling. However, this produces uneven results when applied to dataset metadata records simply externalized as a web page. Fortunately, data and search provides have decades of experience in serving data user communities, allowing them to form heuristics that leverage the structure in the metadata together with knowledge about the user community. Some of these heuristics include specific ways of matching the user input to the essential measurements in the dataset and determining overlaps of time range and spatial areas. Heuristics based on the novelty of the datasets can prioritize later, better versions of data over similar predecessors. And knowledge of how different user types and communities use data can be brought to bear in cases where characteristics of the user (discipline, expertise) or their intent (applications, research) can be divined. The Earth Observing System Data and Information System has begun implementing some of these heuristics in the relevancy algorithm of its Common Metadata Repository search engine.
A time series of urban extent in China using DSMP/OLS nighttime light data

PubMed Central

Chen, Dongsheng; Chen, Le; Wang, Huan; Guan, Qingfeng

2018-01-01

Urban extent data play an important role in urban management and urban studies, such as monitoring the process of urbanization and changes in the spatial configuration of urban areas. Traditional methods of extracting urban-extent information are primarily based on manual investigations and classifications using remote sensing images, and these methods have such problems as large costs in labor and time and low precision. This study proposes an improved, simplified and flexible method for extracting urban extents over multiple scales and the construction of spatiotemporal models using DMSP/OLS nighttime light (NTL) for practical situations. This method eliminates the regional temporal and spatial inconsistency of thresholding NTL in large-scale and multi-temporal scenes. Using this method, we have extracted the urban extents and calculated the corresponding areas on the county, municipal and provincial scales in China from 2000 to 2012. In addition, validation with the data of reference data shows that the overall accuracy (OA), Kappa and F1 Scores were 0.996, 0.793, and 0.782, respectively. We increased the spatial resolution of the urban extent to 500 m (approximately four times finer than the results of previous studies). Based on the urban extent dataset proposed above, we analyzed changes in urban extents over time and observed that urban sprawl has grown in all of the counties of China. We also identified three patterns of urban sprawl: Early Urban Growth, Constant Urban Growth and Recent Urban Growth. In addition, these trends of urban sprawl are consistent with the western, eastern and central cities of China, respectively, in terms of their spatial distribution, socioeconomic characteristics and historical background. Additionally, the urban extents display the spatial configurations of urban areas intuitively. The proposed urban extent dataset is available for download and can provide reference data and support for future studies of urbanization and urban planning. PMID:29795685
A high resolution spatial population database of Somalia for disease risk mapping.

PubMed

Linard, Catherine; Alegana, Victor A; Noor, Abdisalan M; Snow, Robert W; Tatem, Andrew J

2010-09-14

Millions of Somali have been deprived of basic health services due to the unstable political situation of their country. Attempts are being made to reconstruct the health sector, in particular to estimate the extent of infectious disease burden. However, any approach that requires the use of modelled disease rates requires reasonable information on population distribution. In a low-income country such as Somalia, population data are lacking, are of poor quality, or become outdated rapidly. Modelling methods are therefore needed for the production of contemporary and spatially detailed population data. Here land cover information derived from satellite imagery and existing settlement point datasets were used for the spatial reallocation of populations within census units. We used simple and semi-automated methods that can be implemented with free image processing software to produce an easily updatable gridded population dataset at 100 × 100 meters spatial resolution. The 2010 population dataset was matched to administrative population totals projected by the UN. Comparison tests between the new dataset and existing population datasets revealed important differences in population size distributions, and in population at risk of malaria estimates. These differences are particularly important in more densely populated areas and strongly depend on the settlement data used in the modelling approach. The results show that it is possible to produce detailed, contemporary and easily updatable settlement and population distribution datasets of Somalia using existing data. The 2010 population dataset produced is freely available as a product of the AfriPop Project and can be downloaded from: http://www.afripop.org.
A high resolution spatial population database of Somalia for disease risk mapping

PubMed Central

2010-01-01

Background Millions of Somali have been deprived of basic health services due to the unstable political situation of their country. Attempts are being made to reconstruct the health sector, in particular to estimate the extent of infectious disease burden. However, any approach that requires the use of modelled disease rates requires reasonable information on population distribution. In a low-income country such as Somalia, population data are lacking, are of poor quality, or become outdated rapidly. Modelling methods are therefore needed for the production of contemporary and spatially detailed population data. Results Here land cover information derived from satellite imagery and existing settlement point datasets were used for the spatial reallocation of populations within census units. We used simple and semi-automated methods that can be implemented with free image processing software to produce an easily updatable gridded population dataset at 100 × 100 meters spatial resolution. The 2010 population dataset was matched to administrative population totals projected by the UN. Comparison tests between the new dataset and existing population datasets revealed important differences in population size distributions, and in population at risk of malaria estimates. These differences are particularly important in more densely populated areas and strongly depend on the settlement data used in the modelling approach. Conclusions The results show that it is possible to produce detailed, contemporary and easily updatable settlement and population distribution datasets of Somalia using existing data. The 2010 population dataset produced is freely available as a product of the AfriPop Project and can be downloaded from: http://www.afripop.org. PMID:20840751
Secure access control and large scale robust representation for online multimedia event detection.

PubMed

Liu, Changyu; Lu, Bin; Li, Huiling

2014-01-01

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches.
Precipitation intercomparison of a set of satellite- and raingauge-derived datasets, ERA Interim reanalysis, and a single WRF regional climate simulation over Europe and the North Atlantic

NASA Astrophysics Data System (ADS)

Skok, Gregor; Žagar, Nedjeljka; Honzak, Luka; Žabkar, Rahela; Rakovec, Jože; Ceglar, Andrej

2016-01-01

The study presents a precipitation intercomparison based on two satellite-derived datasets (TRMM 3B42, CMORPH), four raingauge-based datasets (GPCC, E-OBS, Willmott & Matsuura, CRU), ERA Interim reanalysis (ERAInt), and a single climate simulation using the WRF model. The comparison was performed for a domain encompassing parts of Europe and the North Atlantic over the 11-year period of 2000-2010. The four raingauge-based datasets are similar to the TRMM dataset with biases over Europe ranging from -7 % to +4 %. The spread among the raingauge-based datasets is relatively small over most of Europe, although areas with greater uncertainty (more than 30 %) exist, especially near the Alps and other mountainous regions. There are distinct differences between the datasets over the European land area and the Atlantic Ocean in comparison to the TRMM dataset. ERAInt has a small dry bias over the land; the WRF simulation has a large wet bias (+30 %), whereas CMORPH is characterized by a large and spatially consistent dry bias (-21 %). Over the ocean, both ERAInt and CMORPH have a small wet bias (+8 %) while the wet bias in WRF is significantly larger (+47 %). ERAInt has the highest frequency of low-intensity precipitation while the frequency of high-intensity precipitation is the lowest due to its lower native resolution. Both satellite-derived datasets have more low-intensity precipitation over the ocean than over the land, while the frequency of higher-intensity precipitation is similar or larger over the land. This result is likely related to orography, which triggers more intense convective precipitation, while the Atlantic Ocean is characterized by more homogenous large-scale precipitation systems which are associated with larger areas of lower intensity precipitation. However, this is not observed in ERAInt and WRF, indicating the insufficient representation of convective processes in the models. Finally, the Fraction Skill Score confirmed that both models perform better over the Atlantic Ocean with ERAInt outperforming the WRF at low thresholds and WRF outperforming ERAInt at higher thresholds. The diurnal cycle is simulated better in the WRF simulation than in ERAInt, although WRF could not reproduce well the amplitude of the diurnal cycle. While the evaluation of the WRF model confirms earlier findings related to the model's wet bias over European land, the applied satellite-derived precipitation datasets revealed differences between the land and ocean areas along with uncertainties in the observation datasets.
Nanocubes for real-time exploration of spatiotemporal datasets.

PubMed

Lins, Lauro; Klosowski, James T; Scheidegger, Carlos

2013-12-01

Consider real-time exploration of large multidimensional spatiotemporal datasets with billions of entries, each defined by a location, a time, and other attributes. Are certain attributes correlated spatially or temporally? Are there trends or outliers in the data? Answering these questions requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, which in a sense precomputes every possible aggregate query over the database. Data cubes are sometimes assumed to take a prohibitively large amount of space, and to consequently require disk storage. In contrast, we show how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries; we call this data structure a nanocube. We present algorithms to compute and query a nanocube, and show how it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots. When compared to exact visualizations created by scanning an entire dataset, nanocube plots have bounded screen error across a variety of scales, thanks to a hierarchical structure in space and time. We demonstrate the effectiveness of our technique on a variety of real-world datasets, and present memory, timing, and network bandwidth measurements. We find that the timings for the queries in our examples are dominated by network and user-interaction latencies.
Spatiotemporal dataset on Chinese population distribution and its driving factors from 1949 to 2013.

PubMed

Wang, Lizhe; Chen, Lajiao

2016-07-05

Spatio-temporal data on human population and its driving factors is critical to understanding and responding to population problems. Unfortunately, such spatio-temporal data on a large scale and over the long term are often difficult to obtain. Here, we present a dataset on Chinese population distribution and its driving factors over a remarkably long period, from 1949 to 2013. Driving factors of population distribution were selected according to the push-pull migration laws, which were summarized into four categories: natural environment, natural resources, economic factors and social factors. Natural environment and natural resources indicators were calculated using Geographic Information System (GIS) and Remote Sensing (RS) techniques, whereas economic and social factors from 1949 to 2013 were collected from the China Statistical Yearbook and China Compendium of Statistics from 1949 to 2008. All of the data were quality controlled and unified into an identical dataset with the same spatial scope and time period. The dataset is expected to be useful for understanding how population responds to and impacts environmental change.
Spatiotemporal dataset on Chinese population distribution and its driving factors from 1949 to 2013

NASA Astrophysics Data System (ADS)

Wang, Lizhe; Chen, Lajiao

2016-07-01

Spatio-temporal data on human population and its driving factors is critical to understanding and responding to population problems. Unfortunately, such spatio-temporal data on a large scale and over the long term are often difficult to obtain. Here, we present a dataset on Chinese population distribution and its driving factors over a remarkably long period, from 1949 to 2013. Driving factors of population distribution were selected according to the push-pull migration laws, which were summarized into four categories: natural environment, natural resources, economic factors and social factors. Natural environment and natural resources indicators were calculated using Geographic Information System (GIS) and Remote Sensing (RS) techniques, whereas economic and social factors from 1949 to 2013 were collected from the China Statistical Yearbook and China Compendium of Statistics from 1949 to 2008. All of the data were quality controlled and unified into an identical dataset with the same spatial scope and time period. The dataset is expected to be useful for understanding how population responds to and impacts environmental change.
Locally Downscaled and Spatially Customizable Climate Data for Historical and Future Periods for North America

PubMed Central

Wang, Tongli; Hamann, Andreas; Spittlehouse, Dave; Carroll, Carlos

2016-01-01

Large volumes of gridded climate data have become available in recent years including interpolated historical data from weather stations and future predictions from general circulation models. These datasets, however, are at various spatial resolutions that need to be converted to scales meaningful for applications such as climate change risk and impact assessments or sample-based ecological research. Extracting climate data for specific locations from large datasets is not a trivial task and typically requires advanced GIS and data management skills. In this study, we developed a software package, ClimateNA, that facilitates this task and provides a user-friendly interface suitable for resource managers and decision makers as well as scientists. The software locally downscales historical and future monthly climate data layers into scale-free point estimates of climate values for the entire North American continent. The software also calculates a large number of biologically relevant climate variables that are usually derived from daily weather data. ClimateNA covers 1) 104 years of historical data (1901–2014) in monthly, annual, decadal and 30-year time steps; 2) three paleoclimatic periods (Last Glacial Maximum, Mid Holocene and Last Millennium); 3) three future periods (2020s, 2050s and 2080s); and 4) annual time-series of model projections for 2011–2100. Multiple general circulation models (GCMs) were included for both paleo and future periods, and two representative concentration pathways (RCP4.5 and 8.5) were chosen for future climate data. PMID:27275583
Locally Downscaled and Spatially Customizable Climate Data for Historical and Future Periods for North America.

PubMed

Wang, Tongli; Hamann, Andreas; Spittlehouse, Dave; Carroll, Carlos

2016-01-01

Large volumes of gridded climate data have become available in recent years including interpolated historical data from weather stations and future predictions from general circulation models. These datasets, however, are at various spatial resolutions that need to be converted to scales meaningful for applications such as climate change risk and impact assessments or sample-based ecological research. Extracting climate data for specific locations from large datasets is not a trivial task and typically requires advanced GIS and data management skills. In this study, we developed a software package, ClimateNA, that facilitates this task and provides a user-friendly interface suitable for resource managers and decision makers as well as scientists. The software locally downscales historical and future monthly climate data layers into scale-free point estimates of climate values for the entire North American continent. The software also calculates a large number of biologically relevant climate variables that are usually derived from daily weather data. ClimateNA covers 1) 104 years of historical data (1901-2014) in monthly, annual, decadal and 30-year time steps; 2) three paleoclimatic periods (Last Glacial Maximum, Mid Holocene and Last Millennium); 3) three future periods (2020s, 2050s and 2080s); and 4) annual time-series of model projections for 2011-2100. Multiple general circulation models (GCMs) were included for both paleo and future periods, and two representative concentration pathways (RCP4.5 and 8.5) were chosen for future climate data.
Australian snowpack in the NARCliM ensemble: evaluation, bias correction and future projections

NASA Astrophysics Data System (ADS)

Luca, Alejandro Di; Evans, Jason P.; Ji, Fei

2017-10-01

In this study we evaluate the ability of an ensemble of high-resolution Regional Climate Model simulations to represent snow cover characteristics over the Australian Alps and go on to asses future projections of snowpack characteristics. Our results show that the ensemble presents a cold temperature bias and overestimates total precipitation leading to a general overestimation of the snow cover as compared with MODIS satellite data. We then produce a new set of snowpack characteristics by running a temperature based snow melt/accumulation model forced by bias corrected temperature and precipitation fields. While some positive snow cover biases remain, the bias corrected (BC) dataset show large improvements regarding the simulation of total amounts, seasonality and spatial distribution of the snow cover compared with MODIS products. Both the raw and BC datasets are then used to assess future changes in the snowpack characteristics. Both datasets show robust increases in near-surface temperatures and decreases in snowfall that lead to a substantial reduction of the snowpack over the Australian Alps. The snowpack decreases by about 15 and 60% by 2030 and 2070 respectively. While the BC data introduce large differences in the simulation of the present climate snowpack, in relative terms future changes appear to be similar to those obtained using the raw data. Future temperature projections show a clear dependence with elevation through the snow-albedo feedback effect that affects snowpack projections. Uncertainties in future projections of the snowpack are large in both datasets and are mainly dominated by the choice of the lateral boundary conditions.
a Spatiotemporal Aggregation Query Method Using Multi-Thread Parallel Technique Based on Regional Division

NASA Astrophysics Data System (ADS)

Liao, S.; Chen, L.; Li, J.; Xiong, W.; Wu, Q.

2015-07-01

Existing spatiotemporal database supports spatiotemporal aggregation query over massive moving objects datasets. Due to the large amounts of data and single-thread processing method, the query speed cannot meet the application requirements. On the other hand, the query efficiency is more sensitive to spatial variation then temporal variation. In this paper, we proposed a spatiotemporal aggregation query method using multi-thread parallel technique based on regional divison and implemented it on the server. Concretely, we divided the spatiotemporal domain into several spatiotemporal cubes, computed spatiotemporal aggregation on all cubes using the technique of multi-thread parallel processing, and then integrated the query results. By testing and analyzing on the real datasets, this method has improved the query speed significantly.
Crystallographic Orientation Relationships (CORs) between rutile inclusions and garnet hosts: towards using COR frequencies as a petrogenetic indicator

NASA Astrophysics Data System (ADS)

Griffiths, Thomas; Habler, Gerlinde; Schantl, Philip; Abart, Rainer

2017-04-01

Crystallographic orientation relationships (CORs) between crystalline inclusions and their hosts are commonly used to support particular inclusion origins, but often interpretations are based on a small fraction of all inclusions in a system. The electron backscatter diffraction (EBSD) method allows collection of large COR datasets more quickly than other methods while maintaining high spatial resolution. Large datasets allow analysis of the relative frequencies of different CORs, and identification of 'statistical CORs', where certain limited degrees of freedom exist in the orientation relationship between two neighbour crystals (Griffiths et al. 2016). Statistical CORs exist in addition to completely fixed 'specific' CORs (previously the only type of COR considered). We present a comparison of three EBSD single point datasets (all N > 200 inclusions) of rutile inclusions in garnet hosts, covering three rock systems, each with a different geological history: 1) magmatic garnet in pegmatite from the Koralpe complex, Eastern Alps, formed at temperatures > 600°C and low pressures; 2) granulite facies garnet rims on ultra-high-pressure garnets from the Kimi complex, Rhodope Massif; and 3) a Moldanubian granulite from the southeastern Bohemian Massif, equilibrated at peak conditions of 1050°C and 1.6 GPa. The present study is unique because all datasets have been analysed using the same catalogue of potential CORs, therefore relative frequencies and other COR properties can be meaningfully compared. In every dataset > 94% of the inclusions analysed exhibit one of the CORs tested for. Certain CORs are consistently among the most common in all datasets. However, the relative abundances of these common CORs show large variations between datasets (varying from 8 to 42 % relative abundance in one case). Other CORs are consistently uncommon but nonetheless present in every dataset. Lastly, there are some CORs that are common in one of the datasets and rare in the remainder. These patterns suggest competing influences on relative COR frequencies. Certain CORs seem consistently favourable, perhaps pointing to very stable low energy configurations, whereas some CORs are favoured in only one system, perhaps due to particulars of the formation mechanism, kinetics or conditions. Variations in COR frequencies between datasets seem to correlate with the conditions of host-inclusion system evolution. The two datasets from granulite-facies metamorphic samples show more similarities to each other than to the pegmatite dataset, and the sample inferred to have experienced the highest temperatures (Moldanubian granulite) shows the lowest diversity of CORs, low frequencies of statistical CORs and the highest frequency of specific CORs. These results provide evidence that petrological information is being encoded in COR distributions. They make a strong case for further studies of the factors influencing COR development and for measurements of COR distributions in other systems and between different phases. Griffiths, T.A., Habler, G., Abart, R. (2016): Crystallographic orientation relationships in host-inclusion systems: New insights from large EBSD data sets. Amer. Miner., 101, 690-705.
Spatiotemporal multistage consensus clustering in molecular dynamics studies of large proteins.

PubMed

Kenn, Michael; Ribarics, Reiner; Ilieva, Nevena; Cibena, Michael; Karch, Rudolf; Schreiner, Wolfgang

2016-04-26

The aim of this work is to find semi-rigid domains within large proteins as reference structures for fitting molecular dynamics trajectories. We propose an algorithm, multistage consensus clustering, MCC, based on minimum variation of distances between pairs of Cα-atoms as target function. The whole dataset (trajectory) is split into sub-segments. For a given sub-segment, spatial clustering is repeatedly started from different random seeds, and we adopt the specific spatial clustering with minimum target function: the process described so far is stage 1 of MCC. Then, in stage 2, the results of spatial clustering are consolidated, to arrive at domains stable over the whole dataset. We found that MCC is robust regarding the choice of parameters and yields relevant information on functional domains of the major histocompatibility complex (MHC) studied in this paper: the α-helices and β-floor of the protein (MHC) proved to be most flexible and did not contribute to clusters of significant size. Three alleles of the MHC, each in complex with ABCD3 peptide and LC13 T-cell receptor (TCR), yielded different patterns of motion. Those alleles causing immunological allo-reactions showed distinct correlations of motion between parts of the peptide, the binding cleft and the complementary determining regions (CDR)-loops of the TCR. Multistage consensus clustering reflected functional differences between MHC alleles and yields a methodological basis to increase sensitivity of functional analyses of bio-molecules. Due to the generality of approach, MCC is prone to lend itself as a potent tool also for the analysis of other kinds of big data.
Velocities of gas in star-forming regions

NASA Astrophysics Data System (ADS)

Nissen, H. D.; Gustafsson, M.; Field, D.; Lemaire, J. L.; Clénet, Y.; Rouan, D.

2007-12-01

We present high spatial (0.18") and velocity (<2 km/s) resolution observations of the central 1'x1' of OMC1. We identify a large number of shock features and determine radial velocity, position angle and emission brightness for each of these features. Using this dataset we analyze the kinematic properties of the inner square arcminute of OMC1, identifying among other things the IR signature of a massive outflow originating from source I.
Large uncertainties in observed daily precipitation extremes over land

NASA Astrophysics Data System (ADS)

Herold, Nicholas; Behrangi, Ali; Alexander, Lisa V.

2017-01-01

We explore uncertainties in observed daily precipitation extremes over the terrestrial tropics and subtropics (50°S-50°N) based on five commonly used products: the Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) dataset, the Global Precipitation Climatology Centre-Full Data Daily (GPCC-FDD) dataset, the Tropical Rainfall Measuring Mission (TRMM) multi-satellite research product (T3B42 v7), the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), and the Global Precipitation Climatology Project's One-Degree Daily (GPCP-1DD) dataset. We use the precipitation indices R10mm and Rx1day, developed by the Expert Team on Climate Change Detection and Indices, to explore the behavior of "moderate" and "extreme" extremes, respectively. In order to assess the sensitivity of extreme precipitation to different grid sizes we perform our calculations on four common spatial resolutions (0.25° × 0.25°, 1° × 1°, 2.5° × 2.5°, and 3.75° × 2.5°). The impact of the chosen "order of operation" in calculating these indices is also determined. Our results show that moderate extremes are relatively insensitive to product and resolution choice, while extreme extremes can be very sensitive. For example, at 0.25° × 0.25° quasi-global mean Rx1day values vary from 37 mm in PERSIANN-CDR to 62 mm in T3B42. We find that the interproduct spread becomes prominent at resolutions of 1° × 1° and finer, thus establishing a minimum effective resolution at which observational products agree. Without improvements in interproduct spread, these exceedingly large observational uncertainties at high spatial resolution may limit the usefulness of model evaluations. As has been found previously, resolution sensitivity can be largely eliminated by applying an order of operation where indices are calculated prior to regridding. However, this approach is not appropriate when true area averages are desired (e.g., for model evaluations).
Alpine Grassland Soil Organic Carbon Stock and Its Uncertainty in the Three Rivers Source Region of the Tibetan Plateau

PubMed Central

Chang, Xiaofeng; Wang, Shiping; Cui, Shujuan; Zhu, Xiaoxue; Luo, Caiyun; Zhang, Zhenhua; Wilkes, Andreas

2014-01-01

Alpine grassland of the Tibetan Plateau is an important component of global soil organic carbon (SOC) stocks, but insufficient field observations and large spatial heterogeneity leads to great uncertainty in their estimation. In the Three Rivers Source Region (TRSR), alpine grasslands account for more than 75% of the total area. However, the regional carbon (C) stock estimate and their uncertainty have seldom been tested. Here we quantified the regional SOC stock and its uncertainty using 298 soil profiles surveyed from 35 sites across the TRSR during 2006–2008. We showed that the upper soil (0–30 cm depth) in alpine grasslands of the TRSR stores 2.03 Pg C, with a 95% confidence interval ranging from 1.25 to 2.81 Pg C. Alpine meadow soils comprised 73% (i.e. 1.48 Pg C) of the regional SOC estimate, but had the greatest uncertainty at 51%. The statistical power to detect a deviation of 10% uncertainty in grassland C stock was less than 0.50. The required sample size to detect this deviation at a power of 90% was about 6–7 times more than the number of sample sites surveyed. Comparison of our observed SOC density with the corresponding values from the dataset of Yang et al. indicates that these two datasets are comparable. The combined dataset did not reduce the uncertainty in the estimate of the regional grassland soil C stock. This result could be mainly explained by the underrepresentation of sampling sites in large areas with poor accessibility. Further research to improve the regional SOC stock estimate should optimize sampling strategy by considering the number of samples and their spatial distribution. PMID:24819054
A comparison of multi-spectral, multi-angular, and multi-temporal remote sensing datasets for fractional shrub canopy mapping in Arctic Alaska

USGS Publications Warehouse

Selkowitz, D.J.

2010-01-01

Shrub cover appears to be increasing across many areas of the Arctic tundra biome, and increasing shrub cover in the Arctic has the potential to significantly impact global carbon budgets and the global climate system. For most of the Arctic, however, there is no existing baseline inventory of shrub canopy cover, as existing maps of Arctic vegetation provide little information about the density of shrub cover at a moderate spatial resolution across the region. Remotely-sensed fractional shrub canopy maps can provide this necessary baseline inventory of shrub cover. In this study, we compare the accuracy of fractional shrub canopy (> 0.5 m tall) maps derived from multi-spectral, multi-angular, and multi-temporal datasets from Landsat imagery at 30 m spatial resolution, Moderate Resolution Imaging SpectroRadiometer (MODIS) imagery at 250 m and 500 m spatial resolution, and MultiAngle Imaging Spectroradiometer (MISR) imagery at 275 m spatial resolution for a 1067 km2 study area in Arctic Alaska. The study area is centered at 69 ??N, ranges in elevation from 130 to 770 m, is composed primarily of rolling topography with gentle slopes less than 10??, and is free of glaciers and perennial snow cover. Shrubs > 0.5 m in height cover 2.9% of the study area and are primarily confined to patches associated with specific landscape features. Reference fractional shrub canopy is determined from in situ shrub canopy measurements and a high spatial resolution IKONOS image swath. Regression tree models are constructed to estimate fractional canopy cover at 250 m using different combinations of input data from Landsat, MODIS, and MISR. Results indicate that multi-spectral data provide substantially more accurate estimates of fractional shrub canopy cover than multi-angular or multi-temporal data. Higher spatial resolution datasets also provide more accurate estimates of fractional shrub canopy cover (aggregated to moderate spatial resolutions) than lower spatial resolution datasets, an expected result for a study area where most shrub cover is concentrated in narrow patches associated with rivers, drainages, and slopes. Including the middle infrared bands available from Landsat and MODIS in the regression tree models (in addition to the four standard visible and near-infrared spectral bands) typically results in a slight boost in accuracy. Including the multi-angular red band data available from MISR in the regression tree models, however, typically boosts accuracy more substantially, resulting in moderate resolution fractional shrub canopy estimates approaching the accuracy of estimates derived from the much higher spatial resolution Landsat sensor. Given the poor availability of snow and cloud-free Landsat scenes in many areas of the Arctic and the promising results demonstrated here by the MISR sensor, MISR may be the best choice for large area fractional shrub canopy mapping in the Alaskan Arctic for the period 2000-2009.

Spatial contexts for temporal variability in alpine vegetation under ongoing climate change

USGS Publications Warehouse

Fagre, Daniel B.; ,; George P. Malanson,

2013-01-01

A framework to monitor mountain summit vegetation (The Global Observation Research Initiative in Alpine Environments, GLORIA) was initiated in 1997. GLORIA results should be taken within a regional context of the spatial variability of alpine tundra. Changes observed at GLORIA sites in Glacier National Park, Montana, USA are quantified within the context of the range of variability observed in alpine tundra across much of western North America. Dissimilarity is calculated and used in nonmetric multidimensional scaling for repeated measures of vascular species cover at 14 GLORIA sites with 525 nearby sites and with 436 sites in western North America. The lengths of the trajectories of the GLORIA sites in ordination space are compared to the dimensions of the space created by the larger datasets. The absolute amount of change on the GLORIA summits over 5 years is high, but the degree of change is small relative to the geographical context. The GLORIA sites are on the margin of the ordination volumes with the large datasets. The GLORIA summit vegetation appears to be specialized, arguing for the intrinsic value of early observed change in limited niche space.
Tundra landform and vegetation productivity trend maps for the Arctic Coastal Plain of northern Alaska

NASA Astrophysics Data System (ADS)

Lara, Mark J.; Nitze, Ingmar; Grosse, Guido; McGuire, A. David

2018-04-01

Arctic tundra landscapes are composed of a complex mosaic of patterned ground features, varying in soil moisture, vegetation composition, and surface hydrology over small spatial scales (10-100 m). The importance of microtopography and associated geomorphic landforms in influencing ecosystem structure and function is well founded, however, spatial data products describing local to regional scale distribution of patterned ground or polygonal tundra geomorphology are largely unavailable. Thus, our understanding of local impacts on regional scale processes (e.g., carbon dynamics) may be limited. We produced two key spatiotemporal datasets spanning the Arctic Coastal Plain of northern Alaska (~60,000 km2) to evaluate climate-geomorphological controls on arctic tundra productivity change, using (1) a novel 30 m classification of polygonal tundra geomorphology and (2) decadal-trends in surface greenness using the Landsat archive (1999-2014). These datasets can be easily integrated and adapted in an array of local to regional applications such as (1) upscaling plot-level measurements (e.g., carbon/energy fluxes), (2) mapping of soils, vegetation, or permafrost, and/or (3) initializing ecosystem biogeochemistry, hydrology, and/or habitat modeling.
Bat trait, genetic and pathogen data from large-scale investigations of African fruit bats, Eidolon helvum.

PubMed

Peel, Alison J; Baker, Kate S; Hayman, David T S; Suu-Ire, Richard; Breed, Andrew C; Gembu, Guy-Crispin; Lembo, Tiziana; Fernández-Loras, Andrés; Sargan, David R; Fooks, Anthony R; Cunningham, Andrew A; Wood, James L N

2016-08-01

Bats, including African straw-coloured fruit bats (Eidolon helvum), have been highlighted as reservoirs of many recently emerged zoonotic viruses. This common, widespread and ecologically important species was the focus of longitudinal and continent-wide studies of the epidemiological and ecology of Lagos bat virus, henipaviruses and Achimota viruses. Here we present a spatial, morphological, demographic, genetic and serological dataset encompassing 2827 bats from nine countries over an 8-year period. Genetic data comprises cytochrome b mitochondrial sequences (n=608) and microsatellite genotypes from 18 loci (n=544). Tooth-cementum analyses (n=316) allowed derivation of rare age-specific serologic data for a lyssavirus, a henipavirus and two rubulaviruses. This dataset contributes a substantial volume of data on the ecology of E. helvum and its viruses and will be valuable for a wide range of studies, including viral transmission dynamic modelling in age-structured populations, investigation of seasonal reproductive asynchrony in wide-ranging species, ecological niche modelling, inference of island colonisation history, exploration of relationships between island and body size, and various spatial analyses of demographic, morphometric or serological data.
Tundra landform and vegetation productivity trend maps for the Arctic Coastal Plain of northern Alaska.

PubMed

Lara, Mark J; Nitze, Ingmar; Grosse, Guido; McGuire, A David

2018-04-10

Arctic tundra landscapes are composed of a complex mosaic of patterned ground features, varying in soil moisture, vegetation composition, and surface hydrology over small spatial scales (10-100 m). The importance of microtopography and associated geomorphic landforms in influencing ecosystem structure and function is well founded, however, spatial data products describing local to regional scale distribution of patterned ground or polygonal tundra geomorphology are largely unavailable. Thus, our understanding of local impacts on regional scale processes (e.g., carbon dynamics) may be limited. We produced two key spatiotemporal datasets spanning the Arctic Coastal Plain of northern Alaska (~60,000 km 2 ) to evaluate climate-geomorphological controls on arctic tundra productivity change, using (1) a novel 30 m classification of polygonal tundra geomorphology and (2) decadal-trends in surface greenness using the Landsat archive (1999-2014). These datasets can be easily integrated and adapted in an array of local to regional applications such as (1) upscaling plot-level measurements (e.g., carbon/energy fluxes), (2) mapping of soils, vegetation, or permafrost, and/or (3) initializing ecosystem biogeochemistry, hydrology, and/or habitat modeling.
AMRZone: A Runtime AMR Data Sharing Framework For Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Wenzhao; Tang, Houjun; Harenberg, Steven

Frameworks that facilitate runtime data sharing across multiple applications are of great importance for scientific data analytics. Although existing frameworks work well over uniform mesh data, they can not effectively handle adaptive mesh refinement (AMR) data. Among the challenges to construct an AMR-capable framework include: (1) designing an architecture that facilitates online AMR data management; (2) achieving a load-balanced AMR data distribution for the data staging space at runtime; and (3) building an effective online index to support the unique spatial data retrieval requirements for AMR data. Towards addressing these challenges to support runtime AMR data sharing across scientific applications,more » we present the AMRZone framework. Experiments over real-world AMR datasets demonstrate AMRZone's effectiveness at achieving a balanced workload distribution, reading/writing large-scale datasets with thousands of parallel processes, and satisfying queries with spatial constraints. Moreover, AMRZone's performance and scalability are even comparable with existing state-of-the-art work when tested over uniform mesh data with up to 16384 cores; in the best case, our framework achieves a 46% performance improvement.« less
Inter-comparison of multiple statistically downscaled climate datasets for the Pacific Northwest, USA

PubMed Central

Jiang, Yueyang; Kim, John B.; Still, Christopher J.; Kerns, Becky K.; Kline, Jeffrey D.; Cunningham, Patrick G.

2018-01-01

Statistically downscaled climate data have been widely used to explore possible impacts of climate change in various fields of study. Although many studies have focused on characterizing differences in the downscaling methods, few studies have evaluated actual downscaled datasets being distributed publicly. Spatially focusing on the Pacific Northwest, we compare five statistically downscaled climate datasets distributed publicly in the US: ClimateNA, NASA NEX-DCP30, MACAv2-METDATA, MACAv2-LIVNEH and WorldClim. We compare the downscaled projections of climate change, and the associated observational data used as training data for downscaling. We map and quantify the variability among the datasets and characterize the spatio-temporal patterns of agreement and disagreement among the datasets. Pair-wise comparisons of datasets identify the coast and high-elevation areas as areas of disagreement for temperature. For precipitation, high-elevation areas, rainshadows and the dry, eastern portion of the study area have high dissimilarity among the datasets. By spatially aggregating the variability measures into watersheds, we develop guidance for selecting datasets within the Pacific Northwest climate change impact studies. PMID:29461513
Inter-comparison of multiple statistically downscaled climate datasets for the Pacific Northwest, USA.

PubMed

Jiang, Yueyang; Kim, John B; Still, Christopher J; Kerns, Becky K; Kline, Jeffrey D; Cunningham, Patrick G

2018-02-20

Statistically downscaled climate data have been widely used to explore possible impacts of climate change in various fields of study. Although many studies have focused on characterizing differences in the downscaling methods, few studies have evaluated actual downscaled datasets being distributed publicly. Spatially focusing on the Pacific Northwest, we compare five statistically downscaled climate datasets distributed publicly in the US: ClimateNA, NASA NEX-DCP30, MACAv2-METDATA, MACAv2-LIVNEH and WorldClim. We compare the downscaled projections of climate change, and the associated observational data used as training data for downscaling. We map and quantify the variability among the datasets and characterize the spatio-temporal patterns of agreement and disagreement among the datasets. Pair-wise comparisons of datasets identify the coast and high-elevation areas as areas of disagreement for temperature. For precipitation, high-elevation areas, rainshadows and the dry, eastern portion of the study area have high dissimilarity among the datasets. By spatially aggregating the variability measures into watersheds, we develop guidance for selecting datasets within the Pacific Northwest climate change impact studies.
Search for contact interactions and large extra dimensions in the dilepton channel using proton–proton collisions at √s = 8 TeV with the ATLAS detector

DOE PAGES

Aad, G.

2014-12-11

Research is conducted for non-resonant new phenomena in dielectron and dimuon final states, originating from either contact interactions or large extra spatial dimensions. The LHC 2012 proton–proton collision dataset recorded by the ATLAS detector is used, corresponding to 20 fb –1 at √s = 8 TeV. The dilepton invariant mass spectrum is a discriminating variable in both searches, with the contact interaction search additionally utilizing the dilepton forward-backward asymmetry. No significant deviations from the Standard Model expectation are observed. Lower limits are set on the ℓℓqq contact interaction scale Λ between 15.4 TeV and 26.3 TeV, at the 95% credibilitymore » level. For large extra spatial dimensions, lower limits are set on the string scale MS between 3.2 TeV to 5.0 TeV.« less
SamuROI, a Python-Based Software Tool for Visualization and Analysis of Dynamic Time Series Imaging at Multiple Spatial Scales.

PubMed

Rueckl, Martin; Lenzi, Stephen C; Moreno-Velasquez, Laura; Parthier, Daniel; Schmitz, Dietmar; Ruediger, Sten; Johenning, Friedrich W

2017-01-01

The measurement of activity in vivo and in vitro has shifted from electrical to optical methods. While the indicators for imaging activity have improved significantly over the last decade, tools for analysing optical data have not kept pace. Most available analysis tools are limited in their flexibility and applicability to datasets obtained at different spatial scales. Here, we present SamuROI (Structured analysis of multiple user-defined ROIs), an open source Python-based analysis environment for imaging data. SamuROI simplifies exploratory analysis and visualization of image series of fluorescence changes in complex structures over time and is readily applicable at different spatial scales. In this paper, we show the utility of SamuROI in Ca 2+ -imaging based applications at three spatial scales: the micro-scale (i.e., sub-cellular compartments including cell bodies, dendrites and spines); the meso-scale, (i.e., whole cell and population imaging with single-cell resolution); and the macro-scale (i.e., imaging of changes in bulk fluorescence in large brain areas, without cellular resolution). The software described here provides a graphical user interface for intuitive data exploration and region of interest (ROI) management that can be used interactively within Jupyter Notebook: a publicly available interactive Python platform that allows simple integration of our software with existing tools for automated ROI generation and post-processing, as well as custom analysis pipelines. SamuROI software, source code and installation instructions are publicly available on GitHub and documentation is available online. SamuROI reduces the energy barrier for manual exploration and semi-automated analysis of spatially complex Ca 2+ imaging datasets, particularly when these have been acquired at different spatial scales.
Partial Least Square Discriminant Analysis Based on Normalized Two-Stage Vegetation Indices for Mapping Damage from Rice Diseases Using PlanetScope Datasets.

PubMed

Shi, Yue; Huang, Wenjiang; Ye, Huichun; Ruan, Chao; Xing, Naichen; Geng, Yun; Dong, Yingying; Peng, Dailiang

2018-06-11

In recent decades, rice disease co-epidemics have caused tremendous damage to crop production in both China and Southeast Asia. A variety of remote sensing based approaches have been developed and applied to map diseases distribution using coarse- to moderate-resolution imagery. However, the detection and discrimination of various disease species infecting rice were seldom assessed using high spatial resolution data. The aims of this study were (1) to develop a set of normalized two-stage vegetation indices (VIs) for characterizing the progressive development of different diseases with rice; (2) to explore the performance of combined normalized two-stage VIs in partial least square discriminant analysis (PLS-DA); and (3) to map and evaluate the damage caused by rice diseases at fine spatial scales, for the first time using bi-temporal, high spatial resolution imagery from PlanetScope datasets at a 3 m spatial resolution. Our findings suggest that the primary biophysical parameters caused by different disease (e.g., changes in leaf area, pigment contents, or canopy morphology) can be captured using combined normalized two-stage VIs. PLS-DA was able to classify rice diseases at a sub-field scale, with an overall accuracy of 75.62% and a Kappa value of 0.47. The approach was successfully applied during a typical co-epidemic outbreak of rice dwarf (Rice dwarf virus, RDV), rice blast ( Magnaporthe oryzae ), and glume blight ( Phyllosticta glumarum ) in Guangxi Province, China. Furthermore, our approach highlighted the feasibility of the method in capturing heterogeneous disease patterns at fine spatial scales over the large spatial extents.
SamuROI, a Python-Based Software Tool for Visualization and Analysis of Dynamic Time Series Imaging at Multiple Spatial Scales

PubMed Central

Rueckl, Martin; Lenzi, Stephen C.; Moreno-Velasquez, Laura; Parthier, Daniel; Schmitz, Dietmar; Ruediger, Sten; Johenning, Friedrich W.

2017-01-01

The measurement of activity in vivo and in vitro has shifted from electrical to optical methods. While the indicators for imaging activity have improved significantly over the last decade, tools for analysing optical data have not kept pace. Most available analysis tools are limited in their flexibility and applicability to datasets obtained at different spatial scales. Here, we present SamuROI (Structured analysis of multiple user-defined ROIs), an open source Python-based analysis environment for imaging data. SamuROI simplifies exploratory analysis and visualization of image series of fluorescence changes in complex structures over time and is readily applicable at different spatial scales. In this paper, we show the utility of SamuROI in Ca2+-imaging based applications at three spatial scales: the micro-scale (i.e., sub-cellular compartments including cell bodies, dendrites and spines); the meso-scale, (i.e., whole cell and population imaging with single-cell resolution); and the macro-scale (i.e., imaging of changes in bulk fluorescence in large brain areas, without cellular resolution). The software described here provides a graphical user interface for intuitive data exploration and region of interest (ROI) management that can be used interactively within Jupyter Notebook: a publicly available interactive Python platform that allows simple integration of our software with existing tools for automated ROI generation and post-processing, as well as custom analysis pipelines. SamuROI software, source code and installation instructions are publicly available on GitHub and documentation is available online. SamuROI reduces the energy barrier for manual exploration and semi-automated analysis of spatially complex Ca2+ imaging datasets, particularly when these have been acquired at different spatial scales. PMID:28706482
A geo-spatial data management system for potentially active volcanoes—GEOWARN project

NASA Astrophysics Data System (ADS)

Gogu, Radu C.; Dietrich, Volker J.; Jenny, Bernhard; Schwandner, Florian M.; Hurni, Lorenz

2006-02-01

Integrated studies of active volcanic systems for the purpose of long-term monitoring and forecast and short-term eruption prediction require large numbers of data-sets from various disciplines. A modern database concept has been developed for managing and analyzing multi-disciplinary volcanological data-sets. The GEOWARN project (choosing the "Kos-Yali-Nisyros-Tilos volcanic field, Greece" and the "Campi Flegrei, Italy" as test sites) is oriented toward potentially active volcanoes situated in regions of high geodynamic unrest. This article describes the volcanological database of the spatial and temporal data acquired within the GEOWARN project. As a first step, a spatial database embedded in a Geographic Information System (GIS) environment was created. Digital data of different spatial resolution, and time-series data collected at different intervals or periods, were unified in a common, four-dimensional representation of space and time. The database scheme comprises various information layers containing geographic data (e.g. seafloor and land digital elevation model, satellite imagery, anthropogenic structures, land-use), geophysical data (e.g. from active and passive seismicity, gravity, tomography, SAR interferometry, thermal imagery, differential GPS), geological data (e.g. lithology, structural geology, oceanography), and geochemical data (e.g. from hydrothermal fluid chemistry and diffuse degassing features). As a second step based on the presented database, spatial data analysis has been performed using custom-programmed interfaces that execute query scripts resulting in a graphical visualization of data. These query tools were designed and compiled following scenarios of known "behavior" patterns of dormant volcanoes and first candidate signs of potential unrest. The spatial database and query approach is intended to facilitate scientific research on volcanic processes and phenomena, and volcanic surveillance.
How spatial and temporal rainfall variability affect runoff across basin scales: insights from field observations in the (semi-)urbanised Charlotte watershed

NASA Astrophysics Data System (ADS)

Ten Veldhuis, M. C.; Smith, J. A.; Zhou, Z.

2017-12-01

Impacts of rainfall variability on runoff response are highly scale-dependent. Sensitivity analyses based on hydrological model simulations have shown that impacts are likely to depend on combinations of storm type, basin versus storm scale, temporal versus spatial rainfall variability. So far, few of these conclusions have been confirmed on observational grounds, since high quality datasets of spatially variable rainfall and runoff over prolonged periods are rare. Here we investigate relationships between rainfall variability and runoff response based on 30 years of radar-rainfall datasets and flow measurements for 16 hydrological basins ranging from 7 to 111 km2. Basins vary not only in scale, but also in their degree of urbanisation. We investigated temporal and spatial variability characteristics of rainfall fields across a range of spatial and temporal scales to identify main drivers for variability in runoff response. We identified 3 ranges of basin size with different temporal versus spatial rainfall variability characteristics. Total rainfall volume proved to be the dominant agent determining runoff response at all basin scales, independent of their degree of urbanisation. Peak rainfall intensity and storm core volume are of secondary importance. This applies to all runoff parameters, including runoff volume, runoff peak, volume-to-peak and lag time. Position and movement of the storm with respect to the basin have a negligible influence on runoff response, with the exception of lag times in some of the larger basins. This highlights the importance of accuracy in rainfall estimation: getting the position right but the volume wrong will inevitably lead to large errors in runoff prediction. Our study helps to identify conditions where rainfall variability matters for correct estimation of the rainfall volume as well as the associated runoff response.
Soil Moisture fusion across scales using a multiscale nonstationary Spatial Hierarchical Model

NASA Astrophysics Data System (ADS)

Kathuria, D.; Mohanty, B.; Katzfuss, M.

2017-12-01

Soil moisture (SM) datasets from remote sensing (RS) platforms (such as SMOS and SMAP) and reanalysis products from land surface models are typically available on a coarse spatial granularity of several square km. Ground based sensors, on the other hand, provide observations on a finer spatial scale (meter scale or less) but are sparsely available. SM is affected by high variability due to complex interactions between geologic, topographic, vegetation and atmospheric variables and these interactions change dynamically with footprint scales. Past literature has largely focused on the scale specific effect of these covariates on soil moisture. The present study proposes a robust Multiscale-Nonstationary Spatial Hierarchical Model (MN-SHM) which can assimilate SM from point to RS footprints. The spatial structure of SM across footprints is modeled by a class of scalable covariance functions whose nonstationary depends on atmospheric forcings (such as precipitation) and surface physical controls (such as topography, soil-texture and vegetation). The proposed model is applied to fuse point and airborne ( 1.5 km) SM data obtained during the SMAPVEX12 campaign in the Red River watershed in Southern Manitoba, Canada with SMOS ( 30km) data. It is observed that precipitation, soil-texture and vegetation are the dominant factors which affect the SM distribution across various footprint scales (750 m, 1.5 km, 3 km, 9 km,15 km and 30 km). We conclude that MN-SHM handles the change of support problems easily while retaining reasonable predictive accuracy across multiple spatial resolutions in the presence of surface heterogeneity. The MN-SHM can be considered as a complex non-stationary extension of traditional geostatistical prediction methods (such as Kriging) for fusing multi-platform multi-scale datasets.
Modeling spatial-temporal dynamics of global wetlands: Comprehensive evaluation of a new sub-grid TOPMODEL parameterization and uncertainties

NASA Astrophysics Data System (ADS)

Zhang, Z.; Zimmermann, N. E.; Poulter, B.

2015-12-01

Simulations of the spatial-temporal dynamics of wetlands is key to understanding the role of wetland biogeochemistry under past and future climate variability. Hydrologic inundation models, such as TOPMODEL, are based on a fundamental parameter known as the compound topographic index (CTI) and provide a computationally cost-efficient approach to simulate global wetland dynamics. However, there remains large discrepancy in the implementations of TOPMODEL in land-surface models (LSMs) and thus their performance against observations. This study describes new improvements to TOPMODEL implementation and estimates of global wetland dynamics using the LPJ-wsl DGVM, and quantifies uncertainties by comparing three digital elevation model products (HYDRO1k, GMTED, and HydroSHEDS) at different spatial resolution and accuracy on simulated inundation dynamics. We found that calibrating TOPMODEL with a benchmark dataset can help to successfully predict the seasonal and interannual variations of wetlands, as well as improve the spatial distribution of wetlands to be consistent with inventories. The HydroSHEDS DEM, using a river-basin scheme for aggregating the CTI, shows best accuracy for capturing the spatio-temporal dynamics of wetland among three DEM products. This study demonstrates the feasibility to capture spatial heterogeneity of inundation and to estimate seasonal and interannual variations in wetland by coupling a hydrological module in LSMs with appropriate benchmark datasets. It additionally highlight the importance of an adequate understanding of topographic indices for simulating global wetlands and show the opportunity to converge wetland estimations in LSMs by identifying the uncertainty associated with existing wetland products.
Assessment of imputation methods using varying ecological information to fill the gaps in a tree functional trait database

NASA Astrophysics Data System (ADS)

Poyatos, Rafael; Sus, Oliver; Vilà-Cabrera, Albert; Vayreda, Jordi; Badiella, Llorenç; Mencuccini, Maurizio; Martínez-Vilalta, Jordi

2016-04-01

Plant functional traits are increasingly being used in ecosystem ecology thanks to the growing availability of large ecological databases. However, these databases usually contain a large fraction of missing data because measuring plant functional traits systematically is labour-intensive and because most databases are compilations of datasets with different sampling designs. As a result, within a given database, there is an inevitable variability in the number of traits available for each data entry and/or the species coverage in a given geographical area. The presence of missing data may severely bias trait-based analyses, such as the quantification of trait covariation or trait-environment relationships and may hamper efforts towards trait-based modelling of ecosystem biogeochemical cycles. Several data imputation (i.e. gap-filling) methods have been recently tested on compiled functional trait databases, but the performance of imputation methods applied to a functional trait database with a regular spatial sampling has not been thoroughly studied. Here, we assess the effects of data imputation on five tree functional traits (leaf biomass to sapwood area ratio, foliar nitrogen, maximum height, specific leaf area and wood density) in the Ecological and Forest Inventory of Catalonia, an extensive spatial database (covering 31900 km2). We tested the performance of species mean imputation, single imputation by the k-nearest neighbors algorithm (kNN) and a multiple imputation method, Multivariate Imputation with Chained Equations (MICE) at different levels of missing data (10%, 30%, 50%, and 80%). We also assessed the changes in imputation performance when additional predictors (species identity, climate, forest structure, spatial structure) were added in kNN and MICE imputations. We evaluated the imputed datasets using a battery of indexes describing departure from the complete dataset in trait distribution, in the mean prediction error, in the correlation matrix and in selected bivariate trait relationships. MICE yielded imputations which better preserved the variability and covariance structure of the data and provided an estimate of between-imputation uncertainty. We found that adding species identity as a predictor in MICE and kNN improved imputation for all traits, but adding climate did not lead to any appreciable improvement. However, forest structure and spatial structure did reduce imputation errors in maximum height and in leaf biomass to sapwood area ratios, respectively. Although species mean imputations showed the lowest error for 3 out the 5 studied traits, dataset-averaged errors were lowest for MICE imputations with all additional predictors, when missing data levels were 50% or lower. Species mean imputations always resulted in larger errors in the correlation matrix and appreciably altered the studied bivariate trait relationships. In conclusion, MICE imputations using species identity, climate, forest structure and spatial structure as predictors emerged as the most suitable method of the ones tested here, but it was also evident that imputation performance deteriorates at high levels of missing data (80%).
Classification of Large-Scale Remote Sensing Images for Automatic Identification of Health Hazards: Smoke Detection Using an Autologistic Regression Classifier.

PubMed

Wolters, Mark A; Dean, C B

2017-01-01

Remote sensing images from Earth-orbiting satellites are a potentially rich data source for monitoring and cataloguing atmospheric health hazards that cover large geographic regions. A method is proposed for classifying such images into hazard and nonhazard regions using the autologistic regression model, which may be viewed as a spatial extension of logistic regression. The method includes a novel and simple approach to parameter estimation that makes it well suited to handling the large and high-dimensional datasets arising from satellite-borne instruments. The methodology is demonstrated on both simulated images and a real application to the identification of forest fire smoke.
Internal Consistency of the NVAP Water Vapor Dataset

NASA Technical Reports Server (NTRS)

Suggs, Ronnie J.; Jedlovec, Gary J.; Arnold, James E. (Technical Monitor)

2001-01-01

The NVAP (NASA Water Vapor Project) dataset is a global dataset at 1 x 1 degree spatial resolution consisting of daily, pentad, and monthly atmospheric precipitable water (PW) products. The analysis blends measurements from the Television and Infrared Operational Satellite (TIROS) Operational Vertical Sounder (TOVS), the Special Sensor Microwave/Imager (SSM/I), and radiosonde observations into a daily collage of PW. The original dataset consisted of five years of data from 1988 to 1992. Recent updates have added three additional years (1993-1995) and incorporated procedural and algorithm changes from the original methodology. Since each of the PW sources (TOVS, SSM/I, and radiosonde) do not provide global coverage, each of these sources compliment one another by providing spatial coverage over regions and during times where the other is not available. For this type of spatial and temporal blending to be successful, each of the source components should have similar or compatible accuracies. If this is not the case, regional and time varying biases may be manifested in the NVAP dataset. This study examines the consistency of the NVAP source data by comparing daily collocated TOVS and SSM/I PW retrievals with collocated radiosonde PW observations. The daily PW intercomparisons are performed over the time period of the dataset and for various regions.
Changeable camouflage: how well can flounder resemble the colour and spatial scale of substrates in their natural habitats?

PubMed Central

Akkaynak, Derya; Siemann, Liese A.; Barbosa, Alexandra

2017-01-01

Flounder change colour and pattern for camouflage. We used a spectrometer to measure reflectance spectra and a digital camera to capture body patterns of two flounder species camouflaged on four natural backgrounds of different spatial scale (sand, small gravel, large gravel and rocks). We quantified the degree of spectral match between flounder and background relative to the situation of perfect camouflage in which flounder and background were assumed to have identical spectral distribution. Computations were carried out for three biologically relevant observers: monochromatic squid, dichromatic crab and trichromatic guitarfish. Our computations present a new approach to analysing datasets with multiple spectra that have large variance. Furthermore, to investigate the spatial match between flounder and background, images of flounder patterns were analysed using a custom program originally developed to study cuttlefish camouflage. Our results show that all flounder and background spectra fall within the same colour gamut and that, in terms of different observer visual systems, flounder matched most substrates in luminance and colour contrast. Flounder matched the spatial scales of all substrates except for rocks. We discuss findings in terms of flounder biology; furthermore, we discuss our methodology in light of hyperspectral technologies that combine high-resolution spectral and spatial imaging. PMID:28405370
Changeable camouflage: how well can flounder resemble the colour and spatial scale of substrates in their natural habitats?

PubMed

Akkaynak, Derya; Siemann, Liese A; Barbosa, Alexandra; Mäthger, Lydia M

2017-03-01

Flounder change colour and pattern for camouflage. We used a spectrometer to measure reflectance spectra and a digital camera to capture body patterns of two flounder species camouflaged on four natural backgrounds of different spatial scale (sand, small gravel, large gravel and rocks). We quantified the degree of spectral match between flounder and background relative to the situation of perfect camouflage in which flounder and background were assumed to have identical spectral distribution. Computations were carried out for three biologically relevant observers: monochromatic squid, dichromatic crab and trichromatic guitarfish. Our computations present a new approach to analysing datasets with multiple spectra that have large variance. Furthermore, to investigate the spatial match between flounder and background, images of flounder patterns were analysed using a custom program originally developed to study cuttlefish camouflage. Our results show that all flounder and background spectra fall within the same colour gamut and that, in terms of different observer visual systems, flounder matched most substrates in luminance and colour contrast. Flounder matched the spatial scales of all substrates except for rocks. We discuss findings in terms of flounder biology; furthermore, we discuss our methodology in light of hyperspectral technologies that combine high-resolution spectral and spatial imaging.

Rule-based topology system for spatial databases to validate complex geographic datasets

NASA Astrophysics Data System (ADS)

Martinez-Llario, J.; Coll, E.; Núñez-Andrés, M.; Femenia-Ribera, C.

2017-06-01

A rule-based topology software system providing a highly flexible and fast procedure to enforce integrity in spatial relationships among datasets is presented. This improved topology rule system is built over the spatial extension Jaspa. Both projects are open source, freely available software developed by the corresponding author of this paper. Currently, there is no spatial DBMS that implements a rule-based topology engine (considering that the topology rules are designed and performed in the spatial backend). If the topology rules are applied in the frontend (as in many GIS desktop programs), ArcGIS is the most advanced solution. The system presented in this paper has several major advantages over the ArcGIS approach: it can be extended with new topology rules, it has a much wider set of rules, and it can mix feature attributes with topology rules as filters. In addition, the topology rule system can work with various DBMSs, including PostgreSQL, H2 or Oracle, and the logic is performed in the spatial backend. The proposed topology system allows users to check the complex spatial relationships among features (from one or several spatial layers) that require some complex cartographic datasets, such as the data specifications proposed by INSPIRE in Europe and the Land Administration Domain Model (LADM) for Cadastral data.
Using Neural Networks to Improve the Performance of Radiative Transfer Modeling Used for Geometry Dependent Surface Lambertian-Equivalent Reflectivity Calculations

NASA Technical Reports Server (NTRS)

Fasnacht, Zachary; Qin, Wenhan; Haffner, David P.; Loyola, Diego; Joiner, Joanna; Krotkov, Nickolay; Vasilkov, Alexander; Spurr, Robert

2017-01-01

Surface Lambertian-equivalent reflectivity (LER) is important for trace gas retrievals in the direct calculation of cloud fractions and indirect calculation of the air mass factor. Current trace gas retrievals use climatological surface LER's. Surface properties that impact the bidirectional reflectance distribution function (BRDF) as well as varying satellite viewing geometry can be important for retrieval of trace gases. Geometry Dependent LER (GLER) captures these effects with its calculation of sun normalized radiances (I/F) and can be used in current LER algorithms (Vasilkov et al. 2016). Pixel by pixel radiative transfer calculations are computationally expensive for large datasets. Modern satellite missions such as the Tropospheric Monitoring Instrument (TROPOMI) produce very large datasets as they take measurements at much higher spatial and spectral resolutions. Look up table (LUT) interpolation improves the speed of radiative transfer calculations but complexity increases for non-linear functions. Neural networks perform fast calculations and can accurately predict both non-linear and linear functions with little effort.
Rapid underway profiling of water quality in Queensland estuaries.

PubMed

Hodge, Jonathan; Longstaff, Ben; Steven, Andy; Thornton, Phillip; Ellis, Peter; McKelvie, Ian

2005-01-01

We present an overview of a portable underway water quality monitoring system (RUM-Rapid Underway Monitoring), developed by integrating several off-the-shelf water quality instruments to provide rapid, comprehensive, and spatially referenced 'snapshots' of water quality conditions. We demonstrate the utility of the system from studies in the Northern Great Barrier Reef (Daintree River) and the Moreton Bay region. The Brisbane dataset highlights RUM's utility in characterising plumes as well as its ability to identify the smaller scale structure of large areas. RUM is shown to be particularly useful when measuring indicators with large small-scale variability such as turbidity and chlorophyll-a. Additionally, the Daintree dataset shows the ability to integrate other technologies, resulting in a more comprehensive analysis, whilst sampling offshore highlights some of the analytical issues required for sampling low concentration data. RUM is a low cost, highly flexible solution that can be modified for use in any water type, on most vessels and is only limited by the available monitoring technologies.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, Donald F.; Schulz, Carl; Konijnenburg, Marco

High-resolution Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry imaging enables the spatial mapping and identification of biomolecules from complex surfaces. The need for long time-domain transients, and thus large raw file sizes, results in a large amount of raw data (“big data”) that must be processed efficiently and rapidly. This can be compounded by largearea imaging and/or high spatial resolution imaging. For FT-ICR, data processing and data reduction must not compromise the high mass resolution afforded by the mass spectrometer. The continuous mode “Mosaic Datacube” approach allows high mass resolution visualization (0.001 Da) of mass spectrometry imaging data, butmore » requires additional processing as compared to featurebased processing. We describe the use of distributed computing for processing of FT-ICR MS imaging datasets with generation of continuous mode Mosaic Datacubes for high mass resolution visualization. An eight-fold improvement in processing time is demonstrated using a Dutch nationally available cloud service.« less
Scalable parallel distance field construction for large-scale applications

DOE PAGES

Yu, Hongfeng; Xie, Jinrong; Ma, Kwan -Liu; ...

2015-10-01

Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less
Scalable Parallel Distance Field Construction for Large-Scale Applications.

PubMed

Yu, Hongfeng; Xie, Jinrong; Ma, Kwan-Liu; Kolla, Hemanth; Chen, Jacqueline H

2015-10-01

Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.
Sonification Prototype for Space Physics

NASA Astrophysics Data System (ADS)

Candey, R. M.; Schertenleib, A. M.; Diaz Merced, W. L.

2005-12-01

As an alternative and adjunct to visual displays, auditory exploration of data via sonification (data controlled sound) and audification (audible playback of data samples) is promising for complex or rapidly/temporally changing visualizations, for data exploration of large datasets (particularly multi-dimensional datasets), and for exploring datasets in frequency rather than spatial dimensions (see also International Conferences on Auditory Display ). Besides improving data exploration and analysis for most researchers, the use of sound is especially valuable as an assistive technology for visually-impaired people and can make science and math more exciting for high school and college students. Only recently have the hardware and software come together to make a cross-platform open-source sonification tool feasible. We have developed a prototype sonification data analysis tool using the JavaSound API and NASA GSFC's ViSBARD software . Wanda Diaz Merced, a blind astrophysicist from Puerto Rico, is instrumental in advising on and testing the tool.
Who, What, When, Where? Determining the Health Implications of Wildfire Smoke Exposure

NASA Astrophysics Data System (ADS)

Ford, B.; Lassman, W.; Gan, R.; Burke, M.; Pfister, G.; Magzamen, S.; Fischer, E. V.; Volckens, J.; Pierce, J. R.

2016-12-01

Exposure to poor air quality is associated with negative impacts on human health. A large natural source of PM in the western U.S. is from wildland fires. Accurately attributing health endpoints to wildland-fire smoke requires a determination of the exposed population. This is a difficult endeavor because most current methods for monitoring air quality are not at high temporal and spatial resolutions. Therefore, there is a growing effort to include multiple datasets and create blended products of smoke exposure that can exploit the strengths of each dataset. In this work, we combine model (WRF-Chem) simulations, NASA satellite (MODIS) observations, and in-situ surface monitors to improve exposure estimates. We will also introduce a social-media dataset of self-reported smoke/haze/pollution to improve population-level exposure estimates for the summer of 2015. Finally, we use these detailed exposure estimates in different epidemiologic study designs to provide an in-depth understanding of the role wildfire exposure plays on health outcomes.
Promoter classifier: software package for promoter database analysis.

PubMed

Gershenzon, Naum I; Ioshikhes, Ilya P

2005-01-01

Promoter Classifier is a package of seven stand-alone Windows-based C++ programs allowing the following basic manipulations with a set of promoter sequences: (i) calculation of positional distributions of nucleotides averaged over all promoters of the dataset; (ii) calculation of the averaged occurrence frequencies of the transcription factor binding sites and their combinations; (iii) division of the dataset into subsets of sequences containing or lacking certain promoter elements or combinations; (iv) extraction of the promoter subsets containing or lacking CpG islands around the transcription start site; and (v) calculation of spatial distributions of the promoter DNA stacking energy and bending stiffness. All programs have a user-friendly interface and provide the results in a convenient graphical form. The Promoter Classifier package is an effective tool for various basic manipulations with eukaryotic promoter sequences that usually are necessary for analysis of large promoter datasets. The program Promoter Divider is described in more detail as a representative component of the package.
The coming paradigm shift: A transition from manual to automated microscopy.

PubMed

Farahani, Navid; Monteith, Corey E

2016-01-01

The field of pathology has used light microscopy (LM) extensively since the mid-19(th) century for examination of histological tissue preparations. This technology has remained the foremost tool in use by pathologists even as other fields have undergone a great change in recent years through new technologies. However, as new microscopy techniques are perfected and made available, this reliance on the standard LM will likely begin to change. Advanced imaging involving both diffraction-limited and subdiffraction techniques are bringing nondestructive, high-resolution, molecular-level imaging to pathology. Some of these technologies can produce three-dimensional (3D) datasets from sampled tissues. In addition, block-face/tissue-sectioning techniques are already providing automated, large-scale 3D datasets of whole specimens. These datasets allow pathologists to see an entire sample with all of its spatial information intact, and furthermore allow image analysis such as detection, segmentation, and classification, which are impossible in standard LM. It is likely that these technologies herald a major paradigm shift in the field of pathology.
Online Visualization and Value Added Services of MERRA-2 Data at GES DISC

NASA Technical Reports Server (NTRS)

Shen, Suhung; Ostrenga, Dana M.; Vollmer, Bruce E.; Hegde, Mahabaleshwa S.; Wei, Jennifer C.; Bosilovich, Michael G.

2017-01-01

NASA climate reanalysis datasets from MERRA-2, distributed at the Goddard Earth Sciences Data and Information Services Center (GES DISC), have been used in broad research areas, such as climate variations, extreme weather, agriculture, renewable energy, and air quality, etc. The datasets contain numerous variables for atmosphere, land, and ocean, grouped into 95 products. The total archived volume is approximately 337 TB ( approximately 562K files) at the end of October 2017. Due to the large number of products and files, and large data volumes, it may be a challenge for a user to find and download the data of interest. The support team at GES DISC, working closely with the MERRA-2 science team, has created and is continuing to work on value added data services to best meet the needs of a broad user community. This presentation, using aerosol over Asia Monsoon as an example, provides an overview of the MERRA-2 data services at GES DISC, including: How to find the data? How many data access methods are provided? What are the best data access methods for me? How do download the subsetted (parameter, spatial, temporal) data and save in preferred spatial resolution and data format? How to visualize and explore the data online? In addition, we introduce a future online analytic tool designed for supporting application research, focusing on long-term hourly time-series data access and analysis.
A Columnar Storage Strategy with Spatiotemporal Index for Big Climate Data

NASA Astrophysics Data System (ADS)

Hu, F.; Bowen, M. K.; Li, Z.; Schnase, J. L.; Duffy, D.; Lee, T. J.; Yang, C. P.

2015-12-01

Large collections of observational, reanalysis, and climate model output data may grow to as large as a 100 PB in the coming years, so climate dataset is in the Big Data domain, and various distributed computing frameworks have been utilized to address the challenges by big climate data analysis. However, due to the binary data format (NetCDF, HDF) with high spatial and temporal dimensions, the computing frameworks in Apache Hadoop ecosystem are not originally suited for big climate data. In order to make the computing frameworks in Hadoop ecosystem directly support big climate data, we propose a columnar storage format with spatiotemporal index to store climate data, which will support any project in the Apache Hadoop ecosystem (e.g. MapReduce, Spark, Hive, Impala). With this approach, the climate data will be transferred into binary Parquet data format, a columnar storage format, and spatial and temporal index will be built and attached into the end of Parquet files to enable real-time data query. Then such climate data in Parquet data format could be available to any computing frameworks in Hadoop ecosystem. The proposed approach is evaluated using the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) climate reanalysis dataset. Experimental results show that this approach could efficiently overcome the gap between the big climate data and the distributed computing frameworks, and the spatiotemporal index could significantly accelerate data querying and processing.
L1 norm based common spatial patterns decomposition for scalp EEG BCI.

PubMed

Li, Peiyang; Xu, Peng; Zhang, Rui; Guo, Lanjin; Yao, Dezhong

2013-08-06

Brain computer interfaces (BCI) is one of the most popular branches in biomedical engineering. It aims at constructing a communication between the disabled persons and the auxiliary equipments in order to improve the patients' life. In motor imagery (MI) based BCI, one of the popular feature extraction strategies is Common Spatial Patterns (CSP). In practical BCI situation, scalp EEG inevitably has the outlier and artifacts introduced by ocular, head motion or the loose contact of electrodes in scalp EEG recordings. Because outlier and artifacts are usually observed with large amplitude, when CSP is solved in view of L2 norm, the effect of outlier and artifacts will be exaggerated due to the imposing of square to outliers, which will finally influence the MI based BCI performance. While L1 norm will lower the outlier effects as proved in other application fields like EEG inverse problem, face recognition, etc. In this paper, we present a new CSP implementation using the L1 norm technique, instead of the L2 norm, to solve the eigen problem for spatial filter estimation with aim to improve the robustness of CSP to outliers. To evaluate the performance of our method, we applied our method as well as the standard CSP and the regularized CSP with Tikhonov regularization (TR-CSP), on both the peer BCI dataset with simulated outliers and the dataset from the MI BCI system developed in our group. The McNemar test is used to investigate whether the difference among the three CSPs is of statistical significance. The results of both the simulation and real BCI datasets consistently reveal that the proposed method has much higher classification accuracies than the conventional CSP and the TR-CSP. By combining L1 norm based Eigen decomposition into Common Spatial Patterns, the proposed approach can effectively improve the robustness of BCI system to EEG outliers and thus be potential for the actual MI BCI application, where outliers are inevitably introduced into EEG recordings.
Characterizing the Spatial Contiguity of Extreme Precipitation over the US in the Recent Past

NASA Astrophysics Data System (ADS)

Touma, D. E.; Swain, D. L.; Diffenbaugh, N. S.

2016-12-01

The spatial characteristics of extreme precipitation over an area can define the hydrologic response in a basin, subsequently affecting the flood risk in the region. Here, we examine the spatial extent of extreme precipitation in the US by defining its "footprint": a contiguous area of rainfall exceeding a certain threshold (e.g., 90th percentile) on a given day. We first characterize the climatology of extreme rainfall footprint sizes across the US from 1980-2015 using Daymet, a high-resolution observational gridded rainfall dataset. We find that there are distinct regional and seasonal differences in average footprint sizes of extreme daily rainfall. In the winter, the Midwest shows footprints exceeding 500,000 sq. km while the Front Range exhibits footprints of 10,000 sq. km. Alternatively, the summer average footprint size is generally smaller and more uniform across the US, ranging from 10,000 sq. km in the Southwest to 100,000 sq. km in Montana and North Dakota. Moreover, we find that there are some significant increasing trends of average footprint size between 1980-2015, specifically in the Southwest in the winter and the Northeast in the spring. While gridded daily rainfall datasets allow for a practical framework in calculating footprint size, this calculation heavily depends on the interpolation methods that have been used in creating the dataset. Therefore, we assess footprint size using the GHCN-Daily station network and use geostatistical methods to define footprints of extreme rainfall directly from station data. Compared to the findings from Daymet, preliminary results using this method show fewer small daily footprint sizes over the US while large footprints are of similar number and magnitude to Daymet. Overall, defining the spatial characteristics of extreme rainfall as well as observed and expected changes in these characteristics allows us to better understand the hydrologic response to extreme rainfall and how to better characterize flood risks.
The Optimization of Trained and Untrained Image Classification Algorithms for Use on Large Spatial Datasets

NASA Technical Reports Server (NTRS)

Kocurek, Michael J.

2005-01-01

The HARVIST project seeks to automatically provide an accurate, interactive interface to predict crop yield over the entire United States. In order to accomplish this goal, large images must be quickly and automatically classified by crop type. Current trained and untrained classification algorithms, while accurate, are highly inefficient when operating on large datasets. This project sought to develop new variants of two standard trained and untrained classification algorithms that are optimized to take advantage of the spatial nature of image data. The first algorithm, harvist-cluster, utilizes divide-and-conquer techniques to precluster an image in the hopes of increasing overall clustering speed. The second algorithm, harvistSVM, utilizes support vector machines (SVMs), a type of trained classifier. It seeks to increase classification speed by applying a "meta-SVM" to a quick (but inaccurate) SVM to approximate a slower, yet more accurate, SVM. Speedups were achieved by tuning the algorithm to quickly identify when the quick SVM was incorrect, and then reclassifying low-confidence pixels as necessary. Comparing the classification speeds of both algorithms to known baselines showed a slight speedup for large values of k (the number of clusters) for harvist-cluster, and a significant speedup for harvistSVM. Future work aims to automate the parameter tuning process required for harvistSVM, and further improve classification accuracy and speed. Additionally, this research will move documents created in Canvas into ArcGIS. The launch of the Mars Reconnaissance Orbiter (MRO) will provide a wealth of image data such as global maps of Martian weather and high resolution global images of Mars. The ability to store this new data in a georeferenced format will support future Mars missions by providing data for landing site selection and the search for water on Mars.
SOURCE EXPLORER: Towards Web Browser Based Tools for Astronomical Source Visualization and Analysis

NASA Astrophysics Data System (ADS)

Young, M. D.; Hayashi, S.; Gopu, A.

2014-05-01

As a new generation of large format, high-resolution imagers come online (ODI, DECAM, LSST, etc.) we are faced with the daunting prospect of astronomical images containing upwards of hundreds of thousands of identifiable sources. Visualizing and interacting with such large datasets using traditional astronomical tools appears to be unfeasible, and a new approach is required. We present here a method for the display and analysis of arbitrarily large source datasets using dynamically scaling levels of detail, enabling scientists to rapidly move from large-scale spatial overviews down to the level of individual sources and everything in-between. Based on the recognized standards of HTML5+JavaScript, we enable observers and archival users to interact with their images and sources from any modern computer without having to install specialized software. We demonstrate the ability to produce large-scale source lists from the images themselves, as well as overlaying data from publicly available source ( 2MASS, GALEX, SDSS, etc.) or user provided source lists. A high-availability cluster of computational nodes allows us to produce these source maps on demand and customized based on user input. User-generated source lists and maps are persistent across sessions and are available for further plotting, analysis, refinement, and culling.
Application of Climate Assessment Tool (CAT) to estimate climate variability impacts on nutrient loading from local watersheds

Treesearch

Ying Ouyang; Prem B. Parajuli; Gary Feng; Theodor D. Leininger; Yongshan Wan; Padmanava Dash

2018-01-01

A vast amount of future climate scenario datasets, created by climate models such as general circulation models (GCMs), have been used in conjunction with watershed models to project future climate variability impact on hydrological processes and water quality. However, these low spatial-temporal resolution datasets are often difficult to downscale spatially and...
A new global 1-km dataset of percentage tree cover derived from remote sensing

USGS Publications Warehouse

DeFries, R.S.; Hansen, M.C.; Townshend, J.R.G.; Janetos, A.C.; Loveland, Thomas R.

2000-01-01

Accurate assessment of the spatial extent of forest cover is a crucial requirement for quantifying the sources and sinks of carbon from the terrestrial biosphere. In the more immediate context of the United Nations Framework Convention on Climate Change, implementation of the Kyoto Protocol calls for estimates of carbon stocks for a baseline year as well as for subsequent years. Data sources from country level statistics and other ground-based information are based on varying definitions of 'forest' and are consequently problematic for obtaining spatially and temporally consistent carbon stock estimates. By combining two datasets previously derived from the Advanced Very High Resolution Radiometer (AVHRR) at 1 km spatial resolution, we have generated a prototype global map depicting percentage tree cover and associated proportions of trees with different leaf longevity (evergreen and deciduous) and leaf type (broadleaf and needleleaf). The product is intended for use in terrestrial carbon cycle models, in conjunction with other spatial datasets such as climate and soil type, to obtain more consistent and reliable estimates of carbon stocks. The percentage tree cover dataset is available through the Global Land Cover Facility at the University of Maryland at http://glcf.umiacs.umd.edu.
Enhancing GIS Capabilities for High Resolution Earth Science Grids

NASA Astrophysics Data System (ADS)

Koziol, B. W.; Oehmke, R.; Li, P.; O'Kuinghttons, R.; Theurich, G.; DeLuca, C.

2017-12-01

Applications for high performance GIS will continue to increase as Earth system models pursue more realistic representations of Earth system processes. Finer spatial resolution model input and output, unstructured or irregular modeling grids, data assimilation, and regional coordinate systems present novel challenges for GIS frameworks operating in the Earth system modeling domain. This presentation provides an overview of two GIS-driven applications that combine high performance software with big geospatial datasets to produce value-added tools for the modeling and geoscientific community. First, a large-scale interpolation experiment using National Hydrography Dataset (NHD) catchments, a high resolution rectilinear CONUS grid, and the Earth System Modeling Framework's (ESMF) conservative interpolation capability will be described. ESMF is a parallel, high-performance software toolkit that provides capabilities (e.g. interpolation) for building and coupling Earth science applications. ESMF is developed primarily by the NOAA Environmental Software Infrastructure and Interoperability (NESII) group. The purpose of this experiment was to test and demonstrate the utility of high performance scientific software in traditional GIS domains. Special attention will be paid to the nuanced requirements for dealing with high resolution, unstructured grids in scientific data formats. Second, a chunked interpolation application using ESMF and OpenClimateGIS (OCGIS) will demonstrate how spatial subsetting can virtually remove computing resource ceilings for very high spatial resolution interpolation operations. OCGIS is a NESII-developed Python software package designed for the geospatial manipulation of high-dimensional scientific datasets. An overview of the data processing workflow, why a chunked approach is required, and how the application could be adapted to meet operational requirements will be discussed here. In addition, we'll provide a general overview of OCGIS's parallel subsetting capabilities including challenges in the design and implementation of a scientific data subsetter.
Ensemble reconstruction of spatio-temporal extreme low-flow events in France since 1871

NASA Astrophysics Data System (ADS)

Caillouet, Laurie; Vidal, Jean-Philippe; Sauquet, Eric; Devers, Alexandre; Graff, Benjamin

2017-06-01

The length of streamflow observations is generally limited to the last 50 years even in data-rich countries like France. It therefore offers too small a sample of extreme low-flow events to properly explore the long-term evolution of their characteristics and associated impacts. To overcome this limit, this work first presents a daily 140-year ensemble reconstructed streamflow dataset for a reference network of near-natural catchments in France. This dataset, called SCOPE Hydro (Spatially COherent Probabilistic Extended Hydrological dataset), is based on (1) a probabilistic precipitation, temperature, and reference evapotranspiration downscaling of the Twentieth Century Reanalysis over France, called SCOPE Climate, and (2) continuous hydrological modelling using SCOPE Climate as forcings over the whole period. This work then introduces tools for defining spatio-temporal extreme low-flow events. Extreme low-flow events are first locally defined through the sequent peak algorithm using a novel combination of a fixed threshold and a daily variable threshold. A dedicated spatial matching procedure is then established to identify spatio-temporal events across France. This procedure is furthermore adapted to the SCOPE Hydro 25-member ensemble to characterize in a probabilistic way unrecorded historical events at the national scale. Extreme low-flow events are described and compared in a spatially and temporally homogeneous way over 140 years on a large set of catchments. Results highlight well-known recent events like 1976 or 1989-1990, but also older and relatively forgotten ones like the 1878 and 1893 events. These results contribute to improving our knowledge of historical events and provide a selection of benchmark events for climate change adaptation purposes. Moreover, this study allows for further detailed analyses of the effect of climate variability and anthropogenic climate change on low-flow hydrology at the scale of France.

Spatially explicit modeling of particulate nutrient flux in Large global rivers

NASA Astrophysics Data System (ADS)

Cohen, S.; Kettner, A.; Mayorga, E.; Harrison, J. A.

2017-12-01

Water, sediment, nutrient and carbon fluxes along river networks have undergone considerable alterations in response to anthropogenic and climatic changes, with significant consequences to infrastructure, agriculture, water security, ecology and geomorphology worldwide. However, in a global setting, these changes in fluvial fluxes and their spatial and temporal characteristics are poorly constrained, due to the limited availability of continuous and long-term observations. We present results from a new global-scale particulate modeling framework (WBMsedNEWS) that combines the Global NEWS watershed nutrient export model with the spatially distributed WBMsed water and sediment model. We compare the model predictions against multiple observational datasets. The results indicate that the model is able to accurately predict particulate nutrient (Nitrogen, Phosphorus and Organic Carbon) fluxes on an annual time scale. Analysis of intra-basin nutrient dynamics and fluxes to global oceans is presented.
Improving Discoverability of Geophysical Data using Location Based Services

NASA Astrophysics Data System (ADS)

Morrison, D.; Barnes, R. J.; Potter, M.; Nylund, S. R.; Patrone, D.; Weiss, M.; Talaat, E. R.; Sarris, T. E.; Smith, D.

2014-12-01

The great promise of Virtual Observatories is the ability to perform complex search operations across the metadata of a large variety of different data sets. This allows the researcher to isolate and select the relevant measurements for their topic of study. The Virtual ITM Observatory (VITMO) has many diverse geophysical datasets that cover a large temporal and spatial range that present a unique search problem. VITMO provides many methods by which the user can search for and select data of interest including restricting selections based on geophysical conditions (solar wind speed, Kp, etc) as well as finding those datasets that overlap in time. One of the key challenges in improving discoverability is the ability to identify portions of datasets that overlap in time and in location. The difficulty is that location data is not contained in the metadata for datasets produced by satellites and would be extremely large in volume if it were available, making searching for overlapping data very time consuming. To solve this problem we have developed a series of light-weight web services that can provide a new data search capability for VITMO and others. The services consist of a database of spacecraft ephemerides and instrument fields of view; an overlap calculator to find times when the fields of view of different instruments intersect; and a magnetic field line tracing service that maps in situ and ground based measurements to the equatorial plane in magnetic coordinates for a number of field models and geophysical conditions. These services run in real-time when the user queries for data. They will allow the non-specialist user to select data that they were previously unable to locate, opening up analysis opportunities beyond the instrument teams and specialists, making it easier for future students who come into the field.
Discovering Cortical Folding Patterns in Neonatal Cortical Surfaces Using Large-Scale Dataset

PubMed Central

Meng, Yu; Li, Gang; Wang, Li; Lin, Weili; Gilmore, John H.

2017-01-01

The cortical folding of the human brain is highly complex and variable across individuals. Mining the major patterns of cortical folding from modern large-scale neuroimaging datasets is of great importance in advancing techniques for neuroimaging analysis and understanding the inter-individual variations of cortical folding and its relationship with cognitive function and disorders. As the primary cortical folding is genetically influenced and has been established at term birth, neonates with the minimal exposure to the complicated postnatal environmental influence are the ideal candidates for understanding the major patterns of cortical folding. In this paper, for the first time, we propose a novel method for discovering the major patterns of cortical folding in a large-scale dataset of neonatal brain MR images (N = 677). In our method, first, cortical folding is characterized by the distribution of sulcal pits, which are the locally deepest points in cortical sulci. Because deep sulcal pits are genetically related, relatively consistent across individuals, and also stable during brain development, they are well suitable for representing and characterizing cortical folding. Then, the similarities between sulcal pit distributions of any two subjects are measured from spatial, geometrical, and topological points of view. Next, these different measurements are adaptively fused together using a similarity network fusion technique, to preserve their common information and also catch their complementary information. Finally, leveraging the fused similarity measurements, a hierarchical affinity propagation algorithm is used to group similar sulcal folding patterns together. The proposed method has been applied to 677 neonatal brains (the largest neonatal dataset to our knowledge) in the central sulcus, superior temporal sulcus, and cingulate sulcus, and revealed multiple distinct and meaningful folding patterns in each region. PMID:28229131
EverVIEW: a visualization platform for hydrologic and Earth science gridded data

USGS Publications Warehouse

Romañach, Stephanie S.; McKelvy, James M.; Suir, Kevin J.; Conzelmann, Craig

2015-01-01

The EverVIEW Data Viewer is a cross-platform desktop application that combines and builds upon multiple open source libraries to help users to explore spatially-explicit gridded data stored in Network Common Data Form (NetCDF). Datasets are displayed across multiple side-by-side geographic or tabular displays, showing colorized overlays on an Earth globe or grid cell values, respectively. Time-series datasets can be animated to see how water surface elevation changes through time or how habitat suitability for a particular species might change over time under a given scenario. Initially targeted toward Florida's Everglades restoration planning, EverVIEW has been flexible enough to address the varied needs of large-scale planning beyond Florida, and is currently being used in biological planning efforts nationally and internationally.
Bayesian geostatistics in health cartography: the perspective of malaria.

PubMed

Patil, Anand P; Gething, Peter W; Piel, Frédéric B; Hay, Simon I

2011-06-01

Maps of parasite prevalences and other aspects of infectious diseases that vary in space are widely used in parasitology. However, spatial parasitological datasets rarely, if ever, have sufficient coverage to allow exact determination of such maps. Bayesian geostatistics (BG) is a method for finding a large sample of maps that can explain a dataset, in which maps that do a better job of explaining the data are more likely to be represented. This sample represents the knowledge that the analyst has gained from the data about the unknown true map. BG provides a conceptually simple way to convert these samples to predictions of features of the unknown map, for example regional averages. These predictions account for each map in the sample, yielding an appropriate level of predictive precision.
Bayesian geostatistics in health cartography: the perspective of malaria

PubMed Central

Patil, Anand P.; Gething, Peter W.; Piel, Frédéric B.; Hay, Simon I.

2011-01-01

Maps of parasite prevalences and other aspects of infectious diseases that vary in space are widely used in parasitology. However, spatial parasitological datasets rarely, if ever, have sufficient coverage to allow exact determination of such maps. Bayesian geostatistics (BG) is a method for finding a large sample of maps that can explain a dataset, in which maps that do a better job of explaining the data are more likely to be represented. This sample represents the knowledge that the analyst has gained from the data about the unknown true map. BG provides a conceptually simple way to convert these samples to predictions of features of the unknown map, for example regional averages. These predictions account for each map in the sample, yielding an appropriate level of predictive precision. PMID:21420361
Open Earth Observation Data for Measuring Anthropogenic Development in Coastal Zones at Continental Scales

NASA Astrophysics Data System (ADS)

Du, X.; Leinenkugel, P.; Guo, H.; Kuenzer, C.

2017-12-01

During the recent decades, global coasts are undergoing tremendous change due to accelerating socio-economic growth, which has severe effects on the functioning of global coastal systems. In view of this, accurate, timely, and area-wide global information on natural as well as anthropogenic processes in the coastal zone are of paramount importance for sustainable coastal development. A broad range of freely available satellite derived products, and open geo-datasets, as well as statistics with global coverage exist that have not yet been fully exploited to evaluate human development patterns in coastal areas. In this study, we demonstrate the potential of freely and openly available EO and GEO data sets for characterizing and evaluating human development in coastal zones on large scales. Therefore, different geo-spatial dataset such as Global Urban Footprint (GUF), Open Street Map (OSM), time series of Global Human Settlement Layer (GHSL) and Climate Change Initiative (CCI) Land cover were acquired for the entire continental coast of Asia, defined as the terrestrial area 100 km from the coastline. In order to extract indices for the coastline, a reference structure was developed allowing the integration of a 2D spatial pattern of a given parameter to a certain location along the coast line. Based on this reference structure statistics for the coast were calculated every 5 km parallel to the coast line as well as for four different distance intervals from the coast. The results demonstrate the highly unequal distribution of coastal development with respect to urban and agricultural usage in Asia, with large differences between and within different countries. China coasts show the highest overall patterns of urban development, while countries such as Pakistan and Myanmar show comparably low levels with nearly no development evident absence from coastal metropolitan areas. Furthermore, a clear trend of decreasing urban development is evident with increasing distance from the coast. This study highlights the potential of global geo-spatial data products for deriving anthropogenic development indicators that can support the evaluation and monitoring for sustainable development of coastal zones, while also discussing the shortcomings of these datasets for such purposes.
Effective and efficient analysis of spatio-temporal data

NASA Astrophysics Data System (ADS)

Zhang, Zhongnan

Spatio-temporal data mining, i.e., mining knowledge from large amount of spatio-temporal data, is a highly demanding field because huge amounts of spatio-temporal data have been collected in various applications, ranging from remote sensing, to geographical information systems (GIS), computer cartography, environmental assessment and planning, etc. The collection data far exceeded human's ability to analyze which make it crucial to develop analysis tools. Recent studies on data mining have extended to the scope of data mining from relational and transactional datasets to spatial and temporal datasets. Among the various forms of spatio-temporal data, remote sensing images play an important role, due to the growing wide-spreading of outer space satellites. In this dissertation, we proposed two approaches to analyze the remote sensing data. The first one is about applying association rules mining onto images processing. Each image was divided into a number of image blocks. We built a spatial relationship for these blocks during the dividing process. This made a large number of images into a spatio-temporal dataset since each image was shot in time-series. The second one implemented co-occurrence patterns discovery from these images. The generated patterns represent subsets of spatial features that are located together in space and time. A weather analysis is composed of individual analysis of several meteorological variables. These variables include temperature, pressure, dew point, wind, clouds, visibility and so on. Local-scale models provide detailed analysis and forecasts of meteorological phenomena ranging from a few kilometers to about 100 kilometers in size. When some of above meteorological variables have some special change tendency, some kind of severe weather will happen in most cases. Using the discovery of association rules, we found that some special meteorological variables' changing has tight relation with some severe weather situation that will happen very soon. This dissertation is composed of three parts: an introduction, some basic knowledges and relative works, and my own three contributions to the development of approaches for spatio-temporal data mining: DYSTAL algorithm, STARSI algorithm, and COSTCOP+ algorithm.
Using the Gravity Model to Estimate the Spatial Spread of Vector-Borne Diseases

PubMed Central

Barrios, José Miguel; Verstraeten, Willem W.; Maes, Piet; Aerts, Jean-Marie; Farifteh, Jamshid; Coppin, Pol

2012-01-01

The gravity models are commonly used spatial interaction models. They have been widely applied in a large set of domains dealing with interactions amongst spatial entities. The spread of vector-borne diseases is also related to the intensity of interaction between spatial entities, namely, the physical habitat of pathogens’ vectors and/or hosts, and urban areas, thus humans. This study implements the concept behind gravity models in the spatial spread of two vector-borne diseases, nephropathia epidemica and Lyme borreliosis, based on current knowledge on the transmission mechanism of these diseases. Two sources of information on vegetated systems were tested: the CORINE land cover map and MODIS NDVI. The size of vegetated areas near urban centers and a local indicator of occupation-related exposure were found significant predictors of disease risk. Both the land cover map and the space-borne dataset were suited yet not equivalent input sources to locate and measure vegetated areas of importance for disease spread. The overall results point at the compatibility of the gravity model concept and the spatial spread of vector-borne diseases. PMID:23202882
Using the gravity model to estimate the spatial spread of vector-borne diseases.

PubMed

Barrios, José Miguel; Verstraeten, Willem W; Maes, Piet; Aerts, Jean-Marie; Farifteh, Jamshid; Coppin, Pol

2012-11-30

The gravity models are commonly used spatial interaction models. They have been widely applied in a large set of domains dealing with interactions amongst spatial entities. The spread of vector-borne diseases is also related to the intensity of interaction between spatial entities, namely, the physical habitat of pathogens’ vectors and/or hosts, and urban areas, thus humans. This study implements the concept behind gravity models in the spatial spread of two vector-borne diseases, nephropathia epidemica and Lyme borreliosis, based on current knowledge on the transmission mechanism of these diseases. Two sources of information on vegetated systems were tested: the CORINE land cover map and MODIS NDVI. The size of vegetated areas near urban centers and a local indicator of occupation-related exposure were found significant predictors of disease risk. Both the land cover map and the space-borne dataset were suited yet not equivalent input sources to locate and measure vegetated areas of importance for disease spread. The overall results point at the compatibility of the gravity model concept and the spatial spread of vector-borne diseases.
On the potential for the Partial Triadic Analysis to grasp the spatio-temporal variability of groundwater hydrochemistry

NASA Astrophysics Data System (ADS)

Gourdol, L.; Hissler, C.; Pfister, L.

2012-04-01

The Luxembourg sandstone aquifer is of major relevance for the national supply of drinking water in Luxembourg. The city of Luxembourg (20% of the country's population) gets almost 2/3 of its drinking water from this aquifer. As a consequence, the study of both the groundwater hydrochemistry, as well as its spatial and temporal variations, are considered as of highest priority. Since 2005, a monitoring network has been implemented by the Water Department of Luxembourg City, with a view to a more sustainable management of this strategic water resource. The data collected to date forms a large and complex dataset, describing spatial and temporal variations of many hydrochemical parameters. The data treatment issue is tightly connected to this kind of water monitoring programs and complex databases. Standard multivariate statistical techniques, such as principal components analysis and hierarchical cluster analysis, have been widely used as unbiased methods for extracting meaningful information from groundwater quality data and are now classically used in many hydrogeological studies, in particular to characterize temporal or spatial hydrochemical variations induced by natural and anthropogenic factors. But these classical multivariate methods deal with two-way matrices, usually parameters/sites or parameters/time, while often the dataset resulting from qualitative water monitoring programs should be seen as a datacube parameters/sites/time. Three-way matrices, such as the one we propose here, are difficult to handle and to analyse by classical multivariate statistical tools and thus should be treated with approaches dealing with three-way data structures. One possible analysis approach consists in the use of partial triadic analysis (PTA). The PTA was previously used with success in many ecological studies but never to date in the domain of hydrogeology. Applied to the dataset of the Luxembourg Sandstone aquifer, the PTA appears as a new promising statistical instrument for hydrogeologists, in particular to characterize temporal and spatial hydrochemical variations induced by natural and anthropogenic factors. This new approach for groundwater management offers potential for 1) identifying a common multivariate spatial structure, 2) untapping the different hydrochemical patterns and explaining their controlling factors and 3) analysing the temporal variability of this structure and grasping hydrochemical changes.
Volcanic forcing for climate modeling: a new microphysics-based dataset covering years 1600-present

NASA Astrophysics Data System (ADS)

Arfeuille, F.; Weisenstein, D.; Mack, H.; Rozanov, E.; Peter, T.; Brönnimann, S.

2013-02-01

As the understanding and representation of the impacts of volcanic eruptions on climate have improved in the last decades, uncertainties in the stratospheric aerosol forcing from large eruptions are now not only linked to visible optical depth estimates on a global scale but also to details on the size, latitude and altitude distributions of the stratospheric aerosols. Based on our understanding of these uncertainties, we propose a new model-based approach to generating a volcanic forcing for General-Circulation-Model (GCM) and Chemistry-Climate-Model (CCM) simulations. This new volcanic forcing, covering the 1600-present period, uses an aerosol microphysical model to provide a realistic, physically consistent treatment of the stratospheric sulfate aerosols. Twenty-six eruptions were modeled individually using the latest available ice cores aerosol mass estimates and historical data on the latitude and date of eruptions. The evolution of aerosol spatial and size distribution after the sulfur dioxide discharge are hence characterized for each volcanic eruption. Large variations are seen in hemispheric partitioning and size distributions in relation to location/date of eruptions and injected SO2 masses. Results for recent eruptions are in good agreement with observations. By providing accurate amplitude and spatial distributions of shortwave and longwave radiative perturbations by volcanic sulfate aerosols, we argue that this volcanic forcing may help refine the climate model responses to the large volcanic eruptions since 1600. The final dataset consists of 3-D values (with constant longitude) of spectrally resolved extinction coefficients, single scattering albedos and asymmetry factors calculated for different wavelength bands upon request. Surface area densities for heterogeneous chemistry are also provided.
Detecting and Quantifying Forest Change: The Potential of Existing C- and X-Band Radar Datasets.

PubMed

Tanase, Mihai A; Ismail, Ismail; Lowell, Kim; Karyanto, Oka; Santoro, Maurizio

2015-01-01

This paper evaluates the opportunity provided by global interferometric radar datasets for monitoring deforestation, degradation and forest regrowth in tropical and semi-arid environments. The paper describes an easy to implement method for detecting forest spatial changes and estimating their magnitude. The datasets were acquired within space-borne high spatial resolutions radar missions at near-global scales thus being significant for monitoring systems developed under the United Framework Convention on Climate Change (UNFCCC). The approach presented in this paper was tested in two areas located in Indonesia and Australia. Forest change estimation was based on differences between a reference dataset acquired in February 2000 by the Shuttle Radar Topography Mission (SRTM) and TanDEM-X mission (TDM) datasets acquired in 2011 and 2013. The synergy between SRTM and TDM datasets allowed not only identifying changes in forest extent but also estimating their magnitude with respect to the reference through variations in forest height.
Secure Access Control and Large Scale Robust Representation for Online Multimedia Event Detection

PubMed Central

Liu, Changyu; Li, Huiling

2014-01-01

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches. PMID:25147840
Ground and satellite based assessment of meteorological droughts: The Coello river basin case study

NASA Astrophysics Data System (ADS)

Cruz-Roa, A. F.; Olaya-Marín, E. J.; Barrios, M. I.

2017-10-01

The spatial distribution of droughts is a key factor for designing water management policies at basin scale in arid and semi-arid regions. Ground hydro-meteorological data in neo-tropical areas are scarce; therefore, the merging of ground and satellite datasets is a promissory approach for improving our understanding of water distribution. This paper compares three monthly rainfall interpolation methods for drought evaluation. The ordinary kriging technique based on ground data, and cokriging with elevation as auxiliary variable were compared against cokriging using the Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA). Twenty rain gauge stations and the 3B42V7 version of the TMPA research dataset were considered. Comparisons were made over the Coello river basin (Colombia) at 3″ spatial resolution covering a period of eight years (1998-2005). The best spatial rainfall estimation was found for cokriging using ground data and elevation. The spatial support of TMPA dataset is very coarse for a merged interpolation with ground data, this spatial scales discrepancy highlight the need to consider scaling rules in the interpolation process.
Multiple Imputation of Groundwater Data to Evaluate Spatial and Temporal Anthropogenic Influences on Subsurface Water Fluxes in Los Angeles, CA

NASA Astrophysics Data System (ADS)

Manago, K. F.; Hogue, T. S.; Hering, A. S.

2014-12-01

In the City of Los Angeles, groundwater accounts for 11% of the total water supply on average, and 30% during drought years. Due to ongoing drought in California, increased reliance on local water supply highlights the need for better understanding of regional groundwater dynamics and estimating sustainable groundwater supply. However, in an urban setting, such as Los Angeles, understanding or modeling groundwater levels is extremely complicated due to various anthropogenic influences such as groundwater pumping, artificial recharge, landscape irrigation, leaking infrastructure, seawater intrusion, and extensive impervious surfaces. This study analyzes anthropogenic effects on groundwater levels using groundwater monitoring well data from the County of Los Angeles Department of Public Works. The groundwater data is irregularly sampled with large gaps between samples, resulting in a sparsely populated dataset. A multiple imputation method is used to fill the missing data, allowing for multiple ensembles and improved error estimates. The filled data is interpolated to create spatial groundwater maps utilizing information from all wells. The groundwater data is evaluated at a monthly time step over the last several decades to analyze the effect of land cover and identify other influencing factors on groundwater levels spatially and temporally. Preliminary results show irrigated parks have the largest influence on groundwater fluctuations, resulting in large seasonal changes, exceeding changes in spreading grounds. It is assumed that these fluctuations are caused by watering practices required to sustain non-native vegetation. Conversely, high intensity urbanized areas resulted in muted groundwater fluctuations and behavior decoupling from climate patterns. Results provides improved understanding of anthropogenic effects on groundwater levels in addition to providing high quality datasets for validation of regional groundwater models.
Geolokit: An interactive tool for visualising and exploring geoscientific data in Google Earth

NASA Astrophysics Data System (ADS)

Triantafyllou, Antoine; Watlet, Arnaud; Bastin, Christophe

2017-10-01

Virtual globes have been developed to showcase different types of data combining a digital elevation model and basemaps of high resolution satellite imagery. Hence, they became a standard to share spatial data and information, although they suffer from a lack of toolboxes dedicated to the formatting of large geoscientific dataset. From this perspective, we developed Geolokit: a free and lightweight software that allows geoscientists - and every scientist working with spatial data - to import their data (e.g., sample collections, structural geology, cross-sections, field pictures, georeferenced maps), to handle and to transcribe them to Keyhole Markup Language (KML) files. KML files are then automatically opened in the Google Earth virtual globe and the spatial data accessed and shared. Geolokit comes with a large number of dedicated tools that can process and display: (i) multi-points data, (ii) scattered data interpolations, (iii) structural geology features in 2D and 3D, (iv) rose diagrams, stereonets and dip-plunge polar histograms, (v) cross-sections and oriented rasters, (vi) georeferenced field pictures, (vii) georeferenced maps and projected gridding. Therefore, together with Geolokit, Google Earth becomes not only a powerful georeferenced data viewer but also a stand-alone work platform. The toolbox (available online at http://www.geolokit.org) is written in Python, a high-level, cross-platform programming language and is accessible through a graphical user interface, designed to run in parallel with Google Earth, through a workflow that requires no additional third party software. Geolokit features are demonstrated in this paper using typical datasets gathered from two case studies illustrating its applicability at multiple scales of investigation: a petro-structural investigation of the Ile d'Yeu orthogneissic unit (Western France) and data collection of the Mariana oceanic subduction zone (Western Pacific).
Large scale snow water status monitoring: comparison of different snow water products in the upper Colorado basins

USGS Publications Warehouse

Artan, G.A.; Verdin, J.P.; Lietzow, R.

2013-01-01

We illustrate the ability to monitor the status of snowpack over large areas by using a~spatially distributed snow accumulation and ablation model in the Upper Colorado Basin. The model was forced with precipitation fields from the National Weather Service (NWS) Multi-sensor Precipitation Estimator (MPE) and the Tropical Rainfall Measuring Mission (TRMM) datasets; remaining meteorological model input data was from NOAA's Global Forecast System (GFS) model output fields. The simulated snow water equivalent (SWE) was compared to SWEs from the Snow Data Assimilation System (SNODAS) and SNOwpack TELemetry system (SNOTEL) over a~region of the Western United States that covers parts of the Upper Colorado Basin. We also compared the SWE product estimated from the Special Sensor Microwave Imager (SSM/I) and Scanning Multichannel Microwave Radiometer (SMMR) to the SNODAS and SNOTEL SWE datasets. Agreement between the spatial distribution of the simulated SWE with both SNODAS and SNOTEL was high for the two model runs for the entire snow accumulation period. Model-simulated SWEs, both with MPE and TRMM, were significantly correlated spatially on average with the SNODAS (r = 0.81 and r = 0.54; d.f. = 543) and the SNOTEL SWE (r = 0.85 and r = 0.55; d.f. = 543), when monthly basinwide simulated average SWE the correlation was also highly significant (r = 0.95 and r = 0.73; d.f. = 12). The SWE estimated from the passive microwave imagery was not correlated either with the SNODAS SWE or (r = 0.14, d.f. = 7) SNOTEL-reported SWE values (r = 0.08, d.f. = 7). The agreement between modeled SWE and the SWE recorded by SNODAS and SNOTEL weakened during the snowmelt period due to an underestimation bias of the air temperature that was used as model input forcing.
Regional modeling of large wildfires under current and potential future climates in Colorado and Wyoming, USA

USGS Publications Warehouse

West, Amanda; Kumar, Sunil; Jarnevich, Catherine S.

2016-01-01

Regional analysis of large wildfire potential given climate change scenarios is crucial to understanding areas most at risk in the future, yet wildfire models are not often developed and tested at this spatial scale. We fit three historical climate suitability models for large wildfires (i.e. ≥ 400 ha) in Colorado andWyoming using topography and decadal climate averages corresponding to wildfire occurrence at the same temporal scale. The historical models classified points of known large wildfire occurrence with high accuracies. Using a novel approach in wildfire modeling, we applied the historical models to independent climate and wildfire datasets, and the resulting sensitivities were 0.75, 0.81, and 0.83 for Maxent, Generalized Linear, and Multivariate Adaptive Regression Splines, respectively. We projected the historic models into future climate space using data from 15 global circulation models and two representative concentration pathway scenarios. Maps from these geospatial analyses can be used to evaluate the changing spatial distribution of climate suitability of large wildfires in these states. April relative humidity was the most important covariate in all models, providing insight to the climate space of large wildfires in this region. These methods incorporate monthly and seasonal climate averages at a spatial resolution relevant to land management (i.e. 1 km2) and provide a tool that can be modified for other regions of North America, or adapted for other parts of the world.
Uncertainty in Random Forests: What does it mean in a spatial context?

NASA Astrophysics Data System (ADS)

Klump, Jens; Fouedjio, Francky

2017-04-01

Geochemical surveys are an important part of exploration for mineral resources and in environmental studies. The samples and chemical analyses are often laborious and difficult to obtain and therefore come at a high cost. As a consequence, these surveys are characterised by datasets with large numbers of variables but relatively few data points when compared to conventional big data problems. With more remote sensing platforms and sensor networks being deployed, large volumes of auxiliary data of the surveyed areas are becoming available. The use of these auxiliary data has the potential to improve the prediction of chemical element concentrations over the whole study area. Kriging is a well established geostatistical method for the prediction of spatial data but requires significant pre-processing and makes some basic assumptions about the underlying distribution of the data. Some machine learning algorithms, on the other hand, may require less data pre-processing and are non-parametric. In this study we used a dataset provided by Kirkwood et al. [1] to explore the potential use of Random Forest in geochemical mapping. We chose Random Forest because it is a well understood machine learning method and has the advantage that it provides us with a measure of uncertainty. By comparing Random Forest to Kriging we found that both methods produced comparable maps of estimated values for our variables of interest. Kriging outperformed Random Forest for variables of interest with relatively strong spatial correlation. The measure of uncertainty provided by Random Forest seems to be quite different to the measure of uncertainty provided by Kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. In conclusion, our preliminary results show that the model driven approach in geostatistics gives us more reliable estimates for our target variables than Random Forest for variables with relatively strong spatial correlation. However, in cases of weak spatial correlation Random Forest, as a nonparametric method, may give the better results once we have a better understanding of the meaning of its uncertainty measures in a spatial context. References [1] Kirkwood, C., M. Cave, D. Beamish, S. Grebby, and A. Ferreira (2016), A machine learning approach to geochemical mapping, Journal of Geochemical Exploration, 163, 28-40, doi:10.1016/j.gexplo.2016.05.003.

Inter-fraction variations in respiratory motion models

NASA Astrophysics Data System (ADS)

McClelland, J. R.; Hughes, S.; Modat, M.; Qureshi, A.; Ahmad, S.; Landau, D. B.; Ourselin, S.; Hawkes, D. J.

2011-01-01

Respiratory motion can vary dramatically between the planning stage and the different fractions of radiotherapy treatment. Motion predictions used when constructing the radiotherapy plan may be unsuitable for later fractions of treatment. This paper presents a methodology for constructing patient-specific respiratory motion models and uses these models to evaluate and analyse the inter-fraction variations in the respiratory motion. The internal respiratory motion is determined from the deformable registration of Cine CT data and related to a respiratory surrogate signal derived from 3D skin surface data. Three different models for relating the internal motion to the surrogate signal have been investigated in this work. Data were acquired from six lung cancer patients. Two full datasets were acquired for each patient, one before the course of radiotherapy treatment and one at the end (approximately 6 weeks later). Separate models were built for each dataset. All models could accurately predict the respiratory motion in the same dataset, but had large errors when predicting the motion in the other dataset. Analysis of the inter-fraction variations revealed that most variations were spatially varying base-line shifts, but changes to the anatomy and the motion trajectories were also observed.
An assessment of differences in gridded precipitation datasets in complex terrain

NASA Astrophysics Data System (ADS)

Henn, Brian; Newman, Andrew J.; Livneh, Ben; Daly, Christopher; Lundquist, Jessica D.

2018-01-01

Hydrologic modeling and other geophysical applications are sensitive to precipitation forcing data quality, and there are known challenges in spatially distributing gauge-based precipitation over complex terrain. We conduct a comparison of six high-resolution, daily and monthly gridded precipitation datasets over the Western United States. We compare the long-term average spatial patterns, and interannual variability of water-year total precipitation, as well as multi-year trends in precipitation across the datasets. We find that the greatest absolute differences among datasets occur in high-elevation areas and in the maritime mountain ranges of the Western United States, while the greatest percent differences among datasets relative to annual total precipitation occur in arid and rain-shadowed areas. Differences between datasets in some high-elevation areas exceed 200 mm yr-1 on average, and relative differences range from 5 to 60% across the Western United States. In areas of high topographic relief, true uncertainties and biases are likely higher than the differences among the datasets; we present evidence of this based on streamflow observations. Precipitation trends in the datasets differ in magnitude and sign at smaller scales, and are sensitive to how temporal inhomogeneities in the underlying precipitation gauge data are handled.
Creating a seamless 1 km resolution daily land surface temperature dataset for urban and surrounding areas in the conterminous United States

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Xiaoma; Zhou, Yuyu; Asrar, Ghassem R.

High spatiotemporal land surface temperature (LST) datasets are increasingly needed in a variety of fields such as ecology, hydrology, meteorology, epidemiology, and energy systems. Moderate Resolution Imaging Spectroradiometer (MODIS) LST is one of such high spatiotemporal datasets that are widely used. But, it has large amount of missing values primarily because of clouds. Gapfilling the missing values is an important approach to create high spatiotemporal LST datasets. However current gapfilling methods have limitations in terms of accuracy and time required to assemble the data over large areas (e.g., national and continental levels). In this study, we developed a 3-step hybridmore » method by integrating a combination of daily merging, spatiotemporal gapfilling, and temporal interpolation methods, to create a high spatiotemporal LST dataset using the four daily LST observations from the two MODIS instruments on Terra and Aqua satellites. We applied this method in urban and surrounding areas for the conterminous U.S. in 2010. The evaluation of the gapfilled LST product indicates that its root mean squared error (RMSE) to be 3.3K for mid-daytime (1:30 pm) and 2.7K for mid-13 nighttime (1:30 am) observations. The method can be easily extended to other years and regions and is also applicable to other satellite products. This seamless daily (mid-daytime and mid-nighttime) LST product with 1 km spatial resolution is of great value for studying effects of urbanization (e.g., urban heat island) and the related impacts on people, ecosystems, energy systems and other infrastructure for cities.« less
Comparison and Evaluation of Annual NDVI Time Series in China Derived from the NOAA AVHRR LTDR and Terra MODIS MOD13C1 Products

PubMed Central

Guo, Xiaoyi; Zhang, Hongyan; Wu, Zhengfang; Zhao, Jianjun; Zhang, Zhengxiang

2017-01-01

Time series of Normalized Difference Vegetation Index (NDVI) derived from multiple satellite sensors are crucial data to study vegetation dynamics. The Land Long Term Data Record Version 4 (LTDR V4) NDVI dataset was recently released at a 0.05 × 0.05° spatial resolution and daily temporal resolution. In this study, annual NDVI time series that are composited by the LTDR V4 and Moderate Resolution Imaging Spectroradiometer (MODIS) NDVI datasets (MOD13C1) are compared and evaluated for the period from 2001 to 2014 in China. The spatial patterns of the NDVI generally match between the LTDR V4 and MOD13C1 datasets. The transitional zone between high and low NDVI values generally matches the boundary of semi-arid and sub-humid regions. A significant and high coefficient of determination is found between the two datasets according to a pixel-based correlation analysis. The spatially averaged NDVI of LTDR V4 is characterized by a much weaker positive regression slope relative to that of the spatially averaged NDVI of the MOD13C1 dataset because of changes in NOAA AVHRR sensors between 2005 and 2006. The measured NDVI values of LTDR V4 were always higher than that of MOD13C1 in western China due to the relatively lower atmospheric water vapor content in western China, and opposite observation appeared in eastern China. In total, 18.54% of the LTDR V4 NDVI pixels exhibit significant trends, whereas 35.79% of the MOD13C1 NDVI pixels show significant trends. Good agreement is observed between the significant trends of the two datasets in the Northeast Plain, Bohai Economic Rim, Loess Plateau, and Yangtze River Delta. By contrast, the datasets contrasted in northwestern desert regions and southern China. A trend analysis of the regression slope values according to the vegetation type shows good agreement between the LTDR V4 and MOD13C1 datasets. This study demonstrates the spatial and temporal consistencies and discrepancies between the AVHRR LTDR and MODIS MOD13C1 NDVI products in China, which could provide useful information for the choice of NDVI products in subsequent studies of vegetation dynamics. PMID:28587266
Comparison and Evaluation of Annual NDVI Time Series in China Derived from the NOAA AVHRR LTDR and Terra MODIS MOD13C1 Products.

PubMed

Guo, Xiaoyi; Zhang, Hongyan; Wu, Zhengfang; Zhao, Jianjun; Zhang, Zhengxiang

2017-06-06

Time series of Normalized Difference Vegetation Index (NDVI) derived from multiple satellite sensors are crucial data to study vegetation dynamics. The Land Long Term Data Record Version 4 (LTDR V4) NDVI dataset was recently released at a 0.05 × 0.05° spatial resolution and daily temporal resolution. In this study, annual NDVI time series that are composited by the LTDR V4 and Moderate Resolution Imaging Spectroradiometer (MODIS) NDVI datasets (MOD13C1) are compared and evaluated for the period from 2001 to 2014 in China. The spatial patterns of the NDVI generally match between the LTDR V4 and MOD13C1 datasets. The transitional zone between high and low NDVI values generally matches the boundary of semi-arid and sub-humid regions. A significant and high coefficient of determination is found between the two datasets according to a pixel-based correlation analysis. The spatially averaged NDVI of LTDR V4 is characterized by a much weaker positive regression slope relative to that of the spatially averaged NDVI of the MOD13C1 dataset because of changes in NOAA AVHRR sensors between 2005 and 2006. The measured NDVI values of LTDR V4 were always higher than that of MOD13C1 in western China due to the relatively lower atmospheric water vapor content in western China, and opposite observation appeared in eastern China. In total, 18.54% of the LTDR V4 NDVI pixels exhibit significant trends, whereas 35.79% of the MOD13C1 NDVI pixels show significant trends. Good agreement is observed between the significant trends of the two datasets in the Northeast Plain, Bohai Economic Rim, Loess Plateau, and Yangtze River Delta. By contrast, the datasets contrasted in northwestern desert regions and southern China. A trend analysis of the regression slope values according to the vegetation type shows good agreement between the LTDR V4 and MOD13C1 datasets. This study demonstrates the spatial and temporal consistencies and discrepancies between the AVHRR LTDR and MODIS MOD13C1 NDVI products in China, which could provide useful information for the choice of NDVI products in subsequent studies of vegetation dynamics.
U.S. Geological Survey spatial data access

USGS Publications Warehouse

Faundeen, John L.; Kanengieter, Ronald L.; Buswell, Michael D.

2002-01-01

The U.S. Geological Survey (USGS) has done a progress review on improving access to its spatial data holdings over the Web. The USGS EROS Data Center has created three major Web-based interfaces to deliver spatial data to the general public; they are Earth Explorer, the Seamless Data Distribution System (SDDS), and the USGS Web Mapping Portal. Lessons were learned in developing these systems, and various resources were needed for their implementation. The USGS serves as a fact-finding agency in the U.S. Government that collects, monitors, analyzes, and provides scientific information about natural resource conditions and issues. To carry out its mission, the USGS has created and managed spatial data since its inception. Originally relying on paper maps, the USGS now uses advanced technology to produce digital representations of the Earth’s features. The spatial products of the USGS include both source and derivative data. Derivative datasets include Digital Orthophoto Quadrangles (DOQ), Digital Elevation Models, Digital Line Graphs, land-cover Digital Raster Graphics, and the seamless National Elevation Dataset. These products, created with automated processes, use aerial photographs, satellite images, or other cartographic information such as scanned paper maps as source data. With Earth Explorer, users can search multiple inventories through metadata queries and can browse satellite and DOQ imagery. They can place orders and make payment through secure credit card transactions. Some USGS spatial data can be accessed with SDDS. The SDDS uses an ArcIMS map service interface to identify the user’s areas of interest and determine the output format; it allows the user to either download the actual spatial data directly for small areas or place orders for larger areas to be delivered on media. The USGS Web Mapping Portal provides views of national and international datasets through an ArcIMS map service interface. In addition, the map portal posts news about new map services available from the USGS, many simultaneously published on the Environmental Systems Research Institute Geography Network. These three information systems use new software tools and expanded hardware to meet the requirements of the users. The systems are designed to handle the required workload and are relatively easy to enhance and maintain. The software tools give users a high level of functionality and help the system conform to industry standards. The hardware and software architecture is designed to handle the large amounts of spatial data and Internet traffic required by the information systems. Last, customer support was needed to answer questions, monitor e-mail, and report customer problems.
Tracking changes in land-use and drainage status of organic soils using heterogeneous spatial datasets

NASA Astrophysics Data System (ADS)

Untenecker, Johanna; Tiemeyer, Bärbel; Freibauer, Annette; Laggner, Andreas; Luterbacher, Jürg

2016-04-01

Tracking land-use since 1990 is one of the major challenges in greenhouse gas (GHG) reporting under the United Nations Framework Convention on Climate Change (UNFCCC) and the Kyoto Protocol, as the data availability, especially for the base year 1990, is often poor. Even if data is available, spatial and thematic resolution will often change over time or differ even within one country. Such inconsistencies will cause a strong overestimation of land use change (LUC) if not adequately accounted for. Using different spatial datasets, we present a method that allows tracking changes in land-use and drainage status of organic soils. The drainage status is relevant for the Kyoto activities grazing land management (GM) and wetland drainage and rewetting (WDR) as the GHG emissions of organic soils strongly depend on the groundwater level. We used datasets that are already used for the German national inventory report (Digital Landscape Model of official cadastre data) and high resolution spatial datasets (CIR aerial photography) derived for biodiversity monitoring of six federal states in North and East Germany. This data is combined with the legal protection status such as nature conservation areas. To create a consistent time series, we developed a translation key which allows quantifying gross and net LUC in a spatially explicit manner. The developed method fills the lack of data for 1990 and allows GHG accounting on higher Tier levels as soon as detailed emission factors are ready to be implemented. LUC can be stratified by the protection status. Areas without a protection status show a trend towards both intensification of land-use and drier conditions. Highly protected areas show an opposite trend while a moderate protection level (e.g. by nature parks) did only have very weak effects. Furthermore, there are major differences between federal states. In Schleswig-Holstein, known as a federal state of high agricultural production, organic soils tend to become drier and even highly protected areas only show a slight decrease of land-use intensity. Organic soils in Mecklenburg-Western Pomerania, on the other hand, tend to become wetter and less intensively used even in not protected areas. This can be interpreted as a result of an extensive peatland protection programme. Thus, our method does not only allow tracking drainage status and land-use in a suitable way for higher Tier levels in GHG-inventories and for Kyoto-accounting, but offers additional information on the success of large scale rewetting practises.
Plant species classification using flower images—A comparative study of local feature representations

PubMed Central

Seeland, Marco; Rzanny, Michael; Alaqraa, Nedal; Wäldchen, Jana; Mäder, Patrick

2017-01-01

Steady improvements of image description methods induced a growing interest in image-based plant species classification, a task vital to the study of biodiversity and ecological sensitivity. Various techniques have been proposed for general object classification over the past years and several of them have already been studied for plant species classification. However, results of these studies are selective in the evaluated steps of a classification pipeline, in the utilized datasets for evaluation, and in the compared baseline methods. No study is available that evaluates the main competing methods for building an image representation on the same datasets allowing for generalized findings regarding flower-based plant species classification. The aim of this paper is to comparatively evaluate methods, method combinations, and their parameters towards classification accuracy. The investigated methods span from detection, extraction, fusion, pooling, to encoding of local features for quantifying shape and color information of flower images. We selected the flower image datasets Oxford Flower 17 and Oxford Flower 102 as well as our own Jena Flower 30 dataset for our experiments. Findings show large differences among the various studied techniques and that their wisely chosen orchestration allows for high accuracies in species classification. We further found that true local feature detectors in combination with advanced encoding methods yield higher classification results at lower computational costs compared to commonly used dense sampling and spatial pooling methods. Color was found to be an indispensable feature for high classification results, especially while preserving spatial correspondence to gray-level features. In result, our study provides a comprehensive overview of competing techniques and the implications of their main parameters for flower-based plant species classification. PMID:28234999
Large-scale Labeled Datasets to Fuel Earth Science Deep Learning Applications

NASA Astrophysics Data System (ADS)

Maskey, M.; Ramachandran, R.; Miller, J.

2017-12-01

Deep learning has revolutionized computer vision and natural language processing with various algorithms scaled using high-performance computing. However, generic large-scale labeled datasets such as the ImageNet are the fuel that drives the impressive accuracy of deep learning results. Large-scale labeled datasets already exist in domains such as medical science, but creating them in the Earth science domain is a challenge. While there are ways to apply deep learning using limited labeled datasets, there is a need in the Earth sciences for creating large-scale labeled datasets for benchmarking and scaling deep learning applications. At the NASA Marshall Space Flight Center, we are using deep learning for a variety of Earth science applications where we have encountered the need for large-scale labeled datasets. We will discuss our approaches for creating such datasets and why these datasets are just as valuable as deep learning algorithms. We will also describe successful usage of these large-scale labeled datasets with our deep learning based applications.
Digital version of "Open-File Report 92-179: Geologic map of the Cow Cove Quadrangle, San Bernardino County, California"

USGS Publications Warehouse

Wilshire, Howard G.; Bedford, David R.; Coleman, Teresa

2002-01-01

3. Plottable map representations of the database at 1:24,000 scale in PostScript and Adobe PDF formats. The plottable files consist of a color geologic map derived from the spatial database, composited with a topographic base map in the form of the USGS Digital Raster Graphic for the map area. Color symbology from each of these datasets is maintained, which can cause plot file sizes to be large.
An integrated pan-tropical biomass map using multiple reference datasets.

PubMed

Avitabile, Valerio; Herold, Martin; Heuvelink, Gerard B M; Lewis, Simon L; Phillips, Oliver L; Asner, Gregory P; Armston, John; Ashton, Peter S; Banin, Lindsay; Bayol, Nicolas; Berry, Nicholas J; Boeckx, Pascal; de Jong, Bernardus H J; DeVries, Ben; Girardin, Cecile A J; Kearsley, Elizabeth; Lindsell, Jeremy A; Lopez-Gonzalez, Gabriela; Lucas, Richard; Malhi, Yadvinder; Morel, Alexandra; Mitchard, Edward T A; Nagy, Laszlo; Qie, Lan; Quinones, Marcela J; Ryan, Casey M; Ferry, Slik J W; Sunderland, Terry; Laurin, Gaia Vaglio; Gatti, Roberto Cazzolla; Valentini, Riccardo; Verbeeck, Hans; Wijaya, Arief; Willcock, Simon

2016-04-01

We combined two existing datasets of vegetation aboveground biomass (AGB) (Proceedings of the National Academy of Sciences of the United States of America, 108, 2011, 9899; Nature Climate Change, 2, 2012, 182) into a pan-tropical AGB map at 1-km resolution using an independent reference dataset of field observations and locally calibrated high-resolution biomass maps, harmonized and upscaled to 14 477 1-km AGB estimates. Our data fusion approach uses bias removal and weighted linear averaging that incorporates and spatializes the biomass patterns indicated by the reference data. The method was applied independently in areas (strata) with homogeneous error patterns of the input (Saatchi and Baccini) maps, which were estimated from the reference data and additional covariates. Based on the fused map, we estimated AGB stock for the tropics (23.4 N-23.4 S) of 375 Pg dry mass, 9-18% lower than the Saatchi and Baccini estimates. The fused map also showed differing spatial patterns of AGB over large areas, with higher AGB density in the dense forest areas in the Congo basin, Eastern Amazon and South-East Asia, and lower values in Central America and in most dry vegetation areas of Africa than either of the input maps. The validation exercise, based on 2118 estimates from the reference dataset not used in the fusion process, showed that the fused map had a RMSE 15-21% lower than that of the input maps and, most importantly, nearly unbiased estimates (mean bias 5 Mg dry mass ha(-1) vs. 21 and 28 Mg ha(-1) for the input maps). The fusion method can be applied at any scale including the policy-relevant national level, where it can provide improved biomass estimates by integrating existing regional biomass maps as input maps and additional, country-specific reference datasets. © 2015 John Wiley & Sons Ltd.
Evaluating the role of evapotranspiration remote sensing data in improving hydrological modeling predictability

NASA Astrophysics Data System (ADS)

Herman, Matthew R.; Nejadhashemi, A. Pouyan; Abouali, Mohammad; Hernandez-Suarez, Juan Sebastian; Daneshvar, Fariborz; Zhang, Zhen; Anderson, Martha C.; Sadeghi, Ali M.; Hain, Christopher R.; Sharifi, Amirreza

2018-01-01

As the global demands for the use of freshwater resources continues to rise, it has become increasingly important to insure the sustainability of this resources. This is accomplished through the use of management strategies that often utilize monitoring and the use of hydrological models. However, monitoring at large scales is not feasible and therefore model applications are becoming challenging, especially when spatially distributed datasets, such as evapotranspiration, are needed to understand the model performances. Due to these limitations, most of the hydrological models are only calibrated for data obtained from site/point observations, such as streamflow. Therefore, the main focus of this paper is to examine whether the incorporation of remotely sensed and spatially distributed datasets can improve the overall performance of the model. In this study, actual evapotranspiration (ETa) data was obtained from the two different sets of satellite based remote sensing data. One dataset estimates ETa based on the Simplified Surface Energy Balance (SSEBop) model while the other one estimates ETa based on the Atmosphere-Land Exchange Inverse (ALEXI) model. The hydrological model used in this study is the Soil and Water Assessment Tool (SWAT), which was calibrated against spatially distributed ETa and single point streamflow records for the Honeyoey Creek-Pine Creek Watershed, located in Michigan, USA. Two different techniques, multi-variable and genetic algorithm, were used to calibrate the SWAT model. Using the aforementioned datasets, the performance of the hydrological model in estimating ETa was improved using both calibration techniques by achieving Nash-Sutcliffe efficiency (NSE) values >0.5 (0.73-0.85), percent bias (PBIAS) values within ±25% (±21.73%), and root mean squared error - observations standard deviation ratio (RSR) values <0.7 (0.39-0.52). However, the genetic algorithm technique was more effective with the ETa calibration while significantly reducing the model performance for estimating the streamflow (NSE: 0.32-0.52, PBIAS: ±32.73%, and RSR: 0.63-0.82). Meanwhile, using the multi-variable technique, the model performance for estimating the streamflow was maintained with a high level of accuracy (NSE: 0.59-0.61, PBIAS: ±13.70%, and RSR: 0.63-0.64) while the evapotranspiration estimations were improved. Results from this assessment shows that incorporation of remotely sensed and spatially distributed data can improve the hydrological model performance if it is coupled with a right calibration technique.
Geostatistical Characterization of Cereal Leaf Beetle (Coleoptera: Chrysomelidae) Distributions in Wheat.

PubMed

Reay-Jones, Francis P F

2017-08-01

A 3-yr study was conducted in wheat, Triticum aestivum L., in South Carolina to characterize the spatial distribution of Oulema melanopus (L.) adults, eggs, and larvae using semivariograms, which provides a measure of spatial dependence among sampling data. Moran's I coefficients for peak densities of each life stage indicated significant positive autocorrelation for seven (two for eggs, one for larvae, and four for adults) of the 16 datasets. Aggregation was detected in 13 of these 16 datasets when analyzed by semivariogram modeling, with spherical, Gaussian, and exponential models best fitting for eight, four, and one dataset, respectively, and with models for two datasets having only one parameter (nugget) significantly different from zero. The nugget-to-sill ratios ranged from 0.043 to 0.774, and indicated strong spatial dependence in six models (three for adults, two for eggs, and one for larvae), moderate spatial dependence in six models (three for adults and six for eggs), and weak spatial dependence in one model (adults). Range values varied from 39.1 m to 234.1 m, with an average of 120.1 ± 14.0 m. Average range values were 104.9, 135.2, and 161.2 m for adults, eggs, and larvae, respectively. Because the majority of semivariogram models in our study indicated aggregated distributions, spatial sampling will provide more information than nonspatial random sampling. Developing our understanding of spatial dependence of crop pests is needed to optimize sampling plans and can provide a basis for exploring site-specific management tactics. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Uncertainty Assessment of the NASA Earth Exchange Global Daily Downscaled Climate Projections (NEX-GDDP) Dataset

NASA Technical Reports Server (NTRS)

Wang, Weile; Nemani, Ramakrishna R.; Michaelis, Andrew; Hashimoto, Hirofumi; Dungan, Jennifer L.; Thrasher, Bridget L.; Dixon, Keith W.

2016-01-01

The NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP) dataset is comprised of downscaled climate projections that are derived from 21 General Circulation Model (GCM) runs conducted under the Coupled Model Intercomparison Project Phase 5 (CMIP5) and across two of the four greenhouse gas emissions scenarios (RCP4.5 and RCP8.5). Each of the climate projections includes daily maximum temperature, minimum temperature, and precipitation for the periods from 1950 through 2100 and the spatial resolution is 0.25 degrees (approximately 25 km x 25 km). The GDDP dataset has received warm welcome from the science community in conducting studies of climate change impacts at local to regional scales, but a comprehensive evaluation of its uncertainties is still missing. In this study, we apply the Perfect Model Experiment framework (Dixon et al. 2016) to quantify the key sources of uncertainties from the observational baseline dataset, the downscaling algorithm, and some intrinsic assumptions (e.g., the stationary assumption) inherent to the statistical downscaling techniques. We developed a set of metrics to evaluate downscaling errors resulted from bias-correction ("quantile-mapping"), spatial disaggregation, as well as the temporal-spatial non-stationarity of climate variability. Our results highlight the spatial disaggregation (or interpolation) errors, which dominate the overall uncertainties of the GDDP dataset, especially over heterogeneous and complex terrains (e.g., mountains and coastal area). In comparison, the temporal errors in the GDDP dataset tend to be more constrained. Our results also indicate that the downscaled daily precipitation also has relatively larger uncertainties than the temperature fields, reflecting the rather stochastic nature of precipitation in space. Therefore, our results provide insights in improving statistical downscaling algorithms and products in the future.
Datasets, Technologies and Products from the NASA/NOAA Electronic Theater 2002

NASA Technical Reports Server (NTRS)

Hasler, A. Fritz; Starr, David (Technical Monitor)

2001-01-01

An in depth look at the Earth Science datasets used in the Etheater Visualizations will be presented. This will include the satellite orbits, platforms, scan patterns, the size, temporal and spatial resolution, and compositing techniques used to obtain the datasets as well as the spectral bands utilized.
Digital spatial data for observed, predicted, and misclassification errors for observations in the training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study area

USGS Publications Warehouse

McKinney, Tim S.; Anning, David W.

2012-01-01

This product "Digital spatial data for observed, predicted, and misclassification errors for observations in the training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study area" is a 1:250,000-scale point spatial dataset developed as part of a regional Southwest Principal Aquifers (SWPA) study (Anning and others, 2012). The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions.
Patient-specific quantification of image quality: An automated method for measuring spatial resolution in clinical CT images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanders, Jeremiah, E-mail: jeremiah.sanders@duke.e

Purpose: To develop and validate an automated technique for evaluating the spatial resolution characteristics of clinical computed tomography (CT) images. Methods: Twenty one chest and abdominopelvic clinical CT datasets were examined in this study. An algorithm was developed to extract a CT resolution index (RI) analogous to the modulation transfer function from clinical CT images by measuring the edge-spread function (ESF) across the patient’s skin. A polygon mesh of the air-skin boundary was created. The faces of the mesh were then used to measure the ESF across the air-skin interface. The ESF was differentiated to obtain the line-spread function (LSF),more » and the LSF was Fourier transformed to obtain the RI. The algorithm’s ability to detect the radial dependence of the RI was investigated. RIs measured with the proposed method were compared with a conventional phantom-based method across two reconstruction algorithms (FBP and iterative) using the spatial frequency at 50% RI, f{sub 50}, as the metric for comparison. Three reconstruction kernels were investigated for each reconstruction algorithm. Finally, an observer study was conducted to determine if observers could visually perceive the differences in the measured blurriness of images reconstructed with a given reconstruction method. Results: RI measurements performed with the proposed technique exhibited the expected dependencies on the image reconstruction. The measured f{sub 50} values increased with harder kernels for both FBP and iterative reconstruction. Furthermore, the proposed algorithm was able to detect the radial dependence of the RI. Patient-specific measurements of the RI were comparable to the phantom-based technique, but the patient data exhibited a large spread in the measured f{sub 50}, indicating that some datasets were blurrier than others even when the projection data were reconstructed with the same reconstruction algorithm and kernel. Results from the observer study substantiated this finding. Conclusions: Clinically informed, patient-specific spatial resolution can be measured from clinical datasets. The method is sufficiently sensitive to reflect changes in spatial resolution due to different reconstruction parameters. The method can be applied to automatically assess the spatial resolution of patient images and quantify dependencies that may not be captured in phantom data.« less
An open source multivariate framework for n-tissue segmentation with evaluation on public data.

PubMed

Avants, Brian B; Tustison, Nicholas J; Wu, Jue; Cook, Philip A; Gee, James C

2011-12-01

We introduce Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs ( http://www.picsl.upenn.edu/ANTs). The Bayesian formulation of the segmentation problem is solved using the Expectation Maximization (EM) algorithm with the modeling of the class intensities based on either parametric or non-parametric finite mixtures. Atropos is capable of incorporating spatial prior probability maps (sparse), prior label maps and/or Markov Random Field (MRF) modeling. Atropos has also been efficiently implemented to handle large quantities of possible labelings (in the experimental section, we use up to 69 classes) with a minimal memory footprint. This work describes the technical and implementation aspects of Atropos and evaluates its performance on two different ground-truth datasets. First, we use the BrainWeb dataset from Montreal Neurological Institute to evaluate three-tissue segmentation performance via (1) K-means segmentation without use of template data; (2) MRF segmentation with initialization by prior probability maps derived from a group template; (3) Prior-based segmentation with use of spatial prior probability maps derived from a group template. We also evaluate Atropos performance by using spatial priors to drive a 69-class EM segmentation problem derived from the Hammers atlas from University College London. These evaluation studies, combined with illustrative examples that exercise Atropos options, demonstrate both performance and wide applicability of this new platform-independent open source segmentation tool.
An Open Source Multivariate Framework for n-Tissue Segmentation with Evaluation on Public Data

PubMed Central

Tustison, Nicholas J.; Wu, Jue; Cook, Philip A.; Gee, James C.

2012-01-01

We introduce Atropos, an ITK-based multivariate n-class open source segmentation algorithm distributed with ANTs (http://www.picsl.upenn.edu/ANTs). The Bayesian formulation of the segmentation problem is solved using the Expectation Maximization (EM) algorithm with the modeling of the class intensities based on either parametric or non-parametric finite mixtures. Atropos is capable of incorporating spatial prior probability maps (sparse), prior label maps and/or Markov Random Field (MRF) modeling. Atropos has also been efficiently implemented to handle large quantities of possible labelings (in the experimental section, we use up to 69 classes) with a minimal memory footprint. This work describes the technical and implementation aspects of Atropos and evaluates its performance on two different ground-truth datasets. First, we use the BrainWeb dataset from Montreal Neurological Institute to evaluate three-tissue segmentation performance via (1) K-means segmentation without use of template data; (2) MRF segmentation with initialization by prior probability maps derived from a group template; (3) Prior-based segmentation with use of spatial prior probability maps derived from a group template. We also evaluate Atropos performance by using spatial priors to drive a 69-class EM segmentation problem derived from the Hammers atlas from University College London. These evaluation studies, combined with illustrative examples that exercise Atropos options, demonstrate both performance and wide applicability of this new platform-independent open source segmentation tool. PMID:21373993
Isotopic constraints on global atmospheric methane sources and sinks: a critical assessment of recent findings and new data

NASA Astrophysics Data System (ADS)

Schwietzke, S.; Sherwood, O.; Michel, S. E.; Bruhwiler, L.; Dlugokencky, E. J.; Tans, P. P.

2017-12-01

Methane isotopic data have increasingly been used in recent studies to help constrain global atmospheric methane sources and sinks. The added scientific contributions to this field include (i) careful comparisons and merging of atmospheric isotope measurement datasets to increase spatial coverage, (ii) in-depth analyses of observed isotopic spatial gradients and seasonal patterns, and (iii) improved datasets of isotopic source signatures. Different interpretations have been made regarding the utility of the isotopic data on the diagnosis of methane sources and sinks. Some studies have found isotopic evidence of a largely microbial source causing the renewed growth in global atmospheric methane since 2007, and underestimated global fossil fuel methane emissions compared to most previous studies. However, other studies have challenged these conclusions by pointing out substantial spatial variability in isotopic source signatures as well as open questions in atmospheric sinks and biomass burning trends. This presentation will review and contrast the main arguments and evidence for the different conclusions. The analysis will distinguish among the different research objectives including (i) global methane budget source attribution in steady-state, (ii) source attribution of recent global methane trends, and (iii) identifying specific methane sources in individual plumes during field campaigns. Additional comparisons of model experiments with atmospheric measurements and updates on isotopic source signature data will complement the analysis.

Identifying environmental drivers of insect phenology across space and time: Culicoides in Scotland as a case study.

PubMed

Searle, K R; Blackwell, A; Falconer, D; Sullivan, M; Butler, A; Purse, B V

2013-04-01

Interpreting spatial patterns in the abundance of species over time is a fundamental cornerstone of ecological research. For many species, this type of analysis is hampered by datasets that contain a large proportion of zeros, and data that are overdispersed and spatially autocorrelated. This is particularly true for insects, for which abundance data can fluctuate from zero to many thousands in the space of weeks. Increasingly, an understanding of the ways in which environmental variation drives spatial and temporal patterns in the distribution, abundance and phenology of insects is required for management of pests and vector-borne diseases. In this study, we combine the use of smoothing techniques and generalised linear mixed models to relate environmental drivers to key phenological patterns of two species of biting midges, Culicoides pulicaris and C. impunctatus, of which C. pulicaris has been implicated in transmission of bluetongue in Europe. In so doing, we demonstrate analytical tools for linking the phenology of species with key environmental drivers, despite using a relatively small dataset containing overdispersed and zero-inflated data. We demonstrate the importance of landcover and climatic variables in determining the seasonal abundance of these two vector species, and highlight the need for more empirical data on the effects of temperature and precipitation on the life history traits of palearctic Culicoides spp. in Europe.
Modeling spatial-temporal dynamics of global wetlands: comprehensive evaluation of a new sub-grid TOPMODEL parameterization and uncertainties

NASA Astrophysics Data System (ADS)

Zhang, Z.; Zimmermann, N. E.; Poulter, B.

2015-11-01

Simulations of the spatial-temporal dynamics of wetlands are key to understanding the role of wetland biogeochemistry under past and future climate variability. Hydrologic inundation models, such as TOPMODEL, are based on a fundamental parameter known as the compound topographic index (CTI) and provide a computationally cost-efficient approach to simulate wetland dynamics at global scales. However, there remains large discrepancy in the implementations of TOPMODEL in land-surface models (LSMs) and thus their performance against observations. This study describes new improvements to TOPMODEL implementation and estimates of global wetland dynamics using the LPJ-wsl dynamic global vegetation model (DGVM), and quantifies uncertainties by comparing three digital elevation model products (HYDRO1k, GMTED, and HydroSHEDS) at different spatial resolution and accuracy on simulated inundation dynamics. In addition, we found that calibrating TOPMODEL with a benchmark wetland dataset can help to successfully delineate the seasonal and interannual variations of wetlands, as well as improve the spatial distribution of wetlands to be consistent with inventories. The HydroSHEDS DEM, using a river-basin scheme for aggregating the CTI, shows best accuracy for capturing the spatio-temporal dynamics of wetlands among the three DEM products. The estimate of global wetland potential/maximum is ∼ 10.3 Mkm2 (106 km2), with a mean annual maximum of ∼ 5.17 Mkm2 for 1980-2010. This study demonstrates the feasibility to capture spatial heterogeneity of inundation and to estimate seasonal and interannual variations in wetland by coupling a hydrological module in LSMs with appropriate benchmark datasets. It additionally highlights the importance of an adequate investigation of topographic indices for simulating global wetlands and shows the opportunity to converge wetland estimates across LSMs by identifying the uncertainty associated with existing wetland products.
Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects

NASA Astrophysics Data System (ADS)

Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan

2010-10-01

The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.
Uncertainty in the spatial distribution of tropical forest biomass: a comparison of pan-tropical maps.

PubMed

Mitchard, Edward Ta; Saatchi, Sassan S; Baccini, Alessandro; Asner, Gregory P; Goetz, Scott J; Harris, Nancy L; Brown, Sandra

2013-10-26

Mapping the aboveground biomass of tropical forests is essential both for implementing conservation policy and reducing uncertainties in the global carbon cycle. Two medium resolution (500 m - 1000 m) pantropical maps of vegetation biomass have been recently published, and have been widely used by sub-national and national-level activities in relation to Reducing Emissions from Deforestation and forest Degradation (REDD+). Both maps use similar input data layers, and are driven by the same spaceborne LiDAR dataset providing systematic forest height and canopy structure estimates, but use different ground datasets for calibration and different spatial modelling methodologies. Here, we compare these two maps to each other, to the FAO's Forest Resource Assessment (FRA) 2010 country-level data, and to a high resolution (100 m) biomass map generated for a portion of the Colombian Amazon. We find substantial differences between the two maps, in particular in central Amazonia, the Congo basin, the south of Papua New Guinea, the Miombo woodlands of Africa, and the dry forests and savannas of South America. There is little consistency in the direction of the difference. However, when the maps are aggregated to the country or biome scale there is greater agreement, with differences cancelling out to a certain extent. When comparing country level biomass stocks, the two maps agree with each other to a much greater extent than to the FRA 2010 estimates. In the Colombian Amazon, both pantropical maps estimate higher biomass than the independent high resolution map, but show a similar spatial distribution of this biomass. Biomass mapping has progressed enormously over the past decade, to the stage where we can produce globally consistent maps of aboveground biomass. We show that there are still large uncertainties in these maps, in particular in areas with little field data. However, when used at a regional scale, different maps appear to converge, suggesting we can provide reasonable stock estimates when aggregated over large regions. Therefore we believe the largest uncertainties for REDD+ activities relate to the spatial distribution of biomass and to the spatial pattern of forest cover change, rather than to total globally or nationally summed carbon density.
Exhumation of the Ladakh batholith revealed through the combined analysis of bedrock and detrital zircon (U-Th)/He data

NASA Astrophysics Data System (ADS)

Tripathy-Lang, A.; Fox, M.; Bohon, W.; Van Soest, M. C.; Hodges, K. V.; Dortch, J.

2013-12-01

Recent studies of the Ladakh batholith, in the northwestern Indian Himalaya, have yielded various hypotheses for its exhumation history and relationship with the evolution of the southwestern margin of the Tibetan Plateau, which is today bounded by the Karakoram fault. Different hypotheses are supported by various datasets with differing spatial and temporal resolution. First, low-temperature thermochronologic and thermobarometric data provide constraints on long term exhumation (10^6 - 10^7 yr) and suggest that the Ladakh batholith experienced multiple tilting events since ~40 Ma (Kirstein, Tectonophysics, 2011). Second, cosmogenic nuclide concentrations (CNCs), which provide evidence for erosion rates averaged over millennial timescales (10^2-10^4 yr), suggest that erosion rates increase toward the Karakoram fault (Dortch et al., Geomorphology, 2011). A third dataset comprises detrital zircon (U-Th)/He data obtained from the mouth of the Basgo catchment, on the southern flank of the Ladakh batholith (Tripathy-Lang et al., JGR-ES, 2013). This exceptionally large detrital dataset provides information about both the bedrock age distribution and recent erosion rates that sample different parts of the catchment. Interpreting this dataset requires an understanding of the erosion history at multiple timescales. To these already existing datasets, we add new bedrock zircon (U-Th)/He data from an age-elevation transect collected from the base to range crest of the Basgo catchment, which we use to verify models of bedrock age distribution. Through the combined analysis of the datasets, the resolution of both the long term exhumation rate and the spatial distribution of modern erosion rates can be greatly improved, thus advancing our understanding of this part of the Tibetan margin. With this aim, we use thermo-kinematic models to predict bedrock ages that we compare to our new bedrock data. We test different modern erosion rate distributions to generate synthetic detrital thermochronometric and CNC data. Through the comparison of predicted and measured data (both detrital thermochronometric data and CNC data) we infer long term exhumation histories and also modern erosion rate distribution.
South Asian Summer Monsoon Rainfall Variability and Trend: Its Links to Indo-Pacific SST Anomalies and Moist Processes

NASA Astrophysics Data System (ADS)

Prasanna, V.

2016-06-01

The warm (cold) phase of El Niño (La Niña) and its impact on all Indian Summer Monsoon rainfall (AISMR) relationship is explored for the past 100 years. The 103-year (1901-2003) data from the twentieth century reanalysis datasets (20CR) and other major reanalysis datasets for southwest monsoon season (JJAS) is utilized to find out the simultaneous influence of the El Niño Southern Oscillation (ENSO)-AISMR relationship. Two cases such as wet, dry monsoon years associated with ENSO(+) (El Niño), ENSO(-) (La Niña) and Non-ENSO (neutral) events have been discussed in detail using observed rainfall and three-dimensional 20CR dataset. The dry and wet years associated with ENSO and Non-ENSO periods show significant differences in the spatial pattern of rainfall associated with three-dimensional atmospheric composite, the 20CR dataset has captured the anomalies quite well. During wet (dry) years, the rainfall is high (low), i.e. 10 % above (below) average from the long-term mean and this wet or dry condition occur both during ENSO and Non-ENSO phases. The Non-ENSO year dry or wet composites are also focused in detail to understand, where do the anomalous winds come from unlike in the ENSO case. The moisture transport is coherent with the changes in the spatial pattern of AISMR and large-scale feature in the 20CR dataset. Recent 50-year trend (1951-2000) is also analyzed from various available observational and reanalysis datasets to see the influence of Indo-Pacific SST and moist processes on the South Asian summer monsoon rainfall trend. Apart from the Indo-Pacific sea surface temperatures (SST), the moisture convergence and moisture transport among India (IND), Equatorial Indian Ocean (IOC) and tropical western pacific (WNP) is also important in modifying the wet or dry cycles over India. The mutual interaction among IOC, WNP and IND in seasonal timescales is significant in modifying wet and dry cycles over the Indian region and the seasonal anomalies.
Using the Spatial Distribution of Installers to Define Solar Photovoltaic Markets

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Shaughnessy, Eric; Nemet, Gregory F.; Darghouth, Naim

2016-09-01

Solar PV market research to date has largely relied on arbitrary jurisdictional boundaries, such as counties, to study solar PV market dynamics. This paper seeks to improve solar PV market research by developing a methodology to define solar PV markets. The methodology is based on the spatial distribution of solar PV installers. An algorithm is developed and applied to a rich dataset of solar PV installations to study the outcomes of the installer-based market definitions. The installer-based approach exhibits several desirable properties. Specifically, the higher market granularity of the installer-based approach will allow future PV market research to study themore » relationship between market dynamics and pricing with more precision.« less
Spatial Indexing for Data Searching in Mobile Sensing Environments.

PubMed

Zhou, Yuchao; De, Suparna; Wang, Wei; Moessner, Klaus; Palaniswami, Marimuthu S

2017-06-18

Data searching and retrieval is one of the fundamental functionalities in many Web of Things applications, which need to collect, process and analyze huge amounts of sensor stream data. The problem in fact has been well studied for data generated by sensors that are installed at fixed locations; however, challenges emerge along with the popularity of opportunistic sensing applications in which mobile sensors keep reporting observation and measurement data at variable intervals and changing geographical locations. To address these challenges, we develop the Geohash-Grid Tree, a spatial indexing technique specially designed for searching data integrated from heterogeneous sources in a mobile sensing environment. Results of the experiments on a real-world dataset collected from the SmartSantander smart city testbed show that the index structure allows efficient search based on spatial distance, range and time windows in a large time series database.
Spatial Indexing for Data Searching in Mobile Sensing Environments

PubMed Central

Zhou, Yuchao; De, Suparna; Wang, Wei; Moessner, Klaus; Palaniswami, Marimuthu S.

2017-01-01

Data searching and retrieval is one of the fundamental functionalities in many Web of Things applications, which need to collect, process and analyze huge amounts of sensor stream data. The problem in fact has been well studied for data generated by sensors that are installed at fixed locations; however, challenges emerge along with the popularity of opportunistic sensing applications in which mobile sensors keep reporting observation and measurement data at variable intervals and changing geographical locations. To address these challenges, we develop the Geohash-Grid Tree, a spatial indexing technique specially designed for searching data integrated from heterogeneous sources in a mobile sensing environment. Results of the experiments on a real-world dataset collected from the SmartSantander smart city testbed show that the index structure allows efficient search based on spatial distance, range and time windows in a large time series database. PMID:28629156
Highlights of the Version 8 SBUV and TOMS Datasets Released at this Symposium

NASA Technical Reports Server (NTRS)

Bhartia, Pawan K.; McPeters, Richard D.; Flynn, Lawrence E.; Wellemeyer, Charles G.

2004-01-01

Last October was the 25th anniversary of the launch of the SBUV and TOMS instruments on NASA's Nimbus-7 satellite. Total Ozone and ozone profile datasets produced by these and following instruments have produced a quarter century long record. Over time we have released several versions of these datasets to incorporate advances in UV radiative transfer, inverse modeling, and instrument characterization. In this meeting we are releasing datasets produced from the version 8 algorithms. They replace the previous versions (V6 SBUV, and V7 TOMS) released about a decade ago. About a dozen companion papers in this meeting provide details of the new algorithms and intercomparison of the new data with external data. In this paper we present key features of the new algorithm, and discuss how the new results differ from those released previously. We show that the new datasets have better internal consistency and also agree better with external datasets. A key feature of the V8 SBUV algorithm is that the climatology has no influence on inter-annual variability and trends; it only affects the mean values and, to a limited extent, the seasonal dependence. By contrast, climatology does have some influence on TOMS total O3 trends, particularly at large solar zenith angles. For this reason, and also because TOMS record has gaps, md EP/TOMS is suffering from data quality problems, we recommend using SBUV total ozone data for applications where the high spatial resolution of TOMS is not essential.
Single-Image Super Resolution for Multispectral Remote Sensing Data Using Convolutional Neural Networks

NASA Astrophysics Data System (ADS)

Liebel, L.; Körner, M.

2016-06-01

In optical remote sensing, spatial resolution of images is crucial for numerous applications. Space-borne systems are most likely to be affected by a lack of spatial resolution, due to their natural disadvantage of a large distance between the sensor and the sensed object. Thus, methods for single-image super resolution are desirable to exceed the limits of the sensor. Apart from assisting visual inspection of datasets, post-processing operations—e.g., segmentation or feature extraction—can benefit from detailed and distinguishable structures. In this paper, we show that recently introduced state-of-the-art approaches for single-image super resolution of conventional photographs, making use of deep learning techniques, such as convolutional neural networks (CNN), can successfully be applied to remote sensing data. With a huge amount of training data available, end-to-end learning is reasonably easy to apply and can achieve results unattainable using conventional handcrafted algorithms. We trained our CNN on a specifically designed, domain-specific dataset, in order to take into account the special characteristics of multispectral remote sensing data. This dataset consists of publicly available SENTINEL-2 images featuring 13 spectral bands, a ground resolution of up to 10m, and a high radiometric resolution and thus satisfying our requirements in terms of quality and quantity. In experiments, we obtained results superior compared to competing approaches trained on generic image sets, which failed to reasonably scale satellite images with a high radiometric resolution, as well as conventional interpolation methods.
Global patterns and climate drivers of water-use efficiency in terrestrial ecosystems deduced from satellite-based datasets and carbon cycle models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Yan; Piao, Shilong; Huang, Mengtian

Our aim is to investigate how ecosystem water-use efficiency (WUE) varies spatially under different climate conditions, and how spatial variations in WUE differ from those of transpiration-based water-use efficiency (WUE t) and transpiration-based inherent water-use efficiency (IWUE t). LocationGlobal terrestrial ecosystems. We investigated spatial patterns of WUE using two datasets of gross primary productivity (GPP) and evapotranspiration (ET) and four biosphere model estimates of GPP and ET. Spatial relationships between WUE and climate variables were further explored through regression analyses. Global WUE estimated by two satellite-based datasets is 1.9 ± 0.1 and 1.8 ± 0.6g C m -2mm -1 lowermore » than the simulations from four process-based models (2.0 ± 0.3g C m -2mm -1) but comparable within the uncertainty of both approaches. In both satellite-based datasets and process models, precipitation is more strongly associated with spatial gradients of WUE for temperate and tropical regions, but temperature dominates north of 50 degrees N. WUE also increases with increasing solar radiation at high latitudes. The values of WUE from datasets and process-based models are systematically higher in wet regions (with higher GPP) than in dry regions. WUE t shows a lower precipitation sensitivity than WUE, which is contrary to leaf- and plant-level observations. IWUE t, the product of WUE t and water vapour deficit, is found to be rather conservative with spatially increasing precipitation, in agreement with leaf- and plant-level measurements. In conclusion, WUE, WUE t and IWUE t produce different spatial relationships with climate variables. In dry ecosystems, water losses from evaporation from bare soil, uncorrelated with productivity, tend to make WUE lower than in wetter regions. Yet canopy conductance is intrinsically efficient in those ecosystems and maintains a higher IWUEt. This suggests that the responses of each component flux of evapotranspiration should be analysed separately when investigating regional gradients in WUE, its temporal variability and its trends.« less
Global patterns and climate drivers of water-use efficiency in terrestrial ecosystems deduced from satellite-based datasets and carbon cycle models

DOE PAGES

Sun, Yan; Piao, Shilong; Huang, Mengtian; ...

2015-12-23

Our aim is to investigate how ecosystem water-use efficiency (WUE) varies spatially under different climate conditions, and how spatial variations in WUE differ from those of transpiration-based water-use efficiency (WUE t) and transpiration-based inherent water-use efficiency (IWUE t). LocationGlobal terrestrial ecosystems. We investigated spatial patterns of WUE using two datasets of gross primary productivity (GPP) and evapotranspiration (ET) and four biosphere model estimates of GPP and ET. Spatial relationships between WUE and climate variables were further explored through regression analyses. Global WUE estimated by two satellite-based datasets is 1.9 ± 0.1 and 1.8 ± 0.6g C m -2mm -1 lowermore » than the simulations from four process-based models (2.0 ± 0.3g C m -2mm -1) but comparable within the uncertainty of both approaches. In both satellite-based datasets and process models, precipitation is more strongly associated with spatial gradients of WUE for temperate and tropical regions, but temperature dominates north of 50 degrees N. WUE also increases with increasing solar radiation at high latitudes. The values of WUE from datasets and process-based models are systematically higher in wet regions (with higher GPP) than in dry regions. WUE t shows a lower precipitation sensitivity than WUE, which is contrary to leaf- and plant-level observations. IWUE t, the product of WUE t and water vapour deficit, is found to be rather conservative with spatially increasing precipitation, in agreement with leaf- and plant-level measurements. In conclusion, WUE, WUE t and IWUE t produce different spatial relationships with climate variables. In dry ecosystems, water losses from evaporation from bare soil, uncorrelated with productivity, tend to make WUE lower than in wetter regions. Yet canopy conductance is intrinsically efficient in those ecosystems and maintains a higher IWUEt. This suggests that the responses of each component flux of evapotranspiration should be analysed separately when investigating regional gradients in WUE, its temporal variability and its trends.« less
Integrating High-Resolution Datasets to Target Mitigation Efforts for Improving Air Quality and Public Health in Urban Neighborhoods

PubMed Central

Shandas, Vivek; Voelkel, Jackson; Rao, Meenakshi; George, Linda

2016-01-01

Reducing exposure to degraded air quality is essential for building healthy cities. Although air quality and population vary at fine spatial scales, current regulatory and public health frameworks assess human exposures using county- or city-scales. We build on a spatial analysis technique, dasymetric mapping, for allocating urban populations that, together with emerging fine-scale measurements of air pollution, addresses three objectives: (1) evaluate the role of spatial scale in estimating exposure; (2) identify urban communities that are disproportionately burdened by poor air quality; and (3) estimate reduction in mobile sources of pollutants due to local tree-planting efforts using nitrogen dioxide. Our results show a maximum value of 197% difference between cadastrally-informed dasymetric system (CIDS) and standard estimations of population exposure to degraded air quality for small spatial extent analyses, and a lack of substantial difference for large spatial extent analyses. These results provide the foundation for improving policies for managing air quality, and targeting mitigation efforts to address challenges of environmental justice. PMID:27527205
In situ Raman mapping of art objects

PubMed Central

Brondeel, Ph.; Moens, L.; Vandenabeele, P.

2016-01-01

Raman spectroscopy has grown to be one of the techniques of interest for the investigation of art objects. The approach has several advantageous properties, and the non-destructive character of the technique allowed it to be used for in situ investigations. However, compared with laboratory approaches, it would be useful to take advantage of the small spectral footprint of the technique, and use Raman spectroscopy to study the spatial distribution of different compounds. In this work, an in situ Raman mapping system is developed to be able to relate chemical information with its spatial distribution. Challenges for the development are discussed, including the need for stable positioning and proper data treatment. To avoid focusing problems, nineteenth century porcelain cards are used to test the system. This work focuses mainly on the post-processing of the large dataset which consists of four steps: (i) importing the data into the software; (ii) visualization of the dataset; (iii) extraction of the variables; and (iv) creation of a Raman image. It is shown that despite the challenging task of the development of the full in situ Raman mapping system, the first steps are very promising. This article is part of the themed issue ‘Raman spectroscopy in art and archaeology’. PMID:27799424
Tundra landform and vegetation productivity trend maps for the Arctic Coastal Plain of northern Alaska

PubMed Central

Lara, Mark J.; Nitze, Ingmar; Grosse, Guido; McGuire, A. David

2018-01-01

Arctic tundra landscapes are composed of a complex mosaic of patterned ground features, varying in soil moisture, vegetation composition, and surface hydrology over small spatial scales (10–100 m). The importance of microtopography and associated geomorphic landforms in influencing ecosystem structure and function is well founded, however, spatial data products describing local to regional scale distribution of patterned ground or polygonal tundra geomorphology are largely unavailable. Thus, our understanding of local impacts on regional scale processes (e.g., carbon dynamics) may be limited. We produced two key spatiotemporal datasets spanning the Arctic Coastal Plain of northern Alaska (~60,000 km2) to evaluate climate-geomorphological controls on arctic tundra productivity change, using (1) a novel 30 m classification of polygonal tundra geomorphology and (2) decadal-trends in surface greenness using the Landsat archive (1999–2014). These datasets can be easily integrated and adapted in an array of local to regional applications such as (1) upscaling plot-level measurements (e.g., carbon/energy fluxes), (2) mapping of soils, vegetation, or permafrost, and/or (3) initializing ecosystem biogeochemistry, hydrology, and/or habitat modeling. PMID:29633984
Tundra landform and vegetation productivity trend maps for the Arctic Coastal Plain of northern Alaska

USGS Publications Warehouse

Lara, Mark J.; Nitze, Ingmar; Grosse, Guido; McGuire, A. David

2018-01-01

Arctic tundra landscapes are composed of a complex mosaic of patterned ground features, varying in soil moisture, vegetation composition, and surface hydrology over small spatial scales (10–100 m). The importance of microtopography and associated geomorphic landforms in influencing ecosystem structure and function is well founded, however, spatial data products describing local to regional scale distribution of patterned ground or polygonal tundra geomorphology are largely unavailable. Thus, our understanding of local impacts on regional scale processes (e.g., carbon dynamics) may be limited. We produced two key spatiotemporal datasets spanning the Arctic Coastal Plain of northern Alaska (~60,000 km2) to evaluate climate-geomorphological controls on arctic tundra productivity change, using (1) a novel 30 m classification of polygonal tundra geomorphology and (2) decadal-trends in surface greenness using the Landsat archive (1999–2014). These datasets can be easily integrated and adapted in an array of local to regional applications such as (1) upscaling plot-level measurements (e.g., carbon/energy fluxes), (2) mapping of soils, vegetation, or permafrost, and/or (3) initializing ecosystem biogeochemistry, hydrology, and/or habitat modeling.
Multi-year mapping of irrigated croplands over the US High Plains Aquifer using satellite data

NASA Astrophysics Data System (ADS)

Deines, J.; Kendall, A. D.; Hyndman, D. W.

2016-12-01

Irrigated agriculture is the largest consumer of freshwater globally. Effective water management is crucial to support ongoing agricultural intensification to meet increasing demand for food, fuel, and fiber production. Knowledge of where and when irrigation occurs is critical for effective management and hydrological modeling, yet data on patterns of irrigation through time are surprisingly rare. Existing regional datasets in the United States tend to be either aspatial county-level estimates or static, single-year remotely sensed products with relatively low spatial resolution ( 250 m or coarser). Spatially explicit, dynamic maps are needed to understand water use trends, create accurate hydrological models, and inform forecasts of future water availability under projected climate change. In the High Plains Aquifer (HPA), repeat mapping efforts in 2002 and 2007 indicated only 60% of irrigated lands were static between these periods. To better understand annual irrigation dynamics, we used remote sensing to produce annual maps of irrigated cropland across the HPA region from a data fusion of Landsat satellites, annual time series of vegetation indices, and ancillary data such as precipitation, soil properties, and terrain slope. We performed machine learning classification using Google Earth Engine, allowing efficient image processing over a large region for multiple years. We then analyzed maps for water use trends and found that although total irrigated area has increased only slightly, there was substantial variability in the spatial pattern of irrigated lands over time. This dataset will support efforts towards groundwater sustainability by providing consistent, spatially explicit tracking of irrigation dynamics over time.
High-Dimensional Bayesian Geostatistics

PubMed Central

Banerjee, Sudipto

2017-01-01

With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings. PMID:29391920
High-Dimensional Bayesian Geostatistics.

PubMed

Banerjee, Sudipto

2017-06-01

With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.

Dynamic Moss Observed with Hi-C

NASA Technical Reports Server (NTRS)

Alexander, Caroline; Winebarger, Amy; Morton, Richard; Savage, Sabrina

2014-01-01

The High-resolution Coronal Imager (Hi-C), flown on 11 July 2012, has revealed an unprecedented level of detail and substructure within the solar corona. Hi--C imaged a large active region (AR11520) with 0.2-0.3'' spatial resolution and 5.5s cadence over a 5 minute period. An additional dataset with a smaller FOV, the same resolution, but with a higher temporal cadence (1s) was also taken during the rocket flight. This dataset was centered on a large patch of 'moss' emission that initially seemed to show very little variability. Image processing revealed this region to be much more dynamic than first thought with numerous bright and dark features observed to appear, move and disappear over the 5 minute observation. Moss is thought to be emission from the upper transition region component of hot loops so studying its dynamics and the relation between the bright/dark features and underlying magnetic features is important to tie the interaction of the different atmospheric layers together. Hi-C allows us to study the coronal emission of the moss at the smallest scales while data from SDO/AIA and HMI is used to give information on these structures at different heights/temperatures. Using the high temporal and spatial resolution of Hi-C the observed moss features were tracked and the distribution of displacements, speeds, and sizes were measured. This allows us to comment on both the physical processes occurring within the dynamic moss and the scales at which these changes are occurring.
Dynamic Moss Observed with Hi-C

NASA Technical Reports Server (NTRS)

Alexander, Caroline; Winebarger, Amy; Morton, Richard; Savage, Sabrina

2014-01-01

The High-resolution Coronal Imager (Hi-C), flown on 11 July 2012, has revealed an unprecedented level of detail and substructure within the solar corona. Hi-C imaged a large active region (AR11520) with 0.2-0.3'' spatial resolution and 5.5s cadence over a 5 minute period. An additional dataset with a smaller FOV, the same resolution, but with a higher temporal cadence (1s) was also taken during the rocket flight. This dataset was centered on a large patch of 'moss' emission that initially seemed to show very little variability. Image processing revealed this region to be much more dynamic than first thought with numerous bright and dark features observed to appear, move and disappear over the 5 minute observation. Moss is thought to be emission from the upper transition region component of hot loops so studying its dynamics and the relation between the bright/dark features and underlying magnetic features is important to tie the interaction of the different atmospheric layers together. Hi-C allows us to study the coronal emission of the moss at the smallest scales while data from SDO/AIA and HMI is used to give information on these structures at different heights/temperatures. Using the high temporal and spatial resolution of Hi-C the observed moss features were tracked and the distribution of displacements, speeds, and sizes were measured. This allows us to comment on both the physical processes occurring within the dynamic moss and the scales at which these changes are occurring.
Visualizing Time-Varying Distribution Data in EOS Application

NASA Technical Reports Server (NTRS)

Shen, Han-Wei

2004-01-01

In this research, we have developed several novel visualization methods for spatial probability density function data. Our focus has been on 2D spatial datasets, where each pixel is a random variable, and has multiple samples which are the results of experiments on that random variable. We developed novel clustering algorithms as a means to reduce the information contained in these datasets; and investigated different ways of interpreting and clustering the data.
Architecture of the local spatial data infrastructure for regional climate change research

NASA Astrophysics Data System (ADS)

Titov, Alexander; Gordov, Evgeny

2013-04-01

Georeferenced datasets (meteorological databases, modeling and reanalysis results, etc.) are actively used in modeling and analysis of climate change for various spatial and temporal scales. Due to inherent heterogeneity of environmental datasets as well as their size which might constitute up to tens terabytes for a single dataset studies in the area of climate and environmental change require a special software support based on SDI approach. A dedicated architecture of the local spatial data infrastructure aiming at regional climate change analysis using modern web mapping technologies is presented. Geoportal is a key element of any SDI, allowing searching of geoinformation resources (datasets and services) using metadata catalogs, producing geospatial data selections by their parameters (data access functionality) as well as managing services and applications of cartographical visualization. It should be noted that due to objective reasons such as big dataset volume, complexity of data models used, syntactic and semantic differences of various datasets, the development of environmental geodata access, processing and visualization services turns out to be quite a complex task. Those circumstances were taken into account while developing architecture of the local spatial data infrastructure as a universal framework providing geodata services. So that, the architecture presented includes: 1. Effective in terms of search, access, retrieval and subsequent statistical processing, model of storing big sets of regional georeferenced data, allowing in particular to store frequently used values (like monthly and annual climate change indices, etc.), thus providing different temporal views of the datasets 2. General architecture of the corresponding software components handling geospatial datasets within the storage model 3. Metadata catalog describing in detail using ISO 19115 and CF-convention standards datasets used in climate researches as a basic element of the spatial data infrastructure as well as its publication according to OGC CSW (Catalog Service Web) specification 4. Computational and mapping web services to work with geospatial datasets based on OWS (OGC Web Services) standards: WMS, WFS, WPS 5. Geoportal as a key element of thematic regional spatial data infrastructure providing also software framework for dedicated web applications development To realize web mapping services Geoserver software is used since it provides natural WPS implementation as a separate software module. To provide geospatial metadata services GeoNetwork Opensource (http://geonetwork-opensource.org) product is planned to be used for it supports ISO 19115/ISO 19119/ISO 19139 metadata standards as well as ISO CSW 2.0 profile for both client and server. To implement thematic applications based on geospatial web services within the framework of local SDI geoportal the following open source software have been selected: 1. OpenLayers JavaScript library, providing basic web mapping functionality for the thin client such as web browser 2. GeoExt/ExtJS JavaScript libraries for building client-side web applications working with geodata services. The web interface developed will be similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. The work is partially supported by RF Ministry of Education and Science grant 8345, SB RAS Program VIII.80.2.1 and IP 131.
An empirical understanding of triple collocation evaluation measure

NASA Astrophysics Data System (ADS)

Scipal, Klaus; Doubkova, Marcela; Hegyova, Alena; Dorigo, Wouter; Wagner, Wolfgang

2013-04-01

Triple collocation method is an advanced evaluation method that has been used in the soil moisture field for only about half a decade. The method requires three datasets with an independent error structure that represent an identical phenomenon. The main advantages of the method are that it a) doesn't require a reference dataset that has to be considered to represent the truth, b) limits the effect of random and systematic errors of other two datasets, and c) simultaneously assesses the error of three datasets. The objective of this presentation is to assess the triple collocation error (Tc) of the ASAR Global Mode Surface Soil Moisture (GM SSM 1) km dataset and highlight problems of the method related to its ability to cancel the effect of error of ancillary datasets. In particular, the goal is to a) investigate trends in Tc related to the change in spatial resolution from 5 to 25 km, b) to investigate trends in Tc related to the choice of a hydrological model, and c) to study the relationship between Tc and other absolute evaluation methods (namely RMSE and Error Propagation EP). The triple collocation method is implemented using ASAR GM, AMSR-E, and a model (either AWRA-L, GLDAS-NOAH, or ERA-Interim). First, the significance of the relationship between the three soil moisture datasets was tested that is a prerequisite for the triple collocation method. Second, the trends in Tc related to the choice of the third reference dataset and scale were assessed. For this purpose the triple collocation is repeated replacing AWRA-L with two different globally available model reanalysis dataset operating at different spatial resolution (ERA-Interim and GLDAS-NOAH). Finally, the retrieved results were compared to the results of the RMSE and EP evaluation measures. Our results demonstrate that the Tc method does not eliminate the random and time-variant systematic errors of the second and the third dataset used in the Tc. The possible reasons include the fact a) that the TC method could not fully function with datasets acting at very different spatial resolutions, or b) that the errors were not fully independent as initially assumed.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Xing, Z.; Fetzer, E.

2008-12-01

NASA's Earth Observing System (EOS) is the world's most ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the A-Train platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the cloud scenes from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time matchups between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, and assemble merged datasets for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, and perform pairwise instrument matchups for A-Train datasets. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System

NASA Astrophysics Data System (ADS)

Wilson, B.; Manipon, G.; Xing, Z.; Fetzer, E.

2009-04-01

NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, perform pairwise instrument matchups for A-Train datasets, and compute fused products. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Integrative Spatial Data Analytics for Public Health Studies of New York State

PubMed Central

Chen, Xin; Wang, Fusheng

2016-01-01

Increased accessibility of health data made available by the government provides unique opportunity for spatial analytics with much higher resolution to discover patterns of diseases, and their correlation with spatial impact indicators. This paper demonstrated our vision of integrative spatial analytics for public health by linking the New York Cancer Mapping Dataset with datasets containing potential spatial impact indicators. We performed spatial based discovery of disease patterns and variations across New York State, and identify potential correlations between diseases and demographic, socio-economic and environmental indicators. Our methods were validated by three correlation studies: the correlation between stomach cancer and Asian race, the correlation between breast cancer and high education population, and the correlation between lung cancer and air toxics. Our work will allow public health researchers, government officials or other practitioners to adequately identify, analyze, and monitor health problems at the community or neighborhood level for New York State. PMID:28269834
A Self-Organizing Spatial Clustering Approach to Support Large-Scale Network RTK Systems.

PubMed

Shen, Lili; Guo, Jiming; Wang, Lei

2018-06-06

The network real-time kinematic (RTK) technique can provide centimeter-level real time positioning solutions and play a key role in geo-spatial infrastructure. With ever-increasing popularity, network RTK systems will face issues in the support of large numbers of concurrent users. In the past, high-precision positioning services were oriented towards professionals and only supported a few concurrent users. Currently, precise positioning provides a spatial foundation for artificial intelligence (AI), and countless smart devices (autonomous cars, unmanned aerial-vehicles (UAVs), robotic equipment, etc.) require precise positioning services. Therefore, the development of approaches to support large-scale network RTK systems is urgent. In this study, we proposed a self-organizing spatial clustering (SOSC) approach which automatically clusters online users to reduce the computational load on the network RTK system server side. The experimental results indicate that both the SOSC algorithm and the grid algorithm can reduce the computational load efficiently, while the SOSC algorithm gives a more elastic and adaptive clustering solution with different datasets. The SOSC algorithm determines the cluster number and the mean distance to cluster center (MDTCC) according to the data set, while the grid approaches are all predefined. The side-effects of clustering algorithms on the user side are analyzed with real global navigation satellite system (GNSS) data sets. The experimental results indicate that 10 km can be safely used as the cluster radius threshold for the SOSC algorithm without significantly reducing the positioning precision and reliability on the user side.
CheS-Mapper - Chemical Space Mapping and Visualization in 3D.

PubMed

Gütlein, Martin; Karwath, Andreas; Kramer, Stefan

2012-03-17

Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis.
CheS-Mapper - Chemical Space Mapping and Visualization in 3D

PubMed Central

2012-01-01

Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis. PMID:22424447
MEMHDX: an interactive tool to expedite the statistical validation and visualization of large HDX-MS datasets.

PubMed

Hourdel, Véronique; Volant, Stevenn; O'Brien, Darragh P; Chenal, Alexandre; Chamot-Rooke, Julia; Dillies, Marie-Agnès; Brier, Sébastien

2016-11-15

With the continued improvement of requisite mass spectrometers and UHPLC systems, Hydrogen/Deuterium eXchange Mass Spectrometry (HDX-MS) workflows are rapidly evolving towards the investigation of more challenging biological systems, including large protein complexes and membrane proteins. The analysis of such extensive systems results in very large HDX-MS datasets for which specific analysis tools are required to speed up data validation and interpretation. We introduce a web application and a new R-package named 'MEMHDX' to help users analyze, validate and visualize large HDX-MS datasets. MEMHDX is composed of two elements. A statistical tool aids in the validation of the results by applying a mixed-effects model for each peptide, in each experimental condition, and at each time point, taking into account the time dependency of the HDX reaction and number of independent replicates. Two adjusted P-values are generated per peptide, one for the 'Change in dynamics' and one for the 'Magnitude of ΔD', and are used to classify the data by means of a 'Logit' representation. A user-friendly interface developed with Shiny by RStudio facilitates the use of the package. This interactive tool allows the user to easily and rapidly validate, visualize and compare the relative deuterium incorporation on the amino acid sequence and 3D structure, providing both spatial and temporal information. MEMHDX is freely available as a web tool at the project home page http://memhdx.c3bi.pasteur.fr CONTACT: marie-agnes.dillies@pasteur.fr or sebastien.brier@pasteur.frSupplementary information: Supplementary data is available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
How do the methodological choices of your climate change study affect your results? A hydrologic case study across the Pacific Northwest

NASA Astrophysics Data System (ADS)

Chegwidden, O.; Nijssen, B.; Rupp, D. E.; Kao, S. C.; Clark, M. P.

2017-12-01

We describe results from a large hydrologic climate change dataset developed across the Pacific Northwestern United States and discuss how the analysis of those results can be seen as a framework for other large hydrologic ensemble investigations. This investigation will better inform future modeling efforts and large ensemble analyses across domains within and beyond the Pacific Northwest. Using outputs from the Coupled Model Intercomparison Project Phase 5 (CMIP5), we provide projections of hydrologic change for the domain through the end of the 21st century. The dataset is based upon permutations of four methodological choices: (1) ten global climate models (2) two representative concentration pathways (3) three meteorological downscaling methods and (4) four unique hydrologic model set-ups (three of which entail the same hydrologic model using independently calibrated parameter sets). All simulations were conducted across the Columbia River Basin and Pacific coastal drainages at a 1/16th ( 6 km) resolution and at a daily timestep. In total, the 172 distinct simulations offer an updated, comprehensive view of climate change projections through the end of the 21st century. The results consist of routed streamflow at 400 sites throughout the domain as well as distributed spatial fields of relevant hydrologic variables like snow water equivalent and soil moisture. In this presentation, we discuss the level of agreement with previous hydrologic projections for the study area and how these projections differ with specific methodological choices. By controlling for some methodological choices we can show how each choice affects key climatic change metrics. We discuss how the spread in results varies across hydroclimatic regimes. We will use this large dataset as a case study for distilling a wide range of hydroclimatological projections into useful climate change assessments.
High spatial resolution mapping of folds and fractures using Unmanned Aerial Vehicle (UAV) photogrammetry

NASA Astrophysics Data System (ADS)

Cruden, A. R.; Vollgger, S.

2016-12-01

The emerging capability of UAV photogrammetry combines a simple and cost-effective method to acquire digital aerial images with advanced computer vision algorithms that compute spatial datasets from a sequence of overlapping digital photographs from various viewpoints. Depending on flight altitude and camera setup, sub-centimeter spatial resolution orthophotographs and textured dense point clouds can be achieved. Orientation data can be collected for detailed structural analysis by digitally mapping such high-resolution spatial datasets in a fraction of time and with higher fidelity compared to traditional mapping techniques. Here we describe a photogrammetric workflow applied to a structural study of folds and fractures within alternating layers of sandstone and mudstone at a coastal outcrop in SE Australia. We surveyed this location using a downward looking digital camera mounted on commercially available multi-rotor UAV that autonomously followed waypoints at a set altitude and speed to ensure sufficient image overlap, minimum motion blur and an appropriate resolution. The use of surveyed ground control points allowed us to produce a geo-referenced 3D point cloud and an orthophotograph from hundreds of digital images at a spatial resolution < 10 mm per pixel, and cm-scale location accuracy. Orientation data of brittle and ductile structures were semi-automatically extracted from these high-resolution datasets using open-source software. This resulted in an extensive and statistically relevant orientation dataset that was used to 1) interpret the progressive development of folds and faults in the region, and 2) to generate a 3D structural model that underlines the complex internal structure of the outcrop and quantifies spatial variations in fold geometries. Overall, our work highlights how UAV photogrammetry can contribute to new insights in structural analysis.
Providing Geographic Datasets as Linked Data in Sdi

NASA Astrophysics Data System (ADS)

Hietanen, E.; Lehto, L.; Latvala, P.

2016-06-01

In this study, a prototype service to provide data from Web Feature Service (WFS) as linked data is implemented. At first, persistent and unique Uniform Resource Identifiers (URI) are created to all spatial objects in the dataset. The objects are available from those URIs in Resource Description Framework (RDF) data format. Next, a Web Ontology Language (OWL) ontology is created to describe the dataset information content using the Open Geospatial Consortium's (OGC) GeoSPARQL vocabulary. The existing data model is modified in order to take into account the linked data principles. The implemented service produces an HTTP response dynamically. The data for the response is first fetched from existing WFS. Then the Geographic Markup Language (GML) format output of the WFS is transformed on-the-fly to the RDF format. Content Negotiation is used to serve the data in different RDF serialization formats. This solution facilitates the use of a dataset in different applications without replicating the whole dataset. In addition, individual spatial objects in the dataset can be referred with URIs. Furthermore, the needed information content of the objects can be easily extracted from the RDF serializations available from those URIs. A solution for linking data objects to the dataset URI is also introduced by using the Vocabulary of Interlinked Datasets (VoID). The dataset is divided to the subsets and each subset is given its persistent and unique URI. This enables the whole dataset to be explored with a web browser and all individual objects to be indexed by search engines.
Modeling Spatial Dependencies and Semantic Concepts in Data Mining

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vatsavai, Raju

Data mining is the process of discovering new patterns and relationships in large datasets. However, several studies have shown that general data mining techniques often fail to extract meaningful patterns and relationships from the spatial data owing to the violation of fundamental geospatial principles. In this tutorial, we introduce basic principles behind explicit modeling of spatial and semantic concepts in data mining. In particular, we focus on modeling these concepts in the widely used classification, clustering, and prediction algorithms. Classification is the process of learning a structure or model (from user given inputs) and applying the known model to themore » new data. Clustering is the process of discovering groups and structures in the data that are ``similar,'' without applying any known structures in the data. Prediction is the process of finding a function that models (explains) the data with least error. One common assumption among all these methods is that the data is independent and identically distributed. Such assumptions do not hold well in spatial data, where spatial dependency and spatial heterogeneity are a norm. In addition, spatial semantics are often ignored by the data mining algorithms. In this tutorial we cover recent advances in explicitly modeling of spatial dependencies and semantic concepts in data mining.« less
Climate Change and Hydrological Extreme Events - Risks and Perspectives for Water Management in Bavaria and Québec

NASA Astrophysics Data System (ADS)

Ludwig, R.

2017-12-01

There is as yet no confirmed knowledge whether and how climate change contributes to the magnitude and frequency of hydrological extreme events and how regional water management could adapt to the corresponding risks. The ClimEx project (2015-2019) investigates the effects of climate change on the meteorological and hydrological extreme events and their implications for water management in Bavaria and Québec. High Performance Computing is employed to enable the complex simulations in a hydro-climatological model processing chain, resulting in a unique high-resolution and transient (1950-2100) dataset of climatological and meteorological forcing and hydrological response: (1) The climate module has developed a large ensemble of high resolution data (12km) of the CRCM5 RCM for Central Europe and North-Eastern North America, downscaled from 50 members of the CanESM2 GCM. The dataset is complemented by all available data from the Euro-CORDEX project to account for the assessment of both natural climate variability and climate change. The large ensemble with several thousand model years provides the potential to catch rare extreme events and thus improves the process understanding of extreme events with return periods of 1000+ years. (2) The hydrology module comprises process-based and spatially explicit model setups (e.g. WaSiM) for all major catchments in Bavaria and Southern Québec in high temporal (3h) and spatial (500m) resolution. The simulations form the basis for in depth analysis of hydrological extreme events based on the inputs from the large climate model dataset. The specific data situation enables to establish a new method for `virtual perfect prediction', which assesses climate change impacts on flood risk and water resources management by identifying patterns in the data which reveal preferential triggers of hydrological extreme events. The presentation will highlight first results from the analysis of the large scale ClimEx model ensemble, showing the current and future ratio of natural variability and climate change impacts on meteorological extreme events. Selected data from the ensemble is used to drive a hydrological model experiment to illustrate the capacity to better determine the recurrence periods of hydrological extreme events under conditions of climate change.
Spatializing 6,000 years of global urbanization from 3700 BC to AD 2000

NASA Astrophysics Data System (ADS)

Reba, Meredith; Reitsma, Femke; Seto, Karen C.

2016-06-01

How were cities distributed globally in the past? How many people lived in these cities? How did cities influence their local and regional environments? In order to understand the current era of urbanization, we must understand long-term historical urbanization trends and patterns. However, to date there is no comprehensive record of spatially explicit, historic, city-level population data at the global scale. Here, we developed the first spatially explicit dataset of urban settlements from 3700 BC to AD 2000, by digitizing, transcribing, and geocoding historical, archaeological, and census-based urban population data previously published in tabular form by Chandler and Modelski. The dataset creation process also required data cleaning and harmonization procedures to make the data internally consistent. Additionally, we created a reliability ranking for each geocoded location to assess the geographic uncertainty of each data point. The dataset provides the first spatially explicit archive of the location and size of urban populations over the last 6,000 years and can contribute to an improved understanding of contemporary and historical urbanization trends.
Spatializing 6,000 years of global urbanization from 3700 BC to AD 2000

PubMed Central

Reba, Meredith; Reitsma, Femke; Seto, Karen C.

2016-01-01

How were cities distributed globally in the past? How many people lived in these cities? How did cities influence their local and regional environments? In order to understand the current era of urbanization, we must understand long-term historical urbanization trends and patterns. However, to date there is no comprehensive record of spatially explicit, historic, city-level population data at the global scale. Here, we developed the first spatially explicit dataset of urban settlements from 3700 BC to AD 2000, by digitizing, transcribing, and geocoding historical, archaeological, and census-based urban population data previously published in tabular form by Chandler and Modelski. The dataset creation process also required data cleaning and harmonization procedures to make the data internally consistent. Additionally, we created a reliability ranking for each geocoded location to assess the geographic uncertainty of each data point. The dataset provides the first spatially explicit archive of the location and size of urban populations over the last 6,000 years and can contribute to an improved understanding of contemporary and historical urbanization trends. PMID:27271481
Opportunities for multivariate analysis of open spatial datasets to characterize urban flooding risks

NASA Astrophysics Data System (ADS)

Gaitan, S.; ten Veldhuis, J. A. E.

2015-06-01

Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to reduce flooding impacts. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall, socioeconomic characteristics, and social sensing, may help to explain probability and impacts of urban flooding. Several spatial datasets have been recently made available in the Netherlands, including rainfall-related incident reports made by citizens, spatially distributed rain depths, semidistributed socioeconomic information, and buildings age. Inspecting the potential of this data to explain the occurrence of rainfall related incidents has not been done yet. Multivariate analysis tools for describing communities and environmental patterns have been previously developed and used in the field of study of ecology. The objective of this paper is to outline opportunities for these tools to explore urban flooding risks patterns in the mentioned datasets. To that end, a cluster analysis is performed. Results indicate that incidence of rainfall-related impacts is higher in areas characterized by older infrastructure and higher population density.

Assessing the accuracy and stability of variable selection ...

EPA Pesticide Factsheets

Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological datasets there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used, or stepwise procedures are employed which iteratively add/remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating dataset consists of the good/poor condition of n=1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set (p=212) of landscape features from the StreamCat dataset. Two types of RF models are compared: a full variable set model with all 212 predictors, and a reduced variable set model selected using a backwards elimination approach. We assess model accuracy using RF's internal out-of-bag estimate, and a cross-validation procedure with validation folds external to the variable selection process. We also assess the stability of the spatial predictions generated by the RF models to changes in the number of predictors, and argue that model selection needs to consider both accuracy and stability. The results suggest that RF modeling is robust to the inclusion of many variables of moderate to low importance. We found no substanti
Influence of reanalysis datasets on dynamically downscaling the recent past

NASA Astrophysics Data System (ADS)

Moalafhi, Ditiro B.; Evans, Jason P.; Sharma, Ashish

2017-08-01

Multiple reanalysis datasets currently exist that can provide boundary conditions for dynamic downscaling and simulating local hydro-climatic processes at finer spatial and temporal resolutions. Previous work has suggested that there are two reanalyses alternatives that provide the best lateral boundary conditions for downscaling over southern Africa. This study dynamically downscales these reanalyses (ERA-I and MERRA) over southern Africa to a high resolution (10 km) grid using the WRF model. Simulations cover the period 1981-2010. Multiple observation datasets were used for both surface temperature and precipitation to account for observational uncertainty when assessing results. Generally, temperature is simulated quite well, except over the Namibian coastal plain where the simulations show anomalous warm temperature related to the failure to propagate the influence of the cold Benguela current inland. Precipitation tends to be overestimated in high altitude areas, and most of southern Mozambique. This could be attributed to challenges in handling complex topography and capturing large-scale circulation patterns. While MERRA driven WRF exhibits slightly less bias in temperature especially for La Nina years, ERA-I driven simulations are on average superior in terms of RMSE. When considering multiple variables and metrics, ERA-I is found to produce the best simulation of the climate over the domain. The influence of the regional model appears to be large enough to overcome the small difference in relative errors present in the lateral boundary conditions derived from these two reanalyses.
An approach for mapping large-area impervious surfaces: Synergistic use of Landsat-7 ETM+ and high spatial resolution imagery

USGS Publications Warehouse

Yang, Limin; Huang, Chengquan; Homer, Collin G.; Wylie, Bruce K.; Coan, Michael

2003-01-01

A wide range of urban ecosystem studies, including urban hydrology, urban climate, land use planning, and resource management, require current and accurate geospatial data of urban impervious surfaces. We developed an approach to quantify urban impervious surfaces as a continuous variable by using multisensor and multisource datasets. Subpixel percent impervious surfaces at 30-m resolution were mapped using a regression tree model. The utility, practicality, and affordability of the proposed method for large-area imperviousness mapping were tested over three spatial scales (Sioux Falls, South Dakota, Richmond, Virginia, and the Chesapeake Bay areas of the United States). Average error of predicted versus actual percent impervious surface ranged from 8.8 to 11.4%, with correlation coefficients from 0.82 to 0.91. The approach is being implemented to map impervious surfaces for the entire United States as one of the major components of the circa 2000 national land cover database.
Mapping populations at risk: improving spatial demographic data for infectious disease modeling and metric derivation

PubMed Central

2012-01-01

The use of Global Positioning Systems (GPS) and Geographical Information Systems (GIS) in disease surveys and reporting is becoming increasingly routine, enabling a better understanding of spatial epidemiology and the improvement of surveillance and control strategies. In turn, the greater availability of spatially referenced epidemiological data is driving the rapid expansion of disease mapping and spatial modeling methods, which are becoming increasingly detailed and sophisticated, with rigorous handling of uncertainties. This expansion has, however, not been matched by advancements in the development of spatial datasets of human population distribution that accompany disease maps or spatial models. Where risks are heterogeneous across population groups or space or dependent on transmission between individuals, spatial data on human population distributions and demographic structures are required to estimate infectious disease risks, burdens, and dynamics. The disease impact in terms of morbidity, mortality, and speed of spread varies substantially with demographic profiles, so that identifying the most exposed or affected populations becomes a key aspect of planning and targeting interventions. Subnational breakdowns of population counts by age and sex are routinely collected during national censuses and maintained in finer detail within microcensus data. Moreover, demographic and health surveys continue to collect representative and contemporary samples from clusters of communities in low-income countries where census data may be less detailed and not collected regularly. Together, these freely available datasets form a rich resource for quantifying and understanding the spatial variations in the sizes and distributions of those most at risk of disease in low income regions, yet at present, they remain unconnected data scattered across national statistical offices and websites. In this paper we discuss the deficiencies of existing spatial population datasets and their limitations on epidemiological analyses. We review sources of detailed, contemporary, freely available and relevant spatial demographic data focusing on low income regions where such data are often sparse and highlight the value of incorporating these through a set of examples of their application in disease studies. Moreover, the importance of acknowledging, measuring, and accounting for uncertainty in spatial demographic datasets is outlined. Finally, a strategy for building an open-access database of spatial demographic data that is tailored to epidemiological applications is put forward. PMID:22591595
NASA Cold Land Processes Experiment (CLPX 2002/03): Atmospheric analyses datasets

Treesearch

Glen E. Liston; Daniel L. Birkenheuer; Christopher A. Hiemstra; Donald W. Cline; Kelly Elder

2008-01-01

This paper describes the Local Analysis and Prediction System (LAPS) and the 20-km horizontal grid version of the Rapid Update Cycle (RUC20) atmospheric analyses datasets, which are available as part of the Cold Land Processes Field Experiment (CLPX) data archive. The LAPS dataset contains spatially and temporally continuous atmospheric and surface variables over...
Evaluation of sliding baseline methods for spatial estimation for cluster detection in the biosurveillance system

PubMed Central

Xing, Jian; Burkom, Howard; Moniz, Linda; Edgerton, James; Leuze, Michael; Tokars, Jerome

2009-01-01

Background The Centers for Disease Control and Prevention's (CDC's) BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Our study focused on finding useful anomalies at manageable alert rates according to available BioSense data history. Methods The study dataset included more than 3 years of daily counts of military outpatient clinic visits for respiratory and rash syndrome groupings. We applied four spatial estimation methods in implementations of space-time scan statistics cross-checked in Matlab and C. We compared the utility of these methods according to the resultant background cluster rate (a false alarm surrogate) and sensitivity to injected cluster signals. The comparison runs used a spatial resolution based on the facility zip code in the patient record and a finer resolution based on the residence zip code. Results Simple estimation methods that account for day-of-week (DOW) data patterns yielded a clear advantage both in background cluster rate and in signal sensitivity. A 28-day baseline gave the most robust results for this estimation; the preferred baseline is long enough to remove daily fluctuations but short enough to reflect recent disease trends and data representation. Background cluster rates were lower for the rash syndrome counts than for the respiratory counts, likely because of seasonality and the large scale of the respiratory counts. Conclusion The spatial estimation method should be chosen according to characteristics of the selected data streams. In this dataset with strong day-of-week effects, the overall best detection performance was achieved using subregion averages over a 28-day baseline stratified by weekday or weekend/holiday behavior. Changing the estimation method for particular scenarios involving different spatial resolution or other syndromes can yield further improvement. PMID:19615075
The impact of the resolution of meteorological datasets on catchment-scale drought studies

NASA Astrophysics Data System (ADS)

Hellwig, Jost; Stahl, Kerstin

2017-04-01

Gridded meteorological datasets provide the basis to study drought at a range of scales, including catchment scale drought studies in hydrology. They are readily available to study past weather conditions and often serve real time monitoring as well. As these datasets differ in spatial/temporal coverage and spatial/temporal resolution, for most studies there is a tradeoff between these features. Our investigation examines whether biases occur when studying drought on catchment scale with low resolution input data. For that, a comparison among the datasets HYRAS (covering Central Europe, 1x1 km grid, daily data, 1951 - 2005), E-OBS (Europe, 0.25° grid, daily data, 1950-2015) and GPCC (whole world, 0.5° grid, monthly data, 1901 - 2013) is carried out. Generally, biases in precipitation increase with decreasing resolution. Most important variations are found during summer. In low mountain range of Central Europe the datasets of sparse resolution (E-OBS, GPCC) overestimate dry days and underestimate total precipitation since they are not able to describe high spatial variability. However, relative measures like the correlation coefficient reveal good consistencies of dry and wet periods, both for absolute precipitation values and standardized indices like the Standardized Precipitation Index (SPI) or Standardized Precipitation Evaporation Index (SPEI). Particularly the most severe droughts derived from the different datasets match very well. These results indicate that absolute values of sparse resolution datasets applied to catchment scale might be critical to use for an assessment of the hydrological drought at catchment scale, whereas relative measures for determining periods of drought are more trustworthy. Therefore, studies on drought, that downscale meteorological data, should carefully consider their data needs and focus on relative measures for dry periods if sufficient for the task.
Small-Area Estimation of Spatial Access to Care and Its Implications for Policy.

PubMed

Gentili, Monica; Isett, Kim; Serban, Nicoleta; Swann, Julie

2015-10-01

Local or small-area estimates to capture emerging trends across large geographic regions are critical in identifying and addressing community-level health interventions. However, they are often unavailable due to lack of analytic capabilities in compiling and integrating extensive datasets and complementing them with the knowledge about variations in state-level health policies. This study introduces a modeling approach for small-area estimation of spatial access to pediatric primary care that is data "rich" and mathematically rigorous, integrating data and health policy in a systematic way. We illustrate the sensitivity of the model to policy decision making across large geographic regions by performing a systematic comparison of the estimates at the census tract and county levels for Georgia and California. Our results show the proposed approach is able to overcome limitations of other existing models by capturing patient and provider preferences and by incorporating possible changes in health policies. The primary finding is systematic underestimation of spatial access, and inaccurate estimates of disparities across population and across geography at the county level with respect to those at the census tract level with implications on where to focus and which type of interventions to consider.
Flow over bedforms in a large sand-bed river: A field investigation

USGS Publications Warehouse

Holmes, Robert R.; Garcia, Marcelo H.

2008-01-01

An experimental field study of flows over bedforms was conducted on the Missouri River near St. Charles, Missouri. Detailed velocity data were collected under two different flow conditions along bedforms in this sand-bed river. The large river-scale data reflect flow characteristics similar to those of laboratory-scale flows, with flow separation occurring downstream of the bedform crest and flow reattachment on the stoss side of the next downstream bedform. Wave-like responses of the flow to the bedforms were detected, with the velocity decreasing throughout the flow depth over bedform troughs, and the velocity increasing over bedform crests. Local and spatially averaged velocity distributions were logarithmic for both datasets. The reach-wise spatially averaged vertical-velocity profile from the standard velocity-defect model was evaluated. The vertically averaged mean flow velocities for the velocity-defect model were within 5% of the measured values and estimated spatially averaged point velocities were within 10% for the upper 90% of the flow depth. The velocity-defect model, neglecting the wake function, was evaluated and found to estimate thevertically averaged mean velocity within 1% of the measured values.
Delineation of marsh types and marsh-type change in coastal Louisiana for 2007 and 2013

USGS Publications Warehouse

Hartley, Stephen B.; Couvillion, Brady R.; Enwright, Nicholas M.

2017-05-30

The Bureau of Ocean Energy Management researchers often require detailed information regarding emergent marsh vegetation types (such as fresh, intermediate, brackish, and saline) for modeling habitat capacities and mitigation. In response, the U.S. Geological Survey in cooperation with the Bureau of Ocean Energy Management produced a detailed change classification of emergent marsh vegetation types in coastal Louisiana from 2007 and 2013. This study incorporates two existing vegetation surveys and independent variables such as Landsat Thematic Mapper multispectral satellite imagery, high-resolution airborne imagery from 2007 and 2013, bare-earth digital elevation models based on airborne light detection and ranging, alternative contemporary land-cover classifications, and other spatially explicit variables. An image classification based on image objects was created from 2007 and 2013 National Agriculture Imagery Program color-infrared aerial photography. The final products consisted of two 10-meter raster datasets. Each image object from the 2007 and 2013 spatial datasets was assigned a vegetation classification by using a simple majority filter. In addition to those spatial datasets, we also conducted a change analysis between the datasets to produce a 10-meter change raster product. This analysis identified how much change has taken place and where change has occurred. The spatial data products show dynamic areas where marsh loss is occurring or where marsh type is changing. This information can be used to assist and advance conservation efforts for priority natural resources.
Topic modeling for cluster analysis of large biological and medical datasets

PubMed Central

2014-01-01

Background The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. Results In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Conclusion Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets. PMID:25350106
Topic modeling for cluster analysis of large biological and medical datasets.

PubMed

Zhao, Weizhong; Zou, Wen; Chen, James J

2014-01-01

The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets.
Satellite-derived pan-Arctic melt onset dataset, 2000-2009

NASA Astrophysics Data System (ADS)

Wang, L.; Derksen, C.; Howell, S.; Wolken, G. J.; Sharp, M. J.; Markus, T.

2009-12-01

The SeaWinds Scatterometer on QuikSCAT (QS) has been in orbit for over a decade since its launch in June 1999. Due to its high sensitivity to the appearance of liquid water in snow and day/night all weather capability, QS data have been successfully used to detect melt onset and melt duration for various elements of the cryosphere. These melt datasets are especially useful in the polar regions where the application of imagery from optical sensors is hindered by polar nights and frequent cloud cover. In this study, we generate a pan-Arctic, pan-cryosphere melt onset dataset by combining estimates from previously published algorithms optimized for individual cryospheric elements and applied to QS and Special Sensor Microwave Imager (SSM/I) data for the northern high latitude land surface, ice caps, large lakes, and sea ice. Comparisons of melt onset along the boundaries between different components of the cryosphere show that in general the integrated dataset provides consistent and spatially coherent melt onset estimates across the pan-Arctic. We present the climatology and the anomaly patterns in melt onset during 2000-2009, and identify synoptic-scale linkages between atmospheric conditions and the observed patterns. We also investigate the possible trends in melt onset in the pan-Arctic during the 10-year period.
A benchmark for vehicle detection on wide area motion imagery

NASA Astrophysics Data System (ADS)

Catrambone, Joseph; Amzovski, Ismail; Liang, Pengpeng; Blasch, Erik; Sheaff, Carolyn; Wang, Zhonghai; Chen, Genshe; Ling, Haibin

2015-05-01

Wide area motion imagery (WAMI) has been attracting an increased amount of research attention due to its large spatial and temporal coverage. An important application includes moving target analysis, where vehicle detection is often one of the first steps before advanced activity analysis. While there exist many vehicle detection algorithms, a thorough evaluation of them on WAMI data still remains a challenge mainly due to the lack of an appropriate benchmark data set. In this paper, we address a research need by presenting a new benchmark for wide area motion imagery vehicle detection data. The WAMI benchmark is based on the recently available Wright-Patterson Air Force Base (WPAFB09) dataset and the Temple Resolved Uncertainty Target History (TRUTH) associated target annotation. Trajectory annotations were provided in the original release of the WPAFB09 dataset, but detailed vehicle annotations were not available with the dataset. In addition, annotations of static vehicles, e.g., in parking lots, are also not identified in the original release. Addressing these issues, we re-annotated the whole dataset with detailed information for each vehicle, including not only a target's location, but also its pose and size. The annotated WAMI data set should be useful to community for a common benchmark to compare WAMI detection, tracking, and identification methods.
A spatio-temporal index for aerial full waveform laser scanning data

NASA Astrophysics Data System (ADS)

Laefer, Debra F.; Vo, Anh-Vu; Bertolotto, Michela

2018-04-01

Aerial laser scanning is increasingly available in the full waveform version of the raw signal, which can provide greater insight into and control over the data and, thus, richer information about the scanned scenes. However, when compared to conventional discrete point storage, preserving raw waveforms leads to vastly larger and more complex data volumes. To begin addressing these challenges, this paper introduces a novel bi-level approach for storing and indexing full waveform (FWF) laser scanning data in a relational database environment, while considering both the spatial and the temporal dimensions of that data. In the storage scheme's upper level, the full waveform datasets are partitioned into spatial and temporal coherent groups that are indexed by a two-dimensional R∗-tree. To further accelerate intra-block data retrieval, at the lower level a three-dimensional local octree is created for each pulse block. The local octrees are implemented in-memory and can be efficiently written to a database for reuse. The indexing solution enables scalable and efficient three-dimensional (3D) spatial and spatio-temporal queries on the actual pulse data - functionalities not available in other systems. The proposed FWF laser scanning data solution is capable of managing multiple FWF datasets derived from large flight missions. The flight structure is embedded into the data storage model and can be used for querying predicates. Such functionality is important to FWF data exploration since aircraft locations and orientations are frequently required for FWF data analyses. Empirical tests on real datasets of up to 1 billion pulses from Dublin, Ireland prove the almost perfect scalability of the system. The use of the local 3D octree in the indexing structure accelerated pulse clipping by 1.2-3.5 times for non-axis-aligned (NAA) polyhedron shaped clipping windows, while axis-aligned (AA) polyhedron clipping was better served using only the top indexing layer. The distinct behaviours of the hybrid indexing for AA and NAA clipping windows are attributable to the different proportion of the local-index-related overheads with respect to the total querying costs. When temporal constraints were added, generally the number of costly spatial checks were reduced, thereby shortening the querying times.
Cloud and fog interactions with coastal forests in the California Channel Islands

NASA Astrophysics Data System (ADS)

Still, C. J.; Baguskas, S. A.; Williams, P.; Fischer, D. T.; Carbone, M. S.; Rastogi, B.

2015-12-01

Coastal forests in California are frequently covered by clouds or immersed in fog in the rain-free summer. Scientists have long surmised that fog might provide critical water inputs to these forests. However, until recently, there has been little ecophysiological research to support how or why plants should prefer foggy regions; similarly, there is very little work quantifying water delivered to ecosystems by fog drip except for a few notable sites along the California coast. However, without spatial datasets of summer cloudcover and fog inundation, combined with detailed process studies, questions regarding the roles of cloud shading and fog drip in dictating plant distributions and ecosystem physiology cannot be addressed effectively. The overall objective of this project is to better understand how cloudcover and fog influence forest metabolism, growth, and distribution. Across a range of sites in California's Channel Islands National Park we measured a wide variety of ecosystem processes and properties. We then related these to cloudcover and fog immersion maps created using satellite datasets and airport and radiosonde observations. We compiled a spatially continuous dataset of summertime cloudcover frequency of the Southern California bight using satellite imagery from the NOAA geostationary GOES-11 Imager. We also created map of summertime cloudcover frequency of this area using MODIS imagery. To assess the ability of our mapping approach to predict spatial and temporal fog inundation patterns, we compared our monthly average daytime fog maps for GOES pixels corresponding to stations where fog inputs were measured with fog collectors in a Bishop pine forest. We also compared our cloudcover maps to measurements of irradiance measurements. Our results demonstrate that cloudcover and fog strongly modulate radiation, water, and carbon budgets, as well as forest distributions, in this semi-arid environment. Measurements of summertime fog drip, pine sapflow and growth, and soil respiration are strongly related to variations in cloudcover and fog drip. Importantly, spatial variations in cloud cover and fog immersion drive large changes in modeled water budgets and correspond closely to patterns of tree growth and mortality.
Patterns, biases and prospects in the distribution and diversity of Neotropical snakes

PubMed Central

Sawaya, Ricardo J.; Zizka, Alexander; Laffan, Shawn; Faurby, Søren; Pyron, R. Alexander; Bérnils, Renato S.; Jansen, Martin; Passos, Paulo; Prudente, Ana L. C.; Cisneros‐Heredia, Diego F.; Braz, Henrique B.; Nogueira, Cristiano de C.; Antonelli, Alexandre; Meiri, Shai

2017-01-01

Abstract Motivation We generated a novel database of Neotropical snakes (one of the world's richest herpetofauna) combining the most comprehensive, manually compiled distribution dataset with publicly available data. We assess, for the first time, the diversity patterns for all Neotropical snakes as well as sampling density and sampling biases. Main types of variables contained We compiled three databases of species occurrences: a dataset downloaded from the Global Biodiversity Information Facility (GBIF), a verified dataset built through taxonomic work and specialized literature, and a combined dataset comprising a cleaned version of the GBIF dataset merged with the verified dataset. Spatial location and grain Neotropics, Behrmann projection equivalent to 1° × 1°. Time period Specimens housed in museums during the last 150 years. Major taxa studied Squamata: Serpentes. Software format Geographical information system (GIS). Results The combined dataset provides the most comprehensive distribution database for Neotropical snakes to date. It contains 147,515 records for 886 species across 12 families, representing 74% of all species of snakes, spanning 27 countries in the Americas. Species richness and phylogenetic diversity show overall similar patterns. Amazonia is the least sampled Neotropical region, whereas most well‐sampled sites are located near large universities and scientific collections. We provide a list and updated maps of geographical distribution of all snake species surveyed. Main conclusions The biodiversity metrics of Neotropical snakes reflect patterns previously documented for other vertebrates, suggesting that similar factors may determine the diversity of both ectothermic and endothermic animals. We suggest conservation strategies for high‐diversity areas and sampling efforts be directed towards Amazonia and poorly known species. PMID:29398972
Microclimate Data Improve Predictions of Insect Abundance Models Based on Calibrated Spatiotemporal Temperatures.

PubMed

Rebaudo, François; Faye, Emile; Dangles, Olivier

2016-01-01

A large body of literature has recently recognized the role of microclimates in controlling the physiology and ecology of species, yet the relevance of fine-scale climatic data for modeling species performance and distribution remains a matter of debate. Using a 6-year monitoring of three potato moth species, major crop pests in the tropical Andes, we asked whether the spatiotemporal resolution of temperature data affect the predictions of models of moth performance and distribution. For this, we used three different climatic data sets: (i) the WorldClim dataset (global dataset), (ii) air temperature recorded using data loggers (weather station dataset), and (iii) air crop canopy temperature (microclimate dataset). We developed a statistical procedure to calibrate all datasets to monthly and yearly variation in temperatures, while keeping both spatial and temporal variances (air monthly temperature at 1 km² for the WorldClim dataset, air hourly temperature for the weather station, and air minute temperature over 250 m radius disks for the microclimate dataset). Then, we computed pest performances based on these three datasets. Results for temperature ranging from 9 to 11°C revealed discrepancies in the simulation outputs in both survival and development rates depending on the spatiotemporal resolution of the temperature dataset. Temperature and simulated pest performances were then combined into multiple linear regression models to compare predicted vs. field data. We used an additional set of study sites to test the ability of the results of our model to be extrapolated over larger scales. Results showed that the model implemented with microclimatic data best predicted observed pest abundances for our study sites, but was less accurate than the global dataset model when performed at larger scales. Our simulations therefore stress the importance to consider different temperature datasets depending on the issue to be solved in order to accurately predict species abundances. In conclusion, keeping in mind that the mismatch between the size of organisms and the scale at which climate data are collected and modeled remains a key issue, temperature dataset selection should be balanced by the desired output spatiotemporal scale for better predicting pest dynamics and developing efficient pest management strategies.
Microclimate Data Improve Predictions of Insect Abundance Models Based on Calibrated Spatiotemporal Temperatures

PubMed Central

Rebaudo, François; Faye, Emile; Dangles, Olivier

2016-01-01

A large body of literature has recently recognized the role of microclimates in controlling the physiology and ecology of species, yet the relevance of fine-scale climatic data for modeling species performance and distribution remains a matter of debate. Using a 6-year monitoring of three potato moth species, major crop pests in the tropical Andes, we asked whether the spatiotemporal resolution of temperature data affect the predictions of models of moth performance and distribution. For this, we used three different climatic data sets: (i) the WorldClim dataset (global dataset), (ii) air temperature recorded using data loggers (weather station dataset), and (iii) air crop canopy temperature (microclimate dataset). We developed a statistical procedure to calibrate all datasets to monthly and yearly variation in temperatures, while keeping both spatial and temporal variances (air monthly temperature at 1 km² for the WorldClim dataset, air hourly temperature for the weather station, and air minute temperature over 250 m radius disks for the microclimate dataset). Then, we computed pest performances based on these three datasets. Results for temperature ranging from 9 to 11°C revealed discrepancies in the simulation outputs in both survival and development rates depending on the spatiotemporal resolution of the temperature dataset. Temperature and simulated pest performances were then combined into multiple linear regression models to compare predicted vs. field data. We used an additional set of study sites to test the ability of the results of our model to be extrapolated over larger scales. Results showed that the model implemented with microclimatic data best predicted observed pest abundances for our study sites, but was less accurate than the global dataset model when performed at larger scales. Our simulations therefore stress the importance to consider different temperature datasets depending on the issue to be solved in order to accurately predict species abundances. In conclusion, keeping in mind that the mismatch between the size of organisms and the scale at which climate data are collected and modeled remains a key issue, temperature dataset selection should be balanced by the desired output spatiotemporal scale for better predicting pest dynamics and developing efficient pest management strategies. PMID:27148077
A Spatially Distinct History of the Development of California Groundfish Fisheries

PubMed Central

Miller, Rebecca R.; Field, John C.; Santora, Jarrod A.; Schroeder, Isaac D.; Huff, David D.; Key, Meisha; Pearson, Don E.; MacCall, Alec D.

2014-01-01

During the past century, commercial fisheries have expanded from small vessels fishing in shallow, coastal habitats to a broad suite of vessels and gears that fish virtually every marine habitat on the globe. Understanding how fisheries have developed in space and time is critical for interpreting and managing the response of ecosystems to the effects of fishing, however time series of spatially explicit data are typically rare. Recently, the 1933–1968 portion of the commercial catch dataset from the California Department of Fish and Wildlife was recovered and digitized, completing the full historical series for both commercial and recreational datasets from 1933–2010. These unique datasets include landing estimates at a coarse 10 by 10 minute “grid-block” spatial resolution and extends the entire length of coastal California up to 180 kilometers from shore. In this study, we focus on the catch history of groundfish which were mapped for each grid-block using the year at 50% cumulative catch and total historical catch per habitat area. We then constructed generalized linear models to quantify the relationship between spatiotemporal trends in groundfish catches, distance from ports, depth, percentage of days with wind speed over 15 knots, SST and ocean productivity. Our results indicate that over the history of these fisheries, catches have taken place in increasingly deeper habitat, at a greater distance from ports, and in increasingly inclement weather conditions. Understanding spatial development of groundfish fisheries and catches in California are critical for improving population models and for evaluating whether implicit stock assessment model assumptions of relative homogeneity of fisheries removals over time and space are reasonable. This newly reconstructed catch dataset and analysis provides a comprehensive appreciation for the development of groundfish fisheries with respect to commonly assumed trends of global fisheries patterns that are typically constrained by a lack of long-term spatial datasets. PMID:24967973

Observational uncertainty and regional climate model evaluation: A pan-European perspective

NASA Astrophysics Data System (ADS)

Kotlarski, Sven; Szabó, Péter; Herrera, Sixto; Räty, Olle; Keuler, Klaus; Soares, Pedro M.; Cardoso, Rita M.; Bosshard, Thomas; Pagé, Christian; Boberg, Fredrik; Gutiérrez, José M.; Jaczewski, Adam; Kreienkamp, Frank; Liniger, Mark. A.; Lussana, Cristian; Szepszo, Gabriella

2017-04-01

Local and regional climate change assessments based on downscaling methods crucially depend on the existence of accurate and reliable observational reference data. In dynamical downscaling via regional climate models (RCMs) observational data can influence model development itself and, later on, model evaluation, parameter calibration and added value assessment. In empirical-statistical downscaling, observations serve as predictand data and directly influence model calibration with corresponding effects on downscaled climate change projections. Focusing on the evaluation of RCMs, we here analyze the influence of uncertainties in observational reference data on evaluation results in a well-defined performance assessment framework and on a European scale. For this purpose we employ three different gridded observational reference grids, namely (1) the well-established EOBS dataset (2) the recently developed EURO4M-MESAN regional re-analysis, and (3) several national high-resolution and quality-controlled gridded datasets that recently became available. In terms of climate models five reanalysis-driven experiments carried out by five different RCMs within the EURO-CORDEX framework are used. Two variables (temperature and precipitation) and a range of evaluation metrics that reflect different aspects of RCM performance are considered. We furthermore include an illustrative model ranking exercise and relate observational spread to RCM spread. The results obtained indicate a varying influence of observational uncertainty on model evaluation depending on the variable, the season, the region and the specific performance metric considered. Over most parts of the continent, the influence of the choice of the reference dataset for temperature is rather small for seasonal mean values and inter-annual variability. Here, model uncertainty (as measured by the spread between the five RCM simulations considered) is typically much larger than reference data uncertainty. For parameters of the daily temperature distribution and for the spatial pattern correlation, however, important dependencies on the reference dataset can arise. The related evaluation uncertainties can be as large or even larger than model uncertainty. For precipitation the influence of observational uncertainty is, in general, larger than for temperature. It often dominates model uncertainty especially for the evaluation of the wet day frequency, the spatial correlation and the shape and location of the distribution of daily values. But even the evaluation of large-scale seasonal mean values can be considerably affected by the choice of the reference. When employing a simple and illustrative model ranking scheme on these results it is found that RCM ranking in many cases depends on the reference dataset employed.
Spatiotemporal patterns of precipitation inferred from streamflow observations across the Sierra Nevada mountain range

NASA Astrophysics Data System (ADS)

Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Newman, Andrew J.; Hughes, Mimi; McGurk, Bruce; Lundquist, Jessica D.

2018-01-01

Given uncertainty in precipitation gauge-based gridded datasets over complex terrain, we use multiple streamflow observations as an additional source of information about precipitation, in order to identify spatial and temporal differences between a gridded precipitation dataset and precipitation inferred from streamflow. We test whether gridded datasets capture across-crest and regional spatial patterns of variability, as well as year-to-year variability and trends in precipitation, in comparison to precipitation inferred from streamflow. We use a Bayesian model calibration routine with multiple lumped hydrologic model structures to infer the most likely basin-mean, water-year total precipitation for 56 basins with long-term (>30 year) streamflow records in the Sierra Nevada mountain range of California. We compare basin-mean precipitation derived from this approach with basin-mean precipitation from a precipitation gauge-based, 1/16° gridded dataset that has been used to simulate and evaluate trends in Western United States streamflow and snowpack over the 20th century. We find that the long-term average spatial patterns differ: in particular, there is less precipitation in the gridded dataset in higher-elevation basins whose aspect faces prevailing cool-season winds, as compared to precipitation inferred from streamflow. In a few years and basins, there is less gridded precipitation than there is observed streamflow. Lower-elevation, southern, and east-of-crest basins show better agreement between gridded and inferred precipitation. Implied actual evapotranspiration (calculated as precipitation minus streamflow) then also varies between the streamflow-based estimates and the gridded dataset. Absolute uncertainty in precipitation inferred from streamflow is substantial, but the signal of basin-to-basin and year-to-year differences are likely more robust. The findings suggest that considering streamflow when spatially distributing precipitation in complex terrain may improve its representation, particularly for basins whose orientations (e.g., windward-facing) are favored for orographic precipitation enhancement.
VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases

NASA Technical Reports Server (NTRS)

Roussopoulos, N.; Sellis, Timos

1993-01-01

One of the biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental data base access method, VIEWCACHE, provides such an interface for accessing distributed datasets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image datasets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate database search.
Users' Manual and Installation Guide for the EverVIEW Slice and Dice Tool (Version 1.0 Beta)

USGS Publications Warehouse

Roszell, Dustin; Conzelmann, Craig; Chimmula, Sumani; Chandrasekaran, Anuradha; Hunnicut, Christina

2009-01-01

Network Common Data Form (NetCDF) is a self-describing, machine-independent file format for storing array-oriented scientific data. Over the past few years, there has been a growing movement within the community of natural resource managers in The Everglades, Fla., to use NetCDF as the standard data container for datasets based on multidimensional arrays. As a consequence, a need arose for additional tools to view and manipulate NetCDF datasets, specifically to create subsets of large NetCDF files. To address this need, we created the EverVIEW Slice and Dice Tool to allow users to create subsets of grid-based NetCDF files. The major functions of this tool are (1) to subset NetCDF files both spatially and temporally; (2) to view the NetCDF data in table form; and (3) to export filtered data to a comma-separated value file format.
Imaging samples larger than the field of view: the SLS experience

NASA Astrophysics Data System (ADS)

Vogiatzis Oikonomidis, Ioannis; Lovric, Goran; Cremona, Tiziana P.; Arcadu, Filippo; Patera, Alessandra; Schittny, Johannes C.; Stampanoni, Marco

2017-06-01

Volumetric datasets with micrometer spatial and sub-second temporal resolutions are nowadays routinely acquired using synchrotron X-ray tomographic microscopy (SRXTM). Although SRXTM technology allows the examination of multiple samples with short scan times, many specimens are larger than the field-of-view (FOV) provided by the detector. The extension of the FOV in the direction perpendicular to the rotation axis remains non-trivial. We present a method that can efficiently increase the FOV merging volumetric datasets obtained by region-of-interest tomographies in different 3D positions of the sample with a minimal amount of artefacts and with the ability to handle large amounts of data. The method has been successfully applied for the three-dimensional imaging of a small number of mouse lung acini of intact animals, where pixel sizes down to the micrometer range and short exposure times are required.
Analysis of long term trends of precipitation estimates acquired using radar network in Turkey

NASA Astrophysics Data System (ADS)

Tugrul Yilmaz, M.; Yucel, Ismail; Kamil Yilmaz, Koray

2016-04-01

Precipitation estimates, a vital input in many hydrological and agricultural studies, can be obtained using many different platforms (ground station-, radar-, model-, satellite-based). Satellite- and model-based estimates are spatially continuous datasets, however they lack the high resolution information many applications often require. Station-based values are actual precipitation observations, however they suffer from their nature that they are point data. These datasets may be interpolated however such end-products may have large errors over remote locations with different climate/topography/etc than the areas stations are installed. Radars have the particular advantage of having high spatial resolution information over land even though accuracy of radar-based precipitation estimates depends on the Z-R relationship, mountain blockage, target distance from the radar, spurious echoes resulting from anomalous propagation of the radar beam, bright band contamination and ground clutter. A viable method to obtain spatially and temporally high resolution consistent precipitation information is merging radar and station data to take advantage of each retrieval platform. An optimally merged product is particularly important in Turkey where complex topography exerts strong controls on the precipitation regime and in turn hampers observation efforts. There are currently 10 (additional 7 are planned) weather radars over Turkey obtaining precipitation information since 2007. This study aims to optimally merge radar precipitation data with station based observations to introduce a station-radar blended precipitation product. This study was supported by TUBITAK fund # 114Y676.
Estimation of Global 1km-grid Terrestrial Carbon Exchange Part I: Developing Inputs and Modelling

NASA Astrophysics Data System (ADS)

Sasai, T.; Murakami, K.; Kato, S.; Matsunaga, T.; Saigusa, N.; Hiraki, K.

2015-12-01

Global terrestrial carbon cycle largely depends on a spatial pattern in land cover type, which is heterogeneously-distributed over regional and global scales. However, most studies, which aimed at the estimation of carbon exchanges between ecosystem and atmosphere, remained within several tens of kilometers grid spatial resolution, and the results have not been enough to understand the detailed pattern of carbon exchanges based on ecological community. Improving the sophistication of spatial resolution is obviously necessary to enhance the accuracy of carbon exchanges. Moreover, the improvement may contribute to global warming awareness, policy makers and other social activities. In this study, we show global terrestrial carbon exchanges (net ecosystem production, net primary production, and gross primary production) with 1km-grid resolution. As methodology for computing the exchanges, we 1) developed a global 1km-grid climate and satellite dataset based on the approach in Setoyama and Sasai (2013); 2) used the satellite-driven biosphere model (Biosphere model integrating Eco-physiological And Mechanistic approaches using Satellite data: BEAMS) (Sasai et al., 2005, 2007, 2011); 3) simulated the carbon exchanges by using the new dataset and BEAMS by the use of a supercomputer that includes 1280 CPU and 320 GPGPU cores (GOSAT RCF of NIES). As a result, we could develop a global uniform system for realistically estimating terrestrial carbon exchange, and evaluate net ecosystem production in each community level; leading to obtain highly detailed understanding of terrestrial carbon exchanges.
Spatial variations in Titan's atmospheric temperature: ALMA and Cassini comparisons from 2012 to 2015

NASA Astrophysics Data System (ADS)

Thelen, Alexander E.; Nixon, C. A.; Chanover, N. J.; Molter, E. M.; Cordiner, M. A.; Achterberg, R. K.; Serigano, J.; Irwin, P. G. J.; Teanby, N.; Charnley, S. B.

2018-06-01

Submillimeter emission lines of carbon monoxide (CO) in Titan's atmosphere provide excellent probes of atmospheric temperature due to the molecule's long chemical lifetime and stable, well constrained volume mixing ratio. Here we present the analysis of 4 datasets obtained with the Atacama Large Millimeter/Submillimeter Array (ALMA) in 2012, 2013, 2014, and 2015 that contain strong CO rotational transitions. Utilizing ALMA's high spatial resolution in the 2012, 2014, and 2015 observations, we extract spectra from 3 separate regions on Titan's disk using datasets with beam sizes ranging from 0.35 × 0.28″ to 0.39 × 0.34″. Temperature profiles retrieved by the NEMESIS radiative transfer code are compared to Cassini Composite Infrared Spectrometer (CIRS) and radio occultation science results from similar latitude regions. Disk-averaged temperature profiles stay relatively constant from year to year, while small seasonal variations in atmospheric temperature are present from 2012 to 2015 in the stratosphere and mesosphere ( ∼ 100-500 km) of spatially resolved regions. We measure the stratopause (320 km) to increase in temperature by 5 K in northern latitudes from 2012 to 2015, while temperatures rise throughout the stratosphere at lower latitudes. We observe generally cooler temperatures in the lower stratosphere ( ∼ 100 km) than those obtained through Cassini radio occultation measurements, with the notable exception of warming in the northern latitudes and the absence of previous instabilities; both of these results are indicators that Titan's lower atmosphere responds to seasonal effects, particularly at higher latitudes. While retrieved temperature profiles cover a range of latitudes in these observations, deviations from CIRS nadir maps and radio occultation measurements convolved with the ALMA beam-footprint are not found to be statistically significant, and discrepancies are often found to be less than 5 K throughout the atmosphere. ALMA's excellent sensitivity in the lower stratosphere (60-300 km) provides a highly complementary dataset to contemporary CIRS and radio science observations, including altitude regions where both of those measurement sets contain large uncertainties. The demonstrated utility of CO emission lines in the submillimeter as a tracer of Titan's atmospheric temperature lays the groundwork for future studies of other molecular species - particularly those that exhibit strong polar abundance enhancements or are pressure-broadened in the lower atmosphere, as temperature profiles are found to consistently vary with latitude in all three years by up to 15 K.
Monitoring Urbanization Processes from Space: Using Landsat Imagery to Detect Built-Up Areas at Scale

NASA Astrophysics Data System (ADS)

Goldblatt, R.; You, W.; Hanson, G.; Khandelwal, A. K.

2016-12-01

Urbanization is one of the most fundamental trends of the past two centuries and a key force shaping almost all dimensions of modern society. Monitoring the spatial extent of cities and their dynamics be means of remote sensing methods is crucial for many research domains, as well as to city and regional planning and to policy making. Yet the majority of urban research is being done in small scales, due, in part, to computational limitation. With the increasing availability of parallel computing platforms with large storage capacities, such as Google Earth Engine (GEE), researchers can scale up the spatial and the temporal units of analysis and investigate urbanization processes over larger areas and over longer periods of time. In this study we present a methodology that is designed to capture temporal changes in the spatial extent of urban areas at the national level. We utilize a large scale ground-truth dataset containing examples of "built-up" and "not built-up" areas from across India. This dataset, which was collected based on 2016 high-resolution imagery, is used for supervised pixel-based image classification in GEE. We assess different types of classifiers and inputs and demonstrate that with Landsat 8 as the classifier`s input, Random Forest achieves a high accuracy rate of around 87%. Although performance with Landsat 8 as the input exceeds that of Landsat 7, with the addition of several per-pixel computed indices to Landsat 7 - NDVI, NDBI, MNDWI and SAVI - the classifier`s sensitivity improves by around 10%. We use Landsat 7 to detect temporal changes in the extent of urban areas. The classifier is trained with 2016 imagery as the input - for which ground truth data is available - and is used the to detect urban areas over the historical imagery. We demonstrate that this classification produces high quality maps of urban extent over time. We compare the classification result with numerous datasets of urban areas (e.g. MODIS, DMSP-OLS and WorldPop) and show that our classification captures the fine boundaries between built-up areas and various types of land cover thus providing an accurate estimation of the extent of urban areas. The study demonstrates the potential of cloud-based platforms, such as GEE, for monitoring long-term and continuous urbanization processes at scale.
Benchmark of Machine Learning Methods for Classification of a SENTINEL-2 Image

NASA Astrophysics Data System (ADS)

Pirotti, F.; Sunar, F.; Piragnolo, M.

2016-06-01

Thanks to mainly ESA and USGS, a large bulk of free images of the Earth is readily available nowadays. One of the main goals of remote sensing is to label images according to a set of semantic categories, i.e. image classification. This is a very challenging issue since land cover of a specific class may present a large spatial and spectral variability and objects may appear at different scales and orientations. In this study, we report the results of benchmarking 9 machine learning algorithms tested for accuracy and speed in training and classification of land-cover classes in a Sentinel-2 dataset. The following machine learning methods (MLM) have been tested: linear discriminant analysis, k-nearest neighbour, random forests, support vector machines, multi layered perceptron, multi layered perceptron ensemble, ctree, boosting, logarithmic regression. The validation is carried out using a control dataset which consists of an independent classification in 11 land-cover classes of an area about 60 km2, obtained by manual visual interpretation of high resolution images (20 cm ground sampling distance) by experts. In this study five out of the eleven classes are used since the others have too few samples (pixels) for testing and validating subsets. The classes used are the following: (i) urban (ii) sowable areas (iii) water (iv) tree plantations (v) grasslands. Validation is carried out using three different approaches: (i) using pixels from the training dataset (train), (ii) using pixels from the training dataset and applying cross-validation with the k-fold method (kfold) and (iii) using all pixels from the control dataset. Five accuracy indices are calculated for the comparison between the values predicted with each model and control values over three sets of data: the training dataset (train), the whole control dataset (full) and with k-fold cross-validation (kfold) with ten folds. Results from validation of predictions of the whole dataset (full) show the random forests method with the highest values; kappa index ranging from 0.55 to 0.42 respectively with the most and least number pixels for training. The two neural networks (multi layered perceptron and its ensemble) and the support vector machines - with default radial basis function kernel - methods follow closely with comparable performance.
An effective assessment protocol for continuous geospatial datasets of forest characteristics using USFS Forest Inventory and Analysis (FIA) data

Treesearch

Rachel Riemann; Barry Tyler Wilson; Andrew Lister; Sarah Parks

2010-01-01

Geospatial datasets of forest characteristics are modeled representations of real populations on the ground. The continuous spatial character of such datasets provides an incredible source of information at the landscape level for ecosystem research, policy analysis, and planning applications, all of which are critical for addressing current challenges related to...
An Integrative Platform for Three-dimensional Quantitative Analysis of Spatially Heterogeneous Metastasis Landscapes

NASA Astrophysics Data System (ADS)

Guldner, Ian H.; Yang, Lin; Cowdrick, Kyle R.; Wang, Qingfei; Alvarez Barrios, Wendy V.; Zellmer, Victoria R.; Zhang, Yizhe; Host, Misha; Liu, Fang; Chen, Danny Z.; Zhang, Siyuan

2016-04-01

Metastatic microenvironments are spatially and compositionally heterogeneous. This seemingly stochastic heterogeneity provides researchers great challenges in elucidating factors that determine metastatic outgrowth. Herein, we develop and implement an integrative platform that will enable researchers to obtain novel insights from intricate metastatic landscapes. Our two-segment platform begins with whole tissue clearing, staining, and imaging to globally delineate metastatic landscape heterogeneity with spatial and molecular resolution. The second segment of our platform applies our custom-developed SMART 3D (Spatial filtering-based background removal and Multi-chAnnel forest classifiers-based 3D ReconsTruction), a multi-faceted image analysis pipeline, permitting quantitative interrogation of functional implications of heterogeneous metastatic landscape constituents, from subcellular features to multicellular structures, within our large three-dimensional (3D) image datasets. Coupling whole tissue imaging of brain metastasis animal models with SMART 3D, we demonstrate the capability of our integrative pipeline to reveal and quantify volumetric and spatial aspects of brain metastasis landscapes, including diverse tumor morphology, heterogeneous proliferative indices, metastasis-associated astrogliosis, and vasculature spatial distribution. Collectively, our study demonstrates the utility of our novel integrative platform to reveal and quantify the global spatial and volumetric characteristics of the 3D metastatic landscape with unparalleled accuracy, opening new opportunities for unbiased investigation of novel biological phenomena in situ.
Harnessing Big Data to Represent 30-meter Spatial Heterogeneity in Earth System Models

NASA Astrophysics Data System (ADS)

Chaney, N.; Shevliakova, E.; Malyshev, S.; Van Huijgevoort, M.; Milly, C.; Sulman, B. N.

2016-12-01

Terrestrial land surface processes play a critical role in the Earth system; they have a profound impact on the global climate, food and energy production, freshwater resources, and biodiversity. One of the most fascinating yet challenging aspects of characterizing terrestrial ecosystems is their field-scale (˜30 m) spatial heterogeneity. It has been observed repeatedly that the water, energy, and biogeochemical cycles at multiple temporal and spatial scales have deep ties to an ecosystem's spatial structure. Current Earth system models largely disregard this important relationship leading to an inadequate representation of ecosystem dynamics. In this presentation, we will show how existing global environmental datasets can be harnessed to explicitly represent field-scale spatial heterogeneity in Earth system models. For each macroscale grid cell, these environmental data are clustered according to their field-scale soil and topographic attributes to define unique sub-grid tiles. The state-of-the-art Geophysical Fluid Dynamics Laboratory (GFDL) land model is then used to simulate these tiles and their spatial interactions via the exchange of water, energy, and nutrients along explicit topographic gradients. Using historical simulations over the contiguous United States, we will show how a robust representation of field-scale spatial heterogeneity impacts modeled ecosystem dynamics including the water, energy, and biogeochemical cycles as well as vegetation composition and distribution.
Assessing the quality of open spatial data for mobile location-based services research and applications

NASA Astrophysics Data System (ADS)

Ciepłuch, C.; Mooney, P.; Jacob, R.; Zheng, J.; Winstanely, A. C.

2011-12-01

New trends in GIS such as Volunteered Geographical Information (VGI), Citizen Science, and Urban Sensing, have changed the shape of the geoinformatics landscape. The OpenStreetMap (OSM) project provided us with an exciting, evolving, free and open solution as a base dataset for our geoserver and spatial data provider for our research. OSM is probably the best known and best supported example of VGI and user generated spatial content on the Internet. In this paper we will describe current results from the development of quality indicators for measures for OSM data. Initially we have analysed the Ireland OSM data in grid cells (5km) to gather statistical data about the completeness, accuracy, and fitness for purpose of the underlying spatial data. This analysis included: density of user contributions, spatial density of points and polygons, types of tags and metadata used, dominant contributors in a particular area or for a particular geographic feature type, etc. There greatest OSM activity and spatial data density is highly correlated with centres of large population. The ability to quantify and assess if VGI, such as OSM, is of sufficient quality for mobile mapping applications and Location-based services is critical to the future success of VGI as a spatial data source for these technologies.
Spatial prediction of near surface soil water retention functions using hydrogeophysics

NASA Astrophysics Data System (ADS)

Gibson, J. P.; Franz, T. E.

2017-12-01

The hydrological community often turns to widely available spatial datasets such as SSURGO to characterize the spatial variability of soil across a landscape of interest. This has served as a reasonable first approximation when lacking localized soil data. However, previous work has shown that information loss within land surface models primarily stems from parameterization. Localized soil sampling is both expensive and time intense, and thus a need exists in connecting spatial datasets with ground observations. Given that hydrogeophysics is data-dense, rapid, and relatively easy to adopt, it is a promising technique to help dovetail localized soil sampling with larger spatial datasets. In this work, we utilize 2 geophysical techniques; cosmic ray neutron probe and electromagnetic induction, to identify temporally stable soil moisture patterns. This is achieved by measuring numerous times over a range of wet to dry field conditions in order to apply an empirical orthogonal function. We then present measured water retention functions of shallow cores extracted within each temporally stable zone. Lastly, we use soil moisture patterns as a covariate to predict soil hydraulic properties in areas without measurement and validate using a leave-one-out cross validation analysis. Using these approaches to better constrain soil hydraulic property variability, we speculate that further research can better estimate hydrologic fluxes in areas of interest.
Dataset on outdoor behavior-system and spatial-pattern in the third place in cold area-based on the perspective of new energy structure.

PubMed

Ren, Kai; Wang, Yuan; Liu, Tingxi; Wang, Guanli

2017-02-01

The data presented in this paper are related to the research article entitled "Exploration of Outdoor Behavior System and Spatial Pattern in the Third Place in Cold Area- based on the perspective of new energy structure" (Ren, 2016) [1]. The dataset was from a field sub-time extended investigation to residents of Power Home Community in Inner Mongolia of China that belongs to cold region of ID area according to Chinese design code for buildings. This filed data provided descriptive statistics about environment-behavior symbiosis system, environment loading, behavior system, spatial demanding and spatial pattern for all kinds of residents (Older, younger, children). The field data set is made publicly available to enable critical or extended analyzes.
High-Level Location Based Search Services That Improve Discoverability of Geophysical Data in the Virtual ITM Observatory

NASA Astrophysics Data System (ADS)

Schaefer, R. K.; Morrison, D.; Potter, M.; Barnes, R. J.; Nylund, S. R.; Patrone, D.; Aiello, J.; Talaat, E. R.; Sarris, T.

2015-12-01

The great promise of Virtual Observatories is the ability to perform complex search operations across the metadata of a large variety of different data sets. This allows the researcher to isolate and select the relevant measurements for their topic of study. The Virtual ITM Observatory (VITMO) has many diverse geophysical datasets that cover a large temporal and spatial range that present a unique search problem. VITMO provides many methods by which the user can search for and select data of interest including restricting selections based on geophysical conditions (solar wind speed, Kp, etc) as well as finding those datasets that overlap in time. One of the key challenges in improving discoverability is the ability to identify portions of datasets that overlap in time and in location. The difficulty is that location data is not contained in the metadata for datasets produced by satellites and would be extremely large in volume if it were available, making searching for overlapping data very time consuming. To solve this problem we have developed a series of light-weight web services that can provide a new data search capability for VITMO and others. The services consist of a database of spacecraft ephemerides and instrument fields of view; an overlap calculator to find times when the fields of view of different instruments intersect; and a magnetic field line tracing service that maps in situ and ground based measurements to the equatorial plane in magnetic coordinates for a number of field models and geophysical conditions. These services run in real-time when the user queries for data. These services will allow the non-specialist user to select data that they were previously unable to locate, opening up analysis opportunities beyond the instrument teams and specialists, making it easier for future students who come into the field.
Extracting climate signals from large hydrological data cubes using multivariate statistics - an example for the Mediterranean basin

NASA Astrophysics Data System (ADS)

Kauer, Agnes; Dorigo, Wouter; Bauer-Marschallinger, Bernhard

2017-04-01

Global warming is expected to change ocean-atmosphere oscillation patterns, e.g. the El Nino Southern Oscillation, and may thus have a substantial impact on water resources over land. Yet, the link between climate oscillations and terrestrial hydrology has large uncertainties. In particular, the climate in the Mediterranean basin is expected to be sensitive to global warming as it may increase insufficient and irregular water supply and lead to more frequent and intense droughts and heavy precipitation events. The ever increasing need for water in tourism and agriculture reinforce the problem. Therefore, the monitoring and better understanding of the hydrological cycle are crucial for this area. This study seeks to quantify the effect of regional climate modes, e.g. the Northern Atlantic Oscillation (NAO) on the hydrological cycle in the Mediterranean. We apply Empirical Orthogonal Functions (EOF) to a wide range of hydrological datasets to extract the major modes of variation over the study period. We use more than ten datasets describing precipitation, soil moisture, evapotranspiration, and changes in water mass with study periods ranging from one to three decades depending on the dataset. The resulting EOFs are then examined for correlations with regional climate modes using Spearman rank correlation analysis. This is done for the entire time span of the EOFs and for monthly and seasonally sampled data. We find relationships between the hydrological datasets and the climate modes NAO, Arctic Oscillation (AO), Eastern Atlantic (EA), and Tropical Northern Atlantic (TNA). Analyses of monthly and seasonally sampled data reveal high correlations especially in the winter months. However, the spatial extent of the data cube considered for the analyses have a large impact on the results. Our statistical analyses suggest an impact of regional climate modes on the hydrological cycle in the Mediterranean area and may provide valuable input for evaluating process-oriented climate models. The study is supported by WACMOS-MED project of the European Space Agency.
NATIONAL HYDROGRAPHY DATASET

EPA Science Inventory

Resource Purpose:The National Hydrography Dataset (NHD) is a comprehensive set of digital spatial data that contains information about surface water features such as lakes, ponds, streams, rivers, springs and wells. Within the NHD, surface water features are combined to fo...
Exploratory spatial data analysis of global MODIS active fire data

NASA Astrophysics Data System (ADS)

Oom, D.; Pereira, J. M. C.

2013-04-01

We performed an exploratory spatial data analysis (ESDA) of autocorrelation patterns in the NASA MODIS MCD14ML Collection 5 active fire dataset, for the period 2001-2009, at the global scale. The dataset was screened, resulting in an annual rate of false alarms and non-vegetation fires ranging from a minimum of 3.1% in 2003 to a maximum of 4.4% in 2001. Hot bare soils and gas flares were the major sources of false alarms and non-vegetation fires. The data were aggregated at 0.5° resolution for the global and local spatial autocorrelation Fire counts were found to be positively correlated up to distances of around 200 km, and negatively for larger distances. A value of 0.80 (p = 0.001, α = 0.05) for Moran's I indicates strong spatial autocorrelation between fires at global scale, with 60% of all cells displaying significant positive or negative spatial correlation. Different types of spatial autocorrelation were mapped and regression diagnostics allowed for the identification of spatial outlier cells, with fire counts much higher or lower than expected, considering their spatial context.

The emergence of spatial cyberinfrastructure.

PubMed

Wright, Dawn J; Wang, Shaowen

2011-04-05

Cyberinfrastructure integrates advanced computer, information, and communication technologies to empower computation-based and data-driven scientific practice and improve the synthesis and analysis of scientific data in a collaborative and shared fashion. As such, it now represents a paradigm shift in scientific research that has facilitated easy access to computational utilities and streamlined collaboration across distance and disciplines, thereby enabling scientific breakthroughs to be reached more quickly and efficiently. Spatial cyberinfrastructure seeks to resolve longstanding complex problems of handling and analyzing massive and heterogeneous spatial datasets as well as the necessity and benefits of sharing spatial data flexibly and securely. This article provides an overview and potential future directions of spatial cyberinfrastructure. The remaining four articles of the special feature are introduced and situated in the context of providing empirical examples of how spatial cyberinfrastructure is extending and enhancing scientific practice for improved synthesis and analysis of both physical and social science data. The primary focus of the articles is spatial analyses using distributed and high-performance computing, sensor networks, and other advanced information technology capabilities to transform massive spatial datasets into insights and knowledge.
The emergence of spatial cyberinfrastructure

PubMed Central

Wright, Dawn J.; Wang, Shaowen

2011-01-01

Cyberinfrastructure integrates advanced computer, information, and communication technologies to empower computation-based and data-driven scientific practice and improve the synthesis and analysis of scientific data in a collaborative and shared fashion. As such, it now represents a paradigm shift in scientific research that has facilitated easy access to computational utilities and streamlined collaboration across distance and disciplines, thereby enabling scientific breakthroughs to be reached more quickly and efficiently. Spatial cyberinfrastructure seeks to resolve longstanding complex problems of handling and analyzing massive and heterogeneous spatial datasets as well as the necessity and benefits of sharing spatial data flexibly and securely. This article provides an overview and potential future directions of spatial cyberinfrastructure. The remaining four articles of the special feature are introduced and situated in the context of providing empirical examples of how spatial cyberinfrastructure is extending and enhancing scientific practice for improved synthesis and analysis of both physical and social science data. The primary focus of the articles is spatial analyses using distributed and high-performance computing, sensor networks, and other advanced information technology capabilities to transform massive spatial datasets into insights and knowledge. PMID:21467227
Applications of spatial statistical network models to stream data

USGS Publications Warehouse

Isaak, Daniel J.; Peterson, Erin E.; Ver Hoef, Jay M.; Wenger, Seth J.; Falke, Jeffrey A.; Torgersen, Christian E.; Sowder, Colin; Steel, E. Ashley; Fortin, Marie-Josée; Jordan, Chris E.; Ruesch, Aaron S.; Som, Nicholas; Monestiez, Pascal

2014-01-01

Streams and rivers host a significant portion of Earth's biodiversity and provide important ecosystem services for human populations. Accurate information regarding the status and trends of stream resources is vital for their effective conservation and management. Most statistical techniques applied to data measured on stream networks were developed for terrestrial applications and are not optimized for streams. A new class of spatial statistical model, based on valid covariance structures for stream networks, can be used with many common types of stream data (e.g., water quality attributes, habitat conditions, biological surveys) through application of appropriate distributions (e.g., Gaussian, binomial, Poisson). The spatial statistical network models account for spatial autocorrelation (i.e., nonindependence) among measurements, which allows their application to databases with clustered measurement locations. Large amounts of stream data exist in many areas where spatial statistical analyses could be used to develop novel insights, improve predictions at unsampled sites, and aid in the design of efficient monitoring strategies at relatively low cost. We review the topic of spatial autocorrelation and its effects on statistical inference, demonstrate the use of spatial statistics with stream datasets relevant to common research and management questions, and discuss additional applications and development potential for spatial statistics on stream networks. Free software for implementing the spatial statistical network models has been developed that enables custom applications with many stream databases.
Fast multidimensional ensemble empirical mode decomposition for the analysis of big spatio-temporal datasets.

PubMed

Wu, Zhaohua; Feng, Jiaxin; Qiao, Fangli; Tan, Zhe-Min

2016-04-13

In this big data era, it is more urgent than ever to solve two major issues: (i) fast data transmission methods that can facilitate access to data from non-local sources and (ii) fast and efficient data analysis methods that can reveal the key information from the available data for particular purposes. Although approaches in different fields to address these two questions may differ significantly, the common part must involve data compression techniques and a fast algorithm. This paper introduces the recently developed adaptive and spatio-temporally local analysis method, namely the fast multidimensional ensemble empirical mode decomposition (MEEMD), for the analysis of a large spatio-temporal dataset. The original MEEMD uses ensemble empirical mode decomposition to decompose time series at each spatial grid and then pieces together the temporal-spatial evolution of climate variability and change on naturally separated timescales, which is computationally expensive. By taking advantage of the high efficiency of the expression using principal component analysis/empirical orthogonal function analysis for spatio-temporally coherent data, we design a lossy compression method for climate data to facilitate its non-local transmission. We also explain the basic principles behind the fast MEEMD through decomposing principal components instead of original grid-wise time series to speed up computation of MEEMD. Using a typical climate dataset as an example, we demonstrate that our newly designed methods can (i) compress data with a compression rate of one to two orders; and (ii) speed-up the MEEMD algorithm by one to two orders. © 2016 The Authors.
Scale-dependent habitat use by a large free-ranging predator, the Mediterranean fin whale

NASA Astrophysics Data System (ADS)

Cotté, Cédric; Guinet, Christophe; Taupier-Letage, Isabelle; Mate, Bruce; Petiau, Estelle

2009-05-01

Since the heterogeneity of oceanographic conditions drives abundance, distribution, and availability of prey, it is essential to understand how foraging predators interact with their dynamic environment at various spatial and temporal scales. We examined the spatio-temporal relationships between oceanographic features and abundance of fin whales ( Balaenoptera physalus), the largest free-ranging predator in the Western Mediterranean Sea (WM), through two independent approaches. First, spatial modeling was used to estimate whale density, using waiting distance (the distance between detections) for fin whales along ferry routes across the WM, in relation to remotely sensed oceanographic parameters. At a large scale (basin and year), fin whales exhibited fidelity to the northern WM with a summer-aggregated and winter-dispersed pattern. At mesoscale (20-100 km), whales were found in colder, saltier (from an on-board system) and dynamic areas defined by steep altimetric and temperature gradients. Second, using an independent fin whale satellite tracking dataset, we showed that tracked whales were effectively preferentially located in favorable habitats, i.e. in areas of high predicted densities as identified by our previous model using oceanographic data contemporaneous to the tracking period. We suggest that the large-scale fidelity corresponds to temporally and spatially predictable habitat of whale favorite prey, the northern krill ( Meganyctiphanes norvegica), while mesoscale relationships are likely to identify areas of high prey concentration and availability.
Innovating Big Data Computing Geoprocessing for Analysis of Engineered-Natural Systems

NASA Astrophysics Data System (ADS)

Rose, K.; Baker, V.; Bauer, J. R.; Vasylkivska, V.

2016-12-01

Big data computing and analytical techniques offer opportunities to improve predictions about subsurface systems while quantifying and characterizing associated uncertainties from these analyses. Spatial analysis, big data and otherwise, of subsurface natural and engineered systems are based on variable resolution, discontinuous, and often point-driven data to represent continuous phenomena. We will present examples from two spatio-temporal methods that have been adapted for use with big datasets and big data geo-processing capabilities. The first approach uses regional earthquake data to evaluate spatio-temporal trends associated with natural and induced seismicity. The second algorithm, the Variable Grid Method (VGM), is a flexible approach that presents spatial trends and patterns, such as those resulting from interpolation methods, while simultaneously visualizing and quantifying uncertainty in the underlying spatial datasets. In this presentation we will show how we are utilizing Hadoop to store and perform spatial analyses to efficiently consume and utilize large geospatial data in these custom analytical algorithms through the development of custom Spark and MapReduce applications that incorporate ESRI Hadoop libraries. The team will present custom `Big Data' geospatial applications that run on the Hadoop cluster and integrate with ESRI ArcMap with the team's probabilistic VGM approach. The VGM-Hadoop tool has been specially built as a multi-step MapReduce application running on the Hadoop cluster for the purpose of data reduction. This reduction is accomplished by generating multi-resolution, non-overlapping, attributed topology that is then further processed using ESRI's geostatistical analyst to convey a probabilistic model of a chosen study region. Finally, we will share our approach for implementation of data reduction and topology generation via custom multi-step Hadoop applications, performance benchmarking comparisons, and Hadoop-centric opportunities for greater parallelization of geospatial operations.
Semantic attributes for people's appearance description: an appearance modality for video surveillance applications

NASA Astrophysics Data System (ADS)

Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed

2017-09-01

Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.
Spatial and temporal synchrony in reptile population dynamics in variable environments.

PubMed

Greenville, Aaron C; Wardle, Glenda M; Nguyen, Vuong; Dickman, Chris R

2016-10-01

Resources are seldom distributed equally across space, but many species exhibit spatially synchronous population dynamics. Such synchrony suggests the operation of large-scale external drivers, such as rainfall or wildfire, or the influence of oasis sites that provide water, shelter, or other resources. However, testing the generality of these factors is not easy, especially in variable environments. Using a long-term dataset (13-22 years) from a large (8000 km(2)) study region in arid Central Australia, we tested firstly for regional synchrony in annual rainfall and the dynamics of six reptile species across nine widely separated sites. For species that showed synchronous spatial dynamics, we then used multivariate follow a multivariate auto-regressive state-space (MARSS) models to predict that regional rainfall would be positively associated with their populations. For asynchronous species, we used MARSS models to explore four other possible population structures: (1) populations were asynchronous, (2) differed between oasis and non-oasis sites, (3) differed between burnt and unburnt sites, or (4) differed between three sub-regions with different rainfall gradients. Only one species showed evidence of spatial population synchrony and our results provide little evidence that rainfall synchronizes reptile populations. The oasis or the wildfire hypotheses were the best-fitting models for the other five species. Thus, our six study species appear generally to be structured in space into one or two populations across the study region. Our findings suggest that for arid-dwelling reptile populations, spatial and temporal dynamics are structured by abiotic events, but individual responses to covariates at smaller spatial scales are complex and poorly understood.
Combining global land cover datasets to quantify agricultural expansion into forests in Latin America: Limitations and challenges

PubMed Central

Persson, U. Martin

2017-01-01

While we know that deforestation in the tropics is increasingly driven by commercial agriculture, most tropical countries still lack recent and spatially-explicit assessments of the relative importance of pasture and cropland expansion in causing forest loss. Here we present a spatially explicit quantification of the extent to which cultivated land and grassland expanded at the expense of forests across Latin America in 2001–2011, by combining two “state-of-the-art” global datasets (Global Forest Change forest loss and GlobeLand30-2010 land cover). We further evaluate some of the limitations and challenges in doing this. We find that this approach does capture some of the major patterns of land cover following deforestation, with GlobeLand30-2010’s Grassland class (which we interpret as pasture) being the most common land cover replacing forests across Latin America. However, our analysis also reveals some major limitations to combining these land cover datasets for quantifying pasture and cropland expansion into forest. First, a simple one-to-one translation between GlobeLand30-2010’s Cultivated land and Grassland classes into cropland and pasture respectively, should not be made without caution, as GlobeLand30-2010 defines its Cultivated land to include some pastures. Comparisons with the TerraClass dataset over the Brazilian Amazon and with previous literature indicates that Cultivated land in GlobeLand30-2010 includes notable amounts of pasture and other vegetation (e.g. in Paraguay and the Brazilian Amazon). This further suggests that the approach taken here generally leads to an underestimation (of up to ~60%) of the role of pasture in replacing forest. Second, a large share (~33%) of the Global Forest Change forest loss is found to still be forest according to GlobeLand30-2010 and our analysis suggests that the accuracy of the combined datasets, especially for areas with heterogeneous land cover and/or small-scale forest loss, is still too poor for deriving accurate quantifications of land cover following forest loss. PMID:28704510
Combining global land cover datasets to quantify agricultural expansion into forests in Latin America: Limitations and challenges.

PubMed

Pendrill, Florence; Persson, U Martin

2017-01-01

While we know that deforestation in the tropics is increasingly driven by commercial agriculture, most tropical countries still lack recent and spatially-explicit assessments of the relative importance of pasture and cropland expansion in causing forest loss. Here we present a spatially explicit quantification of the extent to which cultivated land and grassland expanded at the expense of forests across Latin America in 2001-2011, by combining two "state-of-the-art" global datasets (Global Forest Change forest loss and GlobeLand30-2010 land cover). We further evaluate some of the limitations and challenges in doing this. We find that this approach does capture some of the major patterns of land cover following deforestation, with GlobeLand30-2010's Grassland class (which we interpret as pasture) being the most common land cover replacing forests across Latin America. However, our analysis also reveals some major limitations to combining these land cover datasets for quantifying pasture and cropland expansion into forest. First, a simple one-to-one translation between GlobeLand30-2010's Cultivated land and Grassland classes into cropland and pasture respectively, should not be made without caution, as GlobeLand30-2010 defines its Cultivated land to include some pastures. Comparisons with the TerraClass dataset over the Brazilian Amazon and with previous literature indicates that Cultivated land in GlobeLand30-2010 includes notable amounts of pasture and other vegetation (e.g. in Paraguay and the Brazilian Amazon). This further suggests that the approach taken here generally leads to an underestimation (of up to ~60%) of the role of pasture in replacing forest. Second, a large share (~33%) of the Global Forest Change forest loss is found to still be forest according to GlobeLand30-2010 and our analysis suggests that the accuracy of the combined datasets, especially for areas with heterogeneous land cover and/or small-scale forest loss, is still too poor for deriving accurate quantifications of land cover following forest loss.
Evaluation of Different Phenological Information to Map Crop Rotation in Complex Irrigated Indus Basin

NASA Astrophysics Data System (ADS)

Ismaeel, A.; Zhou, Q.

2018-04-01

Accurate information of crop rotation in large basin is essential for policy decisions on land, water and nutrient resources around the world. Crop area estimation using low spatial resolution remote sensing data is challenging in a large heterogeneous basin having more than one cropping seasons. This study aims to evaluate the accuracy of two phenological datasets individually and in combined form to map crop rotations in complex irrigated Indus basin without image segmentation. Phenology information derived from Normalized Difference Vegetation Index (NDVI) and Leaf Area Index (LAI) of Moderate Resolution Imaging Spectroradiometer (MODIS) sensor, having 8-day temporal and 1000 m spatial resolution, was used in the analysis. An unsupervised (temporal space clustering) to supervised (area knowledge and phenology behavior) classification approach was adopted to identify 13 crop rotations. Estimated crop area was compared with reported area collected by field census. Results reveal that combined dataset (NDVI*LAI) performs better in mapping wheat-rice, wheat-cotton and wheat-fodder rotation by attaining root mean square error (RMSE) of 34.55, 16.84, 20.58 and mean absolute percentage error (MAPE) of 24.56 %, 36.82 %, 30.21 % for wheat, rice and cotton crop respectively. For sugarcane crop mapping, LAI produce good results by achieving RMSE of 8.60 and MAPE of 34.58 %, as compared to NDVI (10.08, 40.53 %) and NDVI*LAI (10.83, 39.45 %). The availability of major crop rotation statistics provides insight to develop better strategies for land, water and nutrient accounting frameworks to improve agriculture productivity.
Hydrogeophysical Cyberinfrastructure For Real-Time Interactive Browser Controlled Monitoring Of Near Surface Hydrology: Results Of A 13 Month Monitoring Effort At The Hanford 300 Area

NASA Astrophysics Data System (ADS)

Versteeg, R. J.; Johnson, T.; Henrie, A.; Johnson, D.

2013-12-01

The Hanford 300 Area, located adjacent to the Columbia River in south-central Washington, USA, is the site of former research and uranium fuel rod fabrication facilities. Waste disposal practices at the site included discharging between 33 and 59 metric tons of uranium over a 40 year period into shallow infiltration galleries, resulting in persistent uranium contamination within the vadose and saturated zones. Uranium transport from the vadose zone to the saturated zone is intimately linked with water table fluctuations and river water driven by upstream dam operations. Different remedial efforts have occurred at the site to address uranium contamination. Numerous investigations are occurring at the site, both to investigate remedial performance and to increase the understanding of uranium dynamics. Several of these studies include acquisition of large hydrological and time lapse electrical geophysical data sets. Such datasets contain large amounts of information on hydrological processes. There are substantial challenges in how to effectively deal with the data volumes of such datasets, how to process such datasets and how to provide users with the ability to effectively access and synergize the hydrological information contained in raw and processed data. These challenges motivated the development of a cloud based cyberinfrastructure for dealing with large electrical hydrogeophysical datasets. This cyberinfrastructure is modular and extensible and includes datamanagement, data processing, visualization and result mining capabilities. Specifically, it provides for data transmission to a central server, data parsing in a relational database and processing of the data using a PNNL developed parallel inversion code on either dedicated or commodity compute clusters. Access to results is done through a browser with interactive tools allowing for generation of on demand visualization of the inversion results as well as interactive data mining and statistical calculation. This infrastructure was used for the acquisition and processing of an electrical geophysical timelapse survey which was collected over a highly instrumented field site in the Hanford 300 Area. Over a 13 month period between November 2011 and December 2012 1823 timelapse datasets were collected (roughly 5 datasets a day for a total of 23 million individual measurements) on three parallel resistivity lines of 30 m each with 0.5 meter electrode spacing. In addition, hydrological and environmental data were collected from dedicated and general purpose sensors. This dataset contains rich information on near surface processes on a range of different spatial and temporal scales (ranging from hourly to seasonal). We will show how this cyberinfrastructure was used to manage and process this dataset and how the cyberinfrastructure can be used to access, mine and visualize the resulting data and information.
The Stream-Catchment (StreamCat) and Lake-Catchment ...

EPA Pesticide Factsheets

Background/Question/MethodsLake and stream conditions respond to both natural and human-related landscape features. Characterizing these features within contributing areas (i.e., delineated watersheds) of streams and lakes could improve our understanding of how biological conditions vary spatially and improve the use, management, and restoration of these aquatic resources. However, the specialized geospatial techniques required to define and characterize stream and lake watersheds has limited their widespread use in both scientific and management efforts at large spatial scales. We developed the StreamCat and LakeCat Datasets to model, predict, and map the probable biological conditions of streams and lakes across the conterminous US (CONUS). Both StreamCat and LakeCat contain watershed-level characterizations of several hundred natural (e.g., soils, geology, climate, and land cover) and anthropogenic (e.g., urbanization, agriculture, mining, and forest management) landscape features for ca. 2.6 million stream segments and 376,000 lakes across the CONUS, respectively. These datasets can be paired with field samples to provide independent variables for modeling and other analyses. We paired 1,380 stream and 1,073 lake samples from the USEPAs National Aquatic Resource Surveys with StreamCat and LakeCat and used random forest (RF) to model and then map an invertebrate condition index and chlorophyll a concentration, respectively. Results/ConclusionsThe invertebrate
Uncertainty of future projections of species distributions in mountainous regions.

PubMed

Tang, Ying; Winkler, Julie A; Viña, Andrés; Liu, Jianguo; Zhang, Yuanbin; Zhang, Xiaofeng; Li, Xiaohong; Wang, Fang; Zhang, Jindong; Zhao, Zhiqiang

2018-01-01

Multiple factors introduce uncertainty into projections of species distributions under climate change. The uncertainty introduced by the choice of baseline climate information used to calibrate a species distribution model and to downscale global climate model (GCM) simulations to a finer spatial resolution is a particular concern for mountainous regions, as the spatial resolution of climate observing networks is often insufficient to detect the steep climatic gradients in these areas. Using the maximum entropy (MaxEnt) modeling framework together with occurrence data on 21 understory bamboo species distributed across the mountainous geographic range of the Giant Panda, we examined the differences in projected species distributions obtained from two contrasting sources of baseline climate information, one derived from spatial interpolation of coarse-scale station observations and the other derived from fine-spatial resolution satellite measurements. For each bamboo species, the MaxEnt model was calibrated separately for the two datasets and applied to 17 GCM simulations downscaled using the delta method. Greater differences in the projected spatial distributions of the bamboo species were observed for the models calibrated using the different baseline datasets than between the different downscaled GCM simulations for the same calibration. In terms of the projected future climatically-suitable area by species, quantification using a multi-factor analysis of variance suggested that the sum of the variance explained by the baseline climate dataset used for model calibration and the interaction between the baseline climate data and the GCM simulation via downscaling accounted for, on average, 40% of the total variation among the future projections. Our analyses illustrate that the combined use of gridded datasets developed from station observations and satellite measurements can help estimate the uncertainty introduced by the choice of baseline climate information to the projected changes in species distribution.
Uncertainty of future projections of species distributions in mountainous regions

PubMed Central

Tang, Ying; Viña, Andrés; Liu, Jianguo; Zhang, Yuanbin; Zhang, Xiaofeng; Li, Xiaohong; Wang, Fang; Zhang, Jindong; Zhao, Zhiqiang

2018-01-01

Multiple factors introduce uncertainty into projections of species distributions under climate change. The uncertainty introduced by the choice of baseline climate information used to calibrate a species distribution model and to downscale global climate model (GCM) simulations to a finer spatial resolution is a particular concern for mountainous regions, as the spatial resolution of climate observing networks is often insufficient to detect the steep climatic gradients in these areas. Using the maximum entropy (MaxEnt) modeling framework together with occurrence data on 21 understory bamboo species distributed across the mountainous geographic range of the Giant Panda, we examined the differences in projected species distributions obtained from two contrasting sources of baseline climate information, one derived from spatial interpolation of coarse-scale station observations and the other derived from fine-spatial resolution satellite measurements. For each bamboo species, the MaxEnt model was calibrated separately for the two datasets and applied to 17 GCM simulations downscaled using the delta method. Greater differences in the projected spatial distributions of the bamboo species were observed for the models calibrated using the different baseline datasets than between the different downscaled GCM simulations for the same calibration. In terms of the projected future climatically-suitable area by species, quantification using a multi-factor analysis of variance suggested that the sum of the variance explained by the baseline climate dataset used for model calibration and the interaction between the baseline climate data and the GCM simulation via downscaling accounted for, on average, 40% of the total variation among the future projections. Our analyses illustrate that the combined use of gridded datasets developed from station observations and satellite measurements can help estimate the uncertainty introduced by the choice of baseline climate information to the projected changes in species distribution. PMID:29320501
Comparison of two spatially-resolved fossil fuel CO2 emissions inventories at the urban scale in four US cities

NASA Astrophysics Data System (ADS)

Liang, J.; Gurney, K. R.; O'Keeffe, D.; Patarasuk, R.; Hutchins, M.; Rao, P.

2017-12-01

Spatially-resolved fossil fuel CO2 (FFCO2) emissions are used not only in complex atmospheric modeling systems as prior scenarios to simulate concentrations of CO2 in the atmosphere, but to improve understanding of relationships with socioeconomic factors in support of sustainability policymaking. We present a comparison of ODIAC, a top-down global gridded FFCO2 emissions dataset, and Hesita, a bottom-up FFCO2 emissions dataset, in four US cities, including Los Angles, Indianapolis, Salt Lake City and Baltimore City. ODIAC was developed by downscaling national total emissions to 1km-by-1km grid cells using satellite nightlight imagery as proxy. Hesita was built from the ground up by allocating sector-specific county-level emissions to urban-level spatial surrogates including facility locations, road maps, building footprints/parcels, railroad maps and shipping lanes. The differences in methodology and data sources could lead to large discrepancies in FFCO2 estimates at the urban scale, and these discrepancies need to be taken into account in conducting atmospheric modeling or socioeconomic analysis. This comparison work is aimed at quantifying the statistical and spatial difference between the two FFCO2 inventories. An analysis of the difference in total emissions, spatial distribution and statistical distribution resulted in the following findings: (1) ODIAC agrees well with Hestia in total FFCO2 emissions estimates across the four cities with a difference from 3%-20%; (2) Small-scale areal and linear spatial features such as roads and buildings are either entirely missing or not very well represented in ODIAC, since nightlight imagery might not be able to capture these information. This might further lead to underestimated on-road FFCO2 emissions in ODIAC; (3) The statistical distribution of ODIAC is more concentrated around the mean with much less samples in the lower range. These phenomena could result from the nightlight halo and saturation effects; (4) The grid-cell cumulative emissions of ODIAC appear in good agreement with that of Hestia, implying the two inventories have similar overall spatial structures at the city scale.
Evaluating a Local Ensemble Transform Kalman Filter snow cover data assimilation method to estimate SWE within a high-resolution hydrologic modeling framework across Western US mountainous regions

NASA Astrophysics Data System (ADS)

Oaida, C. M.; Andreadis, K.; Reager, J. T., II; Famiglietti, J. S.; Levoe, S.

2017-12-01

Accurately estimating how much snow water equivalent (SWE) is stored in mountainous regions characterized by complex terrain and snowmelt-driven hydrologic cycles is not only greatly desirable, but also a big challenge. Mountain snowpack exhibits high spatial variability across a broad range of spatial and temporal scales due to a multitude of physical and climatic factors, making it difficult to observe or estimate in its entirety. Combing remotely sensed data and high resolution hydrologic modeling through data assimilation (DA) has the potential to provide a spatially and temporally continuous SWE dataset at horizontal scales that capture sub-grid snow spatial variability and are also relevant to stakeholders such as water resource managers. Here, we present the evaluation of a new snow DA approach that uses a Local Ensemble Transform Kalman Filter (LETKF) in tandem with the Variable Infiltration Capacity macro-scale hydrologic model across the Western United States, at a daily temporal resolution, and a horizontal resolution of 1.75 km x 1.75 km. The LETKF is chosen for its relative simplicity, ease of implementation, and computational efficiency and scalability. The modeling/DA system assimilates daily MODIS Snow Covered Area and Grain Size (MODSCAG) fractional snow cover over, and has been developed to efficiently calculate SWE estimates over extended periods of time and covering large regional-scale areas at relatively high spatial resolution, ultimately producing a snow reanalysis-type dataset. Here we focus on the assessment of SWE produced by the DA scheme over several basins in California's Sierra Nevada Mountain range where Airborne Snow Observatory data is available, during the last five water years (2013-2017), which include both one of the driest and one of the wettest years. Comparison against such a spatially distributed SWE observational product provides a greater understanding of the model's ability to estimate SWE and SWE spatial variability, and highlights under which conditions snow cover DA can add value in estimating SWE.
Evaluating Temporal Consistency in Marine Biodiversity Hotspots.

PubMed

Piacenza, Susan E; Thurman, Lindsey L; Barner, Allison K; Benkwitt, Cassandra E; Boersma, Kate S; Cerny-Chipman, Elizabeth B; Ingeman, Kurt E; Kindinger, Tye L; Lindsley, Amy J; Nelson, Jake; Reimer, Jessica N; Rowe, Jennifer C; Shen, Chenchen; Thompson, Kevin A; Heppell, Selina S

2015-01-01

With the ongoing crisis of biodiversity loss and limited resources for conservation, the concept of biodiversity hotspots has been useful in determining conservation priority areas. However, there has been limited research into how temporal variability in biodiversity may influence conservation area prioritization. To address this information gap, we present an approach to evaluate the temporal consistency of biodiversity hotspots in large marine ecosystems. Using a large scale, public monitoring dataset collected over an eight year period off the US Pacific Coast, we developed a methodological approach for avoiding biases associated with hotspot delineation. We aggregated benthic fish species data from research trawls and calculated mean hotspot thresholds for fish species richness and Shannon's diversity indices over the eight year dataset. We used a spatial frequency distribution method to assign hotspot designations to the grid cells annually. We found no areas containing consistently high biodiversity through the entire study period based on the mean thresholds, and no grid cell was designated as a hotspot for greater than 50% of the time-series. To test if our approach was sensitive to sampling effort and the geographic extent of the survey, we followed a similar routine for the northern region of the survey area. Our finding of low consistency in benthic fish biodiversity hotspots over time was upheld, regardless of biodiversity metric used, whether thresholds were calculated per year or across all years, or the spatial extent for which we calculated thresholds and identified hotspots. Our results suggest that static measures of benthic fish biodiversity off the US West Coast are insufficient for identification of hotspots and that long-term data are required to appropriately identify patterns of high temporal variability in biodiversity for these highly mobile taxa. Given that ecological communities are responding to a changing climate and other environmental perturbations, our work highlights the need for scientists and conservation managers to consider both spatial and temporal dynamics when designating biodiversity hotspots.
Evaluating Temporal Consistency in Marine Biodiversity Hotspots

PubMed Central

Barner, Allison K.; Benkwitt, Cassandra E.; Boersma, Kate S.; Cerny-Chipman, Elizabeth B.; Ingeman, Kurt E.; Kindinger, Tye L.; Lindsley, Amy J.; Nelson, Jake; Reimer, Jessica N.; Rowe, Jennifer C.; Shen, Chenchen; Thompson, Kevin A.; Heppell, Selina S.

2015-01-01

With the ongoing crisis of biodiversity loss and limited resources for conservation, the concept of biodiversity hotspots has been useful in determining conservation priority areas. However, there has been limited research into how temporal variability in biodiversity may influence conservation area prioritization. To address this information gap, we present an approach to evaluate the temporal consistency of biodiversity hotspots in large marine ecosystems. Using a large scale, public monitoring dataset collected over an eight year period off the US Pacific Coast, we developed a methodological approach for avoiding biases associated with hotspot delineation. We aggregated benthic fish species data from research trawls and calculated mean hotspot thresholds for fish species richness and Shannon’s diversity indices over the eight year dataset. We used a spatial frequency distribution method to assign hotspot designations to the grid cells annually. We found no areas containing consistently high biodiversity through the entire study period based on the mean thresholds, and no grid cell was designated as a hotspot for greater than 50% of the time-series. To test if our approach was sensitive to sampling effort and the geographic extent of the survey, we followed a similar routine for the northern region of the survey area. Our finding of low consistency in benthic fish biodiversity hotspots over time was upheld, regardless of biodiversity metric used, whether thresholds were calculated per year or across all years, or the spatial extent for which we calculated thresholds and identified hotspots. Our results suggest that static measures of benthic fish biodiversity off the US West Coast are insufficient for identification of hotspots and that long-term data are required to appropriately identify patterns of high temporal variability in biodiversity for these highly mobile taxa. Given that ecological communities are responding to a changing climate and other environmental perturbations, our work highlights the need for scientists and conservation managers to consider both spatial and temporal dynamics when designating biodiversity hotspots. PMID:26200354
Mapping permafrost change hot-spots with Landsat time-series

NASA Astrophysics Data System (ADS)

Grosse, G.; Nitze, I.

2016-12-01

Recent and projected future climate warming strongly affects permafrost stability over large parts of the terrestrial Arctic with local, regional and global scale consequences. The monitoring and quantification of permafrost and associated land surface changes in these areas is crucial for the analysis of hydrological and biogeochemical cycles as well as vegetation and ecosystem dynamics. However, detailed knowledge of the spatial distribution and the temporal dynamics of these processes is scarce and likely key locations of permafrost landscape dynamics may remain unnoticed. As part of the ERC funded PETA-CARB and ESA GlobPermafrost projects, we developed an automated processing chain based on data from the entire Landsat archive (excluding MSS) for the detection of permafrost change related processes and hotspots. The automated method enables us to analyze thousands of Landsat scenes, which allows for a multi-scaled spatio-temporal analysis at 30 meter spatial resolution. All necessary processing steps are carried out automatically with minimal user interaction, including data extraction, masking, reprojection, subsetting, data stacking, and calculation of multi-spectral indices. These indices, e.g. Landsat Tasseled Cap and NDVI among others, are used as proxies for land surface conditions, such as vegetation status, moisture or albedo. Finally, a robust trend analysis is applied to each multi-spectral index and each pixel over the entire observation period of up to 30 years from 1985 to 2015, depending on data availability. Large transects of around 2 million km² across different permafrost types in Siberia and North America have been processed. Permafrost related or influencing landscape dynamics were detected within the trend analysis, including thermokarst lake dynamics, fires, thaw slumps, and coastal dynamics. The produced datasets will be distributed to the community as part of the ERC PETA-CARB and ESA GlobPermafrost projects. Users are encouraged to provide feedback and ground truth data for a continuous improvement of our methodology and datasets, which will lead to a better understanding of the spatial and temporal distribution of changes within the vulnerable permafrost zone.

Detecting vegetation cover change on the summit of Cadillac Mountain using multi-temporal remote sensing datasets: 1979, 2001, and 2007.

PubMed

Kim, Min-Kook; Daigle, John J

2011-09-01

This study examines the efficacy of management strategies implemented in 2000 to reduce visitor-induced vegetation impact and enhance vegetation recovery at the summit loop trail on Cadillac Mountain at Acadia National Park, Maine. Using single-spectral high-resolution remote sensing datasets captured in 1979, 2001, and 2007, pre-classification change detection analysis techniques were applied to measure fractional vegetation cover changes between the time periods. This popular sub-alpine summit with low-lying vegetation and attractive granite outcroppings experiences dispersed visitor use away from the designated trail, so three pre-defined spatial scales (small, 0-30 m; medium, 0-60 m; and large, 0-90 m) were examined in the vicinity of the summit loop trail with visitor use (experimental site) and a site chosen nearby in a relatively pristine undisturbed area (control site) with similar spatial scales. Results reveal significant changes in terms of rates of vegetation impact between 1979 and 2001 extending out to 90 m from the summit loop trail with no management at the site. No significant differences were detected among three spatial zones (inner, 0-30 m; middle, 30-60 m; and outer, 60-90 m) at the experimental site, but all were significantly higher rates of impact compared to similar spatial scales at the control site (all p < 0.001). In contrast, significant changes in rates of recovery between 2001 and 2007 were observed in the medium and large spatial scales at the experimental site under management as compared to the control site (all p < 0.05). Also during this later period a higher rate of recovery was observed in the outer zone as compared to the inner zone at the experimental site (p < 0.05). The overall study results suggest a trend in the desired direction for the site and visitor management strategies designed to reduce vegetation impact and enhance vegetation recovery at the summit loop trail of Cadillac Mountain since 2000. However, the vegetation recovery has been rather minimal and did not reach the level of cover observed during the 1979 time period. In addition, the advantages and some limitations of using remote sensing technologies are discussed in detecting vegetation change in this setting and potential application to other recreation settings.
Relationships Between Fire and Land Use Change in the Brazilian Amazon Based on Satellite Data

NASA Astrophysics Data System (ADS)

Fanin, T.; van der Werf, G.

2014-12-01

Fires are used as a tool in the process of deforestation. The relationship between fire and deforestation varies temporally and spatially according to the type of deforestation and climatic conditions. This study evaluates spatiotemporal variability between fire and deforestation over the 2002-2012 period in the Brazilian Legal Amazon (BLA). We based our study on four datasets: deforestation estimates from PRODES (Amazon Deforestation Monitoring Project) and forest cover loss from the Global Forest Change (GFC) project based on Landsat data, and burned area and land cover based on Moderate Resolution Imaging Spectroradiometer (MODIS) data. While GFC and PRODES supported similar findings on spatial and temporal dynamics, the Landsat-scale comparison also highlighted a number of differences. Both datasets show a decrease after 2004 in forest loss or deforestation extent mainly from decreasing clearing rates in evergreen broadleaf forest, mostly in the states of Mato Grosso and Rondonia. However, the drop is larger and more gradual in PRODES than in GFC, with the former having less than half the forest loss of the latter. GFC indicates anomalous high forest loss in the years 2007 and 2010 not seen in PRODES. Rescaling these forest dynamics datasets to 500-meter resolution, allowed for a comparison against the MODIS datasets. The burned area data indicates that the mismatch between PRODES and GFC is largely related to increased fire occurrence during these dry years, mainly in Para. In addition it indicates that the time interval between deforestation and fire differs according to land cover, which is important when estimating the atmospheric impact of forest loss. We found that evergreen broadleaf forests are burned shortly after deforestation due to slash and burn techniques, while croplands have longer intervals depending on the crop variety. As a final step, we used these insights to better quantify carbon emissions from this region.
Influence of sub-kilometer precipitation datasets on simulated snowpack and glacier winter balance in alpine terrain.

NASA Astrophysics Data System (ADS)

Vionnet, Vincent; Six, Delphine; Auger, Ludovic; Lafaysse, Matthieu; Quéno, Louis; Réveillet, Marion; Dombrowski-Etchevers, Ingrid; Thibert, Emmanuel; Dumont, Marie

2017-04-01

Capturing spatial and temporal variabilities of meteorological conditions at fine scale is necessary for modelling snowpack and glacier winter mass balance in alpine terrain. In particular, precipitation amount and phase are strongly influenced by the complex topography. In this study, we assess the impact of three sub-kilometer precipitation datasets (rainfall and snowfall) on distributed simulations of snowpack and glacier winter mass balance with the detailed snowpack model Crocus for winter 2011-2012. The different precipitation datasets at 500-m grid spacing over part of the French Alps (200*200 km2 area) are coming either from (i) the SAFRAN precipitation analysis specially developed for alpine terrain, or from (ii) operational outputs of the atmospheric model AROME at 2.5-km grid spacing downscaled to 500 m with fixed lapse rate or from (iii) a version of the atmospheric model AROME at 500-m grid spacing. Others atmospherics forcings (air temperature and humidity, incoming longwave and shortwave radiation, wind speed) are taken from the AROME simulations at 500-m grid spacing. These atmospheric forcings are firstly compared against a network of automatic weather stations. Results are analysed with respect to station location (valley, mid- and high-altitude). The spatial pattern of seasonal snowfall and its dependency with elevation is then analysed for the different precipitation datasets. Large differences between SAFRAN and the two versions of AROME are found at high-altitude. Finally, results of Crocus snowpack simulations are evaluated against (i) punctual in-situ measurements of snow depth and snow water equivalent, and (ii) maps of snow covered areas retrieved from optical satellite data (MODIS). Measurements of winter accumulation of six glaciers of the French Alps are also used and provide very valuable information on precipitation at high-altitude where the conventional observation network is scarce. This study illustrates the potential and limitations of high-resolution atmospheric models to drive simulations of snowpack and glacier winter mass balance in alpine terrain.
Landscape epidemiology in urban environments: The example of rodent-borne Trypanosoma in Niamey, Niger.

PubMed

Rossi, Jean-Pierre; Kadaouré, Ibrahima; Godefroid, Martin; Dobigny, Gauthier

2017-10-05

Trypanosomes are protozoan parasites found worldwide, infecting humans and animals. In the past decade, the number of reports on atypical human cases due to Trypanosoma lewisi or T. lewisi-like has increased urging to investigate the multiple factors driving the disease dynamics, particularly in cities where rodents and humans co-exist at high densities. In the present survey, we used a species distribution model, Maxent, to assess the spatial pattern of Trypanosoma-positive rodents in the city of Niamey. The explanatory variables were landscape metrics describing urban landscape composition and physiognomy computed from 8 land-cover classes. We computed the metrics around each data location using a set of circular buffers of increasing radii (20m, 40m, 60m, 80m and 100m). For each spatial resolution, we determined the optimal combination of feature class and regularization multipliers by fitting Maxent with the full dataset. Since our dataset was small (114 occurrences) we expected an important uncertainty associated to data partitioning into calibration and evaluation datasets. We thus performed 350 independent model runs with a training dataset representing a random subset of 80% of the occurrences and the optimal Maxent parameters. Each model yielded a map of habitat suitability over Niamey, which was transformed into a binary map implementing a threshold maximizing the sensitivity and the specificity. The resulting binary maps were combined to display the proportion of models that indicated a good environmental suitability for Trypanosoma-positive rodents. Maxent performed better with landscape metrics derived from buffers of 80m. Habitat suitability for Trypanosoma-positive rodents exhibited large patches linked to urban features such as patch richness and the proportion of landscape covered by concrete or tarred areas. Such inferences could be helpful in assessing areas at risk, setting of monitoring programs, public and medical staff awareness or even vaccination campaigns. Copyright © 2017 Elsevier B.V. All rights reserved.
Social dataset analysis and mapping tools for Risk Perception: resilience, people preparation and communication tools

NASA Astrophysics Data System (ADS)

Peters-Guarin, Graciela; Garcia, Carolina; Frigerio, Simone

2010-05-01

Perception has been identified as resource and part of the resilience of a community to disasters. Risk perception, if present, may determine the potential damage a household or community experience. Different levels of risk perception and preparedness can influence directly people's susceptibility and the way they might react in case of an emergency caused by natural hazards. In spite of the profuse literature about risk perception, works to spatially portray this feature are really scarce. The spatial relationship to danger or hazard is being recognised as an important factor of the risk equation; it can be used as a powerful tool either for better knowledge or for operational reasons (e.g. management of preventive information). Risk perception and people's awareness when displayed in a spatial format can be useful for several actors in the risk management arena. Local authorities and civil protection can better address educational activities to increase the preparation of particularly vulnerable groups of clusters of households within a community. It can also be useful for the emergency personal in order to optimally direct the actions in case of an emergency. In the framework of the Marie Curie Research Project, a Community Based Early Warning System (CBEWS) it's been developed in the Mountain Community Valtellina of Tirano, northern Italy. This community has been continuously exposed to different mass movements and floods, in particular, a large event in 1987 which affected a large portion of the valley and left 58 dead. The actual emergency plan for the study area is composed by a real time, highly detailed, decision support system. This emergency plan contains detailed instructions for the rapid deployment of civil protection and other emergency personal in case of emergency, for risk scenarios previously defined. Especially in case of a large event, where timely reaction is crucial for reducing casualties, it is important for those in charge of emergency management, to know in advance the different levels of risk perception and preparedness existing among several sectors of the population. Knowing where the most vulnerable population is located may optimize the use of resources, better direct the initial efforts and organize the evacuation and attention procedures. As part of the CBEWS, a comprehensive survey was applied in the study area to measure, among others features, the levels of risk perception, preparation and information received about natural hazards. After a statistical and direct analysis on a complete social dataset recorded, a spatial information distribution is actually in progress. Based on boundaries features (municipalities and sub-districts) of Italian Institute of Statistics (ISTAT), a local scale background has been granted (a private address level is not accessible for privacy rules so the local districts-ID inside municipality has been the detail level performed) and a spatial location of the surveyed population has been completed. The geometric component has been defined and actually it is possible to create a local distribution of social parameters derived from perception questionnaries results. A lot of raw information and social-statistical analysis offer different mirror and "visual concept" of risk perception. For this reason a concrete complete GeoDB is under working for the complete organization of the dataset. By a technical point of view the environment for data sharing is based on a complete open source web-service environment, to offer manually-made and user-friendly interface to this kind of information. Final aim is to offer different switches of dataset, using the same scale prototype and data hierarchical structure, to provide and compare social location of risk perception in the most detailed level.
Object classification and outliers analysis in the forthcoming Gaia mission

NASA Astrophysics Data System (ADS)

Ordóñez-Blanco, D.; Arcay, B.; Dafonte, C.; Manteiga, M.; Ulla, A.

2010-12-01

Astrophysics is evolving towards the rational optimization of costly observational material by the intelligent exploitation of large astronomical databases from both terrestrial telescopes and spatial mission archives. However, there has been relatively little advance in the development of highly scalable data exploitation and analysis tools needed to generate the scientific returns from these large and expensively obtained datasets. Among the upcoming projects of astronomical instrumentation, Gaia is the next cornerstone ESA mission. The Gaia survey foresees the creation of a data archive and its future exploitation with automated or semi-automated analysis tools. This work reviews some of the work that is being developed by the Gaia Data Processing and Analysis Consortium for the object classification and analysis of outliers in the forthcoming mission.
Evaluation of Uncertainty in Precipitation Datasets for New Mexico, USA

NASA Astrophysics Data System (ADS)

Besha, A. A.; Steele, C. M.; Fernald, A.

2014-12-01

Climate change, population growth and other factors are endangering water availability and sustainability in semiarid/arid areas particularly in the southwestern United States. Wide coverage of spatial and temporal measurements of precipitation are key for regional water budget analysis and hydrological operations which themselves are valuable tool for water resource planning and management. Rain gauge measurements are usually reliable and accurate at a point. They measure rainfall continuously, but spatial sampling is limited. Ground based radar and satellite remotely sensed precipitation have wide spatial and temporal coverage. However, these measurements are indirect and subject to errors because of equipment, meteorological variability, the heterogeneity of the land surface itself and lack of regular recording. This study seeks to understand precipitation uncertainty and in doing so, lessen uncertainty propagation into hydrological applications and operations. We reviewed, compared and evaluated the TRMM (Tropical Rainfall Measuring Mission) precipitation products, NOAA's (National Oceanic and Atmospheric Administration) Global Precipitation Climatology Centre (GPCC) monthly precipitation dataset, PRISM (Parameter elevation Regression on Independent Slopes Model) data and data from individual climate stations including Cooperative Observer Program (COOP), Remote Automated Weather Stations (RAWS), Soil Climate Analysis Network (SCAN) and Snowpack Telemetry (SNOTEL) stations. Though not yet finalized, this study finds that the uncertainty within precipitation estimates datasets is influenced by regional topography, season, climate and precipitation rate. Ongoing work aims to further evaluate precipitation datasets based on the relative influence of these phenomena so that we can identify the optimum datasets for input to statewide water budget analysis.
Long-term vegetation activity trends in the Iberian Peninsula and The Balearic Islands using high spatial resolution NOAA-AVHRR data (1981 - 2015).

NASA Astrophysics Data System (ADS)

Martin-Hernandez, Natalia; Vicente-Serrano, Sergio; Azorin-Molina, Cesar; Begueria-Portugues, Santiago; Reig-Gracia, Fergus; Zabalza-Martínez, Javier

2017-04-01

We have analysed trends in the Normalized Difference Vegetation Index (NDVI) in the Iberian Peninsula and The Balearic Islands over the period 1981 - 2015 using a new high resolution data set from the entire available NOAA - AVHRR images (IBERIAN NDVI dataset). After a complete processing including geocoding, calibration, cloud removal, topographic correction and temporal filtering, we obtained bi-weekly time series. To assess the accuracy of the new IBERIAN NDVI time-series, we have compared temporal variability and trends of NDVI series with those results reported by GIMMS 3g and MODIS (MOD13A3) NDVI datasets. In general, the IBERIAN NDVI showed high reliability with these two products but showing higher spatial resolution than the GIMMS dataset and covering two more decades than the MODIS dataset. Using the IBERIAN NDVI dataset, we analysed NDVI trends by means of the non-parametric Mann-Kendall test and Theil-Sen slope estimator. In average, vegetation trends in the study area show an increase over the last decades. However, there are local spatial differences: the main increase has been recorded in humid regions of the north of the Iberian Peninsula. The statistical techniques allow finding abrupt and gradual changes in different land cover types during the analysed period. These changes are related with human activity due to land transformations (from dry to irrigated land), land abandonment and forest recovery.
Hyperspectral Image Classification With Markov Random Fields and a Convolutional Neural Network

NASA Astrophysics Data System (ADS)

Cao, Xiangyong; Zhou, Feng; Xu, Lin; Meng, Deyu; Xu, Zongben; Paisley, John

2018-05-01

This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent (SGD) and update the class labels of all pixel vectors using an alpha-expansion min-cut-based algorithm. Compared with other state-of-the-art methods, the proposed classification method achieves better performance on one synthetic dataset and two benchmark HSI datasets in a number of experimental settings.
Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies.

PubMed

Bhaskar, Anand; Javanmard, Adel; Courtade, Thomas A; Tse, David

2017-03-15

Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait. Our algorithm Geographic Ancestry Positioning (GAP) relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. We also develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches. Our software is available at https://github.com/anand-bhaskar/gap . abhaskar@stanford.edu or ajavanma@usc.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Automatic labeling of MR brain images through extensible learning and atlas forests.

PubMed

Xu, Lijun; Liu, Hong; Song, Enmin; Yan, Meng; Jin, Renchao; Hung, Chih-Cheng

2017-12-01

Multiatlas-based method is extensively used in MR brain images segmentation because of its simplicity and robustness. This method provides excellent accuracy although it is time consuming and limited in terms of obtaining information about new atlases. In this study, an automatic labeling of MR brain images through extensible learning and atlas forest is presented to address these limitations. We propose an extensible learning model which allows the multiatlas-based framework capable of managing the datasets with numerous atlases or dynamic atlas datasets and simultaneously ensure the accuracy of automatic labeling. Two new strategies are used to reduce the time and space complexity and improve the efficiency of the automatic labeling of brain MR images. First, atlases are encoded to atlas forests through random forest technology to reduce the time consumed for cross-registration between atlases and target image, and a scatter spatial vector is designed to eliminate errors caused by inaccurate registration. Second, an atlas selection method based on the extensible learning model is used to select atlases for target image without traversing the entire dataset and then obtain the accurate labeling. The labeling results of the proposed method were evaluated in three public datasets, namely, IBSR, LONI LPBA40, and ADNI. With the proposed method, the dice coefficient metric values on the three datasets were 84.17 ± 4.61%, 83.25 ± 4.29%, and 81.88 ± 4.53% which were 5% higher than those of the conventional method, respectively. The efficiency of the extensible learning model was evaluated by state-of-the-art methods for labeling of MR brain images. Experimental results showed that the proposed method could achieve accurate labeling for MR brain images without traversing the entire datasets. In the proposed multiatlas-based method, extensible learning and atlas forests were applied to control the automatic labeling of brain anatomies on large atlas datasets or dynamic atlas datasets and obtain accurate results. © 2017 American Association of Physicists in Medicine.
Assessment of Observational Uncertainty in Extreme Precipitation Events over the Continental United States

NASA Astrophysics Data System (ADS)

Slinskey, E. A.; Loikith, P. C.; Waliser, D. E.; Goodman, A.

2017-12-01

Extreme precipitation events are associated with numerous societal and environmental impacts. Furthermore, anthropogenic climate change is projected to alter precipitation intensity across portions of the Continental United States (CONUS). Therefore, a spatial understanding and intuitive means of monitoring extreme precipitation over time is critical. Towards this end, we apply an event-based indicator, developed as a part of NASA's support of the ongoing efforts of the US National Climate Assessment, which assigns categories to extreme precipitation events based on 3-day storm totals as a basis for dataset intercomparison. To assess observational uncertainty across a wide range of historical precipitation measurement approaches, we intercompare in situ station data from the Global Historical Climatology Network (GHCN), satellite-derived precipitation data from NASA's Tropical Rainfall Measuring Mission (TRMM), gridded in situ station data from the Parameter-elevation Regressions on Independent Slopes Model (PRISM), global reanalysis from NASA's Modern Era Retrospective-Analysis version 2 (MERRA 2), and regional reanalysis with gauge data assimilation from NCEP's North American Regional Reanalysis (NARR). Results suggest considerable variability across the five-dataset suite in the frequency, spatial extent, and magnitude of extreme precipitation events. Consistent with expectations, higher resolution datasets were found to resemble station data best and capture a greater frequency of high-end extreme events relative to lower spatial resolution datasets. The degree of dataset agreement varies regionally, however all datasets successfully capture the seasonal cycle of precipitation extremes across the CONUS. These intercomparison results provide additional insight about observational uncertainty and the ability of a range of precipitation measurement and analysis products to capture extreme precipitation event climatology. While the event category threshold is fixed in this analysis, preliminary results from the development of a flexible categorization scheme, that scales with grid resolution, are presented.
Analysis of spatial and temporal rainfall trends in Sicily during the 1921-2012 period

NASA Astrophysics Data System (ADS)

Liuzzo, Lorena; Bono, Enrico; Sammartano, Vincenzo; Freni, Gabriele

2016-10-01

Precipitation patterns worldwide are changing under the effects of global warming. The impacts of these changes could dramatically affect the hydrological cycle and, consequently, the availability of water resources. In order to improve the quality and reliability of forecasting models, it is important to analyse historical precipitation data to account for possible future changes. For these reasons, a large number of studies have recently been carried out with the aim of investigating the existence of statistically significant trends in precipitation at different spatial and temporal scales. In this paper, the existence of statistically significant trends in rainfall from observational datasets, which were measured by 245 rain gauges over Sicily (Italy) during the 1921-2012 period, was investigated. Annual, seasonal and monthly time series were examined using the Mann-Kendall non-parametric statistical test to detect statistically significant trends at local and regional scales, and their significance levels were assessed. Prior to the application of the Mann-Kendall test, the historical dataset was completed using a geostatistical spatial interpolation technique, the residual ordinary kriging, and then processed to remove the influence of serial correlation on the test results, applying the procedure of trend-free pre-whitening. Once the trends at each site were identified, the spatial patterns of the detected trends were examined using spatial interpolation techniques. Furthermore, focusing on the 30 years from 1981 to 2012, the trend analysis was repeated with the aim of detecting short-term trends or possible changes in the direction of the trends. Finally, the effect of climate change on the seasonal distribution of rainfall during the year was investigated by analysing the trend in the precipitation concentration index. The application of the Mann-Kendall test to the rainfall data provided evidence of a general decrease in precipitation in Sicily during the 1921-2012 period. Downward trends frequently occurred during the autumn and winter months. However, an increase in total annual precipitation was detected during the period from 1981 to 2012.
Application of Alignment Methodologies to Spatial Ontologies in the Hydro Domain

NASA Astrophysics Data System (ADS)

Lieberman, J. E.; Cheatham, M.; Varanka, D.

2015-12-01

Ontologies are playing an increasing role in facilitating mediation and translation between datasets representing diverse schemas, vocabularies, or knowledge communities. This role is relatively straightforward when there is one ontology comprising all relevant common concepts that can be mapped to entities in each dataset. Frequently, one common ontology has not been agreed to. Either each dataset is represented by a distinct ontology, or there are multiple candidates for commonality. Either the one most appropriate (expressive, relevant, correct) ontology must be chosen, or else concepts and relationships matched across multiple ontologies through an alignment process so that they may be used in concert to carry out mediation or other semantic operations. A resulting alignment can be effective to the extent that entities in in the ontologies represent differing terminology for comparable conceptual knowledge. In cases such as spatial ontologies, though, ontological entities may also represent disparate conceptualizations of space according to the discernment methods and application domains on which they are based. One ontology's wetland concept may overlap in space with another ontology's recharge zone or wildlife range or water feature. In order to evaluate alignment with respect to spatial ontologies, alignment has been applied to a series of ontologies pertaining to surface water that are used variously in hydrography (characterization of water features), hydrology (study of water cycling), and water quality (nutrient and contaminant transport) application domains. There is frequently a need to mediate between datasets in each domain in order to develop broader understanding of surface water systems, so there is a practical as well theoretical value in the alignment. From a domain expertise standpoint, the ontologies under consideration clearly contain some concepts that are spatially as well as conceptually identical and then others with less clear similarities in either sense. Our study serves both to determine the limits of standard methods for aligning spatial ontologies and to suggest new methods of calculating similarity axioms that take into account semantic, spatial, and cognitive criteria relevant to fitness for relevant usage scenarios.
Spatially continuous interpolation of water stage and water depths using the Everglades depth estimation network (EDEN)

USGS Publications Warehouse

Pearlstine, Leonard; Higer, Aaron; Palaseanu, Monica; Fujisaki, Ikuko; Mazzotti, Frank

2007-01-01

The Everglades Depth Estimation Network (EDEN) is an integrated network of real-time water-level monitoring, ground-elevation modeling, and water-surface modeling that provides scientists and managers with current (2000-present), online water-stage and water-depth information for the entire freshwater portion of the Greater Everglades. Continuous daily spatial interpolations of the EDEN network stage data are presented on a 400-square-meter grid spacing. EDEN offers a consistent and documented dataset that can be used by scientists and managers to (1) guide large-scale field operations, (2) integrate hydrologic and ecological responses, and (3) support biological and ecological assessments that measure ecosystem responses to the implementation of the Comprehensive Everglades Restoration Plan (CERP) The target users are biologists and ecologists examining trophic level responses to hydrodynamic changes in the Everglades.
Investigation of aquifer-estuary interaction using wavelet analysis of fiber-optic temperature data

USGS Publications Warehouse

Henderson, R.D.; Day-Lewis, Frederick D.; Harvey, Charles F.

2009-01-01

Fiber-optic distributed temperature sensing (FODTS) provides sub-minute temporal and meter-scale spatial resolution over kilometer-long cables. Compared to conventional thermistor or thermocouple-based technologies, which measure temperature at discrete (and commonly sparse) locations, FODTS offers nearly continuous spatial coverage, thus providing hydrologic information at spatiotemporal scales previously impossible. Large and information-rich FODTS datasets, however, pose challenges for data exploration and analysis. To date, FODTS analyses have focused on time-series variance as the means to discriminate between hydrologic phenomena. Here, we demonstrate the continuous wavelet transform (CWT) and cross-wavelet transform (XWT) to analyze FODTS in the context of related hydrologic time series. We apply the CWT and XWT to data from Waquoit Bay, Massachusetts to identify the location and timing of tidal pumping of submarine groundwater.
Long-term ice phenology records from eastern-central Europe

NASA Astrophysics Data System (ADS)

Takács, Katalin; Kern, Zoltán; Pásztor, László

2018-03-01

A dataset of annual freshwater ice phenology was compiled for the largest river (Danube) and the largest lake (Lake Balaton) in eastern-central Europe, extending regular river and lake ice monitoring data through the use of historical observations and documentary records dating back to AD 1774 and AD 1885, respectively. What becomes clear is that the dates of the first appearance of ice and freeze-up have shifted, arriving 12-30 and 4-13 days later, respectively, per 100 years. Break-up and ice-off have shifted to earlier dates by 7-13 and 9-27 days/100 years, except on Lake Balaton, where the date of break-up has not changed significantly. The datasets represent a resource for (paleo)climatological research thanks to the strong, physically determined link between water and air temperature and the occurrence of freshwater ice phenomena. The derived centennial records of freshwater cryophenology for the Danube and Balaton are readily available for detailed analysis of the temporal trends, large-scale spatial comparison, or other climatological purposes. The derived dataset is publicly available via PANGAEA at https://doi.org/10.1594/PANGAEA.881056.
A PDS Archive for Observations of Mercury's Na Exosphere

NASA Astrophysics Data System (ADS)

Backes, C.; Cassidy, T.; Merkel, A. W.; Killen, R. M.; Potter, A. E.

2016-12-01

We present a data product consisting of ground-based observations of Mercury's sodium exosphere. We have amassed a sizeable dataset of several thousand spectral observations of Mercury's exosphere from the McMath-Pierce solar telescope. Over the last year, a data reduction pipeline has been developed and refined to process and reconstruct these spectral images into low resolution images of sodium D2 emission. This dataset, which extends over two decades, will provide an unprecedented opportunity to analyze the dynamics of Mercury's mid to high-latitude exospheric emissions, which have long been attributed to solar wind ion bombardment. This large archive of observations will be of great use to the Mercury science community in studying the effects of space weather on Mercury's tenuous exosphere. When completely processed, images in this dataset will show the observed spatial distribution of Na D2 in the Mercurian exosphere, have measurements of this sodium emission per pixel in units of kilorayleighs, and be available through NASA's Planetary Data System. The overall goal of the presentation will be to provide the Planetary Science community with a clear picture of what information and data this archival product will make available.
Shelf-Scale Mapping of Fish Distribution Using Active and Passive Acoustics

NASA Astrophysics Data System (ADS)

Wall, Carrie C.

Fish sound production has been associated with courtship and spawning behavior. Acoustic recordings of fish sounds can be used to identify distribution and behavior. Passive acoustic monitoring (PAM) can record large amounts of acoustic data in a specific area for days to years. These data can be collected in remote locations under potentially unsafe seas throughout a 24-hour period providing datasets unattainable using observer-based methods. However, the instruments must withstand the caustic ocean environment and be retrieved to obtain the recorded data. This can prove difficult due to the risk of PAMs being lost, stolen or damaged, especially in highly active areas. In addition, point-source sound recordings are only one aspect of fish biogeography. Passive acoustic platforms that produce low self-generated noise, have high retrieval rates, and are equipped with a suite of environmental sensors are needed to relate patterns in fish sound production to concurrently collected oceanographic conditions on large, synoptic scales. The association of sound with reproduction further invokes the need for such non-invasive, near-real time datasets that can be used to enhance current management methods limited by survey bias, inaccurate fisher reports, and extensive delays between fisheries data collection and population assessment. Red grouper (Epinephelus morio) exhibit the distinctive behavior of digging holes and producing a unique sound during courtship. These behaviors can be used to identify red grouper distribution and potential spawning habitat over large spatial scales. The goal of this research was to provide a greater understanding of the temporal and spatial distribution of red grouper sound production and holes on the central West Florida Shelf (WFS) using active sonar and passive acoustic recorders. The technology demonstrated here establishes the necessary methods to map shelf-scale fish sound production. The results of this work could aid resource managers in determining critical spawning times and areas. Over 403,000 acoustic recordings were made across an approximately 39,000 km2 area on the WFS during periods throughout 2008 to 2011 using stationary passive acoustic recorders and hydrophone-integrated gliders. A custom MySQL database with a portal to MATLAB was developed to catalogue and process the large acoustic dataset stored on a server. Analyses of these data determined the daily, seasonal and spatial patterns of red grouper as well as toadfish and several unconfirmed fish species termed: 100 Hz Pulsing, 6 kHz Sound, 300 Hz FM Harmonic, and 365 Hz Harmonic. Red grouper sound production was correlated to sunrise and sunset, and was primarily recorded in water 15 to 93 m deep, with increased calling within known hard bottom areas and in Steamboat Lumps Marine Reserve. Analyses of high-resolution multibeam bathymetry collected in a portion of the reserve in 2006 and 2009 allowed detailed documentation and characterization of holes excavated by red grouper. Comparisons of the spatially overlapping datasets suggested holes are constructed and maintained over time, and provided evidence towards an increase in spawning habitat usage. High rates of sound production recorded from stationary recorders and a glider deployment were correlated to high hole density in Steamboat Lumps. This research demonstrates the utility of coupling passive acoustic data with high-resolution bathymetric data to verify the occupation of suspected male territory (holes) and to provide a more complete understanding of effective spawning habitat. Annual peaks in calling (July and August, and November and December) did not correspond to spawning peaks (March -- May); however, passive acoustic monitoring was established as an effective tool to identify areas of potential spawning activity by recording the presence of red grouper. Sounds produced by other species of fish were recorded in the passive acoustic dataset. The distribution of toadfish calls suggests two species (Opsanus beta and O. pardus) were recorded; the latter had not been previously described. The call characteristics and spatial distribution of the four unknown fish-related sounds can be used to help confirm the sources. Long-term PAM studies that provide systematic monitoring can be a valuable assessment tool for all soniferous species. Glider technology, due to a high rate of successful retrieval and low self-generated noise, was proven to be a reliable and relatively inexpensive method to collect fisheries acoustic data in the field. The implementation of regular deployments of hydrophone-integrated gliders and fixed location passive acoustic monitoring stations is suggested to enhance fisheries management.
Tularosa Basin Play Fairway Analysis Data and Models

DOE Data Explorer

Nash, Greg

2017-07-11

This submission includes raster datasets for each layer of evidence used for weights of evidence analysis as well as the deterministic play fairway analysis (PFA). Data representative of heat, permeability and groundwater comprises some of the raster datasets. Additionally, the final deterministic PFA model is provided along with a certainty model. All of these datasets are best used with an ArcGIS software package, specifically Spatial Data Modeler.

Mapping the spatial distribution of global anthropogenic mercury atmospheric emission inventories

NASA Astrophysics Data System (ADS)

Wilson, Simon J.; Steenhuisen, Frits; Pacyna, Jozef M.; Pacyna, Elisabeth G.

This paper describes the procedures employed to spatially distribute global inventories of anthropogenic emissions of mercury to the atmosphere, prepared by Pacyna, E.G., Pacyna, J.M., Steenhuisen, F., Wilson, S. [2006. Global anthropogenic mercury emission inventory for 2000. Atmospheric Environment, this issue, doi:10.1016/j.atmosenv.2006.03.041], and briefly discusses the results of this work. A new spatially distributed global emission inventory for the (nominal) year 2000, and a revised version of the 1995 inventory are presented. Emissions estimates for total mercury and major species groups are distributed within latitude/longitude-based grids with a resolution of 1×1 and 0.5×0.5°. A key component in the spatial distribution procedure is the use of population distribution as a surrogate parameter to distribute emissions from sources that cannot be accurately geographically located. In this connection, new gridded population datasets were prepared, based on the CEISIN GPW3 datasets (CIESIN, 2004. Gridded Population of the World (GPW), Version 3. Center for International Earth Science Information Network (CIESIN), Columbia University and Centro Internacional de Agricultura Tropical (CIAT). GPW3 data are available at http://beta.sedac.ciesin.columbia.edu/gpw/index.jsp). The spatially distributed emissions inventories and population datasets prepared in the course of this work are available on the Internet at www.amap.no/Resources/HgEmissions/
Parallel hyperspectral compressive sensing method on GPU

NASA Astrophysics Data System (ADS)

Bernabé, Sergio; Martín, Gabriel; Nascimento, José M. P.

2015-10-01

Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.
Mediterranean sea water budget long-term trend inferred from salinity observations

NASA Astrophysics Data System (ADS)

Skliris, N.; Zika, J. D.; Herold, L.; Josey, S. A.; Marsh, R.

2018-01-01

Changes in the Mediterranean water cycle since 1950 are investigated using salinity and reanalysis based air-sea freshwater flux datasets. Salinity observations indicate a strong basin-scale multi-decadal salinification, particularly in the intermediate and deep layers. Evaporation, precipitation and river runoff variations are all shown to contribute to a very strong increase in net evaporation of order 20-30%. While large temporal uncertainties and discrepancies are found between E-P multi-decadal trend patterns in the reanalysis datasets, a more robust and spatially coherent structure of multi-decadal change is obtained for the salinity field. Salinity change implies an increase in net evaporation of 8 to 12% over 1950-2010, which is considerably lower than that suggested by air-sea freshwater flux products, but still largely exceeding estimates of global water cycle amplification. A new method based on water mass transformation theory is used to link changes in net evaporation over the Mediterranean Sea with changes in the volumetric distribution of salinity. The water mass transformation distribution in salinity coordinates suggests that the Mediterranean basin salinification is driven by changes in the regional water cycle rather than changes in salt transports at the straits.
The Pattern Across the Continental United States of Evapotranspiration Variability Associated with Water Availability

NASA Technical Reports Server (NTRS)

Koster, Randal D.; Salvucci, Guido D.; Rigden, Angela J.; Jung, Martin; Collatz, G. James; Schubert, Siegfried D.

2015-01-01

The spatial pattern across the continental United States of the interannual variance of warm season water-dependent evapotranspiration, a pattern of relevance to land-atmosphere feedback, cannot be measured directly. Alternative and indirect approaches to estimating the pattern, however, do exist, and given the uncertainty of each, we use several such approaches here. We first quantify the water dependent evapotranspiration variance pattern inherent in two derived evapotranspiration datasets available from the literature. We then search for the pattern in proxy geophysical variables (air temperature, stream flow, and NDVI) known to have strong ties to evapotranspiration. The variances inherent in all of the different (and mostly independent) data sources show some differences but are generally strongly consistent they all show a large variance signal down the center of the U.S., with lower variances toward the east and (for the most part) toward the west. The robustness of the pattern across the datasets suggests that it indeed represents the pattern operating in nature. Using Budykos hydroclimatic framework, we show that the pattern can largely be explained by the relative strength of water and energy controls on evapotranspiration across the continent.
a Novel Framework for Remote Sensing Image Scene Classification

NASA Astrophysics Data System (ADS)

Jiang, S.; Zhao, H.; Wu, W.; Tan, Q.

2018-04-01

High resolution remote sensing (HRRS) images scene classification aims to label an image with a specific semantic category. HRRS images contain more details of the ground objects and their spatial distribution patterns than low spatial resolution images. Scene classification can bridge the gap between low-level features and high-level semantics. It can be applied in urban planning, target detection and other fields. This paper proposes a novel framework for HRRS images scene classification. This framework combines the convolutional neural network (CNN) and XGBoost, which utilizes CNN as feature extractor and XGBoost as a classifier. Then, this framework is evaluated on two different HRRS images datasets: UC-Merced dataset and NWPU-RESISC45 dataset. Our framework achieved satisfying accuracies on two datasets, which is 95.57 % and 83.35 % respectively. From the experiments result, our framework has been proven to be effective for remote sensing images classification. Furthermore, we believe this framework will be more practical for further HRRS scene classification, since it costs less time on training stage.
High resolution global gridded data for use in population studies

NASA Astrophysics Data System (ADS)

Lloyd, Christopher T.; Sorichetta, Alessandro; Tatem, Andrew J.

2017-01-01

Recent years have seen substantial growth in openly available satellite and other geospatial data layers, which represent a range of metrics relevant to global human population mapping at fine spatial scales. The specifications of such data differ widely and therefore the harmonisation of data layers is a prerequisite to constructing detailed and contemporary spatial datasets which accurately describe population distributions. Such datasets are vital to measure impacts of population growth, monitor change, and plan interventions. To this end the WorldPop Project has produced an open access archive of 3 and 30 arc-second resolution gridded data. Four tiled raster datasets form the basis of the archive: (i) Viewfinder Panoramas topography clipped to Global ADMinistrative area (GADM) coastlines; (ii) a matching ISO 3166 country identification grid; (iii) country area; (iv) and slope layer. Further layers include transport networks, landcover, nightlights, precipitation, travel time to major cities, and waterways. Datasets and production methodology are here described. The archive can be downloaded both from the WorldPop Dataverse Repository and the WorldPop Project website.
High resolution global gridded data for use in population studies.

PubMed

Lloyd, Christopher T; Sorichetta, Alessandro; Tatem, Andrew J

2017-01-31

Recent years have seen substantial growth in openly available satellite and other geospatial data layers, which represent a range of metrics relevant to global human population mapping at fine spatial scales. The specifications of such data differ widely and therefore the harmonisation of data layers is a prerequisite to constructing detailed and contemporary spatial datasets which accurately describe population distributions. Such datasets are vital to measure impacts of population growth, monitor change, and plan interventions. To this end the WorldPop Project has produced an open access archive of 3 and 30 arc-second resolution gridded data. Four tiled raster datasets form the basis of the archive: (i) Viewfinder Panoramas topography clipped to Global ADMinistrative area (GADM) coastlines; (ii) a matching ISO 3166 country identification grid; (iii) country area; (iv) and slope layer. Further layers include transport networks, landcover, nightlights, precipitation, travel time to major cities, and waterways. Datasets and production methodology are here described. The archive can be downloaded both from the WorldPop Dataverse Repository and the WorldPop Project website.
Scalable Earth-observation Analytics for Geoscientists: Spacetime Extensions to the Array Database SciDB

NASA Astrophysics Data System (ADS)

Appel, Marius; Lahn, Florian; Pebesma, Edzer; Buytaert, Wouter; Moulds, Simon

2016-04-01

Today's amount of freely available data requires scientists to spend large parts of their work on data management. This is especially true in environmental sciences when working with large remote sensing datasets, such as obtained from earth-observation satellites like the Sentinel fleet. Many frameworks like SpatialHadoop or Apache Spark address the scalability but target programmers rather than data analysts, and are not dedicated to imagery or array data. In this work, we use the open-source data management and analytics system SciDB to bring large earth-observation datasets closer to analysts. Its underlying data representation as multidimensional arrays fits naturally to earth-observation datasets, distributes storage and computational load over multiple instances by multidimensional chunking, and also enables efficient time-series based analyses, which is usually difficult using file- or tile-based approaches. Existing interfaces to R and Python furthermore allow for scalable analytics with relatively little learning effort. However, interfacing SciDB and file-based earth-observation datasets that come as tiled temporal snapshots requires a lot of manual bookkeeping during ingestion, and SciDB natively only supports loading data from CSV-like and custom binary formatted files, which currently limits its practical use in earth-observation analytics. To make it easier to work with large multi-temporal datasets in SciDB, we developed software tools that enrich SciDB with earth observation metadata and allow working with commonly used file formats: (i) the SciDB extension library scidb4geo simplifies working with spatiotemporal arrays by adding relevant metadata to the database and (ii) the Geospatial Data Abstraction Library (GDAL) driver implementation scidb4gdal allows to ingest and export remote sensing imagery from and to a large number of file formats. Using added metadata on temporal resolution and coverage, the GDAL driver supports time-based ingestion of imagery to existing multi-temporal SciDB arrays. While our SciDB plugin works directly in the database, the GDAL driver has been specifically developed using a minimum amount of external dependencies (i.e. CURL). Source code for both tools is available from github [1]. We present these tools in a case-study that demonstrates the ingestion of multi-temporal tiled earth-observation data to SciDB, followed by a time-series analysis using R and SciDBR. Through the exclusive use of open-source software, our approach supports reproducibility in scalable large-scale earth-observation analytics. In the future, these tools can be used in an automated way to let scientists only work on ready-to-use SciDB arrays to significantly reduce the data management workload for domain scientists. [1] https://github.com/mappl/scidb4geo} and \\url{https://github.com/mappl/scidb4gdal
How many sightings to model rare marine species distributions

PubMed Central

Authier, Matthieu; Monestiez, Pascal; Ridoux, Vincent

2018-01-01

Despite large efforts, datasets with few sightings are often available for rare species of marine megafauna that typically live at low densities. This paucity of data makes modelling the habitat of these taxa particularly challenging. We tested the predictive performance of different types of species distribution models fitted to decreasing numbers of sightings. Generalised additive models (GAMs) with three different residual distributions and the presence only model MaxEnt were tested on two megafauna case studies differing in both the number of sightings and ecological niches. From a dolphin (277 sightings) and an auk (1,455 sightings) datasets, we simulated rarity with a sighting thinning protocol by random sampling (without replacement) of a decreasing fraction of sightings. Better prediction of the distribution of a rarely sighted species occupying a narrow habitat (auk dataset) was expected compared to the distribution of a rarely sighted species occupying a broad habitat (dolphin dataset). We used the original datasets to set up a baseline model and fitted additional models on fewer sightings but keeping effort constant. Model predictive performance was assessed with mean squared error and area under the curve. Predictions provided by the models fitted to the thinned-out datasets were better than a homogeneous spatial distribution down to a threshold of approximately 30 sightings for a GAM with a Tweedie distribution and approximately 130 sightings for the other models. Thinning the sighting data for the taxon with narrower habitats seemed to be less detrimental to model predictive performance than for the broader habitat taxon. To generate reliable habitat modelling predictions for rarely sighted marine predators, our results suggest (1) using GAMs with a Tweedie distribution with presence-absence data and (2) implementing, as a conservative empirical measure, at least 50 sightings in the models. PMID:29529097
Developing a 1 km resolution daily air temperature dataset for urban and surrounding areas in the conterminous United States

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Xiaoma; Zhou, Yuyu; Asrar, Ghassem R.

High spatiotemporal resolution air temperature (Ta) datasets are increasingly needed for assessing the impact of temperature change on people, ecosystems, and energy system, especially in the urban domains. However, such datasets are not widely available because of the large spatiotemporal heterogeneity of Ta caused by complex biophysical and socioeconomic factors such as built infrastructure and human activities. In this study, we developed a 1-km gridded dataset of daily minimum Ta (Tmin) and maximum Ta (Tmax), and the associated uncertainties, in urban and surrounding areas in the conterminous U.S. for the 2003–2016 period. Daily geographically weighted regression (GWR) models were developedmore » and used to interpolate Ta using 1 km daily land surface temperature and elevation as explanatory variables. The leave-one-out cross-validation approach indicates that our method performs reasonably well, with root mean square errors of 2.1 °C and 1.9 °C, mean absolute errors of 1.5 °C and 1.3 °C, and R 2 of 0.95 and 0.97, for Tmin and Tmax, respectively. The resulting dataset captures reasonably the spatial heterogeneity of Ta in the urban areas, and also captures effectively the urban heat island (UHI) phenomenon that Ta rises with the increase of urban development (i.e., impervious surface area). The new dataset is valuable for studying environmental impacts of urbanization such as UHI and other related effects (e.g., on building energy consumption and human health). The proposed methodology also shows a potential to build a long-term record of Ta worldwide, to fill the data gap that currently exists for studies of urban systems.« less
Management and assimilation of diverse, distributed watershed datasets

NASA Astrophysics Data System (ADS)

Varadharajan, C.; Faybishenko, B.; Versteeg, R.; Agarwal, D.; Hubbard, S. S.; Hendrix, V.

2016-12-01

The U.S. Department of Energy's (DOE) Watershed Function Scientific Focus Area (SFA) seeks to determine how perturbations to mountainous watersheds (e.g., floods, drought, early snowmelt) impact the downstream delivery of water, nutrients, carbon, and metals over seasonal to decadal timescales. We are building a software platform that enables integration of diverse and disparate field, laboratory, and simulation datasets, of various types including hydrological, geological, meteorological, geophysical, geochemical, ecological and genomic datasets across a range of spatial and temporal scales within the Rifle floodplain and the East River watershed, Colorado. We are using agile data management and assimilation approaches, to enable web-based integration of heterogeneous, multi-scale dataSensor-based observations of water-level, vadose zone and groundwater temperature, water quality, meteorology as well as biogeochemical analyses of soil and groundwater samples have been curated and archived in federated databases. Quality Assurance and Quality Control (QA/QC) are performed on priority datasets needed for on-going scientific analyses, and hydrological and geochemical modeling. Automated QA/QC methods are used to identify and flag issues in the datasets. Data integration is achieved via a brokering service that dynamically integrates data from distributed databases via web services, based on user queries. The integrated results are presented to users in a portal that enables intuitive search, interactive visualization and download of integrated datasets. The concepts, approaches and codes being used are shared across various data science components of various large DOE-funded projects such as the Watershed Function SFA, Next Generation Ecosystem Experiment (NGEE) Tropics, Ameriflux/FLUXNET, and Advanced Simulation Capability for Environmental Management (ASCEM), and together contribute towards DOE's cyberinfrastructure for data management and model-data integration.
A scoping review of spatial cluster analysis techniques for point-event data.

PubMed

Fritz, Charles E; Schuurman, Nadine; Robertson, Colin; Lear, Scott

2013-05-01

Spatial cluster analysis is a uniquely interdisciplinary endeavour, and so it is important to communicate and disseminate ideas, innovations, best practices and challenges across practitioners, applied epidemiology researchers and spatial statisticians. In this research we conducted a scoping review to systematically search peer-reviewed journal databases for research that has employed spatial cluster analysis methods on individual-level, address location, or x and y coordinate derived data. To illustrate the thematic issues raised by our results, methods were tested using a dataset where known clusters existed. Point pattern methods, spatial clustering and cluster detection tests, and a locally weighted spatial regression model were most commonly used for individual-level, address location data (n = 29). The spatial scan statistic was the most popular method for address location data (n = 19). Six themes were identified relating to the application of spatial cluster analysis methods and subsequent analyses, which we recommend researchers to consider; exploratory analysis, visualization, spatial resolution, aetiology, scale and spatial weights. It is our intention that researchers seeking direction for using spatial cluster analysis methods, consider the caveats and strengths of each approach, but also explore the numerous other methods available for this type of analysis. Applied spatial epidemiology researchers and practitioners should give special consideration to applying multiple tests to a dataset. Future research should focus on developing frameworks for selecting appropriate methods and the corresponding spatial weighting schemes.
Dissecting the space-time structure of tree-ring datasets using the partial triadic analysis.

PubMed

Rossi, Jean-Pierre; Nardin, Maxime; Godefroid, Martin; Ruiz-Diaz, Manuela; Sergent, Anne-Sophie; Martinez-Meier, Alejandro; Pâques, Luc; Rozenberg, Philippe

2014-01-01

Tree-ring datasets are used in a variety of circumstances, including archeology, climatology, forest ecology, and wood technology. These data are based on microdensity profiles and consist of a set of tree-ring descriptors, such as ring width or early/latewood density, measured for a set of individual trees. Because successive rings correspond to successive years, the resulting dataset is a ring variables × trees × time datacube. Multivariate statistical analyses, such as principal component analysis, have been widely used for extracting worthwhile information from ring datasets, but they typically address two-way matrices, such as ring variables × trees or ring variables × time. Here, we explore the potential of the partial triadic analysis (PTA), a multivariate method dedicated to the analysis of three-way datasets, to apprehend the space-time structure of tree-ring datasets. We analyzed a set of 11 tree-ring descriptors measured in 149 georeferenced individuals of European larch (Larix decidua Miller) during the period of 1967-2007. The processing of densitometry profiles led to a set of ring descriptors for each tree and for each year from 1967-2007. The resulting three-way data table was subjected to two distinct analyses in order to explore i) the temporal evolution of spatial structures and ii) the spatial structure of temporal dynamics. We report the presence of a spatial structure common to the different years, highlighting the inter-individual variability of the ring descriptors at the stand scale. We found a temporal trajectory common to the trees that could be separated into a high and low frequency signal, corresponding to inter-annual variations possibly related to defoliation events and a long-term trend possibly related to climate change. We conclude that PTA is a powerful tool to unravel and hierarchize the different sources of variation within tree-ring datasets.
Improvements in the spatial representation of lakes and reservoirs in the contiguous United States for the National Water Model

NASA Astrophysics Data System (ADS)

Khan, S.; Salas, F.; Sampson, K. M.; Read, L. K.; Cosgrove, B.; Li, Z.; Gochis, D. J.

2017-12-01

The representation of inland surface water bodies in distributed hydrologic models at the continental scale is a challenge. The National Water Model (NWM) utilizes the National Hydrography Dataset Plus Version 2 (NHDPlusV2) "waterbody" dataset to represent lakes and reservoirs. The "waterbody" layer is a comprehensive dataset that represents surface water bodies using common features like lakes, ponds, reservoirs, estuaries, playas and swamps/marshes. However, a major issue that remains unresolved even in the latest revision of NHDPlus Version 2 is the inconsistency in waterbody digitization and delineation errors. Manually correcting the water body polygons becomes tedious and quickly impossible for continental-scale hydrologic models such as the NWM. In this study, we improved spatial representation of 6,802 lakes and reservoirs by analyzing 379,110 waterbodies in the contiguous United States (excluding the Laurentian Great Lakes). We performed a step-by- step process that integrates a set of geospatial analyses to identify, track, and correct the extent of lakes and reservoirs features that are larger than 0.75 km2. The following assumptions were applied while developing the new dataset: a) lakes and reservoirs cannot directly feed into each other; b) each waterbody must have one outlet; and c) a single lake or reservoir feature cannot have multiple parts. The majority of the NHDplusV2 waterbody features in the original dataset are delineated correctly. However approximately 3 % of the lake and reservoir polygons were found to be incorrect with topological errors and were corrected accordingly. It is important to fix these digitizing errors because the waterbody features are closely linked to the river topology. This new waterbody dataset will ensure that model-simulated water is directed into and through the lakes and reservoirs in a manner that supports the NWM code base and assumptions. The improved dataset will facilitate more effective integration of lakes and reservoirs with correct spatial features into the updated NWM.
On sample size and different interpretations of snow stability datasets

NASA Astrophysics Data System (ADS)

Schirmer, M.; Mitterer, C.; Schweizer, J.

2009-04-01

Interpretations of snow stability variations need an assessment of the stability itself, independent of the scale investigated in the study. Studies on stability variations at a regional scale have often chosen stability tests such as the Rutschblock test or combinations of various tests in order to detect differences in aspect and elevation. The question arose: ‘how capable are such stability interpretations in drawing conclusions'. There are at least three possible errors sources: (i) the variance of the stability test itself; (ii) the stability variance at an underlying slope scale, and (iii) that the stability interpretation might not be directly related to the probability of skier triggering. Various stability interpretations have been proposed in the past that provide partly different results. We compared a subjective one based on expert knowledge with a more objective one based on a measure derived from comparing skier-triggered slopes vs. slopes that have been skied but not triggered. In this study, the uncertainties are discussed and their effects on regional scale stability variations will be quantified in a pragmatic way. An existing dataset with very large sample sizes was revisited. This dataset contained the variance of stability at a regional scale for several situations. The stability in this dataset was determined using the subjective interpretation scheme based on expert knowledge. The question to be answered was how many measurements were needed to obtain similar results (mainly stability differences in aspect or elevation) as with the complete dataset. The optimal sample size was obtained in several ways: (i) assuming a nominal data scale the sample size was determined with a given test, significance level and power, and by calculating the mean and standard deviation of the complete dataset. With this method it can also be determined if the complete dataset consists of an appropriate sample size. (ii) Smaller subsets were created with similar aspect distributions to the large dataset. We used 100 different subsets for each sample size. Statistical variations obtained in the complete dataset were also tested on the smaller subsets using the Mann-Whitney or the Kruskal-Wallis test. For each subset size, the number of subsets were counted in which the significance level was reached. For these tests no nominal data scale was assumed. (iii) For the same subsets described above, the distribution of the aspect median was determined. A count of how often this distribution was substantially different from the distribution obtained with the complete dataset was made. Since two valid stability interpretations were available (an objective and a subjective interpretation as described above), the effect of the arbitrary choice of the interpretation on spatial variability results was tested. In over one third of the cases the two interpretations came to different results. The effect of these differences were studied in a similar method as described in (iii): the distribution of the aspect median was determined for subsets of the complete dataset using both interpretations, compared against each other as well as to the results of the complete dataset. For the complete dataset the two interpretations showed mainly identical results. Therefore the subset size was determined from the point at which the results of the two interpretations converged. A universal result for the optimal subset size cannot be presented since results differed between different situations contained in the dataset. The optimal subset size is thus dependent on stability variation in a given situation, which is unknown initially. There are indications that for some situations even the complete dataset might be not large enough. At a subset size of approximately 25, the significant differences between aspect groups (as determined using the whole dataset) were only obtained in one out of five situations. In some situations, up to 20% of the subsets showed a substantially different distribution of the aspect median. Thus, in most cases, 25 measurements (which can be achieved by six two-person teams in one day) did not allow to draw reliable conclusions.
Effects of spatial resolution and landscape structure on land cover characterization

NASA Astrophysics Data System (ADS)

Yang, Wenli

This dissertation addressed problems in scaling, problems that are among the main challenges in remote sensing. The principal objective of the research was to investigate the effects of changing spatial scale on the representation of land cover. A second objective was to determine the relationship between such effects, characteristics of landscape structure and scaling procedures. Four research issues related to spatial scaling were examined. They included: (1) the upscaling of Normalized Difference Vegetation Index (NDVI); (2) the effects of spatial scale on indices of landscape structure; (3) the representation of land cover databases at different spatial scales; and (4) the relationships between landscape indices and land cover area estimations. The overall bias resulting from non-linearity of NDVI in relation to spatial resolution is generally insignificant as compared to other factors such as influences of aerosols and water vapor. The bias is, however, related to land surface characteristics. Significant errors may be introduced in heterogeneous areas where different land cover types exhibit strong spectral contrast. Spatially upscaled SPOT and TM NDVIs have information content comparable with the AVHRR-derived NDVI. Indices of landscape structure and spatial resolution are generally related, but the exact forms of the relationships are subject to changes in other factors including the basic patch unit constituting a landscape and the proportional area of foreground land cover under consideration. The extent of agreement between spatially aggregated coarse resolution land cover datasets and full resolution datasets changes with the properties of the original datasets, including the pixel size and class definition. There are close relationships between landscape structure and class areas estimated from spatially aggregated land cover databases. The relationships, however, do not permit extension from one area to another. Inversion calibration across different geographic/ecological areas is, therefore, not feasible. Different rules govern the land cover area changes across resolutions when different upscaling methods are used. Special attention should be given to comparison between land cover maps derived using different methods.
A Model-Based Approach for Microvasculature Structure Distortion Correction in Two-Photon Fluorescence Microscopy Images

PubMed Central

Dao, Lam; Glancy, Brian; Lucotte, Bertrand; Chang, Lin-Ching; Balaban, Robert S; Hsu, Li-Yueh

2015-01-01

SUMMARY This paper investigates a post-processing approach to correct spatial distortion in two-photon fluorescence microscopy images for vascular network reconstruction. It is aimed at in vivo imaging of large field-of-view, deep-tissue studies of vascular structures. Based on simple geometric modeling of the object-of-interest, a distortion function is directly estimated from the image volume by deconvolution analysis. Such distortion function is then applied to sub volumes of the image stack to adaptively adjust for spatially varying distortion and reduce the image blurring through blind deconvolution. The proposed technique was first evaluated in phantom imaging of fluorescent microspheres that are comparable in size to the underlying capillary vascular structures. The effectiveness of restoring three-dimensional spherical geometry of the microspheres using the estimated distortion function was compared with empirically measured point-spread function. Next, the proposed approach was applied to in vivo vascular imaging of mouse skeletal muscle to reduce the image distortion of the capillary structures. We show that the proposed method effectively improve the image quality and reduce spatially varying distortion that occurs in large field-of-view deep-tissue vascular dataset. The proposed method will help in qualitative interpretation and quantitative analysis of vascular structures from fluorescence microscopy images. PMID:26224257
Mapping regional soil water erosion risk in the Brittany-Loire basin for water management agency

NASA Astrophysics Data System (ADS)

Degan, Francesca; Cerdan, Olivier; Salvador-Blanes, Sébastien; Gautier, Jean-Noël

2014-05-01

Soil water erosion is one of the main degradation processes that affect soils through the removal of soil particles from the surface. The impacts for environment and agricultural areas are diverse, such as water pollution, crop yield depression, organic matter loss and reduction in water storage capacity. There is therefore a strong need to produce maps at the regional scale to help environmental policy makers and soil and water management bodies to mitigate the effect of water and soil pollution. Our approach aims to model and map soil erosion risk at regional scale (155 000 km²) and high spatial resolution (50 m) in the Brittany - Loire basin. The factors responsible for soil erosion are different according to the spatial and time scales considered. The regional scale entails challenges about homogeneous data sets availability, spatial resolution of results, various erosion processes and agricultural practices. We chose to improve the MESALES model (Le Bissonnais et al., 2002) to map soil erosion risk, because it was developed specifically for water erosion in agricultural fields in temperate areas. The MESALES model consists in a decision tree which gives for each combination of factors the corresponding class of soil erosion risk. Four factors that determine soil erosion risk are considered: soils, land cover, climate and topography. The first main improvement of the model consists in using newly available datasets that are more accurate than the initial ones. The datasets used cover all the study area homogeneously. Soil dataset has a 1/1 000 000 scale and attributes such as texture, soil type, rock fragment and parent material are used. The climate dataset has a spatial resolution of 8 km and a temporal resolution of mm/day for 12 years. Elevation dataset has a spatial resolution of 50 m. Three different land cover datasets are used where the finest spatial resolution is 50 m over three years. Using these datasets, four erosion factors are characterized and quantified: the soil factors (soil sealing, erodibility and runoff), the rate of land cover over three years for each season and for 77 land use classes, the topographic factor (slope and drainage area) and the climate hazard (seasonal amount and rainfall erosivity). These modifications of the original MESALES model allow to better represent erosion risk for arable and bare land. We validated model results by stakeholder consultations and meetings over all the study area. The model has finally been modified taking into account validation results. Results are provided with a spatial resolution of 1 km, and then integrated into 2121 catchments. An erosion risk map for each season and an annual erosion risk map are produced. These new maps allow to organize in hierarchy 2121 catchments into three erosion risk classes. In the annual erosion risk map, 347 catchments have the highest erosion risk, which corresponds to 16 % of total Brittany-Loire basin area. Water management agency now uses these maps to identify priority areas and to plan specific preservation practices.
Developing a regional retrospective ensemble precipitation dataset for watershed hydrology modeling, Idaho, USA

NASA Astrophysics Data System (ADS)

Flores, A. N.; Smith, K.; LaPorte, P.

2011-12-01

Applications like flood forecasting, military trafficability assessment, and slope stability analysis necessitate the use of models capable of resolving hydrologic states and fluxes at spatial scales of hillslopes (e.g., 10s to 100s m). These models typically require precipitation forcings at spatial scales of kilometers or better and time intervals of hours. Yet in especially rugged terrain that typifies much of the Western US and throughout much of the developing world, precipitation data at these spatiotemporal resolutions is difficult to come by. Ground-based weather radars have significant problems in high-relief settings and are sparsely located, leaving significant gaps in coverage and high uncertainties. Precipitation gages provide accurate data at points but are very sparsely located and their placement is often not representative, yielding significant coverage gaps in a spatial and physiographic sense. Numerical weather prediction efforts have made precipitation data, including critically important information on precipitation phase, available globally and in near real-time. However, these datasets present watershed modelers with two problems: (1) spatial scales of many of these datasets are tens of kilometers or coarser, (2) numerical weather models used to generate these datasets include a land surface parameterization that in some circumstances can significantly affect precipitation predictions. We report on the development of a regional precipitation dataset for Idaho that leverages: (1) a dataset derived from a numerical weather prediction model, (2) gages within Idaho that report hourly precipitation data, and (3) a long-term precipitation climatology dataset. Hourly precipitation estimates from the Modern Era Retrospective-analysis for Research and Applications (MERRA) are stochastically downscaled using a hybrid orographic and statistical model from their native resolution (1/2 x 2/3 degrees) to a resolution of approximately 1 km. Downscaled precipitation realizations are conditioned on hourly observations from reporting gages and then conditioned again on the Parameter-elevation Regressions on Independent Slopes Model (PRISM) at the monthly timescale to reflect orographic precipitation trends common to watersheds of the Western US. While this methodology potentially introduces cross-pollination of errors due to the re-use of precipitation gage data, it nevertheless achieves an ensemble-based precipitation estimate and appropriate measures of uncertainty at a spatiotemporal resolution appropriate for watershed modeling.
Accuracy assessment of seven global land cover datasets over China

NASA Astrophysics Data System (ADS)

Yang, Yongke; Xiao, Pengfeng; Feng, Xuezhi; Li, Haixing

2017-03-01

Land cover (LC) is the vital foundation to Earth science. Up to now, several global LC datasets have arisen with efforts of many scientific communities. To provide guidelines for data usage over China, nine LC maps from seven global LC datasets (IGBP DISCover, UMD, GLC, MCD12Q1, GLCNMO, CCI-LC, and GlobeLand30) were evaluated in this study. First, we compared their similarities and discrepancies in both area and spatial patterns, and analysed their inherent relations to data sources and classification schemes and methods. Next, five sets of validation sample units (VSUs) were collected to calculate their accuracy quantitatively. Further, we built a spatial analysis model and depicted their spatial variation in accuracy based on the five sets of VSUs. The results show that, there are evident discrepancies among these LC maps in both area and spatial patterns. For LC maps produced by different institutes, GLC 2000 and CCI-LC 2000 have the highest overall spatial agreement (53.8%). For LC maps produced by same institutes, overall spatial agreement of CCI-LC 2000 and 2010, and MCD12Q1 2001 and 2010 reach up to 99.8% and 73.2%, respectively; while more efforts are still needed if we hope to use these LC maps as time series data for model inputting, since both CCI-LC and MCD12Q1 fail to represent the rapid changing trend of several key LC classes in the early 21st century, in particular urban and built-up, snow and ice, water bodies, and permanent wetlands. With the highest spatial resolution, the overall accuracy of GlobeLand30 2010 is 82.39%. For the other six LC datasets with coarse resolution, CCI-LC 2010/2000 has the highest overall accuracy, and following are MCD12Q1 2010/2001, GLC 2000, GLCNMO 2008, IGBP DISCover, and UMD in turn. Beside that all maps exhibit high accuracy in homogeneous regions; local accuracies in other regions are quite different, particularly in Farming-Pastoral Zone of North China, mountains in Northeast China, and Southeast Hills. Special attention should be paid for data users who are interested in these regions.

Evaluating Climate Causation of Conflict in Darfur Using Multi-temporal, Multi-resolution Satellite Image Datasets With Novel Analyses

NASA Astrophysics Data System (ADS)

Brown, I.; Wennbom, M.

2013-12-01

Climate change, population growth and changes in traditional lifestyles have led to instabilities in traditional demarcations between neighboring ethic and religious groups in the Sahel region. This has resulted in a number of conflicts as groups resort to arms to settle disputes. Such disputes often centre on or are justified by competition for resources. The conflict in Darfur has been controversially explained by resource scarcity resulting from climate change. Here we analyse established methods of using satellite imagery to assess vegetation health in Darfur. Multi-decadal time series of observations are available using low spatial resolution visible-near infrared imagery. Typically normalized difference vegetation index (NDVI) analyses are produced to describe changes in vegetation ';greenness' or ';health'. Such approaches have been widely used to evaluate the long term development of vegetation in relation to climate variations across a wide range of environments from the Arctic to the Sahel. These datasets typically measure peak NDVI observed over a given interval and may introduce bias. It is furthermore unclear how the spatial organization of sparse vegetation may affect low resolution NDVI products. We develop and assess alternative measures of vegetation including descriptors of the growing season, wetness and resource availability. Expanding the range of parameters used in the analysis reduces our dependence on peak NDVI. Furthermore, these descriptors provide a better characterization of the growing season than the single NDVI measure. Using multi-sensor data we combine high temporal/moderate spatial resolution data with low temporal/high spatial resolution data to improve the spatial representativity of the observations and to provide improved spatial analysis of vegetation patterns. The approach places the high resolution observations in the NDVI context space using a longer time series of lower resolution imagery. The vegetation descriptors derived are evaluated using independent high spatial resolution datasets that reveal the pattern and health of vegetation at metre scales. We also use climate variables to support the interpretation of these data. We conclude that the spatio-temporal patterns in Darfur vegetation and climate datasets suggest that labelling the conflict a climate-change conflict is inaccurate and premature.
Reconstruction of a Three Hourly 1-km Land Surface Air Temperature Dataset in the Qinghai-Tibet Plateau

NASA Astrophysics Data System (ADS)

Zhou, J.; Ding, L.

2017-12-01

Land surface air temperature (SAT) is an important parameter in the modeling of radiation balance and energy budget of the earth surface. Generally, SAT is measured at ground meteorological stations; then SAT mapping is possible though a spatial interpolation process. The interpolated SAT map relies on the spatial distribution of ground stations, the terrain, and many other factors; thus, it has great uncertainties in regions with complicated terrain. Instead, SAT map can also be obtained through physical modeling of interactions between the land surface and the atmosphere. Such dataset generally has coarse spatial resolution (e.g. coarser than 0.1°) and cannot satisfy the applications at fine scales, e.g. 1 km. This presentation reports the reconstruction of a three hourly 1-km SAT dataset from 2001 to 2015 over the Qinghai-Tibet Plateau. The terrain in the Qinghai-Tibet Plateau, especially in the eastern part, is extremely complicated. Two SAT datasets with good qualities are used in this study. The first one is from the 3h China Meteorological Forcing Dataset with a 0.1° resolution released by the Institute of Tibetan Plateau Research, Chinese Academy of Sciences (Yang et al., 2010); the second one is from the ERA-Interim product with the same temporal resolution and a 0.125° resolution. A statistical approach is developed to downscale the spatial resolution of the derived SAT to 1-km. The elevation and the normalized difference vegetation index (NDVI) are selected as two scaling factors in the downscaling approach. Results demonstrate there is significantly negative correlation between the SAT and elevation in all seasons; there is also significantly negative correlation between the SAT and NDVI in the vegetation growth seasons, while the correlation decreases in the other seasons. Therefore, a temporally dynamic downscaling approach is feasible to enhance the spatial resolution of the SAT. Compared with the SAT at the 0.1° or 0.125°, the reconstructed 1-km SAT can provide much more spatial details in areas with complicated terrain. Additionally, the 1-km SAT agrees well with the ground measured air temperatures as well as the SAT before downscaling. The reconstructed SAT will be beneficial for the modeling of surface radiation balance and energy budget over the Qinghai-Tibet Plateau.
The sensitivity of ecosystem service models to choices of input data and spatial resolution

USGS Publications Warehouse

Bagstad, Kenneth J.; Cohen, Erika; Ancona, Zachary H.; McNulty, Steven; Sun, Ge

2018-01-01

Although ecosystem service (ES) modeling has progressed rapidly in the last 10–15 years, comparative studies on data and model selection effects have become more common only recently. Such studies have drawn mixed conclusions about whether different data and model choices yield divergent results. In this study, we compared the results of different models to address these questions at national, provincial, and subwatershed scales in Rwanda. We compared results for carbon, water, and sediment as modeled using InVEST and WaSSI using (1) land cover data at 30 and 300 m resolution and (2) three different input land cover datasets. WaSSI and simpler InVEST models (carbon storage and annual water yield) were relatively insensitive to the choice of spatial resolution, but more complex InVEST models (seasonal water yield and sediment regulation) produced large differences when applied at differing resolution. Six out of nine ES metrics (InVEST annual and seasonal water yield and WaSSI) gave similar predictions for at least two different input land cover datasets. Despite differences in mean values when using different data sources and resolution, we found significant and highly correlated results when using Spearman's rank correlation, indicating consistent spatial patterns of high and low values. Our results confirm and extend conclusions of past studies, showing that in certain cases (e.g., simpler models and national-scale analyses), results can be robust to data and modeling choices. For more complex models, those with different output metrics, and subnational to site-based analyses in heterogeneous environments, data and model choices may strongly influence study findings.
Accurate population genetic measurements require cryptic species identification in corals

NASA Astrophysics Data System (ADS)

Sheets, Elizabeth A.; Warner, Patricia A.; Palumbi, Stephen R.

2018-06-01

Correct identification of closely related species is important for reliable measures of gene flow. Incorrectly lumping individuals of different species together has been shown to over- or underestimate population differentiation, but examples highlighting when these different results are observed in empirical datasets are rare. Using 199 single nucleotide polymorphisms, we assigned 768 individuals in the Acropora hyacinthus and A. cytherea morphospecies complexes to each of eight previously identified cryptic genetic species and measured intraspecific genetic differentiation across three geographic scales (within reefs, among reefs within an archipelago, and among Pacific archipelagos). We then compared these calculations to estimated genetic differentiation at each scale with all cryptic genetic species mixed as if we could not tell them apart. At the reef scale, correct genetic species identification yielded lower F ST estimates and fewer significant comparisons than when species were mixed, raising estimates of short-scale gene flow. In contrast, correct genetic species identification at large spatial scales yielded higher F ST measurements than mixed-species comparisons, lowering estimates of long-term gene flow among archipelagos. A meta-analysis of published population genetic studies in corals found similar results: F ST estimates at small spatial scales were lower and significance was found less often in studies that controlled for cryptic species. Our results and these prior datasets controlling for cryptic species suggest that genetic differentiation among local reefs may be lower than what has generally been reported in the literature. Not properly controlling for cryptic species structure can bias population genetic analyses in different directions across spatial scales, and this has important implications for conservation strategies that rely on these estimates.
Optimising Habitat-Based Models for Wide-Ranging Marine Predators: Scale Matters

NASA Astrophysics Data System (ADS)

Scales, K. L.; Hazen, E. L.; Jacox, M.; Edwards, C. A.; Bograd, S. J.

2016-12-01

Predicting the responses of marine top predators to dynamic oceanographic conditions requires habitat-based models that sufficiently capture environmental preferences. Spatial resolution and temporal averaging of environmental data layers is a key aspect of model construction. The utility of surfaces contemporaneous to animal movement (e.g. daily, weekly), versus synoptic products (monthly, seasonal, climatological) is currently under debate, as is the optimal spatial resolution for predictive products. Using movement simulations with built-in environmental preferences (correlated random walks, multi-state hidden Markov-type models) together with modeled (Regional Oceanographic Modeling System, ROMS) and remotely-sensed (MODIS-Aqua) datasets, we explored the effects of degrading environmental surfaces (3km - 1 degree, daily - climatological) on model inference. We simulated the movements of a hypothetical wide-ranging marine predator through the California Current system over a three month period (May-June-July), based on metrics derived from previously published blue whale Balaenoptera musculus tracking studies. Results indicate that models using seasonal or climatological data fields can overfit true environmental preferences, in both presence-absence and behaviour-based model formulations. Moreover, the effects of a degradation in spatial resolution are more pronounced when using temporally averaged fields than when using daily, weekly or monthly datasets. In addition, we observed a notable divergence between the `best' models selected using common methods (e.g. AUC, AICc) and those that most accurately reproduced built-in environmental preferences. These findings have important implications for conservation and management of marine mammals, seabirds, sharks, sea turtles and large teleost fish, particularly in implementing dynamic ocean management initiatives and in forecasting responses to future climate-mediated ecosystem change.
Optimising Habitat-Based Models for Wide-Ranging Marine Predators: Scale Matters

NASA Astrophysics Data System (ADS)

Scales, K. L.; Hazen, E. L.; Jacox, M.; Edwards, C. A.; Bograd, S. J.

2016-02-01

Predicting the responses of marine top predators to dynamic oceanographic conditions requires habitat-based models that sufficiently capture environmental preferences. Spatial resolution and temporal averaging of environmental data layers is a key aspect of model construction. The utility of surfaces contemporaneous to animal movement (e.g. daily, weekly), versus synoptic products (monthly, seasonal, climatological) is currently under debate, as is the optimal spatial resolution for predictive products. Using movement simulations with built-in environmental preferences (correlated random walks, multi-state hidden Markov-type models) together with modeled (Regional Oceanographic Modeling System, ROMS) and remotely-sensed (MODIS-Aqua) datasets, we explored the effects of degrading environmental surfaces (3km - 1 degree, daily - climatological) on model inference. We simulated the movements of a hypothetical wide-ranging marine predator through the California Current system over a three month period (May-June-July), based on metrics derived from previously published blue whale Balaenoptera musculus tracking studies. Results indicate that models using seasonal or climatological data fields can overfit true environmental preferences, in both presence-absence and behaviour-based model formulations. Moreover, the effects of a degradation in spatial resolution are more pronounced when using temporally averaged fields than when using daily, weekly or monthly datasets. In addition, we observed a notable divergence between the `best' models selected using common methods (e.g. AUC, AICc) and those that most accurately reproduced built-in environmental preferences. These findings have important implications for conservation and management of marine mammals, seabirds, sharks, sea turtles and large teleost fish, particularly in implementing dynamic ocean management initiatives and in forecasting responses to future climate-mediated ecosystem change.
Gridded precipitation fields at high temporal and spatial resolution for operational flood forecasting in the Rhine basin

NASA Astrophysics Data System (ADS)

van Osnabrugge, Bart; Weerts, Albrecht; Uijlenhoet, Remko

2017-04-01

Gridded areal precipitation, as one of the most important hydrometeorological input variables for initial state estimation in operational hydrological forecasting, is available in the form of raster data sets (e.g. HYRAS and EOBS) for the River Rhine basin. These datasets are compiled off-line on a daily time step using station data with the highest possible spatial density. However, such a product is not available operationally and at an hourly discretisation. Therefore, we constructed an hourly gridded precipitation dataset at 1.44 km2 resolution for the Rhine basin for the period from 1998 to present using a REGNIE-like interpolation procedure (Weerts et al., 2008) using a low and a high density rain gauge network. The datasets were validated against daily HYRAS (Rauthe, 2013) and EOBS (Haylock, 2008) data. The main goal of the operational procedure is to emulate the HYRAS dataset as good as possible, as the daily HYRAS dataset is used in the off-line calibration of the hydrological model. Our main findings are that even with low station density, the spatial patterns found in the HYRAS data set are well reproduced. With low station density (years 1999-2006) our dataset underestimates precipitation compared to HYRAS and EOBS, notably during the winter. However, interpolation based on the same set of stations overestimates precipitation compared to EOBS for the years 2006-2014. This discrepancy disappears when switching to the high station density. We also analyze the robustness of the hourly precipitation fields by comparing with stations not used during interpolation. Specific issues regarding the data when creating the gridded precipitation fields will be highlighted. Finally, the datasets are used to drive an hourly and daily gridded WFLOW_HBV model of the Rhine at the same spatial resolution. Haylock, M.R., N. Hofstra, A.M.G. Klein Tank, E.J. Klok, P.D. Jones and M. New. 2008: A European daily high-resolution gridded dataset of surface temperature and precipitation. J. Geophys. Res (Atmospheres), 113, D20119, doi:10.1029/2008JD10201 Rauthe, M., Steiner, H., Riediger, U., Mazurkiewicz, A., Gratzki, A. 2013: A Central European precipitation climatology - Part 1: Generation and validation of a high-resolution gridded daily data set (HYRAS). Meteorologische Zeitschrift, 22(3), 235 256 Weerts, A.H., D. Meißner, and S. Rademacher, 2008. Input data rainfall-runoff model operational system FEWS-NL & FEWS-DE. Technical report, Deltares.
High Resolution Stratigraphic Mapping in Complex Terrain: A Comparison of Traditional Remote Sensing Techniques with Unmanned Aerial Vehicle - Structure from Motion Photogrammetry

NASA Astrophysics Data System (ADS)

Nesbit, P. R.; Hugenholtz, C.; Durkin, P.; Hubbard, S. M.; Kucharczyk, M.; Barchyn, T.

2016-12-01

Remote sensing and digital mapping have started to revolutionize geologic mapping in recent years as a result of their realized potential to provide high resolution 3D models of outcrops to assist with interpretation, visualization, and obtaining accurate measurements of inaccessible areas. However, in stratigraphic mapping applications in complex terrain, it is difficult to acquire information with sufficient detail at a wide spatial coverage with conventional techniques. We demonstrate the potential of a UAV and Structure from Motion (SfM) photogrammetric approach for improving 3D stratigraphic mapping applications within a complex badland topography. Our case study is performed in Dinosaur Provincial Park (Alberta, Canada), mapping late Cretaceous fluvial meander belt deposits of the Dinosaur Park formation amidst a succession of steeply sloping hills and abundant drainages - creating a challenge for stratigraphic mapping. The UAV-SfM dataset (2 cm spatial resolution) is compared directly with a combined satellite and aerial LiDAR dataset (30 cm spatial resolution) to reveal advantages and limitations of each dataset before presenting a unique workflow that utilizes the dense point cloud from the UAV-SfM dataset for analysis. The UAV-SfM dense point cloud minimizes distortion, preserves 3D structure, and records an RGB attribute - adding potential value in future studies. The proposed UAV-SfM workflow allows for high spatial resolution remote sensing of stratigraphy in complex topographic environments. This extended capability can add value to field observations and has the potential to be integrated with subsurface petroleum models.
Operational use of open satellite data for marine water quality monitoring

NASA Astrophysics Data System (ADS)

Symeonidis, Panagiotis; Vakkas, Theodoros

2017-09-01

The purpose of this study was to develop an operational platform for marine water quality monitoring using near real time satellite data. The developed platform utilizes free and open satellite data available from different data sources like COPERNICUS, the European Earth Observation Initiative, or NASA, from different satellites and instruments. The quality of the marine environment is operationally evaluated using parameters like chlorophyll-a concentration, water color and Sea Surface Temperature (SST). For each parameter, there are more than one dataset available, from different data sources or satellites, to allow users to select the most appropriate dataset for their area or time of interest. The above datasets are automatically downloaded from the data provider's services and ingested to the central, spatial engine. The spatial data platform uses the Postgresql database with the PostGIS extension for spatial data storage and Geoserver for the provision of the spatial data services. The system provides daily, 10 days and monthly maps and time series of the above parameters. The information is provided using a web client which is based on the GET SDI PORTAL, an easy to use and feature rich geospatial visualization and analysis platform. The users can examine the temporal variation of the parameters using a simple time animation tool. In addition, with just one click on the map, the system provides an interactive time series chart for any of the parameters of the available datasets. The platform can be offered as Software as a Service (SaaS) to any area in the Mediterranean region.
High-Resolution Digital Terrain Models of the Sacramento/San Joaquin Delta Region, California

USGS Publications Warehouse

Coons, Tom; Soulard, Christopher E.; Knowles, Noah

2008-01-01

The U.S. Geological Survey (USGS) Western Region Geographic Science Center, in conjunction with the USGS Water Resources Western Branch of Regional Research, has developed a high-resolution elevation dataset covering the Sacramento/San Joaquin Delta region of California. The elevation data were compiled photogrammically from aerial photography (May 2002) with a scale of 1:15,000. The resulting dataset has a 10-meter horizontal resolution grid of elevation values. The vertical accuracy was determined to be 1 meter. Two versions of the elevation data are available: the first dataset has all water coded as zero, whereas the second dataset has bathymetry data merged with the elevation data. The projection of both datasets is set to UTM Zone 10, NAD 1983. The elevation data are clipped into files that spatially approximate 7.5-minute USGS quadrangles, with about 100 meters of overlap to facilitate combining the files into larger regions without data gaps. The files are named after the 7.5-minute USGS quadrangles that cover the same general spatial extent. File names that include a suffix (_b) indicate that the bathymetry data are included (for example, sac_east versus sac_east_b). These files are provided in ESRI Grid format.
SPARQL Query Re-writing Using Partonomy Based Transformation Rules

NASA Astrophysics Data System (ADS)

Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.

Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.
Spatiotemporal Permutation Entropy as a Measure for Complexity of Cardiac Arrhythmia

NASA Astrophysics Data System (ADS)

Schlemmer, Alexander; Berg, Sebastian; Lilienkamp, Thomas; Luther, Stefan; Parlitz, Ulrich

2018-05-01

Permutation entropy (PE) is a robust quantity for measuring the complexity of time series. In the cardiac community it is predominantly used in the context of electrocardiogram (ECG) signal analysis for diagnoses and predictions with a major application found in heart rate variability parameters. In this article we are combining spatial and temporal PE to form a spatiotemporal PE that captures both, complexity of spatial structures and temporal complexity at the same time. We demonstrate that the spatiotemporal PE (STPE) quantifies complexity using two datasets from simulated cardiac arrhythmia and compare it to phase singularity analysis and spatial PE (SPE). These datasets simulate ventricular fibrillation (VF) on a two-dimensional and a three-dimensional medium using the Fenton-Karma model. We show that SPE and STPE are robust against noise and demonstrate its usefulness for extracting complexity features at different spatial scales.
Demonstration of Airborne Wide Area Assessment Technologies at Pueblo Precision Bombing Ranges, Colorado. Hyperspectral Imaging, Version 2.0

DTIC Science & Technology

2007-09-27

the spatial and spectral resolution ...variety of geological and vegetation mapping efforts, the Hymap sensor offered the best available combination of spectral and spatial resolution , signal... The limitations of the technology currently relate to spatial and spectral resolution and geo- correction accuracy. Secondly, HSI datasets
Developing a new global network of river reaches from merged satellite-derived datasets

NASA Astrophysics Data System (ADS)

Lion, C.; Allen, G. H.; Beighley, E.; Pavelsky, T.

2015-12-01

In 2020, the Surface Water and Ocean Topography satellite (SWOT), a joint mission of NASA/CNES/CSA/UK will be launched. One of its major products will be the measurements of continental water extent, including the width, height, and slope of rivers and the surface area and elevations of lakes. The mission will improve the monitoring of continental water and also our understanding of the interactions between different hydrologic reservoirs. For rivers, SWOT measurements of slope must be carried out over predefined river reaches. As such, an a priori dataset for rivers is needed in order to facilitate analysis of the raw SWOT data. The information required to produce this dataset includes measurements of river width, elevation, slope, planform, river network topology, and flow accumulation. To produce this product, we have linked two existing global datasets: the Global River Widths from Landsat (GRWL) database, which contains river centerline locations, widths, and a braiding index derived from Landsat imagery, and a modified version of the HydroSHEDS hydrologically corrected digital elevation product, which contains heights and flow accumulation measurements for streams at 3 arcsecond spatial resolution. Merging these two datasets requires considerable care. The difficulties, among others, lie in the difference of resolution: 30m versus 3 arseconds, and the age of the datasets: 2000 versus ~2010 (some rivers have moved, the braided sections are different). As such, we have developed custom software to merge the two datasets, taking into account the spatial proximity of river channels in the two datasets and ensuring that flow accumulation in the final dataset always increases downstream. Here, we present our preliminary results for a portion of South America and demonstrate the strengths and weaknesses of the method.
The creation of future daily gridded datasets of precipitation and temperature with a spatial weather generator, Cyprus 2020-2050

NASA Astrophysics Data System (ADS)

Camera, Corrado; Bruggeman, Adriana; Hadjinicolaou, Panos; Pashiardis, Stelios; Lange, Manfred

2014-05-01

High-resolution gridded daily datasets are essential for natural resource management and the analysis of climate changes and their effects. This study aimed to create gridded datasets of daily precipitation and daily minimum and maximum temperature, for the future (2020-2050). The horizontal resolution of the developed datasets is 1 x 1 km2, covering the area under control of the Republic of Cyprus (5.760 km2). The study is divided into two parts. The first consists of the evaluation of the performance of different interpolation techniques for daily rainfall and temperature data (1980-2010) for the creation of the gridded datasets. Rainfall data recorded at 145 stations and temperature data from 34 stations were used. For precipitation, inverse distance weighting (IDW) performs best for local events, while a combination of step-wise geographically weighted regression and IDW proves to be the best method for large scale events. For minimum and maximum temperature, a combination of step-wise linear multiple regression and thin plate splines is recognized as the best method. Six Regional Climate Models (RCMs) for the A1B SRES emission scenario from the EU ENSEMBLE project database were selected as sources for future climate projections. The RCMs were evaluated for their capacity to simulate Cyprus climatology for the period 1980-2010. Data for the period 2020-2050 from the three best performing RCMs were downscaled, using the change factors approach, at the location of observational stations. Daily time series were created with a stochastic rainfall and temperature generator. The RainSim V3 software (Burton et al., 2008) was used to generate spatial-temporal coherent rainfall fields. The temperature generator was developed in R and modeled temperature as a weakly stationary process with the daily mean and standard deviation conditioned on the wet and dry state of the day (Richardson, 1981). Finally gridded datasets depicting projected future climate conditions were created with the identified best interpolation methods. The difference between the input and simulated mean daily rainfall, averaged over all the stations, was 0.03 mm (2.2%), while the error related to the number of dry days was 2 (0.6%). For mean daily minimum temperature the error was 0.005 ºC (0.04%), while for maximum temperature it was 0.01 ºC (0.04%). Overall, the weather generators were found to be reliable instruments for the downscaling of precipitation and temperature. The resulting datasets indicate a decrease of the mean annual rainfall over the study area between 5 and 70 mm (1-15%) for 2020-2050, relative to 1980-2010. Average annual minimum and maximum temperature over the Republic of Cyprus are projected to increase between 1.2 and 1.5 ºC. The dataset is currently used to compute agricultural production and water use indicators, as part of the AGWATER project (AEIFORIA/GEORGO/0311(BIE)/06), co-financed by the European Regional Development Fund and the Republic of Cyprus through the Research Promotion Foundation. Burton, A., Kilsby, C.G., Fowler, H.J., Cowpertwait, P.S.P., and O'Connell, P.E.: RainSim: A spatial-temporal stochastic rainfall modelling system. Environ. Model. Software 23, 1356-1369, 2008 Richardson, C.W.: Stochastic simulation of daily precipitation, temperature, and solar radiation. Water Resour. Res. 17, 182-190, 1981.
Spectral-spatial classification of hyperspectral data with mutual information based segmented stacked autoencoder approach

NASA Astrophysics Data System (ADS)

Paul, Subir; Nagesh Kumar, D.

2018-04-01

Hyperspectral (HS) data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. In the present study, Mutual Information (MI) based Segmented Stacked Autoencoder (S-SAE) approach for spectral-spatial classification of the HS data is proposed to reduce the complexity and computational time compared to Stacked Autoencoder (SAE) based feature extraction. A non-parametric dependency measure (MI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of both linear and nonlinear inter-band dependency for spectral segmentation of the HS bands. Then morphological profiles are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, Support Vector Machine (SVM) with Gaussian kernel and Random Forest (RF) are used for classification of the three most popularly used HS datasets. Results of the numerical experiments carried out in this study have shown that SVM with a Gaussian kernel is providing better results for the Pavia University and Botswana datasets whereas RF is performing better for Indian Pines dataset. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.
Percentage of Protected Area Amounts within each Watershed Boundary for the Conterminous US

EPA Science Inventory

Abstract: This dataset uses spatial information from the Watershed Boundary Dataset (WBD, March 2011) and the Protected Areas Database of the United States (PAD-US Version 1.0). The resulting data layer, with percentages of protected areas by category, was created using the ATtI...
Leveraging freely available remote sensing and ancillary datasets for semi-automated identification of potential wetland areas using a Geographic Information System (GIS).

DOT National Transportation Integrated Search

2016-06-01

The purpose of this study was to develop a wetland identification tool that makes use of freely available geospatial : datasets to identify potential wetland locations at a spatial scale relevant for transportation corridor assessments. The tool was ...
ACCURACY OF THE 1992 NATIONAL LAND COVER DATASET AREA ESTIMATES: AN ANALYSIS AT MULTIPLE SPATIAL EXTENTS

EPA Science Inventory

Abstract for poster presentation:

Site-specific accuracy assessments evaluate fine-scale accuracy of land-use/land-cover(LULC) datasets but provide little insight into accuracy of area estimates of LULC

classes derived from sampling units of varying size. Additiona...
The Influence of Spatial Resolutions on the Retrieval Accuracy of Sea Surface Wind Speed with Cross-polarized C-band SAR images

NASA Astrophysics Data System (ADS)

Zhang, K.; Han, B.; Mansaray, L. R.; Xu, X.; Guo, Q.; Jingfeng, H.

2017-12-01

Synthetic aperture radar (SAR) instruments on board satellites are valuable for high-resolution wind field mapping, especially for coastal studies. Since the launch of Sentinel-1A on April 3, 2014, followed by Sentinel-1B on April 25, 2016, large amount of C-band SAR data have been added to a growing accumulation of SAR datasets (ERS-1/2, RADARSAT-1/2, ENVISAT). These new developments are of great significance for a wide range of applications in coastal sea areas, especially for high spatial resolution wind resource assessment, in which the accuracy of retrieved wind fields is extremely crucial. Recently, it is reported that wind speeds can also be retrieved from C-band cross-polarized SAR images, which is an important complement to wind speed retrieval from co-polarization. However, there is no consensus on the optimal resolution for wind speed retrieval from cross-polarized SAR images. This paper presents a comparison strategy for investigating the influence of spatial resolutions on sea surface wind speed retrieval accuracy with cross-polarized SAR images. Firstly, for wind speeds retrieved from VV-polarized images, the optimal geophysical C-band model (CMOD) function was selected among four CMOD functions. Secondly, the most suitable C-band cross-polarized ocean (C-2PO) model was selected between two C-2POs for the VH-polarized image dataset. Then, the VH-wind speeds retrieved by the selected C-2PO were compared with the VV-polarized sea surface wind speeds retrieved using the optimal CMOD, which served as reference, at different spatial resolutions. Results show that the VH-polarized wind speed retrieval accuracy increases rapidly with the decrease in spatial resolutions from 100 m to 1000 m, with a drop in RMSE of 42%. However, the improvement in wind speed retrieval accuracy levels off with spatial resolutions decreasing from 1000 m to 5000 m. This demonstrates that the pixel spacing of 1 km may be the compromising choice for the tradeoff between the spatial resolution and wind speed retrieval accuracy with cross-polarized images obtained from RADASAT-2 fine quad polarization mode. Figs. 1 illustrate the variation of the following statistical parameters: Bias, Corr, R2, RMSE and STD as a function of spatial resolution.

Drought Variability in Eastern Part of Romania and its Connection with Large-Scale Air Circulation

NASA Astrophysics Data System (ADS)

Barbu, Nicu; Stefan, Sabina; Georgescu, Florinela

2014-05-01

Drought is a phenomenon that appears due to precipitation deficit and it is intensified by strong winds, high temperatures, low relative humidity and high insolation; in fact, all these factors lead to increasing of evapotranspiration processes that contribute to soil water deficit. The Standardized Precipitation Evapotranspiration Index (SPEI) take into account all this factors listed above. The temporal variability of the drought in Eastern part of Romania for 50 years, during the period 1961-2010, is investigated. This study is focused on the drought variability related to large scale air circulation. The gridded dataset with spatial resolution of 0.5º lat/lon of SPEI, (https://digital.csic.es/handle/10261/72264) were used to analyze drought periods in connection with large scale air circulation determinate from the two catalogues (GWT - GrossWetter-Typen and WLK - WetterLargenKlassifikation) defined in COST733Action. The GWT catalogue uses at input dataset the sea level pressure and the WLK catalogue uses as input dataset the geopotential field at 925 hPa and 500 hPa, wind at 700 hPa and total water content for entire atmospheric column. In this study we use the GWT catalogue with 18 circulation types and the WLK catalogue with 40 circulation types. The analysis for Barlad Hydrological Basin indicated that the negative values (that means water deficit - drought period) of SPEI are associated with prevailing anticyclonic regime and positive values (that means water excess - rainy period) of SPEI are associated with prevailing cyclonic regime as was expected. In last decade was observed an increase of dry period associated with an increase of anticyclonic activity over Romania. Using GWT18 catalogue the drought are associated with the north-eastern anticyclonic circulation type (NE-A). According to the WLK40 catalogue, the dominant circulation type associated with the drought is north-west-anticyclonic-dry anticyclonic (NW-AAD) type. keywords: drought, SPEI, large-scale atmospheric circulation
Large Survey Database: A Distributed Framework for Storage and Analysis of Large Datasets

NASA Astrophysics Data System (ADS)

Juric, Mario

2011-01-01

The Large Survey Database (LSD) is a Python framework and DBMS for distributed storage, cross-matching and querying of large survey catalogs (>10^9 rows, >1 TB). The primary driver behind its development is the analysis of Pan-STARRS PS1 data. It is specifically optimized for fast queries and parallel sweeps of positionally and temporally indexed datasets. It transparently scales to more than >10^2 nodes, and can be made to function in "shared nothing" architectures. An LSD database consists of a set of vertically and horizontally partitioned tables, physically stored as compressed HDF5 files. Vertically, we partition the tables into groups of related columns ('column groups'), storing together logically related data (e.g., astrometry, photometry). Horizontally, the tables are partitioned into partially overlapping ``cells'' by position in space (lon, lat) and time (t). This organization allows for fast lookups based on spatial and temporal coordinates, as well as data and task distribution. The design was inspired by the success of Google BigTable (Chang et al., 2006). Our programming model is a pipelined extension of MapReduce (Dean and Ghemawat, 2004). An SQL-like query language is used to access data. For complex tasks, map-reduce ``kernels'' that operate on query results on a per-cell basis can be written, with the framework taking care of scheduling and execution. The combination leverages users' familiarity with SQL, while offering a fully distributed computing environment. LSD adds little overhead compared to direct Python file I/O. In tests, we sweeped through 1.1 Grows of PanSTARRS+SDSS data (220GB) less than 15 minutes on a dual CPU machine. In a cluster environment, we achieved bandwidths of 17Gbits/sec (I/O limited). Based on current experience, we believe LSD should scale to be useful for analysis and storage of LSST-scale datasets. It can be downloaded from http://mwscience.net/lsd.
Dataset on spatial distribution and location of universities in Nigeria.

PubMed

Adeyemi, G A; Edeki, S O

2018-06-01

Access to quality educational system, and the location of educational institutions are of great importance for future prospect of youth in any nation. These in return, have great effects on the economy growth and development of any country. Thus, the dataset contained in this article examines and explains the spatial distribution of universities in the Nigeria system of education. Data from the university commission, Nigeria, as at December 2017 are used. These include all the 40 federal universities, 44 states universities, and 69 private universities making a total of 153 universities in the Nigerian system of education. The data analysis is via the Geographic Information System (GIS) software. The dataset contained in this article will be of immense assistance to the national educational policy makers, parents, and potential students as regards smart and reliable decision making academically.
The importance of accurate road data for spatial applications in public health: customizing a road network

PubMed Central

Frizzelle, Brian G; Evenson, Kelly R; Rodriguez, Daniel A; Laraia, Barbara A

2009-01-01

Background Health researchers have increasingly adopted the use of geographic information systems (GIS) for analyzing environments in which people live and how those environments affect health. One aspect of this research that is often overlooked is the quality and detail of the road data and whether or not it is appropriate for the scale of analysis. Many readily available road datasets, both public domain and commercial, contain positional errors or generalizations that may not be compatible with highly accurate geospatial locations. This study examined the accuracy, completeness, and currency of four readily available public and commercial sources for road data (North Carolina Department of Transportation, StreetMap Pro, TIGER/Line 2000, TIGER/Line 2007) relative to a custom road dataset which we developed and used for comparison. Methods and Results A custom road network dataset was developed to examine associations between health behaviors and the environment among pregnant and postpartum women living in central North Carolina in the United States. Three analytical measures were developed to assess the comparative accuracy and utility of four publicly and commercially available road datasets and the custom dataset in relation to participants' residential locations over three time periods. The exclusion of road segments and positional errors in the four comparison road datasets resulted in between 5.9% and 64.4% of respondents lying farther than 15.24 meters from their nearest road, the distance of the threshold set by the project to facilitate spatial analysis. Agreement, using a Pearson's correlation coefficient, between the customized road dataset and the four comparison road datasets ranged from 0.01 to 0.82. Conclusion This study demonstrates the importance of examining available road datasets and assessing their completeness, accuracy, and currency for their particular study area. This paper serves as an example for assessing the feasibility of readily available commercial or public road datasets, and outlines the steps by which an improved custom dataset for a study area can be developed. PMID:19409088
Analyses of Sporocarps, Morphotyped Ectomycorrhizae, Environmental ITS and LSU Sequences Identify Common Genera that Occur at a Periglacial Site

PubMed Central

Jumpponen, Ari; Brown, Shawn P.; Trappe, James M.; Cázares, Efrén; Strömmer, Rauni

2015-01-01

Periglacial substrates exposed by retreating glaciers represent extreme and sensitive environments defined by a variety of abiotic stressors that challenge organismal establishment and survival. The simple communities often residing at these sites enable their analyses in depth. We utilized existing data and mined published sporocarp, morphotyped ectomycorrhizae (ECM), as well as environmental sequence data of internal transcribed spacer (ITS) and large subunit (LSU) regions of the ribosomal RNA gene to identify taxa that occur at a glacier forefront in the North Cascades Mountains in Washington State in the USA. The discrete data types consistently identified several common and widely distributed genera, perhaps best exemplified by Inocybe and Laccaria. Although we expected low diversity and richness, our environmental sequence data included 37 ITS and 26 LSU operational taxonomic units (OTUs) that likely form ECM. While environmental surveys of metabarcode markers detected large numbers of targeted ECM taxa, both the fruiting body and the morphotype datasets included genera that were undetected in either of the metabarcode datasets. These included hypogeous (Hymenogaster) and epigeous (Lactarius) taxa, some of which may produce large sporocarps but may possess small and/or spatially patchy genets. We highlight the importance of combining various data types to provide a comprehensive view of a fungal community, even in an environment assumed to host communities of low species richness and diversity. PMID:29376900
Forest biomass variation in Southernmost Brazil: the impact of Araucaria trees.

PubMed

Rosenfield, Milena Fermina; Souza, Alexandre F

2014-03-01

A variety of environmental and biotic factors determine vegetation growth and affect plant biomass accumulation. From temperature to species composition, aboveground biomass storage in forest ecosystems is influenced by a number of variables and usually presents a high spatial variability. With this focus, the aim of the study was to evaluate the variables affecting live aboveground forest biomass (AGB) in Subtropical Moist Forests of Southern Brazil, and to analyze the spatial distribution of biomass estimates. Data from a forest inventory performed in the State of Rio Grande do Sul, Southern Brazil, was used in the present study. Thirty-eight 1-ha plots were sampled and all trees with DBH > or = 9.5cm were included for biomass estimation. Values for aboveground biomass were obtained using published allometric equations. Environmental and biotic variables (elevation, rainfall, temperature, soils, stem density and species diversity) were obtained from the literature or calculated from the dataset. For the total dataset, mean AGB was 195.2 Mg/ha. Estimates differed between Broadleaf and Mixed Coniferous-Broadleaf forests: mean AGB was lower in Broadleaf Forests (AGB(BF)=118.9 Mg/ha) when compared to Mixed Forests (AGB(MF)=250.3 Mg/ha). There was a high spatial and local variability in our dataset, even within forest types. This condition is normal in tropical forests and is usually attributed to the presence of large trees. The explanatory multiple regressions were influenced mainly by elevation and explained 50.7% of the variation in AGB. Stem density, diversity and organic matter also influenced biomass variation. The results from our study showed a positive relationship between aboveground biomass and elevation. Therefore, higher values of AGB are located at higher elevations and subjected to cooler temperatures and wetter climate. There seems to be an important contribution of the coniferous species Araucaria angustifolia in Mixed Forest plots, as it presented significantly higher biomass than angiosperm species. In Brazil, this endangered species is part of a high diversity forest (Araucaria Forest) and has the potential for biomass storage. The results of the present study show the spatial and local variability in aboveground biomass in subtropical forests and highlight the importance of these ecosystems in global carbon stock, stimulating the improvement of future biomass estimates.
An Intercomparison of Large-Extent Tree Canopy Cover Geospatial Datasets

NASA Astrophysics Data System (ADS)

Bender, S.; Liknes, G.; Ruefenacht, B.; Reynolds, J.; Miller, W. P.

2017-12-01

As a member of the Multi-Resolution Land Characteristics Consortium (MRLC), the U.S. Forest Service (USFS) is responsible for producing and maintaining the tree canopy cover (TCC) component of the National Land Cover Database (NLCD). The NLCD-TCC data are available for the conterminous United States (CONUS), coastal Alaska, Hawai'i, Puerto Rico, and the U.S. Virgin Islands. The most recent official version of the NLCD-TCC data is based primarily on reference data from 2010-2011 and is part of the multi-component 2011 version of the NLCD. NLCD data are updated on a five-year cycle. The USFS is currently producing the next official version (2016) of the NLCD-TCC data for the United States, and it will be made publicly-available in early 2018. In this presentation, we describe the model inputs, modeling methods, and tools used to produce the 30-m NLCD-TCC data. Several tree cover datasets at 30-m, as well as datasets at finer resolution, have become available in recent years due to advancements in earth observation data and their availability, computing, and sensors. We compare multiple tree cover datasets that have similar resolution to the NLCD-TCC data. We also aggregate the tree class from fine-resolution land cover datasets to a percent canopy value on a 30-m pixel, in order to compare the fine-resolution datasets to the datasets created directly from 30-m Landsat data. The extent of the tree canopy cover datasets included in the study ranges from global and national to the state level. Preliminary investigation of multiple tree cover datasets over the CONUS indicates a high amount of spatial variability. For example, in a comparison of the NLCD-TCC and the Global Land Cover Facility's Landsat Tree Cover Continuous Fields (2010) data by MRLC mapping zones, the zone-level root mean-square deviation ranges from 2% to 39% (mean=17%, median=15%). The analysis outcomes are expected to inform USFS decisions with regard to the next cycle (2021) of NLCD-TCC production.
Assessing Human Modifications to Floodplains using Large-Scale Hydrogeomorphic Floodplain Modeling

NASA Astrophysics Data System (ADS)

Morrison, R. R.; Scheel, K.; Nardi, F.; Annis, A.

2017-12-01

Human modifications to floodplains for water resource and flood management purposes have significantly transformed river-floodplain connectivity dynamics in many watersheds. Bridges, levees, reservoirs, shifts in land use, and other hydraulic engineering works have altered flow patterns and caused changes in the timing and extent of floodplain inundation processes. These hydrogeomorphic changes have likely resulted in negative impacts to aquatic habitat and ecological processes. The availability of large-scale topographic datasets at high resolution provide an opportunity for detecting anthropogenic impacts by means of geomorphic mapping. We have developed and are implementing a methodology for comparing a hydrogeomorphic floodplain mapping technique to hydraulically-modeled floodplain boundaries to estimate floodplain loss due to human activities. Our hydrogeomorphic mapping methodology assumes that river valley morphology intrinsically includes information on flood-driven erosion and depositional phenomena. We use a digital elevation model-based algorithm to identify the floodplain as the area of the fluvial corridor laying below water reference levels, which are estimated using a simplified hydrologic model. Results from our hydrogeomorphic method are compared to hydraulically-derived flood zone maps and spatial datasets of levee protected-areas to explore where water management features, such as levees, have changed floodplain dynamics and landscape features. Parameters associated with commonly used F-index functions are quantified and analyzed to better understand how floodplain areas have been reduced within a basin. Preliminary results indicate that the hydrogeomorphic floodplain model is useful for quickly delineating floodplains at large watershed scales, but further analyses are needed to understand the caveats for using the model in determining floodplain loss due to levees. We plan to continue this work by exploring the spatial dependencies of the F-index function. Results from this work have implications for loss of aquatic habitat and ecological functions, and can inform management and restoration activities by highlighting regions with significant floodplain loss.
Markedly divergent estimates of Amazon forest carbon density from ground plots and satellites.

PubMed

Mitchard, Edward T A; Feldpausch, Ted R; Brienen, Roel J W; Lopez-Gonzalez, Gabriela; Monteagudo, Abel; Baker, Timothy R; Lewis, Simon L; Lloyd, Jon; Quesada, Carlos A; Gloor, Manuel; Ter Steege, Hans; Meir, Patrick; Alvarez, Esteban; Araujo-Murakami, Alejandro; Aragão, Luiz E O C; Arroyo, Luzmila; Aymard, Gerardo; Banki, Olaf; Bonal, Damien; Brown, Sandra; Brown, Foster I; Cerón, Carlos E; Chama Moscoso, Victor; Chave, Jerome; Comiskey, James A; Cornejo, Fernando; Corrales Medina, Massiel; Da Costa, Lola; Costa, Flavia R C; Di Fiore, Anthony; Domingues, Tomas F; Erwin, Terry L; Frederickson, Todd; Higuchi, Niro; Honorio Coronado, Euridice N; Killeen, Tim J; Laurance, William F; Levis, Carolina; Magnusson, William E; Marimon, Beatriz S; Marimon Junior, Ben Hur; Mendoza Polo, Irina; Mishra, Piyush; Nascimento, Marcelo T; Neill, David; Núñez Vargas, Mario P; Palacios, Walter A; Parada, Alexander; Pardo Molina, Guido; Peña-Claros, Marielos; Pitman, Nigel; Peres, Carlos A; Poorter, Lourens; Prieto, Adriana; Ramirez-Angulo, Hirma; Restrepo Correa, Zorayda; Roopsind, Anand; Roucoux, Katherine H; Rudas, Agustin; Salomão, Rafael P; Schietti, Juliana; Silveira, Marcos; de Souza, Priscila F; Steininger, Marc K; Stropp, Juliana; Terborgh, John; Thomas, Raquel; Toledo, Marisol; Torres-Lezama, Armando; van Andel, Tinde R; van der Heijden, Geertje M F; Vieira, Ima C G; Vieira, Simone; Vilanova-Torre, Emilio; Vos, Vincent A; Wang, Ophelia; Zartman, Charles E; Malhi, Yadvinder; Phillips, Oliver L

2014-08-01

The accurate mapping of forest carbon stocks is essential for understanding the global carbon cycle, for assessing emissions from deforestation, and for rational land-use planning. Remote sensing (RS) is currently the key tool for this purpose, but RS does not estimate vegetation biomass directly, and thus may miss significant spatial variations in forest structure. We test the stated accuracy of pantropical carbon maps using a large independent field dataset. Tropical forests of the Amazon basin. The permanent archive of the field plot data can be accessed at: http://dx.doi.org/10.5521/FORESTPLOTS.NET/2014_1. Two recent pantropical RS maps of vegetation carbon are compared to a unique ground-plot dataset, involving tree measurements in 413 large inventory plots located in nine countries. The RS maps were compared directly to field plots, and kriging of the field data was used to allow area-based comparisons. The two RS carbon maps fail to capture the main gradient in Amazon forest carbon detected using 413 ground plots, from the densely wooded tall forests of the north-east, to the light-wooded, shorter forests of the south-west. The differences between plots and RS maps far exceed the uncertainties given in these studies, with whole regions over- or under-estimated by > 25%, whereas regional uncertainties for the maps were reported to be < 5%. Pantropical biomass maps are widely used by governments and by projects aiming to reduce deforestation using carbon offsets, but may have significant regional biases. Carbon-mapping techniques must be revised to account for the known ecological variation in tree wood density and allometry to create maps suitable for carbon accounting. The use of single relationships between tree canopy height and above-ground biomass inevitably yields large, spatially correlated errors. This presents a significant challenge to both the forest conservation and remote sensing communities, because neither wood density nor species assemblages can be reliably mapped from space.
Markedly divergent estimates of Amazon forest carbon density from ground plots and satellites

PubMed Central

Mitchard, Edward T A; Feldpausch, Ted R; Brienen, Roel J W; Lopez-Gonzalez, Gabriela; Monteagudo, Abel; Baker, Timothy R; Lewis, Simon L; Lloyd, Jon; Quesada, Carlos A; Gloor, Manuel; ter Steege, Hans; Meir, Patrick; Alvarez, Esteban; Araujo-Murakami, Alejandro; Aragão, Luiz E O C; Arroyo, Luzmila; Aymard, Gerardo; Banki, Olaf; Bonal, Damien; Brown, Sandra; Brown, Foster I; Cerón, Carlos E; Chama Moscoso, Victor; Chave, Jerome; Comiskey, James A; Cornejo, Fernando; Corrales Medina, Massiel; Da Costa, Lola; Costa, Flavia R C; Di Fiore, Anthony; Domingues, Tomas F; Erwin, Terry L; Frederickson, Todd; Higuchi, Niro; Honorio Coronado, Euridice N; Killeen, Tim J; Laurance, William F; Levis, Carolina; Magnusson, William E; Marimon, Beatriz S; Marimon Junior, Ben Hur; Mendoza Polo, Irina; Mishra, Piyush; Nascimento, Marcelo T; Neill, David; Núñez Vargas, Mario P; Palacios, Walter A; Parada, Alexander; Pardo Molina, Guido; Peña-Claros, Marielos; Pitman, Nigel; Peres, Carlos A; Poorter, Lourens; Prieto, Adriana; Ramirez-Angulo, Hirma; Restrepo Correa, Zorayda; Roopsind, Anand; Roucoux, Katherine H; Rudas, Agustin; Salomão, Rafael P; Schietti, Juliana; Silveira, Marcos; de Souza, Priscila F; Steininger, Marc K; Stropp, Juliana; Terborgh, John; Thomas, Raquel; Toledo, Marisol; Torres-Lezama, Armando; van Andel, Tinde R; van der Heijden, Geertje M F; Vieira, Ima C G; Vieira, Simone; Vilanova-Torre, Emilio; Vos, Vincent A; Wang, Ophelia; Zartman, Charles E; Malhi, Yadvinder; Phillips, Oliver L

2014-01-01

Aim The accurate mapping of forest carbon stocks is essential for understanding the global carbon cycle, for assessing emissions from deforestation, and for rational land-use planning. Remote sensing (RS) is currently the key tool for this purpose, but RS does not estimate vegetation biomass directly, and thus may miss significant spatial variations in forest structure. We test the stated accuracy of pantropical carbon maps using a large independent field dataset. Location Tropical forests of the Amazon basin. The permanent archive of the field plot data can be accessed at: http://dx.doi.org/10.5521/FORESTPLOTS.NET/2014_1 Methods Two recent pantropical RS maps of vegetation carbon are compared to a unique ground-plot dataset, involving tree measurements in 413 large inventory plots located in nine countries. The RS maps were compared directly to field plots, and kriging of the field data was used to allow area-based comparisons. Results The two RS carbon maps fail to capture the main gradient in Amazon forest carbon detected using 413 ground plots, from the densely wooded tall forests of the north-east, to the light-wooded, shorter forests of the south-west. The differences between plots and RS maps far exceed the uncertainties given in these studies, with whole regions over- or under-estimated by > 25%, whereas regional uncertainties for the maps were reported to be < 5%. Main conclusions Pantropical biomass maps are widely used by governments and by projects aiming to reduce deforestation using carbon offsets, but may have significant regional biases. Carbon-mapping techniques must be revised to account for the known ecological variation in tree wood density and allometry to create maps suitable for carbon accounting. The use of single relationships between tree canopy height and above-ground biomass inevitably yields large, spatially correlated errors. This presents a significant challenge to both the forest conservation and remote sensing communities, because neither wood density nor species assemblages can be reliably mapped from space. PMID:26430387
NLCD - MODIS albedo data

EPA Pesticide Factsheets

The NLCD-MODIS land cover-albedo database integrates high-quality MODIS albedo observations with areas of homogeneous land cover from NLCD. The spatial resolution (pixel size) of the database is 480m-x-480m aligned to the standardized UGSG Albers Equal-Area projection. The spatial extent of the database is the continental United States. This dataset is associated with the following publication:Wickham , J., C.A. Barnes, and T. Wade. Combining NLCD and MODIS to Create a Land Cover-Albedo Dataset for the Continental United States. REMOTE SENSING OF ENVIRONMENT. Elsevier Science Ltd, New York, NY, USA, 170(0): 143-153, (2015).
Dimensions of biodiversity in the Earth mycobiome.

PubMed

Peay, Kabir G; Kennedy, Peter G; Talbot, Jennifer M

2016-07-01

Fungi represent a large proportion of the genetic diversity on Earth and fungal activity influences the structure of plant and animal communities, as well as rates of ecosystem processes. Large-scale DNA-sequencing datasets are beginning to reveal the dimensions of fungal biodiversity, which seem to be fundamentally different to bacteria, plants and animals. In this Review, we describe the patterns of fungal biodiversity that have been revealed by molecular-based studies. Furthermore, we consider the evidence that supports the roles of different candidate drivers of fungal diversity at a range of spatial scales, as well as the role of dispersal limitation in maintaining regional endemism and influencing local community assembly. Finally, we discuss the ecological mechanisms that are likely to be responsible for the high heterogeneity that is observed in fungal communities at local scales.
Compression of Born ratio for fluorescence molecular tomography/x-ray computed tomography hybrid imaging: methodology and in vivo validation.

PubMed

Mohajerani, Pouyan; Ntziachristos, Vasilis

2013-07-01

The 360° rotation geometry of the hybrid fluorescence molecular tomography/x-ray computed tomography modality allows for acquisition of very large datasets, which pose numerical limitations on the reconstruction. We propose a compression method that takes advantage of the correlation of the Born-normalized signal among sources in spatially formed clusters to reduce the size of system model. The proposed method has been validated using an ex vivo study and an in vivo study of a nude mouse with a subcutaneous 4T1 tumor, with and without inclusion of a priori anatomical information. Compression rates of up to two orders of magnitude with minimum distortion of reconstruction have been demonstrated, resulting in large reduction in weight matrix size and reconstruction time.
Advances in Multi-Sensor Scanning and Visualization of Complex Plants: the Utmost Case of a Reactor Building

NASA Astrophysics Data System (ADS)

Hullo, J.-F.; Thibault, G.; Boucheny, C.

2015-02-01

In a context of increased maintenance operations and workers generational renewal, a nuclear owner and operator like Electricité de France (EDF) is interested in the scaling up of tools and methods of "as-built virtual reality" for larger buildings and wider audiences. However, acquisition and sharing of as-built data on a large scale (large and complex multi-floored buildings) challenge current scientific and technical capacities. In this paper, we first present a state of the art of scanning tools and methods for industrial plants with very complex architecture. Then, we introduce the inner characteristics of the multi-sensor scanning and visualization of the interior of the most complex building of a power plant: a nuclear reactor building. We introduce several developments that made possible a first complete survey of such a large building, from acquisition, processing and fusion of multiple data sources (3D laser scans, total-station survey, RGB panoramic, 2D floor plans, 3D CAD as-built models). In addition, we present the concepts of a smart application developed for the painless exploration of the whole dataset. The goal of this application is to help professionals, unfamiliar with the manipulation of such datasets, to take into account spatial constraints induced by the building complexity while preparing maintenance operations. Finally, we discuss the main feedbacks of this large experiment, the remaining issues for the generalization of such large scale surveys and the future technical and scientific challenges in the field of industrial "virtual reality".
P-Hint-Hunt: a deep parallelized whole genome DNA methylation detection tool.

PubMed

Peng, Shaoliang; Yang, Shunyun; Gao, Ming; Liao, Xiangke; Liu, Jie; Yang, Canqun; Wu, Chengkun; Yu, Wenqiang

2017-03-14

The increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today's relative tools almost suffer from inaccuracies and time-consuming problems. In our study, we designed a new DNA methylation prediction tool ("Hint-Hunt") to solve the problem. By having an optimal complex alignment computation and Smith-Waterman matrix dynamic programming, Hint-Hunt could analyze and predict the DNA methylation status. But when Hint-Hunt tried to predict DNA methylation status with large-scale dataset, there are still slow speed and low temporal-spatial efficiency problems. In order to solve the problems of Smith-Waterman dynamic programming and low temporal-spatial efficiency, we further design a deep parallelized whole genome DNA methylation detection tool ("P-Hint-Hunt") on Tianhe-2 (TH-2) supercomputer. To the best of our knowledge, P-Hint-Hunt is the first parallel DNA methylation detection tool with a high speed-up to process large-scale dataset, and could run both on CPU and Intel Xeon Phi coprocessors. Moreover, we deploy and evaluate Hint-Hunt and P-Hint-Hunt on TH-2 supercomputer in different scales. The experimental results illuminate our tools eliminate the deviation caused by bisulfite treatment in mapping procedure and the multi-level parallel program yields a 48 times speed-up with 64 threads. P-Hint-Hunt gain a deep acceleration on CPU and Intel Xeon Phi heterogeneous platform, which gives full play of the advantages of multi-cores (CPU) and many-cores (Phi).
Evaluation of CLM4 Solar Radiation Partitioning Scheme Using Remote Sensing and Site Level FPAR Datasets

DOE PAGES

Wang, Kai; Mao, Jiafu; Dickinson, Robert; ...

2013-06-05

This paper examines a land surface solar radiation partitioning scheme, i.e., that of the Community Land Model version 4 (CLM4) with coupled carbon and nitrogen cycles. Taking advantage of a unique 30-year fraction of absorbed photosynthetically active radiation (FPAR) dataset derived from the Global Inventory Modeling and Mapping Studies (GIMMS) normalized difference vegetation index (NDVI) data set, multiple other remote sensing datasets, and site level observations, we evaluated the CLM4 FPAR ’s seasonal cycle, diurnal cycle, long-term trends and spatial patterns. These findings show that the model generally agrees with observations in the seasonal cycle, long-term trends, and spatial patterns,more » but does not reproduce the diurnal cycle. Discrepancies also exist in seasonality magnitudes, peak value months, and spatial heterogeneity. Here, we identify the discrepancy in the diurnal cycle as, due to, the absence of dependence on sun angle in the model. Implementation of sun angle dependence in a one-dimensional (1-D) model is proposed. The need for better relating of vegetation to climate in the model, indicated by long-term trends, is also noted. Evaluation of the CLM4 land surface solar radiation partitioning scheme using remote sensing and site level FPAR datasets provides targets for future development in its representation of this naturally complicated process.« less
The 3D Reference Earth Model: Status and Preliminary Results

NASA Astrophysics Data System (ADS)

Moulik, P.; Lekic, V.; Romanowicz, B. A.

2017-12-01

In the 20th century, seismologists constructed models of how average physical properties (e.g. density, rigidity, compressibility, anisotropy) vary with depth in the Earth's interior. These one-dimensional (1D) reference Earth models (e.g. PREM) have proven indispensable in earthquake location, imaging of interior structure, understanding material properties under extreme conditions, and as a reference in other fields, such as particle physics and astronomy. Over the past three decades, new datasets motivated more sophisticated efforts that yielded models of how properties vary both laterally and with depth in the Earth's interior. Though these three-dimensional (3D) models exhibit compelling similarities at large scales, differences in the methodology, representation of structure, and dataset upon which they are based, have prevented the creation of 3D community reference models. As part of the REM-3D project, we are compiling and reconciling reference seismic datasets of body wave travel-time measurements, fundamental mode and overtone surface wave dispersion measurements, and normal mode frequencies and splitting functions. These reference datasets are being inverted for a long-wavelength, 3D reference Earth model that describes the robust long-wavelength features of mantle heterogeneity. As a community reference model with fully quantified uncertainties and tradeoffs and an associated publically available dataset, REM-3D will facilitate Earth imaging studies, earthquake characterization, inferences on temperature and composition in the deep interior, and be of improved utility to emerging scientific endeavors, such as neutrino geoscience. Here, we summarize progress made in the construction of the reference long period dataset and present a preliminary version of REM-3D in the upper-mantle. In order to determine the level of detail warranted for inclusion in REM-3D, we analyze the spectrum of discrepancies between models inverted with different subsets of the reference dataset. This procedure allows us to evaluate the extent of consistency in imaging heterogeneity at various depths and between spatial scales.
Comparing soil moisture anomalies from multiple independent sources over different regions across the globe

NASA Astrophysics Data System (ADS)

Cammalleri, Carmelo; Vogt, Jürgen V.; Bisselink, Bernard; de Roo, Ad

2017-12-01

Agricultural drought events can affect large regions across the world, implying the need for a suitable global tool for an accurate monitoring of this phenomenon. Soil moisture anomalies are considered a good metric to capture the occurrence of agricultural drought events, and they have become an important component of several operational drought monitoring systems. In the framework of the JRC Global Drought Observatory (GDO, http://edo.jrc.ec.europa.eu/gdo/), the suitability of three datasets as possible representations of root zone soil moisture anomalies has been evaluated: (1) the soil moisture from the Lisflood distributed hydrological model (namely LIS), (2) the remotely sensed Land Surface Temperature data from the MODIS satellite (namely LST), and (3) the ESA Climate Change Initiative combined passive/active microwave skin soil moisture dataset (namely CCI). Due to the independency of these three datasets, the triple collocation (TC) technique has been applied, aiming at quantifying the likely error associated with each dataset in comparison to the unknown true status of the system. TC analysis was performed on five macro-regions (namely North America, Europe, India, southern Africa and Australia) detected as suitable for the experiment, providing insight into the mutual relationship between these datasets as well as an assessment of the accuracy of each method. Even if no definitive statement on the spatial distribution of errors can be provided, a clear outcome of the TC analysis is the good performance of the remote sensing datasets, especially CCI, over dry regions such as Australia and southern Africa, whereas the outputs of LIS seem to be more reliable over areas that are well monitored through meteorological ground station networks, such as North America and Europe. In a global drought monitoring system, the results of the error analysis are used to design a weighted-average ensemble system that exploits the advantages of each dataset.
A comparison of U.S. geological survey seamless elevation models with shuttle radar topography mission data

USGS Publications Warehouse

Gesch, D.; Williams, J.; Miller, W.

2001-01-01

Elevation models produced from Shuttle Radar Topography Mission (SRTM) data will be the most comprehensive, consistently processed, highest resolution topographic dataset ever produced for the Earth's land surface. Many applications that currently use elevation data will benefit from the increased availability of data with higher accuracy, quality, and resolution, especially in poorly mapped areas of the globe. SRTM data will be produced as seamless data, thereby avoiding many of the problems inherent in existing multi-source topographic databases. Serving as precursors to SRTM datasets, the U.S. Geological Survey (USGS) has produced and is distributing seamless elevation datasets that facilitate scientific use of elevation data over large areas. GTOPO30 is a global elevation model with a 30 arc-second resolution (approximately 1-kilometer). The National Elevation Dataset (NED) covers the United States at a resolution of 1 arc-second (approximately 30-meters). Due to their seamless format and broad area coverage, both GTOPO30 and NED represent an advance in the usability of elevation data, but each still includes artifacts from the highly variable source data used to produce them. The consistent source data and processing approach for SRTM data will result in elevation products that will be a significant addition to the current availability of seamless datasets, specifically for many areas outside the U.S. One application that demonstrates some advantages that may be realized with SRTM data is delineation of land surface drainage features (watersheds and stream channels). Seamless distribution of elevation data in which a user interactively specifies the area of interest and order parameters via a map server is already being successfully demonstrated with existing USGS datasets. Such an approach for distributing SRTM data is ideal for a dataset that undoubtedly will be of very high interest to the spatial data user community.
Spectral-spatial hyperspectral image classification using super-pixel-based spatial pyramid representation

NASA Astrophysics Data System (ADS)

Fan, Jiayuan; Tan, Hui Li; Toomik, Maria; Lu, Shijian

2016-10-01

Spatial pyramid matching has demonstrated its power for image recognition task by pooling features from spatially increasingly fine sub-regions. Motivated by the concept of feature pooling at multiple pyramid levels, we propose a novel spectral-spatial hyperspectral image classification approach using superpixel-based spatial pyramid representation. This technique first generates multiple superpixel maps by decreasing the superpixel number gradually along with the increased spatial regions for labelled samples. By using every superpixel map, sparse representation of pixels within every spatial region is then computed through local max pooling. Finally, features learned from training samples are aggregated and trained by a support vector machine (SVM) classifier. The proposed spectral-spatial hyperspectral image classification technique has been evaluated on two public hyperspectral datasets, including the Indian Pines image containing 16 different agricultural scene categories with a 20m resolution acquired by AVIRIS and the University of Pavia image containing 9 land-use categories with a 1.3m spatial resolution acquired by the ROSIS-03 sensor. Experimental results show significantly improved performance compared with the state-of-the-art works. The major contributions of this proposed technique include (1) a new spectral-spatial classification approach to generate feature representation for hyperspectral image, (2) a complementary yet effective feature pooling approach, i.e. the superpixel-based spatial pyramid representation that is used for the spatial correlation study, (3) evaluation on two public hyperspectral image datasets with superior image classification performance.

Constructing a Teleseismic Tomographic Image of Taiwan using BATS Recordings

NASA Astrophysics Data System (ADS)

Krajewski, J.; Roecker, S.

2005-12-01

Taiwan is an evolving arc-continent collision located at a complicated part of the plate boundary between the Eurasian and Philippine Sea plates. To better understand the role of the upper mantle in the dynamics of this collision, we reviewed 4 years of data from the Broadband Array in Taiwan for Seismology (BATS) in Taiwan to construct a teleseismic dataset for tomographic imaging of the subsurface of the island. From an initial selection of approximately 300 events, we used waveform correlation to generate a dataset of 4500 relative arrival times. To calculate accurate travel times in three dimensional wavespeed models over the large lateral distances in our model (~800 km), we solve the eikonal equation directly in a spherical coordinate system. We reduce the influence of smearing of crustal heterogeneity into the deeper mantle, we fix the upper 30 km to a previously determined P wavespeed model for the region. Initial resolution tests suggest a spatial limit on the order of 40 km.
Global assessment of human losses due to earthquakes

USGS Publications Warehouse

Silva, Vitor; Jaiswal, Kishor; Weatherill, Graeme; Crowley, Helen

2014-01-01

Current studies have demonstrated a sharp increase in human losses due to earthquakes. These alarming levels of casualties suggest the need for large-scale investment in seismic risk mitigation, which, in turn, requires an adequate understanding of the extent of the losses, and location of the most affected regions. Recent developments in global and uniform datasets such as instrumental and historical earthquake catalogues, population spatial distribution and country-based vulnerability functions, have opened an unprecedented possibility for a reliable assessment of earthquake consequences at a global scale. In this study, a uniform probabilistic seismic hazard assessment (PSHA) model was employed to derive a set of global seismic hazard curves, using the open-source software OpenQuake for seismic hazard and risk analysis. These results were combined with a collection of empirical fatality vulnerability functions and a population dataset to calculate average annual human losses at the country level. The results from this study highlight the regions/countries in the world with a higher seismic risk, and thus where risk reduction measures should be prioritized.
Color imaging of Mars by the High Resolution Imaging Science Experiment (HiRISE)

USGS Publications Warehouse

Delamere, W.A.; Tornabene, L.L.; McEwen, A.S.; Becker, K.; Bergstrom, J.W.; Bridges, N.T.; Eliason, E.M.; Gallagher, D.; Herkenhoff, K. E.; Keszthelyi, L.; Mattson, S.; McArthur, G.K.; Mellon, M.T.; Milazzo, M.; Russell, P.S.; Thomas, N.

2010-01-01

HiRISE has been producing a large number of scientifically useful color products of Mars and other planetary objects. The three broad spectral bands, coupled with the highly sensitive 14 bit detectors and time delay integration, enable detection of subtle color differences. The very high spatial resolution of HiRISE can augment the mineralogic interpretations based on multispectral (THEMIS) and hyperspectral datasets (TES, OMEGA and CRISM) and thereby enable detailed geologic and stratigraphic interpretations at meter scales. In addition to providing some examples of color images and their interpretation, we describe the processing techniques used to produce them and note some of the minor artifacts in the output. We also provide an example of how HiRISE color products can be effectively used to expand mineral and lithologic mapping provided by CRISM data products that are backed by other spectral datasets. The utility of high quality color data for understanding geologic processes on Mars has been one of the major successes of HiRISE. ?? 2009 Elsevier Inc.
Marine debris accumulation in the Northwestern Hawaiian Islands: an examination of rates and processes.

PubMed

Dameron, Oliver J; Parke, Michael; Albins, Mark A; Brainard, Russell

2007-04-01

Large amounts of derelict fishing gear accumulate and cause damage to shallow coral reefs of the Northwestern Hawaiian Islands (NWHI). To facilitate maintenance of reefs cleaned during 1996-2005 removal efforts, we identify likely high-density debris areas by assessing reef characteristics (depth, benthic habitat type, and energy regime) that influence sub-regional debris accumulation. Previously cleaned backreef and lagoonal reefs at two NWHI locations were resurveyed for accumulated debris using two survey methods. Accumulated debris densities and weights were found to be greater in lagoonal reef areas. Sample weight-based debris densities are extrapolated to similar habitats throughout the NWHI using a spatial 'net habitat' dataset created by generalizing IKONOS satellite derivatives for depth and habitat classification. Prediction accuracy for this dataset is tested using historical debris point data. Annual NWHI debris accumulation is estimated to be 52.0 metric tonnes. For planning purposes, individual NWHI atolls/reefs are allotted a proportion of this total.
Spatial heterogeneity of leaf area index across scales from simulation and remote sensing

NASA Astrophysics Data System (ADS)

Reichenau, Tim G.; Korres, Wolfgang; Montzka, Carsten; Schneider, Karl

2016-04-01

Leaf area index (LAI, single sided leaf area per ground area) influences mass and energy exchange of vegetated surfaces. Therefore LAI is an input variable for many land surface schemes of coupled large scale models, which do not simulate LAI. Since these models typically run on rather coarse resolution grids, LAI is often inferred from coarse resolution remote sensing. However, especially in agriculturally used areas, a grid cell of these products often covers more than a single land-use. In that case, the given LAI does not apply to any single land-use. Therefore, the overall spatial heterogeneity in these datasets differs from that on resolutions high enough to distinguish areas with differing land-use. Detailed process-based plant growth models simulate LAI for separate plant functional types or specific species. However, limited availability of observations causes reduced spatial heterogeneity of model input data (soil, weather, land-use). Since LAI is strongly heterogeneous in space and time and since processes depend on LAI in a nonlinear way, a correct representation of LAI spatial heterogeneity is also desirable on coarse resolutions. The current study assesses this issue by comparing the spatial heterogeneity of LAI from remote sensing (RapidEye) and process-based simulations (DANUBIA simulation system) across scales. Spatial heterogeneity is assessed by analyzing LAI frequency distributions (spatial variability) and semivariograms (spatial structure). Test case is the arable land in the fertile loess plain of the Rur catchment near the Germany-Netherlands border.
Comparison of Different Machine Learning Algorithms for Lithological Mapping Using Remote Sensing Data and Morphological Features: A Case Study in Kurdistan Region, NE Iraq

NASA Astrophysics Data System (ADS)

Othman, Arsalan; Gloaguen, Richard

2015-04-01

Topographic effects and complex vegetation cover hinder lithology classification in mountain regions based not only in field, but also in reflectance remote sensing data. The area of interest "Bardi-Zard" is located in the NE of Iraq. It is part of the Zagros orogenic belt, where seven lithological units outcrop and is known for its chromite deposit. The aim of this study is to compare three machine learning algorithms (MLAs): Maximum Likelihood (ML), Support Vector Machines (SVM), and Random Forest (RF) in the context of a supervised lithology classification task using Advanced Space-borne Thermal Emission and Reflection radiometer (ASTER) satellite, its derived, spatial information (spatial coordinates) and geomorphic data. We emphasize the enhancement in remote sensing lithological mapping accuracy that arises from the integration of geomorphic features and spatial information (spatial coordinates) in classifications. This study identifies that RF is better than ML and SVM algorithms in almost the sixteen combination datasets, which were tested. The overall accuracy of the best dataset combination with the RF map for the all seven classes reach ~80% and the producer and user's accuracies are ~73.91% and 76.09% respectively while the kappa coefficient is ~0.76. TPI is more effective with SVM algorithm than an RF algorithm. This paper demonstrates that adding geomorphic indices such as TPI and spatial information in the dataset increases the lithological classification accuracy.
A Comparison of Traditional, Step-Path, and Geostatistical Techniques in the Stability Analysis of a Large Open Pit

NASA Astrophysics Data System (ADS)

Mayer, J. M.; Stead, D.

2017-04-01

With the increased drive towards deeper and more complex mine designs, geotechnical engineers are often forced to reconsider traditional deterministic design techniques in favour of probabilistic methods. These alternative techniques allow for the direct quantification of uncertainties within a risk and/or decision analysis framework. However, conventional probabilistic practices typically discretize geological materials into discrete, homogeneous domains, with attributes defined by spatially constant random variables, despite the fact that geological media display inherent heterogeneous spatial characteristics. This research directly simulates this phenomenon using a geostatistical approach, known as sequential Gaussian simulation. The method utilizes the variogram which imposes a degree of controlled spatial heterogeneity on the system. Simulations are constrained using data from the Ok Tedi mine site in Papua New Guinea and designed to randomly vary the geological strength index and uniaxial compressive strength using Monte Carlo techniques. Results suggest that conventional probabilistic techniques have a fundamental limitation compared to geostatistical approaches, as they fail to account for the spatial dependencies inherent to geotechnical datasets. This can result in erroneous model predictions, which are overly conservative when compared to the geostatistical results.
Scale dependence of the diversity-stability relationship in a temperate grassland.

PubMed

Zhang, Yunhai; He, Nianpeng; Loreau, Michel; Pan, Qingmin; Han, Xingguo

2018-05-01

A positive relationship between biodiversity and ecosystem stability has been reported in many ecosystems; however, it has yet to be determined whether and how spatial scale affects this relationship. Here, for the first time, we assessed the effects of alpha, beta and gamma diversity on ecosystem stability and the scale dependence of the slope of the diversity-stability relationship.By employing a long-term (33 years) dataset from a temperate grassland, northern China, we calculated the all possible spatial scales with the complete combination from the basic 1-m 2 plots.Species richness was positively associated with ecosystem stability through species asynchrony and overyielding at all spatial scales (1, 2, 3, 4 and 5 m 2 ). Both alpha and beta diversity were positively associated with gamma stability.Moreover, the slope of the diversity-area relationship was significantly higher than that of the stability-area relationship, resulting in a decline of the slope of the diversity-stability relationship with increasing area. Synthesis. With the positive species diversity effect on ecosystem stability from small to large spatial scales, our findings demonstrate the need to maintain a high biodiversity and biotic heterogeneity as insurance against the risks incurred by ecosystems in the face of global environmental changes.
Implementations of geographically weighted lasso in spatial data with multicollinearity (Case study: Poverty modeling of Java Island)

NASA Astrophysics Data System (ADS)

Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi

2017-03-01

Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.
HTM Spatial Pooler With Memristor Crossbar Circuits for Sparse Biometric Recognition.

PubMed

James, Alex Pappachen; Fedorova, Irina; Ibrayev, Timur; Kudithipudi, Dhireesha

2017-06-01

Hierarchical Temporal Memory (HTM) is an online machine learning algorithm that emulates the neo-cortex. The development of a scalable on-chip HTM architecture is an open research area. The two core substructures of HTM are spatial pooler and temporal memory. In this work, we propose a new Spatial Pooler circuit design with parallel memristive crossbar arrays for the 2D columns. The proposed design was validated on two different benchmark datasets, face recognition, and speech recognition. The circuits are simulated and analyzed using a practical memristor device model and 0.18 μm IBM CMOS technology model. The databases AR, YALE, ORL, and UFI, are used to test the performance of the design in face recognition. TIMIT dataset is used for the speech recognition.
An Effective Methodology for Processing and Analyzing Large, Complex Spacecraft Data Streams

ERIC Educational Resources Information Center

Teymourlouei, Haydar

2013-01-01

The emerging large datasets have made efficient data processing a much more difficult task for the traditional methodologies. Invariably, datasets continue to increase rapidly in size with time. The purpose of this research is to give an overview of some of the tools and techniques that can be utilized to manage and analyze large datasets. We…
Quantifying measurement uncertainty and spatial variability in the context of model evaluation

NASA Astrophysics Data System (ADS)

Choukulkar, A.; Brewer, A.; Pichugina, Y. L.; Bonin, T.; Banta, R. M.; Sandberg, S.; Weickmann, A. M.; Djalalova, I.; McCaffrey, K.; Bianco, L.; Wilczak, J. M.; Newman, J. F.; Draxl, C.; Lundquist, J. K.; Wharton, S.; Olson, J.; Kenyon, J.; Marquis, M.

2017-12-01

In an effort to improve wind forecasts for the wind energy sector, the Department of Energy and the NOAA funded the second Wind Forecast Improvement Project (WFIP2). As part of the WFIP2 field campaign, a large suite of in-situ and remote sensing instrumentation was deployed to the Columbia River Gorge in Oregon and Washington from October 2015 - March 2017. The array of instrumentation deployed included 915-MHz wind profiling radars, sodars, wind- profiling lidars, and scanning lidars. The role of these instruments was to provide wind measurements at high spatial and temporal resolution for model evaluation and improvement of model physics. To properly determine model errors, the uncertainties in instrument-model comparisons need to be quantified accurately. These uncertainties arise from several factors such as measurement uncertainty, spatial variability, and interpolation of model output to instrument locations, to name a few. In this presentation, we will introduce a formalism to quantify measurement uncertainty and spatial variability. The accuracy of this formalism will be tested using existing datasets such as the eXperimental Planetary boundary layer Instrumentation Assessment (XPIA) campaign. Finally, the uncertainties in wind measurement and the spatial variability estimates from the WFIP2 field campaign will be discussed to understand the challenges involved in model evaluation.
[Comparison of GIMMS and MODIS normalized vegetation index composite data for Qing-Hai-Tibet Plateau].

PubMed

Du, Jia-Qiang; Shu, Jian-Min; Wang, Yue-Hui; Li, Ying-Chang; Zhang, Lin-Bo; Guo, Yang

2014-02-01

Consistent NDVI time series are basic and prerequisite in long-term monitoring of land surface properties. Advanced very high resolution radiometer (AVHRR) measurements provide the longest records of continuous global satellite measurements sensitive to live green vegetation, and moderate resolution imaging spectroradiometer (MODIS) is more recent typical with high spatial and temporal resolution. Understanding the relationship between the AVHRR-derived NDVI and MODIS NDVI is critical to continued long-term monitoring of ecological resources. NDVI time series acquired by the global inventory modeling and mapping studies (GIMMS) and Terra MODIS were compared over the same time periods from 2000 to 2006 at four scales of Qinghai-Tibet Plateau (whole region, sub-region, biome and pixel) to assess the level of agreement in terms of absolute values and dynamic change by independently assessing the performance of GIMMS and MODIS NDVI and using 495 Landsat samples of 20 km x20 km covering major land cover type. High correlations existed between the two datasets at the four scales, indicating their mostly equal capability of capturing seasonal and monthly phenological variations (mostly at 0. 001 significance level). Simi- larities of the two datasets differed significantly among different vegetation types. The relative low correlation coefficients and large difference of NDVI value between the two datasets were found among dense vegetation types including broadleaf forest and needleleaf forest, yet the correlations were strong and the deviations were small in more homogeneous vegetation types, such as meadow, steppe and crop. 82% of study area was characterized by strong consistency between GIMMS and MODIS NDVI at pixel scale. In the Landsat NDVI vs. GIMMS and MODIS NDVI comparison of absolute values, the MODIS NDVI performed slightly better than GIMMS NDVI, whereas in the comparison of temporal change values, the GIMMS data set performed best. Similar with comparison results of GIMMS and MODIS NDVI, the consistency across the three datasets was clearly different among various vegetation types. In dynamic changes, differences between Landsat and MODIS NDVI were smaller than Landsat NDVI vs. GIMMS NDVI for forest, but Landsat and GIMMS NDVI agreed better for grass and crop. The results suggested that spatial patterns and dynamic trends of GIMMS NDVI were found to be in overall acceptable agreement with MODIS NDVI. It might be feasible to successfully integrate historical GIMMS and more recent MODIS NDVI to provide continuity of NDVI products. The accuracy of merging AVHRR historical data recorded with more modern MODIS NDVI data strongly depends on vegetation type, season and phenological period, and spatial scale. The integration of the two datasets for needleleaf forest, broadleaf forest, and for all vegetation types in the phenological transition periods in spring and autumn should be treated with caution.
GSHR-Tree: a spatial index tree based on dynamic spatial slot and hash table in grid environments

NASA Astrophysics Data System (ADS)

Chen, Zhanlong; Wu, Xin-cai; Wu, Liang

2008-12-01

Computation Grids enable the coordinated sharing of large-scale distributed heterogeneous computing resources that can be used to solve computationally intensive problems in science, engineering, and commerce. Grid spatial applications are made possible by high-speed networks and a new generation of Grid middleware that resides between networks and traditional GIS applications. The integration of the multi-sources and heterogeneous spatial information and the management of the distributed spatial resources and the sharing and cooperative of the spatial data and Grid services are the key problems to resolve in the development of the Grid GIS. The performance of the spatial index mechanism is the key technology of the Grid GIS and spatial database affects the holistic performance of the GIS in Grid Environments. In order to improve the efficiency of parallel processing of a spatial mass data under the distributed parallel computing grid environment, this paper presents a new grid slot hash parallel spatial index GSHR-Tree structure established in the parallel spatial indexing mechanism. Based on the hash table and dynamic spatial slot, this paper has improved the structure of the classical parallel R tree index. The GSHR-Tree index makes full use of the good qualities of R-Tree and hash data structure. This paper has constructed a new parallel spatial index that can meet the needs of parallel grid computing about the magnanimous spatial data in the distributed network. This arithmetic splits space in to multi-slots by multiplying and reverting and maps these slots to sites in distributed and parallel system. Each sites constructs the spatial objects in its spatial slot into an R tree. On the basis of this tree structure, the index data was distributed among multiple nodes in the grid networks by using large node R-tree method. The unbalance during process can be quickly adjusted by means of a dynamical adjusting algorithm. This tree structure has considered the distributed operation, reduplication operation transfer operation of spatial index in the grid environment. The design of GSHR-Tree has ensured the performance of the load balance in the parallel computation. This tree structure is fit for the parallel process of the spatial information in the distributed network environments. Instead of spatial object's recursive comparison where original R tree has been used, the algorithm builds the spatial index by applying binary code operation in which computer runs more efficiently, and extended dynamic hash code for bit comparison. In GSHR-Tree, a new server is assigned to the network whenever a split of a full node is required. We describe a more flexible allocation protocol which copes with a temporary shortage of storage resources. It uses a distributed balanced binary spatial tree that scales with insertions to potentially any number of storage servers through splits of the overloaded ones. The application manipulates the GSHR-Tree structure from a node in the grid environment. The node addresses the tree through its image that the splits can make outdated. This may generate addressing errors, solved by the forwarding among the servers. In this paper, a spatial index data distribution algorithm that limits the number of servers has been proposed. We improve the storage utilization at the cost of additional messages. The structure of GSHR-Tree is believed that the scheme of this grid spatial index should fit the needs of new applications using endlessly larger sets of spatial data. Our proposal constitutes a flexible storage allocation method for a distributed spatial index. The insertion policy can be tuned dynamically to cope with periods of storage shortage. In such cases storage balancing should be favored for better space utilization, at the price of extra message exchanges between servers. This structure makes a compromise in the updating of the duplicated index and the transformation of the spatial index data. Meeting the needs of the grid computing, GSHRTree has a flexible structure in order to satisfy new needs in the future. The GSHR-Tree provides the R-tree capabilities for large spatial datasets stored over interconnected servers. The analysis, including the experiments, confirmed the efficiency of our design choices. The scheme should fit the needs of new applications of spatial data, using endlessly larger datasets. Using the system response time of the parallel processing of spatial scope query algorithm as the performance evaluation factor, According to the result of the simulated the experiments, GSHR-Tree is performed to prove the reasonable design and the high performance of the indexing structure that the paper presented.
Using mixture tuned match filtering to measure changes in subpixel vegetation area in Las Vegas, Nevada

NASA Astrophysics Data System (ADS)

Brelsford, Christa; Shepherd, Doug

2013-09-01

In desert cities, securing sufficient water supply to meet the needs of both existing population and future growth is a complex problem with few easy solutions. Grass lawns are a major driver of water consumption and accurate measurements of vegetation area are necessary to understand drivers of changes in household water consumption. Measuring vegetation change in a heterogeneous urban environment requires sub-pixel estimation of vegetation area. Mixture Tuned Match Filtering has been successfully applied to target detection for materials that only cover small portions of a satellite image pixel. There have been few successful applications of MTMF to fractional area estimation, despite theory that suggests feasibility. We use a ground truth dataset over ten times larger than that available for any previous MTMF application to estimate the bias between ground truth data and matched filter results. We find that the MTMF algorithm underestimates the fractional area of vegetation by 5-10%, and calculate that averaging over 20 to 30 pixels is necessary to correct this bias. We conclude that with a large ground truth dataset, using MTMF for fractional area estimation is possible when results can be estimated at a lower spatial resolution than the base image. When this method is applied to estimating vegetation area in Las Vegas, NV spatial and temporal trends are consistent with expectations from known population growth and policy goals.
Mantle P wave travel time tomography of Eastern and Southern Africa: New images of mantle upwellings

NASA Astrophysics Data System (ADS)

Benoit, M. H.; Li, C.; van der Hilst, R.

2006-12-01

Much of Eastern Africa, including Ethiopia, Kenya, and Tanzania, has undergone extensive tectonism, including rifting, uplift, and volcanism during the Cenozoic. The cause of this tectonism is often attributed to the presence of one or more mantle upwellings, including starting thermal plumes and superplumes. Previous regional seismic studies and global tomographic models show conflicting results regarding the spatial and thermal characteristics of these upwellings. Additionally, there are questions concerning the extent to which the Archean and Proterozoic lithosphere has been altered by possible thermal upwellings in the mantle. To further constrain the mantle structure beneath Southern and Eastern Africa and to investigate the origin of the tectonism in Eastern Africa, we present preliminary results of a large-scale P wave travel time tomographic study of the region. We invert travel time measurements from the EHB database with travel time measurements taken from regional PASSCAL datasets including the Ethiopia Broadband Seismic Experiment (2000-2002); Kenya Broadband Seismic Experiment (2000-2002); Southern Africa Seismic Experiment (1997- 1999); Tanzania Broadband Seismic Experiment (1995-1997), and the Saudi Arabia PASSCAL Experiment (1995-1997). The tomographic inversion uses 3-D sensitivity kernels to combine different datasets and is parameterized with an irregular grid so that high spatial resolution can be obtained in areas of dense data coverage. It uses an adaptive least-squares context using the LSQR method with norm and gradient damping.
a Spiral-Based Downscaling Method for Generating 30 M Time Series Image Data

NASA Astrophysics Data System (ADS)

Liu, B.; Chen, J.; Xing, H.; Wu, H.; Zhang, J.

2017-09-01

The spatial detail and updating frequency of land cover data are important factors influencing land surface dynamic monitoring applications in high spatial resolution scale. However, the fragmentized patches and seasonal variable of some land cover types (e. g. small crop field, wetland) make it labor-intensive and difficult in the generation of land cover data. Utilizing the high spatial resolution multi-temporal image data is a possible solution. Unfortunately, the spatial and temporal resolution of available remote sensing data like Landsat or MODIS datasets can hardly satisfy the minimum mapping unit and frequency of current land cover mapping / updating at the same time. The generation of high resolution time series may be a compromise to cover the shortage in land cover updating process. One of popular way is to downscale multi-temporal MODIS data with other high spatial resolution auxiliary data like Landsat. But the usual manner of downscaling pixel based on a window may lead to the underdetermined problem in heterogeneous area, result in the uncertainty of some high spatial resolution pixels. Therefore, the downscaled multi-temporal data can hardly reach high spatial resolution as Landsat data. A spiral based method was introduced to downscale low spatial and high temporal resolution image data to high spatial and high temporal resolution image data. By the way of searching the similar pixels around the adjacent region based on the spiral, the pixel set was made up in the adjacent region pixel by pixel. The underdetermined problem is prevented to a large extent from solving the linear system when adopting the pixel set constructed. With the help of ordinary least squares, the method inverted the endmember values of linear system. The high spatial resolution image was reconstructed on the basis of high spatial resolution class map and the endmember values band by band. Then, the high spatial resolution time series was formed with these high spatial resolution images image by image. Simulated experiment and remote sensing image downscaling experiment were conducted. In simulated experiment, the 30 meters class map dataset Globeland30 was adopted to investigate the effect on avoid the underdetermined problem in downscaling procedure and a comparison between spiral and window was conducted. Further, the MODIS NDVI and Landsat image data was adopted to generate the 30m time series NDVI in remote sensing image downscaling experiment. Simulated experiment results showed that the proposed method had a robust performance in downscaling pixel in heterogeneous region and indicated that it was superior to the traditional window-based methods. The high resolution time series generated may be a benefit to the mapping and updating of land cover data.
Oregon Cascades Play Fairway Analysis: Raster Datasets and Models

DOE Data Explorer

Adam Brandt

2015-11-15

This submission includes maps of the spatial distribution of basaltic, and felsic rocks in the Oregon Cascades. It also includes a final Play Fairway Analysis (PFA) model, with the heat and permeability composite risk segments (CRS) supplied separately. Metadata for each raster dataset can be found within the zip files, in the TIF images
PRISM Climate Group, Oregon State U

Science.gov Websites

FAQ PRISM Climate Data The PRISM Climate Group gathers climate observations from a wide range of monitoring networks, applies sophisticated quality control measures, and develops spatial climate datasets to reveal short- and long-term climate patterns. The resulting datasets incorporate a variety of modeling
High resolution global gridded data for use in population studies

PubMed Central

Lloyd, Christopher T.; Sorichetta, Alessandro; Tatem, Andrew J.

2017-01-01

Recent years have seen substantial growth in openly available satellite and other geospatial data layers, which represent a range of metrics relevant to global human population mapping at fine spatial scales. The specifications of such data differ widely and therefore the harmonisation of data layers is a prerequisite to constructing detailed and contemporary spatial datasets which accurately describe population distributions. Such datasets are vital to measure impacts of population growth, monitor change, and plan interventions. To this end the WorldPop Project has produced an open access archive of 3 and 30 arc-second resolution gridded data. Four tiled raster datasets form the basis of the archive: (i) Viewfinder Panoramas topography clipped to Global ADMinistrative area (GADM) coastlines; (ii) a matching ISO 3166 country identification grid; (iii) country area; (iv) and slope layer. Further layers include transport networks, landcover, nightlights, precipitation, travel time to major cities, and waterways. Datasets and production methodology are here described. The archive can be downloaded both from the WorldPop Dataverse Repository and the WorldPop Project website. PMID:28140386

Some links on this page may take you to non-federal websites. Their policies may differ from this site.