NASA Astrophysics Data System (ADS)
Hassell, David; Gregory, Jonathan; Blower, Jon; Lawrence, Bryan N.; Taylor, Karl E.
2017-12-01
The CF (Climate and Forecast) metadata conventions are designed to promote the creation, processing, and sharing of climate and forecasting data using Network Common Data Form (netCDF) files and libraries. The CF conventions provide a description of the physical meaning of data and of their spatial and temporal properties, but they depend on the netCDF file encoding which can currently only be fully understood and interpreted by someone familiar with the rules and relationships specified in the conventions documentation. To aid in development of CF-compliant software and to capture with a minimal set of elements all of the information contained in the CF conventions, we propose a formal data model for CF which is independent of netCDF and describes all possible CF-compliant data. Because such data will often be analysed and visualised using software based on other data models, we compare our CF data model with the ISO 19123 coverage model, the Open Geospatial Consortium CF netCDF standard, and the Unidata Common Data Model. To demonstrate that this CF data model can in fact be implemented, we present cf-python, a Python software library that conforms to the model and can manipulate any CF-compliant dataset.
CF Metadata Conventions: Founding Principles, Governance, and Future Directions
NASA Astrophysics Data System (ADS)
Taylor, K. E.
2016-12-01
The CF Metadata Conventions define attributes that promote sharing of climate and forecasting data and facilitate automated processing by computers. The development, maintenance, and evolution of the conventions have mainly been provided by voluntary community contributions. Nevertheless, an organizational framework has been established, which relies on established rules and web-based discussion to ensure smooth (but relatively efficient) evolution of the standard to accommodate new types of data. The CF standard has been essential to the success of high-profile internationally-coordinated modeling activities (e.g, the Coupled Model Intercomparison Project). A summary of CF's founding principles and the prospects for its future evolution will be discussed.
NASA Astrophysics Data System (ADS)
Ward-Garrison, C.; May, R.; Davis, E.; Arms, S. C.
2016-12-01
NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The Climate and Forecasting (CF) metadata conventions for netCDF foster the ability to work with netCDF files in general and useful ways. These conventions include metadata attributes for physical units, standard names, and spatial coordinate systems. While these conventions have been successful in easing the use of working with netCDF-formatted output from climate and forecast models, their use for point-based observation data has been less so. Unidata has prototyped using the discrete sampling geometry (DSG) CF conventions to serve, using the THREDDS Data Server, the real-time point observation data flowing across the Internet Data Distribution (IDD). These data originate in text format reports for individual stations (e.g. METAR surface data or TEMP upper air data) and are converted and stored in netCDF files in real-time. This work discusses the experiences and challenges of using the current CF DSG conventions for storing such real-time data. We also test how parts of netCDF's extended data model can address these challenges, in order to inform decisions for a future version of CF (CF 2.0) that would take advantage of features of the netCDF enhanced data model.
Improving Metadata Compliance for Earth Science Data Records
NASA Astrophysics Data System (ADS)
Armstrong, E. M.; Chang, O.; Foster, D.
2014-12-01
One of the recurring challenges of creating earth science data records is to ensure a consistent level of metadata compliance at the granule level where important details of contents, provenance, producer, and data references are necessary to obtain a sufficient level of understanding. These details are important not just for individual data consumers but also for autonomous software systems. Two of the most popular metadata standards at the granule level are the Climate and Forecast (CF) Metadata Conventions and the Attribute Conventions for Dataset Discovery (ACDD). Many data producers have implemented one or both of these models including the Group for High Resolution Sea Surface Temperature (GHRSST) for their global SST products and the Ocean Biology Processing Group for NASA ocean color and SST products. While both the CF and ACDD models contain various level of metadata richness, the actual "required" attributes are quite small in number. Metadata at the granule level becomes much more useful when recommended or optional attributes are implemented that document spatial and temporal ranges, lineage and provenance, sources, keywords, and references etc. In this presentation we report on a new open source tool to check the compliance of netCDF and HDF5 granules to the CF and ACCD metadata models. The tool, written in Python, was originally implemented to support metadata compliance for netCDF records as part of the NOAA's Integrated Ocean Observing System. It outputs standardized scoring for metadata compliance for both CF and ACDD, produces an objective summary weight, and can be implemented for remote records via OPeNDAP calls. Originally a command-line tool, we have extended it to provide a user-friendly web interface. Reports on metadata testing are grouped in hierarchies that make it easier to track flaws and inconsistencies in the record. We have also extended it to support explicit metadata structures and semantic syntax for the GHRSST project that can be easily adapted to other satellite missions as well. Overall, we hope this tool will provide the community with a useful mechanism to improve metadata quality and consistency at the granule level by providing objective scoring and assessment, as well as encourage data producers to improve metadata quality and quantity.
NASA Astrophysics Data System (ADS)
Hardman, M.; Brodzik, M. J.; Long, D. G.
2017-12-01
Beginning in 1978, the satellite passive microwave data record has been a mainstay of remote sensing of the cryosphere, providing twice-daily, near-global spatial coverage for monitoring changes in hydrologic and cryospheric parameters that include precipitation, soil moisture, surface water, vegetation, snow water equivalent, sea ice concentration and sea ice motion. Historical versions of the gridded passive microwave data sets were produced as flat binary files described in human-readable documentation. This format is error-prone and makes it difficult to reliably include all processing and provenance. Funded by NASA MEaSUREs, we have completely reprocessed the gridded data record that includes SMMR, SSM/I-SSMIS and AMSR-E. The new Calibrated Enhanced-Resolution Brightness Temperature (CETB) Earth System Data Record (ESDR) files are self-describing. Our approach to the new data set was to create netCDF4 files that use standard metadata conventions and best practices to incorporate file-level, machine- and human-readable contents, geolocation, processing and provenance metadata. We followed the flexible and adaptable Climate and Forecast (CF-1.6) Conventions with respect to their coordinate conventions and map projection parameters. Additionally, we made use of Attribute Conventions for Dataset Discovery (ACDD-1.3) that provided file-level conventions with spatio-temporal bounds that enable indexing software to search for coverage. Our CETB files also include temporal coverage and spatial resolution in the file-level metadata for human-readability. We made use of the JPL CF/ACDD Compliance Checker to guide this work. We tested our file format with real software, for example, netCDF Command-line Operators (NCO) power tools for unlimited control on spatio-temporal subsetting and concatenation of files. The GDAL tools understand the CF metadata and produce fully-compliant geotiff files from our data. ArcMap can then reproject the geotiff files on-the-fly and work with other geolocated data such as coastlines, with no special work required. We expect this combination of standards and well-tested interoperability to significantly improve the usability of this important ESDR for the Earth Science community.
Weather forecasting with open source software
NASA Astrophysics Data System (ADS)
Rautenhaus, Marc; Dörnbrack, Andreas
2013-04-01
To forecast the weather situation during aircraft-based atmospheric field campaigns, we employ a tool chain of existing and self-developed open source software tools and open standards. Of particular value are the Python programming language with its extension libraries NumPy, SciPy, PyQt4, Matplotlib and the basemap toolkit, the NetCDF standard with the Climate and Forecast (CF) Metadata conventions, and the Open Geospatial Consortium Web Map Service standard. These open source libraries and open standards helped to implement the "Mission Support System", a Web Map Service based tool to support weather forecasting and flight planning during field campaigns. The tool has been implemented in Python and has also been released as open source (Rautenhaus et al., Geosci. Model Dev., 5, 55-71, 2012). In this presentation we discuss the usage of free and open source software for weather forecasting in the context of research flight planning, and highlight how the field campaign work benefits from using open source tools and open standards.
Collaborative Sharing of Multidimensional Space-time Data Using HydroShare
NASA Astrophysics Data System (ADS)
Gan, T.; Tarboton, D. G.; Horsburgh, J. S.; Dash, P. K.; Idaszak, R.; Yi, H.; Blanton, B.
2015-12-01
HydroShare is a collaborative environment being developed for sharing hydrological data and models. It includes capability to upload data in many formats as resources that can be shared. The HydroShare data model for resources uses a specific format for the representation of each type of data and specifies metadata common to all resource types as well as metadata unique to specific resource types. The Network Common Data Form (NetCDF) was chosen as the format for multidimensional space-time data in HydroShare. NetCDF is widely used in hydrological and other geoscience modeling because it contains self-describing metadata and supports the creation of array-oriented datasets that may include three spatial dimensions, a time dimension and other user defined dimensions. For example, NetCDF may be used to represent precipitation or surface air temperature fields that have two dimensions in space and one dimension in time. This presentation will illustrate how NetCDF files are used in HydroShare. When a NetCDF file is loaded into HydroShare, header information is extracted using the "ncdump" utility. Python functions developed for the Django web framework on which HydroShare is based, extract science metadata present in the NetCDF file, saving the user from having to enter it. Where the file follows Climate Forecast (CF) convention and Attribute Convention for Dataset Discovery (ACDD) standards, metadata is thus automatically populated. Users also have the ability to add metadata to the resource that may not have been present in the original NetCDF file. HydroShare's metadata editing functionality then writes this science metadata back into the NetCDF file to maintain consistency between the science metadata in HydroShare and the metadata in the NetCDF file. This further helps researchers easily add metadata information following the CF and ACDD conventions. Additional data inspection and subsetting functions were developed, taking advantage of Python and command line libraries for working with NetCDF files. We describe the design and implementation of these features and illustrate how NetCDF files from a modeling application may be curated in HydroShare and thus enhance reproducibility of the associated research. We also discuss future development planned for multidimensional space-time data in HydroShare.
Advances in a distributed approach for ocean model data interoperability
Signell, Richard P.; Snowden, Derrick P.
2014-01-01
An infrastructure for earth science data is emerging across the globe based on common data models and web services. As we evolve from custom file formats and web sites to standards-based web services and tools, data is becoming easier to distribute, find and retrieve, leaving more time for science. We describe recent advances that make it easier for ocean model providers to share their data, and for users to search, access, analyze and visualize ocean data using MATLAB® and Python®. These include a technique for modelers to create aggregated, Climate and Forecast (CF) metadata convention datasets from collections of non-standard Network Common Data Form (NetCDF) output files, the capability to remotely access data from CF-1.6-compliant NetCDF files using the Open Geospatial Consortium (OGC) Sensor Observation Service (SOS), a metadata standard for unstructured grid model output (UGRID), and tools that utilize both CF and UGRID standards to allow interoperable data search, browse and access. We use examples from the U.S. Integrated Ocean Observing System (IOOS®) Coastal and Ocean Modeling Testbed, a project in which modelers using both structured and unstructured grid model output needed to share their results, to compare their results with other models, and to compare models with observed data. The same techniques used here for ocean modeling output can be applied to atmospheric and climate model output, remote sensing data, digital terrain and bathymetric data.
Verification of a New NOAA/NSIDC Passive Microwave Sea-Ice Concentration Climate Record
NASA Technical Reports Server (NTRS)
Meier, Walter N.; Peng, Ge; Scott, Donna J.; Savoie, Matt H.
2014-01-01
A new satellite-based passive microwave sea-ice concentration product developed for the National Oceanic and Atmospheric Administration (NOAA)Climate Data Record (CDR) programme is evaluated via comparison with other passive microwave-derived estimates. The new product leverages two well-established concentration algorithms, known as the NASA Team and Bootstrap, both developed at and produced by the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC). The sea ice estimates compare well with similar GSFC products while also fulfilling all NOAA CDR initial operation capability (IOC) requirements, including (1) self describing file format, (2) ISO 19115-2 compliant collection-level metadata,(3) Climate and Forecast (CF) compliant file-level metadata, (4) grid-cell level metadata (data quality fields), (5) fully automated and reproducible processing and (6) open online access to full documentation with version control, including source code and an algorithm theoretical basic document. The primary limitations of the GSFC products are lack of metadata and use of untracked manual corrections to the output fields. Smaller differences occur from minor variations in processing methods by the National Snow and Ice Data Center (for the CDR fields) and NASA (for the GSFC fields). The CDR concentrations do have some differences from the constituent GSFC concentrations, but trends and variability are not substantially different.
Representing Simple Geometry Types in NetCDF-CF
NASA Astrophysics Data System (ADS)
Blodgett, D. L.; Koziol, B. W.; Whiteaker, T. L.; Simons, R.
2016-12-01
The Climate and Forecast (CF) metadata convention is well-suited for representing gridded and point-based observational datasets. However, CF currently has no accepted mechanism for representing simple geometry types such as lines and polygons. Lack of support for simple geometries within CF has unintentionally excluded a broad set of geoscientific data types from NetCDF-CF data encodings. For example, hydrologic datasets often contain polygon watershed catchments and polyline stream reaches in addition to point sampling stations and water management infrastructure. The latter has an associated CF specification. In the interest of supporting all simple geometry types within CF, a working group was formed following an EarthCube workshop on Advancing NetCDF-CF [1] to draft a CF specification for simple geometries: points, lines, polygons, and their associated multi-geometry representations [2]. The draft also includes parametric geometry types such as circles and ellipses. This presentation will provide an overview of the scope and content of the proposed specification focusing on mechanisms for representing coordinate arrays using variable length or continuous ragged arrays, capturing multi-geometries, and accounting for type-specific geometry artifacts such as polygon holes/interiors, node ordering, etc. The concepts contained in the specification proposal will be described with a use case representing streamflow in rivers and evapotranspiration from HUC12 watersheds. We will also introduce Python and R reference implementations developed alongside the technical specification. These in-development, open source Python and R libraries convert between commonly used GIS software objects (i.e. GEOS-based primitives) and their associated simple geometry CF representation. [1] http://www.unidata.ucar.edu/events/2016CFWorkshop/[2] https://github.com/bekozi/netCDF-CF-simple-geometry
NASA Astrophysics Data System (ADS)
Druken, K. A.; Trenham, C. E.; Wang, J.; Bastrakova, I.; Evans, B. J. K.; Wyborn, L. A.; Ip, A. I.; Poudjom Djomani, Y.
2016-12-01
The National Computational Infrastructure (NCI) hosts one of Australia's largest repositories (10+ PBytes) of research data, colocated with a petascale High Performance Computer and a highly integrated research cloud. Key to maximizing benefit of NCI's collections and computational capabilities is ensuring seamless interoperable access to these datasets. This presents considerable data management challenges across the diverse range of geoscience data; spanning disciplines where netCDF-CF is commonly utilized (e.g., climate, weather, remote-sensing), through to the geophysics and seismology fields that employ more traditional domain- and study-specific data formats. These data are stored in a variety of gridded, irregularly spaced (i.e., trajectories, point clouds, profiles), and raster image structures. They often have diverse coordinate projections and resolutions, thus complicating the task of comparison and inter-discipline analysis. Nevertheless, much can be learned from the netCDF-CF model that has long served the climate community, providing a common data structure for the atmospheric, ocean and cryospheric sciences. We are extending the application of the existing Climate and Forecast (CF) metadata conventions to NCI's broader geoscience data collections. We present simple implementations that can significantly improve interoperability of the research collections, particularly in the case of line survey data. NCI has developed a compliance checker to assist with the data quality across all hosted netCDF-CF collections. The tool is an extension to one of the main existing CF Convention checkers, that we have modified to incorporate the Attribute Convention for Data Discovery (ACDD) and ISO19115 standards, and to perform parallelised checks over collections of files, ensuring compliance and consistency across the NCI data collections as a whole. It is complemented by a checker that also verifies functionality against a range of scientific analysis, programming, and data visualisation tools. By design, these tests are not necessarily domain-specific, and demonstrate that verified data is accessible to end-users, thus allowing for seamless interoperability with other datasets across a wide range of fields.
ERDDAP: Reducing Data Friction with an Open Source Data Platform
NASA Astrophysics Data System (ADS)
O'Brien, K.
2017-12-01
Data friction is not just an issue facing interdisciplinary research. Often times, even within disciplines, significant data friction can exist. Issues of differing formats, limited metadata and non-existent machine-to-machine data access are all issues that exist within disciplines and make it that much harder for successful interdisciplinary cooperation. Therefore, reducing data friction within disciplines is crucial first step in providing better overall collaboration. ERDDAP, an open source data platform developed at NOAA's Southwest Fisheries Center, is well poised to improve data useability and understanding and reduce data friction, both in single and multi-disciplinary research. By virtue of its ability to integrate data of varying formats and provide RESTful-based user access to data and metadata, use of ERDDAP has grown substantially throughout the ocean data community. ERDDAP also supports standards such as the DAP data protocol, the Climate and Forecast (CF) metadata conventions and the Bagit document standard for data archival. In this presentation, we will discuss the advantages of using ERDDAP as a data platform. We will also show specific use cases where utilizing ERDDAP has reduced friction within a single discipline (physical oceanography) and improved interdisciplinary collaboration as well.
NASA Astrophysics Data System (ADS)
Hardy, D.; Janée, G.; Gallagher, J.; Frew, J.; Cornillon, P.
2006-12-01
The OPeNDAP Data Access Protocol (DAP) is a community standard for sharing scientific data across the Internet. Data providers using DAP have adopted a variety of metadata conventions to improve data utility, such as COARDS (1995) and CF (2003). Our results show, however, that metadata do not follow these conventions in practice. We collected metadata from over a hundred DAP servers, tens of thousands of data objects, and hundreds of collections. We found that a minority claim to adhere to a metadata convention, and a small percentage accurately adhere to their stated convention. We present descriptive statistics of our survey and highlight common traits such as well-populated attributes. Our empirical results indicate that unified search services cannot rely solely on metadata conventions. Although we encourage all providers to adopt a small subset of the CF convention for discovery purposes, we have no evidence to suggest that improved conventions would simplify the fundamental problem of heterogeneity. Large-scale discovery services must find methods for integrating incompatible metadata.
NASA Technical Reports Server (NTRS)
Smit, Christine; Hegde, Mahabaleshwara; Strub, Richard; Bryant, Keith; Li, Angela; Petrenko, Maksym
2017-01-01
Giovanni is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization.
The Value of Data and Metadata Standardization for Interoperability in Giovanni
NASA Astrophysics Data System (ADS)
Smit, C.; Hegde, M.; Strub, R. F.; Bryant, K.; Li, A.; Petrenko, M.
2017-12-01
Giovanni (https://giovanni.gsfc.nasa.gov/giovanni/) is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization. This poster gives examples of how our metadata and data standards, both external and internal, have both simplified our code base and improved our users' experiences.
Web Services as Building Blocks for an Open Coastal Observing System
NASA Astrophysics Data System (ADS)
Breitbach, G.; Krasemann, H.
2012-04-01
In coastal observing systems it is needed to integrate different observing methods like remote sensing, in-situ measurements, and models into a synoptic view of the state of the observed region. This integration can be based solely on web services combining data and metadata. Such an approach is pursued for COSYNA (Coastal Observing System for Northern and Artic seas). Data from satellite and radar remote sensing, measurements of buoys, stations and Ferryboxes are the observation part of COSYNA. These data are assimilated into models to create pre-operational forecasts. For discovering data an OGC Web Feature Service (WFS) is used by the COSYNA data portal. This Web Feature Service knows the necessary metadata not only for finding data, but in addition the URLs of web services to view and download the data. To make the data from different resources comparable a common vocabulary is needed. For COSYNA the standard names from CF-conventions are stored within the metadata whenever possible. For the metadata an INSPIRE and ISO19115 compatible data format is used. The WFS is fed from the metadata-system using database-views. Actual data are stored in two different formats, in NetCDF-files for gridded data and in an RDBMS for time-series-like data. The web service URLs are mostly standard based the standards are mainly OGC standards. Maps were created from netcdf files with the help of the ncWMS tool whereas a self-developed java servlet is used for maps of moving measurement platforms. In this case download of data is offered via OGC SOS. For NetCDF-files OPeNDAP is used for the data download. The OGC CSW is used for accessing extended metadata. The concept of data management in COSYNA will be presented which is independent of the special services used in COSYNA. This concept is parameter and data centric and might be useful for other observing systems.
CfRadial - CF NetCDF for Radar and Lidar Data in Polar Coordinates.
NASA Astrophysics Data System (ADS)
Dixon, M. J.; Lee, W. C.; Michelson, D.; Curtis, M.
2016-12-01
Since 1990, NCAR has supported over 20 different data formats for radar and lidar data in polar coordinates. Researchers, students and operational users spend unnecessary time handling a multitude of unique formats. CfRadial grew out of the need to simplify the use of these data and thereby to improve efficiency in research and operations. CfRadial adopts the well-known NetCDF framework, along with the Climate and Forecasting (CF) conventions such that data and metadata are accurately represented. Mobile platforms are also supported. The first major release, CfRadial version 1.1, occurred in February 2011, followed by minor updates. CfRadial has been adopted by NCAR as well as other agencies in the US and the UK. CfRadial development was boosted in 2015 through a two-year NSF EarthCube grant to improve CF in general. Version 1.4 was agreed upon in May 2016, adding explicit support for quality control fields and spectra. In Europe and Australia, EUMETNET OPERA's HDF5-based ODIM_H5 standard has been rapidly embraced as the modern standard for exchanging weather radar data for operations. ODIM_H5 exploits data groups, hierarchies, and built-in compression, characteristics that have been added to NetCDF4. A meeting of the WMO Task Team on Weather Radar Data Exchange (TT-WRDE) was held at NCAR in Boulder in July 2016, with a goal of identifying a single global standard for radar and lidar data in polar coordinates. CfRadial and ODIM_H5 were considered alongside the older and more rigid table-driven WMO BUFR and GRIB2 formats. TT-WRDE recommended that CfRadial 1.4 be merged with the sweep-oriented structure of ODIM_H5, making use of NetCDF groups, to produce a single format that will encompass the best ideas of both formats. That has led to the emergence of the CfRadial 2.0 standard. This format should meet the objectives of both the NSF EarthCube CF 2.0 initiative and the WMO TT-WRDE. It has the added benefit of improving data exchange between operational and research users, making operational data more readily available to researchers, and research algorithms more accessible to operational agencies.
Control vocabulary software designed for CMIP6
NASA Astrophysics Data System (ADS)
Nadeau, D.; Taylor, K. E.; Williams, D. N.; Ames, S.
2016-12-01
The Coupled Model Intercomparison Project Phase 6 (CMIP6) coordinates a number of intercomparison activities and includes many more experiments than its predecessor, CMIP5. In order to organize and facilitate use of the complex collection of expected CMIP6 model output, a standard set of descriptive information has been defined, which must be stored along with the data. This standard information enables automated machine interpretation of the contents of all model output files. The standard metadata is stored in compliance with the Climate and Forecast (CF) standard, which ensures that it can be interpreted and visualized by many standard software packages. Additional attributes (not standardized by CF) are required by CMIP6 to enhance identification of models and experiments, and to provide additional information critical for interpreting the model results. To ensure that CMIP6 data complies with the standards, a python program called "PrePARE" (Pre-Publication Attribute Reviewer for the ESGF) has been developed to check the model output prior to its publication and release for analysis. If, for example, a required attribute is missing or incorrect (e.g., not included in the reference CMIP6 controlled vocabularies), then PrePare will prevent publication. In some circumstances, missing attributes can be created or incorrect attributes can be replaced automatically by PrePARE, and the program will warn users about the changes that have been made. PrePARE provides a final check on model output assuring adherence to a baseline conformity across the output from all CMIP6 models which will facilitate analysis by climate scientists. PrePARE is flexible and can be easily modified for use by similar projects that have a well-defined set of metadata and controlled vocabularies.
NASA Astrophysics Data System (ADS)
Vines, Aleksander; Hansen, Morten W.; Korosov, Anton
2017-04-01
Existing infrastructure international and Norwegian projects, e.g., NorDataNet, NMDC and NORMAP, provide open data access through the OPeNDAP protocol following the conventions for CF (Climate and Forecast) metadata, designed to promote the processing and sharing of files created with the NetCDF application programming interface (API). This approach is now also being implemented in the Norwegian Sentinel Data Hub (satellittdata.no) to provide satellite EO data to the user community. Simultaneously with providing simplified and unified data access, these projects also seek to use and establish common standards for use and discovery metadata. This then allows development of standardized tools for data search and (subset) streaming over the internet to perform actual scientific analysis. A combinnation of software tools, which we call a Scientific Platform as a Service (SPaaS), will take advantage of these opportunities to harmonize and streamline the search, retrieval and analysis of integrated satellite and auxiliary observations of the oceans in a seamless system. The SPaaS is a cloud solution for integration of analysis tools with scientific datasets via an API. The core part of the SPaaS is a distributed metadata catalog to store granular metadata describing the structure, location and content of available satellite, model, and in situ datasets. The analysis tools include software for visualization (also online), interactive in-depth analysis, and server-based processing chains. The API conveys search requests between system nodes (i.e., interactive and server tools) and provides easy access to the metadata catalog, data repositories, and the tools. The SPaaS components are integrated in virtual machines, of which provisioning and deployment are automatized using existing state-of-the-art open-source tools (e.g., Vagrant, Ansible, Docker). The open-source code for scientific tools and virtual machine configurations is under version control at https://github.com/nansencenter/, and is coupled to an online continuous integration system (e.g., Travis CI).
NetCDF4/HDF5 and Linked Data in the Real World - Enriching Geoscientific Metadata without Bloat
NASA Astrophysics Data System (ADS)
Ip, Alex; Car, Nicholas; Druken, Kelsey; Poudjom-Djomani, Yvette; Butcher, Stirling; Evans, Ben; Wyborn, Lesley
2017-04-01
NetCDF4 has become the dominant generic format for many forms of geoscientific data, leveraging (and constraining) the versatile HDF5 container format, while providing metadata conventions for interoperability. However, the encapsulation of detailed metadata within each file can lead to metadata "bloat", and difficulty in maintaining consistency where metadata is replicated to multiple locations. Complex conceptual relationships are also difficult to represent in simple key-value netCDF metadata. Linked Data provides a practical mechanism to address these issues by associating the netCDF files and their internal variables with complex metadata stored in Semantic Web vocabularies and ontologies, while complying with and complementing existing metadata conventions. One of the stated objectives of the netCDF4/HDF5 formats is that they should be self-describing: containing metadata sufficient for cataloguing and using the data. However, this objective can be regarded as only partially-met where details of conventions and definitions are maintained externally to the data files. For example, one of the most widely used netCDF community standards, the Climate and Forecasting (CF) Metadata Convention, maintains standard vocabularies for a broad range of disciplines across the geosciences, but this metadata is currently neither readily discoverable nor machine-readable. We have previously implemented useful Linked Data and netCDF tooling (ncskos) that associates netCDF files, and individual variables within those files, with concepts in vocabularies formulated using the Simple Knowledge Organization System (SKOS) ontology. NetCDF files contain Uniform Resource Identifier (URI) links to terms represented as SKOS Concepts, rather than plain-text representations of those terms, so we can use simple, standardised web queries to collect and use rich metadata for the terms from any Linked Data-presented SKOS vocabulary. Geoscience Australia (GA) manages a large volume of diverse geoscientific data, much of which is being translated from proprietary formats to netCDF at NCI Australia. This data is made available through the NCI National Environmental Research Data Interoperability Platform (NERDIP) for programmatic access and interdisciplinary analysis. The netCDF files contain both scientific data variables (e.g. gravity, magnetic or radiometric values), but also domain-specific operational values (e.g. specific instrument parameters) best described fully in formal vocabularies. Our ncskos codebase provides access to multiple stores of detailed external metadata in a standardised fashion. Geophysical datasets are generated from a "survey" event, and GA maintains corporate databases of all surveys and their associated metadata. It is impractical to replicate the full source survey metadata into each netCDF dataset so, instead, we link the netCDF files to survey metadata using public Linked Data URIs. These URIs link to Survey class objects which we model as a subclass of Activity objects as defined by the PROV Ontology, and we provide URI resolution for them via a custom Linked Data API which draws current survey metadata from GA's in-house databases. We have demonstrated that Linked Data is a practical way to associate netCDF data with detailed, external metadata. This allows us to ensure that catalogued metadata is kept consistent with metadata points-of-truth, and we can infer complex conceptual relationships not possible with netCDF key-value attributes alone.
NASA Astrophysics Data System (ADS)
Stein, Olaf; Schultz, Martin G.; Rambadt, Michael; Saini, Rajveer; Hoffmann, Lars; Mallmann, Daniel
2017-04-01
Global model data of atmospheric composition produced by the Copernicus Atmospheric Monitoring Service (CAMS) is collected since 2010 at FZ Jülich and serves as boundary condition for use by Regional Air Quality (RAQ) modellers world-wide. RAQ models need time-resolved meteorological as well as chemical lateral boundary conditions for their individual model domains. While the meteorological data usually come from well-established global forecast systems, the chemical boundary conditions are not always well defined. In the past, many models used 'climatic' boundary conditions for the tracer concentrations, which can lead to significant concentration biases, particularly for tracers with longer lifetimes which can be transported over long distances (e.g. over the whole northern hemisphere) with the mean wind. The Copernicus approach utilizes extensive near-realtime data assimilation of atmospheric composition data observed from space which gives additional reliability to the global modelling data and is well received by the RAQ communities. An existing Web Coverage Service (WCS) for sharing these individually tailored model results is currently being re-engineered to make use of a modern, scalable database technology in order to improve performance, enhance flexibility, and allow the operation of catalogue services. The new Jülich Atmospheric Data Distributions Server (JADDS) adheres to the Web Coverage Service WCS2.0 standard as defined by the Open Geospatial Consortium OGC. This enables the user groups to flexibly define datasets they need by selecting a subset of chemical species or restricting geographical boundaries or the length of the time series. The data is made available in the form of different catalogues stored locally on our server. In addition, the Jülich OWS Interface (JOIN) provides interoperable web services allowing for easy download and visualization of datasets delivered from WCS servers via the internet. We will present the prototype JADDS server and address the major issues identified when relocating large four-dimensional datasets into a RASDAMAN raster array database. So far the RASDAMAN support for data available in netCDF format is limited with respect to metadata related to variables and axes. For community-wide accepted solutions, selected data coverages shall result in downloadable netCDF files including metadata complying with the netCDF CF Metadata Conventions standard (http://cfconventions.org/). This can be achieved by adding custom metadata elements for RASDAMAN bands (model levels) on data ingestion. Furthermore, an optimization strategy for ingestion of several TB of 4D model output data will be outlined.
Earthquake and failure forecasting in real-time: A Forecasting Model Testing Centre
NASA Astrophysics Data System (ADS)
Filgueira, Rosa; Atkinson, Malcolm; Bell, Andrew; Main, Ian; Boon, Steven; Meredith, Philip
2013-04-01
Across Europe there are a large number of rock deformation laboratories, each of which runs many experiments. Similarly there are a large number of theoretical rock physicists who develop constitutive and computational models both for rock deformation and changes in geophysical properties. Here we consider how to open up opportunities for sharing experimental data in a way that is integrated with multiple hypothesis testing. We present a prototype for a new forecasting model testing centre based on e-infrastructures for capturing and sharing data and models to accelerate the Rock Physicist (RP) research. This proposal is triggered by our work on data assimilation in the NERC EFFORT (Earthquake and Failure Forecasting in Real Time) project, using data provided by the NERC CREEP 2 experimental project as a test case. EFFORT is a multi-disciplinary collaboration between Geoscientists, Rock Physicists and Computer Scientist. Brittle failure of the crust is likely to play a key role in controlling the timing of a range of geophysical hazards, such as volcanic eruptions, yet the predictability of brittle failure is unknown. Our aim is to provide a facility for developing and testing models to forecast brittle failure in experimental and natural data. Model testing is performed in real-time, verifiably prospective mode, in order to avoid selection biases that are possible in retrospective analyses. The project will ultimately quantify the predictability of brittle failure, and how this predictability scales from simple, controlled laboratory conditions to the complex, uncontrolled real world. Experimental data are collected from controlled laboratory experiments which includes data from the UCL Laboratory and from Creep2 project which will undertake experiments in a deep-sea laboratory. We illustrate the properties of the prototype testing centre by streaming and analysing realistically noisy synthetic data, as an aid to generating and improving testing methodologies in imperfect conditions. The forecasting model testing centre uses a repository to hold all the data and models and a catalogue to hold all the corresponding metadata. It allows to: Data transfer: Upload experimental data: We have developed FAST (Flexible Automated Streaming Transfer) tool to upload data from RP laboratories to the repository. FAST sets up data transfer requirements and selects automatically the transfer protocol. Metadata are automatically created and stored. Web data access: Create synthetic data: Users can choose a generator and supply parameters. Synthetic data are automatically stored with corresponding metadata. Select data and models: Search the metadata using criteria design for RP. The metadata of each data (synthetic or from laboratory) and models are well-described through their respective catalogues accessible by the web portal. Upload models: Upload and store a model with associated metadata. This provide an opportunity to share models. The web portal solicits and creates metadata describing each model. Run model and visualise results: Selected data and a model to be submitted to a High Performance Computational resource hiding technical details. Results are displayed in accelerated time and stored allowing retrieval, inspection and aggregation. The forecasting model testing centre proposed could be integrated into EPOS. Its expected benefits are: Improved the understanding of brittle failure prediction and its scalability to natural phenomena. Accelerated and extensive testing and rapid sharing of insights. Increased impact and visibility of RP and GeoScience research. Resources for education and training. A key challenge is to agree the framework for sharing RP data and models. Our work is provocative first step.
NASA Astrophysics Data System (ADS)
Brisc, Felicia; Vater, Stefan; Behrens, Joern
2016-04-01
We present the UGRID Reader, a visualization software component that implements the UGRID Conventions into Paraview. It currently supports the reading and visualization of 2D unstructured triangular, quadrilateral and mixed triangle/quadrilateral meshes, while the data can be defined per cell or per vertex. The Climate and Forecast Metadata Conventions (CF Conventions) have been set for many years as the standard framework for climate data written in NetCDF format. While they allow storing unstructured data simply as data defined at a series of points, they do not currently address the topology of the underlying unstructured mesh. However, it is often necessary to have additional mesh topology information, i.e. is it a one dimensional network, a 2D triangular mesh or a flexible mixed triangle/quadrilateral mesh, a 2D mesh with vertical layers, or a fully unstructured 3D mesh. The UGRID Conventions proposed by the UGRID Interoperability group are attempting to fill in this void by extending the CF Conventions with topology specifications. As the UGRID Conventions are increasingly popular with an important subset of the CF community, they warrant the development of a customized tool for the visualization and exploration of UGRID-conforming data. The implementation of the UGRID Reader has been designed corresponding to the ParaView plugin architecture. This approach allowed us to tap into the powerful reading and rendering capabilities of ParaView, while the reader is easy to install. We aim at parallelism to be able to process large data sets. Furthermore, our current application of the reader is the visualization of higher order simulation output which demands for a special representation of the data within a cell.
Rosetta: Ensuring the Preservation and Usability of ASCII-based Data into the Future
NASA Astrophysics Data System (ADS)
Ramamurthy, M. K.; Arms, S. C.
2015-12-01
Field data obtained from dataloggers often take the form of comma separated value (CSV) ASCII text files. While ASCII based data formats have positive aspects, such as the ease of accessing the data from disk and the wide variety of tools available for data analysis, there are some drawbacks, especially when viewing the situation through the lens of data interoperability and stewardship. The Unidata data translation tool, Rosetta, is a web-based service that provides an easy, wizard-based interface for data collectors to transform their datalogger generated ASCII output into Climate and Forecast (CF) compliant netCDF files following the CF-1.6 discrete sampling geometries. These files are complete with metadata describing what data are contained in the file, the instruments used to collect the data, and other critical information that otherwise may be lost in one of many README files. The choice of the machine readable netCDF data format and data model, coupled with the CF conventions, ensures long-term preservation and interoperability, and that future users will have enough information to responsibly use the data. However, with the understanding that the observational community appreciates the ease of use of ASCII files, methods for transforming the netCDF back into a CSV or spreadsheet format are also built-in. One benefit of translating ASCII data into a machine readable format that follows open community-driven standards is that they are instantly able to take advantage of data services provided by the many open-source data server tools, such as the THREDDS Data Server (TDS). While Rosetta is currently a stand-alone service, this talk will also highlight efforts to couple Rosetta with the TDS, thus allowing self-publishing of thoroughly documented datasets by the data producers themselves.
NASA Astrophysics Data System (ADS)
Haran, T. M.; Brodzik, M. J.; Nordgren, B.; Estilow, T.; Scott, D. J.
2015-12-01
An increasing number of new Earth science datasets are being producedby data providers in self-describing, machine-independent file formatsincluding Hierarchical Data Format version 5 (HDF5) and NetworkCommon Data Form version 4 (netCDF-4). Furthermore data providers maybe producing netCDF-4 files that follow the conventions for Climateand Forecast metadata version 1.6 (CF 1.6) which, for datasets mappedto a projected raster grid covering all or a portion of the earth,includes the Coordinate Reference System (CRS) used to define howlatitude and longitude are mapped to grid coordinates, i.e. columnsand rows, and vice versa. One problem that users may encounter is thattheir preferred visualization and analysis tool may not yet includesupport for one of these newer formats. Moreover, data distributorssuch as NASA's NSIDC DAAC may not yet include support for on-the-flyconversion of data files for all data sets produced in a new format toa preferred older distributed format.There do exist open source solutions to this dilemma in the form ofsoftware packages that can translate files in one of the new formatsto one of the preferred formats. However these software packagesrequire that the file to be translated conform to the specificationsof its respective format. Although an online CF-Convention compliancechecker is available from cfconventions.org, a recent NSIDC userservices incident described here in detail involved an NSIDC-supporteddata set that passed the (then current) CF Checker Version 2.0.6, butwas in fact lacking two variables necessary for conformance. Thisproblem was not detected until GDAL, a software package which reliedon the missing variables, was employed by a user in an attempt totranslate the data into a different file format, namely GeoTIFF.This incident indicates that testing a candidate data product with oneor more software products written to accept the advertised conventionsis proposed as a practice which improves interoperability. Differencesbetween data file contents and software package expectations areexposed, affording an opportunity to improve conformance of software,data or both. The incident can also serve as a demonstration that dataproviders, distributors, and users can work together to improve dataproduct quality and interoperability.
Forecasting Chronic Diseases Using Data Fusion.
Acar, Evrim; Gürdeniz, Gözde; Savorani, Francesco; Hansen, Louise; Olsen, Anja; Tjønneland, Anne; Dragsted, Lars Ove; Bro, Rasmus
2017-07-07
Data fusion, that is, extracting information through the fusion of complementary data sets, is a topic of great interest in metabolomics because analytical platforms such as liquid chromatography-mass spectrometry (LC-MS) and nuclear magnetic resonance (NMR) spectroscopy commonly used for chemical profiling of biofluids provide complementary information. In this study, with a goal of forecasting acute coronary syndrome (ACS), breast cancer, and colon cancer, we jointly analyzed LC-MS, NMR measurements of plasma samples, and the metadata corresponding to the lifestyle of participants. We used supervised data fusion based on multiple kernel learning and exploited the linearity of the models to identify significant metabolites/features for the separation of healthy referents and the cases developing a disease. We demonstrated that (i) fusing LC-MS, NMR, and metadata provided better separation of ACS cases and referents compared with individual data sets, (ii) NMR data performed the best in terms of forecasting breast cancer, while fusion degraded the performance, and (iii) neither the individual data sets nor their fusion performed well for colon cancer. Furthermore, we showed the strengths and limitations of the fusion models by discussing their performance in terms of capturing known biomarkers for smoking and coffee. While fusion may improve performance in terms of separating certain conditions by jointly analyzing metabolomics and metadata sets, it is not necessarily always the best approach as in the case of breast cancer.
Unleashing Geophysics Data with Modern Formats and Services
NASA Astrophysics Data System (ADS)
Ip, Alex; Brodie, Ross C.; Druken, Kelsey; Bastrakova, Irina; Evans, Ben; Kemp, Carina; Richardson, Murray; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley
2016-04-01
Geoscience Australia (GA) is the national steward of large volumes of geophysical data extending over the entire Australasian region and spanning many decades. The volume and variety of data which must be managed, coupled with the increasing need to support machine-to-machine data access, mean that the old "click-and-ship" model delivering data as downloadable files for local analysis is rapidly becoming unviable - a "big data" problem not unique to geophysics. The Australian Government, through the Research Data Services (RDS) Project, recently funded the Australian National Computational Infrastructure (NCI) to organize a wide range of Earth Systems data from diverse collections including geoscience, geophysics, environment, climate, weather, and water resources onto a single High Performance Data (HPD) Node. This platform, which now contains over 10 petabytes of data, is called the National Environmental Research Data Interoperability Platform (NERDIP), and is designed to facilitate broad user access, maximise reuse, and enable integration. GA has contributed several hundred terabytes of geophysical data to the NERDIP. Historically, geophysical datasets have been stored in a range of formats, with metadata of varying quality and accessibility, and without standardised vocabularies. This has made it extremely difficult to aggregate original data from multiple surveys (particularly un-gridded geophysics point/line data) into standard formats suited to High Performance Computing (HPC) environments. To address this, it was decided to use the NERDIP-preferred Hierarchical Data Format (HDF) 5, which is a proven, standard, open, self-describing and high-performance format supported by extensive software tools, libraries and data services. The Network Common Data Form (NetCDF) 4 API facilitates the use of data in HDF5, whilst the NetCDF Climate & Forecasting conventions (NetCDF-CF) further constrain NetCDF4/HDF5 data so as to provide greater inherent interoperability. The first geophysical data collection selected for transformation by GA was Airborne ElectroMagnetics (AEM) data which was held in proprietary-format files, with associated ISO 19115 metadata held in a separate relational database. Existing NetCDF-CF metadata profiles were enhanced to cover AEM and other geophysical data types, and work is underway to formalise the new geophysics vocabulary as a proposed extension to the Climate & Forecasting conventions. The richness and flexibility of HDF5's internal indexing mechanisms has allowed lossless restructuring of the AEM data for efficient storage, subsetting and access via either the NetCDF4/HDF5 APIs or Open-source Project for a Network Data Access Protocol (OPeNDAP) data services. This approach not only supports large-scale HPC processing, but also interactive access to a wide range of geophysical data in user-friendly environments such as iPython notebooks and more sophisticated cloud-enabled portals such as the Virtual Geophysics Laboratory (VGL). As multidimensional AEM datasets are relatively complex compared to other geophysical data types, the general approach employed in this project for modernizing AEM data is likely to be applicable to other geophysics data types. When combined with the use of standards-based data services and APIs, a coordinated, systematic modernisation will result in vastly improved accessibility to, and usability of, geophysical data in a wide range of computational environments both within and beyond the geophysics community.
Hankin, Steven C.; Blower, Jon D.; Carval, Thierry; Casey, Kenneth S.; Donlon, Craig; Lauret, Olivier; Loubrieu, Thomas; Srinivasan, Ashwanth; Trinanes, Joaquin; Godøy, Øystein; Mendelssohn, Roy; Signell, Richard P.; de La Beaujardiere, Jeff; Cornillon, Peter; Blanc, Frederique; Rew, Russ; Harlan, Jack; Hall, Julie; Harrison, D.E.; Stammer, Detlef
2010-01-01
It is generally recognized that meeting society's emerging environmental science and management needs will require the marine data community to provide simpler, more effective and more interoperable access to its data. There is broad agreement, as well, that data standards are the bedrock upon which interoperability will be built. The path that would bring the marine data community to agree upon and utilize such standards, however, is often elusive. In this paper we examine the trio of standards 1) netCDF files; 2) the Climate and Forecast (CF) metadata convention; and 3) the OPeNDAP data access protocol. These standards taken together have brought our community a high level of interoperability for "gridded" data such as model outputs, satellite products and climatological analyses, and they are gaining rapid acceptance for ocean observations. We will provide an overview of the scope of the contribution that has been made. We then step back from the information technology considerations to examine the community or "social" process by which the successes were achieved. We contrast the path by which the World Meteorological Organization (WMO) has advanced the Global Telecommunications System (GTS) - netCDF/CF/OPeNDAP exemplifying a "bottom up" standards process whereas GTS is "top down". Both of these standards are tales of success at achieving specific purposes, yet each is hampered by technical limitations. These limitations sometimes lead to controversy over whether alternative technological directions should be pursued. Finally we draw general conclusions regarding the factors that affect the success of a standards development effort - the likelihood that an IT standard will meet its design goals and will achieve community-wide acceptance. We believe that a higher level of thoughtful awareness by the scientists, program managers and technology experts of the vital role of standards and the merits of alternative standards processes can help us as a community to reach our interoperability goals faster.
NASA Astrophysics Data System (ADS)
Darroch, Louise; Buck, Justin
2017-04-01
Atlantic Ocean observation is currently undertaken through loosely-coordinated, in-situ observing networks, satellite observations and data management arrangements at regional, national and international scales. The EU Horizon 2020 AtlantOS project aims to deliver an advanced framework for the development of an Integrated Atlantic Ocean Observing System that strengthens the Global Ocean Observing System (GOOS) and contributes to the aims of the Galway Statement on Atlantic Ocean Cooperation. One goal is to ensure that data from different and diverse in-situ observing networks are readily accessible and useable to a wider community, including the international ocean science community and other stakeholders in this field. To help achieve this goal, the British Oceanographic Data Centre (BODC) produced a parameter matrix to harmonise data exchange, data flow and data integration for the key variables acquired by multiple in-situ AtlantOS observing networks such as ARGO, Seafloor Mapping and OceanSITES. Our solution used semantic linking of controlled vocabularies and metadata for parameters that were "mappable" to existing EU and international standard vocabularies. An AtlantOS Essential Variables list of terms (aggregated level) based on Global Climate Observing System (GCOS) Essential Climate Variables (ECV), GOOS Essential Ocean Variables (EOV) and other key network variables was defined and published on the Natural Environment Research Council (NERC) Vocabulary Server (version 2.0) as collection A05 (http://vocab.nerc.ac.uk/collection/A05/current/). This new vocabulary was semantically linked to standardised metadata for observed properties and units that had been validated by the AtlantOS community: SeaDataNet parameters (P01), Climate and Forecast (CF) Standard Names (P07) and SeaDataNet units (P06). Observed properties were mapped to biological entities from the internationally assured AphiaID from the WOrld Register of Marine Species (WoRMS), http://www.marinespecies.org/aphia.php?p=webservice. The AtlantOS parameter matrix offers a way to harmonise the globally important variables (such as ECVs and EOVs) from in-situ observing networks that use different flavours of exchange formats based on SeaDataNet and CF parameter metadata. It also offers a way to standardise data in the wider Integrated Ocean Observing System. It uses sustainable and trusted standardised vocabularies that are governed by internationally renowned and long-standing organisations and is interoperable through the use of persistent resource identifiers, such as URNs and PURLs. It is the first step to integrating and serving data in a variety of international exchange formats using Application programming interfaces (API) improving both data discoverability and utility for users.
Evaluation and Quality Control for the Copernicus Seasonal Forecast Systems
NASA Astrophysics Data System (ADS)
Manubens, N.; Hunter, A.; Bedia, J.; Bretonnière, P. A.; Bhend, J.; Doblas-Reyes, F. J.
2017-12-01
The EU funded Copernicus Climate Change Service (C3S) will provide authoritative information about past, current and future climate for a wide range of users, from climate scientists to stakeholders from a wide range of sectors including insurance, energy or transport. It has been recognized that providing information about the products' quality and provenance is paramount to establish trust in the service and allow users to make best use of the available information. This presentation outlines the work being conducted within the Quality Assurance for Multi-model Seasonal Forecast Products project (QA4Seas). The aim of QA4Seas is to develop a strategy for the evaluation and quality control (EQC) of the multi-model seasonal forecasts provided by C3S. First, we present the set of guidelines the data providers must comply with, ensuring the data is fully traceable and harmonized across data sets. Second, we discuss the ongoing work on defining a provenance and metadata model that is able to encode such information, and that can be extended to describe the steps followed to obtain the final verification products such as maps and time series of forecast quality measures. The metadata model is based on the Resource Description Framework W3C standard, being thus extensible and reusable. It benefits from widely adopted vocabularies to describe data provenance and workflows, as well as from expert consensus and community-support for the development of the verification and downscaling specific ontologies. Third, we describe the open source software being developed to generate fully reproducible and certifiable seasonal forecast products, which also attaches provenance and metadata information to the verification measures and enables the user to visually inspect the quality of the C3S products. QA4Seas is seeking collaboration with similar initiatives, as well as extending the discussion to interested parties outside the C3S community to share experiences and establish global common guidelines or best practices regarding data provenance.
Dilokthornsakul, Piyameth; Patidar, Mausam; Campbell, Jonathan D
2017-12-01
To forecast lifetime outcomes and cost of lumacaftor/ivacaftor combination therapy in patients with cystic fibrosis (CF) with homozygous phe508del mutation from the US payer perspective. A lifetime Markov model was developed from a US payer perspective. The model included five health states: 1) mild lung disease (percent predicted forced expiratory volume in 1 second [FEV 1 ] >70%), 2) moderate lung disease (40% ≤ FEV 1 ≤ 70%), 3) severe lung disease (FEV 1 < 40%), 4) lung transplantation, and 5) death. All inputs were derived from published literature. We estimated lumacaftor/ivacaftor's improvement in outcomes compared with a non-CF referent population as well as CF-specific mortality estimates. Lumacaftor/ivacaftor was associated with additional 2.91 life-years (95% credible interval 2.55-3.56) and additional 2.42 quality-adjusted life-years (QALYs) (95% credible interval 2.10-2.98). Lumacaftor/ivacaftor was associated with improvements in survival and QALYs equivalent to 27.6% and 20.7%, respectively, for the survival and QALY gaps between CF usual care and their non-CF peers. The incremental lifetime cost was $2,632,249. Lumacaftor/ivacaftor increased life-years and QALYs in CF patients with the homozygous phe508del mutation and moved morbidity and mortality closer to that of their non-CF peers but it came with higher cost. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Berni, Nicola; Brocca, Luca; Barbetta, Silvia; Pandolfo, Claudia; Stelluti, Marco; Moramarco, Tommaso
2014-05-01
The Italian national hydro-meteorological early warning system is composed by 21 regional offices (Functional Centres, CF). Umbria Region (central Italy) CF provides early warning for floods and landslides, real-time monitoring and decision support systems (DSS) for the Civil Defence Authorities when significant events occur. The alert system is based on hydrometric and rainfall thresholds with detailed procedures for the management of critical events in which different roles of authorities and institutions involved are defined. The real-time flood forecasting system is based also on different hydrological and hydraulic forecasting models. Among these, the MISDc rainfall-runoff model ("Modello Idrologico SemiDistribuito in continuo"; Brocca et al., 2011) and the flood routing model named STAFOM-RCM (STAge Forecasting Model-Rating Curve Model; Barbetta et al., 2014) are continuously operative in real-time providing discharge and stage forecasts, respectively, with lead-times up to 24 hours (when quantitative precipitation forecasts are used) in several gauged river sections in the Upper-Middle Tiber River basin. Models results are published in real-time in the open source CF web platform: www.cfumbria.it. MISDc provides discharge and soil moisture forecasts for different sub-basins while STAFOM-RCM provides stage forecasts at hydrometric sections. Moreover, through STAFOM-RCM the uncertainty of the forecast stage hydrograph is provided in terms of 95% Confidence Interval (CI) assessed by analyzing the statistical properties of model output in terms of lateral. In the period 10th-12th November 2013, a severe flood event occurred in Umbria mainly affecting the north-eastern area and causing significant economic damages, but fortunately no casualties. The territory was interested by intense and persistent rainfall; the hydro-meteorological monitoring network recorded locally rainfall depth over 400 mm in 72 hours. In the most affected area, the recorded rainfall depths correspond approximately to a return period of 200 years. Most rivers in Umbria have been involved, exceeding hydrometric thresholds and causing flooding (e.g. Chiascio river). The flood event was continuously monitored at the Umbria Region CF and the possible evolution predicted and assessed on the basis of the model forecasts. The predictions provided by MISDc and STAFOM-RCM were found useful to support real-time decision-making addressed to flood risk management. Moreover, the quantification of the uncertainty affecting the deterministic forecast stages was found consistent with the level of confidence selected and had practical utility corroborating the need of coupling deterministic forecast and 'uncertainty' when the model output is used to support decisions about flood management. REFERENCES Barbetta, S., Moramarco, T., Brocca, L., Franchini, M., Melone, F. (2014). Confidence interval of real-time forecast stages provided by the STAFOM-RCM model: the case study of the Tiber River (Italy). Hydrological Processes, 28(3), 729-743. Brocca, L., Melone, F., Moramarco, T. (2011). Distributed rainfall-runoff modelling for flood frequency estimation and flood forecasting. Hydrological Processes, 25 (18), 2801-2813
Forecasting US ivacaftor outcomes and cost in cystic fibrosis patients with the G551D mutation.
Dilokthornsakul, Piyameth; Hansen, Ryan N; Campbell, Jonathan D
2016-06-01
Ivacaftor, a breakthrough treatment for cystic fibrosis (CF) patients with the G551D genetic mutation, lacks long-term clinical and cost projections. This study forecasted outcomes and cost by comparing ivacaftor plus usual care versus usual care alone.A lifetime Markov model was conducted from a US payer perspective. The model consisted of five health states: 1) forced expiratory volume in 1 s (FEV1) % pred ≥70%, 2) 40%≤ FEV1 % pred <70%, 3) FEV1 % pred <40%, 4) lung transplantation and 5) death. All inputs were extracted from published literature. Budget impact was also estimated. We estimated ivacaftor's improvement in outcomes compared with a non-CF referent population.Ivacaftor was associated with 18.25 (95% credible interval (CrI) 13.71-22.20) additional life-years and 15.03 (95% CrI 11.13-18.73) additional quality-adjusted life-years (QALYs). Ivacaftor was associated with improvements in survival and QALYs equivalent to 68% and 56%, respectively, for the survival and QALY gaps between CF usual care and their non-CF peers. The incremental lifetime cost was $3 374 584. The budget impact was $0.087 per member per month.Ivacaftor increased life-years and QALYs in CF patients with the G551D mutation, and moved morbidity and mortality closer to that of their non-CF peers. Ivacaftor costs much more than usual care, but comes at a relatively limited budget impact. Copyright ©ERS 2016.
EnviroAtlas One Meter Resolution Urban Land Cover Data (2008-2012) Web Service
This EnviroAtlas web service supports research and online mapping activities related to EnviroAtlas (https://www.epa.gov/enviroatlas ). The EnviroAtlas One Meter-scale Urban Land Cover (MULC) Data were generated individually for each EnviroAtlas community. Source imagery varies by community. Land cover classes mapped also vary by community and include the following: water, impervious surfaces, soil and barren land, trees, shrub, grass and herbaceous, agriculture, orchards, woody wetlands, and emergent wetlands. Accuracy assessments were completed for each community's classification. For specific information about methods and accuracy of each community's land cover classification, consult their individual metadata records: Austin, TX (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B91A32A9D-96F5-4FA0-BC97-73BAD5D1F158%7D); Cleveland, OH (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B82ab1edf-8fc8-4667-9c52-5a5acffffa34%7D); Des Moines, IA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BA4152198-978D-4C0B-959F-42EABA9C4E1B%7D); Durham, NC (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B2FF66877-A037-4693-9718-D1870AA3F084%7D); Fresno, CA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B87041CF3-05BC-43C3-82DA-F066267C9871%7D); Green Bay, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BD602E7C9-7F53-4C24
A Free and Open Source Web-based Data Catalog Evaluation Tool
NASA Astrophysics Data System (ADS)
O'Brien, K.; Schweitzer, R.; Burger, E. F.
2015-12-01
For many years, the Unified Access Framework (UAF) project has worked to provide improved access to scientific data by leveraging widely used data standards and conventions. These standards include the Climate and Forecast (CF) metadata conventions, the Data Access Protocol (DAP) and various Open Geospatial Consortium (OGC) standards such as WMS and WCS. The UAF has also worked to create a unified access point for scientific data access through THREDDS and ERDDAP catalogs. A significant effort was made by the UAF project to build a catalog-crawling tool that was designed to crawl remote catalogs, analyze their content and then build a clean catalog that 1) represented only CF compliant data; 2) provided a uniform set of access services and 3) where possible, aggregated data in time. That catalog is available at http://ferret.pmel.noaa.gov/geoide/geoIDECleanCatalog.html.Although this tool has proved immensely valuable in allowing the UAF project to create a high quality data catalog, the need for a catalog evaluation service or tool to operate on a more local level also exists. Many programs that generate data of interest to the public are recognizing the utility and power of using the THREDDS data server (TDS) to serve that data. However, for some groups that lack the resources to maintain dedicated IT personnel, it can be difficult to set up a properly configured TDS. The TDS catalog evaluating service that is under development and will be discussed in this presentation is an effort, through the UAF project, to bridge that gap. Based upon the power of the original UAF catalog cleaner, the web evaluator will have the ability to scan and crawl a local TDS catalog, evaluate the contents for compliance with CF standards, analyze the services offered, and identify datasets where possible temporal aggregation would benefit data access. The results of the catalog evaluator will guide the configuration of the dataset in TDS to ensure that it meets the standards as promoted by the UAF framework.
Cloud-Enabled Climate Analytics-as-a-Service using Reanalysis data: A case study.
NASA Astrophysics Data System (ADS)
Nadeau, D.; Duffy, D.; Schnase, J. L.; McInerney, M.; Tamkin, G.; Potter, G. L.; Thompson, J. H.
2014-12-01
The NASA Center for Climate Simulation (NCCS) maintains advanced data capabilities and facilities that allow researchers to access the enormous volume of data generated by weather and climate models. The NASA Climate Model Data Service (CDS) and the NCCS are merging their efforts to provide Climate Analytics-as-a-Service for the comparative study of the major reanalysis projects: ECMWF ERA-Interim, NASA/GMAO MERRA, NOAA/NCEP CFSR, NOAA/ESRL 20CR, JMA JRA25, and JRA55. These reanalyses have been repackaged to netCDF4 file format following the CMIP5 Climate and Forecast (CF) metadata convention prior to be sequenced into the Hadoop Distributed File System ( HDFS ). A small set of operations that represent a common starting point in many analysis workflows was then created: min, max, sum, count, variance and average. In this example, Reanalysis data exploration was performed with the use of Hadoop MapReduce and accessibility was achieved using the Climate Data Service(CDS) application programming interface (API) created at NCCS. This API provides a uniform treatment of large amount of data. In this case study, we have limited our exploration to 2 variables, temperature and precipitation, using 3 operations, min, max and avg and using 30-year of Reanalysis data for 3 regions of the world: global, polar, subtropical.
Examining the Role of Metadata in Testing IED Detection Systems
2009-09-01
energy management, crop assessment, weather forecasting, disaster alerts, and endangered species assessment [ Biagioni and Bridges 2002; Main- waring et...accessed June 10, 2008). Biagioni , E. and K. Bridges. 2002. The application of remote sensor technology to assist the recovery of rare endangered species
EnviroAtlas Estimated Percent Tree Cover Along Walkable Roads Web Service
This EnviroAtlas dataset estimates tree cover along walkable roads. The road width is estimated for each road and percent tree cover is calculated in a 8.5 meter strip beginning at the estimated road edge. Percent tree cover is calculated for each block between road intersections. Tree cover provides valuable benefits to neighborhood residents and walkers by providing shade, improved aesthetics, and outdoor gathering spaces. For specific information about each community's Estimated Percent Tree Cover Along Walkable Roads layer, consult their individual metadata records: Austin, TX (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B4876FD99-C14A-464A-9E31-5CB5F2225687%7D); Cleveland, OH (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B28e3f937-6f22-45c5-98cf-1707b0fc92df%7D); Des Moines, IA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B09FE7D60-B636-405C-BB07-68147DFE8CAF%7D); Durham, NC (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BF341A26B-4972-4C6B-B675-9B5E02F4F25F%7D); Fresno, CA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BB71334B9-C53A-4674-A739-1031969E5163%7D); Green Bay, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BB9AFEBED-9C29-4DB0-8B54-0CAF58BE5A2D%7D); Memphis, TN (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BBE552E7A-A789-4AA9-ADF9-234109C6517E%7D); Mi
NASA Astrophysics Data System (ADS)
Moroni, D. F.; Armstrong, E. M.; Tauer, E.; Hausman, J.; Huang, T.; Thompson, C. K.; Chung, N.
2013-12-01
The Physical Oceanographic Distributed Active Archive Center (PO.DAAC) is one of 12 data centers sponsored by NASA's Earth Science Data and Information System (ESDIS) project. The PO.DAAC is tasked with archival and distribution of NASA Earth science missions specific to physical oceanography, many of which have interdisciplinary applications for weather forecasting/monitoring, ocean biology, ocean modeling, and climate studies. PO.DAAC has a 20-year history of cross-project and international collaborations with partners in Europe, Japan, Australia, and the UK. Domestically, the PO.DAAC has successfully established lasting partners with non-NASA institutions and projects including the National Oceanic and Atmospheric Administration (NOAA), United States Navy, Remote Sensing Systems, and Unidata. A key component of these partnerships is PO.DAAC's direct involvement with international working groups and science teams, such as the Group for High Resolution Sea Surface Temperature (GHRSST), International Ocean Vector Winds Science Team (IOVWST), Ocean Surface Topography Science Team (OSTST), and the Committee on Earth Observing Satellites (CEOS). To help bolster new and existing collaborations, the PO.DAAC has established a standardized approach to its internal Data Management and Archiving System (DMAS), utilizing a Data Dictionary to provide the baseline standard for entry and capture of dataset and granule metadata. Furthermore, the PO.DAAC has established an end-to-end Dataset Lifecycle Policy, built upon both internal and external recommendations of best practices toward data stewardship. Together, DMAS, the Data Dictionary, and the Dataset Lifecycle Policy provide the infrastructure to enable standardized data and metadata to be fully ingested and harvested to facilitate interoperability and compatibility across data access protocols, tools, and services. The Dataset Lifecycle Policy provides the checks and balances to help ensure all incoming HDF and netCDF-based datasets meet minimum compliance requirements with the Lawrence Livermore National Laboratory's actively maintained Climate and Forecast (CF) conventions with additional goals toward metadata standards provided by the Attribute Convention for Dataset Discovery (ACDD), the International Organization for Standardization (ISO) 19100-series, and the Federal Geographic Data Committee (FGDC). By default, DMAS ensures all datasets are compliant with NASA's Global Change Master Directory (GCMD) and NASA's Reverb data discovery clearinghouse (also known as ECHO). For data access, PO.DAAC offers several widely-used technologies, including File Transfer Protocol (FTP), Open-source Project for a Network Data Access Protocol (OPeNDAP), and Thematic Realtime Environmental Distributed Data Services (THREDDS). These access technologies are available directly to users or through PO.DAAC's web interfaces, specifically the High-level Tool for Interactive Data Extraction (HiTIDE), Live Access Server (LAS), and PO.DAAC's set of search, image, and Consolidated Web Services (CWS). Lastly, PO.DAAC's newly introduced, standards-based CWS provide singular endpoints for search, imaging, and extraction capabilities, respectively, across L2/L3/L4 datasets. Altogether, these tools, services and policies serve to provide flexible, interoperable functionality for both users and data providers.
Towards the Goal of Modular Climate Data Services: An Overview of NCPP Applications and Software
NASA Astrophysics Data System (ADS)
Koziol, B. W.; Cinquini, L.; Treshansky, A.; Murphy, S.; DeLuca, C.
2013-12-01
In August 2013, the National Climate Predictions and Projections Platform (NCPP) organized a workshop focusing on the quantitative evaluation of downscaled climate data products (QED-2013). The QED-2013 workshop focused on real-world application problems drawn from several sectors (e.g. hydrology, ecology, environmental health, agriculture), and required that downscaled downscaled data products be dynamically accessed, generated, manipulated, annotated, and evaluated. The cyberinfrastructure elements that were integrated to support the workshop included (1) a wiki-based project hosting environment (Earth System CoG) with an interface to data services provided by an Earth System Grid Federation (ESGF) data node; (2) metadata tools provided by the Earth System Documentation (ES-DOC) collaboration; and (3) a Python-based library OpenClimateGIS (OCGIS) for subsetting and converting NetCDF-based climate data to GIS and tabular formats. Collectively, this toolset represents a first deployment of a 'ClimateTranslator' that enables users to access, interpret, and apply climate information at local and regional scales. This presentation will provide an overview of these components above, how they were used in the workshop, and discussion of current and potential integration. The long-term strategy for this software stack is to offer the suite of services described on a customizable, per-project basis. Additional detail on the three components is below. (1) Earth System CoG is a web-based collaboration environment that integrates data discovery and access services with tools for supporting governance and the organization of information. QED-2013 utilized these capabilities to share with workshop participants a suite of downscaled datasets, associated images derived from those datasets, and metadata files describing the downscaling techniques involved. The collaboration side of CoG was used for workshop organization, discussion, and results. (2) The ES-DOC Questionnaire, Viewer, and Comparator are web-based tools for the creation and use of model and experiment documentation. Workshop participants used the Questionnaire to generate metadata on regional downscaling models and statistical downscaling methods, and the Viewer to display the results. A prototype Comparator was available to compare properties across dynamically downscaled models. (3) OCGIS is a Python (v2.7) package designed for geospatial manipulation, subsetting, computation, and translation of Climate and Forecasting (CF)-compliant climate datasets - either stored in local NetCDF files, or files served through THREDDS data servers.
Dynamic reusable workflows for ocean science
Signell, Richard; Fernandez, Filipe; Wilcox, Kyle
2016-01-01
Digital catalogs of ocean data have been available for decades, but advances in standardized services and software for catalog search and data access make it now possible to create catalog-driven workflows that automate — end-to-end — data search, analysis and visualization of data from multiple distributed sources. Further, these workflows may be shared, reused and adapted with ease. Here we describe a workflow developed within the US Integrated Ocean Observing System (IOOS) which automates the skill-assessment of water temperature forecasts from multiple ocean forecast models, allowing improved forecast products to be delivered for an open water swim event. A series of Jupyter Notebooks are used to capture and document the end-to-end workflow using a collection of Python tools that facilitate working with standardized catalog and data services. The workflow first searches a catalog of metadata using the Open Geospatial Consortium (OGC) Catalog Service for the Web (CSW), then accesses data service endpoints found in the metadata records using the OGC Sensor Observation Service (SOS) for in situ sensor data and OPeNDAP services for remotely-sensed and model data. Skill metrics are computed and time series comparisons of forecast model and observed data are displayed interactively, leveraging the capabilities of modern web browsers. The resulting workflow not only solves a challenging specific problem, but highlights the benefits of dynamic, reusable workflows in general. These workflows adapt as new data enters the data system, facilitate reproducible science, provide templates from which new scientific workflows can be developed, and encourage data providers to use standardized services. As applied to the ocean swim event, the workflow exposed problems with two of the ocean forecast products which led to improved regional forecasts once errors were corrected. While the example is specific, the approach is general, and we hope to see increased use of dynamic notebooks across the geoscience domains.
Synthesis of national risk profile
NASA Technical Reports Server (NTRS)
1979-01-01
The methodology used and results obtained in computing the national risk profile for carbon fibers (CF) released after an aircraft accident (fire or explosion) are presented. The computation was performed by use of twenty-six individual conditional risk profiles, together with the extrapolation of these profiles to other U.S. airports. The risk profile was obtained using 1993 CF utilization forecasts, but numbers of facilities were taken from 1972 and 1975 census data, while losses were expressed in 1977 dollars.
Challenges to Standardization: A Case Study Using Coastal and Deep-Ocean Water Level Data
NASA Astrophysics Data System (ADS)
Sweeney, A. D.; Stroker, K. J.; Mungov, G.; McLean, S. J.
2015-12-01
Sea levels recorded at coastal stations and inferred from deep-ocean pressure observations at the seafloor are submitted for archive in multiple data and metadata formats. These formats include two forms of schema-less XML and a custom binary format accompanied by metadata in a spreadsheet. The authors report on efforts to use existing standards to make this data more discoverable and more useful beyond their initial use in detecting tsunamis. An initial review of data formats for sea level data around the globe revealed heterogeneity in presentation and content. In the absence of a widely-used domain-specific format, we adopted the general model for structuring data and metadata expressed by the Network Common Data Form (netCDF). netCDF has been endorsed by the Open Geospatial Consortium and has the advantages of small size when compared to equivalent plain text representation and provides a standard way of embedding metadata in the same file. We followed the orthogonal time-series profile of the Climate and Forecast discrete sampling geometries as the convention for structuring the data and describing metadata relevant for use. We adhered to the Attribute Convention for Data Discovery for capturing metadata to support user search. Beyond making it possible to structure data and metadata in a standard way, netCDF is supported by multiple software tools in providing programmatic cataloging, access, subsetting, and transformation to other formats. We will describe our successes and failures in adhering to existing standards and provide requirements for either augmenting existing conventions or developing new ones. Some of these enhancements are specific to sea level data, while others are applicable to time-series data in general.
NASA Astrophysics Data System (ADS)
Hausman, J.; Sanchez, A.; Armstrong, E. M.
2014-12-01
Seasat-A was NASA's first ocean observing satellite mission. It launched in June 1978 and operated continuously until it suffered a power failure 106 days later. It contained an altimeter (ALT), scatterometer (SASS), SAR, microwave radiometer (SMMR), and a visible/infrared radiometer (VIRR). These instruments allowed Seasat to measure sea surface height, ocean winds and both brightness and sea surface temperatures. The data, except for the SAR, are archived at PO.DAAC. Since these are the only oceanographic satellite data available for this early period of remote sensing, their importance has grown for use in climate studies. Even though the datasets were digitized from the original tapes, the Seasat data have since still been maintained in the same flat binary format technology of 1980 when the data were first distributed. In 2013 PO.DAAC began a project to reformat the original data into a user friendly, modern and maintainable format consistent with the netCDF data model and Climate Forecast (CF) and Attribute Conventions Dataset Discovery (ACDD) metadata standards. A significant benefit of using this data format includes the improved interoperability with tools and web services such as OPeNDAP, THREDDS, and various subsetting software, such as PO.DAAC's HiTIDE. Additionally, application of such metadata standards provides an opportunity to correctly document the data at the granule level. The first step in the conversion process involved going through the original documentation to understand the source binary data format. Documentation was found for processing levels 1 and 2 for ALT, SASS and SMMR. Software readers were then written for each of the datasets using Matlab , followed by regression tests performed on the newly outputted data in order to demonstrate that the readers were correctly interpreting the source data. Next, writers were created to convert the data into the updated format. The reformatted data were also regression tested and science validated to ensure that the data were not corrupted during the reformatting process. The resulting modernized Seasat datasets will be made available iteratively by instrument and processing level on PO.DAAC's web portal http://podaac.jpl.nasa.gov, anonymous ftp site, ftp://podaac.jpl.nasa.gov/allData/seasat and other web services.
The Comparison of Point Data Models for the Output of WRF Hydro Model in the IDV
NASA Astrophysics Data System (ADS)
Ho, Y.; Weber, J.
2017-12-01
WRF Hydro netCDF output files contain streamflow, flow depth, longitude, latitude, altitude and stream order values for each forecast point. However, the data are not CF compliant. The total number of forecast points for the US CONUS is approximately 2.7 million and it is a big challenge for any visualization and analysis tool. The IDV point cloud display shows point data as a set of points colored by parameter. This display is very efficient compared to a standard point type display for rendering a large number of points. The one problem we have is that the data I/O can be a bottleneck issue when dealing with a large collection of point input files. In this presentation, we will experiment with different point data models and their APIs to access the same WRF Hydro model output. The results will help us construct a CF compliant netCDF point data format for the community.
DPADL: An Action Language for Data Processing Domains
NASA Technical Reports Server (NTRS)
Golden, Keith; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper presents DPADL (Data Processing Action Description Language), a language for describing planning domains that involve data processing. DPADL is a declarative object-oriented language that supports constraints and embedded Java code, object creation and copying, explicit inputs and outputs for actions, and metadata descriptions of existing and desired data. DPADL is supported by the IMAGEbot system, which will provide automation for an ecosystem forecasting system called TOPS.
NASA Astrophysics Data System (ADS)
Anderson, D. M.; Snowden, D. P.; Bochenek, R.; Bickel, A.
2015-12-01
In the U.S. coastal waters, a network of eleven regional coastal ocean observing systems support real-time coastal and ocean observing. The platforms supported and variables acquired are diverse, ranging from current sensing high frequency (HF) radar to autonomous gliders. The system incorporates data produced by other networks and experimental systems, further increasing the breadth of the collection. Strategies promoted by the U.S. Integrated Ocean Observing System (IOOS) ensure these data are not lost at sea. Every data set deserves a description. ISO and FGDC compliant metadata enables catalog interoperability and record-sharing. Extensive use of netCDF with the Climate and Forecast convention (identifying both metadata and a structured format) is shown to be a powerful strategy to promote discovery, interoperability, and re-use of the data. To integrate specialized data which are often obscure, quality control protocols are being developed to homogenize the QC and make these data more integrate-able. Data Assembly Centers have been established to integrate some specialized streams including gliders, animal telemetry, and HF radar. Subsets of data that are ingested into the National Data Buoy Center are also routed to the Global Telecommunications System (GTS) of the World Meteorological Organization to assure wide international distribution. From the GTS, data are assimilated into now-cast and forecast models, fed to other observing systems, and used to support observation-based decision making such as forecasts, warnings, and alerts. For a few years apps were a popular way to deliver these real-time data streams to phones and tablets. Responsive and adaptive web sites are an emerging flexible strategy to provide access to the regional coastal ocean observations.
Air Quality uFIND: User-oriented Tool Set for Air Quality Data Discovery and Access
NASA Astrophysics Data System (ADS)
Hoijarvi, K.; Robinson, E. M.; Husar, R. B.; Falke, S. R.; Schultz, M. G.; Keating, T. J.
2012-12-01
Historically, there have been major impediments to seamless and effective data usage encountered by both data providers and users. Over the last five years, the international Air Quality (AQ) Community has worked through forums such as the Group on Earth Observations AQ Community of Practice, the ESIP AQ Working Group, and the Task Force on Hemispheric Transport of Air Pollution to converge on data format standards (e.g., netCDF), data access standards (e.g., Open Geospatial Consortium Web Coverage Services), metadata standards (e.g., ISO 19115), as well as other conventions (e.g., CF Naming Convention) in order to build an Air Quality Data Network. The centerpiece of the AQ Data Network is the web service-based tool set: user-oriented Filtering and Identification of Networked Data. The purpose of uFIND is to provide rich and powerful facilities for the user to: a) discover and choose a desired dataset by navigation through the multi-dimensional metadata space using faceted search, b) seamlessly access and browse datasets, and c) use uFINDs facilities as a web service for mashups with other AQ applications and portals. In a user-centric information system such as uFIND, the user experience is improved by metadata that includes the general fields for discovery as well as community-specific metadata to narrow the search beyond space, time and generic keyword searches. However, even with the community-specific additions, the ISO 19115 records were formed in compliance with the standard, so that other standards-based search interface could leverage this additional information. To identify the fields necessary for metadata discovery we started with the ISO 19115 Core Metadata fields and fields that were needed for a Catalog Service for the Web (CSW) Record. This fulfilled two goals - one to create valid ISO 19115 records and the other to be able to retrieve the records through a Catalog Service for the Web query. Beyond the required set of fields, the AQ Community added additional fields using a combination of keywords and ISO 19115 fields. These extensions allow discovery by measurement platform or observed phenomena. Beyond discovery metadata, the AQ records include service identification objects that allow standards-based clients, such as some brokers, to access the data found via OGC WCS or WMS data access protocols. uFIND, is one such smart client, this combination of discovery and access metadata allows the user to preview each registered dataset through spatial and temporal views; observe the data access and usage pattern and also find links to dataset-specific metadata directly in uFIND. The AQ data providers also benefit from this architecture since their data products are easier to find and re-use, enhancing the relevance and importance of their products. Finally, the earth science community at large benefits from the Service Oriented Architecture of uFIND, since it is a service itself and allows service-based interfacing with providers and users of the metadata, allowing uFIND facets to be further refined for a particular AQ application or completely repurposed for other Earth Science domains that use the same set of data access and metadata standards.
NASA Astrophysics Data System (ADS)
Popov, V. N.; Botygin, I. A.; Kolochev, A. S.
2017-01-01
The approach allows representing data of international codes for exchange of meteorological information using metadescription as the formalism associated with certain categories of resources. Development of metadata components was based on an analysis of the data of surface meteorological observations, atmosphere vertical sounding, atmosphere wind sounding, weather radar observing, observations from satellites and others. A common set of metadata components was formed including classes, divisions and groups for a generalized description of the meteorological data. The structure and content of the main components of a generalized metadescription are presented in detail by the example of representation of meteorological observations from land and sea stations. The functional structure of a distributed computing system is described. It allows organizing the storage of large volumes of meteorological data for their further processing in the solution of problems of the analysis and forecasting of climatic processes.
Skill assessment of Korea operational oceanographic system (KOOS)
NASA Astrophysics Data System (ADS)
Kim, J.; Park, K.
2016-02-01
For the ocean forecast system in Korea, the Korea operational oceanographic system (KOOS) has been developed and pre-operated since 2009 by the Korea institute of ocean science and technology (KIOST) funded by the Korean government. KOOS provides real time information and forecasts for marine environmental conditions in order to support all kinds of activities in the sea. Furthermore, more significant purpose of the KOOS information is to response and support to maritime problems and accidents such as oil spill, red-tide, shipwreck, extraordinary wave, coastal inundation and so on. Accordingly, it is essential to evaluate prediction accuracy and efforts to improve accuracy. The forecast accuracy should meet or exceed target benchmarks before its products are approved for release to the public.In this paper, we conduct error quantification of the forecasts using skill assessment technique for judgement of the KOOS performance. Skill assessment statistics includes the measures of errors and correlations such as root-mean-square-error (RMSE), mean bias (MB), correlation coefficient (R), and index of agreement (IOA) and the frequency with which errors lie within specified limits termed the central frequency (CF).The KOOS provides 72-hour daily forecast data such as air pressure, wind, water elevation, currents, wave, water temperature, and salinity produced by meteorological and hydrodynamic numerical models of WRF, ROMS, MOM5, WAM, WW3, and MOHID. The skill assessment has been performed through comparison of model results with in-situ observation data (Figure 1) for the period from 1 July, 2010 to 31 March, 2015 in Table 1 and model errors have been quantified with skill scores and CF determined by acceptable criteria depending on predicted variables (Table 2). Moreover, we conducted quantitative evaluation of spatio-temporal pattern correlation between numerical models and observation data such as sea surface temperature (SST) and sea surface current produced by ocean sensor in satellites and high frequency (HF) radar, respectively. Those quantified errors can allow to objective assessment of the KOOS performance and used can reveal different aspects of model inefficiency. Based on these results, various model components are tested and developed in order to improve forecast accuracy.
Enhanced Management of and Access to Hurricane Sandy Ocean and Coastal Mapping Data
NASA Astrophysics Data System (ADS)
Eakins, B.; Neufeld, D.; Varner, J. D.; McLean, S. J.
2014-12-01
NOAA's National Geophysical Data Center (NGDC) has significantly improved the discovery and delivery of its geophysical data holdings, initially targeting ocean and coastal mapping (OCM) data in the U.S. coastal region impacted by Hurricane Sandy in 2012. We have developed a browser-based, interactive interface that permits users to refine their initial map-driven data-type choices prior to bulk download (e.g., by selecting individual surveys), including the ability to choose ancillary files, such as reports or derived products. Initial OCM data types now available in a U.S. East Coast map viewer, as well as underlying web services, include: NOS hydrographic soundings and multibeam sonar bathymetry. Future releases will include trackline geophysics, airborne topographic and bathymetric-topographic lidar, bottom sample descriptions, and digital elevation models.This effort also includes working collaboratively with other NOAA offices and partners to develop automated methods to receive and verify data, stage data for archive, and notify data providers when ingest and archive are completed. We have also developed improved metadata tools to parse XML and auto-populate OCM data catalogs, support the web-based creation and editing of ISO-compliant metadata records, and register metadata in appropriate data portals. This effort supports a variety of NOAA mission requirements, from safe navigation to coastal flood forecasting and habitat characterization.
Value of the GENS Forecast Ensemble as a Tool for Adaptation of Economic Activity to Climate Change
NASA Astrophysics Data System (ADS)
Hancock, L. O.; Alpert, J. C.; Kordzakhia, M.
2009-12-01
In an atmosphere of uncertainty as to the magnitude and direction of climate change in upcoming decades, one adaptation mechanism has emerged with consensus support: the upgrade and dissemination of spatially-resolved, accurate forecasts tailored to the needs of users. Forecasting can facilitate the changeover from dependence on climatology that is increasingly out of date. The best forecasters are local, but local forecasters face great constraints in some countries. Indeed, it is no coincidence that some areas subject to great weather variability and strong processes of climate change are economically vulnerable: mountainous regions, for example, where heavy and erratic flooding can destroy the value built up by households over years. It follows that those best placed to benefit from forecasting upgrades may not be those who have invested in the greatest capacity to date. More-flexible use of the global forecasts may contribute to adaptation. NOAA anticipated several years ago that their forecasts could be used in new ways in the future, and accordingly prepared sockets for easy access to their archives. These could be used to empower various national and regional capacities. Verification to identify practical lead times for the economically important variables is a needed first step. This presentation presents the verification that our team has undertaken, a pilot effort in which we considered variables of interest to economic actors in several lower income countries, cf. shepherds in a remote area of Central Asia, and verified the ensemble forecasts of those variables.
NASA Astrophysics Data System (ADS)
Lengyel, F.; Yang, P.; Rosenzweig, B.; Vorosmarty, C. J.
2012-12-01
The Northeast Regional Earth System Model (NE-RESM, NSF Award #1049181) integrates weather research and forecasting models, terrestrial and aquatic ecosystem models, a water balance/transport model, and mesoscale and energy systems input-out economic models developed by interdisciplinary research team from academia and government with expertise in physics, biogeochemistry, engineering, energy, economics, and policy. NE-RESM is intended to forecast the implications of planning decisions on the region's environment, ecosystem services, energy systems and economy through the 21st century. Integration of model components and the development of cyberinfrastructure for interacting with the system is facilitated with the integrated Rule Oriented Data System (iRODS), a distributed data grid that provides archival storage with metadata facilities and a rule-based workflow engine for automating and auditing scientific workflows.
NASA Astrophysics Data System (ADS)
Sandri, Laura; Selva, Jacopo; Costa, Antonio; Macedonio, Giovanni; Marzocchi, Warner
2014-05-01
Probabilistic Volcanic Hazard Assessment (PVHA) represents the most complete scientific contribution for planning rational strategies aimed at mitigating the risk posed by volcanic activity at different time scales. The definition of the space-time window for PVHA is related to the kind of risk mitigation actions that are under consideration. Short intervals (days to weeks) are important for short-term risk mitigation actions like the evacuation of a volcanic area. During volcanic unrest episodes or eruptions, it is of primary importance to produce short-term tephra fallout forecast, and frequently update it to account for the rapidly evolving situation. This information is obviously crucial for crisis management, since tephra may heavily affect building stability, public health, transportations and evacuation routes (airports, trains, road traffic) and lifelines (electric power supply). In this study, we propose a methodology for the short-term PVHA and its operational implementation, based on the model BET_EF, in which measures from the monitoring system are used to routinely update the forecast of some parameters related to the eruption dynamics, that is, the probabilities of eruption, of every possible vent position and every possible eruption size. Then, considering all possible vent positions and eruptive sizes, tephra dispersal models are coupled with frequently updated meteorological forecasts. Finally, these results are merged through a Bayesian procedure, accounting for epistemic uncertainties at all the considered steps. As case study we retrospectively study some stages of the volcanic unrest that took place in Campi Flegrei (CF) in 1982-1984. In particular, we aim at presenting a practical example of possible operational tephra fall PVHA on a daily basis, in the surroundings of CF at different stages of the 1982-84 unrest. Tephra dispersal is simulated using the analytical HAZMAP code. We consider three possible eruptive sizes (a low, a medium and a high eruption "scenario" respectively) and 700 possible vent positions within the CF Neapolitan Yellow Tuff caldera. The probabilities related to eruption dynamics, and estimated by BET_EF, are based on the set up of the code obtained specifically for CF during a 6-years long elicitation project, and on the actual monitoring parameters measured during the unrest and published in the literature. We take advantage here of two novel improvements: (i) a time function to describe how the probability of eruption evolves within the time window defined for the forecast, and (ii) the production of hazard curves and their confidence levels, a tool that allows a complete description of PVHA and its uncertainties. The general goal of this study is to show what, and how, pieces of scientific knowledge can be operationally transferred to decision makers, and specifically how this could have been translated in practice during the 1982-84 Campi Flegrei crisis, if scientists knew what we know today about this volcano.
Latest developments for the IAGOS database: Interoperability and metadata
NASA Astrophysics Data System (ADS)
Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Schultz, Martin; van Velthoven, Peter; Broetz, Bjoern; Rauthe-Schöch, Armin; Brissebrat, Guillaume
2014-05-01
In-service Aircraft for a Global Observing System (IAGOS, http://www.iagos.org) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. Users can access the data through the following web sites: http://www.iagos.fr or http://www.pole-ether.fr as the IAGOS database is part of the French atmospheric chemistry data centre ETHER (CNES and CNRS). The database is in continuous development and improvement. In the framework of the IGAS project (IAGOS for GMES/COPERNICUS Atmospheric Service), major achievements will be reached, such as metadata and format standardisation in order to interoperate with international portals and other databases, QA/QC procedures and traceability, CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data integration within the central database, and the real-time data transmission. IGAS work package 2 aims at providing the IAGOS data to users in a standardized format including the necessary metadata and information on data processing, data quality and uncertainties. We are currently redefining and standardizing the IAGOS metadata for interoperable use within GMES/Copernicus. The metadata are compliant with the ISO 19115, INSPIRE and NetCDF-CF conventions. IAGOS data will be provided to users in NetCDF or NASA Ames format. We also are implementing interoperability between all the involved IAGOS data services, including the central IAGOS database, the former MOZAIC and CARIBIC databases, Aircraft Research DLR database and the Jülich WCS web application JOIN (Jülich OWS Interface) which combines model outputs with in situ data for intercomparison. The optimal data transfer protocol is being investigated to insure the interoperability. To facilitate satellite and model validation, tools will be made available for co-location and comparison with IAGOS. We will enhance the JOIN application in order to properly display aircraft data as vertical profiles and along individual flight tracks and to allow for graphical comparison to model results that are accessible through interoperable web services, such as the daily products from the GMES/Copernicus atmospheric service.
NASA Astrophysics Data System (ADS)
Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim
2010-05-01
The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can concern local, satellite and model data. - Documentation: catalogue of all the available data and their metadata. These tools have been developed using standard and free languages and softwares: - Linux system with an Apache web server and a Tomcat application server; - J2EE tools : JSF and Struts frameworks, hibernate; - relational database management systems: PostgreSQL and MySQL; - OpenLDAP directory. In order to facilitate the access to the data by African scientists, the complete system has been mirrored at AGHRYMET Regional Centre in Niamey and is operational there since January 2009. Users can now access metadata and request data through one or the other of two equivalent portals: http://database.amma-international.org or http://amma.agrhymet.ne/amma-data.
EIA publications directory 1994
NASA Astrophysics Data System (ADS)
1995-07-01
Enacted in 1977, the Department of Energy (DOE) Organization Act established the Energy Information Administration (EIA) as the Department's independent statistical and analytical agency, with a mandate to collect and publish data and prepare analyses on energy production, consumption, prices, resources, and projections of energy supply and demand. This edition of the EIA Publications Directory contains titles and abstracts of periodicals and one-time reports produced by EIA from January through December 1994. The body of the Directory contains citations and abstracts arranged by broad subject categories: metadata, coal, oil and gas, nuclear, electricity, renewable energy/alternative fuels, multifuel, end-use consumption, models, and forecasts.
NASA Astrophysics Data System (ADS)
Manzano Muñoz, Fernando; Pouliquen, Sylvie; Petit de la Villeon, Loic; Carval, Thierry; Loubrieu, Thomas; Wedhe, Henning; Sjur Ringheim, Lid; Hammarklint, Thomas; Tamm, Susanne; De Alfonso, Marta; Perivoliotis, Leonidas; Chalkiopoulos, Antonis; Marinova, Veselka; Tintore, Joaquin; Troupin, Charles
2016-04-01
Copernicus, previously known as GMES (Global Monitoring for Environment and Security), is the European Programme for the establishment of a European capacity for Earth Observation and Monitoring. Copernicus aims to provide a sustainable service for Ocean Monitoring and Forecasting validated and commissioned by users. From May 2015, the Copernicus Marine Environment Monitoring Service (CMEMS) is working on an operational mode through a contract with services engagement (result is regular data provision). Within CMEMS, the In Situ Thematic Assembly Centre (INSTAC) distributed service integrates in situ data from different sources for operational oceanography needs. CMEMS INSTAC is collecting and carrying out quality control in a homogeneous manner on data from providers outside Copernicus (national and international networks), to fit the needs of internal and external users. CMEMS INSTAC has been organized in 7 regional Dissemination Units (DUs) to rely on the EuroGOOS ROOSes. Each DU aggregates data and metadata provided by a series of Production Units (PUs) acting as an interface for providers. Homogeneity and standardization are key features to ensure coherent and efficient service. All DUs provide data in the OceanSITES NetCDF format 1.2 (based on NetCDF 3.6), which is CF compliant, relies on SeaDataNet vocabularies and is able to handle profile and time-series measurements. All the products, both near real-time (NRT) and multi-year (REP), are available online for every CMEMS registered user through an FTP service. On top of the FTP service, INSTAC products are available through Oceanotron, an open-source data server dedicated to marine observations dissemination. It provides services such as aggregation on spatio-temporal coordinates and observed parameters, and subsetting on observed parameters and metadata. The accuracy of the data is checked on various levels. Quality control procedures are applied for the validity of the data and correctness tests for the metadata of each NetCDF file. The quality control procedures for the data include different routines for NRT and REP products. Key Performance Indicators (KPI) for monitoring purposes are also used in Copernicus. They allow a periodic monitoring of the availability, quantity and quality of the INSTAC data integrated in the NRT products. Statistical reports are generated on quarterly and yearly basis to provide more visibility on the coverage in space and time of the INSTAC NRT and REP products, as well as information on their quality. These reports are generated using Java and Python procedures developed within the INSTAC group. One of the most critical tasks for the DUs is to generate NetCDF files compliant with the agreed format. Many tools and programming libraries have been developed for that purpose, for instance Unidata Java Library. These tools provide NetCDF data management capabilities including creation, reading and modification. Some DUs have also developed regional data portals which offer useful information for the users including data charts, platforms availability through interactive maps, KPI and statistical figures and direct access to the FTP service. The proposed presentation will detail Copernicus in situ data service and the monitoring tools that have been developed by the INSTAC group.
Changing knowledge perspective in a changing world: The Adriatic multidisciplinary TDS approach
NASA Astrophysics Data System (ADS)
Bergamasco, Andrea; Carniel, Sandro; Nativi, Stefano; Signell, Richard P.; Benetazzo, Alvise; Falcieri, Francesco M.; Bonaldo, Davide; Minuzzo, Tiziano; Sclavo, Mauro
2013-04-01
The use and exploitation of the marine environment in recent years has been increasingly high, therefore calling for the need of a better description, monitoring and understanding of its behavior. However, marine scientists and managers often spend too much time in accessing and reformatting data instead of focusing on discovering new knowledge from the processes observed and data acquired. There is therefore the need to make more efficient our approach to data mining, especially in a world where rapid climate change imposes rapid and quick choices. In this context, it is mandatory to explore ways and possibilities to make large amounts of distributed data usable in an efficient and easy way, an effort that requires standardized data protocols, web services and standards-based tools. Following the US-IOOS approach, which has been adopted in many oceanographic and meteorological sectors, we present a CNR experience in the direction of setting up a national Italian IOOS framework (at the moment confined at the Adriatic Sea environment), using the THREDDS (THematic Real-time Environmental Distributed Data Services) Data Server (TDS). A TDS is a middleware designed to fill the gap between data providers and data users, and provides services allowing data users to find the data sets pertaining to their scientific needs, to access, visualize and use them in an easy way, without the need of downloading files to the local workspace. In order to achieve this results, it is necessary that the data providers make their data available in a standard form that the TDS understands, and with sufficient metadata so that the data can be read and searched for in a standard way. The TDS core is a NetCDF- Java Library implementing a Common Data Model (CDM), as developed by Unidata (http://www.unidata.ucar.edu), allowing the access to "array-based" scientific data. Climate and Forecast (CF) compliant NetCDF files can be read directly with no modification, while non-compliant files can be modified to meet appropriate metadata requirements. Once standardized in the CDM, the TDS makes datasets available through a series of web services such as OPeNDAP or Open Geospatial Consortium Web Coverage Service (WCS), allowing the data users to easily obtain small subsets from large datasets, and to quickly visualize their content by using tools such as GODIVA2 or Integrated Data Viewer (IDV). In addition, an ISO metadata service is available through the TDS that can be harvested by catalogue broker services (e.g. GI-cat) to enable distributed search across federated data servers. Example of TDS datasets from oceanographic evolutions (currents, waves, sediments...) will be described and discussed, while some examples can be accessed directly to the Venice site http://tds.ve.ismar.cnr.it:8080/thredds/catalog.html (Bergamasco et al., 2012) also within the framework of RITMARE Project. References Bergamasco A., Benetazzo A., Carniel S., Falcieri F., Minuzzo T., Signell R.P. and M. Sclavo, 2012. From interoperability to knowledge discovery using large model datasets in the marine environment: the THREDDS Data Server example. Advances in Oceanography and Limnology, 3(1), 41-50. DOI:10.1080/19475721.2012.669637
Parameter estimation and forecasting for multiplicative log-normal cascades.
Leövey, Andrés E; Lux, Thomas
2012-04-01
We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing et al. [Physica D 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica D 193, 195 (2004)] and Kiyono et al. [Phys. Rev. E 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono et al.'s procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.
A polarimetric scattering database for non-spherical ice particles at microwave wavelengths
NASA Astrophysics Data System (ADS)
Lu, Yinghui; Jiang, Zhiyuan; Aydin, Kultegin; Verlinde, Johannes; Clothiaux, Eugene E.; Botta, Giovanni
2016-10-01
The atmospheric science community has entered a period in which electromagnetic scattering properties at microwave frequencies of realistically constructed ice particles are necessary for making progress on a number of fronts. One front includes retrieval of ice-particle properties and signatures from ground-based, airborne, and satellite-based radar and radiometer observations. Another front is evaluation of model microphysics by application of forward operators to their outputs and comparison to observations during case study periods. Yet a third front is data assimilation, where again forward operators are applied to databases of ice-particle scattering properties and the results compared to observations, with their differences leading to corrections of the model state. Over the past decade investigators have developed databases of ice-particle scattering properties at microwave frequencies and made them openly available. Motivated by and complementing these earlier efforts, a database containing polarimetric single-scattering properties of various types of ice particles at millimeter to centimeter wavelengths is presented. While the database presented here contains only single-scattering properties of ice particles in a fixed orientation, ice-particle scattering properties are computed for many different directions of the radiation incident on them. These results are useful for understanding the dependence of ice-particle scattering properties on ice-particle orientation with respect to the incident radiation. For ice particles that are small compared to the wavelength, the number of incident directions of the radiation is sufficient to compute reasonable estimates of their (randomly) orientation-averaged scattering properties. This database is complementary to earlier ones in that it contains complete (polarimetric) scattering property information for each ice particle - 44 plates, 30 columns, 405 branched planar crystals, 660 aggregates, and 640 conical graupel - and direction of incident radiation but is limited to four frequencies (X-, Ku-, Ka-, and W-bands), does not include temperature dependencies of the single-scattering properties, and does not include scattering properties averaged over randomly oriented ice particles. Rules for constructing the morphologies of ice particles from one database to the next often differ; consequently, analyses that incorporate all of the different databases will contain the most variability, while illuminating important differences between them. Publication of this database is in support of future analyses of this nature and comes with the hope that doing so helps contribute to the development of a database standard for ice-particle scattering properties, like the NetCDF (Network Common Data Form) CF (Climate and Forecast) or NetCDF CF/Radial metadata conventions.
Transportation Workload Forecasting (TWF) Study.
1984-01-01
inflation factors, changes in troop strengths; opening , closing, or consolidation of facilities; and changes in mode of transportation. (4) The Military...tons Initial attempts to read the data with a univariate statistics program failed, due to the existence of nonnumeric entries in data fields . Records...W M=wn.99,i.i.Pm.i-,9Nd.~ C;f* ne hJN lfr csN ewc S.-,n melt rlfMOMf7 . £9 V oocacooofll.O N trtoati9e ieonhnfbo£0 * **. .~ ~~~ ~~~~ n
The Ophidia framework: toward cloud-based data analytics for climate change
NASA Astrophysics Data System (ADS)
Fiore, Sandro; D'Anca, Alessandro; Elia, Donatello; Mancini, Marco; Mariello, Andrea; Mirto, Maria; Palazzo, Cosimo; Aloisio, Giovanni
2015-04-01
The Ophidia project is a research effort on big data analytics facing scientific data analysis challenges in the climate change domain. It provides parallel (server-side) data analysis, an internal storage model and a hierarchical data organization to manage large amount of multidimensional scientific data. The Ophidia analytics platform provides several MPI-based parallel operators to manipulate large datasets (data cubes) and array-based primitives to perform data analysis on large arrays of scientific data. The most relevant data analytics use cases implemented in national and international projects target fire danger prevention (OFIDIA), interactions between climate change and biodiversity (EUBrazilCC), climate indicators and remote data analysis (CLIP-C), sea situational awareness (TESSA), large scale data analytics on CMIP5 data in NetCDF format, Climate and Forecast (CF) convention compliant (ExArch). Two use cases regarding the EU FP7 EUBrazil Cloud Connect and the INTERREG OFIDIA projects will be presented during the talk. In the former case (EUBrazilCC) the Ophidia framework is being extended to integrate scalable VM-based solutions for the management of large volumes of scientific data (both climate and satellite data) in a cloud-based environment to study how climate change affects biodiversity. In the latter one (OFIDIA) the data analytics framework is being exploited to provide operational support regarding processing chains devoted to fire danger prevention. To tackle the project challenges, data analytics workflows consisting of about 130 operators perform, among the others, parallel data analysis, metadata management, virtual file system tasks, maps generation, rolling of datasets, import/export of datasets in NetCDF format. Finally, the entire Ophidia software stack has been deployed at CMCC on 24-nodes (16-cores/node) of the Athena HPC cluster. Moreover, a cloud-based release tested with OpenNebula is also available and running in the private cloud infrastructure of the CMCC Supercomputing Centre.
LANCE in ECHO - Merging Science and Near Real-Time Data Search and Order
NASA Astrophysics Data System (ADS)
Kreisler, S.; Murphy, K. J.; Vollmer, B.; Lighty, L.; Mitchell, A. E.; Devine, N.
2012-12-01
NASA's Earth Observing System (EOS) Data and Information System (EOSDIS) Land Atmosphere Near real-time Capability for EOS (LANCE) project provides expedited data products from the Terra, Aqua, and Aura satellites within three hours of observation. In order to satisfy latency requirements, LANCE data are produced with relaxed ancillary data resulting in a product that may have minor differences from its science quality counterpart. LANCE products are used by a number of different groups to support research and applications that require near real-time earth observations, such as disaster relief, hazard and air quality monitoring, and weather forecasting. LANCE elements process raw rate-buffered and/or session-based production datasets into higher-level products, which are freely available to registered users via LANCE FTP sites. The LANCE project also generates near real-time full resolution browse imagery from these products, which can be accessed through the Global Imagery Browse Services (GIBS). In an effort to support applications and services that require timely access to these near real-time products, the project is currently implementing the publication of LANCE product metadata to the EOS ClearingHouse (ECHO), a centralized EOSDIS registry of EOS data. Metadata within ECHO is made available through an Application Program Interface (API), and applications can utilize the API to allow users to efficiently search and order LANCE data. Publishing near real-time data to ECHO will permit applications to access near real-time product metadata prior to the release of its science quality counterpart and to associate imagery from GIBS with its underlying data product.
NASA Astrophysics Data System (ADS)
Perraud, Jean-Michel; Bennett, James C.; Bridgart, Robert; Robertson, David E.
2016-04-01
Research undertaken through the Water Information Research and Development Alliance (WIRADA) has laid the foundations for continuous deterministic and ensemble short-term forecasting services. One output of this research is the software Short-term Water Information Forecasting Tools version 2 (SWIFT2). SWIFT2 is developed for use in research on short term streamflow forecasting techniques as well as operational forecasting services at the Australian Bureau of Meteorology. The variety of uses in research and operations requires a modular software system whose components can be arranged in applications that are fit for each particular purpose, without unnecessary software duplication. SWIFT2 modelling structures consist of sub-areas of hydrologic models, nodes and links with in-stream routing and reservoirs. While this modelling structure is customary, SWIFT2 is built from the ground up for computational and data intensive applications such as ensemble forecasts necessary for the estimation of the uncertainty in forecasts. Support for parallel computation on multiple processors or on a compute cluster is a primary use case. A convention is defined to store large multi-dimensional forecasting data and its metadata using the netCDF library. SWIFT2 is written in modern C++ with state of the art software engineering techniques and practices. A salient technical feature is a well-defined application programming interface (API) to facilitate access from different applications and technologies. SWIFT2 is already seamlessly accessible on Windows and Linux via packages in R, Python, Matlab and .NET languages such as C# and F#. Command line or graphical front-end applications are also feasible. This poster gives an overview of the technology stack, and illustrates the resulting features of SWIFT2 for users. Research and operational uses share the same common core C++ modelling shell for consistency, but augmented by different software modules suitable for each context. The accessibility via interactive modelling languages is particularly amenable to using SWIFT2 in exploratory research, with a dynamic and versatile experimental modelling workflow. This does not come at the expense of the stability and reliability required for use in operations, where only mature and stable components are used.
Recovering Nimbus era Observations at the NASA GES DISC
NASA Astrophysics Data System (ADS)
Meyer, D. J.; Johnson, J. E.; Esfandiari, A. E.; Zamkoff, E. B.; Al-Jazrawi, A. F.; Gerasimov, I. V.; Alcott, G. T.
2017-12-01
Between 1964 and 1978, NASA launched a series of seven Nimbus meteorological satellites which provided Earth observations for 30 years. These satellites, carrying a total of 33 instruments to observe the Earth at visible, infrared, ultraviolet, and microwave wavelengths, revolutionized weather forecasting, provided early observations of ocean color and atmospheric ozone, and prototyped location-based search and rescue capabilities. The Nimbus series paved the way for a number of currently operational systems such as the EOS Terra, Aqua and Aura platforms.The original data archive included both magnetic tapes and film media. These media are well past their expected end of life, placing at risk valuable data that are critical to extending the history of Earth observations back in time. GES DISC has been incorporating these data into a modern online archive by recovering the digital data files from the tapes, and scanning images of the data from film strips. The original data products were written on obsolete hardware systems in outdated file formats, and in the absence of metadata standards at that time, were often written in proprietary file structures. Through a tedious and laborious process, oft-corrupted data are recovered, and incomplete metadata and documentation are reconstructed.
Linked Metadata - lightweight semantics for data integration (Invited)
NASA Astrophysics Data System (ADS)
Hendler, J. A.
2013-12-01
The "Linked Open Data" cloud (http://linkeddata.org) is currently used to show how the linking of datasets, supported by SPARQL endpoints, is creating a growing set of linked data assets. This linked data space has been growing rapidly, and the last version collected is estimated to have had over 35 billion 'triples.' As impressive as this may sound, there is an inherent flaw in the way the linked data story is conceived. The idea is that all of the data is represented in a linked format (generally RDF) and applications will essentially query this cloud and provide mashup capabilities between the various kinds of data that are found. The view of linking in the cloud is fairly simple -links are provided by either shared URIs or by URIs that are asserted to be owl:sameAs. This view of the linking, which primarily focuses on shared objects and subjects in RDF's subject-predicate-object representation, misses a critical aspect of Semantic Web technology. Given triples such as * A:person1 foaf:knows A:person2 * B:person3 foaf:knows B:person4 * C:person5 foaf:name 'John Doe' this view would not consider them linked (barring other assertions) even though they share a common vocabulary. In fact, we get significant clues that there are commonalities in these data items from the shared namespaces and predicates, even if the traditional 'graph' view of RDF doesn't appear to join on these. Thus, it is the linking of the data descriptions, whether as metadata or other vocabularies, that provides the linking in these cases. This observation is crucial to scientific data integration where the size of the datasets, or even the individual relationships within them, can be quite large. (Note that this is not restricted to scientific data - search engines, social networks, and massive multiuser games also create huge amounts of data.) To convert all the triples into RDF and provide individual links is often unnecessary, and is both time and space intensive. Those looking to do on the fly integration may prefer to do more traditional data queries and then convert and link the 'views' returned at retrieval time, providing another means of using the linked data infrastructure without having to convert whole datasets to triples to provide linking. Web companies have been taking advantage of 'lightweight' semantic metadata for search quality and optimization (cf. schema.org), linking networks within and without web sites (cf. Facebook's Open Graph Protocol), and in doing various kinds of advertisement and user modeling across datasets. Scientific metadata, on the other hand, has traditionally been geared at being largescale and highly descriptive, and scientific ontologies have been aimed at high expressivity, essentially providing complex reasoning services rather than the less expressive vocabularies needed for data discovery and simple mappings that can allow humans (or more complex systems) when full scale integration is needed. Although this work is just the beginning for providing integration, as the community creates more and more datasets, discovery of these data resources on the Web becomes a crucial starting place. Simple descriptors, that can be combined with textual fields and/or common community vocabularies, can be a great starting place on bringing scientific data into the Web of Data that is growing in other communities. References: [1] Pouchard, Line C., et al. "A Linked Science investigation: enhancing climate change data discovery with semantic technologies." Earth science informatics 6.3 (2013): 175-185.
eWaterCycle visualisation. combining the strength of NetCDF and Web Map Service: ncWMS
NASA Astrophysics Data System (ADS)
Hut, R.; van Meersbergen, M.; Drost, N.; Van De Giesen, N.
2016-12-01
As a result of the eWatercycle global hydrological forecast we have created Cesium-ncWMS, a web application based on ncWMS and Cesium. ncWMS is a server side application capable of reading any NetCDF file written using the Climate and Forecasting (CF) conventions, and making the data available as a Web Map Service(WMS). ncWMS automatically determines available variables in a file, and creates maps colored according to map data and a user selected color scale. Cesium is a Javascript 3D virtual Globe library. It uses WebGL for rendering, which makes it very fast, and it is capable of displaying a wide variety of data types such as vectors, 3D models, and 2D maps. The forecast results are automatically uploaded to our web server running ncWMS. In turn, the web application can be used to change the settings for color maps and displayed data. The server uses the settings provided by the web application, together with the data in NetCDF to provide WMS image tiles, time series data and legend graphics to the Cesium-NcWMS web application. The user can simultaneously zoom in to the very high resolution forecast results anywhere on the world, and get time series data for any point on the globe. The Cesium-ncWMS visualisation combines a global overview with local relevant information in any browser. See the visualisation live at forecast.ewatercycle.org
Parameter estimation and forecasting for multiplicative log-normal cascades
NASA Astrophysics Data System (ADS)
Leövey, Andrés E.; Lux, Thomas
2012-04-01
We study the well-known multiplicative log-normal cascade process in which the multiplication of Gaussian and log normally distributed random variables yields time series with intermittent bursts of activity. Due to the nonstationarity of this process and the combinatorial nature of such a formalism, its parameters have been estimated mostly by fitting the numerical approximation of the associated non-Gaussian probability density function to empirical data, cf. Castaing [Physica DPDNPDT0167-278910.1016/0167-2789(90)90035-N 46, 177 (1990)]. More recently, alternative estimators based upon various moments have been proposed by Beck [Physica DPDNPDT0167-278910.1016/j.physd.2004.01.020 193, 195 (2004)] and Kiyono [Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.76.041113 76, 041113 (2007)]. In this paper, we pursue this moment-based approach further and develop a more rigorous generalized method of moments (GMM) estimation procedure to cope with the documented difficulties of previous methodologies. We show that even under uncertainty about the actual number of cascade steps, our methodology yields very reliable results for the estimated intermittency parameter. Employing the Levinson-Durbin algorithm for best linear forecasts, we also show that estimated parameters can be used for forecasting the evolution of the turbulent flow. We compare forecasting results from the GMM and Kiyono 's procedure via Monte Carlo simulations. We finally test the applicability of our approach by estimating the intermittency parameter and forecasting of volatility for a sample of financial data from stock and foreign exchange markets.
Extra-tropical Cyclones and Windstorms in Seasonal Forecasts
NASA Astrophysics Data System (ADS)
Leckebusch, Gregor C.; Befort, Daniel J.; Weisheimer, Antje; Knight, Jeff; Thornton, Hazel; Roberts, Julia; Hermanson, Leon
2015-04-01
Severe damages and large insured losses over Europe related to natural phenomena are mostly caused by extra-tropical cyclones and their related windstorm fields. Thus, an adequate representation of these events in seasonal prediction systems and reliable forecasts up to a season in advance would be of high value for society and economy. In this study, state-of-the-art seasonal forecast prediction systems are analysed (ECMWF, UK Met Office) regarding the general climatological representation and the seasonal prediction of extra-tropical cyclones and windstorms during the core winter season (DJF) with a lead time of up to four months. Two different algorithms are used to identify cyclones and windstorm events in these datasets. Firstly, we apply a cyclone identification and tracking algorithm based on the Laplacian of MSLP and secondly, we use an objective wind field tracking algorithm to identify and track continuous areas of extreme high wind speeds (cf. Leckebusch et al., 2008), which can be related to extra-tropical winter cyclones. Thus, for the first time, we can analyse the forecast of severe wind events near to the surface caused by extra-tropical cyclones. First results suggest a successful validation of the spatial climatological distributions of wind storm and cyclone occurrence in the seasonal forecast systems in comparison with reanalysis data (ECMWF-ERA40 & ERAInterim) in general. However, large biases are found for some areas. The skill of the seasonal forecast systems in simulating the year-to-year variability of the frequency of severe windstorm events and cyclones is investigated using the ranked probability skill score. Positive skill is found over large parts of the Northern Hemisphere as well as for the most intense extra-tropical cyclones and its related wind fields.
NCAR's Research Data Archive: OPeNDAP Access for Complex Datasets
NASA Astrophysics Data System (ADS)
Dattore, R.; Worley, S. J.
2014-12-01
Many datasets have complex structures including hundreds of parameters and numerous vertical levels, grid resolutions, and temporal products. Making these data accessible is a challenge for a data provider. OPeNDAP is powerful protocol for delivering in real-time multi-file datasets that can be ingested by many analysis and visualization tools, but for these datasets there are too many choices about how to aggregate. Simple aggregation schemes can fail to support, or at least make it very challenging, for many potential studies based on complex datasets. We address this issue by using a rich file content metadata collection to create a real-time customized OPeNDAP service to match the full suite of access possibilities for complex datasets. The Climate Forecast System Reanalysis (CFSR) and it's extension, the Climate Forecast System Version 2 (CFSv2) datasets produced by the National Centers for Environmental Prediction (NCEP) and hosted by the Research Data Archive (RDA) at the Computational and Information Systems Laboratory (CISL) at NCAR are examples of complex datasets that are difficult to aggregate with existing data server software. CFSR and CFSv2 contain 141 distinct parameters on 152 vertical levels, six grid resolutions and 36 products (analyses, n-hour forecasts, multi-hour averages, etc.) where not all parameter/level combinations are available at all grid resolution/product combinations. These data are archived in the RDA with the data structure provided by the producer; no additional re-organization or aggregation have been applied. Since 2011, users have been able to request customized subsets (e.g. - temporal, parameter, spatial) from the CFSR/CFSv2, which are processed in delayed-mode and then downloaded to a user's system. Until now, the complexity has made it difficult to provide real-time OPeNDAP access to the data. We have developed a service that leverages the already-existing subsetting interface and allows users to create a virtual dataset with its own structure (das, dds). The user receives a URL to the customized dataset that can be used by existing tools to ingest, analyze, and visualize the data. This presentation will detail the metadata system and OPeNDAP server that enable user-customized real-time access and show an example of how a visualization tool can access the data.
Forecasting Solar Flares Using Magnetogram-based Predictors and Machine Learning
NASA Astrophysics Data System (ADS)
Florios, Kostas; Kontogiannis, Ioannis; Park, Sung-Hong; Guerra, Jordan A.; Benvenuto, Federico; Bloomfield, D. Shaun; Georgoulis, Manolis K.
2018-02-01
We propose a forecasting approach for solar flares based on data from Solar Cycle 24, taken by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) mission. In particular, we use the Space-weather HMI Active Region Patches (SHARP) product that facilitates cut-out magnetograms of solar active regions (AR) in the Sun in near-realtime (NRT), taken over a five-year interval (2012 - 2016). Our approach utilizes a set of thirteen predictors, which are not included in the SHARP metadata, extracted from line-of-sight and vector photospheric magnetograms. We exploit several machine learning (ML) and conventional statistics techniques to predict flares of peak magnitude {>} M1 and {>} C1 within a 24 h forecast window. The ML methods used are multi-layer perceptrons (MLP), support vector machines (SVM), and random forests (RF). We conclude that random forests could be the prediction technique of choice for our sample, with the second-best method being multi-layer perceptrons, subject to an entropy objective function. A Monte Carlo simulation showed that the best-performing method gives accuracy ACC=0.93(0.00), true skill statistic TSS=0.74(0.02), and Heidke skill score HSS=0.49(0.01) for {>} M1 flare prediction with probability threshold 15% and ACC=0.84(0.00), TSS=0.60(0.01), and HSS=0.59(0.01) for {>} C1 flare prediction with probability threshold 35%.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Pedone, Jr., James M.
A cluster file system is provided having a plurality of distributed metadata servers with shared access to one or more shared low latency persistent key-value metadata stores. A metadata server comprises an abstract storage interface comprising a software interface module that communicates with at least one shared persistent key-value metadata store providing a key-value interface for persistent storage of key-value metadata. The software interface module provides the key-value metadata to the at least one shared persistent key-value metadata store in a key-value format. The shared persistent key-value metadata store is accessed by a plurality of metadata servers. A metadata requestmore » can be processed by a given metadata server independently of other metadata servers in the cluster file system. A distributed metadata storage environment is also disclosed that comprises a plurality of metadata servers having an abstract storage interface to at least one shared persistent key-value metadata store.« less
Recovering Nimbus Era Observations at the NASA GES DISC
NASA Technical Reports Server (NTRS)
Meyer, D.; Johnson, J.; Esfandiari, A.; Zamkoff, E.; Al-Jazrawi, A.; Gerasimov, I.; Alcott, G.
2017-01-01
Between 1964 and 1978, NASA launched a series of seven Nimbus meteorological satellites which provided Earth observations for 30 years. These satellites, carrying a total of 33 instruments to observe the Earth at visible, infrared, ultraviolet, and microwave wavelengths, revolutionized weather forecasting, provided early observations of ocean color and atmospheric ozone, and prototyped location-based search and rescue capabilities. The Nimbus series paved the way for a number of currently operational systems such as the EOS (Earth Observation System) Terra, Aqua, and Aura platforms. The original data archive includes both magnetic tapes and film media. These media are well past their expected end of life, placing at risk valuable data that are critical to extending the history of Earth observations back in time. GES DISC (Goddard Earth Sciences Data and Information Services Center) has been incorporating these data into a modern online archive by recovering the digital data files from the tapes, and scanning images of the data from film strips. The digital data products were written on obsolete hardware systems in outdated file formats, and in the absence of metadata standards at that time, were often written in proprietary file structures. Through a tedious and laborious process, oft-corrupted data are recovered, and incomplete metadata and documentation are reconstructed.
Log-less metadata management on metadata server for parallel file systems.
Liao, Jianwei; Xiao, Guoqiang; Peng, Xiaoning
2014-01-01
This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally.
Log-Less Metadata Management on Metadata Server for Parallel File Systems
Xiao, Guoqiang; Peng, Xiaoning
2014-01-01
This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally. PMID:24892093
NASA Astrophysics Data System (ADS)
Mazzetti, Paolo; Valentin, Bernard; Koubarakis, Manolis; Nativi, Stefano
2013-04-01
Access to Earth Observation products remains not at all straightforward for end users in most domains. Semantically-enabled search engines, generally accessible through Web portals, have been developed. They allow searching for products by selecting application-specific terms and specifying basic geographical and temporal filtering criteria. Although this mostly suits the needs of the general public, the scientific communities require more advanced and controlled means to find products. Ranges of validity, traceability (e.g. origin, applied algorithms), accuracy, uncertainty, are concepts that are typically taken into account in research activities. The Prod-Trees (Enriching Earth Observation Ontology Services using Product Trees) project will enhance the CF-netCDF product format and vocabulary to allow storing metadata that better describe the products, and in particular EO products. The project will bring a standardized solution that permits annotating EO products in such a manner that official and third-party software libraries and tools will be able to search for products using advanced tags and controlled parameter names. Annotated EO products will be automatically supported by all the compatible software. Because the entire product information will come from the annotations and the standards, there will be no need for integrating extra components and data structures that have not been standardized. In the course of the project, the most important and popular open-source software libraries and tools will be extended to support the proposed extensions of CF-netCDF. The result will be provided back to the respective owners and maintainers for ensuring the best dissemination and adoption of the extended format. The project, funded by ESA, has started in December 2012 and will end in May 2014. It is coordinated by Space Applications Services, and the Consortium includes CNR-IIA and the National and Kapodistrian University of Athens. The first activities included the elicitation of user requirements in order to identify gaps in the current CF and netCDF specification for providing an extended support of the discovery of EO data. To this aim a Validation Group has been established including members from organizations actively using netCDF and CF standards. A questionnaire has been prepared and submitted to the Validation Group; it was aimed for being filled online, but also for guiding interviews. The presentation will focus on the project objectives, the first achievements with particular reference to the results of the requirements analysis and future plans.
The European Drought Observatory (EDO): Current State and Future Directions
NASA Astrophysics Data System (ADS)
Vogt, J.; Singleton, A.; Sepulcre, G.; Micale, F.; Barbosa, P.
2012-12-01
Europe has repeatedly been affected by droughts, resulting in considerable ecological and economic damage and climate change studies indicate a trend towards increasing climate variability most likely resulting in more frequent drought occurrences also in Europe. Against this background, the European Commission's Joint Research Centre (JRC) is developing methods and tools for assessing, monitoring and forecasting droughts in Europe and develops a European Drought Observatory (EDO) to complement and integrate national activities with a European view. At the core of the European Drought Observatory (EDO) is a portal, including a map server, a metadata catalogue, a media-monitor and analysis tools. The map server presents Europe-wide up-to-date information on the occurrence and severity of droughts, which is complemented by more detailed information provided by regional, national and local observatories through OGC compliant web mapping and web coverage services. In addition, time series of historical maps as well as graphs of the temporal evolution of drought indices for individual grid cells and administrative regions in Europe can be retrieved and analysed. Current work is focusing on validating the available products, improving the functionalities, extending the linkage to additional national and regional drought information systems and improving medium to long-range probabilistic drought forecasting products. Probabilistic forecasts are attractive in that they provide an estimate of the range of uncertainty in a particular forecast. Longer-term goals include the development of long-range drought forecasting products, the analysis of drought hazard and risk, the monitoring of drought impact and the integration of EDO in a global drought information system. The talk will provide an overview on the development and state of EDO, the different products, and the ways to include a wide range of stakeholders (i.e. European, national river basin, and local authorities) in the development of the system as well as an outlook on the future developments.
Impact of Sea Surface Salinity on Coupled Dynamics for the Tropical Indo Pacific
NASA Astrophysics Data System (ADS)
Busalacchi, A. J.; Hackert, E. C.
2014-12-01
In this presentation we assess the impact of in situ and satellite sea surface salinity (SSS) observations on seasonal to interannual variability of tropical Indo-Pacific Ocean dynamics as well as on dynamical ENSO forecasts using a Hybrid Coupled Model (HCM) for 1993-2007 (cf., Hackert et al., 2011) and August 2011 until February 2014 (cf., Hackert et al., 2014). The HCM is composed of a primitive equation ocean model coupled with a SVD-based statistical atmospheric model for the tropical Indo-Pacific region. An Ensemble Reduced Order Kalman Filter (EROKF) is used to assimilate observations to constrain dynamics and thermodynamics for initialization of the HCM. Including SSS generally improves NINO3 sea surface temperature anomaly validation. Assimilating SSS gives significant improvement versus just subsurface temperature for all forecast lead times after 5 months. We find that the positive impact of SSS assimilation is brought about by surface freshening in the western Pacific warm pool that leads to increased barrier layer thickness (BLT) and shallower mixed layer depths. Thus, in the west the net effect of assimilating SSS is to increase stability and reduce mixing, which concentrates the wind impact of ENSO coupling. Specifically, the main benefit of SSS assimilation for 1993-2007 comes from improvement to the Spring Predictability Barrier (SPB) period. In the east, the impact of Aquarius satellite SSS is to induce more cooling in the NINO3 region as a result of being relatively more salty than in situ SSS in the eastern Pacific leading to increased mixing and entrainment. This, in turn, sets up an enhanced west to east SST gradient and intensified Bjerknes coupling. For the 2011-2014 period, consensus coupled model forecasts compiled by the IRI tend to erroneously predict NINO3 warming; SSS assimilation corrects this defect. Finally, we plan to update our analysis and report on the dynamical impact of including Aquarius SSS for the most-recent, ongoing 2014 El Niño in the tropical Pacific and compare these results with other available coupled models that do not yet include satellite SSS.
Metadata for Web Resources: How Metadata Works on the Web.
ERIC Educational Resources Information Center
Dillon, Martin
This paper discusses bibliographic control of knowledge resources on the World Wide Web. The first section sets the context of the inquiry. The second section covers the following topics related to metadata: (1) definitions of metadata, including metadata as tags and as descriptors; (2) metadata on the Web, including general metadata systems,…
Metadata Dictionary Database: A Proposed Tool for Academic Library Metadata Management
ERIC Educational Resources Information Center
Southwick, Silvia B.; Lampert, Cory
2011-01-01
This article proposes a metadata dictionary (MDD) be used as a tool for metadata management. The MDD is a repository of critical data necessary for managing metadata to create "shareable" digital collections. An operational definition of metadata management is provided. The authors explore activities involved in metadata management in…
NASA Astrophysics Data System (ADS)
Palanisamy, Giriprakash; Wilson, Bruce E.; Cook, Robert B.; Lenhardt, Chris W.; Santhana Vannan, Suresh; Pan, Jerry; McMurry, Ben F.; Devarakonda, Ranjeet
2010-12-01
The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) is one of the science-oriented data centers in EOSDIS, aligned primarily with terrestrial ecology. The ORNL DAAC archives and serves data from NASA-funded field campaigns (such as BOREAS, FIFE, and LBA), regional and global data sets relevant to biogeochemical cycles, land validation studies for remote sensing, and source code for some terrestrial ecology models. Users of the ORNL DAAC include field ecologists, remote sensing scientists, modelers at various scales, synthesis scientific groups, a range of educational users (particularly baccalaureate and graduate instruction), and decision support analysts. It is clear that the wide range of users served by the ORNL DAAC have differing needs and differing capabilities for accessing and using data. It is also not possible for the ORNL DAAC, or the other data centers in EDSS to develop all of the tools and interfaces to support even most of the potential uses of data directly. As is typical of Information Technology to support a research enterprise, the user needs will continue to evolve rapidly over time and users themselves cannot predict future needs, as those needs depend on the results of current investigation. The ORNL DAAC is addressing these needs by targeted implementation of web services and tools which can be consumed by other applications, so that a modeler can retrieve data in netCDF format with the Climate Forecasting convention and a field ecologist can retrieve subsets of that same data in a comma separated value format, suitable for use in Excel or R. Tools such as our MODIS Subsetting capability, the Spatial Data Access Tool (SDAT; based on OGC web services), and OPeNDAP-compliant servers such as THREDDS particularly enable such diverse means of access. We also seek interoperability of metadata, recognizing that terrestrial ecology is a field where there are a very large number of relevant data repositories. ORNL DAAC metadata is published to several metadata repositories using the Open Archive Initiative Protocol for Metadata Handling (OAI-PMH), to increase the chances that users can find data holdings relevant to their particular scientific problem. ORNL also seeks to leverage technology across these various data projects and encourage standardization of processes and technical architecture. This standardization is behind current efforts involving the use of Drupal and Fedora Commons. This poster describes the current and planned approaches that the ORNL DAAC is taking to enable cost-effective interoperability among data centers, both across the NASA EOSDIS data centers and across the international spectrum of terrestrial ecology-related data centers. The poster will highlight the standards that we are currently using across data formats, metadata formats, and data protocols. References: [1]Devarakonda R., et al. Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics (2010), 3(1): 87-94. [2]Devarakonda R., et al. Data sharing and retrieval using OAI-PMH. Earth Science Informatics (2011), 4(1): 1-5.
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations.
Martínez-Romero, Marcos; O'Connor, Martin J; Shankar, Ravi D; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L; Gevaert, Olivier; Graybeal, John; Musen, Mark A
2017-01-01
In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository.
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations
Martínez-Romero, Marcos; O’Connor, Martin J.; Shankar, Ravi D.; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L.; Gevaert, Olivier; Graybeal, John; Musen, Mark A.
2017-01-01
In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository. PMID:29854196
Schmidt, Wiebke; Evers-King, Hayley L.; Campos, Carlos J. A.; Jones, Darren B.; Miller, Peter I.; Davidson, Keith; Shutler, Jamie D.
2018-01-01
Microbiological contamination or elevated marine biotoxin concentrations within shellfish can result in temporary closure of shellfish aquaculture harvesting, leading to financial loss for the aquaculture business and a potential reduction in consumer confidence in shellfish products. We present a method for predicting short-term variations in shellfish concentrations of Escherichia coli and biotoxin (okadaic acid and its derivates dinophysistoxins and pectenotoxins). The approach was evaluated for 2 contrasting shellfish harvesting areas. Through a meta-data analysis and using environmental data (in situ, satellite observations and meteorological nowcasts and forecasts), key environmental drivers were identified and used to develop models to predict E. coli and biotoxin concentrations within shellfish. Models were trained and evaluated using independent datasets, and the best models were identified based on the model exhibiting the lowest root mean square error. The best biotoxin model was able to provide 1 wk forecasts with an accuracy of 86%, a 0% false positive rate and a 0% false discovery rate (n = 78 observations) when used to predict the closure of shellfish beds due to biotoxin. The best E. coli models were used to predict the European hygiene classification of the shellfish beds to an accuracy of 99% (n = 107 observations) and 98% (n = 63 observations) for a bay (St Austell Bay) and an estuary (Turnaware Bar), respectively. This generic approach enables high accuracy short-term farm-specific forecasts, based on readily accessible environmental data and observations. PMID:29805719
Harvesting NASA's Common Metadata Repository (CMR)
NASA Technical Reports Server (NTRS)
Shum, Dana; Durbin, Chris; Norton, James; Mitchell, Andrew
2017-01-01
As part of NASA's Earth Observing System Data and Information System (EOSDIS), the Common Metadata Repository (CMR) stores metadata for over 30,000 datasets from both NASA and international providers along with over 300M granules. This metadata enables sub-second discovery and facilitates data access. While the CMR offers a robust temporal, spatial and keyword search functionality to the general public and international community, it is sometimes more desirable for international partners to harvest the CMR metadata and merge the CMR metadata into a partner's existing metadata repository. This poster will focus on best practices to follow when harvesting CMR metadata to ensure that any changes made to the CMR can also be updated in a partner's own repository. Additionally, since each partner has distinct metadata formats they are able to consume, the best practices will also include guidance on retrieving the metadata in the desired metadata format using CMR's Unified Metadata Model translation software.
Harvesting NASA's Common Metadata Repository
NASA Astrophysics Data System (ADS)
Shum, D.; Mitchell, A. E.; Durbin, C.; Norton, J.
2017-12-01
As part of NASA's Earth Observing System Data and Information System (EOSDIS), the Common Metadata Repository (CMR) stores metadata for over 30,000 datasets from both NASA and international providers along with over 300M granules. This metadata enables sub-second discovery and facilitates data access. While the CMR offers a robust temporal, spatial and keyword search functionality to the general public and international community, it is sometimes more desirable for international partners to harvest the CMR metadata and merge the CMR metadata into a partner's existing metadata repository. This poster will focus on best practices to follow when harvesting CMR metadata to ensure that any changes made to the CMR can also be updated in a partner's own repository. Additionally, since each partner has distinct metadata formats they are able to consume, the best practices will also include guidance on retrieving the metadata in the desired metadata format using CMR's Unified Metadata Model translation software.
MyOcean Internal Information System (Dial-P)
NASA Astrophysics Data System (ADS)
Blanc, Frederique; Jolibois, Tony; Loubrieu, Thomas; Manzella, Giuseppe; Mazzetti, Paolo; Nativi, Stefano
2010-05-01
MyOcean is a three-year project (2008-2011) which goal is the development and pre-operational validation of the GMES Marine Core Service for ocean monitoring and forecasting. It's a transition project that will conduct the European "operational oceanography" community towards the operational phase of a GMES European service, which demands more European integration, more operationality, and more service. Observations, model-based data, and added-value products will be generated - and enhanced thanks to dedicated expertise - by the following production units: • Five Thematic Assembly Centers, each of them dealing with a specific set of observation data: Sea Level, Ocean colour, Sea Surface Temperature, Sea Ice & Wind, and In Situ data, • Seven Monitoring and Forecasting Centers to serve the Global Ocean, the Arctic area, the Baltic Sea, the Atlantic North-West shelves area, the Atlantic Iberian-Biscay-Ireland area, the Mediterranean Sea and the Black sea. Intermediate and final users will discover, view and get the products by means of a central web desk, a central re-active manned service desk and thematic experts distributed across Europe. The MyOcean Information System (MIS) is considering the various aspects of an interoperable - federated information system. Data models support data and computer systems by providing the definition and format of data. The possibility of including the information in the data file is depending on data model adopted. In general there is little effort in the actual project to develop a ‘generic' data model. A strong push to develop a common model is provided by the EU Directive INSPIRE. At present, there is no single de-facto data format for storing observational data. Data formats are still evolving, with their underlying data models moving towards the concept of Feature Types based on ISO/TC211 standards. For example, Unidata are developing the Common Data Model that can represent scientific data types such as point, trajectory, station, grid, etc., which will be implemented in netCDF format. SeaDataNet is recommending ODV and NetCDF formats. Another problem related to data curation and interoperability is the possibility to use common vocabularies. Common vocabularies are developed in many international initiatives, such as GEMET (promoted by INSPIRE as a multilingual thesaurus), UNIDATA, SeaDataNet, Marine Metadata Initiative (MMI). MIS is considering the SeaDataNet vocabulary as a base for interoperability. Four layers of different abstraction levels of interoperability an be defined: - Technical/basic: this layer is implemented at each TAC or MFC through internet connection and basic services for data transfer and browsing (e.g FTP, HTTP, etc). - Syntactic: allowing the interchange of metadata and protocol elements. This layer corresponds to a definition Core Metadata Set, the format of exchange/delivery for the data and associated metadata and possible software. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant metadata set and common data formats). - Functional/pragmatic: based on a common set of functional primitives or on a common set of service definitions. This layer refers to the definition of services based on Web services standards. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant network services). - Semantic: allowing to access similar classes of objects and services across multiple sites, with multilinguality of content as one specific aspect. This layer corresponds to MIS interface, terminology and thesaurus. Given the above requirements, the proposed solution is a federation of systems, where the individual participants are self-contained autonomous systems, but together form a consistent wider picture. A mid-tier integration layer mediates between existing systems, adapting their data and service model schema to the MIS. The developed MIS is a read-only system, i.e. does not allow updating (or inserting) data into the participant resource systems. The main advantages of the proposed approach are: • to enable information sources to join the MIS and publish their data and metadata in a secure way, without any modification to their existing resources and procedures and without any restriction to their autonomy; • to enable users to browse and query the MIS, receiving an aggregated result incorporating relevant data and metadata from across different sources; • to accommodate the growth of such a MIS, either in terms of its clients or of its information resources, as well as the evolution of the underlying data model.
Simplified Metadata Curation via the Metadata Management Tool
NASA Astrophysics Data System (ADS)
Shum, D.; Pilone, D.
2015-12-01
The Metadata Management Tool (MMT) is the newest capability developed as part of NASA Earth Observing System Data and Information System's (EOSDIS) efforts to simplify metadata creation and improve metadata quality. The MMT was developed via an agile methodology, taking into account inputs from GCMD's science coordinators and other end-users. In its initial release, the MMT uses the Unified Metadata Model for Collections (UMM-C) to allow metadata providers to easily create and update collection records in the ISO-19115 format. Through a simplified UI experience, metadata curators can create and edit collections without full knowledge of the NASA Best Practices implementation of ISO-19115 format, while still generating compliant metadata. More experienced users are also able to access raw metadata to build more complex records as needed. In future releases, the MMT will build upon recent work done in the community to assess metadata quality and compliance with a variety of standards through application of metadata rubrics. The tool will provide users with clear guidance as to how to easily change their metadata in order to improve their quality and compliance. Through these features, the MMT allows data providers to create and maintain compliant and high quality metadata in a short amount of time.
Enriched Video Semantic Metadata: Authorization, Integration, and Presentation.
ERIC Educational Resources Information Center
Mu, Xiangming; Marchionini, Gary
2003-01-01
Presents an enriched video metadata framework including video authorization using the Video Annotation and Summarization Tool (VAST)-a video metadata authorization system that integrates both semantic and visual metadata-- metadata integration, and user level applications. Results demonstrated that the enriched metadata were seamlessly…
A data delivery system for IMOS, the Australian Integrated Marine Observing System
NASA Astrophysics Data System (ADS)
Proctor, R.; Roberts, K.; Ward, B. J.
2010-09-01
The Integrated Marine Observing System (IMOS, www.imos.org.au), an AUD 150 m 7-year project (2007-2013), is a distributed set of equipment and data-information services which, among many applications, collectively contribute to meeting the needs of marine climate research in Australia. The observing system provides data in the open oceans around Australia out to a few thousand kilometres as well as the coastal oceans through 11 facilities which effectively observe and measure the 4-dimensional ocean variability, and the physical and biological response of coastal and shelf seas around Australia. Through a national science rationale IMOS is organized as five regional nodes (Western Australia - WAIMOS, South Australian - SAIMOS, Tasmania - TASIMOS, New SouthWales - NSWIMOS and Queensland - QIMOS) surrounded by an oceanic node (Blue Water and Climate). Operationally IMOS is organized as 11 facilities (Argo Australia, Ships of Opportunity, Southern Ocean Automated Time Series Observations, Australian National Facility for Ocean Gliders, Autonomous Underwater Vehicle Facility, Australian National Mooring Network, Australian Coastal Ocean Radar Network, Australian Acoustic Tagging and Monitoring System, Facility for Automated Intelligent Monitoring of Marine Systems, eMarine Information Infrastructure and Satellite Remote Sensing) delivering data. IMOS data is freely available to the public. The data, a combination of near real-time and delayed mode, are made available to researchers through the electronic Marine Information Infrastructure (eMII). eMII utilises the Australian Academic Research Network (AARNET) to support a distributed database on OPeNDAP/THREDDS servers hosted by regional computing centres. IMOS instruments are described through the OGC Specification SensorML and where-ever possible data is in CF compliant netCDF format. Metadata, conforming to standard ISO 19115, is automatically harvested from the netCDF files and the metadata records catalogued in the OGC GeoNetwork Metadata Entry and Search Tool (MEST). Data discovery, access and download occur via web services through the IMOS Ocean Portal (http://imos.aodn.org.au) and tools for the display and integration of near real-time data are in development.
Assessing Metadata Quality of a Federally Sponsored Health Data Repository.
Marc, David T; Beattie, James; Herasevich, Vitaly; Gatewood, Laël; Zhang, Rui
2016-01-01
The U.S. Federal Government developed HealthData.gov to disseminate healthcare datasets to the public. Metadata is provided for each datasets and is the sole source of information to find and retrieve data. This study employed automated quality assessments of the HealthData.gov metadata published from 2012 to 2014 to measure completeness, accuracy, and consistency of applying standards. The results demonstrated that metadata published in earlier years had lower completeness, accuracy, and consistency. Also, metadata that underwent modifications following their original creation were of higher quality. HealthData.gov did not uniformly apply Dublin Core Metadata Initiative to the metadata, which is a widely accepted metadata standard. These findings suggested that the HealthData.gov metadata suffered from quality issues, particularly related to information that wasn't frequently updated. The results supported the need for policies to standardize metadata and contributed to the development of automated measures of metadata quality.
Assessing Metadata Quality of a Federally Sponsored Health Data Repository
Marc, David T.; Beattie, James; Herasevich, Vitaly; Gatewood, Laël; Zhang, Rui
2016-01-01
The U.S. Federal Government developed HealthData.gov to disseminate healthcare datasets to the public. Metadata is provided for each datasets and is the sole source of information to find and retrieve data. This study employed automated quality assessments of the HealthData.gov metadata published from 2012 to 2014 to measure completeness, accuracy, and consistency of applying standards. The results demonstrated that metadata published in earlier years had lower completeness, accuracy, and consistency. Also, metadata that underwent modifications following their original creation were of higher quality. HealthData.gov did not uniformly apply Dublin Core Metadata Initiative to the metadata, which is a widely accepted metadata standard. These findings suggested that the HealthData.gov metadata suffered from quality issues, particularly related to information that wasn’t frequently updated. The results supported the need for policies to standardize metadata and contributed to the development of automated measures of metadata quality. PMID:28269883
Partnerships To Mine Unexploited Sources of Metadata.
ERIC Educational Resources Information Center
Reynolds, Regina Romano
This paper discusses the metadata created for other purposes as a potential source of bibliographic data. The first section addresses collecting metadata by means of templates, including the Nordic Metadata Project's Dublin Core Metadata Template. The second section considers potential partnerships for re-purposing metadata for bibliographic use,…
Enhanced Soundings for Local Coupling Studies: 2015 ARM Climate Research Facility Field Campaign
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferguson, CR; Santanello, JA; Gentine, P
2015-11-01
Matching observed diurnal cycles is a fundamental yet extremely complex test for models. High temporal resolution measurements of surface turbulent heat fluxes and boundary layer properties are required to evaluate the daytime evolution of the boundary layer and its sensitivity to land-atmosphere coupling. To address this need, (12) one-day intensive observing periods (IOP) with enhanced radiosonding will be carried out at the ARM Southern Great Plains (SGP) Central Facility (CF) during summer 2015. Each IOP will comprise a single launch to correspond with the nighttime overpass of the A-Train of satellites (~0830 UTC) and hourly launches during daytime beginning frommore » 1130 UTC and ending at 2130 UTC. At 3-hourly intervals (i.e., 1140 UTC, 1440 UTC, 1740 UTC, and 2040 UTC) a duplicate second radiosonde will be launched 10 minutes subsequent to launch of the on-hour radiosonde for the purpose of assessing horizontal atmospheric variability. In summary, each IOP will have a 14-sounding supplement to the 6-hourly operational sounding schedule at the ARM-SGP CF. The IOP days will be decided before sunset on the preceding day, according to the judgment of the PI’s and taking into consideration daily weather forecasts and the operability of complimentary ARM-SGP CF instrumentation. An overarching goal of the project is to address how ARM could better observe land-atmosphere coupling to support the evaluation and refinement of coupled weather and climate models.« less
A Window to the World: Lessons Learned from NASA's Collaborative Metadata Curation Effort
NASA Astrophysics Data System (ADS)
Bugbee, K.; Dixon, V.; Baynes, K.; Shum, D.; le Roux, J.; Ramachandran, R.
2017-12-01
Well written descriptive metadata adds value to data by making data easier to discover as well as increases the use of data by providing the context or appropriateness of use. While many data centers acknowledge the importance of correct, consistent and complete metadata, allocating resources to curate existing metadata is often difficult. To lower resource costs, many data centers seek guidance on best practices for curating metadata but struggle to identify those recommendations. In order to assist data centers in curating metadata and to also develop best practices for creating and maintaining metadata, NASA has formed a collaborative effort to improve the Earth Observing System Data and Information System (EOSDIS) metadata in the Common Metadata Repository (CMR). This effort has taken significant steps in building consensus around metadata curation best practices. However, this effort has also revealed gaps in EOSDIS enterprise policies and procedures within the core metadata curation task. This presentation will explore the mechanisms used for building consensus on metadata curation, the gaps identified in policies and procedures, the lessons learned from collaborating with both the data centers and metadata curation teams, and the proposed next steps for the future.
A Domain Description Language for Data Processing
NASA Technical Reports Server (NTRS)
Golden, Keith
2003-01-01
We discuss an application of planning to data processing, a planning problem which poses unique challenges for domain description languages. We discuss these challenges and why the current PDDL standard does not meet them. We discuss DPADL (Data Processing Action Description Language), a language for describing planning domains that involve data processing. DPADL is a declarative, object-oriented language that supports constraints and embedded Java code, object creation and copying, explicit inputs and outputs for actions, and metadata descriptions of existing and desired data. DPADL is supported by the IMAGEbot system, which we are using to provide automation for an ecological forecasting application. We compare DPADL to PDDL and discuss changes that could be made to PDDL to make it more suitable for representing planning domains that involve data processing actions.
Evaluating and Evolving Metadata in Multiple Dialects
NASA Astrophysics Data System (ADS)
Kozimor, J.; Habermann, T.; Powers, L. A.; Gordon, S.
2016-12-01
Despite many long-term homogenization efforts, communities continue to develop focused metadata standards along with related recommendations and (typically) XML representations (aka dialects) for sharing metadata content. Different representations easily become obstacles to sharing information because each representation generally requires a set of tools and skills that are designed, built, and maintained specifically for that representation. In contrast, community recommendations are generally described, at least initially, at a more conceptual level and are more easily shared. For example, most communities agree that dataset titles should be included in metadata records although they write the titles in different ways. This situation has led to the development of metadata repositories that can ingest and output metadata in multiple dialects. As an operational example, the NASA Common Metadata Repository (CMR) includes three different metadata dialects (DIF, ECHO, and ISO 19115-2). These systems raise a new question for metadata providers: if I have a choice of metadata dialects, which should I use and how do I make that decision? We have developed a collection of metadata evaluation tools that can be used to evaluate metadata records in many dialects for completeness with respect to recommendations from many organizations and communities. We have applied these tools to over 8000 collection and granule metadata records in four different dialects. This large collection of identical content in multiple dialects enables us to address questions about metadata and dialect evolution and to answer those questions quantitatively. We will describe those tools and results from evaluating the NASA CMR metadata collection.
The TDR: A Repository for Long Term Storage of Geophysical Data and Metadata
NASA Astrophysics Data System (ADS)
Wilson, A.; Baltzer, T.; Caron, J.
2006-12-01
For many years Unidata has provided easy, low cost data access to universities and research labs. Historically Unidata technology provided access to data in near real time. In recent years Unidata has additionally turned to providing middleware to serve longer term data and associated metadata via its THREDDS technology, the most recent offering being the THREDDS Data Server (TDS). The TDS provides middleware for metadata access and management, OPeNDAP data access, and integration with the Unidata Integrated Data Viewer (IDV), among other benefits. The TDS was designed to support rolling archives of data, that is, data that exist only for a relatively short, predefined time window. Now we are creating an addition to the TDS, called the THREDDS Data Repository (TDR), which allows users to store and retrieve data and other objects for an arbitrarily long time period. Data in the TDR can also be served by the TDS. The TDR performs important functions of locating storage for the data, moving the data to and from the repository, assigning unique identifiers, and generating metadata. The TDR framework supports pluggable components that allow tailoring an implementation for a particular application. The Linked Environments for Atmospheric Discovery (LEAD) project provides an excellent use case for the TDR. LEAD is a multi-institutional Large Information Technology Research project funded by the National Science Foundation (NSF). The goal of LEAD is to create a framework based on Grid and Web Services to support mesoscale meteorology research and education. This includes capabilities such as launching forecast models, mining data for meteorological phenomena, and dynamic workflows that are automatically reconfigurable in response to changing weather. LEAD presents unique challenges in managing and storing large data volumes from real-time observational systems as well as data that are dynamically created during the execution of adaptive workflows. For example, in order to support storage of many large data products, the LEAD implementation of the TDR will provide a variety of data movement options, including gridftp. It will have a web service interface and will be callable programmatically as well as via interactive user requests. Future plans include the use of a mass storage device to provide robust long term storage. This talk will present the current state of the TDR effort.
EOS ODL Metadata On-line Viewer
NASA Astrophysics Data System (ADS)
Yang, J.; Rabi, M.; Bane, B.; Ullman, R.
2002-12-01
We have recently developed and deployed an EOS ODL metadata on-line viewer. The EOS ODL metadata viewer is a web server that takes: 1) an EOS metadata file in Object Description Language (ODL), 2) parameters, such as which metadata to view and what style of display to use, and returns an HTML or XML document displaying the requested metadata in the requested style. This tool is developed to address widespread complaints by science community that the EOS Data and Information System (EOSDIS) metadata files in ODL are difficult to read by allowing users to upload and view an ODL metadata file in different styles using a web browser. Users have the selection to view all the metadata or part of the metadata, such as Collection metadata, Granule metadata, or Unsupported Metadata. Choices of display styles include 1) Web: a mouseable display with tabs and turn-down menus, 2) Outline: Formatted and colored text, suitable for printing, 3) Generic: Simple indented text, a direct representation of the underlying ODL metadata, and 4) None: No stylesheet is applied and the XML generated by the converter is returned directly. Not all display styles are implemented for all the metadata choices. For example, Web style is only implemented for Collection and Granule metadata groups with known attribute fields, but not for Unsupported, Other, and All metadata. The overall strategy of the ODL viewer is to transform an ODL metadata file to a viewable HTML in two steps. The first step is to convert the ODL metadata file to an XML using a Java-based parser/translator called ODL2XML. The second step is to transform the XML to an HTML using stylesheets. Both operations are done on the server side. This allows a lot of flexibility in the final result, and is very portable cross-platform. Perl CGI behind the Apache web server is used to run the Java ODL2XML, and then run the results through an XSLT processor. The EOS ODL viewer can be accessed from either a PC or a Mac using Internet Explorer 5.0+ or Netscape 4.7+.
ISO 19115 Experiences in NASA's Earth Observing System (EOS) ClearingHOuse (ECHO)
NASA Astrophysics Data System (ADS)
Cechini, M. F.; Mitchell, A.
2011-12-01
Metadata is an important entity in the process of cataloging, discovering, and describing earth science data. As science research and the gathered data increases in complexity, so does the complexity and importance of descriptive metadata. To meet these growing needs, the metadata models required utilize richer and more mature metadata attributes. Categorizing, standardizing, and promulgating these metadata models to a politically, geographically, and scientifically diverse community is a difficult process. An integral component of metadata management within NASA's Earth Observing System Data and Information System (EOSDIS) is the Earth Observing System (EOS) ClearingHOuse (ECHO). ECHO is the core metadata repository for the EOSDIS data centers providing a centralized mechanism for metadata and data discovery and retrieval. ECHO has undertaken an internal restructuring to meet the changing needs of scientists, the consistent advancement in technology, and the advent of new standards such as ISO 19115. These improvements were based on the following tenets for data discovery and retrieval: + There exists a set of 'core' metadata fields recommended for data discovery. + There exists a set of users who will require the entire metadata record for advanced analysis. + There exists a set of users who will require a 'core' set metadata fields for discovery only. + There will never be a cessation of new formats or a total retirement of all old formats. + Users should be presented metadata in a consistent format of their choosing. In order to address the previously listed items, ECHO's new metadata processing paradigm utilizes the following approach: + Identify a cross-format set of 'core' metadata fields necessary for discovery. + Implement format-specific indexers to extract the 'core' metadata fields into an optimized query capability. + Archive the original metadata in its entirety for presentation to users requiring the full record. + Provide on-demand translation of 'core' metadata to any supported result format. Lessons learned by the ECHO team while implementing its new metadata approach to support usage of the ISO 19115 standard will be presented. These lessons learned highlight some discovered strengths and weaknesses in the ISO 19115 standard as it is introduced to an existing metadata processing system.
Willoughby, Cerys; Bird, Colin L; Coles, Simon J; Frey, Jeremy G
2014-12-22
The drive toward more transparency in research, the growing willingness to make data openly available, and the reuse of data to maximize the return on research investment all increase the importance of being able to find information and make links to the underlying data. The use of metadata in Electronic Laboratory Notebooks (ELNs) to curate experiment data is an essential ingredient for facilitating discovery. The University of Southampton has developed a Web browser-based ELN that enables users to add their own metadata to notebook entries. A survey of these notebooks was completed to assess user behavior and patterns of metadata usage within ELNs, while user perceptions and expectations were gathered through interviews and user-testing activities within the community. The findings indicate that while some groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users are making little attempts to use it, thereby endangering their ability to recover data in the future. A survey of patterns of metadata use in these notebooks, together with feedback from the user community, indicated that while a few groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users adopt a "minimum required" approach to metadata. To investigate whether the patterns of metadata use in LabTrove were unusual, a series of surveys were undertaken to investigate metadata usage in a variety of platforms supporting user-defined metadata. These surveys also provided the opportunity to investigate whether interface designs in these other environments might inform strategies for encouraging metadata creation and more effective use of metadata in LabTrove.
ASDC Collaborations and Processes to Ensure Quality Metadata and Consistent Data Availability
NASA Astrophysics Data System (ADS)
Trapasso, T. J.
2017-12-01
With the introduction of new tools, faster computing, and less expensive storage, increased volumes of data are expected to be managed with existing or fewer resources. Metadata management is becoming a heightened challenge from the increase in data volume, resulting in more metadata records needed to be curated for each product. To address metadata availability and completeness, NASA ESDIS has taken significant strides with the creation of the United Metadata Model (UMM) and Common Metadata Repository (CMR). These UMM helps address hurdles experienced by the increasing number of metadata dialects and the CMR provides a primary repository for metadata so that required metadata fields can be served through a growing number of tools and services. However, metadata quality remains an issue as metadata is not always inherent to the end-user. In response to these challenges, the NASA Atmospheric Science Data Center (ASDC) created the Collaboratory for quAlity Metadata Preservation (CAMP) and defined the Product Lifecycle Process (PLP) to work congruently. CAMP is unique in that it provides science team members a UI to directly supply metadata that is complete, compliant, and accurate for their data products. This replaces back-and-forth communication that often results in misinterpreted metadata. Upon review by ASDC staff, metadata is submitted to CMR for broader distribution through Earthdata. Further, approval of science team metadata in CAMP automatically triggers the ASDC PLP workflow to ensure appropriate services are applied throughout the product lifecycle. This presentation will review the design elements of CAMP and PLP as well as demonstrate interfaces to each. It will show the benefits that CAMP and PLP provide to the ASDC that could potentially benefit additional NASA Earth Science Data and Information System (ESDIS) Distributed Active Archive Centers (DAACs).
Metadata squared: enhancing its usability for volunteered geographic information and the GeoWeb
Poore, Barbara S.; Wolf, Eric B.; Sui, Daniel Z.; Elwood, Sarah; Goodchild, Michael F.
2013-01-01
The Internet has brought many changes to the way geographic information is created and shared. One aspect that has not changed is metadata. Static spatial data quality descriptions were standardized in the mid-1990s and cannot accommodate the current climate of data creation where nonexperts are using mobile phones and other location-based devices on a continuous basis to contribute data to Internet mapping platforms. The usability of standard geospatial metadata is being questioned by academics and neogeographers alike. This chapter analyzes current discussions of metadata to demonstrate how the media shift that is occurring has affected requirements for metadata. Two case studies of metadata use are presented—online sharing of environmental information through a regional spatial data infrastructure in the early 2000s, and new types of metadata that are being used today in OpenStreetMap, a map of the world created entirely by volunteers. Changes in metadata requirements are examined for usability, the ease with which metadata supports coproduction of data by communities of users, how metadata enhances findability, and how the relationship between metadata and data has changed. We argue that traditional metadata associated with spatial data infrastructures is inadequate and suggest several research avenues to make this type of metadata more interactive and effective in the GeoWeb.
Evolutions in Metadata Quality
NASA Astrophysics Data System (ADS)
Gilman, J.
2016-12-01
Metadata Quality is one of the chief drivers of discovery and use of NASA EOSDIS (Earth Observing System Data and Information System) data. Issues with metadata such as lack of completeness, inconsistency, and use of legacy terms directly hinder data use. As the central metadata repository for NASA Earth Science data, the Common Metadata Repository (CMR) has a responsibility to its users to ensure the quality of CMR search results. This talk will cover how we encourage metadata authors to improve the metadata through the use of integrated rubrics of metadata quality and outreach efforts. In addition we'll demonstrate Humanizers, a technique for dealing with the symptoms of metadata issues. Humanizers allow CMR administrators to identify specific metadata issues that are fixed at runtime when the data is indexed. An example Humanizer is the aliasing of processing level "Level 1" to "1" to improve consistency across collections. The CMR currently indexes 35K collections and 300M granules.
Collaboration tools and techniques for large model datasets
Signell, R.P.; Carniel, S.; Chiggiato, J.; Janekovic, I.; Pullen, J.; Sherwood, C.R.
2008-01-01
In MREA and many other marine applications, it is common to have multiple models running with different grids, run by different institutions. Techniques and tools are described for low-bandwidth delivery of data from large multidimensional datasets, such as those from meteorological and oceanographic models, directly into generic analysis and visualization tools. Output is stored using the NetCDF CF Metadata Conventions, and then delivered to collaborators over the web via OPeNDAP. OPeNDAP datasets served by different institutions are then organized via THREDDS catalogs. Tools and procedures are then used which enable scientists to explore data on the original model grids using tools they are familiar with. It is also low-bandwidth, enabling users to extract just the data they require, an important feature for access from ship or remote areas. The entire implementation is simple enough to be handled by modelers working with their webmasters - no advanced programming support is necessary. ?? 2007 Elsevier B.V. All rights reserved.
Metadata Means Communication: The Challenges of Producing Useful Metadata
NASA Astrophysics Data System (ADS)
Edwards, P. N.; Batcheller, A. L.
2010-12-01
Metadata are increasingly perceived as an important component of data sharing systems. For instance, metadata accompanying atmospheric model output may indicate the grid size, grid type, and parameter settings used in the model configuration. We conducted a case study of a data portal in the atmospheric sciences using in-depth interviews, document review, and observation. OUr analysis revealed a number of challenges in producing useful metadata. First, creating and managing metadata required considerable effort and expertise, yet responsibility for these tasks was ill-defined and diffused among many individuals, leading to errors, failure to capture metadata, and uncertainty about the quality of the primary data. Second, metadata ended up stored in many different forms and software tools, making it hard to manage versions and transfer between formats. Third, the exact meanings of metadata categories remained unsettled and misunderstood even among a small community of domain experts -- an effect we expect to be exacerbated when scientists from other disciplines wish to use these data. In practice, we found that metadata problems due to these obstacles are often overcome through informal, personal communication, such as conversations or email. We conclude that metadata serve to communicate the context of data production from the people who produce data to those who wish to use it. Thus while formal metadata systems are often public, critical elements of metadata (those embodied in informal communication) may never be recorded. Therefore, efforts to increase data sharing should include ways to facilitate inter-investigator communication. Instead of tackling metadata challenges only on the formal level, we can improve data usability for broader communities by better supporting metadata communication.
Inheritance rules for Hierarchical Metadata Based on ISO 19115
NASA Astrophysics Data System (ADS)
Zabala, A.; Masó, J.; Pons, X.
2012-04-01
Mainly, ISO19115 has been used to describe metadata for datasets and services. Furthermore, ISO19115 standard (as well as the new draft ISO19115-1) includes a conceptual model that allows to describe metadata at different levels of granularity structured in hierarchical levels, both in aggregated resources such as particularly series, datasets, and also in more disaggregated resources such as types of entities (feature type), types of attributes (attribute type), entities (feature instances) and attributes (attribute instances). In theory, to apply a complete metadata structure to all hierarchical levels of metadata, from the whole series to an individual feature attributes, is possible, but to store all metadata at all levels is completely impractical. An inheritance mechanism is needed to store each metadata and quality information at the optimum hierarchical level and to allow an ease and efficient documentation of metadata in both an Earth observation scenario such as a multi-satellite mission multiband imagery, as well as in a complex vector topographical map that includes several feature types separated in layers (e.g. administrative limits, contour lines, edification polygons, road lines, etc). Moreover, and due to the traditional split of maps in tiles due to map handling at detailed scales or due to the satellite characteristics, each of the previous thematic layers (e.g. 1:5000 roads for a country) or band (Landsat-5 TM cover of the Earth) are tiled on several parts (sheets or scenes respectively). According to hierarchy in ISO 19115, the definition of general metadata can be supplemented by spatially specific metadata that, when required, either inherits or overrides the general case (G.1.3). Annex H of this standard states that only metadata exceptions are defined at lower levels, so it is not necessary to generate the full registry of metadata for each level but to link particular values to the general value that they inherit. Conceptually the metadata registry is complete for each metadata hierarchical level, but at the implementation level most of the metadata elements are not stored at both levels but only at more generic one. This communication defines a metadata system that covers 4 levels, describes which metadata has to support series-layer inheritance and in which way, and how hierarchical levels are defined and stored. Metadata elements are classified according to the type of inheritance between products, series, tiles and the datasets. It explains the metadata elements classification and exemplifies it using core metadata elements. The communication also presents a metadata viewer and edition tool that uses the described model to propagate metadata elements and to show to the user a complete set of metadata for each level in a transparent way. This tool is integrated in the MiraMon GIS software.
The role of metadata in managing large environmental science datasets. Proceedings
DOE Office of Scientific and Technical Information (OSTI.GOV)
Melton, R.B.; DeVaney, D.M.; French, J. C.
1995-06-01
The purpose of this workshop was to bring together computer science researchers and environmental sciences data management practitioners to consider the role of metadata in managing large environmental sciences datasets. The objectives included: establishing a common definition of metadata; identifying categories of metadata; defining problems in managing metadata; and defining problems related to linking metadata with primary data.
Building Format-Agnostic Metadata Repositories
NASA Astrophysics Data System (ADS)
Cechini, M.; Pilone, D.
2010-12-01
This presentation will discuss the problems that surround persisting and discovering metadata in multiple formats; a set of tenets that must be addressed in a solution; and NASA’s Earth Observing System (EOS) ClearingHOuse’s (ECHO) proposed approach. In order to facilitate cross-discipline data analysis, Earth Scientists will potentially interact with more than one data source. The most common data discovery paradigm relies on services and/or applications facilitating the discovery and presentation of metadata. What may not be common are the formats in which the metadata are formatted. As the number of sources and datasets utilized for research increases, it becomes more likely that a researcher will encounter conflicting metadata formats. Metadata repositories, such as the EOS ClearingHOuse (ECHO), along with data centers, must identify ways to address this issue. In order to define the solution to this problem, the following tenets are identified: - There exists a set of ‘core’ metadata fields recommended for data discovery. - There exists a set of users who will require the entire metadata record for advanced analysis. - There exists a set of users who will require a ‘core’ set of metadata fields for discovery only. - There will never be a cessation of new formats or a total retirement of all old formats. - Users should be presented metadata in a consistent format. ECHO has undertaken an effort to transform its metadata ingest and discovery services in order to support the growing set of metadata formats. In order to address the previously listed items, ECHO’s new metadata processing paradigm utilizes the following approach: - Identify a cross-format set of ‘core’ metadata fields necessary for discovery. - Implement format-specific indexers to extract the ‘core’ metadata fields into an optimized query capability. - Archive the original metadata in its entirety for presentation to users requiring the full record. - Provide on-demand translation of ‘core’ metadata to any supported result format. With this identified approach, the Earth Scientist is provided with a consistent data representation as they interact with a variety of datasets that utilize multiple metadata formats. They are then able to focus their efforts on the more critical research activities which they are undertaking.
Making Metadata Better with CMR and MMT
NASA Technical Reports Server (NTRS)
Gilman, Jason Arthur; Shum, Dana
2016-01-01
Ensuring complete, consistent and high quality metadata is a challenge for metadata providers and curators. The CMR and MMT systems provide providers and curators options to build in metadata quality from the start and also assess and improve the quality of already existing metadata.
Evolution in Metadata Quality: Common Metadata Repository's Role in NASA Curation Efforts
NASA Technical Reports Server (NTRS)
Gilman, Jason; Shum, Dana; Baynes, Katie
2016-01-01
Metadata Quality is one of the chief drivers of discovery and use of NASA EOSDIS (Earth Observing System Data and Information System) data. Issues with metadata such as lack of completeness, inconsistency, and use of legacy terms directly hinder data use. As the central metadata repository for NASA Earth Science data, the Common Metadata Repository (CMR) has a responsibility to its users to ensure the quality of CMR search results. This poster covers how we use humanizers, a technique for dealing with the symptoms of metadata issues, as well as our plans for future metadata validation enhancements. The CMR currently indexes 35K collections and 300M granules.
Patridge, Jeff; Namulanda, Gonza
2008-01-01
The Environmental Public Health Tracking (EPHT) Network provides an opportunity to bring together diverse environmental and health effects data by integrating}?> local, state, and national databases of environmental hazards, environmental exposures, and health effects. To help users locate data on the EPHT Network, the network will utilize descriptive metadata that provide critical information as to the purpose, location, content, and source of these data. Since 2003, the Centers for Disease Control and Prevention's EPHT Metadata Subgroup has been working to initiate the creation and use of descriptive metadata. Efforts undertaken by the group include the adoption of a metadata standard, creation of an EPHT-specific metadata profile, development of an open-source metadata creation tool, and promotion of the creation of descriptive metadata by changing the perception of metadata in the public health culture.
Metadata: Standards for Retrieving WWW Documents (and Other Digitized and Non-Digitized Resources)
NASA Astrophysics Data System (ADS)
Rusch-Feja, Diann
The use of metadata for indexing digitized and non-digitized resources for resource discovery in a networked environment is being increasingly implemented all over the world. Greater precision is achieved using metadata than relying on universal search engines and furthermore, meta-data can be used as filtering mechanisms for search results. An overview of various metadata sets is given, followed by a more focussed presentation of Dublin Core Metadata including examples of sub-elements and qualifiers. Especially the use of the Dublin Core Relation element provides connections between the metadata of various related electronic resources, as well as the metadata for physical, non-digitized resources. This facilitates more comprehensive search results without losing precision and brings together different genres of information which would otherwise be only searchable in separate databases. Furthermore, the advantages of Dublin Core Metadata in comparison with library cataloging and the use of universal search engines are discussed briefly, followed by a listing of types of implementation of Dublin Core Metadata.
Obuch, Raymond C.; Carlino, Jennifer; Zhang, Lin; Blythe, Jonathan; Dietrich, Christopher; Hawkinson, Christine
2018-04-12
The Department of the Interior (DOI) is a Federal agency with over 90,000 employees across 10 bureaus and 8 agency offices. Its primary mission is to protect and manage the Nation’s natural resources and cultural heritage; provide scientific and other information about those resources; and honor its trust responsibilities or special commitments to American Indians, Alaska Natives, and affiliated island communities. Data and information are critical in day-to-day operational decision making and scientific research. DOI is committed to creating, documenting, managing, and sharing high-quality data and metadata in and across its various programs that support its mission. Documenting data through metadata is essential in realizing the value of data as an enterprise asset. The completeness, consistency, and timeliness of metadata affect users’ ability to search for and discover the most relevant data for the intended purpose; and facilitates the interoperability and usability of these data among DOI bureaus and offices. Fully documented metadata describe data usability, quality, accuracy, provenance, and meaning.Across DOI, there are different maturity levels and phases of information and metadata management implementations. The Department has organized a committee consisting of bureau-level points-of-contacts to collaborate on the development of more consistent, standardized, and more effective metadata management practices and guidance to support this shared mission and the information needs of the Department. DOI’s metadata implementation plans establish key roles and responsibilities associated with metadata management processes, procedures, and a series of actions defined in three major metadata implementation phases including: (1) Getting started—Planning Phase, (2) Implementing and Maintaining Operational Metadata Management Phase, and (3) the Next Steps towards Improving Metadata Management Phase. DOI’s phased approach for metadata management addresses some of the major data and metadata management challenges that exist across the diverse missions of the bureaus and offices. All employees who create, modify, or use data are involved with data and metadata management. Identifying, establishing, and formalizing the roles and responsibilities associated with metadata management are key to institutionalizing a framework of best practices, methodologies, processes, and common approaches throughout all levels of the organization; these are the foundation for effective data resource management. For executives and managers, metadata management strengthens their overarching views of data assets, holdings, and data interoperability; and clarifies how metadata management can help accelerate the compliance of multiple policy mandates. For employees, data stewards, and data professionals, formalized metadata management will help with the consistency of definitions, and approaches addressing data discoverability, data quality, and data lineage. In addition to data professionals and others associated with information technology; data stewards and program subject matter experts take on important metadata management roles and responsibilities as data flow through their respective business and science-related workflows. The responsibilities of establishing, practicing, and governing the actions associated with their specific metadata management roles are critical to successful metadata implementation.
Making Interoperability Easier with the NASA Metadata Management Tool
NASA Astrophysics Data System (ADS)
Shum, D.; Reese, M.; Pilone, D.; Mitchell, A. E.
2016-12-01
ISO 19115 has enabled interoperability amongst tools, yet many users find it hard to build ISO metadata for their collections because it can be large and overly flexible for their needs. The Metadata Management Tool (MMT), part of NASA's Earth Observing System Data and Information System (EOSDIS), offers users a modern, easy to use browser based tool to develop ISO compliant metadata. Through a simplified UI experience, metadata curators can create and edit collections without any understanding of the complex ISO-19115 format, while still generating compliant metadata. The MMT is also able to assess the completeness of collection level metadata by evaluating it against a variety of metadata standards. The tool provides users with clear guidance as to how to change their metadata in order to improve their quality and compliance. It is based on NASA's Unified Metadata Model for Collections (UMM-C) which is a simpler metadata model which can be cleanly mapped to ISO 19115. This allows metadata authors and curators to meet ISO compliance requirements faster and more accurately. The MMT and UMM-C have been developed in an agile fashion, with recurring end user tests and reviews to continually refine the tool, the model and the ISO mappings. This process is allowing for continual improvement and evolution to meet the community's needs.
GraphMeta: Managing HPC Rich Metadata in Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Dong; Chen, Yong; Carns, Philip
High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes, but also from increasingly diverse metadata, which contains data provenance and arbitrary user-defined attributes in addition to traditional POSIX metadata. This ‘rich’ metadata is becoming critical to supporting advanced data management functionality such as data auditing and validation. In our prior work, we identified a graph-based model as a promising solution to uniformly manage HPC rich metadata due to its flexibility and generality. However, at the same time, graph-based HPC rich metadata anagement also introducesmore » significant challenges to the underlying infrastructure. In this study, we first identify the challenges on the underlying infrastructure to support scalable, high-performance rich metadata management. Based on that, we introduce GraphMeta, a graphbased engine designed for this use case. It achieves performance scalability by introducing a new graph partitioning algorithm and a write-optimal storage engine. We evaluate GraphMeta under both synthetic and real HPC metadata workloads, compare it with other approaches, and demonstrate its advantages in terms of efficiency and usability for rich metadata management in HPC systems.« less
Metabolonote: A Wiki-Based Database for Managing Hierarchical Metadata of Metabolome Analyses
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics – technology for comprehensive detection of small molecules in an organism – lags behind the other “omics” in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers’ understanding and use of data but also submitters’ motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/. PMID:25905099
Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses.
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
Robert E. Keane
2006-01-01
The Metadata (MD) table in the FIREMON database is used to record any information about the sampling strategy or data collected using the FIREMON sampling procedures. The MD method records metadata pertaining to a group of FIREMON plots, such as all plots in a specific FIREMON project. FIREMON plots are linked to metadata using a unique metadata identifier that is...
Enhanced Soundings for Local Coupling Studies Field Campaign Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferguson, Craig R; Santanello, Joseph A; Gentine, Pierre
This document presents initial analyses of the enhanced radiosonde observations obtained during the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) Climate Research Facility Enhanced Soundings for Local Coupling Studies Field Campaign (ESLCS), which took place at the ARM Southern Great Plains (SGP) Central Facility (CF) from June 15 to August 31, 2015. During ESLCS, routine 4-times-daily radiosonde measurements at the ARM-SGP CF were augmented on 12 days (June 18 and 29; July 11, 14, 19, and 26; August 15, 16, 21, 25, 26, and 27) with daytime 1-hourly radiosondes and 10-minute ‘trailer’ radiosondes every 3 hours. These 12more » intensive operational period (IOP) days were selected on the basis of prior-day qualitative forecasts of potential land-atmosphere coupling strength. The campaign captured 2 dry soil convection advantage days (June 29 and July 14) and 10 atmospherically controlled days. Other noteworthy IOP events include: 2 soil dry-down sequences (July 11-14-19 and August 21-25-26), a 2-day clear-sky case (August 15-16), and the passing of Tropical Storm Bill (June 18). To date, the ESLCS data set constitutes the highest-temporal-resolution sampling of the evolution of the daytime planetary boundary layer (PBL) using radiosondes at the ARM-SGP. The data set is expected to contribute to: 1) improved understanding and modeling of the diurnal evolution of the PBL, particularly with regard to the role of local soil wetness, and (2) new insights into the appropriateness of current ARM-SGP CF thermodynamic sampling strategies.« less
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.
2011-12-01
Under several NASA grants, we are generating multi-sensor merged atmospheric datasets to enable the detection of instrument biases and studies of climate trends over decades of data. For example, under a NASA MEASURES grant we are producing a water vapor climatology from the A-Train instruments, stratified by the Cloudsat cloud classification for each geophysical scene. The generation and proper use of such multi-sensor climate data records (CDR's) requires a high level of openness, transparency, and traceability. To make the datasets self-documenting and provide access to full metadata and traceability, we have implemented a set of capabilities and services using known, interoperable protocols. These protocols include OpenSearch, OPeNDAP, Open Provenance Model, service & data casting technologies using Atom feeds, and REST-callable analysis workflows implemented as SciFlo (XML) documents. We advocate that our approach can serve as a blueprint for how to openly "document and serve" complex, multi-sensor CDR's with full traceability. The capabilities and services provided include: - Discovery of the collections by keyword search, exposed using OpenSearch protocol; - Space/time query across the CDR's granules and all of the input datasets via OpenSearch; - User-level configuration of the production workflows so that scientists can select additional physical variables from the A-Train to add to the next iteration of the merged datasets; - Efficient data merging using on-the-fly OPeNDAP variable slicing & spatial subsetting of data out of input netCDF and HDF files (without moving the entire files); - Self-documenting CDR's published in a highly usable netCDF4 format with groups used to organize the variables, CF-style attributes for each variable, numeric array compression, & links to OPM provenance; - Recording of processing provenance and data lineage into a query-able provenance trail in Open Provenance Model (OPM) format, auto-captured by the workflow engine; - Open Publishing of all of the workflows used to generate products as machine-callable REST web services, using the capabilities of the SciFlo workflow engine; - Advertising of the metadata (e.g. physical variables provided, space/time bounding box, etc.) for our prepared datasets as "datacasts" using the Atom feed format; - Publishing of all datasets via our "DataDrop" service, which exploits the WebDAV protocol to enable scientists to access remote data directories as local files on their laptops; - Rich "web browse" of the CDR's with full metadata and the provenance trail one click away; - Advertising of all services as Google-discoverable "service casts" using the Atom format. The presentation will describe our use of the interoperable protocols and demonstrate the capabilities and service GUI's.
Documentation Resources on the ESIP Wiki
NASA Technical Reports Server (NTRS)
Habermann, Ted; Kozimor, John; Gordon, Sean
2017-01-01
The ESIP community includes data providers and users that communicate with one another through datasets and metadata that describe them. Improving this communication depends on consistent high-quality metadata. The ESIP Documentation Cluster and the wiki play an important central role in facilitating this communication. We will describe and demonstrate sections of the wiki that provide information about metadata concept definitions, metadata recommendation, metadata dialects, and guidance pages. We will also describe and demonstrate the ISO Explorer, a tool that the community is developing to help metadata creators.
Ismail, Mahmoud; Philbin, James
2015-04-01
The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes. This work promotes storing DICOM studies in MSD format to reduce the metadata processing time. A set of experiments are performed that update the metadata of a set of DICOM studies for deidentification and migration. The studies are stored in both the traditional single frame DICOM (SFD) format and the MSD format. The results show that it is faster to update studies' metadata in MSD format than in SFD format because the bulk data is separated in MSD and is not retrieved from the storage system. In addition, it is space efficient to store the deidentified studies in MSD format as it shares the same bulk data object with the original study. In summary, separation of metadata from pixel data using the MSD format provides fast metadata access and speeds up applications that process only the metadata.
Transforming Dermatologic Imaging for the Digital Era: Metadata and Standards.
Caffery, Liam J; Clunie, David; Curiel-Lewandrowski, Clara; Malvehy, Josep; Soyer, H Peter; Halpern, Allan C
2018-01-17
Imaging is increasingly being used in dermatology for documentation, diagnosis, and management of cutaneous disease. The lack of standards for dermatologic imaging is an impediment to clinical uptake. Standardization can occur in image acquisition, terminology, interoperability, and metadata. This paper presents the International Skin Imaging Collaboration position on standardization of metadata for dermatologic imaging. Metadata is essential to ensure that dermatologic images are properly managed and interpreted. There are two standards-based approaches to recording and storing metadata in dermatologic imaging. The first uses standard consumer image file formats, and the second is the file format and metadata model developed for the Digital Imaging and Communication in Medicine (DICOM) standard. DICOM would appear to provide an advantage over using consumer image file formats for metadata as it includes all the patient, study, and technical metadata necessary to use images clinically. Whereas, consumer image file formats only include technical metadata and need to be used in conjunction with another actor-for example, an electronic medical record-to supply the patient and study metadata. The use of DICOM may have some ancillary benefits in dermatologic imaging including leveraging DICOM network and workflow services, interoperability of images and metadata, leveraging existing enterprise imaging infrastructure, greater patient safety, and better compliance to legislative requirements for image retention.
Ismail, Mahmoud; Philbin, James
2015-01-01
Abstract. The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes. This work promotes storing DICOM studies in MSD format to reduce the metadata processing time. A set of experiments are performed that update the metadata of a set of DICOM studies for deidentification and migration. The studies are stored in both the traditional single frame DICOM (SFD) format and the MSD format. The results show that it is faster to update studies’ metadata in MSD format than in SFD format because the bulk data is separated in MSD and is not retrieved from the storage system. In addition, it is space efficient to store the deidentified studies in MSD format as it shares the same bulk data object with the original study. In summary, separation of metadata from pixel data using the MSD format provides fast metadata access and speeds up applications that process only the metadata. PMID:26158117
ISO, FGDC, DIF and Dublin Core - Making Sense of Metadata Standards for Earth Science Data
NASA Astrophysics Data System (ADS)
Jones, P. R.; Ritchey, N. A.; Peng, G.; Toner, V. A.; Brown, H.
2014-12-01
Metadata standards provide common definitions of metadata fields for information exchange across user communities. Despite the broad adoption of metadata standards for Earth science data, there are still heterogeneous and incompatible representations of information due to differences between the many standards in use and how each standard is applied. Federal agencies are required to manage and publish metadata in different metadata standards and formats for various data catalogs. In 2014, the NOAA National Climatic data Center (NCDC) managed metadata for its scientific datasets in ISO 19115-2 in XML, GCMD Directory Interchange Format (DIF) in XML, DataCite Schema in XML, Dublin Core in XML, and Data Catalog Vocabulary (DCAT) in JSON, with more standards and profiles of standards planned. Of these standards, the ISO 19115-series metadata is the most complete and feature-rich, and for this reason it is used by NCDC as the source for the other metadata standards. We will discuss the capabilities of metadata standards and how these standards are being implemented to document datasets. Successful implementations include developing translations and displays using XSLTs, creating links to related data and resources, documenting dataset lineage, and establishing best practices. Benefits, gaps, and challenges will be highlighted with suggestions for improved approaches to metadata storage and maintenance.
NASA Astrophysics Data System (ADS)
Hernández, B. E.; Bugbee, K.; le Roux, J.; Beaty, T.; Hansen, M.; Staton, P.; Sisco, A. W.
2017-12-01
Earth observation (EO) data collected as part of NASA's Earth Observing System Data and Information System (EOSDIS) is now searchable via the Common Metadata Repository (CMR). The Analysis and Review of CMR (ARC) Team at Marshall Space Flight Center has been tasked with reviewing all NASA metadata records in the CMR ( 7,000 records). Each collection level record and constituent granule level metadata are reviewed for both completeness as well as compliance with the CMR's set of metadata standards, as specified in the Unified Metadata Model (UMM). NASA's Distributed Active Archive Centers (DAACs) have been harmonizing priority metadata records within the context of the inter-agency federal Big Earth Data Initiative (BEDI), which seeks to improve the discoverability, accessibility, and usability of EO data. Thus, the first phase of this project constitutes reviewing BEDI metadata records, while the second phase will constitute reviewing the remaining non-BEDI records in CMR. This presentation will discuss the ARC team's findings in terms of the overall quality of BEDI records across all DAACs as well as compliance with UMM standards. For instance, only a fifth of the collection-level metadata fields needed correction, compared to a quarter of the granule-level fields. It should be noted that the degree to which DAACs' metadata did not comply with the UMM standards may reflect multiple factors, such as recent changes in the UMM standards, and the utilization of different metadata formats (e.g. DIF 10, ECHO 10, ISO 19115-1) across the DAACs. Insights, constructive criticism, and lessons learned from this metadata review process will be contributed from both ORNL and SEDAC. Further inquiry along such lines may lead to insights which may improve the metadata curation process moving forward. In terms of the broader implications for metadata compliance with the UMM standards, this research has shown that a large proportion of the prioritized collections have already been made compliant, although the process of improving metadata quality is ongoing and iterative. Further research is also warranted into whether or not the gains in metadata quality are also driving gains in data use.
Forum Guide to Metadata: The Meaning behind Education Data. NFES 2009-805
ERIC Educational Resources Information Center
National Forum on Education Statistics, 2009
2009-01-01
The purpose of this guide is to empower people to more effectively use data as information. To accomplish this, the publication explains what metadata are; why metadata are critical to the development of sound education data systems; what components comprise a metadata system; what value metadata bring to data management and use; and how to…
ERIC Educational Resources Information Center
Yang, Le
2016-01-01
This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…
McMahon, Christiana; Denaxas, Spiros
2016-01-01
Metadata are critical in epidemiological and public health research. However, a lack of biomedical metadata quality frameworks and limited awareness of the implications of poor quality metadata renders data analyses problematic. In this study, we created and evaluated a novel framework to assess metadata quality of epidemiological and public health research datasets. We performed a literature review and surveyed stakeholders to enhance our understanding of biomedical metadata quality assessment. The review identified 11 studies and nine quality dimensions; none of which were specifically aimed at biomedical metadata. 96 individuals completed the survey; of those who submitted data, most only assessed metadata quality sometimes, and eight did not at all. Our framework has four sections: a) general information; b) tools and technologies; c) usability; and d) management and curation. We evaluated the framework using three test cases and sought expert feedback. The framework can assess biomedical metadata quality systematically and robustly. PMID:27570670
McMahon, Christiana; Denaxas, Spiros
2016-01-01
Metadata are critical in epidemiological and public health research. However, a lack of biomedical metadata quality frameworks and limited awareness of the implications of poor quality metadata renders data analyses problematic. In this study, we created and evaluated a novel framework to assess metadata quality of epidemiological and public health research datasets. We performed a literature review and surveyed stakeholders to enhance our understanding of biomedical metadata quality assessment. The review identified 11 studies and nine quality dimensions; none of which were specifically aimed at biomedical metadata. 96 individuals completed the survey; of those who submitted data, most only assessed metadata quality sometimes, and eight did not at all. Our framework has four sections: a) general information; b) tools and technologies; c) usability; and d) management and curation. We evaluated the framework using three test cases and sought expert feedback. The framework can assess biomedical metadata quality systematically and robustly.
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises
NASA Astrophysics Data System (ADS)
Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.
2011-12-01
JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.
Towards Data Value-Level Metadata for Clinical Studies.
Zozus, Meredith Nahm; Bonner, Joseph
2017-01-01
While several standards for metadata describing clinical studies exist, comprehensive metadata to support traceability of data from clinical studies has not been articulated. We examine uses of metadata in clinical studies. We examine and enumerate seven sources of data value-level metadata in clinical studies inclusive of research designs across the spectrum of the National Institutes of Health definition of clinical research. The sources of metadata inform categorization in terms of metadata describing the origin of a data value, the definition of a data value, and operations to which the data value was subjected. The latter is further categorized into information about changes to a data value, movement of a data value, retrieval of a data value, and data quality checks, constraints or assessments to which the data value was subjected. The implications of tracking and managing data value-level metadata are explored.
Managing Complex Change in Clinical Study Metadata
Brandt, Cynthia A.; Gadagkar, Rohit; Rodriguez, Cesar; Nadkarni, Prakash M.
2004-01-01
In highly functional metadata-driven software, the interrelationships within the metadata become complex, and maintenance becomes challenging. We describe an approach to metadata management that uses a knowledge-base subschema to store centralized information about metadata dependencies and use cases involving specific types of metadata modification. Our system borrows ideas from production-rule systems in that some of this information is a high-level specification that is interpreted and executed dynamically by a middleware engine. Our approach is implemented in TrialDB, a generic clinical study data management system. We review approaches that have been used for metadata management in other contexts and describe the features, capabilities, and limitations of our system. PMID:15187070
NASA Astrophysics Data System (ADS)
Lugmayr, Artur R.; Mailaparampil, Anurag; Tico, Florina; Kalli, Seppo; Creutzburg, Reiner
2003-01-01
Digital television (digiTV) is an additional multimedia environment, where metadata is one key element for the description of arbitrary content. This implies adequate structures for content description, which is provided by XML metadata schemes (e.g. MPEG-7, MPEG-21). Content and metadata management is the task of a multimedia repository, from which digiTV clients - equipped with an Internet connection - can access rich additional multimedia types over an "All-HTTP" protocol layer. Within this research work, we focus on conceptual design issues of a metadata repository for the storage of metadata, accessible from the feedback channel of a local set-top box. Our concept describes the whole heterogeneous life-cycle chain of XML metadata from the service provider to the digiTV equipment, device independent representation of content, accessing and querying the metadata repository, management of metadata related to digiTV, and interconnection of basic system components (http front-end, relational database system, and servlet container). We present our conceptual test configuration of a metadata repository that is aimed at a real-world deployment, done within the scope of the future interaction (fiTV) project at the Digital Media Institute (DMI) Tampere (www.futureinteraction.tv).
Metazen – metadata capture for metagenomes
2014-01-01
Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508
Metazen - metadata capture for metagenomes.
Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker
2014-01-01
As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.
Improving Access to NASA Earth Science Data through Collaborative Metadata Curation
NASA Astrophysics Data System (ADS)
Sisco, A. W.; Bugbee, K.; Shum, D.; Baynes, K.; Dixon, V.; Ramachandran, R.
2017-12-01
The NASA-developed Common Metadata Repository (CMR) is a high-performance metadata system that currently catalogs over 375 million Earth science metadata records. It serves as the authoritative metadata management system of NASA's Earth Observing System Data and Information System (EOSDIS), enabling NASA Earth science data to be discovered and accessed by a worldwide user community. The size of the EOSDIS data archive is steadily increasing, and the ability to manage and query this archive depends on the input of high quality metadata to the CMR. Metadata that does not provide adequate descriptive information diminishes the CMR's ability to effectively find and serve data to users. To address this issue, an innovative and collaborative review process is underway to systematically improve the completeness, consistency, and accuracy of metadata for approximately 7,000 data sets archived by NASA's twelve EOSDIS data centers, or Distributed Active Archive Centers (DAACs). The process involves automated and manual metadata assessment of both collection and granule records by a team of Earth science data specialists at NASA Marshall Space Flight Center. The team communicates results to DAAC personnel, who then make revisions and reingest improved metadata into the CMR. Implementation of this process relies on a network of interdisciplinary collaborators leveraging a variety of communication platforms and long-range planning strategies. Curating metadata at this scale and resolving metadata issues through community consensus improves the CMR's ability to serve current and future users and also introduces best practices for stewarding the next generation of Earth Observing System data. This presentation will detail the metadata curation process, its outcomes thus far, and also share the status of ongoing curation activities.
NASA Technical Reports Server (NTRS)
Shum, Dana; Bugbee, Kaylin
2017-01-01
This talk explains the ongoing metadata curation activities in the Common Metadata Repository. It explores tools that exist today which are useful for building quality metadata and also opens up the floor for discussions on other potentially useful tools.
NASA Astrophysics Data System (ADS)
Troyan, D.
2016-12-01
The Atmospheric Radiation Measurement (ARM) program has been collecting data from instruments in diverse climate regions for nearly twenty-five years. These data are made available to all interested parties at no cost via specially designed tools found on the ARM website (www.arm.gov). Metadata is created and applied to the various datastreams to facilitate information retrieval using the ARM website, the ARM Data Discovery Tool, and data quality reporting tools. Over the last year, the Metadata Manager - a relatively new position within the ARM program - created two documents that summarize the state of ARM metadata processes: ARM Metadata Workflow, and ARM Metadata Standards. These documents serve as guides to the creation and management of ARM metadata. With many of ARM's data functions spread around the Department of Energy national laboratory complex and with many of the original architects of the metadata structure no longer working for ARM, there is increased importance on using these documents to resolve issues from data flow bottlenecks and inaccurate metadata to improving data discovery and organizing web pages. This presentation will provide some examples from the workflow and standards documents. The examples will illustrate the complexity of the ARM metadata processes and the efficiency by which the metadata team works towards achieving the goal of providing access to data collected under the auspices of the ARM program.
Efficient processing of MPEG-21 metadata in the binary domain
NASA Astrophysics Data System (ADS)
Timmerer, Christian; Frank, Thomas; Hellwagner, Hermann; Heuer, Jörg; Hutter, Andreas
2005-10-01
XML-based metadata is widely adopted across the different communities and plenty of commercial and open source tools for processing and transforming are available on the market. However, all of these tools have one thing in common: they operate on plain text encoded metadata which may become a burden in constrained and streaming environments, i.e., when metadata needs to be processed together with multimedia content on the fly. In this paper we present an efficient approach for transforming such kind of metadata which are encoded using MPEG's Binary Format for Metadata (BiM) without additional en-/decoding overheads, i.e., within the binary domain. Therefore, we have developed an event-based push parser for BiM encoded metadata which transforms the metadata by a limited set of processing instructions - based on traditional XML transformation techniques - operating on bit patterns instead of cost-intensive string comparisons.
A model for enhancing Internet medical document retrieval with "medical core metadata".
Malet, G; Munoz, F; Appleyard, R; Hersh, W
1999-01-01
Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and MEDLINE-type content descriptions. The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines.
Developing Cyberinfrastructure Tools and Services for Metadata Quality Evaluation
NASA Astrophysics Data System (ADS)
Mecum, B.; Gordon, S.; Habermann, T.; Jones, M. B.; Leinfelder, B.; Powers, L. A.; Slaughter, P.
2016-12-01
Metadata and data quality are at the core of reusable and reproducible science. While great progress has been made over the years, much of the metadata collected only addresses data discovery, covering concepts such as titles and keywords. Improving metadata beyond the discoverability plateau means documenting detailed concepts within the data such as sampling protocols, instrumentation used, and variables measured. Given that metadata commonly do not describe their data at this level, how might we improve the state of things? Giving scientists and data managers easy to use tools to evaluate metadata quality that utilize community-driven recommendations is the key to producing high-quality metadata. To achieve this goal, we created a set of cyberinfrastructure tools and services that integrate with existing metadata and data curation workflows which can be used to improve metadata and data quality across the sciences. These tools work across metadata dialects (e.g., ISO19115, FGDC, EML, etc.) and can be used to assess aspects of quality beyond what is internal to the metadata such as the congruence between the metadata and the data it describes. The system makes use of a user-friendly mechanism for expressing a suite of checks as code in popular data science programming languages such as Python and R. This reduces the burden on scientists and data managers to learn yet another language. We demonstrated these services and tools in three ways. First, we evaluated a large corpus of datasets in the DataONE federation of data repositories against a metadata recommendation modeled after existing recommendations such as the LTER best practices and the Attribute Convention for Dataset Discovery (ACDD). Second, we showed how this service can be used to display metadata and data quality information to data producers during the data submission and metadata creation process, and to data consumers through data catalog search and access tools. Third, we showed how the centrally deployed DataONE quality service can achieve major efficiency gains by allowing member repositories to customize and use recommendations that fit their specific needs without having to create de novo infrastructure at their site.
The New Online Metadata Editor for Generating Structured Metadata
NASA Astrophysics Data System (ADS)
Devarakonda, R.; Shrestha, B.; Palanisamy, G.; Hook, L.; Killeffer, T.; Boden, T.; Cook, R. B.; Zolly, L.; Hutchison, V.; Frame, M. T.; Cialella, A. T.; Lazer, K.
2014-12-01
Nobody is better suited to "describe" data than the scientist who created it. This "description" about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, and locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [1] [2]. Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. How is OME helping Big Data Centers like ORNL DAAC? The ORNL DAAC is one of NASA's Earth Observing System Data and Information System (EOSDIS) data centers managed by the ESDIS Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability.References:[1] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [2] Wilson, Bruce E., et al. "Mercury Toolset for Spatiotemporal Metadata." NASA Technical Reports Server (NTRS) (2010).
Ignizio, Drew A.; O'Donnell, Michael S.; Talbert, Colin B.
2014-01-01
Creating compliant metadata for scientific data products is mandated for all federal Geographic Information Systems professionals and is a best practice for members of the geospatial data community. However, the complexity of the The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata, the limited availability of easy-to-use tools, and recent changes in the ESRI software environment continue to make metadata creation a challenge. Staff at the U.S. Geological Survey Fort Collins Science Center have developed a Python toolbox for ESRI ArcDesktop to facilitate a semi-automated workflow to create and update metadata records in ESRI’s 10.x software. The U.S. Geological Survey Metadata Wizard tool automatically populates several metadata elements: the spatial reference, spatial extent, geospatial presentation format, vector feature count or raster column/row count, native system/processing environment, and the metadata creation date. Once the software auto-populates these elements, users can easily add attribute definitions and other relevant information in a simple Graphical User Interface. The tool, which offers a simple design free of esoteric metadata language, has the potential to save many government and non-government organizations a significant amount of time and costs by facilitating the development of The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata compliant metadata for ESRI software users. A working version of the tool is now available for ESRI ArcDesktop, version 10.0, 10.1, and 10.2 (downloadable at http:/www.sciencebase.gov/metadatawizard).
NASA Astrophysics Data System (ADS)
Richard, S. M.
2011-12-01
The USGIN project has drafted and is using a specification for use of ISO 19115/19/39 metadata, recommendations for simple metadata content, and a proposal for a URI scheme to identify resources using resolvable http URI's(see http://lab.usgin.org/usgin-profiles). The principal target use case is a catalog in which resources can be registered and described by data providers for discovery by users. We are currently using the ESRI Geoportal (Open Source), with configuration files for the USGIN profile. The metadata offered by the catalog must provide sufficient content to guide search engines to locate requested resources, to describe the resource content, provenance, and quality so users can determine if the resource will serve for intended usage, and finally to enable human users and sofware clients to obtain or access the resource. In order to achieve an operational federated catalog system, provisions in the ISO specification must be restricted and usage clarified to reduce the heterogeneity of 'standard' metadata and service implementations such that a single client can search against different catalogs, and the metadata returned by catalogs can be parsed reliably to locate required information. Usage of the complex ISO 19139 XML schema allows for a great deal of structured metadata content, but the heterogenity in approaches to content encoding has hampered development of sophisticated client software that can take advantage of the rich metadata; the lack of such clients in turn reduces motivation for metadata producers to produce content-rich metadata. If the only significant use of the detailed, structured metadata is to format into text for people to read, then the detailed information could be put in free text elements and be just as useful. In order for complex metadata encoding and content to be useful, there must be clear and unambiguous conventions on the encoding that are utilized by the community that wishes to take advantage of advanced metadata content. The use cases for the detailed content must be well understood, and the degree of metadata complexity should be determined by requirements for those use cases. The ISO standard provides sufficient flexibility that relatively simple metadata records can be created that will serve for text-indexed search/discovery, resource evaluation by a user reading text content from the metadata, and access to the resource via http, ftp, or well-known service protocols (e.g. Thredds; OGC WMS, WFS, WCS).
Improving Scientific Metadata Interoperability And Data Discoverability using OAI-PMH
NASA Astrophysics Data System (ADS)
Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E.
2010-12-01
While general-purpose search engines (such as Google or Bing) are useful for finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. However, there are a number of different protocols for harvesting metadata, with some challenges for ensuring that updates are propagated and for collaborations with repositories using differing metadata standards. The Open Archive Initiative Protocol for Metadata Handling (OAI-PMH) is a standard that is seeing increased use as a means for exchanging structured metadata. OAI-PMH implementations must support Dublin Core as a metadata standard, with other metadata formats as optional. We have developed tools which enable our structured search tool (Mercury; http://mercury.ornl.gov) to consume metadata from OAI-PMH services in any of the metadata formats we support (Dublin Core, Darwin Core, FCDC CSDGM, GCMD DIF, EML, and ISO 19115/19137). We are also making ORNL DAAC metadata available through OAI-PMH for other metadata tools to utilize, such as the NASA Global Change Master Directory, GCMD). This paper describes Mercury capabilities with multiple metadata formats, in general, and, more specifically, the results of our OAI-PMH implementations and the lessons learned. References: [1] R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. [2] R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010). [3] Devarakonda, R.; Palanisamy, G.; Green, J.; Wilson, B. E. "Mercury: An Example of Effective Software Reuse for Metadata Management Data Discovery and Access", Eos Trans. AGU, 89(53), Fall Meet. Suppl., IN11A-1019 (2008).
An Approach to Information Management for AIR7000 with Metadata and Ontologies
2009-10-01
metadata. We then propose an approach based on Semantic Technologies including the Resource Description Framework (RDF) and Upper Ontologies, for the...mandating specific metadata schemas can result in interoperability problems. For example, many standards within the ADO mandate the use of XML for metadata...such problems, we propose an archi- tecture in which different metadata schemes can inter operate. By using RDF (Resource Description Framework ) as a
Making Interoperability Easier with NASA's Metadata Management Tool (MMT)
NASA Technical Reports Server (NTRS)
Shum, Dana; Reese, Mark; Pilone, Dan; Baynes, Katie
2016-01-01
While the ISO-19115 collection level metadata format meets many users' needs for interoperable metadata, it can be cumbersome to create it correctly. Through the MMT's simple UI experience, metadata curators can create and edit collections which are compliant with ISO-19115 without full knowledge of the NASA Best Practices implementation of ISO-19115 format. Users are guided through the metadata creation process through a forms-based editor, complete with field information, validation hints and picklists. Once a record is completed, users can download the metadata in any of the supported formats with just 2 clicks.
Predicting structured metadata from unstructured metadata.
Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier
2016-01-01
Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data-defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. © The Author(s) 2016. Published by Oxford University Press.
Metazen – metadata capture for metagenomes
Bischof, Jared; Harrison, Travis; Paczian, Tobias; ...
2014-12-08
Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less
Metazen – metadata capture for metagenomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, Jared; Harrison, Travis; Paczian, Tobias
Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less
Predicting structured metadata from unstructured metadata
Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier
2016-01-01
Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data—defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. Database URL: http://www.yeastgenome.org/ PMID:28637268
NASA Astrophysics Data System (ADS)
Benedict, K. K.; Scott, S.
2013-12-01
While there has been a convergence towards a limited number of standards for representing knowledge (metadata) about geospatial (and other) data objects and collections, there exist a variety of community conventions around the specific use of those standards and within specific data discovery and access systems. This combination of limited (but multiple) standards and conventions creates a challenge for system developers that aspire to participate in multiple data infrastrucutres, each of which may use a different combination of standards and conventions. While Extensible Markup Language (XML) is a shared standard for encoding most metadata, traditional direct XML transformations (XSLT) from one standard to another often result in an imperfect transfer of information due to incomplete mapping from one standard's content model to another. This paper presents the work at the University of New Mexico's Earth Data Analysis Center (EDAC) in which a unified data and metadata management system has been developed in support of the storage, discovery and access of heterogeneous data products. This system, the Geographic Storage, Transformation and Retrieval Engine (GSTORE) platform has adopted a polyglot database model in which a combination of relational and document-based databases are used to store both data and metadata, with some metadata stored in a custom XML schema designed as a superset of the requirements for multiple target metadata standards: ISO 19115-2/19139/19110/19119, FGCD CSDGM (both with and without remote sensing extensions) and Dublin Core. Metadata stored within this schema is complemented by additional service, format and publisher information that is dynamically "injected" into produced metadata documents when they are requested from the system. While mapping from the underlying common metadata schema is relatively straightforward, the generation of valid metadata within each target standard is necessary but not sufficient for integration into multiple data infrastructures, as has been demonstrated through EDAC's testing and deployment of metadata into multiple external systems: Data.Gov, the GEOSS Registry, the DataONE network, the DSpace based institutional repository at UNM and semantic mediation systems developed as part of the NASA ACCESS ELSeWEB project. Each of these systems requires valid metadata as a first step, but to make most effective use of the delivered metadata each also has a set of conventions that are specific to the system. This presentation will provide an overview of the underlying metadata management model, the processes and web services that have been developed to automatically generate metadata in a variety of standard formats and highlight some of the specific modifications made to the output metadata content to support the different conventions used by the multiple metadata integration endpoints.
Why can't I manage my digital images like MP3s? The evolution and intent of multimedia metadata
NASA Astrophysics Data System (ADS)
Goodrum, Abby; Howison, James
2005-01-01
This paper considers the deceptively simple question: Why can't digital images be managed in the simple and effective manner in which digital music files are managed? We make the case that the answer is different treatments of metadata in different domains with different goals. A central difference between the two formats stems from the fact that digital music metadata lookup services are collaborative and automate the movement from a digital file to the appropriate metadata, while image metadata services do not. To understand why this difference exists we examine the divergent evolution of metadata standards for digital music and digital images and observed that the processes differ in interesting ways according to their intent. Specifically music metadata was developed primarily for personal file management and community resource sharing, while the focus of image metadata has largely been on information retrieval. We argue that lessons from MP3 metadata can assist individuals facing their growing personal image management challenges. Our focus therefore is not on metadata for cultural heritage institutions or the publishing industry, it is limited to the personal libraries growing on our hard-drives. This bottom-up approach to file management combined with p2p distribution radically altered the music landscape. Might such an approach have a similar impact on image publishing? This paper outlines plans for improving the personal management of digital images-doing image metadata and file management the MP3 way-and considers the likelihood of success.
Why can't I manage my digital images like MP3s? The evolution and intent of multimedia metadata
NASA Astrophysics Data System (ADS)
Goodrum, Abby; Howison, James
2004-12-01
This paper considers the deceptively simple question: Why can"t digital images be managed in the simple and effective manner in which digital music files are managed? We make the case that the answer is different treatments of metadata in different domains with different goals. A central difference between the two formats stems from the fact that digital music metadata lookup services are collaborative and automate the movement from a digital file to the appropriate metadata, while image metadata services do not. To understand why this difference exists we examine the divergent evolution of metadata standards for digital music and digital images and observed that the processes differ in interesting ways according to their intent. Specifically music metadata was developed primarily for personal file management and community resource sharing, while the focus of image metadata has largely been on information retrieval. We argue that lessons from MP3 metadata can assist individuals facing their growing personal image management challenges. Our focus therefore is not on metadata for cultural heritage institutions or the publishing industry, it is limited to the personal libraries growing on our hard-drives. This bottom-up approach to file management combined with p2p distribution radically altered the music landscape. Might such an approach have a similar impact on image publishing? This paper outlines plans for improving the personal management of digital images-doing image metadata and file management the MP3 way-and considers the likelihood of success.
The European Drought Observatory (EDO): Current State and Future Directions
NASA Astrophysics Data System (ADS)
Vogt, Jürgen; Sepulcre, Guadalupe; Magni, Diego; Valentini, Luana; Singleton, Andrew; Micale, Fabio; Barbosa, Paulo
2013-04-01
Europe has repeatedly been affected by droughts, resulting in considerable ecological and economic damage and climate change studies indicate a trend towards increasing climate variability most likely resulting in more frequent drought occurrences also in Europe. Against this background, the European Commission's Joint Research Centre (JRC) is developing methods and tools for assessing, monitoring and forecasting droughts in Europe and develops a European Drought Observatory (EDO) to complement and integrate national activities with a European view. At the core of the European Drought Observatory (EDO) is a portal, including a map server, a metadata catalogue, a media-monitor and analysis tools. The map server presents Europe-wide up-to-date information on the occurrence and severity of droughts, which is complemented by more detailed information provided by regional, national and local observatories through OGC compliant web mapping and web coverage services. In addition, time series of historical maps as well as graphs of the temporal evolution of drought indices for individual grid cells and administrative regions in Europe can be retrieved and analysed. Current work is focusing on validating the available products, developing combined indicators, improving the functionalities, extending the linkage to additional national and regional drought information systems and testing options for medium-range probabilistic drought forecasting across Europe. Longer-term goals include the development of long-range drought forecasting products, the analysis of drought hazard and risk, the monitoring of drought impact and the integration of EDO in a global drought information system. The talk will provide an overview on the development and state of EDO, the different products, and the ways to include a wide range of stakeholders (i.e. European, national river basin, and local authorities) in the development of the system as well as an outlook on the future developments.
The Role of Metadata Standards in EOSDIS Search and Retrieval Applications
NASA Technical Reports Server (NTRS)
Pfister, Robin
1999-01-01
Metadata standards play a critical role in data search and retrieval systems. Metadata tie software to data so the data can be processed, stored, searched, retrieved and distributed. Without metadata these actions are not possible. The process of populating metadata to describe science data is an important service to the end user community so that a user who is unfamiliar with the data, can easily find and learn about a particular dataset before an order decision is made. Once a good set of standards are in place, the accuracy with which data search can be performed depends on the degree to which metadata standards are adhered during product definition. NASA's Earth Observing System Data and Information System (EOSDIS) provides examples of how metadata standards are used in data search and retrieval.
openPDS: protecting the privacy of metadata through SafeAnswers.
de Montjoye, Yves-Alexandre; Shmueli, Erez; Wang, Samuel S; Pentland, Alex Sandy
2014-01-01
The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.
openPDS: Protecting the Privacy of Metadata through SafeAnswers
de Montjoye, Yves-Alexandre; Shmueli, Erez; Wang, Samuel S.; Pentland, Alex Sandy
2014-01-01
The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research. PMID:25007320
Progress in defining a standard for file-level metadata
NASA Technical Reports Server (NTRS)
Williams, Joel; Kobler, Ben
1996-01-01
In the following narrative, metadata required to locate a file on tape or collection of tapes will be referred to as file-level metadata. This paper discribes the rationale for and the history of the effort to define a standard for this metadata.
Achieving interoperability for metadata registries using comparative object modeling.
Park, Yu Rang; Kim, Ju Han
2010-01-01
Achieving data interoperability between organizations relies upon agreed meaning and representation (metadata) of data. For managing and registering metadata, many organizations have built metadata registries (MDRs) in various domains based on international standard for MDR framework, ISO/IEC 11179. Following this trend, two pubic MDRs in biomedical domain have been created, United States Health Information Knowledgebase (USHIK) and cancer Data Standards Registry and Repository (caDSR), from U.S. Department of Health & Human Services and National Cancer Institute (NCI), respectively. Most MDRs are implemented with indiscriminate extending for satisfying organization-specific needs and solving semantic and structural limitation of ISO/IEC 11179. As a result it is difficult to address interoperability among multiple MDRs. In this paper, we propose an integrated metadata object model for achieving interoperability among multiple MDRs. To evaluate this model, we developed an XML Schema Definition (XSD)-based metadata exchange format. We created an XSD-based metadata exporter, supporting both the integrated metadata object model and organization-specific MDR formats.
Request queues for interactive clients in a shared file system of a parallel computing system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin
Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue;more » and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.« less
Making metadata usable in a multi-national research setting.
Ellul, Claire; Foord, Joanna; Mooney, John
2013-11-01
SECOA (Solutions for Environmental Contrasts in Coastal Areas) is a multi-national research project examining the effects of human mobility on urban settlements in fragile coastal environments. This paper describes the setting up of a SECOA metadata repository for non-specialist researchers such as environmental scientists and tourism experts. Conflicting usability requirements of two groups - metadata creators and metadata users - are identified along with associated limitations of current metadata standards. A description is given of a configurable metadata system designed to grow as the project evolves. This work is of relevance for similar projects such as INSPIRE. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
NASA Astrophysics Data System (ADS)
Yatagai, A. I.; Iyemori, T.; Ritschel, B.; Koyama, Y.; Hori, T.; Abe, S.; Tanaka, Y.; Shinbori, A.; Umemura, N.; Sato, Y.; Yagi, M.; Ueno, S.; Hashiguchi, N. O.; Kaneda, N.; Belehaki, A.; Hapgood, M. A.
2013-12-01
The IUGONET is a Japanese program to build a metadata database for ground-based observations of the upper atmosphere [1]. The project began in 2009 with five Japanese institutions which archive data observed by radars, magnetometers, photometers, radio telescopes and helioscopes, and so on, at various altitudes from the Earth's surface to the Sun. Systems have been developed to allow searching of the above described metadata. We have been updating the system and adding new and updated metadata. The IUGONET development team adopted the SPASE metadata model [2] to describe the upper atmosphere data. This model is used as the common metadata format by the virtual observatories for solar-terrestrial physics. It includes metadata referring to each data file (called a 'Granule'), which enable a search for data files as well as data sets. Further details are described in [2] and [3]. Currently, three additional Japanese institutions are being incorporated in IUGONET. Furthermore, metadata of observations of the troposphere, taken at the observatories of the middle and upper atmosphere radar at Shigaraki and the Meteor radar in Indonesia, have been incorporated. These additions will contribute to efficient interdisciplinary scientific research. In the beginning of 2013, the registration of the 'Observatory' and 'Instrument' metadata was completed, which makes it easy to overview of the metadata database. The number of registered metadata as of the end of July, totalled 8.8 million, including 793 observatories and 878 instruments. It is important to promote interoperability and/or metadata exchange between the database development groups. A memorandum of agreement has been signed with the European Near-Earth Space Data Infrastructure for e-Science (ESPAS) project, which has similar objectives to IUGONET with regard to a framework for formal collaboration. Furthermore, observations by satellites and the International Space Station are being incorporated with a view for making/linking metadata databases. The development of effective data systems will contribute to the progress of scientific research on solar terrestrial physics, climate and the geophysical environment. Any kind of cooperation, metadata input and feedback, especially for linkage of the databases, is welcomed. References 1. Hayashi, H. et al., Inter-university Upper Atmosphere Global Observation Network (IUGONET), Data Sci. J., 12, WDS179-184, 2013. 2. King, T. et al., SPASE 2.0: A standard data model for space physics. Earth Sci. Inform. 3, 67-73, 2010, doi:10.1007/s12145-010-0053-4. 3. Hori, T., et al., Development of IUGONET metadata format and metadata management system. J. Space Sci. Info. Jpn., 105-111, 2012. (in Japanese)
Towards Precise Metadata-set for Discovering 3D Geospatial Models in Geo-portals
NASA Astrophysics Data System (ADS)
Zamyadi, A.; Pouliot, J.; Bédard, Y.
2013-09-01
Accessing 3D geospatial models, eventually at no cost and for unrestricted use, is certainly an important issue as they become popular among participatory communities, consultants, and officials. Various geo-portals, mainly established for 2D resources, have tried to provide access to existing 3D resources such as digital elevation model, LIDAR or classic topographic data. Describing the content of data, metadata is a key component of data discovery in geo-portals. An inventory of seven online geo-portals and commercial catalogues shows that the metadata referring to 3D information is very different from one geo-portal to another as well as for similar 3D resources in the same geo-portal. The inventory considered 971 data resources affiliated with elevation. 51% of them were from three geo-portals running at Canadian federal and municipal levels whose metadata resources did not consider 3D model by any definition. Regarding the remaining 49% which refer to 3D models, different definition of terms and metadata were found, resulting in confusion and misinterpretation. The overall assessment of these geo-portals clearly shows that the provided metadata do not integrate specific and common information about 3D geospatial models. Accordingly, the main objective of this research is to improve 3D geospatial model discovery in geo-portals by adding a specific metadata-set. Based on the knowledge and current practices on 3D modeling, and 3D data acquisition and management, a set of metadata is proposed to increase its suitability for 3D geospatial models. This metadata-set enables the definition of genuine classes, fields, and code-lists for a 3D metadata profile. The main structure of the proposal contains 21 metadata classes. These classes are classified in three packages as General and Complementary on contextual and structural information, and Availability on the transition from storage to delivery format. The proposed metadata set is compared with Canadian Geospatial Data Infrastructure (CGDI) metadata which is an implementation of North American Profile of ISO-19115. The comparison analyzes the two metadata against three simulated scenarios about discovering needed 3D geo-spatial datasets. Considering specific metadata about 3D geospatial models, the proposed metadata-set has six additional classes on geometric dimension, level of detail, geometric modeling, topology, and appearance information. In addition classes on data acquisition, preparation, and modeling, and physical availability have been specialized for 3D geospatial models.
2011-05-01
iTunes illustrate the difference between the centralized approach of digital library systems and the distributed approach of container file formats...metadata in a container file format. Apple’s iTunes uses a centralized metadata approach and allows users to maintain song metadata in a single...one iTunes library to another the metadata must be copied separately or reentered in the new library. This demonstrates the utility of storing metadata
Collaborative Metadata Curation in Support of NASA Earth Science Data Stewardship
NASA Technical Reports Server (NTRS)
Sisco, Adam W.; Bugbee, Kaylin; le Roux, Jeanne; Staton, Patrick; Freitag, Brian; Dixon, Valerie
2018-01-01
Growing collection of NASA Earth science data is archived and distributed by EOSDIS’s 12 Distributed Active Archive Centers (DAACs). Each collection and granule is described by a metadata record housed in the Common Metadata Repository (CMR). Multiple metadata standards are in use, and core elements of each are mapped to and from a common model – the Unified Metadata Model (UMM). Work done by the Analysis and Review of CMR (ARC) Team.
Mitogenome metadata: current trends and proposed standards.
Strohm, Jeff H T; Gwiazdowski, Rodger A; Hanner, Robert
2016-09-01
Mitogenome metadata are descriptive terms about the sequence, and its specimen description that allow both to be digitally discoverable and interoperable. Here, we review a sampling of mitogenome metadata published in the journal Mitochondrial DNA between 2005 and 2014. Specifically, we have focused on a subset of metadata fields that are available for GenBank records, and specified by the Genomics Standards Consortium (GSC) and other biodiversity metadata standards; and we assessed their presence across three main categories: collection, biological and taxonomic information. To do this we reviewed 146 mitogenome manuscripts, and their associated GenBank records, and scored them for 13 metadata fields. We also explored the potential for mitogenome misidentification using their sequence diversity, and taxonomic metadata on the Barcode of Life Datasystems (BOLD). For this, we focused on all Lepidoptera and Perciformes mitogenomes included in the review, along with additional mitogenome sequence data mined from Genbank. Overall, we found that none of 146 mitogenome projects provided all the metadata we looked for; and only 17 projects provided at least one category of metadata across the three main categories. Comparisons using mtDNA sequences from BOLD, suggest that some mitogenomes may be misidentified. Lastly, we appreciate the research potential of mitogenomes announced through this journal; and we conclude with a suggestion of 13 metadata fields, available on GenBank, that if provided in a mitogenomes's GenBank record, would increase their research value.
Design and implementation of a fault-tolerant and dynamic metadata database for clinical trials
NASA Astrophysics Data System (ADS)
Lee, J.; Zhou, Z.; Talini, E.; Documet, J.; Liu, B.
2007-03-01
In recent imaging-based clinical trials, quantitative image analysis (QIA) and computer-aided diagnosis (CAD) methods are increasing in productivity due to higher resolution imaging capabilities. A radiology core doing clinical trials have been analyzing more treatment methods and there is a growing quantity of metadata that need to be stored and managed. These radiology centers are also collaborating with many off-site imaging field sites and need a way to communicate metadata between one another in a secure infrastructure. Our solution is to implement a data storage grid with a fault-tolerant and dynamic metadata database design to unify metadata from different clinical trial experiments and field sites. Although metadata from images follow the DICOM standard, clinical trials also produce metadata specific to regions-of-interest and quantitative image analysis. We have implemented a data access and integration (DAI) server layer where multiple field sites can access multiple metadata databases in the data grid through a single web-based grid service. The centralization of metadata database management simplifies the task of adding new databases into the grid and also decreases the risk of configuration errors seen in peer-to-peer grids. In this paper, we address the design and implementation of a data grid metadata storage that has fault-tolerance and dynamic integration for imaging-based clinical trials.
Metadata and Service at the GFZ ISDC Portal
NASA Astrophysics Data System (ADS)
Ritschel, B.
2008-05-01
The online service portal of the GFZ Potsdam Information System and Data Center (ISDC) is an access point for all manner of geoscientific geodata, its corresponding metadata, scientific documentation and software tools. At present almost 2000 national and international users and user groups have the opportunity to request Earth science data from a portfolio of 275 different products types and more than 20 Million single data files with an added volume of approximately 12 TByte. The majority of the data and information, the portal currently offers to the public, are global geomonitoring products such as satellite orbit and Earth gravity field data as well as geomagnetic and atmospheric data for the exploration. These products for Earths changing system are provided via state-of-the art retrieval techniques. The data product catalog system behind these techniques is based on the extensive usage of standardized metadata, which are describing the different geoscientific product types and data products in an uniform way. Where as all ISDC product types are specified by NASA's Directory Interchange Format (DIF), Version 9.0 Parent XML DIF metadata files, the individual data files are described by extended DIF metadata documents. Depending on the beginning of the scientific project, one part of data files are described by extended DIF, Version 6 metadata documents and the other part are specified by data Child XML DIF metadata documents. Both, the product type dependent parent DIF metadata documents and the data file dependent child DIF metadata documents are derived from a base-DIF.xsd xml schema file. The ISDC metadata philosophy defines a geoscientific product as a package consisting of mostly one or sometimes more than one data file plus one extended DIF metadata file. Because NASA's DIF metadata standard has been developed in order to specify a collection of data only, the extension of the DIF standard consists of new and specific attributes, which are necessary for an explicit identification of single data files and the set-up of a comprehensive Earth science data catalog. The huge ISDC data catalog is realized by product type dependent tables filled with data file related metadata, which have relations to corresponding metadata tables. The product type describing parent DIF XML metadata documents are stored and managed in ORACLE's XML storage structures. In order to improve the interoperability of the ISDC service portal, the existing proprietary catalog system will be extended by an ISO 19115 based web catalog service. In addition to this development there is ISDC related concerning semantic network of different kind of metadata resources, like different kind of standardized and not-standardized metadata documents and literature as well as Web 2.0 user generated information derived from tagging activities and social navigation data.
Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO).
Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier
2017-08-01
A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metadata values. We evaluate our framework in the context of experimental metadata from the Gene Expression Omnibus (GEO). We applied four rule mining algorithms to the most common structured metadata elements (sample type, molecular type, platform, label type and organism) from over 1.3million GEO records. We examined the quality of well supported rules from each algorithm and visualized the dependencies among metadata elements. Finally, we evaluated the performance of the algorithms in terms of accuracy, precision, recall, and F-measure. We found that PART is the best algorithm outperforming Apriori, Predictive Apriori, and Decision Table. All algorithms perform significantly better in predicting class values than the majority vote classifier. We found that the performance of the algorithms is related to the dimensionality of the GEO elements. The average performance of all algorithm increases due of the decreasing of dimensionality of the unique values of these elements (2697 platforms, 537 organisms, 454 labels, 9 molecules, and 5 types). Our work suggests that experimental metadata such as present in GEO can be accurately predicted using rule mining algorithms. Our work has implications for both prospective and retrospective augmentation of metadata quality, which are geared towards making data easier to find and reuse. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Mercury Toolset for Spatiotemporal Metadata
NASA Technical Reports Server (NTRS)
Wilson, Bruce E.; Palanisamy, Giri; Devarakonda, Ranjeet; Rhyne, B. Timothy; Lindsley, Chris; Green, James
2010-01-01
Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily) harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.
Mercury Toolset for Spatiotemporal Metadata
NASA Astrophysics Data System (ADS)
Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce; Rhyne, B. Timothy; Lindsley, Chris
2010-06-01
Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily)harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.
Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators
ERIC Educational Resources Information Center
Mayernik, Matthew Stephen
2011-01-01
As digital data creation technologies become more prevalent, data and metadata management are necessary to make data available, usable, sharable, and storable. Researchers in many scientific settings, however, have little experience or expertise in data and metadata management. In this dissertation, I explore the everyday data and metadata…
Climate Data Provenance Tracking for Just-In-Time Computation
NASA Astrophysics Data System (ADS)
Fries, S.; Nadeau, D.; Doutriaux, C.; Williams, D. N.
2016-12-01
The "Climate Data Management System" (CDMS) was created in 1996 as part of the Climate Data Analysis Tools suite of software. It provides a simple interface into a wide variety of climate data formats, and creates NetCDF CF-Compliant files. It leverages the NumPy framework for high performance computation, and is an all-in-one IO and computation package. CDMS has been extended to track manipulations of data, and trace that data all the way to the original raw data. This extension tracks provenance about data, and enables just-in-time (JIT) computation. The provenance for each variable is packaged as part of the variable's metadata, and can be used to validate data processing and computations (by repeating the analysis on the original data). It also allows for an alternate solution for sharing analyzed data; if the bandwidth for a transfer is prohibitively expensive, the provenance serialization can be passed in a much more compact format and the analysis rerun on the input data. Data provenance tracking in CDMS enables far-reaching and impactful functionalities, permitting implementation of many analytical paradigms.
McIDAS-V: Advanced Visualization for 3D Remote Sensing Data
NASA Astrophysics Data System (ADS)
Rink, T.; Achtor, T. H.
2010-12-01
McIDAS-V is a Java-based, open-source, freely available software package for analysis and visualization of geophysical data. Its advanced capabilities provide very interactive 4-D displays, including 3D volumetric rendering and fast sub-manifold slicing, linked to an abstract mathematical data model with built-in metadata for units, coordinate system transforms and sampling topology. A Jython interface provides user defined analysis and computation in terms of the internal data model. These powerful capabilities to integrate data, analysis and visualization are being applied to hyper-spectral sounding retrievals, eg. AIRS and IASI, of moisture and cloud density to interrogate and analyze their 3D structure, as well as, validate with instruments such as CALIPSO, CloudSat and MODIS. The object oriented framework design allows for specialized extensions for novel displays and new sources of data. Community defined CF-conventions for gridded data are understood by the software, and can be immediately imported into the application. This presentation will show examples how McIDAS-V is used in 3-dimensional data analysis, display and evaluation.
Content Metadata Standards for Marine Science: A Case Study
Riall, Rebecca L.; Marincioni, Fausto; Lightsom, Frances L.
2004-01-01
The U.S. Geological Survey developed a content metadata standard to meet the demands of organizing electronic resources in the marine sciences for a broad, heterogeneous audience. These metadata standards are used by the Marine Realms Information Bank project, a Web-based public distributed library of marine science from academic institutions and government agencies. The development and deployment of this metadata standard serve as a model, complete with lessons about mistakes, for the creation of similarly specialized metadata standards for digital libraries.
Heidelberger, Philip; Steinmacher-Burow, Burkhard
2015-01-06
According to one embodiment, a method for implementing an array-based queue in memory of a memory system that includes a controller includes configuring, in the memory, metadata of the array-based queue. The configuring comprises defining, in metadata, an array start location in the memory for the array-based queue, defining, in the metadata, an array size for the array-based queue, defining, in the metadata, a queue top for the array-based queue and defining, in the metadata, a queue bottom for the array-based queue. The method also includes the controller serving a request for an operation on the queue, the request providing the location in the memory of the metadata of the queue.
WGISS-45 International Directory Network (IDN) Report
NASA Technical Reports Server (NTRS)
Morahan, Michael
2018-01-01
The objective of this presentation is to provide IDN (International Directory Network) updates on features and activities to the Committee on Earth Observation Satellites (CEOS) Working Group on Information Systems and Services (WGISS) and provider community. The following topics will be will be discussed during the presentation: Transition of Providers DIF-9 (Directory Interchange Format-9) to DIF-10 Metadata Records in the Common Metadata Repository (CMR); GCMD (Global Change Master Directory) Keyword Update; DIF-10 and UMM-C (Unified Metadata Model-Collections) Schema Changes; Metadata Validation of Provider Metadata; docBUILDER for Submitting IDN Metadata to the CMR (i.e. Registration); and Mapping WGClimate Essential Climate Variable (ECV) Inventory to IDN Records.
Lessons in weather data interoperability: the National Mesonet Program
NASA Astrophysics Data System (ADS)
Evans, J. D.; Werner, B.; Cogar, C.; Heppner, P.
2015-12-01
The National Mesonet Program (NMP) links local, state, and regional surface weather observation networks (a.k.a. mesonets) to enhance the prediction of high-impact, local-scale weather events. A consortium of 23 (and counting) private firms, state agencies, and universities provides near-real-time observations from over 7,000 fixed weather stations, and over 1,000 vehicle-mounted sensors, every 15 minutes or less, together with the detailed sensor and station metadata required for effective forecasts and decision-making. In order to integrate these weather observations across the United States, and to provide full details about sensors, stations, and observations, the NMP has defined a set of conventions for observational data and sensor metadata. These conventions address the needs of users with limited bandwidth and computing resources, while also anticipating a growing variety of sensors and observations. For disseminating weather observation data, the NMP currently employs a simple ASCII format derived from the Integrated Ocean Observing System. This simplifies data ingest into common desktop software, and parsing by simple scripts; and it directly supports basic readings of temperature, pressure, etc. By extending the format to vector-valued observations, it can also convey readings taken at different altitudes (e.g. windspeed) or depths (e.g., soil moisture). Extending beyond these observations to fit a greater variety of sensors (solar irradiation, sodar, radar, lidar) may require further extensions, or a move to more complex formats (e.g., based on XML or JSON). We will discuss the tradeoffs of various conventions for different users and use cases. To convey sensor and station metadata, the NMP uses a convention known as Starfish Fungus Language (*FL), derived from the Open Geospatial Consortium's SensorML standard. *FL separates static and dynamic elements of a sensor description, allowing for relatively compact expressions that reference a library of shared definitions (e.g., sensor manufacturer's specifications) alongside time-varying and site-specific details (slope / aspect, calibration, etc.) We will discuss the tradeoffs of *FL, SensorML, and alternatives for conveying sensor details to various users and uses.
Explorative Analyses of Nursing Research Data.
Kim, Hyeoneui; Jang, Imho; Quach, Jimmy; Richardson, Alex; Kim, Jaemin; Choi, Jeeyae
2016-10-26
As a first step of pursuing the vision of "big data science in nursing," we described the characteristics of nursing research data reported in 194 published nursing studies. We also explored how completely the Version 1 metadata specification of biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) represents these metadata. The metadata items of the nursing studies were all related to one or more of the bioCADDIE metadata entities. However, values of many metadata items of the nursing studies were not sufficiently represented through the bioCADDIE metadata. This was partly due to the differences in the scope of the content that the bioCADDIE metadata are designed to represent. The 194 nursing studies reported a total of 1,181 unique data items, the majority of which take non-numeric values. This indicates the importance of data standardization to enable the integrative analyses of these data to support big data science in nursing. © The Author(s) 2016.
A Model for Enhancing Internet Medical Document Retrieval with “Medical Core Metadata”
Malet, Gary; Munoz, Felix; Appleyard, Richard; Hersh, William
1999-01-01
Objective: Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. Design: The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and Medline-type content descriptions. Results: The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. Conclusions: The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines. PMID:10094069
MPEG-7: standard metadata for multimedia content
NASA Astrophysics Data System (ADS)
Chang, Wo
2005-08-01
The eXtensible Markup Language (XML) metadata technology of describing media contents has emerged as a dominant mode of making media searchable both for human and machine consumptions. To realize this premise, many online Web applications are pushing this concept to its fullest potential. However, a good metadata model does require a robust standardization effort so that the metadata content and its structure can reach its maximum usage between various applications. An effective media content description technology should also use standard metadata structures especially when dealing with various multimedia contents. A new metadata technology called MPEG-7 content description has merged from the ISO MPEG standards body with the charter of defining standard metadata to describe audiovisual content. This paper will give an overview of MPEG-7 technology and what impact it can bring forth to the next generation of multimedia indexing and retrieval applications.
Quality Assurance for Digital Learning Object Repositories: Issues for the Metadata Creation Process
ERIC Educational Resources Information Center
Currier, Sarah; Barton, Jane; O'Beirne, Ronan; Ryan, Ben
2004-01-01
Metadata enables users to find the resources they require, therefore it is an important component of any digital learning object repository. Much work has already been done within the learning technology community to assure metadata quality, focused on the development of metadata standards, specifications and vocabularies and their implementation…
A Model for the Creation of Human-Generated Metadata within Communities
ERIC Educational Resources Information Center
Brasher, Andrew; McAndrew, Patrick
2005-01-01
This paper considers situations for which detailed metadata descriptions of learning resources are necessary, and focuses on human generation of such metadata. It describes a model which facilitates human production of good quality metadata by the development and use of structured vocabularies. Using examples, this model is applied to single and…
Enhancing SCORM Metadata for Assessment Authoring in E-Learning
ERIC Educational Resources Information Center
Chang, Wen-Chih; Hsu, Hui-Huang; Smith, Timothy K.; Wang, Chun-Chia
2004-01-01
With the rapid development of distance learning and the XML technology, metadata play an important role in e-Learning. Nowadays, many distance learning standards, such as SCORM, AICC CMI, IEEE LTSC LOM and IMS, use metadata to tag learning materials. However, most metadata models are used to define learning materials and test problems. Few…
Development of Health Information Search Engine Based on Metadata and Ontology
Song, Tae-Min; Jin, Dal-Lae
2014-01-01
Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907
Development of health information search engine based on metadata and ontology.
Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae
2014-04-01
The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
A metadata template for ocean acidification data
NASA Astrophysics Data System (ADS)
Jiang, L.
2014-12-01
Metadata is structured information that describes, explains, and locates an information resource (e.g., data). It is often coarsely described as data about data, and documents information such as what was measured, by whom, when, where, and how it was sampled, analyzed, with what instruments. Metadata is inherent to ensure the survivability and accessibility of the data into the future. With the rapid expansion of biological response ocean acidification (OA) studies, the lack of a common metadata template to document such type of data has become a significant gap for ocean acidification data management efforts. In this paper, we present a metadata template that can be applied to a broad spectrum of OA studies, including those studying the biological responses of organisms on ocean acidification. The "variable metadata section", which includes the variable name, observation type, whether the variable is a manipulation condition or response variable, and the biological subject on which the variable is studied, forms the core of this metadata template. Additional metadata elements, such as principal investigators, temporal and spatial coverage, platforms for the sampling, data citation are essential components to complete the template. We explain the structure of the template, and define many metadata elements that may be unfamiliar to researchers. For that reason, this paper can serve as a user's manual for the template.
A Shared Infrastructure for Federated Search Across Distributed Scientific Metadata Catalogs
NASA Astrophysics Data System (ADS)
Reed, S. A.; Truslove, I.; Billingsley, B. W.; Grauch, A.; Harper, D.; Kovarik, J.; Lopez, L.; Liu, M.; Brandt, M.
2013-12-01
The vast amount of science metadata can be overwhelming and highly complex. Comprehensive analysis and sharing of metadata is difficult since institutions often publish to their own repositories. There are many disjoint standards used for publishing scientific data, making it difficult to discover and share information from different sources. Services that publish metadata catalogs often have different protocols, formats, and semantics. The research community is limited by the exclusivity of separate metadata catalogs and thus it is desirable to have federated search interfaces capable of unified search queries across multiple sources. Aggregation of metadata catalogs also enables users to critique metadata more rigorously. With these motivations in mind, the National Snow and Ice Data Center (NSIDC) and Advanced Cooperative Arctic Data and Information Service (ACADIS) implemented two search interfaces for the community. Both the NSIDC Search and ACADIS Arctic Data Explorer (ADE) use a common infrastructure which keeps maintenance costs low. The search clients are designed to make OpenSearch requests against Solr, an Open Source search platform. Solr applies indexes to specific fields of the metadata which in this instance optimizes queries containing keywords, spatial bounds and temporal ranges. NSIDC metadata is reused by both search interfaces but the ADE also brokers additional sources. Users can quickly find relevant metadata with minimal effort and ultimately lowers costs for research. This presentation will highlight the reuse of data and code between NSIDC and ACADIS, discuss challenges and milestones for each project, and will identify creation and use of Open Source libraries.
NASA Astrophysics Data System (ADS)
Kuznetsova, M. M.; Heynderickz, D.; Grande, M.; Opgenoorth, H. J.
2017-12-01
The COSPAR/ILWS roadmap on space weather published in 2015 (Advances in Space Research, 2015: DOI: 10.1016/j.asr.2015.03.023) prioritizes steps to be taken to advance understanding of space environment phenomena and to improve space weather forecasting capabilities. General recommendations include development of a comprehensive space environment specification, assessment of the state of the field on a 5-yr basis, standardization of meta-data and product metrics. To facilitate progress towards roadmap goals there is a need for a global hub for collaborative space weather capabilities assessment and development that brings together research, engineering, operational, educational, and end-user communities. The COSPAR Panel on Space Weather is aiming to build upon past progress and to facilitate coordination of established and new international space weather research and development initiatives. Keys to the success include creating flexible, collaborative, inclusive environment and engaging motivated groups and individuals committed to active participation in international multi-disciplinary teams focused on topics addressing emerging needs and challenges in the rapidly growing field of space weather. Near term focus includes comprehensive assessment of the state of the field and establishing an internationally recognized process to quantify and track progress over time, development of a global network of distributed web-based resources and interconnected interactive services required for space weather research, analysis, forecasting and education.
A standard for measuring metadata quality in spectral libraries
NASA Astrophysics Data System (ADS)
Rasaiah, B.; Jones, S. D.; Bellman, C.
2013-12-01
A standard for measuring metadata quality in spectral libraries Barbara Rasaiah, Simon Jones, Chris Bellman RMIT University Melbourne, Australia barbara.rasaiah@rmit.edu.au, simon.jones@rmit.edu.au, chris.bellman@rmit.edu.au ABSTRACT There is an urgent need within the international remote sensing community to establish a metadata standard for field spectroscopy that ensures high quality, interoperable metadata sets that can be archived and shared efficiently within Earth observation data sharing systems. Metadata are an important component in the cataloguing and analysis of in situ spectroscopy datasets because of their central role in identifying and quantifying the quality and reliability of spectral data and the products derived from them. This paper presents approaches to measuring metadata completeness and quality in spectral libraries to determine reliability, interoperability, and re-useability of a dataset. Explored are quality parameters that meet the unique requirements of in situ spectroscopy datasets, across many campaigns. Examined are the challenges presented by ensuring that data creators, owners, and data users ensure a high level of data integrity throughout the lifecycle of a dataset. Issues such as field measurement methods, instrument calibration, and data representativeness are investigated. The proposed metadata standard incorporates expert recommendations that include metadata protocols critical to all campaigns, and those that are restricted to campaigns for specific target measurements. The implication of semantics and syntax for a robust and flexible metadata standard are also considered. Approaches towards an operational and logistically viable implementation of a quality standard are discussed. This paper also proposes a way forward for adapting and enhancing current geospatial metadata standards to the unique requirements of field spectroscopy metadata quality. [0430] BIOGEOSCIENCES / Computational methods and data processing [0480] BIOGEOSCIENCES / Remote sensing [1904] INFORMATICS / Community standards [1912] INFORMATICS / Data management, preservation, rescue [1926] INFORMATICS / Geospatial [1930] INFORMATICS / Data and information governance [1946] INFORMATICS / Metadata [1952] INFORMATICS / Modeling [1976] INFORMATICS / Software tools and services [9810] GENERAL OR MISCELLANEOUS / New fields
Metadata Design in the New PDS4 Standards - Something for Everybody
NASA Astrophysics Data System (ADS)
Raugh, Anne C.; Hughes, John S.
2015-11-01
The Planetary Data System (PDS) archives, supports, and distributes data of diverse targets, from diverse sources, to diverse users. One of the core problems addressed by the PDS4 data standard redesign was that of metadata - how to accommodate the increasingly sophisticated demands of search interfaces, analytical software, and observational documentation into label standards without imposing limits and constraints that would impinge on the quality or quantity of metadata that any particular observer or team could supply. And yet, as an archive, PDS must have detailed documentation for the metadata in the labels it supports, or the institutional knowledge encoded into those attributes will be lost - putting the data at risk.The PDS4 metadata solution is based on a three-step approach. First, it is built on two key ISO standards: ISO 11179 "Information Technology - Metadata Registries", which provides a common framework and vocabulary for defining metadata attributes; and ISO 14721 "Space Data and Information Transfer Systems - Open Archival Information System (OAIS) Reference Model", which provides the framework for the information architecture that enforces the object-oriented paradigm for metadata modeling. Second, PDS has defined a hierarchical system that allows it to divide its metadata universe into namespaces ("data dictionaries", conceptually), and more importantly to delegate stewardship for a single namespace to a local authority. This means that a mission can develop its own data model with a high degree of autonomy and effectively extend the PDS model to accommodate its own metadata needs within the common ISO 11179 framework. Finally, within a single namespace - even the core PDS namespace - existing metadata structures can be extended and new structures added to the model as new needs are identifiedThis poster illustrates the PDS4 approach to metadata management and highlights the expected return on the development investment for PDS, users and data preparers.
NASA Astrophysics Data System (ADS)
Delory, E.; Jirka, S.
2016-02-01
Discovering sensors and observation data is important when enabling the exchange of oceanographic data between observatories and scientists that need the data sets for their work. To better support this discovery process, one task of the European project FixO3 (Fixed-point Open Ocean Observatories) is dealing with the question which elements are needed for developing a better registry for sensors. This has resulted in four items which are addressed by the FixO3 project in cooperation with further European projects such as NeXOS (http://www.nexosproject.eu/). 1.) Metadata description format: To store and retrieve information about sensors and platforms it is necessary to have a common approach how to provide and encode the metadata. For this purpose, the OGC Sensor Model Language (SensorML) 2.0 standard was selected. Especially the opportunity to distinguish between sensor types and instances offers new chances for a more efficient provision and maintenance of sensor metadata. 2.) Conversion of existing metadata into a SensorML 2.0 representation: In order to ensure a sustainable re-use of already provided metadata content (e.g. from ESONET-FixO3 yellow pages), it is important to provide a mechanism which is capable of transforming these already available metadata sets into the new SensorML 2.0 structure. 3.) Metadata editor: To create descriptions of sensors and platforms, it is not possible to expect users to manually edit XML-based description files. Thus, a visual interface is necessary to help during the metadata creation. We will outline a prototype of this editor, building upon the development of the ESONET sensor registry interface. 4.) Sensor Metadata Store: A server is needed that for storing and querying the created sensor descriptions. For this purpose different options exist which will be discussed. In summary, we will present a set of different elements enabling sensor discovery ranging from metadata formats, metadata conversion and editing to metadata storage. Furthermore, the current development status will be demonstrated.
NASA Astrophysics Data System (ADS)
Prasad, U.; Rahabi, A.
2001-05-01
The following utilities developed for HDF-EOS format data dump are of special use for Earth science data for NASA's Earth Observation System (EOS). This poster demonstrates their use and application. The first four tools take HDF-EOS data files as input. HDF-EOS Metadata Dumper - metadmp Metadata dumper extracts metadata from EOS data granules. It operates by simply copying blocks of metadata from the file to the standard output. It does not process the metadata in any way. Since all metadata in EOS granules is encoded in the Object Description Language (ODL), the output of metadmp will be in the form of complete ODL statements. EOS data granules may contain up to three different sets of metadata (Core, Archive, and Structural Metadata). HDF-EOS Contents Dumper - heosls Heosls dumper displays the contents of HDF-EOS files. This utility provides detailed information on the POINT, SWATH, and GRID data sets. in the files. For example: it will list, the Geo-location fields, Data fields and objects. HDF-EOS ASCII Dumper - asciidmp The ASCII dump utility extracts fields from EOS data granules into plain ASCII text. The output from asciidmp should be easily human readable. With minor editing, asciidmp's output can be made ingestible by any application with ASCII import capabilities. HDF-EOS Binary Dumper - bindmp The binary dumper utility dumps HDF-EOS objects in binary format. This is useful for feeding the output of it into existing program, which does not understand HDF, for example: custom software and COTS products. HDF-EOS User Friendly Metadata - UFM The UFM utility tool is useful for viewing ECS metadata. UFM takes an EOSDIS ODL metadata file and produces an HTML report of the metadata for display using a web browser. HDF-EOS METCHECK - METCHECK METCHECK can be invoked from either Unix or Dos environment with a set of command line options that a user might use to direct the tool inputs and output . METCHECK validates the inventory metadata in (.met file) using The Descriptor file (.desc) as the reference. The tool takes (.desc), and (.met) an ODL file as inputs, and generates a simple output file contains the results of the checking process.
The XML Metadata Editor of GFZ Data Services
NASA Astrophysics Data System (ADS)
Ulbricht, Damian; Elger, Kirsten; Tesei, Telemaco; Trippanera, Daniele
2017-04-01
Following the FAIR data principles, research data should be Findable, Accessible, Interoperable and Reuseable. Publishing data under these principles requires to assign persistent identifiers to the data and to generate rich machine-actionable metadata. To increase the interoperability, metadata should include shared vocabularies and crosslink the newly published (meta)data and related material. However, structured metadata formats tend to be complex and are not intended to be generated by individual scientists. Software solutions are needed that support scientists in providing metadata describing their data. To facilitate data publication activities of 'GFZ Data Services', we programmed an XML metadata editor that assists scientists to create metadata in different schemata popular in the earth sciences (ISO19115, DIF, DataCite), while being at the same time usable by and understandable for scientists. Emphasis is placed on removing barriers, in particular the editor is publicly available on the internet without registration [1] and the scientists are not requested to provide information that may be generated automatically (e.g. the URL of a specific licence or the contact information of the metadata distributor). Metadata are stored in browser cookies and a copy can be saved to the local hard disk. To improve usability, form fields are translated into the scientific language, e.g. 'creators' of the DataCite schema are called 'authors'. To assist filling in the form, we make use of drop down menus for small vocabulary lists and offer a search facility for large thesauri. Explanations to form fields and definitions of vocabulary terms are provided in pop-up windows and a full documentation is available for download via the help menu. In addition, multiple geospatial references can be entered via an interactive mapping tool, which helps to minimize problems with different conventions to provide latitudes and longitudes. Currently, we are extending the metadata editor to be reused to generate metadata for data discovery and contextual metadata developed by the 'Multi-scale Laboratories' Thematic Core Service of the European Plate Observing System (EPOS-IP). The Editor will be used to build a common repository of a large variety of geological and geophysical datasets produced by multidisciplinary laboratories throughout Europe, thus contributing to a significant step toward the integration and accessibility of earth science data. This presentation will introduce the metadata editor and show the adjustments made for EPOS-IP. [1] http://dataservices.gfz-potsdam.de/panmetaworks/metaedit
Evolving Metadata in NASA Earth Science Data Systems
NASA Astrophysics Data System (ADS)
Mitchell, A.; Cechini, M. F.; Walter, J.
2011-12-01
NASA's Earth Observing System (EOS) is a coordinated series of satellites for long term global observations. NASA's Earth Observing System Data and Information System (EOSDIS) is a petabyte-scale archive of environmental data that supports global climate change research by providing end-to-end services from EOS instrument data collection to science data processing to full access to EOS and other earth science data. On a daily basis, the EOSDIS ingests, processes, archives and distributes over 3 terabytes of data from NASA's Earth Science missions representing over 3500 data products ranging from various types of science disciplines. EOSDIS is currently comprised of 12 discipline specific data centers that are collocated with centers of science discipline expertise. Metadata is used in all aspects of NASA's Earth Science data lifecycle from the initial measurement gathering to the accessing of data products. Missions use metadata in their science data products when describing information such as the instrument/sensor, operational plan, and geographically region. Acting as the curator of the data products, data centers employ metadata for preservation, access and manipulation of data. EOSDIS provides a centralized metadata repository called the Earth Observing System (EOS) ClearingHouse (ECHO) for data discovery and access via a service-oriented-architecture (SOA) between data centers and science data users. ECHO receives inventory metadata from data centers who generate metadata files that complies with the ECHO Metadata Model. NASA's Earth Science Data and Information System (ESDIS) Project established a Tiger Team to study and make recommendations regarding the adoption of the international metadata standard ISO 19115 in EOSDIS. The result was a technical report recommending an evolution of NASA data systems towards a consistent application of ISO 19115 and related standards including the creation of a NASA-specific convention for core ISO 19115 elements. Part of NASA's effort to continually evolve its data systems led ECHO to enhancing the method in which it receives inventory metadata from the data centers to allow for multiple metadata formats including ISO 19115. ECHO's metadata model will also be mapped to the NASA-specific convention for ingesting science metadata into the ECHO system. As NASA's new Earth Science missions and data centers are migrating to the ISO 19115 standards, EOSDIS is developing metadata management resources to assist in the reading, writing and parsing ISO 19115 compliant metadata. To foster interoperability with other agencies and international partners, NASA is working to ensure that a common ISO 19115 convention is developed, enhancing data sharing capabilities and other data analysis initiatives. NASA is also investigating the use of ISO 19115 standards to encode data quality, lineage and provenance with stored values. A common metadata standard across NASA's Earth Science data systems promotes interoperability, enhances data utilization and removes levels of uncertainty found in data products.
NASA Astrophysics Data System (ADS)
Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.
2007-12-01
Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
Hydrodynamics and Water Quality forecasting over a Cloud Computing environment: INDIGO-DataCloud
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; García, Daniel; Monteoliva, Agustín
2017-04-01
Algae Bloom due to eutrophication is an extended problem for water reservoirs and lakes that impacts directly in water quality. It can create a dead zone that lacks enough oxygen to support life and it can also be human harmful, so it must be controlled in water masses for supplying, bathing or other uses. Hydrodynamic and Water Quality modelling can contribute to forecast the status of the water system in order to alert authorities before an algae bloom event occurs. It can be used to predict scenarios and find solutions to reduce the harmful impact of the blooms. High resolution models need to process a big amount of data using a robust enough computing infrastructure. INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is an European Commission funded project that aims at developing a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The project addresses the development of solutions for different Case Studies using different Cloud-based alternatives. In the first INDIGO software release, a set of components are ready to manage the deployment of services to perform N number of Delft3D simulations (for calibrating or scenario definition) over a Cloud Computing environment, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator, AAI (Authorization, Authentication) and OneData (Distributed Storage System). Moreover, the Future Gateway portal based on Liferay, provides an user-friendly interface where the user can configure the simulations. Due to the data approach of INDIGO, the developed solutions can contribute to manage the full data life cycle of a project, thanks to different tools to manage datasets or even metadata. Furthermore, the cloud environment contributes to provide a dynamic, scalable and easy-to-use framework for non-IT experts users. This framework is potentially capable to automatize the processing of forecasting applying periodic tasks. For instance, a user can forecast every month the hydrodynamics and water quality status of a reservoir starting from a base model and supplying new data gathered from the instrumentation or observations. This interactive presentation aims to show the use of INDIGO solutions in a particular forecasting use case and to inspire others in the use of a Cloud framework for their applications.
Evaluating the privacy properties of telephone metadata.
Mayer, Jonathan; Mutchler, Patrick; Mitchell, John C
2016-05-17
Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences.
Studies of Big Data metadata segmentation between relational and non-relational databases
NASA Astrophysics Data System (ADS)
Golosova, M. V.; Grigorieva, M. A.; Klimentov, A. A.; Ryabinkin, E. A.; Dimitrov, G.; Potekhin, M.
2015-12-01
In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.
Evaluating the privacy properties of telephone metadata
Mayer, Jonathan; Mutchler, Patrick; Mitchell, John C.
2016-01-01
Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences. PMID:27185922
Incorporating ISO Metadata Using HDF Product Designer
NASA Technical Reports Server (NTRS)
Jelenak, Aleksandar; Kozimor, John; Habermann, Ted
2016-01-01
The need to store in HDF5 files increasing amounts of metadata of various complexity is greatly overcoming the capabilities of the Earth science metadata conventions currently in use. Data producers until now did not have much choice but to come up with ad hoc solutions to this challenge. Such solutions, in turn, pose a wide range of issues for data managers, distributors, and, ultimately, data users. The HDF Group is experimenting on a novel approach of using ISO 19115 metadata objects as a catch-all container for all the metadata that cannot be fitted into the current Earth science data conventions. This presentation will showcase how the HDF Product Designer software can be utilized to help data producers include various ISO metadata objects in their products.
NASA Astrophysics Data System (ADS)
O'Brien, K.; Hankin, S.; Mendelssohn, R.; Simons, R.; Smith, B.; Kern, K. J.
2013-12-01
The Observing System Monitoring Center (OSMC), a project funded by the National Oceanic and Atmospheric Administration's Climate Observations Division (COD), exists to join the discrete 'networks' of In Situ ocean observing platforms -- ships, surface floats, profiling floats, tide gauges, etc. - into a single, integrated system. The OSMC is addressing this goal through capabilities in three areas focusing on the needs of specific user groups: 1) it provides real time monitoring of the integrated observing system assets to assist management in optimizing the cost-effectiveness of the system for the assessment of climate variables; 2) it makes the stream of real time data coming from the observing system available to scientific end users into an easy-to-use form; and 3) in the future, it will unify the delayed-mode data from platform-focused data assembly centers into a standards- based distributed system that is readily accessible to interested users from the science and education communities. In this presentation, we will be focusing on the efforts of the OSMC to provide interoperable access to the near real time data stream that is available via the Global Telecommunications System (GTS). This is a very rich data source, and includes data from nearly all of the oceanographic platforms that are actively observing. We will discuss how the data is being served out using a number of widely used 'web services' (including OPeNDAP and SOS) and downloadable file formats (KML, csv, xls, netCDF), so that it can be accessed in web browsers and popular desktop analysis tools. We will also be discussing our use of the Environmental Research Division's Data Access Program (ERDDAP), available from NOAA/NMFS, which has allowed us to achieve our goals of serving the near real time data. From an interoperability perspective, it's important to note that access to the this stream of data is not just for humans, but also for machine-to-machine requests. We'll also delve into how we configured access to the near real time ocean observations in accordance with the Climate and Forecast (CF) metadata conventions describing the various 'feature types' associated with particular in situ observation types, or discrete sampling geometries (DSG). Wrapping up, we'll discuss some of the ways this data source is already being used.
Viewing and Editing Earth Science Metadata MOBE: Metadata Object Browser and Editor in Java
NASA Astrophysics Data System (ADS)
Chase, A.; Helly, J.
2002-12-01
Metadata is an important, yet often neglected aspect of successful archival efforts. However, to generate robust, useful metadata is often a time consuming and tedious task. We have been approaching this problem from two directions: first by automating metadata creation, pulling from known sources of data, and in addition, what this (paper/poster?) details, developing friendly software for human interaction with the metadata. MOBE and COBE(Metadata Object Browser and Editor, and Canonical Object Browser and Editor respectively), are Java applications for editing and viewing metadata and digital objects. MOBE has already been designed and deployed, currently being integrated into other areas of the SIOExplorer project. COBE is in the design and development stage, being created with the same considerations in mind as those for MOBE. Metadata creation, viewing, data object creation, and data object viewing, when taken on a small scale are all relatively simple tasks. Computer science however, has an infamous reputation for transforming the simple into complex. As a system scales upwards to become more robust, new features arise and additional functionality is added to the software being written to manage the system. The software that emerges from such an evolution, though powerful, is often complex and difficult to use. With MOBE the focus is on a tool that does a small number of tasks very well. The result has been an application that enables users to manipulate metadata in an intuitive and effective way. This allows for a tool that serves its purpose without introducing additional cognitive load onto the user, an end goal we continue to pursue.
Managing biomedical image metadata for search and retrieval of similar images.
Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris
2011-08-01
Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.
Misra, Dharitri; Chen, Siyuan; Thoma, George R
2009-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.
The Metadata Cloud: The Last Piece of a Distributed Data System Model
NASA Astrophysics Data System (ADS)
King, T. A.; Cecconi, B.; Hughes, J. S.; Walker, R. J.; Roberts, D.; Thieman, J. R.; Joy, S. P.; Mafi, J. N.; Gangloff, M.
2012-12-01
Distributed data systems have existed ever since systems were networked together. Over the years the model for distributed data systems have evolved from basic file transfer to client-server to multi-tiered to grid and finally to cloud based systems. Initially metadata was tightly coupled to the data either by embedding the metadata in the same file containing the data or by co-locating the metadata in commonly named files. As the sources of data multiplied, data volumes have increased and services have specialized to improve efficiency; a cloud system model has emerged. In a cloud system computing and storage are provided as services with accessibility emphasized over physical location. Computation and data clouds are common implementations. Effectively using the data and computation capabilities requires metadata. When metadata is stored separately from the data; a metadata cloud is formed. With a metadata cloud information and knowledge about data resources can migrate efficiently from system to system, enabling services and allowing the data to remain efficiently stored until used. This is especially important with "Big Data" where movement of the data is limited by bandwidth. We examine how the metadata cloud completes a general distributed data system model, how standards play a role and relate this to the existing types of cloud computing. We also look at the major science data systems in existence and compare each to the generalized cloud system model.
ERIC Educational Resources Information Center
Mulrooney, Timothy J.
2009-01-01
A Geographic Information System (GIS) serves as the tangible and intangible means by which spatially related phenomena can be created, analyzed and rendered. GIS metadata serves as the formal framework to catalog information about a GIS data set. Metadata is independent of the encoded spatial and attribute information. GIS metadata is a subset of…
Integrating XQuery-Enabled SCORM XML Metadata Repositories into an RDF-Based E-Learning P2P Network
ERIC Educational Resources Information Center
Qu, Changtao; Nejdl, Wolfgang
2004-01-01
Edutella is an RDF-based E-Learning P2P network that is aimed to accommodate heterogeneous learning resource metadata repositories in a P2P manner and further facilitate the exchange of metadata between these repositories based on RDF. Whereas Edutella provides RDF metadata repositories with a quite natural integration approach, XML metadata…
Raising orphans from a metadata morass: A researcher's guide to re-use of public 'omics data.
Bhandary, Priyanka; Seetharam, Arun S; Arendsee, Zebulun W; Hur, Manhoi; Wurtele, Eve Syrkin
2018-02-01
More than 15 petabases of raw RNAseq data is now accessible through public repositories. Acquisition of other 'omics data types is expanding, though most lack a centralized archival repository. Data-reuse provides tremendous opportunity to extract new knowledge from existing experiments, and offers a unique opportunity for robust, multi-'omics analyses by merging metadata (information about experimental design, biological samples, protocols) and data from multiple experiments. We illustrate how predictive research can be accelerated by meta-analysis with a study of orphan (species-specific) genes. Computational predictions are critical to infer orphan function because their coding sequences provide very few clues. The metadata in public databases is often confusing; a test case with Zea mays mRNA seq data reveals a high proportion of missing, misleading or incomplete metadata. This metadata morass significantly diminishes the insight that can be extracted from these data. We provide tips for data submitters and users, including specific recommendations to improve metadata quality by more use of controlled vocabulary and by metadata reviews. Finally, we advocate for a unified, straightforward metadata submission and retrieval system. Copyright © 2017 Elsevier B.V. All rights reserved.
Science friction: data, metadata, and collaboration.
Edwards, Paul N; Mayernik, Matthew S; Batcheller, Archer L; Bowker, Geoffrey C; Borgman, Christine L
2011-10-01
When scientists from two or more disciplines work together on related problems, they often face what we call 'science friction'. As science becomes more data-driven, collaborative, and interdisciplinary, demand increases for interoperability among data, tools, and services. Metadata--usually viewed simply as 'data about data', describing objects such as books, journal articles, or datasets--serve key roles in interoperability. Yet we find that metadata may be a source of friction between scientific collaborators, impeding data sharing. We propose an alternative view of metadata, focusing on its role in an ephemeral process of scientific communication, rather than as an enduring outcome or product. We report examples of highly useful, yet ad hoc, incomplete, loosely structured, and mutable, descriptions of data found in our ethnographic studies of several large projects in the environmental sciences. Based on this evidence, we argue that while metadata products can be powerful resources, usually they must be supplemented with metadata processes. Metadata-as-process suggests the very large role of the ad hoc, the incomplete, and the unfinished in everyday scientific work.
Recipes for Semantic Web Dog Food — The ESWC and ISWC Metadata Projects
NASA Astrophysics Data System (ADS)
Möller, Knud; Heath, Tom; Handschuh, Siegfried; Domingue, John
Semantic Web conferences such as ESWC and ISWC offer prime opportunities to test and showcase semantic technologies. Conference metadata about people, papers and talks is diverse in nature and neither too small to be uninteresting or too big to be unmanageable. Many metadata-related challenges that may arise in the Semantic Web at large are also present here. Metadata must be generated from sources which are often unstructured and hard to process, and may originate from many different players, therefore suitable workflows must be established. Moreover, the generated metadata must use appropriate formats and vocabularies, and be served in a way that is consistent with the principles of linked data. This paper reports on the metadata efforts from ESWC and ISWC, identifies specific issues and barriers encountered during the projects, and discusses how these were approached. Recommendations are made as to how these may be addressed in the future, and we discuss how these solutions may generalize to metadata production for the Semantic Web at large.
Metadata or data about data describes the content, quality, condition, and other characteristics of data. Geospatial metadata are critical to data discovery and serves as the fuel for the Geospatial One-Stop data portal.
Visualization of JPEG Metadata
NASA Astrophysics Data System (ADS)
Malik Mohamad, Kamaruddin; Deris, Mustafa Mat
There are a lot of information embedded in JPEG image than just graphics. Visualization of its metadata would benefit digital forensic investigator to view embedded data including corrupted image where no graphics can be displayed in order to assist in evidence collection for cases such as child pornography or steganography. There are already available tools such as metadata readers, editors and extraction tools but mostly focusing on visualizing attribute information of JPEG Exif. However, none have been done to visualize metadata by consolidating markers summary, header structure, Huffman table and quantization table in a single program. In this paper, metadata visualization is done by developing a program that able to summarize all existing markers, header structure, Huffman table and quantization table in JPEG. The result shows that visualization of metadata helps viewing the hidden information within JPEG more easily.
Use of a metadata documentation and search tool for large data volumes: The NGEE arctic example
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Hook, Leslie A; Killeffer, Terri S
The Online Metadata Editor (OME) is a web-based tool to help document scientific data in a well-structured, popular scientific metadata format. In this paper, we will discuss the newest tool that Oak Ridge National Laboratory (ORNL) has developed to generate, edit, and manage metadata and how it is helping data-intensive science centers and projects, such as the U.S. Department of Energy s Next Generation Ecosystem Experiments (NGEE) in the Arctic to prepare metadata and make their big data produce big science and lead to new discoveries.
Creating preservation metadata from XML-metadata profiles
NASA Astrophysics Data System (ADS)
Ulbricht, Damian; Bertelmann, Roland; Gebauer, Petra; Hasler, Tim; Klump, Jens; Kirchner, Ingo; Peters-Kottig, Wolfgang; Mettig, Nora; Rusch, Beate
2014-05-01
Registration of dataset DOIs at DataCite makes research data citable and comes with the obligation to keep data accessible in the future. In addition, many universities and research institutions measure data that is unique and not repeatable like the data produced by an observational network and they want to keep these data for future generations. In consequence, such data should be ingested in preservation systems, that automatically care for file format changes. Open source preservation software that is developed along the definitions of the ISO OAIS reference model is available but during ingest of data and metadata there are still problems to be solved. File format validation is difficult, because format validators are not only remarkably slow - due to variety in file formats different validators return conflicting identification profiles for identical data. These conflicts are hard to resolve. Preservation systems have a deficit in the support of custom metadata. Furthermore, data producers are sometimes not aware that quality metadata is a key issue for the re-use of data. In the project EWIG an university institute and a research institute work together with Zuse-Institute Berlin, that is acting as an infrastructure facility, to generate exemplary workflows for research data into OAIS compliant archives with emphasis on the geosciences. The Institute for Meteorology provides timeseries data from an urban monitoring network whereas GFZ Potsdam delivers file based data from research projects. To identify problems in existing preservation workflows the technical work is complemented by interviews with data practitioners. Policies for handling data and metadata are developed. Furthermore, university teaching material is created to raise the future scientists awareness of research data management. As a testbed for ingest workflows the digital preservation system Archivematica [1] is used. During the ingest process metadata is generated that is compliant to the Metadata Encoding and Transmission Standard (METS). To find datasets in future portals and to make use of this data in own scientific work, proper selection of discovery metadata and application metadata is very important. Some XML-metadata profiles are not suitable for preservation, because version changes are very fast and make it nearly impossible to automate the migration. For other XML-metadata profiles schema definitions are changed after publication of the profile or the schema definitions become inaccessible, which might cause problems during validation of the metadata inside the preservation system [2]. Some metadata profiles are not used widely enough and might not even exist in the future. Eventually, discovery and application metadata have to be embedded into the mdWrap-subtree of the METS-XML. [1] http://www.archivematica.org [2] http://dx.doi.org/10.2218/ijdc.v7i1.215
NASA Astrophysics Data System (ADS)
Oggioni, Alessandro; Tagliolato, Paolo; Fugazza, Cristiano; Bastianini, Mauro; Pavesi, Fabio; Pepe, Monica; Menegon, Stefano; Basoni, Anna; Carrara, Paola
2015-04-01
Sensor observation systems for environmental data have become increasingly important in the last years. The EGU's Informatics in Oceanography and Ocean Science track stressed the importance of management tools and solutions for marine infrastructures. We think that full interoperability among sensor systems is still an open issue and that the solution to this involves providing appropriate metadata. Several open source applications implement the SWE specification and, particularly, the Sensor Observation Services (SOS) standard. These applications allow for the exchange of data and metadata in XML format between computer systems. However, there is a lack of metadata editing tools supporting end users in this activity. Generally speaking, it is hard for users to provide sensor metadata in the SensorML format without dedicated tools. In particular, such a tool should ease metadata editing by providing, for standard sensors, all the invariant information to be included in sensor metadata, thus allowing the user to concentrate on the metadata items that are related to the specific deployment. RITMARE, the Italian flagship project on marine research, envisages a subproject, SP7, for the set-up of the project's spatial data infrastructure. SP7 developed EDI, a general purpose, template-driven metadata editor that is composed of a backend web service and an HTML5/javascript client. EDI can be customized for managing the creation of generic metadata encoded as XML. Once tailored to a specific metadata format, EDI presents the users a web form with advanced auto completion and validation capabilities. In the case of sensor metadata (SensorML versions 1.0.1 and 2.0), the EDI client is instructed to send an "insert sensor" request to an SOS endpoint in order to save the metadata in an SOS server. In the first phase of project RITMARE, EDI has been used to simplify the creation from scratch of SensorML metadata by the involved researchers and data managers. An interesting by-product of this ongoing work is currently constituting an archive of predefined sensor descriptions. This information is being collected in order to further ease metadata creation in the next phase of the project. Users will be able to choose among a number of sensor and sensor platform prototypes: These will be specific instances on which it will be possible to define, in a bottom-up approach, "sensor profiles". We report on the outcome of this activity.
WOVOdat as a worldwide resource to improve eruption forecasts
NASA Astrophysics Data System (ADS)
Widiwijayanti, Christina; Costa, Fidel; Zar Win Nang, Thin; Tan, Karine; Newhall, Chris; Ratdomopurbo, Antonius
2015-04-01
During periods of volcanic unrest, volcanologists need to interpret signs of unrest to be able to forecast whether an eruption is likely to occur. Some volcanic eruptions display signs of impending eruption such as seismic activity, surface deformation, or gas emissions; but not all will give signs and not all signs are necessarily followed by an eruption. Volcanoes behave differently. Precursory signs of an eruption are sometimes very short, less than an hour, but can be also weeks, months, or even years. Some volcanoes are regularly active and closely monitored, while other aren't. Often, the record of precursors to historical eruptions of a volcano isn't enough to allow a forecast of its future activity. Therefore, volcanologists must refer to monitoring data of unrest and eruptions at similar volcanoes. WOVOdat is the World Organization of Volcano Observatories' Database of volcanic unrest - an international effort to develop common standards for compiling and storing data on volcanic unrests in a centralized database and freely web-accessible for reference during volcanic crises, comparative studies, and basic research on pre-eruption processes. WOVOdat will be to volcanology as an epidemiological database is to medicine. We have up to now incorporated about 15% of worldwide unrest data into WOVOdat, covering more than 100 eruption episodes, which includes: volcanic background data, eruptive histories, monitoring data (seismic, deformation, gas, hydrology, thermal, fields, and meteorology), monitoring metadata, and supporting data such as reports, images, maps and videos. Nearly all data in WOVOdat are time-stamped and geo-referenced. Along with creating a database on volcanic unrest, WOVOdat also developing web-tools to help users to query, visualize, and compare data, which further can be used for probabilistic eruption forecasting. Reference to WOVOdat will be especially helpful at volcanoes that have not erupted in historical or 'instrumental' time and thus for which no previous data exist. The more data in WOVOdat, the more useful it will be. We actively solicit relevant data contributions from volcano observatories, other institutions, and individual researchers. Detailed information and documentation about the database and how to use it can be found at www.wovodat.org.
Publishing NASA Metadata as Linked Open Data for Semantic Mashups
NASA Astrophysics Data System (ADS)
Wilson, Brian; Manipon, Gerald; Hua, Hook
2014-05-01
Data providers are now publishing more metadata in more interoperable forms, e.g. Atom or RSS 'casts', as Linked Open Data (LOD), or as ISO Metadata records. A major effort on the part of the NASA's Earth Science Data and Information System (ESDIS) project is the aggregation of metadata that enables greater data interoperability among scientific data sets regardless of source or application. Both the Earth Observing System (EOS) ClearingHOuse (ECHO) and the Global Change Master Directory (GCMD) repositories contain metadata records for NASA (and other) datasets and provided services. These records contain typical fields for each dataset (or software service) such as the source, creation date, cognizant institution, related access URL's, and domain and variable keywords to enable discovery. Under a NASA ACCESS grant, we demonstrated how to publish the ECHO and GCMD dataset and services metadata as LOD in the RDF format. Both sets of metadata are now queryable at SPARQL endpoints and available for integration into "semantic mashups" in the browser. It is straightforward to reformat sets of XML metadata, including ISO, into simple RDF and then later refine and improve the RDF predicates by reusing known namespaces such as Dublin core, georss, etc. All scientific metadata should be part of the LOD world. In addition, we developed an "instant" drill-down and browse interface that provides faceted navigation so that the user can discover and explore the 25,000 datasets and 3000 services. The available facets and the free-text search box appear in the left panel, and the instantly updated results for the dataset search appear in the right panel. The user can constrain the value of a metadata facet simply by clicking on a word (or phrase) in the "word cloud" of values for each facet. The display section for each dataset includes the important metadata fields, a full description of the dataset, potentially some related URL's, and a "search" button that points to an OpenSearch GUI that is pre-configured to search for granules within the dataset. We will present our experiences with converting NASA metadata into LOD, discuss the challenges, illustrate some of the enabled mashups, and demonstrate the latest version of the "instant browse" interface for navigating multiple metadata collections.
The PDS4 Metadata Management System
NASA Astrophysics Data System (ADS)
Raugh, A. C.; Hughes, J. S.
2018-04-01
We present the key features of the Planetary Data System (PDS) PDS4 Information Model as an extendable metadata management system for planetary metadata related to data structure, analysis/interpretation, and provenance.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Richard, S. M.; Valentine, D. W., Jr.; Grethe, J. S.; Hsu, L.; Malik, T.; Bermudez, L. E.; Gupta, A.; Lehnert, K. A.; Whitenack, T.; Ozyurt, I. B.; Condit, C.; Calderon, R.; Musil, L.
2014-12-01
EarthCube is envisioned as a cyberinfrastructure that fosters new, transformational geoscience by enabling sharing, understanding and scientifically-sound and efficient re-use of formerly unconnected data resources, software, models, repositories, and computational power. Its purpose is to enable science enterprise and workforce development via an extensible and adaptable collaboration and resource integration framework. A key component of this vision is development of comprehensive inventories supporting resource discovery and re-use across geoscience domains. The goal of the EarthCube CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) project is to create a methodology and assemble a large inventory of high-quality information resources with standard metadata descriptions and traceable provenance. The inventory is compiled from metadata catalogs maintained by geoscience data facilities, as well as from user contributions. The latter mechanism relies on community resource viewers: online applications that support update and curation of metadata records. Once harvested into CINERGI, metadata records from domain catalogs and community resource viewers are loaded into a staging database implemented in MongoDB, and validated for compliance with ISO 19139 metadata schema. Several types of metadata defects detected by the validation engine are automatically corrected with help of several information extractors or flagged for manual curation. The metadata harvesting, validation and processing components generate provenance statements using W3C PROV notation, which are stored in a Neo4J database. Thus curated metadata, along with the provenance information, is re-published and accessed programmatically and via a CINERGI online application. This presentation focuses on the role of resource inventories in a scalable and adaptable information infrastructure, and on the CINERGI metadata pipeline and its implementation challenges. Key project components are described at the project's website (http://workspace.earthcube.org/cinergi), which also provides access to the initial resource inventory, the inventory metadata model, metadata entry forms and a collection of the community resource viewers.
Brasz, Joost J.; Jonsson, Ulf J.
2006-09-05
A method of operating an organic rankine cycle system wherein a liquid refrigerant is circulated to an evaporator where heat is introduced to the refrigerant to convert it to vapor. The vapor is then passed through a turbine, with the resulting cooled vapor then passing through a condenser for condensing the vapor to a liquid. The refrigerant is one of CF.sub.3CF.sub.2C(O)CF(CF.sub.3).sub.2, (CF.sub.3).sub.2 CFC(O)CF(CF.sub.3).sub.2, CF.sub.3(CF.sub.2).sub.2C(O)CF(CF.sub.3).sub.2, CF.sub.3(CF.sub.2).sub.3C(O)CF(CG.sub.3).sub.2, CF.sub.3(CF.sub.2).sub.5C(O)CF.sub.3, CF.sub.3CF.sub.2C(O)CF.sub.2CF.sub.2CF.sub.3, CF.sub.3C(O)CF(CF.sub.3).sub.2.
Metadata improvements driving new tools and services at a NASA data center
NASA Astrophysics Data System (ADS)
Moroni, D. F.; Hausman, J.; Foti, G.; Armstrong, E. M.
2011-12-01
The NASA Physical Oceanography DAAC (PO.DAAC) is responsible for distributing and maintaining satellite derived oceanographic data from a number of NASA and non-NASA missions for the physical disciplines of ocean winds, sea surface temperature, ocean topography and gravity. Currently its holdings consist of over 600 datasets with a data archive in excess of 200 Terrabytes. The PO.DAAC has recently embarked on a metadata quality and completeness project to migrate, update and improve metadata records for over 300 public datasets. An interactive database management tool has been developed to allow data scientists to enter, update and maintain metadata records. This tool communicates directly with PO.DAAC's Data Management and Archiving System (DMAS), which serves as the new archival and distribution backbone as well as a permanent repository of dataset and granule-level metadata. Although we will briefly discuss the tool, more important ramifications are the ability to now expose, propagate and leverage the metadata in a number of ways. First, the metadata are exposed directly through a faceted and free text search interface directly from drupal-based PO.DAAC web pages allowing for quick browsing and data discovery especially by "drilling" through the various facet levels that organize datasets by time/space resolution, processing level, sensor, measurement type etc. Furthermore, the metadata can now be exposed through web services to produce metadata records in a number of different formats such as FGDC and ISO 19115, or potentially propagated to visualization and subsetting tools, and other discovery interfaces. The fundamental concept is that the metadata forms the essential bridge between the user, and the tool or discovery mechanism for a broad range of ocean earth science data records.
NASA Astrophysics Data System (ADS)
Argent, R.; Sheahan, P.; Plummer, N.
2010-12-01
Under the Commonwealth Water Act 2007 the Bureau of Meteorology was given a new national role in water information, encompassing standards, water accounts and assessments, hydrological forecasting, and collecting, enhancing and making freely available Australia’s water information. The Australian Water Resources Information System (AWRIS) is being developed to fulfil part of this role, by providing foundational data, information and model structures and services. Over 250 organisations across Australia are required to provide water data and metadata to the Bureau, including federal, state and local governments, water storage management and hydroelectricity companies, rural and urban water utilities, and catchment management bodies. The data coverage includes the categories needed to assess and account for water resources at a range of scales. These categories are surface, groundwater and meteorological observations, water in storages, water restrictions, urban and irrigation water use and flows, information on rights, allocations and trades, and a limited suite of water quality parameters. These data are currently supplied to the Bureau via a file-based delivery system at various frequencies from annual to daily or finer, and contain observations taken at periods from minutes to monthly or coarser. One of the primary keys to better data access and utilisation is better data organisation, including content and markup standards. As a significant step on the path to standards for water data description, the Bureau has developed a Water Data Transfer Format (WDTF) for transmission of a variety of water data categories, including site metadata. WDTF is adapted from the OGC’s observation and sampling-features standard. The WDTF XML schema is compatible with the OGC's Web Feature Service (WFS) interchange standard, and conforms to GML Simple Features profile (GML-SF) level 1, emphasising the importance of standards in data exchange. In the longer term we are also working with the OGC’s Hydrology Domain Working Group on the development of WaterML 2, which will provide an international standard applicable to a sub-set of the information handled by WDTF. Making water data accessible for multiple uses, such as for predictive models and external products, has required the development of consistent data models for describing the relationships between the various data elements. Early development of the AWRIS data model has utilised a model-driven architecture approach, the benefits of which are likely to accrue in the long term, as more products and services are developed from the common core. Moving on from our initial focus on data organisation and management, the Bureau is in the early stages of developing an integrated modelling suite (the Bureau Hydrological Modelling System - BHMS) which will encompass the variety of hydrological modelling needs of the Bureau, ranging from water balances, assessments and accounts, to streamflow and hydrological forecasting over scales from hours and days to years and decades. It is envisaged that this modelling suite will also be developed, as far as possible, using standardised, discoverable services to enhance data-model and model-model integration.
Cyberinfrastructure for Atmospheric Discovery
NASA Astrophysics Data System (ADS)
Wilhelmson, R.; Moore, C. W.
2004-12-01
Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed to create an integrated, scalable for identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, and visualizing a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is Workflow Orchestration for On-Demand, Real-Time, Dynamically-Adaptive Systems (WOORDS), which allows the use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand, Grid-enabled systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Although LEAD efforts are primiarly directed at mesoscale meteorology, the IT services being developed has general applicability to other geoscience and environmental science. Integration of traditional and new data sources is a crucial component in LEAD for data analysis and assimilation, for integration of (ensemble mining) of data from sets of simulations, and for comparing results to observational data. As part of the integration effort, LEAD is creating a myLEAD metadata catalog service: a personal metacatalog that extends the Globus MCS system and is built on top of the OGSA-DAI system developed at the National e-Science Center in Edinburgh, Scotland.
New measurements of the thermophysical properties of CF3OCF2CF2CF3 and c -CF2CF2CF2CF2O are reported from T ≈ 235 K to the critical region. Liquid-phase volumetric results for CF3OCF2OCF3 and CF3OCF2CF2H (235 < T/K < 303) are reported to supplement the information already availab...
EPA Metadata Style Guide Keywords and EPA Organization Names
The following keywords and EPA organization names listed below, along with EPA’s Metadata Style Guide, are intended to provide suggestions and guidance to assist with the standardization of metadata records.
Interpreting the ASTM 'content standard for digital geospatial metadata'
Nebert, Douglas D.
1996-01-01
ASTM and the Federal Geographic Data Committee have developed a content standard for spatial metadata to facilitate documentation, discovery, and retrieval of digital spatial data using vendor-independent terminology. Spatial metadata elements are identifiable quality and content characteristics of a data set that can be tied to a geographic location or area. Several Office of Management and Budget Circulars and initiatives have been issued that specify improved cataloguing of and accessibility to federal data holdings. An Executive Order further requires the use of the metadata content standard to document digital spatial data sets. Collection and reporting of spatial metadata for field investigations performed for the federal government is an anticipated requirement. This paper provides an overview of the draft spatial metadata content standard and a description of how the standard could be applied to investigations collecting spatially-referenced field data.
Misra, Dharitri; Chen, Siyuan; Thoma, George R.
2010-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386
The Importance of Metadata in System Development and IKM
2003-02-01
Defence R& D Canada The Importance of Metadata in System Development and IKM Anthony W. Isenor Technical Memorandum DRDC Atlantic TM 2003-011...Metadata in System Development and IKM Anthony W. Isenor Defence R& D Canada – Atlantic Technical Memorandum DRDC Atlantic TM 2003-011 February... it is important for searches and providing relevant information to the client. A comparison of metadata standards was conducted with emphasis on
NASA Astrophysics Data System (ADS)
Do, Hong Xuan; Gudmundsson, Lukas; Leonard, Michael; Westra, Seth
2018-04-01
This is the first part of a two-paper series presenting the Global Streamflow Indices and Metadata archive (GSIM), a worldwide collection of metadata and indices derived from more than 35 000 daily streamflow time series. This paper focuses on the compilation of the daily streamflow time series based on 12 free-to-access streamflow databases (seven national databases and five international collections). It also describes the development of three metadata products (freely available at https://doi.pangaea.de/10.1594/PANGAEA.887477): (1) a GSIM catalogue collating basic metadata associated with each time series, (2) catchment boundaries for the contributing area of each gauge, and (3) catchment metadata extracted from 12 gridded global data products representing essential properties such as land cover type, soil type, and climate and topographic characteristics. The quality of the delineated catchment boundary is also made available and should be consulted in GSIM application. The second paper in the series then explores production and analysis of streamflow indices. Having collated an unprecedented number of stations and associated metadata, GSIM can be used to advance large-scale hydrological research and improve understanding of the global water cycle.
Design and Implementation of a Metadata-rich File System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ames, S; Gokhale, M B; Maltzahn, C
2010-01-19
Despite continual improvements in the performance and reliability of large scale file systems, the management of user-defined file system metadata has changed little in the past decade. The mismatch between the size and complexity of large scale data stores and their ability to organize and query their metadata has led to a de facto standard in which raw data is stored in traditional file systems, while related, application-specific metadata is stored in relational databases. This separation of data and semantic metadata requires considerable effort to maintain consistency and can result in complex, slow, and inflexible system operation. To address thesemore » problems, we have developed the Quasar File System (QFS), a metadata-rich file system in which files, user-defined attributes, and file relationships are all first class objects. In contrast to hierarchical file systems and relational databases, QFS defines a graph data model composed of files and their relationships. QFS incorporates Quasar, an XPATH-extended query language for searching the file system. Results from our QFS prototype show the effectiveness of this approach. Compared to the de facto standard, the QFS prototype shows superior ingest performance and comparable query performance on user metadata-intensive operations and superior performance on normal file metadata operations.« less
Achieving Sub-Second Search in the CMR
NASA Astrophysics Data System (ADS)
Gilman, J.; Baynes, K.; Pilone, D.; Mitchell, A. E.; Murphy, K. J.
2014-12-01
The Common Metadata Repository (CMR) is the next generation Earth Science Metadata catalog for NASA's Earth Observing data. It joins together the holdings from the EOS Clearing House (ECHO) and the Global Change Master Directory (GCMD), creating a unified, authoritative source for EOSDIS metadata. The CMR allows ingest in many different formats while providing consistent search behavior and retrieval in any supported format. Performance is a critical component of the CMR, ensuring improved data discovery and client interactivity. The CMR delivers sub-second search performance for any of the common query conditions (including spatial) across hundreds of millions of metadata granules. It also allows the addition of new metadata concepts such as visualizations, parameter metadata, and documentation. The CMR's goals presented many challenges. This talk will describe the CMR architecture, design, and innovations that were made to achieve its goals. This includes: * Architectural features like immutability and backpressure. * Data management techniques such as caching and parallel loading that give big performance gains. * Open Source and COTS tools like Elasticsearch search engine. * Adoption of Clojure, a functional programming language for the Java Virtual Machine. * Development of a custom spatial search plugin for Elasticsearch and why it was necessary. * Introduction of a unified model for metadata that maps every supported metadata format to a consistent domain model.
Syntactic and Semantic Validation without a Metadata Management System
NASA Technical Reports Server (NTRS)
Pollack, Janine; Gokey, Christopher D.; Kendig, David; Olsen, Lola; Wharton, Stephen W. (Technical Monitor)
2001-01-01
The ability to maintain quality information is essential to securing the confidence in any system for which the information serves as a data source. NASA's Global Change Master Directory (GCMD), an online Earth science data locator, holds over 9000 data set descriptions and is in a constant state of flux as metadata are created and updated on a daily basis. In such a system, the importance of maintaining the consistency and integrity of these-metadata is crucial. The GCMD has developed a metadata management system utilizing XML, controlled vocabulary, and Java technologies to ensure the metadata not only adhere to valid syntax, but also exhibit proper semantics.
Evaluating non-relational storage technology for HEP metadata and meta-data catalog
NASA Astrophysics Data System (ADS)
Grigorieva, M. A.; Golosova, M. V.; Gubin, M. Y.; Klimentov, A. A.; Osipova, V. V.; Ryabinkin, E. A.
2016-10-01
Large-scale scientific experiments produce vast volumes of data. These data are stored, processed and analyzed in a distributed computing environment. The life cycle of experiment is managed by specialized software like Distributed Data Management and Workload Management Systems. In order to be interpreted and mined, experimental data must be accompanied by auxiliary metadata, which are recorded at each data processing step. Metadata describes scientific data and represent scientific objects or results of scientific experiments, allowing them to be shared by various applications, to be recorded in databases or published via Web. Processing and analysis of constantly growing volume of auxiliary metadata is a challenging task, not simpler than the management and processing of experimental data itself. Furthermore, metadata sources are often loosely coupled and potentially may lead to an end-user inconsistency in combined information queries. To aggregate and synthesize a range of primary metadata sources, and enhance them with flexible schema-less addition of aggregated data, we are developing the Data Knowledge Base architecture serving as the intelligence behind GUIs and APIs.
Survey data and metadata modelling using document-oriented NoSQL
NASA Astrophysics Data System (ADS)
Rahmatuti Maghfiroh, Lutfi; Gusti Bagus Baskara Nugraha, I.
2018-03-01
Survey data that are collected from year to year have metadata change. However it need to be stored integratedly to get statistical data faster and easier. Data warehouse (DW) can be used to solve this limitation. However there is a change of variables in every period that can not be accommodated by DW. Traditional DW can not handle variable change via Slowly Changing Dimension (SCD). Previous research handle the change of variables in DW to manage metadata by using multiversion DW (MVDW). MVDW is designed using relational model. Some researches also found that developing nonrelational model in NoSQL database has reading time faster than the relational model. Therefore, we propose changes to metadata management by using NoSQL. This study proposes a model DW to manage change and algorithms to retrieve data with metadata changes. Evaluation of the proposed models and algorithms result in that database with the proposed design can retrieve data with metadata changes properly. This paper has contribution in comprehensive data analysis with metadata changes (especially data survey) in integrated storage.
NASA Astrophysics Data System (ADS)
Riddick, Andrew; Hughes, Andrew; Harpham, Quillon; Royse, Katherine; Singh, Anubha
2014-05-01
There has been an increasing interest both from academic and commercial organisations over recent years in developing hydrologic and other environmental models in response to some of the major challenges facing the environment, for example environmental change and its effects and ensuring water resource security. This has resulted in a significant investment in modelling by many organisations both in terms of financial resources and intellectual capital. To capitalise on the effort on producing models, then it is necessary for the models to be both discoverable and appropriately described. If this is not undertaken then the effort in producing the models will be wasted. However, whilst there are some recognised metadata standards relating to datasets these may not completely address the needs of modellers regarding input data for example. Also there appears to be a lack of metadata schemes configured to encourage the discovery and re-use of the models themselves. The lack of an established standard for model metadata is considered to be a factor inhibiting the more widespread use of environmental models particularly the use of linked model compositions which fuse together hydrologic models with models from other environmental disciplines. This poster presents the results of a Natural Environment Research Council (NERC) funded scoping study to understand the requirements of modellers and other end users for metadata about data and models. A user consultation exercise using an on-line questionnaire has been undertaken to capture the views of a wide spectrum of stakeholders on how they are currently managing metadata for modelling. This has provided a strong confirmation of our original supposition that there is a lack of systems and facilities to capture metadata about models. A number of specific gaps in current provision for data and model metadata were also identified, including a need for a standard means to record detailed information about the modelling environment and the model code used, to assist the selection of models for linked compositions. Existing best practice, including the use of current metadata standards (e.g. ISO 19110, ISO 19115 and ISO 19119) and the metadata components of WaterML were also evaluated. In addition to commonly used metadata attributes (e.g. spatial reference information) there was significant interest in recording a variety of additional metadata attributes. These included more detailed information about temporal data, and also providing estimates of data accuracy and uncertainty within metadata. This poster describes the key results of this study, including a number of gaps in the provision of metadata for modelling, and outlines how these might be addressed. Overall the scoping study has highlighted significant interest in addressing this issue within the environmental modelling community. There is therefore an impetus for on-going research, and we are seeking to take this forward through collaboration with other interested organisations. Progress towards an internationally recognised model metadata standard is suggested.
Descriptive Metadata: Emerging Standards.
ERIC Educational Resources Information Center
Ahronheim, Judith R.
1998-01-01
Discusses metadata, digital resources, cross-disciplinary activity, and standards. Highlights include Standard Generalized Markup Language (SGML); Extensible Markup Language (XML); Dublin Core; Resource Description Framework (RDF); Text Encoding Initiative (TEI); Encoded Archival Description (EAD); art and cultural-heritage metadata initiatives;…
2008-06-01
provides a means for file owners to add metadata which can then be used by iTunes for cataloging and searching [4]. Metadata can be stored in different...based and contain AAC data formats [3]. Specifically, Apple uses Protected AAC to encode copy-protected music titles purchased from the iTunes Music...Store [4]. The files purchased from the iTunes Music Store include the following metadata. • Name • Email address of purchaser • Year • Album
A Solution to Metadata: Using XML Transformations to Automate Metadata
2010-06-01
developed their own metadata standards—Directory Interchange Format (DIF), Ecological Metadata Language ( EML ), and International Organization for...mented all their data using the EML standard. However, when later attempting to publish to a data clearinghouse— such as the Geospatial One-Stop (GOS...construct calls to its transform(s) method by providing the type of the incoming content (e.g., eml ), the type of the resulting content (e.g., fgdc) and
2007-05-17
metadata formats, metadata repositories, enterprise portals and federated search engines that make data visible, available, and usable to users...and provides the metadata formats, metadata repositories, enterprise portals and federated search engines that make data visible, available, and...develop an enterprise- wide data sharing plan, establishment of mission area governance processes for CIOs, DISA development of federated search specifications
FRAMES Metadata Reporting Templates for Ecohydrological Observations, version 1.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christianson, Danielle; Varadharajan, Charuleka; Christoffersen, Brad
FRAMES is a a set of Excel metadata files and package-level descriptive metadata that are designed to facilitate and improve capture of desired metadata for ecohydrological observations. The metadata are bundled with data files into a data package and submitted to a data repository (e.g. the NGEE Tropics Data Repository) via a web form. FRAMES standardizes reporting of diverse ecohydrological and biogeochemical data for synthesis across a range of spatiotemporal scales and incorporates many best data science practices. This version of FRAMES supports observations for primarily automated measurements collected by permanently located sensors, including sap flow (tree water use), leafmore » surface temperature, soil water content, dendrometry (stem diameter growth increment), and solar radiation. Version 1.1 extend the controlled vocabulary and incorporates functionality to facilitate programmatic use of data and FRAMES metadata (R code available at NGEE Tropics Data Repository).« less
Assessing Public Metabolomics Metadata, Towards Improving Quality.
Ferreira, João D; Inácio, Bruno; Salek, Reza M; Couto, Francisco M
2017-12-13
Public resources need to be appropriately annotated with metadata in order to make them discoverable, reproducible and traceable, further enabling them to be interoperable or integrated with other datasets. While data-sharing policies exist to promote the annotation process by data owners, these guidelines are still largely ignored. In this manuscript, we analyse automatic measures of metadata quality, and suggest their application as a mean to encourage data owners to increase the metadata quality of their resources and submissions, thereby contributing to higher quality data, improved data sharing, and the overall accountability of scientific publications. We analyse these metadata quality measures in the context of a real-world repository of metabolomics data (i.e. MetaboLights), including a manual validation of the measures, and an analysis of their evolution over time. Our findings suggest that the proposed measures can be used to mimic a manual assessment of metadata quality.
EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal.
Baker, Ed
2013-01-01
Many institutions and individuals use embedded metadata to aid in the management of their image collections. Many deskop image management solutions such as Adobe Bridge and online tools such as Flickr also make use of embedded metadata to describe, categorise and license images. Until now Scratchpads (a data management system and virtual research environment for biodiversity) have not made use of these metadata, and users have had to manually re-enter this information if they have wanted to display it on their Scratchpad site. The Drupal described here allows users to map metadata embedded in their images to the associated field in the Scratchpads image form using one or more customised mappings. The module works seamlessly with the bulk image uploader used on Scratchpads and it is therefore possible to upload hundreds of images easily with automatic metadata (EXIF, XMP and IPTC) extraction and mapping.
Collection Metadata Solutions for Digital Library Applications
NASA Technical Reports Server (NTRS)
Hill, Linda L.; Janee, Greg; Dolin, Ron; Frew, James; Larsgaard, Mary
1999-01-01
Within a digital library, collections may range from an ad hoc set of objects that serve a temporary purpose to established library collections intended to persist through time. The objects in these collections vary widely, from library and data center holdings to pointers to real-world objects, such as geographic places, and the various metadata schemas that describe them. The key to integrated use of such a variety of collections in a digital library is collection metadata that represents the inherent and contextual characteristics of a collection. The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes: in XML form, the collection metadata "registers" the collection with the user interface client; in HTML form, it is used for user documentation; eventually, it will be used to describe the collection to network search agents; and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.
METADATA REGISTRY, ISO/IEC 11179
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pon, R K; Buttler, D J
2008-01-03
ISO/IEC-11179 is an international standard that documents the standardization and registration of metadata to make data understandable and shareable. This standardization and registration allows for easier locating, retrieving, and transmitting data from disparate databases. The standard defines the how metadata are conceptually modeled and how they are shared among parties, but does not define how data is physically represented as bits and bytes. The standard consists of six parts. Part 1 provides a high-level overview of the standard and defines the basic element of a metadata registry - a data element. Part 2 defines the procedures for registering classification schemesmore » and classifying administered items in a metadata registry (MDR). Part 3 specifies the structure of an MDR. Part 4 specifies requirements and recommendations for constructing definitions for data and metadata. Part 5 defines how administered items are named and identified. Part 6 defines how administered items are registered and assigned an identifier.« less
NASA Technical Reports Server (NTRS)
Ullman, Richard; Bane, Bob; Yang, Jingli
2008-01-01
A shell script has been written as a means of automatically making HDF-EOS-formatted data sets available via the World Wide Web. ("HDF-EOS" and variants thereof are defined in the first of the two immediately preceding articles.) The shell script chains together some software tools developed by the Data Usability Group at Goddard Space Flight Center to perform the following actions: Extract metadata in Object Definition Language (ODL) from an HDF-EOS file, Convert the metadata from ODL to Extensible Markup Language (XML), Reformat the XML metadata into human-readable Hypertext Markup Language (HTML), Publish the HTML metadata and the original HDF-EOS file to a Web server and an Open-source Project for a Network Data Access Protocol (OPeN-DAP) server computer, and Reformat the XML metadata and submit the resulting file to the EOS Clearinghouse, which is a Web-based metadata clearinghouse that facilitates searching for, and exchange of, Earth-Science data.
EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal
2013-01-01
Abstract Many institutions and individuals use embedded metadata to aid in the management of their image collections. Many deskop image management solutions such as Adobe Bridge and online tools such as Flickr also make use of embedded metadata to describe, categorise and license images. Until now Scratchpads (a data management system and virtual research environment for biodiversity) have not made use of these metadata, and users have had to manually re-enter this information if they have wanted to display it on their Scratchpad site. The Drupal described here allows users to map metadata embedded in their images to the associated field in the Scratchpads image form using one or more customised mappings. The module works seamlessly with the bulk image uploader used on Scratchpads and it is therefore possible to upload hundreds of images easily with automatic metadata (EXIF, XMP and IPTC) extraction and mapping. PMID:24723768
NASA Astrophysics Data System (ADS)
Hills, S. J.; Richard, S. M.; Doniger, A.; Danko, D. M.; Derenthal, L.; Energistics Metadata Work Group
2011-12-01
A diverse group of organizations representative of the international community involved in disciplines relevant to the upstream petroleum industry, - energy companies, - suppliers and publishers of information to the energy industry, - vendors of software applications used by the industry, - partner government and academic organizations, has engaged in the Energy Industry Metadata Standards Initiative. This Initiative envisions the use of standard metadata within the community to enable significant improvements in the efficiency with which users discover, evaluate, and access distributed information resources. The metadata standard needed to realize this vision is the initiative's primary deliverable. In addition to developing the metadata standard, the initiative is promoting its adoption to accelerate realization of the vision, and publishing metadata exemplars conformant with the standard. Implementation of the standard by community members, in the form of published metadata which document the information resources each organization manages, will allow use of tools requiring consistent metadata for efficient discovery and evaluation of, and access to, information resources. While metadata are expected to be widely accessible, access to associated information resources may be more constrained. The initiative is being conducting by Energistics' Metadata Work Group, in collaboration with the USGIN Project. Energistics is a global standards group in the oil and natural gas industry. The Work Group determined early in the initiative, based on input solicited from 40+ organizations and on an assessment of existing metadata standards, to develop the target metadata standard as a profile of a revised version of ISO 19115, formally the "Energy Industry Profile of ISO/DIS 19115-1 v1.0" (EIP). The Work Group is participating on the ISO/TC 211 project team responsible for the revision of ISO 19115, now ready for "Draft International Standard" (DIS) status. With ISO 19115 an established, capability-rich, open standard for geographic metadata, EIP v1 is expected to be widely acceptable within the community and readily sustainable over the long-term. The EIP design, also per community requirements, will enable discovery, evaluation, and access to types of information resources considered important to the community, including structured and unstructured digital resources, and physical assets such as hardcopy documents and material samples. This presentation will briefly review the development of this initiative as well as the current and planned Work Group activities. More time will be spent providing an overview of the EIP v1, including the requirements it prescribes, design efforts made to enable automated metadata capture and processing, and the structure and content of its documentation, which was written to minimize ambiguity and facilitate implementation. The Work Group considers EIP v1 a solid initial design for interoperable metadata, and first step toward the vision of the Initiative.
Rescue, Archival and Discovery of Tsunami Events on Marigrams
NASA Astrophysics Data System (ADS)
Eble, M. C.; Wright, L. M.; Stroker, K. J.; Sweeney, A.; Lancaster, M.
2017-12-01
The Big Earth Data Initiative made possible the reformatting of paper marigram records on which were recorded measurements of the 1946, 1952, 1960, and 1964 tsunamis generated in the Pacific Ocean. Data contained within each record were determined to be invaluable for tsunami researchers and operational agencies with a responsibility for issuing warnings during a tsunami event. All marigrams were carefully digitized and metadata were generated to form numerical datasets in order to provide the tsunami and other research and application-driven communities with quality data. Data were then packaged as CF-compliant netCDF datafiles and submitted to the NOAA Centers for Environmental Information for long-term stewardship, archival, and public discovery of both original scanned images and data in digital netCDF and CSC formats. The PNG plots of each time series were generated and included with data packages to provide a visual representation of the numerical data sets. ISO-compliant metadata were compiled for the collection at the event level and individual DOIs were minted for each of the four events included in this project. The procedure followed to reformat each record in this four-event subset of the larger NCEI scanned marigram inventory is presented and discussed. The practical use of these data is presented to highlight that even infrequent measurements of tsunamis hold information that may potentially help constrain earthquake rupture area, provide estimates of earthquake co-seismic slip distribution, identify subsidence or uplift, and significantly increase the holdings of situ data available for tsunami model validation. These same data may also prove valuable to the broader global tide community for validation and further development of tide models and for investigation into the stability of tidal harmonic constants. Data reformatted as part of this project are PARR compliant and meet the requirements for Data Management, Discoverability, Accessibility, Documentation, Readability, and Data Preservation and Stewardship as per the Big Earth Data Initiative.
Standardised online data access and publishing for Earth Systems and Climate data in Australia
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Druken, K. A.; Trenham, C.; Wang, J.; Wyborn, L. A.; Smillie, J.; Allen, C.; Porter, D.
2015-12-01
The National Computational Infrastructure (NCI) hosts Australia's largest repository (10+ PB) of research data collections spanning a wide range of fields from climate, coasts, oceans, and geophysics through to astronomy, bioinformatics, and the social sciences. Spatial scales range from global to local ultra-high resolution, requiring storage volumes from MB to PB. The data have been organised to be highly connected to both the NCI HPC and cloud resources (e.g., interactive visualisation and analysis environments). Researchers can login to utilise the high performance infrastructure for these data collections, or access the data via standards-based web services. Our aim is to provide a trusted platform to support interdisciplinary research across all the collections as well as services for use of the data within individual communities. We thus cater to a wide range of researcher needs, whilst needing to maintain a consistent approach to data management and publishing. All research data collections hosted at NCI are governed by a data management plan, prior to being published through a variety of platforms and web services such as OPeNDAP, HTTP, and WMS. The data management plan ensures the use of standard formats (when available) that comply with relevant data conventions (e.g., CF-Convention) and metadata standards (e.g., ISO19115). Digital Object Identifiers (DOIs) can be minted at NCI and assigned to datasets and collections. Large scale data growth and use in a variety of research fields has led to a rise in, and acceptance of, open spatial data formats such as NetCDF4/HDF5, prompting a need to extend these data conventions to fields such as geophysics and satellite Earth observations. The fusion of DOI-minted data that is discoverable and accessible via metadata and web services, creates a complete picture of data hosting, discovery, use, and citation. This enables standardised and reproducible data analysis.
Vapor pressures, compressibilities, expansivities, and molar volumes of the liquid phase have been measured between room temperature and the critical temperature for a series of fluorinated ethers: CF3OCF2OCF3, CF3OCF2CF2H, c-CF2CF2CF2O, CF3OCF2H, and CF3OCH3. Vapor-phase non-ide...
NASA Astrophysics Data System (ADS)
Wood, Chris
2016-04-01
Under the Marine Strategy Framework Directive (MSFD), EU Member States are mandated to achieve or maintain 'Good Environmental Status' (GES) in their marine areas by 2020, through a series of Programme of Measures (PoMs). The Celtic Seas Partnership (CSP), an EU LIFE+ project, aims to support policy makers, special-interest groups, users of the marine environment, and other interested stakeholders on MSFD implementation in the Celtic Seas geographical area. As part of this support, a metadata portal has been built to provide a signposting service to datasets that are relevant to MSFD within the Celtic Seas. To ensure that the metadata has the widest possible reach, a linked data approach was employed to construct the database. Although the metadata are stored in a traditional RDBS, the metadata are exposed as linked data via the D2RQ platform, allowing virtual RDF graphs to be generated. SPARQL queries can be executed against the end-point allowing any user to manipulate the metadata. D2RQ's mapping language, based on turtle, was used to map a wide range of relevant ontologies to the metadata (e.g. The Provenance Ontology (prov-o), Ocean Data Ontology (odo), Dublin Core Elements and Terms (dc & dcterms), Friend of a Friend (foaf), and Geospatial ontologies (geo)) allowing users to browse the metadata, either via SPARQL queries or by using D2RQ's HTML interface. The metadata were further enhanced by mapping relevant parameters to the NERC Vocabulary Server, itself built on a SPARQL endpoint. Additionally, a custom web front-end was built to enable users to browse the metadata and express queries through an intuitive graphical user interface that requires no prior knowledge of SPARQL. As well as providing means to browse the data via MSFD-related parameters (Descriptor, Criteria, and Indicator), the metadata records include the dataset's country of origin, the list of organisations involved in the management of the data, and links to any relevant INSPIRE-compliant services relating to the dataset. The web front-end therefore enables users to effectively filter, sort, or search the metadata. As the MSFD timeline requires Member States to review their progress on achieving or maintaining GES every six years, the timely development of this metadata portal will not only aid interested stakeholders in understanding how member states are meeting their targets, but also shows how linked data can be used effectively to support policy makers and associated legislative bodies.
THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Shrestha, Biva; Palanisamy, Giri
Nobody is better suited to describe data than the scientist who created it. This description about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset [1]. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, andmore » locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [2][4]. OME is part of ORNL s Mercury software fleet [2][3]. It was jointly developed to support projects funded by the United States Geological Survey (USGS), U.S. Department of Energy (DOE), National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA). OME s architecture provides a customizable interface to support project-specific requirements. Using this new architecture, the ORNL team developed OME instances for USGS s Core Science Analytics, Synthesis, and Libraries (CSAS&L), DOE s Next Generation Ecosystem Experiments (NGEE) and Atmospheric Radiation Measurement (ARM) Program, and the international Surface Ocean Carbon Dioxide ATlas (SOCAT). Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. From the information on the form, the Metadata Editor can create an XML file on the server that the editor is installed or to the user s personal computer. Researchers can also use the ORNL Metadata Editor to modify existing XML metadata files. As an example, an NGEE Arctic scientist use OME to register their datasets to the NGEE data archive and allows the NGEE archive to publish these datasets via a data search portal (http://ngee.ornl.gov/data). These highly descriptive metadata created using OME allows the Archive to enable advanced data search options using keyword, geo-spatial, temporal and ontology filters. Similarly, ARM OME allows scientists or principal investigators (PIs) to submit their data products to the ARM data archive. How would OME help Big Data Centers like the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC)? The ORNL DAAC is one of NASA s Earth Observing System Data and Information System (EOSDIS) data centers managed by the Earth Science Data and Information System (ESDIS) Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological, geological, and chemical components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the NGEE and ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability. Useful Links: USGS OME: http://mercury.ornl.gov/OME/ NGEE OME: http://ngee-arctic.ornl.gov/ngeemetadata/ ARM OME: http://archive2.ornl.gov/armome/ Contact: Ranjeet Devarakonda (devarakondar@ornl.gov) References: [1] Federal Geographic Data Committee. Content standard for digital geospatial metadata. Federal Geographic Data Committee, 1998. [2] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [3] Wilson, B. E., Palanisamy, G., Devarakonda, R., Rhyne, B. T., Lindsley, C., & Green, J. (2010). Mercury Toolset for Spatiotemporal Metadata. [4] Pouchard, L. C., Branstetter, M. L., Cook, R. B., Devarakonda, R., Green, J., Palanisamy, G., ... & Noy, N. F. (2013). A Linked Science investigation: enhancing climate change data discovery with semantic technologies. Earth science informatics, 6(3), 175-185.« less
A document centric metadata registration tool constructing earth environmental data infrastructure
NASA Astrophysics Data System (ADS)
Ichino, M.; Kinutani, H.; Ono, M.; Shimizu, T.; Yoshikawa, M.; Masuda, K.; Fukuda, K.; Kawamoto, H.
2009-12-01
DIAS (Data Integration and Analysis System) is one of GEOSS activities in Japan. It is also a leading part of the GEOSS task with the same name defined in GEOSS Ten Year Implementation Plan. The main mission of DIAS is to construct data infrastructure that can effectively integrate earth environmental data such as observation data, numerical model outputs, and socio-economic data provided from the fields of climate, water cycle, ecosystem, ocean, biodiversity and agriculture. Some of DIAS's data products are available at the following web site of http://www.jamstec.go.jp/e/medid/dias. Most of earth environmental data commonly have spatial and temporal attributes such as the covering geographic scope or the created date. The metadata standards including these common attributes are published by the geographic information technical committee (TC211) in ISO (the International Organization for Standardization) as specifications of ISO 19115:2003 and 19139:2007. Accordingly, DIAS metadata is developed with basing on ISO/TC211 metadata standards. From the viewpoint of data users, metadata is useful not only for data retrieval and analysis but also for interoperability and information sharing among experts, beginners and nonprofessionals. On the other hand, from the viewpoint of data providers, two problems were pointed out after discussions. One is that data providers prefer to minimize another tasks and spending time for creating metadata. Another is that data providers want to manage and publish documents to explain their data sets more comprehensively. Because of solving these problems, we have been developing a document centric metadata registration tool. The features of our tool are that the generated documents are available instantly and there is no extra cost for data providers to generate metadata. Also, this tool is developed as a Web application. So, this tool does not demand any software for data providers if they have a web-browser. The interface of the tool provides the section titles of the documents and by filling out the content of each section, the documents for the data sets are automatically published in PDF and HTML format. Furthermore, the metadata XML file which is compliant with ISO19115 and ISO19139 is created at the same moment. The generated metadata are managed in the metadata database of the DIAS project, and will be used in various ISO19139 compliant metadata management tools, such as GeoNetwork.
A metadata-driven approach to data repository design.
Harvey, Matthew J; McLean, Andrew; Rzepa, Henry S
2017-01-01
The design and use of a metadata-driven data repository for research data management is described. Metadata is collected automatically during the submission process whenever possible and is registered with DataCite in accordance with their current metadata schema, in exchange for a persistent digital object identifier. Two examples of data preview are illustrated, including the demonstration of a method for integration with commercial software that confers rich domain-specific data analytics without introducing customisation into the repository itself.
NASA Astrophysics Data System (ADS)
le Roux, J.; Baker, A.; Caltagirone, S.; Bugbee, K.
2017-12-01
The Common Metadata Repository (CMR) is a high-performance, high-quality repository for Earth science metadata records, and serves as the primary way to search NASA's growing 17.5 petabytes of Earth science data holdings. Released in 2015, CMR has the capability to support several different metadata standards already being utilized by NASA's combined network of Earth science data providers, or Distributed Active Archive Centers (DAACs). The Analysis and Review of CMR (ARC) Team located at Marshall Space Flight Center is working to improve the quality of records already in CMR with the goal of making records optimal for search and discovery. This effort entails a combination of automated and manual review, where each NASA record in CMR is checked for completeness, accuracy, and consistency. This effort is highly collaborative in nature, requiring communication and transparency of findings amongst NASA personnel, DAACs, the CMR team and other metadata curation teams. Through the evolution of this project it has become apparent that there is a need to document and report findings, as well as track metadata improvements in a more efficient manner. The ARC team has collaborated with Element 84 in order to develop a metadata curation tool to meet these needs. In this presentation, we will provide an overview of this metadata curation tool and its current capabilities. Challenges and future plans for the tool will also be discussed.
Social tagging in the life sciences: characterizing a new metadata resource for bioinformatics.
Good, Benjamin M; Tennis, Joseph T; Wilkinson, Mark D
2009-09-25
Academic social tagging systems, such as Connotea and CiteULike, provide researchers with a means to organize personal collections of online references with keywords (tags) and to share these collections with others. One of the side-effects of the operation of these systems is the generation of large, publicly accessible metadata repositories describing the resources in the collections. In light of the well-known expansion of information in the life sciences and the need for metadata to enhance its value, these repositories present a potentially valuable new resource for application developers. Here we characterize the current contents of two scientifically relevant metadata repositories created through social tagging. This investigation helps to establish how such socially constructed metadata might be used as it stands currently and to suggest ways that new social tagging systems might be designed that would yield better aggregate products. We assessed the metadata that users of CiteULike and Connotea associated with citations in PubMed with the following metrics: coverage of the document space, density of metadata (tags) per document, rates of inter-annotator agreement, and rates of agreement with MeSH indexing. CiteULike and Connotea were very similar on all of the measurements. In comparison to PubMed, document coverage and per-document metadata density were much lower for the social tagging systems. Inter-annotator agreement within the social tagging systems and the agreement between the aggregated social tagging metadata and MeSH indexing was low though the latter could be increased through voting. The most promising uses of metadata from current academic social tagging repositories will be those that find ways to utilize the novel relationships between users, tags, and documents exposed through these systems. For more traditional kinds of indexing-based applications (such as keyword-based search) to benefit substantially from socially generated metadata in the life sciences, more documents need to be tagged and more tags are needed for each document. These issues may be addressed both by finding ways to attract more users to current systems and by creating new user interfaces that encourage more collectively useful individual tagging behaviour.
A Metadata Element Set for Project Documentation
NASA Technical Reports Server (NTRS)
Hodge, Gail; Templeton, Clay; Allen, Robert B.
2003-01-01
Abstract NASA Goddard Space Flight Center is a large engineering enterprise with many projects. We describe our efforts to develop standard metadata sets across project documentation which we term the "Goddard Core". We also address broader issues for project management metadata.
Metadata for WIS and WIGOS: GAW Profile of ISO19115 and Draft WIGOS Core Metadata Standard
NASA Astrophysics Data System (ADS)
Klausen, Jörg; Howe, Brian
2014-05-01
The World Meteorological Organization (WMO) Integrated Global Observing System (WIGOS) is a key WMO priority to underpin all WMO Programs and new initiatives such as the Global Framework for Climate Services (GFCS). The development of the WIGOS Operational Information Resource (WIR) is central to the WIGOS Framework Implementation Plan (WIGOS-IP). The WIR shall provide information on WIGOS and its observing components, as well as requirements of WMO application areas. An important aspect is the description of the observational capabilities by way of structured metadata. The Global Atmosphere Watch is the WMO program addressing the chemical composition and selected physical properties of the atmosphere. Observational data are collected and archived by GAW World Data Centres (WDCs) and related data centres. The Task Team on GAW WDCs (ET-WDC) have developed a profile of the ISO19115 metadata standard that is compliant with the WMO Information System (WIS) specification for the WMO Core Metadata Profile v1.3. This profile is intended to harmonize certain aspects of the documentation of observations as well as the interoperability of the WDCs. The Inter-Commission-Group on WIGOS (ICG-WIGOS) has established the Task Team on WIGOS Metadata (TT-WMD) with representation of all WMO Technical Commissions and the objective to define the WIGOS Core Metadata. The result of this effort is a draft semantic standard comprising of a set of metadata classes that are considered to be of critical importance for the interpretation of observations relevant to WIGOS. The purpose of the presentation is to acquaint the audience with the standard and to solicit informal feed-back from experts in the various disciplines of meteorology and climatology. This feed-back will help ET-WDC and TT-WMD to refine the GAW metadata profile and the draft WIGOS metadata standard, thereby increasing their utility and acceptance.
SIPSMetGen: It's Not Just For Aircraft Data and ECS Anymore.
NASA Astrophysics Data System (ADS)
Schwab, M.
2015-12-01
The SIPSMetGen utility, developed for the NASA EOSDIS project, under the EED contract, simplified the creation of file level metadata for the ECS System. The utility has been enhanced for ease of use, efficiency, speed and increased flexibility. The SIPSMetGen utility was originally created as a means of generating file level spatial metadata for Operation IceBridge. The first version created only ODL metadata, specific for ingest into ECS. The core strength of the utility was, and continues to be, its ability to take complex shapes and patterns of data collection point clouds from aircraft flights and simplify them to a relatively simple concave hull geo-polygon. It has been found to be a useful and easy to use tool for creating file level metadata for many other missions, both aircraft and satellite. While the original version was useful it had its limitations. In 2014 Raytheon was tasked to make enhancements to SIPSMetGen, this resulted a new version of SIPSMetGen which can create ISO Compliant XML metadata; provides optimization and streamlining of the algorithm for creating the spatial metadata; a quicker runtime with more consistent results; a utility that can be configured to run multi-threaded on systems with multiple processors. The utility comes with a java based graphical user interface to aid in configuration and running of the utility. The enhanced SIPSMetGen allows more diverse data sets to be archived with file level metadata. The advantage of archiving data with file level metadata is that it makes it easier for data users, and scientists to find relevant data. File level metadata unlocks the power of existing archives and metadata repositories such as ECS and CMR and search and discovery utilities like Reverb and Earth Data Search. Current missions now using SIPSMetGen include: Aquarius, Measures, ARISE, and Nimbus.
NASA Astrophysics Data System (ADS)
Thomas, R.; Connell, D.; Spears, T.; Leadbetter, A.; Burger, E. F.
2016-12-01
The scientific literature heavily features small-scale studies with the impact of the results extrapolated to regional/global importance. There are on-going initiatives (e.g. OA-ICC, GOA-ON, GEOTRACES, EMODNet Chemistry) aiming to assemble regional to global-scale datasets that are available for trend or meta-analyses. Assessing the quality and comparability of these data requires information about the processing chain from "sampling to spreadsheet". This provenance information needs to be captured and readily available to assess data fitness for purpose. The NOAA Ocean Acidification metadata template was designed in consultation with domain experts for this reason; the core carbonate chemistry variables have 23-37 metadata fields each and for scientists generating these datasets there could appear to be an ever increasing amount of metadata expected to accompany a dataset. While this provenance metadata should be considered essential by those generating or using the data, for those discovering data there is a sliding scale between what is considered discovery metadata (title, abstract, contacts, etc.) versus usage metadata (methodology, environmental setup, lineage, etc.), the split depending on the intended use of data. As part of the OA-ICC's activities, the metadata fields from the NOAA template relevant to the sample processing chain and QA criteria have been factored to develop profiles for, and extensions to, the OM-JSON encoding supported by the PROV ontology. While this work started focused on carbonate chemistry variable specific metadata, the factorization could be applied within the O&M model across other disciplines such as trace metals or contaminants. In a linked data world with a suitable high level model for sample processing and QA available, tools and support can be provided to link reproducible units of metadata (e.g. the standard protocol for a variable as adopted by a community) and simplify the provision of metadata and subsequent discovery.
Metadata Creation, Management and Search System for your Scientific Data
NASA Astrophysics Data System (ADS)
Devarakonda, R.; Palanisamy, G.
2012-12-01
Mercury Search Systems is a set of tools for creating, searching, and retrieving of biogeochemical metadata. Mercury toolset provides orders of magnitude improvements in search speed, support for any metadata format, integration with Google Maps for spatial queries, multi-facetted type search, search suggestions, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. Mercury's metadata editor provides a easy way for creating metadata and Mercury's search interface provides a single portal to search for data and information contained in disparate data management systems, each of which may use any metadata format including FGDC, ISO-19115, Dublin-Core, Darwin-Core, DIF, ECHO, and EML. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury is being used more than 14 different projects across 4 federal agencies. It was originally developed for NASA, with continuing development funded by NASA, USGS, and DOE for a consortium of projects. Mercury search won the NASA's Earth Science Data Systems Software Reuse Award in 2008. References: R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010);
Master Metadata Repository and Metadata-Management System
NASA Technical Reports Server (NTRS)
Armstrong, Edward; Reed, Nate; Zhang, Wen
2007-01-01
A master metadata repository (MMR) software system manages the storage and searching of metadata pertaining to data from national and international satellite sources of the Global Ocean Data Assimilation Experiment (GODAE) High Resolution Sea Surface Temperature Pilot Project [GHRSSTPP]. These sources produce a total of hundreds of data files daily, each file classified as one of more than ten data products representing global sea-surface temperatures. The MMR is a relational database wherein the metadata are divided into granulelevel records [denoted file records (FRs)] for individual satellite files and collection-level records [denoted data set descriptions (DSDs)] that describe metadata common to all the files from a specific data product. FRs and DSDs adhere to the NASA Directory Interchange Format (DIF). The FRs and DSDs are contained in separate subdatabases linked by a common field. The MMR is configured in MySQL database software with custom Practical Extraction and Reporting Language (PERL) programs to validate and ingest the metadata records. The database contents are converted into the Federal Geographic Data Committee (FGDC) standard format by use of the Extensible Markup Language (XML). A Web interface enables users to search for availability of data from all sources.
NASA Astrophysics Data System (ADS)
Zhou, Yangliu
The most commonly used proton conductive membrane in polymer electrolyte membrane fuel cells (PEMFC) and direct methanol fuel cells (DMFC) studies to date is DuPont's NafionRTM, which is a perfluorinated copolymer of tetrafluoroethylene (TFE) and perfluorovinyl ether with a pendant sulfonic acid group. A focus of this work is to find ways to improve the performance of NafionRTM membranes. Crosslinking the TFE chains of fluorinated ionomeric copolymers to improve their thermal and mechanical stability is a proven route to this goal. A straightforward synthetic route to perfluorinated divinyl ethers of the formula CF2=CFO(CF 2)3[OCF(CF3)CF2]mOCF=CF 2 (m = 0-1) has been demonstrated. The compounds CF2=CFO(CF 2)3OCF=CF2 and CF2=CFO(CF2) 3OCF(CF3)CF2OCF=CF2 were prepared and characterized by GC-MS, 13C and 19F NMR, and gas-IR spectroscopy. Synthetic routes to fluorosulfato-tetrafluoropropionyl fluoride [FSO3CF2CF2C(O)F] and difluoromalonyl difluoride [F(O)CCF2C(O)F] with improved yields were found. The second focus of the dissertation was the development of fluorous triarylphosphines for use as new doping materials for the modification of NafionRTM membranes and for use as ligands in catalysts for biphasic catalysis. The synthesis and characterization of a series of new polyhexafluoropropylene oxide derivatives for preparation of fluorous triarylphosphines and phosphonium salts was studied, such as F[CF(CF3)CF2O] 4CF(CF3)CH2CH2I, F[CF(CF3)CF 2O]4CF(CF3)CH=CH2, F[CF(CF3)CF 2O]4CF(CF3) CH2CH2C6H5, and F[CF(CF 3)CF2O]4CF(CF3)CH2CH 2C6H4Br. In a separate study, the photochlorination of 2,2,3,3-tetrafluoro-1-propanol (HCF2CF2CH2OH) and 2,2,3,3-tetrafluoropropyl 2,2,3,3-tetrafluoropropionate [HCF2CF2C(O)OCH2 CF2CF2H] with super diazo blue light (lambda max = 420 nm) were investigated. The photochemical products are different from those obtained under mercury light (lambda = 253.7nm). A new compound ClCF2CF2C(O)OC(H)ClCF2CF2Cl was prepared and characterized by GC-MS, elemental analysis, 1H, 13C and 19F NMR, and gas-IR spectroscopy.
Brady's Geothermal Field Nodal Seismometers Metadata
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lesley Parker
Metadata for the nodal seismometer array deployed at the POROTOMO's Natural Laboratory in Brady Hot Spring, Nevada during the March 2016 testing. Metadata includes location and timing for each instrument as well as file lists of data to be uploaded in a separate submission.
Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.
Dave, Jaydev K; Gingold, Eric L
2013-01-01
The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.
ERIC Educational Resources Information Center
O'Neill, Edward T.; Lavoie, Brian F.; Bennett, Rick; Staples, Thornton; Wayland, Ross; Payette, Sandra; Dekkers, Makx; Weibel, Stuart; Searle, Sam; Thompson, Dave; Rudner, Lawrence M.
2003-01-01
Includes five articles that examine key trends in the development of the public Web: size and growth, internationalization, and metadata usage; Flexible Extensible Digital Object and Repository Architecture (Fedora) for use in digital libraries; developments in the Dublin Core Metadata Initiative (DCMI); the National Library of New Zealand Te Puna…
NASA Astrophysics Data System (ADS)
Andre, Francois; Fleury, Laurence; Gaillardet, Jerome; Nord, Guillaume
2015-04-01
RBV (Réseau des Bassins Versants) is a French initiative to consolidate the national efforts made by more than 15 elementary observatories funded by various research institutions (CNRS, INRA, IRD, IRSTEA, Universities) that study river and drainage basins. The RBV Metadata Catalogue aims at giving an unified vision of the work produced by every observatory to both the members of the RBV network and any external person interested by this domain of research. Another goal is to share this information with other existing metadata portals. Metadata management is heterogeneous among observatories ranging from absence to mature harvestable catalogues. Here, we would like to explain the strategy used to design a state of the art catalogue facing this situation. Main features are as follows : - Multiple input methods: Metadata records in the catalog can either be entered with the graphical user interface, harvested from an existing catalogue or imported from information system through simplified web services. - Hierarchical levels: Metadata records may describe either an observatory, one of its experimental site or a single dataset produced by one instrument. - Multilingualism: Metadata can be easily entered in several configurable languages. - Compliance to standards : the backoffice part of the catalogue is based on a CSW metadata server (Geosource) which ensures ISO19115 compatibility and the ability of being harvested (globally or partially). On going tasks focus on the use of SKOS thesaurus and SensorML description of the sensors. - Ergonomy : The user interface is built with the GWT Framework to offer a rich client application with a fully ajaxified navigation. - Source code sharing : The work has led to the development of reusable components which can be used to quickly create new metadata forms in other GWT applications You can visit the catalogue (http://portailrbv.sedoo.fr/) or contact us by email rbv@sedoo.fr.
Uciteli, Alexandr; Herre, Heinrich
2015-01-01
The specification of metadata in clinical and epidemiological study projects absorbs significant expense. The validity and quality of the collected data depend heavily on the precise and semantical correct representation of their metadata. In various research organizations, which are planning and coordinating studies, the required metadata are specified differently, depending on many conditions, e.g., on the used study management software. The latter does not always meet the needs of a particular research organization, e.g., with respect to the relevant metadata attributes and structuring possibilities. The objective of the research, set forth in this paper, is the development of a new approach for ontology-based representation and management of metadata. The basic features of this approach are demonstrated by the software tool OntoStudyEdit (OSE). The OSE is designed and developed according to the three ontology method. This method for developing software is based on the interactions of three different kinds of ontologies: a task ontology, a domain ontology and a top-level ontology. The OSE can be easily adapted to different requirements, and it supports an ontologically founded representation and efficient management of metadata. The metadata specifications can by imported from various sources; they can be edited with the OSE, and they can be exported in/to several formats, which are used, e.g., by different study management software. Advantages of this approach are the adaptability of the OSE by integrating suitable domain ontologies, the ontological specification of mappings between the import/export formats and the DO, the specification of the study metadata in a uniform manner and its reuse in different research projects, and an intuitive data entry for non-expert users.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Valentine, D.; Richard, S. M.; Gupta, A.; Meier, O.; Peucker-Ehrenbrink, B.; Hudman, G.; Stocks, K. I.; Hsu, L.; Whitenack, T.; Grethe, J. S.; Ozyurt, I. B.
2017-12-01
EarthCube Data Discovery Hub (DDH) is an EarthCube Building Block project using technologies developed in CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) to enable geoscience users to explore a growing portfolio of EarthCube-created and other geoscience-related resources. Over 1 million metadata records are available for discovery through the project portal (cinergi.sdsc.edu). These records are retrieved from data facilities, including federal, state and academic sources, or contributed by geoscientists through workshops, surveys, or other channels. CINERGI metadata augmentation pipeline components 1) provide semantic enhancement based on a large ontology of geoscience terms, using text analytics to generate keywords with references to ontology classes, 2) add spatial extents based on place names found in the metadata record, and 3) add organization identifiers to the metadata. The records are indexed and can be searched via a web portal and standard search APIs. The added metadata content improves discoverability and interoperability of the registered resources. Specifically, the addition of ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by variables measured, equipment used, science domain, processes described, geospatial features studied, and other dataset characteristics that are generated by the pipeline. DDH also lets data curators access and edit the automatically generated metadata records using the CINERGI metadata editor, accept or reject the enhanced metadata content, and consider it in updating their metadata descriptions. We consider several complex data discovery workflows, in environmental seismology (quantifying sediment and water fluxes using seismic data), marine biology (determining available temperature, location, weather and bleaching characteristics of coral reefs related to measurements in a given coral reef survey), and river geochemistry (discovering observations relevant to geochemical measurements outside the tidal zone, given specific discharge conditions).
The data paper: a mechanism to incentivize data publishing in biodiversity science.
Chavan, Vishwas; Penev, Lyubomir
2011-01-01
Free and open access to primary biodiversity data is essential for informed decision-making to achieve conservation of biodiversity and sustainable development. However, primary biodiversity data are neither easily accessible nor discoverable. Among several impediments, one is a lack of incentives to data publishers for publishing of their data resources. One such mechanism currently lacking is recognition through conventional scholarly publication of enriched metadata, which should ensure rapid discovery of 'fit-for-use' biodiversity data resources. We review the state of the art of data discovery options and the mechanisms in place for incentivizing data publishers efforts towards easy, efficient and enhanced publishing, dissemination, sharing and re-use of biodiversity data. We propose the establishment of the 'biodiversity data paper' as one possible mechanism to offer scholarly recognition for efforts and investment by data publishers in authoring rich metadata and publishing them as citable academic papers. While detailing the benefits to data publishers, we describe the objectives, work flow and outcomes of the pilot project commissioned by the Global Biodiversity Information Facility in collaboration with scholarly publishers and pioneered by Pensoft Publishers through its journals Zookeys, PhytoKeys, MycoKeys, BioRisk, NeoBiota, Nature Conservation and the forthcoming Biodiversity Data Journal. We then debate further enhancements of the data paper beyond the pilot project and attempt to forecast the future uptake of data papers as an incentivization mechanism by the stakeholder communities. We believe that in addition to recognition for those involved in the data publishing enterprise, data papers will also expedite publishing of fit-for-use biodiversity data resources. However, uptake and establishment of the data paper as a potential mechanism of scholarly recognition requires a high degree of commitment and investment by the cross-sectional stakeholder communities.
Hill, Andrew; Guralnick, Robert; Smith, Arfon; Sallans, Andrew; Rosemary Gillespie; Denslow, Michael; Gross, Joyce; Murrell, Zack; Tim Conyers; Oboyski, Peter; Ball, Joan; Thomer, Andrea; Prys-Jones, Robert; de Torre, Javier; Kociolek, Patrick; Fortson, Lucy
2012-01-01
Abstract Legacy data from natural history collections contain invaluable and irreplaceable information about biodiversity in the recent past, providing a baseline for detecting change and forecasting the future of biodiversity on a human-dominated planet. However, these data are often not available in formats that facilitate use and synthesis. New approaches are needed to enhance the rates of digitization and data quality improvement. Notes from Nature provides one such novel approach by asking citizen scientists to help with transcription tasks. The initial web-based prototype of Notes from Nature is soon widely available and was developed collaboratively by biodiversity scientists, natural history collections staff, and experts in citizen science project development, programming and visualization. This project brings together digital images representing different types of biodiversity records including ledgers , herbarium sheets and pinned insects from multiple projects and natural history collections. Experts in developing web-based citizen science applications then designed and built a platform for transcribing textual data and metadata from these images. The end product is a fully open source web transcription tool built using the latest web technologies. The platform keeps volunteers engaged by initially explaining the scientific importance of the work via a short orientation, and then providing transcription “missions” of well defined scope, along with dynamic feedback, interactivity and rewards. Transcribed records, along with record-level and process metadata, are provided back to the institutions. While the tool is being developed with new users in mind, it can serve a broad range of needs from novice to trained museum specialist. Notes from Nature has the potential to speed the rate of biodiversity data being made available to a broad community of users. PMID:22859890
The data paper: a mechanism to incentivize data publishing in biodiversity science
2011-01-01
Background Free and open access to primary biodiversity data is essential for informed decision-making to achieve conservation of biodiversity and sustainable development. However, primary biodiversity data are neither easily accessible nor discoverable. Among several impediments, one is a lack of incentives to data publishers for publishing of their data resources. One such mechanism currently lacking is recognition through conventional scholarly publication of enriched metadata, which should ensure rapid discovery of 'fit-for-use' biodiversity data resources. Discussion We review the state of the art of data discovery options and the mechanisms in place for incentivizing data publishers efforts towards easy, efficient and enhanced publishing, dissemination, sharing and re-use of biodiversity data. We propose the establishment of the 'biodiversity data paper' as one possible mechanism to offer scholarly recognition for efforts and investment by data publishers in authoring rich metadata and publishing them as citable academic papers. While detailing the benefits to data publishers, we describe the objectives, work flow and outcomes of the pilot project commissioned by the Global Biodiversity Information Facility in collaboration with scholarly publishers and pioneered by Pensoft Publishers through its journals Zookeys, PhytoKeys, MycoKeys, BioRisk, NeoBiota, Nature Conservation and the forthcoming Biodiversity Data Journal. We then debate further enhancements of the data paper beyond the pilot project and attempt to forecast the future uptake of data papers as an incentivization mechanism by the stakeholder communities. Conclusions We believe that in addition to recognition for those involved in the data publishing enterprise, data papers will also expedite publishing of fit-for-use biodiversity data resources. However, uptake and establishment of the data paper as a potential mechanism of scholarly recognition requires a high degree of commitment and investment by the cross-sectional stakeholder communities. PMID:22373175
Hill, Andrew; Guralnick, Robert; Smith, Arfon; Sallans, Andrew; Rosemary Gillespie; Denslow, Michael; Gross, Joyce; Murrell, Zack; Tim Conyers; Oboyski, Peter; Ball, Joan; Thomer, Andrea; Prys-Jones, Robert; de Torre, Javier; Kociolek, Patrick; Fortson, Lucy
2012-01-01
Legacy data from natural history collections contain invaluable and irreplaceable information about biodiversity in the recent past, providing a baseline for detecting change and forecasting the future of biodiversity on a human-dominated planet. However, these data are often not available in formats that facilitate use and synthesis. New approaches are needed to enhance the rates of digitization and data quality improvement. Notes from Nature provides one such novel approach by asking citizen scientists to help with transcription tasks. The initial web-based prototype of Notes from Nature is soon widely available and was developed collaboratively by biodiversity scientists, natural history collections staff, and experts in citizen science project development, programming and visualization. This project brings together digital images representing different types of biodiversity records including ledgers , herbarium sheets and pinned insects from multiple projects and natural history collections. Experts in developing web-based citizen science applications then designed and built a platform for transcribing textual data and metadata from these images. The end product is a fully open source web transcription tool built using the latest web technologies. The platform keeps volunteers engaged by initially explaining the scientific importance of the work via a short orientation, and then providing transcription "missions" of well defined scope, along with dynamic feedback, interactivity and rewards. Transcribed records, along with record-level and process metadata, are provided back to the institutions. While the tool is being developed with new users in mind, it can serve a broad range of needs from novice to trained museum specialist. Notes from Nature has the potential to speed the rate of biodiversity data being made available to a broad community of users.
a 24/7 High Resolution Storm Surge, Inundation and Circulation Forecasting System for Florida Coast
NASA Astrophysics Data System (ADS)
Paramygin, V.; Davis, J. R.; Sheng, Y.
2012-12-01
A 24/7 forecasting system for Florida is needed because of the high risk of tropical storm surge-induced coastal inundation and damage, and the need to support operational management of water resources, utility infrastructures, and fishery resources. With the anticipated climate change impacts, including sea level rise, coastal areas are facing the challenges of increasing inundation risk and increasing population. Accurate 24/7 forecasting of water level, inundation, and circulation will significantly enhance the sustainability of coastal communities and environments. Supported by the Southeast Coastal Ocean Observing Regional Association (SECOORA) through NOAA IOOS, a 24/7 high-resolution forecasting system for storm surge, coastal inundation, and baroclinic circulation is being developed for Florida using CH3D Storm Surge Modeling System (CH3D-SSMS). CH3D-SSMS is based on the CH3D hydrodynamic model coupled to a coastal wave model SWAN and basin scale surge and wave models. CH3D-SSMS has been verified with surge, wave, and circulation data from several recent hurricanes in the U.S.: Isabel (2003); Charley, Dennis and Ivan (2004); Katrina and Wilma (2005); Ike and Fay (2008); and Irene (2011), as well as typhoons in the Pacific: Fanapi (2010) and Nanmadol (2011). The effects of tropical cyclones on flow and salinity distribution in estuarine and coastal waters has been simulated for Apalachicola Bay as well as Guana-Tolomato-Matanzas Estuary using CH3D-SSMS. The system successfully reproduced different physical phenomena including large waves during Ivan that damaged I-10 Bridges, a large alongshore wave and coastal flooding during Wilma, salinity drop during Fay, and flooding in Taiwan as a result of combined surge and rain effect during Fanapi. The system uses 4 domains that cover entire Florida coastline: West, which covers the Florida panhandle and Tampa Bay; Southwest spans from Florida Keys to Charlotte Harbor; Southeast, covering Biscayne Bay and Miami and East, which continues north to the Florida/Georgia border. The system has a data acquisition and processing module that is used to collect data for model runs (e.g. wind, river flow, precipitation). Depending on the domain, forecasts runs can take ~1-18 hours to complete on a single CPU (8-core) system (1-2 hrs for 2D setup and up to 18 hrs for a 3D setup) with 4 forecasts generated per day. All data is archived / catalogued and model forecast skill is continuously being evaluated. In addition to the baseline forecasts, additional forecasts are being perform using various options for wind forcing (GFS, GFDL, WRF, and parametric hurricane models), model configurations (2D/ 3D), and open boundary conditions by coupling with large scale models (ROMS, NCOM, HYCOM), as well as incorporating real-time and forecast river flow and precipitation data to better understand how to improve model skill. In addition, new forecast products (e.g. more informative inundation maps) are being developed to targeted stakeholders. To support modern data standards, CH3D-SSMS results are available online via a THREDDS server in CF-Compliant NetCDF format as well as other stakeholder-friendly (e.g. GIS) formats. The SECOORA website provides visualization of the model via GODIVA-THREDDS interface.
Using Open and Interoperable Ways to Publish and Access LANCE AIRS Near-Real Time Data
NASA Technical Reports Server (NTRS)
Zhao, Peisheng; Lynnes, Christopher; Vollmer, Bruce; Savtchenko, Andrey; Theobald, Michael; Yang, Wenli
2011-01-01
The Atmospheric Infrared Sounder (AIRS) Near-Real Time (NRT) data from the Land Atmosphere Near real-time Capability for EOS (LANCE) element at the Goddard Earth Sciences Data and Information Services Center (GES DISC) provides information on the global and regional atmospheric state, with very low temporal latency, to support climate research and improve weather forecasting. An open and interoperable platform is useful to facilitate access to, and integration of, LANCE AIRS NRT data. As Web services technology has matured in recent years, a new scalable Service-Oriented Architecture (SOA) is emerging as the basic platform for distributed computing and large networks of interoperable applications. Following the provide-register-discover-consume SOA paradigm, this presentation discusses how to use open-source geospatial software components to build Web services for publishing and accessing AIRS NRT data, explore the metadata relevant to registering and discovering data and services in the catalogue systems, and implement a Web portal to facilitate users' consumption of the data and services.
75 FR 4689 - Electronic Tariff Filings
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-29
... collaborative process relies upon the use of metadata (or information) about the tariff filing, including such... code.\\5\\ Because the Commission is using the electronic metadata to establish statutory action dates... code, as well as accurately providing any other metadata. 6. Similarly, the Commission will be using...
The center for expanded data annotation and retrieval
Bean, Carol A; Cheung, Kei-Hoi; Dumontier, Michel; Durante, Kim A; Gevaert, Olivier; Gonzalez-Beltran, Alejandra; Khatri, Purvesh; Kleinstein, Steven H; O’Connor, Martin J; Pouliot, Yannick; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Wiser, Jeffrey A
2015-01-01
The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. PMID:26112029
Park, Yu Rang; Yoon, Young Jo; Kim, Hye Hyeon; Kim, Ju Han
2013-01-01
Achieving semantic interoperability is critical for biomedical data sharing between individuals, organizations and systems. The ISO/IEC 11179 MetaData Registry (MDR) standard has been recognized as one of the solutions for this purpose. The standard model, however, is limited. Representing concepts consist of two or more values, for instance, are not allowed including blood pressure with systolic and diastolic values. We addressed the structural limitations of ISO/IEC 11179 by an integrated metadata object model in our previous research. In the present study, we introduce semantic extensions for the model by defining three new types of semantic relationships; dependency, composite and variable relationships. To evaluate our extensions in a real world setting, we measured the efficiency of metadata reduction by means of mapping to existing others. We extracted metadata from the College of American Pathologist Cancer Protocols and then evaluated our extensions. With no semantic loss, one third of the extracted metadata could be successfully eliminated, suggesting better strategy for implementing clinical MDRs with improved efficiency and utility.
Bruland, Philipp; Doods, Justin; Storck, Michael; Dugas, Martin
2017-01-01
Data dictionaries provide structural meta-information about data definitions in health information technology (HIT) systems. In this regard, reusing healthcare data for secondary purposes offers several advantages (e.g. reduce documentation times or increased data quality). Prerequisites for data reuse are its quality, availability and identical meaning of data. In diverse projects, research data warehouses serve as core components between heterogeneous clinical databases and various research applications. Given the complexity (high number of data elements) and dynamics (regular updates) of electronic health record (EHR) data structures, we propose a clinical metadata warehouse (CMDW) based on a metadata registry standard. Metadata of two large hospitals were automatically inserted into two CMDWs containing 16,230 forms and 310,519 data elements. Automatic updates of metadata are possible as well as semantic annotations. A CMDW allows metadata discovery, data quality assessment and similarity analyses. Common data models for distributed research networks can be established based on similarity analyses.
Separation of metadata and pixel data to speed DICOM tag morphing.
Ismail, Mahmoud; Philbin, James
2013-01-01
The DICOM information model combines pixel data and metadata in single DICOM object. It is not possible to access the metadata separately from the pixel data. There are use cases where only metadata is accessed. The current DICOM object format increases the running time of those use cases. Tag morphing is one of those use cases. Tag morphing includes deletion, insertion or manipulation of one or more of the metadata attributes. It is typically used for order reconciliation on study acquisition or to localize the issuer of patient ID (IPID) and the patient ID attributes when data from one domain is transferred to a different domain. In this work, we propose using Multi-Series DICOM (MSD) objects, which separate metadata from pixel data and remove duplicate attributes, to reduce the time required for Tag Morphing. The time required to update a set of study attributes in each format is compared. The results show that the MSD format significantly reduces the time required for tag morphing.
Do Community Recommendations Improve Metadata?
NASA Astrophysics Data System (ADS)
Gordon, S.; Habermann, T.; Jones, M. B.; Leinfelder, B.; Mecum, B.; Powers, L. A.; Slaughter, P.
2016-12-01
Complete documentation of scientific data is the surest way to facilitate discovery and reuse. What is complete metadata? There are many metadata recommendations from communities like the OGC, FGDC, NASA, and LTER, that can provide data documentation guidance for discovery, access, use and understanding. Often, the recommendations that communities develop are for a particular metadata dialect. Two examples of this are the LTER Completeness recommendation for EML and the FGDC Data Discovery recommendation for CSDGM. Can community adoption of a recommendation ensure that what is included in the metadata is understandable to the scientific community and beyond? By applying quantitative analysis to different LTER and USGS metadata collections in DataOne and ScienceBase, we show that community recommendations can improve the completeness of collections over time. Additionally, by comparing communities in DataOne that use the EML and CSDGM dialects, but have not adopted the recommendations to the communities that have, the positive effects of recommendation adoption on documentation completeness can be measured.
Metadata Sets for e-Government Resources: The Extended e-Government Metadata Schema (eGMS+)
NASA Astrophysics Data System (ADS)
Charalabidis, Yannis; Lampathaki, Fenareti; Askounis, Dimitris
In the dawn of the Semantic Web era, metadata appear as a key enabler that assists management of the e-Government resources related to the provision of personalized, efficient and proactive services oriented towards the real citizens’ needs. As different authorities typically use different terms to describe their resources and publish them in various e-Government Registries that may enhance the access to and delivery of governmental knowledge, but also need to communicate seamlessly at a national and pan-European level, the need for a unified e-Government metadata standard emerges. This paper presents the creation of an ontology-based extended metadata set for e-Government Resources that embraces services, documents, XML Schemas, code lists, public bodies and information systems. Such a metadata set formalizes the exchange of information between portals and registries and assists the service transformation and simplification efforts, while it can be further taken into consideration when applying Web 2.0 techniques in e-Government.
A Generic Metadata Editor Supporting System Using Drupal CMS
NASA Astrophysics Data System (ADS)
Pan, J.; Banks, N. G.; Leggott, M.
2011-12-01
Metadata handling is a key factor in preserving and reusing scientific data. In recent years, standardized structural metadata has become widely used in Geoscience communities. However, there exist many different standards in Geosciences, such as the current version of the Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata (FGDC CSDGM), the Ecological Markup Language (EML), the Geography Markup Language (GML), and the emerging ISO 19115 and related standards. In addition, there are many different subsets within the Geoscience subdomain such as the Biological Profile of the FGDC (CSDGM), or for geopolitical regions, such as the European Profile or the North American Profile in the ISO standards. It is therefore desirable to have a software foundation to support metadata creation and editing for multiple standards and profiles, without re-inventing the wheels. We have developed a software module as a generic, flexible software system to do just that: to facilitate the support for multiple metadata standards and profiles. The software consists of a set of modules for the Drupal Content Management System (CMS), with minimal inter-dependencies to other Drupal modules. There are two steps in using the system's metadata functions. First, an administrator can use the system to design a user form, based on an XML schema and its instances. The form definition is named and stored in the Drupal database as a XML blob content. Second, users in an editor role can then use the persisted XML definition to render an actual metadata entry form, for creating or editing a metadata record. Behind the scenes, the form definition XML is transformed into a PHP array, which is then rendered via Drupal Form API. When the form is submitted the posted values are used to modify a metadata record. Drupal hooks can be used to perform custom processing on metadata record before and after submission. It is trivial to store the metadata record as an actual XML file or in a storage/archive system. We are working on adding many features to help editor users, such as auto completion, pre-populating of forms, partial saving, as well as automatic schema validation. In this presentation we will demonstrate a few sample editors, including an FGDC editor and a bare bone editor for ISO 19115/19139. We will also demonstrate the use of templates during the definition phase, with the support of export and import functions. Form pre-population and input validation will also be covered. Theses modules are available as open-source software from the Islandora software foundation, as a component of a larger Drupal-based data archive system. They can be easily installed as stand-alone system, or to be plugged into other existing metadata platforms.
A metadata reporting framework (FRAMES) for synthesis of ecohydrological observations
Christianson, Danielle S.; Varadharajan, Charuleka; Christoffersen, Bradley; ...
2017-06-20
Metadata describe the ancillary information needed for data interpretation, comparison across heterogeneous datasets, and quality control and quality assessment (QA/QC). Metadata enable the synthesis of diverse ecohydrological and biogeochemical observations, an essential step in advancing a predictive understanding of earth systems. Environmental observations can be taken across a wide range of spatiotemporal scales in a variety of measurement settings and approaches, and saved in multiple formats. Thus, well-organized, consistent metadata are required to produce usable data products from diverse observations collected in disparate field sites. However, existing metadata reporting protocols do not support the complex data synthesis needs of interdisciplinarymore » earth system research. We developed a metadata reporting framework (FRAMES) to enable predictive understanding of carbon cycling in tropical forests under global change. FRAMES adheres to best practices for data and metadata organization, enabling consistent data reporting and thus compatibility with a variety of standardized data protocols. We used an iterative scientist-centered design process to develop FRAMES. The resulting modular organization streamlines metadata reporting and can be expanded to incorporate additional data types. The flexible data reporting format incorporates existing field practices to maximize data-entry efficiency. With FRAMES’s multi-scale measurement position hierarchy, data can be reported at observed spatial resolutions and then easily aggregated and linked across measurement types to support model-data integration. FRAMES is in early use by both data providers and users. Here in this article, we describe FRAMES, identify lessons learned, and discuss areas of future development.« less
Handling Metadata in a Neurophysiology Laboratory
Zehl, Lyuba; Jaillet, Florent; Stoewer, Adrian; Grewe, Jan; Sobolev, Andrey; Wachtler, Thomas; Brochier, Thomas G.; Riehle, Alexa; Denker, Michael; Grün, Sonja
2016-01-01
To date, non-reproducibility of neurophysiological research is a matter of intense discussion in the scientific community. A crucial component to enhance reproducibility is to comprehensively collect and store metadata, that is, all information about the experiment, the data, and the applied preprocessing steps on the data, such that they can be accessed and shared in a consistent and simple manner. However, the complexity of experiments, the highly specialized analysis workflows and a lack of knowledge on how to make use of supporting software tools often overburden researchers to perform such a detailed documentation. For this reason, the collected metadata are often incomplete, incomprehensible for outsiders or ambiguous. Based on our research experience in dealing with diverse datasets, we here provide conceptual and technical guidance to overcome the challenges associated with the collection, organization, and storage of metadata in a neurophysiology laboratory. Through the concrete example of managing the metadata of a complex experiment that yields multi-channel recordings from monkeys performing a behavioral motor task, we practically demonstrate the implementation of these approaches and solutions with the intention that they may be generalized to other projects. Moreover, we detail five use cases that demonstrate the resulting benefits of constructing a well-organized metadata collection when processing or analyzing the recorded data, in particular when these are shared between laboratories in a modern scientific collaboration. Finally, we suggest an adaptable workflow to accumulate, structure and store metadata from different sources using, by way of example, the odML metadata framework. PMID:27486397
Handling Metadata in a Neurophysiology Laboratory.
Zehl, Lyuba; Jaillet, Florent; Stoewer, Adrian; Grewe, Jan; Sobolev, Andrey; Wachtler, Thomas; Brochier, Thomas G; Riehle, Alexa; Denker, Michael; Grün, Sonja
2016-01-01
To date, non-reproducibility of neurophysiological research is a matter of intense discussion in the scientific community. A crucial component to enhance reproducibility is to comprehensively collect and store metadata, that is, all information about the experiment, the data, and the applied preprocessing steps on the data, such that they can be accessed and shared in a consistent and simple manner. However, the complexity of experiments, the highly specialized analysis workflows and a lack of knowledge on how to make use of supporting software tools often overburden researchers to perform such a detailed documentation. For this reason, the collected metadata are often incomplete, incomprehensible for outsiders or ambiguous. Based on our research experience in dealing with diverse datasets, we here provide conceptual and technical guidance to overcome the challenges associated with the collection, organization, and storage of metadata in a neurophysiology laboratory. Through the concrete example of managing the metadata of a complex experiment that yields multi-channel recordings from monkeys performing a behavioral motor task, we practically demonstrate the implementation of these approaches and solutions with the intention that they may be generalized to other projects. Moreover, we detail five use cases that demonstrate the resulting benefits of constructing a well-organized metadata collection when processing or analyzing the recorded data, in particular when these are shared between laboratories in a modern scientific collaboration. Finally, we suggest an adaptable workflow to accumulate, structure and store metadata from different sources using, by way of example, the odML metadata framework.
A metadata reporting framework (FRAMES) for synthesis of ecohydrological observations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christianson, Danielle S.; Varadharajan, Charuleka; Christoffersen, Bradley
Metadata describe the ancillary information needed for data interpretation, comparison across heterogeneous datasets, and quality control and quality assessment (QA/QC). Metadata enable the synthesis of diverse ecohydrological and biogeochemical observations, an essential step in advancing a predictive understanding of earth systems. Environmental observations can be taken across a wide range of spatiotemporal scales in a variety of measurement settings and approaches, and saved in multiple formats. Thus, well-organized, consistent metadata are required to produce usable data products from diverse observations collected in disparate field sites. However, existing metadata reporting protocols do not support the complex data synthesis needs of interdisciplinarymore » earth system research. We developed a metadata reporting framework (FRAMES) to enable predictive understanding of carbon cycling in tropical forests under global change. FRAMES adheres to best practices for data and metadata organization, enabling consistent data reporting and thus compatibility with a variety of standardized data protocols. We used an iterative scientist-centered design process to develop FRAMES. The resulting modular organization streamlines metadata reporting and can be expanded to incorporate additional data types. The flexible data reporting format incorporates existing field practices to maximize data-entry efficiency. With FRAMES’s multi-scale measurement position hierarchy, data can be reported at observed spatial resolutions and then easily aggregated and linked across measurement types to support model-data integration. FRAMES is in early use by both data providers and users. Here in this article, we describe FRAMES, identify lessons learned, and discuss areas of future development.« less
Streamlining geospatial metadata in the Semantic Web
NASA Astrophysics Data System (ADS)
Fugazza, Cristiano; Pepe, Monica; Oggioni, Alessandro; Tagliolato, Paolo; Carrara, Paola
2016-04-01
In the geospatial realm, data annotation and discovery rely on a number of ad-hoc formats and protocols. These have been created to enable domain-specific use cases generalized search is not feasible for. Metadata are at the heart of the discovery process and nevertheless they are often neglected or encoded in formats that either are not aimed at efficient retrieval of resources or are plainly outdated. Particularly, the quantum leap represented by the Linked Open Data (LOD) movement did not induce so far a consistent, interlinked baseline in the geospatial domain. In a nutshell, datasets, scientific literature related to them, and ultimately the researchers behind these products are only loosely connected; the corresponding metadata intelligible only to humans, duplicated on different systems, seldom consistently. Instead, our workflow for metadata management envisages i) editing via customizable web- based forms, ii) encoding of records in any XML application profile, iii) translation into RDF (involving the semantic lift of metadata records), and finally iv) storage of the metadata as RDF and back-translation into the original XML format with added semantics-aware features. Phase iii) hinges on relating resource metadata to RDF data structures that represent keywords from code lists and controlled vocabularies, toponyms, researchers, institutes, and virtually any description one can retrieve (or directly publish) in the LOD Cloud. In the context of a distributed Spatial Data Infrastructure (SDI) built on free and open-source software, we detail phases iii) and iv) of our workflow for the semantics-aware management of geospatial metadata.
Multi-facetted Metadata - Describing datasets with different metadata schemas at the same time
NASA Astrophysics Data System (ADS)
Ulbricht, Damian; Klump, Jens; Bertelmann, Roland
2013-04-01
Inspired by the wish to re-use research data a lot of work is done to bring data systems of the earth sciences together. Discovery metadata is disseminated to data portals to allow building of customized indexes of catalogued dataset items. Data that were once acquired in the context of a scientific project are open for reappraisal and can now be used by scientists that were not part of the original research team. To make data re-use easier, measurement methods and measurement parameters must be documented in an application metadata schema and described in a written publication. Linking datasets to publications - as DataCite [1] does - requires again a specific metadata schema and every new use context of the measured data may require yet another metadata schema sharing only a subset of information with the meta information already present. To cope with the problem of metadata schema diversity in our common data repository at GFZ Potsdam we established a solution to store file-based research data and describe these with an arbitrary number of metadata schemas. Core component of the data repository is an eSciDoc infrastructure that provides versioned container objects, called eSciDoc [2] "items". The eSciDoc content model allows assigning files to "items" and adding any number of metadata records to these "items". The eSciDoc items can be submitted, revised, and finally published, which makes the data and metadata available through the internet worldwide. GFZ Potsdam uses eSciDoc to support its scientific publishing workflow, including mechanisms for data review in peer review processes by providing temporary web links for external reviewers that do not have credentials to access the data. Based on the eSciDoc API, panMetaDocs [3] provides a web portal for data management in research projects. PanMetaDocs, which is based on panMetaWorks [4], is a PHP based web application that allows to describe data with any XML-based schema. It uses the eSciDoc infrastructures REST-interface to store versioned dataset files and metadata in a XML-format. The software is able to administrate more than one eSciDoc metadata record per item and thus allows the description of a dataset according to its context. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and, at the same time, make use of contextual information available in a project setting. Access rights can be adjusted to set visibility of datasets to the required degree of openness. Metadata from separate instances of panMetaDocs can be syndicated to portals through RSS and OAI-PMH interfaces. The application architecture presented here allows storing file-based datasets and describe these datasets with any number of metadata schemas, depending on the intended use case. Data and metadata are stored in the same entity (eSciDoc items) and are managed by a software tool through the eSciDoc REST interface - in this case the application is panMetaDocs. Other software may re-use the produced items and modify the appropriate metadata records by accessing the web API of the eSciDoc data infrastructure. For presentation of the datasets in a web browser we are not bound to panMetaDocs. This is done by stylesheet transformation of the eSciDoc-item. [1] http://www.datacite.org [2] http://www.escidoc.org , eSciDoc, FIZ Karlruhe, Germany [3] http://panmetadocs.sf.net , panMetaDocs, GFZ Potsdam, Germany [4] http://metaworks.pangaea.de , panMetaWorks, Dr. R. Huber, MARUM, Univ. Bremen, Germany
Architecture of the local spatial data infrastructure for regional climate change research
NASA Astrophysics Data System (ADS)
Titov, Alexander; Gordov, Evgeny
2013-04-01
Georeferenced datasets (meteorological databases, modeling and reanalysis results, etc.) are actively used in modeling and analysis of climate change for various spatial and temporal scales. Due to inherent heterogeneity of environmental datasets as well as their size which might constitute up to tens terabytes for a single dataset studies in the area of climate and environmental change require a special software support based on SDI approach. A dedicated architecture of the local spatial data infrastructure aiming at regional climate change analysis using modern web mapping technologies is presented. Geoportal is a key element of any SDI, allowing searching of geoinformation resources (datasets and services) using metadata catalogs, producing geospatial data selections by their parameters (data access functionality) as well as managing services and applications of cartographical visualization. It should be noted that due to objective reasons such as big dataset volume, complexity of data models used, syntactic and semantic differences of various datasets, the development of environmental geodata access, processing and visualization services turns out to be quite a complex task. Those circumstances were taken into account while developing architecture of the local spatial data infrastructure as a universal framework providing geodata services. So that, the architecture presented includes: 1. Effective in terms of search, access, retrieval and subsequent statistical processing, model of storing big sets of regional georeferenced data, allowing in particular to store frequently used values (like monthly and annual climate change indices, etc.), thus providing different temporal views of the datasets 2. General architecture of the corresponding software components handling geospatial datasets within the storage model 3. Metadata catalog describing in detail using ISO 19115 and CF-convention standards datasets used in climate researches as a basic element of the spatial data infrastructure as well as its publication according to OGC CSW (Catalog Service Web) specification 4. Computational and mapping web services to work with geospatial datasets based on OWS (OGC Web Services) standards: WMS, WFS, WPS 5. Geoportal as a key element of thematic regional spatial data infrastructure providing also software framework for dedicated web applications development To realize web mapping services Geoserver software is used since it provides natural WPS implementation as a separate software module. To provide geospatial metadata services GeoNetwork Opensource (http://geonetwork-opensource.org) product is planned to be used for it supports ISO 19115/ISO 19119/ISO 19139 metadata standards as well as ISO CSW 2.0 profile for both client and server. To implement thematic applications based on geospatial web services within the framework of local SDI geoportal the following open source software have been selected: 1. OpenLayers JavaScript library, providing basic web mapping functionality for the thin client such as web browser 2. GeoExt/ExtJS JavaScript libraries for building client-side web applications working with geodata services. The web interface developed will be similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. The work is partially supported by RF Ministry of Education and Science grant 8345, SB RAS Program VIII.80.2.1 and IP 131.
78 FR 67352 - Combined Notice of Filings #1
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-12
...-75-001. Applicants: Entergy Arkansas, Inc. Description: Metadata Correction--Sec. 1.01 Amendment to.... Description: Metadata Correction--Section 1.01 Amendment to be effective 12/31/9998. Filed Date: 10/25/13...: Entergy Louisiana, LLC. Description: Metadata Correction--Section 1.01 Amendment to be effective 12/31...
ERIC Educational Resources Information Center
White, Hollie C.
2012-01-01
Background: According to Salo (2010), the metadata entered into repositories are "disorganized" and metadata schemes underlying repositories are "arcane". This creates a challenging repository environment in regards to personal information management (PIM) and knowledge organization systems (KOSs). This dissertation research is…
Semantic Networks and Social Networks
ERIC Educational Resources Information Center
Downes, Stephen
2005-01-01
Purpose: To illustrate the need for social network metadata within semantic metadata. Design/methodology/approach: Surveys properties of social networks and the semantic web, suggests that social network analysis applies to semantic content, argues that semantic content is more searchable if social network metadata is merged with semantic web…
International Metadata Initiatives: Lessons in Bibliographic Control.
ERIC Educational Resources Information Center
Caplan, Priscilla
This paper looks at a subset of metadata schemes, including the Text Encoding Initiative (TEI) header, the Encoded Archival Description (EAD), the Dublin Core Metadata Element Set (DCMES), and the Visual Resources Association (VRA) Core Categories for visual resources. It examines why they developed as they did, major point of difference from…
36 CFR 1235.48 - What documentation must agencies transfer with electronic records?
Code of Federal Regulations, 2010 CFR
2010-07-01
... digital geospatial data files can include metadata that conforms to the Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata, as specified in Executive Order 12906 of April... number (301) 837-2903 for digital photographs and metadata, or the National Archives and Records...
36 CFR 1235.48 - What documentation must agencies transfer with electronic records?
Code of Federal Regulations, 2012 CFR
2012-07-01
... digital geospatial data files can include metadata that conforms to the Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata, as specified in Executive Order 12906 of April... number (301) 837-2903 for digital photographs and metadata, or the National Archives and Records...
Leveraging Metadata to Create Better Web Services
ERIC Educational Resources Information Center
Mitchell, Erik
2012-01-01
Libraries have been increasingly concerned with data creation, management, and publication. This increase is partly driven by shifting metadata standards in libraries and partly by the growth of data and metadata repositories being managed by libraries. In order to manage these data sets, libraries are looking for new preservation and discovery…
36 CFR § 1235.48 - What documentation must agencies transfer with electronic records?
Code of Federal Regulations, 2013 CFR
2013-07-01
... digital geospatial data files can include metadata that conforms to the Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata, as specified in Executive Order 12906 of April... number (301) 837-2903 for digital photographs and metadata, or the National Archives and Records...
36 CFR 1235.48 - What documentation must agencies transfer with electronic records?
Code of Federal Regulations, 2011 CFR
2011-07-01
... digital geospatial data files can include metadata that conforms to the Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata, as specified in Executive Order 12906 of April... number (301) 837-2903 for digital photographs and metadata, or the National Archives and Records...
36 CFR 1235.48 - What documentation must agencies transfer with electronic records?
Code of Federal Regulations, 2014 CFR
2014-07-01
... digital geospatial data files can include metadata that conforms to the Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata, as specified in Executive Order 12906 of April... number (301) 837-2903 for digital photographs and metadata, or the National Archives and Records...
Shared Geospatial Metadata Repository for Ontario University Libraries: Collaborative Approaches
ERIC Educational Resources Information Center
Forward, Erin; Leahey, Amber; Trimble, Leanne
2015-01-01
Successfully providing access to special collections of digital geospatial data in academic libraries relies upon complete and accurate metadata. Creating and maintaining metadata using specialized standards is a formidable challenge for libraries. The Ontario Council of University Libraries' Scholars GeoPortal project, which created a shared…
106-17 Telemetry Standards Metadata Configuration Chapter 23
2017-07-01
23-1 23.2 Metadata Description Language ...Chapter 23, July 2017 iii Acronyms HTML Hypertext Markup Language MDL Metadata Description Language PCM pulse code modulation TMATS Telemetry...Attributes Transfer Standard W3C World Wide Web Consortium XML eXtensible Markup Language XSD XML schema document Telemetry Network Standard
Digital Initiatives and Metadata Use in Thailand
ERIC Educational Resources Information Center
SuKantarat, Wichada
2008-01-01
Purpose: This paper aims to provide information about various digital initiatives in libraries in Thailand and especially use of Dublin Core metadata in cataloguing digitized objects in academic and government digital databases. Design/methodology/approach: The author began researching metadata use in Thailand in 2003 and 2004 while on sabbatical…
Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras.
Jung, Jaehoon; Yoon, Inhye; Lee, Seungwon; Paik, Joonki
2016-06-24
Since it is impossible for surveillance personnel to keep monitoring videos from a multiple camera-based surveillance system, an efficient technique is needed to help recognize important situations by retrieving the metadata of an object-of-interest. In a multiple camera-based surveillance system, an object detected in a camera has a different shape in another camera, which is a critical issue of wide-range, real-time surveillance systems. In order to address the problem, this paper presents an object retrieval method by extracting the normalized metadata of an object-of-interest from multiple, heterogeneous cameras. The proposed metadata generation algorithm consists of three steps: (i) generation of a three-dimensional (3D) human model; (ii) human object-based automatic scene calibration; and (iii) metadata generation. More specifically, an appropriately-generated 3D human model provides the foot-to-head direction information that is used as the input of the automatic calibration of each camera. The normalized object information is used to retrieve an object-of-interest in a wide-range, multiple-camera surveillance system in the form of metadata. Experimental results show that the 3D human model matches the ground truth, and automatic calibration-based normalization of metadata enables a successful retrieval and tracking of a human object in the multiple-camera video surveillance system.
McMahon, Christiana; Denaxas, Spiros
2017-11-06
Informed consent is an important feature of longitudinal research studies as it enables the linking of the baseline participant information with administrative data. The lack of standardized models to capture consent elements can lead to substantial challenges. A structured approach to capturing consent-related metadata can address these. a) Explore the state-of-the-art for recording consent; b) Identify key elements of consent required for record linkage; and c) Create and evaluate a novel metadata management model to capture consent-related metadata. The main methodological components of our work were: a) a systematic literature review and qualitative analysis of consent forms; b) the development and evaluation of a novel metadata model. We qualitatively analyzed 61 manuscripts and 30 consent forms. We extracted data elements related to obtaining consent for linkage. We created a novel metadata management model for consent and evaluated it by comparison with the existing standards and by iteratively applying it to case studies. The developed model can facilitate the standardized recording of consent for linkage in longitudinal research studies and enable the linkage of external participant data. Furthermore, it can provide a structured way of recording consent-related metadata and facilitate the harmonization and streamlining of processes.
Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras
Jung, Jaehoon; Yoon, Inhye; Lee, Seungwon; Paik, Joonki
2016-01-01
Since it is impossible for surveillance personnel to keep monitoring videos from a multiple camera-based surveillance system, an efficient technique is needed to help recognize important situations by retrieving the metadata of an object-of-interest. In a multiple camera-based surveillance system, an object detected in a camera has a different shape in another camera, which is a critical issue of wide-range, real-time surveillance systems. In order to address the problem, this paper presents an object retrieval method by extracting the normalized metadata of an object-of-interest from multiple, heterogeneous cameras. The proposed metadata generation algorithm consists of three steps: (i) generation of a three-dimensional (3D) human model; (ii) human object-based automatic scene calibration; and (iii) metadata generation. More specifically, an appropriately-generated 3D human model provides the foot-to-head direction information that is used as the input of the automatic calibration of each camera. The normalized object information is used to retrieve an object-of-interest in a wide-range, multiple-camera surveillance system in the form of metadata. Experimental results show that the 3D human model matches the ground truth, and automatic calibration-based normalization of metadata enables a successful retrieval and tracking of a human object in the multiple-camera video surveillance system. PMID:27347961
Evaluating and Improving Metadata for Data Use and Understanding
NASA Astrophysics Data System (ADS)
Habermann, T.
2013-12-01
The last several decades have seen an extraordinary increase in the number and breadth of environmental data available to the scientific community and the general public. These increases have focused the environmental data community on creating metadata for discovering data and on the creation and population of catalogs and portals for facilitating discovery. This focus is reflected in the fields required by commonly used metadata standards and has resulted in collections populated with metadata that meet, but don't go far beyond, minimal discovery requirements. Discovery is the first step towards addressing scientific questions using data. As more data are discovered and accessed, users need metadata that 1) automates use and integration of these data in tools and 2) facilitates understanding the data when it is compared to similar datasets or as internal variations are observed. When data discovery is the primary goal, it is important to create records for as many datasets as possible. The content of these records is controlled by minimum requirements, and evaluation is generally limited to testing for required fields and counting records. As the use and understanding needs become more important, more comprehensive evaluation tools are needed. An approach is described for evaluating existing metadata in the light of these new requirements and for improving the metadata to meet them.
EUDAT B2FIND : A Cross-Discipline Metadata Service and Discovery Portal
NASA Astrophysics Data System (ADS)
Widmann, Heinrich; Thiemann, Hannes
2016-04-01
The European Data Infrastructure (EUDAT) project aims at a pan-European environment that supports a variety of multiple research communities and individuals to manage the rising tide of scientific data by advanced data management technologies. This led to the establishment of the community-driven Collaborative Data Infrastructure that implements common data services and storage resources to tackle the basic requirements and the specific challenges of international and interdisciplinary research data management. The metadata service B2FIND plays a central role in this context by providing a simple and user-friendly discovery portal to find research data collections stored in EUDAT data centers or in other repositories. For this we store the diverse metadata collected from heterogeneous sources in a comprehensive joint metadata catalogue and make them searchable in an open data portal. The implemented metadata ingestion workflow consists of three steps. First the metadata records - provided either by various research communities or via other EUDAT services - are harvested. Afterwards the raw metadata records are converted and mapped to unified key-value dictionaries as specified by the B2FIND schema. The semantic mapping of the non-uniform, community specific metadata to homogenous structured datasets is hereby the most subtle and challenging task. To assure and improve the quality of the metadata this mapping process is accompanied by • iterative and intense exchange with the community representatives, • usage of controlled vocabularies and community specific ontologies and • formal and semantic validation. Finally the mapped and checked records are uploaded as datasets to the catalogue, which is based on the open source data portal software CKAN. CKAN provides a rich RESTful JSON API and uses SOLR for dataset indexing that enables users to query and search in the catalogue. The homogenization of the community specific data models and vocabularies enables not only the unique presentation of these datasets as tables of field-value pairs but also the faceted, spatial and temporal search in the B2FIND metadata portal. Furthermore the service provides transparent access to the scientific data objects through the given references and identifiers in the metadata. B2FIND offers support for new communities interested in publishing their data within EUDAT. We present here the functionality and the features of the B2FIND service and give an outlook of further developments as interfaces to external libraries and use of Linked Data.
NASA Astrophysics Data System (ADS)
Klump, J. F.; Ulbricht, D.; Conze, R.
2014-12-01
The Continental Deep Drilling Programme (KTB) was a scientific drilling project from 1987 to 1995 near Windischeschenbach, Bavaria. The main super-deep borehole reached a depth of 9,101 meters into the Earth's continental crust. The project used the most current equipment for data capture and processing. After the end of the project key data were disseminated through the web portal of the International Continental Scientific Drilling Program (ICDP). The scientific reports were published as printed volumes. As similar projects have also experienced, it becomes increasingly difficult to maintain a data portal over a long time. Changes in software and underlying hardware make a migration of the entire system inevitable. Around 2009 the data presented on the ICDP web portal were migrated to the Scientific Drilling Database (SDDB) and published through DataCite using Digital Object Identifiers (DOI) as persistent identifiers. The SDDB portal used a relational database with a complex data model to store data and metadata. A PHP-based Content Management System with custom modifications made it possible to navigate and browse datasets using the metadata and then download datasets. The data repository software eSciDoc allows storing self-contained packages consistent with the OAIS reference model. Each package consists of binary data files and XML-metadata. Using a REST-API the packages can be stored in the eSciDoc repository and can be searched using the XML-metadata. During the last maintenance cycle of the SDDB the data and metadata were migrated into the eSciDoc repository. Discovery metadata was generated following the GCMD-DIF, ISO19115 and DataCite schemas. The eSciDoc repository allows to store an arbitrary number of XML-metadata records with each data object. In addition to descriptive metadata each data object may contain pointers to related materials, such as IGSN-metadata to link datasets to physical specimens, or identifiers of literature interpreting the data. Datasets are presented by XSLT-stylesheet transformation using the stored metadata. The presentation shows several migration cycles of data and metadata, which were driven by aging software systems. Currently the datasets reside as self-contained entities in a repository system that is ready for digital preservation.
Effective use of metadata in the integration and analysis of multi-dimensional optical data
NASA Astrophysics Data System (ADS)
Pastorello, G. Z.; Gamon, J. A.
2012-12-01
Data discovery and integration relies on adequate metadata. However, creating and maintaining metadata is time consuming and often poorly addressed or avoided altogether, leading to problems in later data analysis and exchange. This is particularly true for research fields in which metadata standards do not yet exist or are under development, or within smaller research groups without enough resources. Vegetation monitoring using in-situ and remote optical sensing is an example of such a domain. In this area, data are inherently multi-dimensional, with spatial, temporal and spectral dimensions usually being well characterized. Other equally important aspects, however, might be inadequately translated into metadata. Examples include equipment specifications and calibrations, field/lab notes and field/lab protocols (e.g., sampling regimen, spectral calibration, atmospheric correction, sensor view angle, illumination angle), data processing choices (e.g., methods for gap filling, filtering and aggregation of data), quality assurance, and documentation of data sources, ownership and licensing. Each of these aspects can be important as metadata for search and discovery, but they can also be used as key data fields in their own right. If each of these aspects is also understood as an "extra dimension," it is possible to take advantage of them to simplify the data acquisition, integration, analysis, visualization and exchange cycle. Simple examples include selecting data sets of interest early in the integration process (e.g., only data collected according to a specific field sampling protocol) or applying appropriate data processing operations to different parts of a data set (e.g., adaptive processing for data collected under different sky conditions). More interesting scenarios involve guided navigation and visualization of data sets based on these extra dimensions, as well as partitioning data sets to highlight relevant subsets to be made available for exchange. The DAX (Data Acquisition to eXchange) Web-based tool uses a flexible metadata representation model and takes advantage of multi-dimensional data structures to translate metadata types into data dimensions, effectively reshaping data sets according to available metadata. With that, metadata is tightly integrated into the acquisition-to-exchange cycle, allowing for more focused exploration of data sets while also increasing the value of, and incentives for, keeping good metadata. The tool is being developed and tested with optical data collected in different settings, including laboratory, field, airborne, and satellite platforms.
NASA Astrophysics Data System (ADS)
Ten Hoeve, J. E.; Jacobson, M. Z.
2010-12-01
Satellite observational studies have found an increase in cloud fraction (CF) and cloud optical depth (COD) with increasing aerosol optical depth (AOD) followed by a decreasing CF/COD with increasing AOD at higher AODs over the Amazon Basin. The shape of this curve is similar to that of a boomerang, and thus the effect has been dubbed the "boomerang effect.” The increase in CF/COD with increasing AOD at low AODs is ascribed to the first and second indirect effects and is referred to as a microphysical effect of aerosols on clouds. The decrease in CF/COD at higher AODs is ascribed to enhanced warming of clouds due to absorbing aerosols, either as inclusions in drops or interstitially between drops. This is referred to as a radiative effect. To date, the interaction of the microphysical and radiative effects has not been simulated with a regional or global computer model. Here, we simulate the boomerang effect with the nested global-through-urban climate, air pollution, weather forecast model, GATOR-GCMOM, for the Amazon biomass burning season of 2006. We also compare the model with an extensive set of data, including satellite data from MODIS, TRMM, and CALIPSO, in situ surface observations, upper-air data, and AERONET data. Biomass burning emissions are obtained from the Global Fire Emissions Database (GFEDv2), and are combined with MODIS land cover data along with biomass burning emission factors. A high-resolution domain, nested within three increasingly coarser domains, is employed over the heaviest biomass burning region within the arc of deforestation. Modeled trends in cloud properties with aerosol loading compare well with MODIS observed trends, allowing causation of these observed correlations, including of the boomerang effect, to be determined by model results. The impact of aerosols on various cloud parameters, such as cloud optical thickness, cloud fraction, cloud liquid water/ice content, and precipitation, are shown through differences between simulations that include and exclude biomass burning emissions. This study suggests by cause and effect through numerical modeling that aerosol radiative effects counteract microphysical effects at high AODs, a result previously shown by correlation alone. As such, computer models that exclude treatment of cloud radiative effects are likely to overpredict the indirect effects of aerosols on clouds and underestimate the warming due to aerosols containing black carbon.
OceanSITES format and Ocean Observatory Output harmonisation: past, present and future
NASA Astrophysics Data System (ADS)
Pagnani, Maureen; Galbraith, Nan; Diggs, Stephen; Lankhorst, Matthias; Hidas, Marton; Lampitt, Richard
2015-04-01
The Global Ocean Observing System (GOOS) initiative was launched in 1991, and was the first step in creating a global view of ocean observations. In 1999 oceanographers at the OceanObs conference envisioned a 'global system of eulerian observatories' which evolved into the OceanSITES project. OceanSITES has been generously supported by individual oceanographic institutes and agencies across the globe, as well as by the WMO-IOC Joint Technical Commission for Oceanography and Marine Meteorology (under JCOMMOPS). The project is directed by the needs of research scientists, but has a strong data management component, with an international team developing content standards, metadata specifications, and NetCDF templates for many types of in situ oceanographic data. The OceanSITES NetCDF format specification is intended as a robust data exchange and archive format specifically for time-series observatory data from the deep ocean. First released in February 2006, it has evolved to build on and extend internationally recognised standards such as the Climate and Forecast (CF) standard, BODC vocabularies, ISO formats and vocabularies, and in version 1.3, released in 2014, ACDD (Attribute Convention for Dataset Discovery). The success of the OceanSITES format has inspired other observational groups, such as autonomous vehicles and ships of opportunity, to also use the format and today it is fulfilling the original concept of providing a coherent set of data from eurerian observatories. Data in the OceanSITES format is served by 2 Global Data Assembly Centres (GDACs), one at Coriolis, in France, at ftp://ftp.ifremer.fr/ifremer/oceansites/ and one at the US NDBC, at ftp://data.ndbc.noaa.gov/data/oceansites/. These two centres serve over 26,800 OceanSITES format data files from 93 moorings. The use of standardised and controlled features enables the files held at the OceanSITES GDACs to be electronically discoverable and ensures the widest access to the data. The OceanSITES initiative has always been truly international, and in Europe the first project to include OceanSITES as part of its outputs was ANIMATE(2002-2005), where 3 moorings and 5 partners shared equipment, methods and analysis effort and produced their final outputs in OceanSITES format. Subsequent European projects, MERSEA(2004-2008) and EuroSITES (2008-2011) built on that early success and the current European project FixO3 encompasses 23 moorings and 29 partners, all of whom are committed to producing data in OceanSITES format. The global OceanSITES partnership continues to grow; in 2014 the Australian Integrated Marine Observing System ( IMOS) started delivering data to the OceanSITES FTP, and files and India, South Korea and Japan are also active members of the OceanSITES community. As illustrated in figure 1 the OceanSITES sites cover the entire globe, and the format has now matured enough to be taken up by other user groups. GO-SHIP, a global, ship-based hydrographic program, shares technical management with OceanSITES through JCOMMOPS, and has its roots in WOCE Hydrography. This program complements OceanSITES and directly contributes to the mooring data holdings by providing repeated CTD and bottle profiles at specific locations. GO-SHIP hydrographic data adds a source of timeseries profiles and are provided in the OceanSITES file structure to facilitate full data interoperability. GO-SHIP has worked closely with the OceanSITES program, and this interaction has produced an unexpected side benefit - all data in the GO-SHIP database will be offered the robust and CF-compliant OceanSITES format beginning in 2015. The MyOcean European ocean monitoring and forecasting project has been in existence since 2009, and has successfully used the OceanSITES format as a unifying paradigm. MyOcean daily receives hundreds of data files from across Europe, and distributes the data from drifter buoys, moorings and tide gauges in OceanSITES format. These in-situ data are essential for both model verification points and for assimilation into the models. The use of the OceanSITES format now exceeds the hopes and expectations of the original OceanObs vision in 1999 and the stewardship of the format development, extension and documentation is in the expert care of the international OceanSITES Data Management Team. PIC Figure 1
Seeking the Path to Metadata Nirvana
NASA Astrophysics Data System (ADS)
Graybeal, J.
2008-12-01
Scientists have always found reusing other scientists' data challenging. Computers did not fundamentally change the problem, but enabled more and larger instances of it. In fact, by removing human mediation and time delays from the data sharing process, computers emphasize the contextual information that must be exchanged in order to exchange and reuse data. This requirement for contextual information has two faces: "interoperability" when talking about systems, and "the metadata problem" when talking about data. As much as any single organization, the Marine Metadata Interoperability (MMI) project has been tagged with the mission "Solve the metadata problem." Of course, if that goal is achieved, then sustained, interoperable data systems for interdisciplinary observing networks can be easily built -- pesky metadata differences, like which protocol to use for data exchange, or what the data actually measures, will be a thing of the past. Alas, as you might imagine, there will always be complexities and incompatibilities that are not addressed, and data systems that are not interoperable, even within a science discipline. So should we throw up our hands and surrender to the inevitable? Not at all. Rather, we try to minimize metadata problems as much as we can. In this we increasingly progress, despite natural forces that pull in the other direction. Computer systems let us work with more complexity, build community knowledge and collaborations, and preserve and publish our progress and (dis-)agreements. Funding organizations, science communities, and technologists see the importance interoperable systems and metadata, and direct resources toward them. With the new approaches and resources, projects like IPY and MMI can simultaneously define, display, and promote effective strategies for sustainable, interoperable data systems. This presentation will outline the role metadata plays in durable interoperable data systems, for better or worse. It will describe times when "just choosing a standard" can work, and when it probably won't work. And it will point out signs that suggest a metadata storm is coming to your community project, and how you might avoid it. From these lessons we will seek a path to producing interoperable, interdisciplinary, metadata-enlightened environment observing systems.
Bai, Feng-Yang; Ma, Yuan; Lv, Shuang; Pan, Xiu-Mei; Jia, Xiu-Juan
2017-01-01
In this study, the mechanistic and kinetic analysis for reactions of CF3OCH(CF3)2 and CF3OCF2CF2H with OH radicals and Cl atoms have been performed at the CCSD(T)//B3LYP/6-311++G(d,p) level. Kinetic isotope effects for reactions CF3OCH(CF3)2/CF3OCD(CF3)2 and CF3OCF2CF2H/CF3OCF2CF2D with OH and Cl were estimated so as to provide the theoretical estimation for future laboratory investigation. All rate constants, computed by canonical variational transition state theory (CVT) with the small-curvature tunneling correction (SCT), are in reasonable agreement with the limited experimental data. Standard enthalpies of formation for the species were also calculated. Atmospheric lifetime and global warming potentials (GWPs) of the reaction species were estimated, the large lifetimes and GWPs show that the environmental impact of them cannot be ignored. The organic nitrates can be produced by the further oxidation of CF3OC(•)(CF3)2 and CF3OCF2CF2• in the presence of O2 and NO. The subsequent decomposition pathways of CF3OC(O•)(CF3)2 and CF3OCF2CF2O• radicals were studied in detail. The derived Arrhenius expressions for the rate coefficients over 230–350 K are: k T(1) = 5.00 × 10−24T3.57 exp(−849.73/T), k T(2) = 1.79 × 10−24T4.84 exp(−4262.65/T), kT(3) = 1.94 × 10−24 T4.18 exp(−884.26/T), and k T(4) = 9.44 × 10−28T5.25 exp(−913.45/T) cm3 molecule−1 s−1. PMID:28067283
NASA Astrophysics Data System (ADS)
Bai, Feng-Yang; Ma, Yuan; Lv, Shuang; Pan, Xiu-Mei; Jia, Xiu-Juan
2017-01-01
In this study, the mechanistic and kinetic analysis for reactions of CF3OCH(CF3)2 and CF3OCF2CF2H with OH radicals and Cl atoms have been performed at the CCSD(T)//B3LYP/6-311++G(d,p) level. Kinetic isotope effects for reactions CF3OCH(CF3)2/CF3OCD(CF3)2 and CF3OCF2CF2H/CF3OCF2CF2D with OH and Cl were estimated so as to provide the theoretical estimation for future laboratory investigation. All rate constants, computed by canonical variational transition state theory (CVT) with the small-curvature tunneling correction (SCT), are in reasonable agreement with the limited experimental data. Standard enthalpies of formation for the species were also calculated. Atmospheric lifetime and global warming potentials (GWPs) of the reaction species were estimated, the large lifetimes and GWPs show that the environmental impact of them cannot be ignored. The organic nitrates can be produced by the further oxidation of CF3OC(•)(CF3)2 and CF3OCF2CF2• in the presence of O2 and NO. The subsequent decomposition pathways of CF3OC(O•)(CF3)2 and CF3OCF2CF2O• radicals were studied in detail. The derived Arrhenius expressions for the rate coefficients over 230-350 K are: k T(1) = 5.00 × 10-24T3.57 exp(-849.73/T), k T(2) = 1.79 × 10-24T4.84 exp(-4262.65/T), kT(3) = 1.94 × 10-24 T4.18 exp(-884.26/T), and k T(4) = 9.44 × 10-28T5.25 exp(-913.45/T) cm3 molecule-1 s-1.
Atmospheric Chemistry of (CF3)2CHOCH3, (CF3)2CHOCHO, and CF3C(O)OCH3.
Østerstrøm, Freja From; Wallington, Timothy J; Sulbaek Andersen, Mads P; Nielsen, Ole John
2015-10-22
Smog chambers with in situ FTIR detection were used to measure rate coefficients in 700 Torr of air and 296 ± 2 K of: k(Cl+(CF3)2CHOCH3) = (5.41 ± 1.63) × 10(-12), k(Cl+(CF3)2CHOCHO) = (9.44 ± 1.81) × 10(-15), k(Cl+CF3C(O)OCH3) = (6.28 ± 0.98) × 10(-14), k(OH+(CF3)2CHOCH3) = (1.86 ± 0.41) × 10(-13), and k(OH+(CF3)2CHOCHO) = (2.08 ± 0.63) × 10(-14) cm(3) molecule(-1) s(-1). The Cl atom initiated oxidation of (CF3)2CHOCH3 gives (CF3)2CHOCHO in a yield indistinguishable from 100%. The OH radical initiated oxidation of (CF3)2CHOCH3 gives the following products (molar yields): (CF3)2CHOCHO (76 ± 8)%, CF3C(O)OCH3 (16 ± 2)%, CF3C(O)CF3 (4 ± 1)%, and C(O)F2 (45 ± 5)%. The primary oxidation product (CF3)2CHOCHO reacts with Cl atoms to give secondary products (molar yields): CF3C(O)CF3 (67 ± 7)%, CF3C(O)OCHO (28 ± 3)%, and C(O)F2 (118 ± 12)%. CF3C(O)OCH3 reacts with Cl atoms to give: CF3C(O)OCHO (80 ± 8)% and C(O)F2 (6 ± 1)%. Atmospheric lifetimes of (CF3)2CHOCH3, (CF3)2CHOCHO, and CF3C(O)OCH3 were estimated to be 62 days, 1.5 years, and 220 days, respectively. The 100-year global warming potentials (GWPs) for (CF3)2CHOCH3, (CF3)2CHOCHO, and CF3C(O)OCH3 are estimated to be 6, 121, and 46, respectively. A comprehensive description of the atmospheric fate of (CF3)2CHOCH3 is presented.
Sharma, Deepak K; Solbrig, Harold R; Tao, Cui; Weng, Chunhua; Chute, Christopher G; Jiang, Guoqian
2017-06-05
Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains. We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes). We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models. Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.
Liu, Z; Sun, J; Smith, M; Smith, L; Warr, R
2013-11-01
Computer-assisted diagnosis (CAD) of malignant melanoma (MM) has been advocated to help clinicians to achieve a more objective and reliable assessment. However, conventional CAD systems examine only the features extracted from digital photographs of lesions. Failure to incorporate patients' personal information constrains the applicability in clinical settings. To develop a new CAD system to improve the performance of automatic diagnosis of melanoma, which, for the first time, incorporates digital features of lesions with important patient metadata into a learning process. Thirty-two features were extracted from digital photographs to characterize skin lesions. Patients' personal information, such as age, gender and, lesion site, and their combinations, was quantified as metadata. The integration of digital features and metadata was realized through an extended Laplacian eigenmap, a dimensionality-reduction method grouping lesions with similar digital features and metadata into the same classes. The diagnosis reached 82.1% sensitivity and 86.1% specificity when only multidimensional digital features were used, but improved to 95.2% sensitivity and 91.0% specificity after metadata were incorporated appropriately. The proposed system achieves a level of sensitivity comparable with experienced dermatologists aided by conventional dermoscopes. This demonstrates the potential of our method for assisting clinicians in diagnosing melanoma, and the benefit it could provide to patients and hospitals by greatly reducing unnecessary excisions of benign naevi. This paper proposes an enhanced CAD system incorporating clinical metadata into the learning process for automatic classification of melanoma. Results demonstrate that the additional metadata and the mechanism to incorporate them are useful for improving CAD of melanoma. © 2013 British Association of Dermatologists.
NASA Astrophysics Data System (ADS)
Chan, S.; Lehnert, K. A.; Coleman, R. J.
2011-12-01
SESAR, the System for Earth Sample Registration, is an online registry for physical samples collected for Earth and environmental studies. SESAR generates and administers the International Geo Sample Number IGSN, a unique identifier for samples that is dramatically advancing interoperability amongst information systems for sample-based data. SESAR was developed to provide the complete range of registry services, including definition of IGSN syntax and metadata profiles, registration and validation of name spaces requested by users, tools for users to submit and manage sample metadata, validation of submitted metadata, generation and validation of the unique identifiers, archiving of sample metadata, and public or private access to the sample metadata catalog. With the development of SESAR v3, we placed particular emphasis on creating enhanced tools that make metadata submission easier and more efficient for users, and that provide superior functionality for users to manage metadata of their samples in their private workspace MySESAR. For example, SESAR v3 includes a module where users can generate custom spreadsheet templates to enter metadata for their samples, then upload these templates online for sample registration. Once the content of the template is uploaded, it is displayed online in an editable grid format. Validation rules are executed in real-time on the grid data to ensure data integrity. Other new features of SESAR v3 include the capability to transfer ownership of samples to other SESAR users, the ability to upload and store images and other files in a sample metadata profile, and the tracking of changes to sample metadata profiles. In the next version of SESAR (v3.5), we will further improve the discovery, sharing, registration of samples. For example, we are developing a more comprehensive suite of web services that will allow discovery and registration access to SESAR from external systems. Both batch and individual registrations will be possible through web services. Based on valuable feedback from the user community, we will introduce enhancements that add greater flexibility to the system to accommodate the vast diversity of metadata that users want to store. Users will be able to create custom metadata fields and use these for the samples they register. Users will also be able to group samples into 'collections' to make retrieval for research projects or publications easier. An improved interface design will allow for better workflow transition and navigation throughout the application. In keeping up with the demands of a growing community, SESAR has also made process changes to ensure efficiency in system development. For example, we have implemented a release cycle to better track enhancements and fixes to the system, and an API library that facilitates reusability of code. Usage tracking, metrics and surveys capture information to guide the direction of future developments. A new set of administrative tools allows greater control of system management.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-07
... Information Technology. SUMMARY: As part of the HHS Open Government Plan, the HealthData.gov Platform (HDP) is... application of existing voluntary consensus standards for metadata common to all open government data, and... vocabulary recommendations for Linked Data publishers, defining cross domain semantic metadata of open...
Inferring Metadata for a Semantic Web Peer-to-Peer Environment
ERIC Educational Resources Information Center
Brase, Jan; Painter, Mark
2004-01-01
Learning Objects Metadata (LOM) aims at describing educational resources in order to allow better reusability and retrieval. In this article we show how additional inference rules allows us to derive additional metadata from existing ones. Additionally, using these rules as integrity constraints helps us to define the constraints on LOM elements,…
ERIC Educational Resources Information Center
Solomou, Georgia; Pierrakeas, Christos; Kameas, Achilles
2015-01-01
The ability to effectively administrate educational resources in terms of accessibility, reusability and interoperability lies in the adoption of an appropriate metadata schema, able of adequately describing them. A considerable number of different educational metadata schemas can be found in literature, with the IEEE LOM being the most widely…
Manifestations of Metadata: From Alexandria to the Web--Old is New Again
ERIC Educational Resources Information Center
Kennedy, Patricia
2008-01-01
This paper is a discussion of the use of metadata, in its various manifestations, to access information. Information management standards are discussed. The connection between the ancient world and the modern world is highlighted. Individual perspectives are paramount in fulfilling information seeking. Metadata is interpreted and reflected upon in…
To Teach or Not to Teach: The Ethics of Metadata
ERIC Educational Resources Information Center
Barnes, Cynthia; Cavaliere, Frank
2009-01-01
Metadata is information about computer-generated documents that is often inadvertently transmitted to others. The problems associated with metadata have become more acute over time as word processing and other popular programs have become more receptive to the concept of collaboration. As more people become involved in the preparation of…
Document Classification in Support of Automated Metadata Extraction Form Heterogeneous Collections
ERIC Educational Resources Information Center
Flynn, Paul K.
2014-01-01
A number of federal agencies, universities, laboratories, and companies are placing their documents online and making them searchable via metadata fields such as author, title, and publishing organization. To enable this, every document in the collection must be catalogued using the metadata fields. Though time consuming, the task of identifying…
Creating FGDC and NBII metadata with Metavist 2005.
David J. Rugg
2004-01-01
This report documents a computer program for creating metadata compliant with the Federal Geographic Data Committee (FGDC) 1998 metadata standard or the National Biological Information Infrastructure (NBII) 1999 Biological Data Profile for the FGDC standard. The software runs under the Microsoft Windows 2000 and XP operating systems, and requires the presence of...
An Assistant for Loading Learning Object Metadata: An Ontology Based Approach
ERIC Educational Resources Information Center
Casali, Ana; Deco, Claudia; Romano, Agustín; Tomé, Guillermo
2013-01-01
In the last years, the development of different Repositories of Learning Objects has been increased. Users can retrieve these resources for reuse and personalization through searches in web repositories. The importance of high quality metadata is key for a successful retrieval. Learning Objects are described with metadata usually in the standard…
iLOG: A Framework for Automatic Annotation of Learning Objects with Empirical Usage Metadata
ERIC Educational Resources Information Center
Miller, L. D.; Soh, Leen-Kiat; Samal, Ashok; Nugent, Gwen
2012-01-01
Learning objects (LOs) are digital or non-digital entities used for learning, education or training commonly stored in repositories searchable by their associated metadata. Unfortunately, based on the current standards, such metadata is often missing or incorrectly entered making search difficult or impossible. In this paper, we investigate…
Prediction of Solar Eruptions Using Filament Metadata
NASA Astrophysics Data System (ADS)
Aggarwal, Ashna; Schanche, Nicole; Reeves, Katharine K.; Kempton, Dustin; Angryk, Rafal
2018-05-01
We perform a statistical analysis of erupting and non-erupting solar filaments to determine the properties related to the eruption potential. In order to perform this study, we correlate filament eruptions documented in the Heliophysics Event Knowledgebase (HEK) with HEK filaments that have been grouped together using a spatiotemporal tracking algorithm. The HEK provides metadata about each filament instance, including values for length, area, tilt, and chirality. We add additional metadata properties such as the distance from the nearest active region and the magnetic field decay index. We compare trends in the metadata from erupting and non-erupting filament tracks to discover which properties present signs of an eruption. We find that a change in filament length over time is the most important factor in discriminating between erupting and non-erupting filament tracks, with erupting tracks being more likely to have decreasing length. We attempt to find an ensemble of predictive filament metadata using a Random Forest Classifier approach, but find the probability of correctly predicting an eruption with the current metadata is only slightly better than chance.
Representing Hydrologic Models as HydroShare Resources to Facilitate Model Sharing and Collaboration
NASA Astrophysics Data System (ADS)
Castronova, A. M.; Goodall, J. L.; Mbewe, P.
2013-12-01
The CUAHSI HydroShare project is a collaborative effort that aims to provide software for sharing data and models within the hydrologic science community. One of the early focuses of this work has been establishing metadata standards for describing models and model-related data as HydroShare resources. By leveraging this metadata definition, a prototype extension has been developed to create model resources that can be shared within the community using the HydroShare system. The extension uses a general model metadata definition to create resource objects, and was designed so that model-specific parsing routines can extract and populate metadata fields from model input and output files. The long term goal is to establish a library of supported models where, for each model, the system has the ability to extract key metadata fields automatically, thereby establishing standardized model metadata that will serve as the foundation for model sharing and collaboration within HydroShare. The Soil Water & Assessment Tool (SWAT) is used to demonstrate this concept through a case study application.
XAFS Data Interchange: A single spectrum XAFS data file format.
Ravel, B; Newville, M
We propose a standard data format for the interchange of XAFS data. The XAFS Data Interchange (XDI) standard is meant to encapsulate a single spectrum of XAFS along with relevant metadata. XDI is a text-based format with a simple syntax which clearly delineates metadata from the data table in a way that is easily interpreted both by a computer and by a human. The metadata header is inspired by the format of an electronic mail header, representing metadata names and values as an associative array. The data table is represented as columns of numbers. This format can be imported as is into most existing XAFS data analysis, spreadsheet, or data visualization programs. Along with a specification and a dictionary of metadata types, we provide an application-programming interface written in C and bindings for programming dynamic languages.
XAFS Data Interchange: A single spectrum XAFS data file format
NASA Astrophysics Data System (ADS)
Ravel, B.; Newville, M.
2016-05-01
We propose a standard data format for the interchange of XAFS data. The XAFS Data Interchange (XDI) standard is meant to encapsulate a single spectrum of XAFS along with relevant metadata. XDI is a text-based format with a simple syntax which clearly delineates metadata from the data table in a way that is easily interpreted both by a computer and by a human. The metadata header is inspired by the format of an electronic mail header, representing metadata names and values as an associative array. The data table is represented as columns of numbers. This format can be imported as is into most existing XAFS data analysis, spreadsheet, or data visualization programs. Along with a specification and a dictionary of metadata types, we provide an application-programming interface written in C and bindings for programming dynamic languages.
NASA Astrophysics Data System (ADS)
Wong, John-Michael; Stojadinovic, Bozidar
2005-05-01
A framework has been defined for storing and retrieving civil infrastructure monitoring data over a network. The framework consists of two primary components: metadata and network communications. The metadata component provides the descriptions and data definitions necessary for cataloging and searching monitoring data. The communications component provides Java classes for remotely accessing the data. Packages of Enterprise JavaBeans and data handling utility classes are written to use the underlying metadata information to build real-time monitoring applications. The utility of the framework was evaluated using wireless accelerometers on a shaking table earthquake simulation test of a reinforced concrete bridge column. The NEESgrid data and metadata repository services were used as a backend storage implementation. A web interface was created to demonstrate the utility of the data model and provides an example health monitoring application.
Error quantification of abnormal extreme high waves in Operational Oceanographic System in Korea
NASA Astrophysics Data System (ADS)
Jeong, Sang-Hun; Kim, Jinah; Heo, Ki-Young; Park, Kwang-Soon
2017-04-01
In winter season, large-height swell-like waves have occurred on the East coast of Korea, causing property damages and loss of human life. It is known that those waves are generated by a local strong wind made by temperate cyclone moving to eastward in the East Sea of Korean peninsula. Because the waves are often occurred in the clear weather, in particular, the damages are to be maximized. Therefore, it is necessary to predict and forecast large-height swell-like waves to prevent and correspond to the coastal damages. In Korea, an operational oceanographic system (KOOS) has been developed by the Korea institute of ocean science and technology (KIOST) and KOOS provides daily basis 72-hours' ocean forecasts such as wind, water elevation, sea currents, water temperature, salinity, and waves which are computed from not only meteorological and hydrodynamic model (WRF, ROMS, MOM, and MOHID) but also wave models (WW-III and SWAN). In order to evaluate the model performance and guarantee a certain level of accuracy of ocean forecasts, a Skill Assessment (SA) system was established as a one of module in KOOS. It has been performed through comparison of model results with in-situ observation data and model errors have been quantified with skill scores. Statistics which are used in skill assessment are including a measure of both errors and correlations such as root-mean-square-error (RMSE), root-mean-square-error percentage (RMSE%), mean bias (MB), correlation coefficient (R), scatter index (SI), circular correlation (CC) and central frequency (CF) that is a frequency with which errors lie within acceptable error criteria. It should be utilized simultaneously not only to quantify an error but also to improve an accuracy of forecasts by providing a feedback interactively. However, in an abnormal phenomena such as high-height swell-like waves in the East coast of Korea, it requires more advanced and optimized error quantification method that allows to predict the abnormal waves well and to improve the accuracy of forecasts by supporting modification of physics and numeric on numerical models through sensitivity test. In this study, we proposed an appropriate method of error quantification especially on abnormal high waves which are occurred by local weather condition. Furthermore, we introduced that how the quantification errors are contributed to improve wind-wave modeling by applying data assimilation and utilizing reanalysis data.
EnviroAtlas Tree Cover Configuration and Connectivity, Water Background Web Service
This EnviroAtlas web service supports research and online mapping activities related to EnviroAtlas (https://www.epa.gov/enviroatlas). The 1-meter resolution tree cover configuration and connectivity map categorizes tree cover into structural elements (e.g. core, edge, connector, etc.). Source imagery varies by community. For specific information about methods and accuracy of each community's tree cover configuration and connectivity classification, consult their individual metadata records: Austin, TX (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B29D2B039-905C-4825-B0B4-9315122D6A9F%7D); Cleveland, OH (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B03cd54e1-4328-402e-ba75-e198ea9fbdc7%7D); Des Moines, IA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B350A83E6-10A2-4D5D-97E6-F7F368D268BB%7D); Durham, NC (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BC337BA5F-8275-4BA8-9647-F63C443F317D%7D); Fresno, CA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B84B98749-9C1C-4679-AE24-9B9C0998EBA5%7D); Green Bay, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B69E48A44-3D30-4E84-A764-38FBDCCAC3D0%7D); Memphis, TN (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BB7313ADA-04F7-4D80-ABBA-77E753AAD002%7D); Milwaukee, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?u
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics
NASA Astrophysics Data System (ADS)
Huffer, E.; Gleason, J. L.
2015-12-01
The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
GeneLab Analysis Working Group Kick-Off Meeting
NASA Technical Reports Server (NTRS)
Costes, Sylvain V.
2018-01-01
Goals to achieve for GeneLab AWG - GL vision - Review of GeneLab AWG charter Timeline and milestones for 2018 Logistics - Monthly Meeting - Workshop - Internship - ASGSR Introduction of team leads and goals of each group Introduction of all members Q/A Three-tier Client Strategy to Democratize Data Physiological changes, pathway enrichment, differential expression, normalization, processing metadata, reproducibility, Data federation/integration with heterogeneous bioinformatics external databases The GLDS currently serves over 100 omics investigations to the biomedical community via open access. In order to expand the scope of metadata record searches via the GLDS, we designed a metadata warehouse that collects and updates metadata records from external systems housing similar data. To demonstrate the capabilities of federated search and retrieval of these data, we imported metadata records from three open-access data systems into the GLDS metadata warehouse: NCBI's Gene Expression Omnibus (GEO), EBI's PRoteomics IDEntifications (PRIDE) repository, and the Metagenomics Analysis server (MG-RAST). Each of these systems defines metadata for omics data sets differently. One solution to bridge such differences is to employ a common object model (COM) to which each systems' representation of metadata can be mapped. Warehoused metadata records are then transformed at ETL to this single, common representation. Queries generated via the GLDS are then executed against the warehouse, and matching records are shown in the COM representation (Fig. 1). While this approach is relatively straightforward to implement, the volume of the data in the omics domain presents challenges in dealing with latency and currency of records. Furthermore, the lack of a coordinated has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and ARM
NASA Astrophysics Data System (ADS)
Crow, M. C.; Devarakonda, R.; Killeffer, T.; Hook, L.; Boden, T.; Wullschleger, S.
2017-12-01
Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This poster describes tools being used in several projects at Oak Ridge National Laboratory (ORNL), with a focus on the U.S. Department of Energy's Next Generation Ecosystem Experiment in the Arctic (NGEE Arctic) and Atmospheric Radiation Measurements (ARM) project, and their usage at different stages of the data lifecycle. The Online Metadata Editor (OME) is used for the documentation and archival stages while a Data Search tool supports indexing, cataloging, and searching. The NGEE Arctic OME Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload while adhering to standard metadata formats. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The Data Search Tool conveniently displays each data record in a thumbnail containing the title, source, and date range, and features a quick view of the metadata associated with that record, as well as a direct link to the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for geo-searching. These tools are supported by the Mercury [2] consortium (funded by DOE, NASA, USGS, and ARM) and developed and managed at Oak Ridge National Laboratory. Mercury is a set of tools for collecting, searching, and retrieving metadata and data. Mercury collects metadata from contributing project servers, then indexes the metadata to make it searchable using Apache Solr, and provides access to retrieve it from the web page. Metadata standards that Mercury supports include: XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Richard, S. M.; Malik, T.; Hsu, L.; Gupta, A.; Grethe, J. S.; Valentine, D. W., Jr.; Lehnert, K. A.; Bermudez, L. E.; Ozyurt, I. B.; Whitenack, T.; Schachne, A.; Giliarini, A.
2015-12-01
While many geoscience-related repositories and data discovery portals exist, finding information about available resources remains a pervasive problem, especially when searching across multiple domains and catalogs. Inconsistent and incomplete metadata descriptions, disparate access protocols and semantic differences across domains, and troves of unstructured or poorly structured information which is hard to discover and use are major hindrances toward discovery, while metadata compilation and curation remain manual and time-consuming. We report on methodology, main results and lessons learned from an ongoing effort to develop a geoscience-wide catalog of information resources, with consistent metadata descriptions, traceable provenance, and automated metadata enhancement. Developing such a catalog is the central goal of CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability), an EarthCube building block project (earthcube.org/group/cinergi). The key novel technical contributions of the projects include: a) development of a metadata enhancement pipeline and a set of document enhancers to automatically improve various aspects of metadata descriptions, including keyword assignment and definition of spatial extents; b) Community Resource Viewers: online applications for crowdsourcing community resource registry development, curation and search, and channeling metadata to the unified CINERGI inventory, c) metadata provenance, validation and annotation services, d) user interfaces for advanced resource discovery; and e) geoscience-wide ontology and machine learning to support automated semantic tagging and faceted search across domains. We demonstrate these CINERGI components in three types of user scenarios: (1) improving existing metadata descriptions maintained by government and academic data facilities, (2) supporting work of several EarthCube Research Coordination Network projects in assembling information resources for their domains, and (3) enhancing the inventory and the underlying ontology to address several complicated data discovery use cases in hydrology, geochemistry, sedimentology, and critical zone science. Support from the US National Science Foundation under award ICER-1343816 is gratefully acknowledged.
Evaluating and Evolving Metadata in Multiple Dialects
NASA Technical Reports Server (NTRS)
Kozimore, John; Habermann, Ted; Gordon, Sean; Powers, Lindsay
2016-01-01
Despite many long-term homogenization efforts, communities continue to develop focused metadata standards along with related recommendations and (typically) XML representations (aka dialects) for sharing metadata content. Different representations easily become obstacles to sharing information because each representation generally requires a set of tools and skills that are designed, built, and maintained specifically for that representation. In contrast, community recommendations are generally described, at least initially, at a more conceptual level and are more easily shared. For example, most communities agree that dataset titles should be included in metadata records although they write the titles in different ways.
Metadata Evaluation and Improvement: Evolving Analysis and Reporting
NASA Technical Reports Server (NTRS)
Habermann, Ted; Kozimor, John; Gordon, Sean
2017-01-01
ESIP Community members create and manage a large collection of environmental datasets that span multiple decades, the entire globe, and many parts of the solar system. Metadata are critical for discovering, accessing, using and understanding these data effectively and ESIP community members have successfully created large collections of metadata describing these data. As part of the White House Big Earth Data Initiative (BEDI), ESDIS has developed a suite of tools for evaluating these metadata in native dialects with respect to recommendations from many organizations. We will describe those tools and demonstrate evolving techniques for sharing results with data providers.
Park, Yu Rang; Kim*, Ju Han
2006-01-01
Standardized management of data elements (DEs) for Case Report Form (CRF) is crucial in Clinical Trials Information System (CTIS). Traditional CTISs utilize organization-specific definitions and storage methods for Des and CRFs. We developed metadata-based DE management system for clinical trials, Clinical and Histopathological Metadata Registry (CHMR), using international standard for metadata registry (ISO 11179) for the management of cancer clinical trials information. CHMR was evaluated in cancer clinical trials with 1625 DEs extracted from the College of American Pathologists Cancer Protocols for 20 major cancers. PMID:17238675
Metadata to Support Data Warehouse Evolution
NASA Astrophysics Data System (ADS)
Solodovnikova, Darja
The focus of this chapter is metadata necessary to support data warehouse evolution. We present the data warehouse framework that is able to track evolution process and adapt data warehouse schemata and data extraction, transformation, and loading (ETL) processes. We discuss the significant part of the framework, the metadata repository that stores information about the data warehouse, logical and physical schemata and their versions. We propose the physical implementation of multiversion data warehouse in a relational DBMS. For each modification of a data warehouse schema, we outline the changes that need to be made to the repository metadata and in the database.
NASA Astrophysics Data System (ADS)
Schweitzer, R. H.
2001-05-01
The Climate Diagnostics Center maintains a collection of gridded climate data primarily for use by local researchers. Because this data is available on fast digital storage and because it has been converted to netCDF using a standard metadata convention (called COARDS), we recognize that this data collection is also useful to the community at large. At CDC we try to use technology and metadata standards to reduce our costs associated with making these data available to the public. The World Wide Web has been an excellent technology platform for meeting that goal. Specifically we have developed Web-based user interfaces that allow users to search, plot and download subsets from the data collection. We have also been exploring use of the Pacific Marine Environment Laboratory's Live Access Server (LAS) as an engine for this task. This would result in further savings by allowing us to concentrate on customizing the LAS where needed, rather that developing and maintaining our own system. One such customization currently under development is the use of Java Servlets and JavaServer pages in conjunction with a metadata database to produce a hierarchical user interface to LAS. In addition to these Web-based user interfaces all of our data are available via the Distributed Oceanographic Data System (DODS). This allows other sites using LAS and individuals using DODS-enabled clients to use our data as if it were a local file. All of these technology systems are driven by metadata. When we began to create netCDF files, we collaborated with several other agencies to develop a netCDF convention (COARDS) for metadata. At CDC we have extended that convention to incorporate additional metadata elements to make the netCDF files as self-describing as possible. Part of the local metadata is a set of controlled names for the variable, level in the atmosphere and ocean, statistic and data set for each netCDF file. To allow searching and easy reorganization of these metadata, we loaded the metadata from the netCDF files into a mySQL database. The combination of the mySQL database and the controlled names makes it possible to automate the construction of user interfaces and standard format metadata descriptions, like Federal Geographic Data Committee (FGDC) and Directory Interchange Format (DIF). These standard descriptions also include an association between our controlled names and standard keywords such as those developed by the Global Change Master Directory (GCMD). This talk will give an overview of each of these technology and metadata standards as it applies to work at the Climate Diagnostics Center. The talk will also discuss the pros and cons of each approach and discuss areas for future development.
Huang, Min; Liu, Zhaoqing; Qiao, Liyan
2014-10-10
While the NAND flash memory is widely used as the storage medium in modern sensor systems, the aggressive shrinking of process geometry and an increase in the number of bits stored in each memory cell will inevitably degrade the reliability of NAND flash memory. In particular, it's critical to enhance metadata reliability, which occupies only a small portion of the storage space, but maintains the critical information of the file system and the address translations of the storage system. Metadata damage will cause the system to crash or a large amount of data to be lost. This paper presents Asymmetric Programming, a highly reliable metadata allocation strategy for MLC NAND flash memory storage systems. Our technique exploits for the first time the property of the multi-page architecture of MLC NAND flash memory to improve the reliability of metadata. The basic idea is to keep metadata in most significant bit (MSB) pages which are more reliable than least significant bit (LSB) pages. Thus, we can achieve relatively low bit error rates for metadata. Based on this idea, we propose two strategies to optimize address mapping and garbage collection. We have implemented Asymmetric Programming on a real hardware platform. The experimental results show that Asymmetric Programming can achieve a reduction in the number of page errors of up to 99.05% with the baseline error correction scheme.
Huang, Min; Liu, Zhaoqing; Qiao, Liyan
2014-01-01
While the NAND flash memory is widely used as the storage medium in modern sensor systems, the aggressive shrinking of process geometry and an increase in the number of bits stored in each memory cell will inevitably degrade the reliability of NAND flash memory. In particular, it's critical to enhance metadata reliability, which occupies only a small portion of the storage space, but maintains the critical information of the file system and the address translations of the storage system. Metadata damage will cause the system to crash or a large amount of data to be lost. This paper presents Asymmetric Programming, a highly reliable metadata allocation strategy for MLC NAND flash memory storage systems. Our technique exploits for the first time the property of the multi-page architecture of MLC NAND flash memory to improve the reliability of metadata. The basic idea is to keep metadata in most significant bit (MSB) pages which are more reliable than least significant bit (LSB) pages. Thus, we can achieve relatively low bit error rates for metadata. Based on this idea, we propose two strategies to optimize address mapping and garbage collection. We have implemented Asymmetric Programming on a real hardware platform. The experimental results show that Asymmetric Programming can achieve a reduction in the number of page errors of up to 99.05% with the baseline error correction scheme. PMID:25310473
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-30
..., Aerospace Engineer, Engine Certification Office, FAA, Engine & Propeller Directorate, 12 New England... Directives; General Electric Company (GE) CF34-1A, CF34-3A, and CF34-3B Series Turbofan Engines; Correction... to GE CF34-1A, CF34-3A, and CF34-3B series turbofan engines. The docket number is incorrect in all...
Applications For Real Time NOMADS At NCEP To Disseminate NOAA's Operational Model Data Base
NASA Astrophysics Data System (ADS)
Alpert, J. C.; Wang, J.; Rutledge, G.
2007-05-01
A wide range of environmental information, in digital form, with metadata descriptions and supporting infrastructure is contained in the NOAA Operational Modeling Archive Distribution System (NOMADS) and its Real Time (RT) project prototype at the National Centers for Environmental Prediction (NCEP). NOMADS is now delivering on its goal of a seamless framework, from archival to real time data dissemination for NOAA's operational model data holdings. A process is under way to make NOMADS part of NCEP's operational production of products. A goal is to foster collaborations among the research and education communities, value added retailers, and public access for science and development. In the National Research Council's "Completing the Forecast", Recommendation 3.4 states: "NOMADS should be maintained and extended to include (a) long-term archives of the global and regional ensemble forecasting systems at their native resolution, and (b) re-forecast datasets to facilitate post-processing." As one of many participants of NOMADS, NCEP serves the operational model data base using data application protocol (Open-DAP) and other services for participants to serve their data sets and users to obtain them. Using the NCEP global ensemble data as an example, we show an Open-DAP (also known as DODS) client application that provides a request-and-fulfill mechanism for access to the complex ensemble matrix of holdings. As an example of the DAP service, we show a client application which accesses the Global or Regional Ensemble data set to produce user selected weather element event probabilities. The event probabilities are easily extended over model forecast time to show probability histograms defining the future trend of user selected events. This approach insures an efficient use of computer resources because users transmit only the data necessary for their tasks. Data sets are served by OPeN-DAP allowing commercial clients such as MATLAB or IDL as well as freeware clients such as GrADS to access the NCEP real time database. We will demonstrate how users can use NOMADS services to repackage area subsets and select levels and variables that are sent to a users selected ftp site. NOMADS can also display plots on demand for area subsets, selected levels, time series and selected variables.
ERIC Educational Resources Information Center
Alemneh, Daniel Gelaw
2009-01-01
Digital preservation is a significant challenge for cultural heritage institutions and other repositories of digital information resources. Recognizing the critical role of metadata in any successful digital preservation strategy, the Preservation Metadata Implementation Strategies (PREMIS) has been extremely influential on providing a "core" set…
Inconsistencies between Academic E-Book Platforms: A Comparison of Metadata and Search Results
ERIC Educational Resources Information Center
Wiersma, Gabrielle; Tovstiadi, Esta
2017-01-01
This article presents the results of a study of academic e-books that compared the metadata and search results from major academic e-book platforms. The authors collected data and performed a series of test searches designed to produce the same result regardless of platform. Testing, however, revealed metadata-related errors and significant…
ERIC Educational Resources Information Center
Fast, Karl V.; Campbell, D. Grant
2001-01-01
Compares the implied ontological frameworks of the Open Archives Initiative Protocol for Metadata Harvesting and the World Wide Web Consortium's Semantic Web. Discusses current search engine technology, semantic markup, indexing principles of special libraries and online databases, and componentization and the distinction between data and…
The Arctic Observing Network (AON)Cooperative Arctic Data and Information Service (CADIS)
NASA Astrophysics Data System (ADS)
Moore, J.; Fetterer, F.; Middleton, D.; Ramamurthy, M.; Barry, R.
2007-12-01
The Arctic Observing Network (AON) is intended to be a federation of 34 land, atmosphere and ocean observation sites, some already operating and some newly funded by the U.S. National Science Foundation. This International Polar Year (IPY) initiative will acquire a major portion of the data coming from the interagency Study of Environmental Arctic Change (SEARCH). AON will succeed in supporting the science envisioned by its planners only if it functions as a system and not as a collection of independent observation programs. Development and implementation of a comprehensive data management strategy will key a key to the success of this effort. AON planners envision an ideal data management system that includes a portal through which scientists can submit metadata and datasets at a single location; search the complete archive and find all data relevant to a location or process; all data have browse imagery and complete documentation; time series or fields can be plotted on line, and all data are in a relational database so that multiple data sets and sources can be queried and retrieved. The Cooperative Arctic Data and Information Service (CADIS) will provide near-real-time data delivery, a long-term repository for data, a portal for data discovery, and tools to manipulate data by building on existing tools like the Unidata Integrated Data Viewer (IDV). Our approach to the data integration challenge is to start by asking investigators to provide metadata via a general purpose user interface. An entry tool assists PIs in writing metadata and submitting data. Data can be submitted to the archive in NetCDF with Climate and Forecast conventions or in one of several other standard formats where possible. CADIS is a joint effort of the University Corporation for Atmospheric Research (UCAR), the National Snow and Ice Data Center (NSIDC), and the National Center for Atmospheric Research (NCAR). In the first year, we are concentrating on establishing metadata protocols that are compatible with international standards, and on demonstrating data submission, search and visualization tools with a subset of AON data. These capabilities will be expanded in years 2 and 3. By working with AON investigators and by using evolving conventions for in situ data formats as they mature, we hope to bring CADIS to the full level of data integration imagined by AON planners. The CADIS development will be described in terms of challenges, implementation strategies and progress to date. The developers are making a conscious effort to integrate this system and its data holdings with the complementary efforts in the SEARCH and IPY programs. The interdisciplinary content of the data, the variations in format and documentation, as well as its geographic coverage across the Arctic Basin all impact the form and effectiveness of the CADIS system architecture. The clever solutions to the complexity of implementing a comprehensive data management strategy implied in this diversity will be a focus of the presentation.
Shirazi, Fazal; Ferreira, Jose A. G.; Stevens, David A.; Clemons, Karl V.; Kontoyiannis, Dimitrios P.
2016-01-01
Pseudomonas aeruginosa (Pa) and Aspergillus fumigatus (Af) colonize cystic fibrosis (CF) patient airways. Pa culture filtrates inhibit Af biofilms, and Pa non-CF, mucoid (Muc-CF) and nonmucoid CF (NMuc-CF) isolates form an ascending inhibitory hierarchy. We hypothesized this activity is mediated through apoptosis induction. One Af and three Pa (non-CF, Muc-CF, NMuc-CF) reference isolates were studied. Af biofilm was formed in 96 well plates for 16 h ± Pa biofilm filtrates. After 24 h, apoptosis was characterized by viability dye DiBAc, reactive oxygen species (ROS) generation, mitochondrial membrane depolarization, DNA fragmentation and metacaspase activity. Muc-CF and NMuc-CF filtrates inhibited and damaged Af biofilm (p<0.0001). Intracellular ROS levels were elevated (p<0.001) in NMuc-CF-treated Af biofilms (3.7- fold) compared to treatment with filtrates from Muc-CF- (2.5- fold) or non-CF Pa (1.7- fold). Depolarization of mitochondrial potential was greater upon exposure to NMuc-CF (2.4-fold) compared to Muc-CF (1.8-fold) or non-CF (1.25-fold) (p<0.0001) filtrates. Exposure to filtrates resulted in more DNA fragmentation in Af biofilm, compared to control, mediated by metacaspase activation. In conclusion, filtrates from CF-Pa isolates were more inhibitory against Af biofilms than from non-CF. The apoptotic effect involves mitochondrial membrane damage associated with metacaspase activation. PMID:26930399
Shirazi, Fazal; Ferreira, Jose A G; Stevens, David A; Clemons, Karl V; Kontoyiannis, Dimitrios P
2016-01-01
Pseudomonas aeruginosa (Pa) and Aspergillus fumigatus (Af) colonize cystic fibrosis (CF) patient airways. Pa culture filtrates inhibit Af biofilms, and Pa non-CF, mucoid (Muc-CF) and nonmucoid CF (NMuc-CF) isolates form an ascending inhibitory hierarchy. We hypothesized this activity is mediated through apoptosis induction. One Af and three Pa (non-CF, Muc-CF, NMuc-CF) reference isolates were studied. Af biofilm was formed in 96 well plates for 16 h ± Pa biofilm filtrates. After 24 h, apoptosis was characterized by viability dye DiBAc, reactive oxygen species (ROS) generation, mitochondrial membrane depolarization, DNA fragmentation and metacaspase activity. Muc-CF and NMuc-CF filtrates inhibited and damaged Af biofilm (p<0.0001). Intracellular ROS levels were elevated (p<0.001) in NMuc-CF-treated Af biofilms (3.7- fold) compared to treatment with filtrates from Muc-CF- (2.5- fold) or non-CF Pa (1.7- fold). Depolarization of mitochondrial potential was greater upon exposure to NMuc-CF (2.4-fold) compared to Muc-CF (1.8-fold) or non-CF (1.25-fold) (p<0.0001) filtrates. Exposure to filtrates resulted in more DNA fragmentation in Af biofilm, compared to control, mediated by metacaspase activation. In conclusion, filtrates from CF-Pa isolates were more inhibitory against Af biofilms than from non-CF. The apoptotic effect involves mitochondrial membrane damage associated with metacaspase activation.
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)
NASA Astrophysics Data System (ADS)
Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.
2016-12-01
The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
Chapter 35: Describing Data and Data Collections in the VO
NASA Astrophysics Data System (ADS)
Kent, B. R.; Hanisch, R. J.; Williams, R. D.
The list of numbers: 19.22, 17.23, 18.11, 16.98, and 15.11, is of little intrinsic interest without information about the context in which they appear. For instance, are these daily closing stock prices for your favorite investment, or are they hourly photometric measurements of an increasingly bright quasar? The information needed to define this context is called metadata. Metadata are data about data. Astronomers are familiar with metadata through the headers of FITS files and the names and units associated with columns in a table or database. In the VO, metadata describe the contents of tables, images, and spectra, as well as aggregate collections of data (archives, surveys) and computational services. Moreover, VO metadata are constructed according to rules that avoid ambiguity and make it clear whether, in the example above, the stock prices are in dollars or euros, or the photometry is Johnson V or Sloan g. Organization of data is important in any scientific discipline. Equally crucial are the descriptions of that data: the organization publishing the data, its creator or the person making it available, what instruments were used, units assigned to measurement, calibration status, and data quality assessment. The Virtual Observatory metadata scheme not only applies to datasets, but to resources as well, including data archive facilities, searchable web forms, and online analysis and display tools. Since the scientific output flowing from large datasets depends greatly on how well the data are described, it is important for users to understand the basics of the metadata scheme in order to locate the data that they want and use it correctly. Metadata are the key to data discovery and data and service interoperability in the Virtual Observatory.
Sensor metadata blueprints and computer-aided editing for disciplined SensorML
NASA Astrophysics Data System (ADS)
Tagliolato, Paolo; Oggioni, Alessandro; Fugazza, Cristiano; Pepe, Monica; Carrara, Paola
2016-04-01
The need for continuous, accurate, and comprehensive environmental knowledge has led to an increase in sensor observation systems and networks. The Sensor Web Enablement (SWE) initiative has been promoted by the Open Geospatial Consortium (OGC) to foster interoperability among sensor systems. The provision of metadata according to the prescribed SensorML schema is a key component for achieving this and nevertheless availability of correct and exhaustive metadata cannot be taken for granted. On the one hand, it is awkward for users to provide sensor metadata because of the lack in user-oriented, dedicated tools. On the other, the specification of invariant information for a given sensor category or model (e.g., observed properties and units of measurement, manufacturer information, etc.), can be labor- and timeconsuming. Moreover, the provision of these details is error prone and subjective, i.e., may differ greatly across distinct descriptions for the same system. We provide a user-friendly, template-driven metadata authoring tool composed of a backend web service and an HTML5/javascript client. This results in a form-based user interface that conceals the high complexity of the underlying format. This tool also allows for plugging in external data sources providing authoritative definitions for the aforementioned invariant information. Leveraging these functionalities, we compiled a set of SensorML profiles, that is, sensor metadata blueprints allowing end users to focus only on the metadata items that are related to their specific deployment. The natural extension of this scenario is the involvement of end users and sensor manufacturers in the crowd-sourced evolution of this collection of prototypes. We describe the components and workflow of our framework for computer-aided management of sensor metadata.
Boulos, Maged N; Roudsari, Abdul V; Carson, Ewart R
2002-07-01
HealthCyberMap (http://healthcybermap.semanticweb.org/) aims at mapping Internet health information resources in novel ways for enhanced retrieval and navigation. This is achieved by collecting appropriate resource metadata in an unambiguous form that preserves semantics. We modelled a qualified Dublin Core (DC) metadata set ontology with extra elements for resource quality and geographical provenance in Prot g -2000. A metadata collection form helps acquiring resource instance data within Prot g . The DC subject field is populated with UMLS terms directly imported from UMLS Knowledge Source Server using UMLS tab, a Prot g -2000 plug-in. The project is saved in RDFS/RDF. The ontology and associated form serve as a free tool for building and maintaining an RDF medical resource metadata base. The UMLS tab enables browsing and searching for concepts that best describe a resource, and importing them to DC subject fields. The resultant metadata base can be used with a search and inference engine, and have textual and/or visual navigation interface(s) applied to it, to ultimately build a medical Semantic Web portal. Different ways of exploiting Prot g -2000 RDF output are discussed. By making the context and semantics of resources, not merely their raw text and formatting, amenable to computer 'understanding,' we can build a Semantic Web that is more useful to humans than the current Web. This requires proper use of metadata and ontologies. Clinical codes can reliably describe the subjects of medical resources, establish the semantic relationships (as defined by underlying coding scheme) between related resources, and automate their topical categorisation.
NASA Astrophysics Data System (ADS)
San Gil, Inigo; White, Marshall; Melendez, Eda; Vanderbilt, Kristin
The thirty-year-old United States Long Term Ecological Research Network has developed extensive metadata to document their scientific data. Standard and interoperable metadata is a core component of the data-driven analytical solutions developed by this research network Content management systems offer an affordable solution for rapid deployment of metadata centered information management systems. We developed a customized integrative metadata management system based on the Drupal content management system technology. Building on knowledge and experience with the Sevilleta and Luquillo Long Term Ecological Research sites, we successfully deployed the first two medium-scale customized prototypes. In this paper, we describe the vision behind our Drupal based information management instances, and list the features offered through these Drupal based systems. We also outline the plans to expand the information services offered through these metadata centered management systems. We will conclude with the growing list of participants deploying similar instances.
NASA Astrophysics Data System (ADS)
Baumann, Peter
2013-04-01
There is a traditional saying that metadata are understandable, semantic-rich, and searchable. Data, on the other hand, are big, with no accessible semantics, and just downloadable. Not only has this led to an imbalance of search support form a user perspective, but also underneath to a deep technology divide often using relational databases for metadata and bespoke archive solutions for data. Our vision is that this barrier will be overcome, and data and metadata become searchable likewise, leveraging the potential of semantic technologies in combination with scalability technologies. Ultimately, in this vision ad-hoc processing and filtering will not distinguish any longer, forming a uniformly accessible data universe. In the European EarthServer initiative, we work towards this vision by federating database-style raster query languages with metadata search and geo broker technology. We present our approach taken, how it can leverage OGC standards, the benefits envisaged, and first results.
Distributed metadata in a high performance computing environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Zhang, Zhenhua
A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination thatmore » a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.« less
Metadata-Driven SOA-Based Application for Facilitation of Real-Time Data Warehousing
NASA Astrophysics Data System (ADS)
Pintar, Damir; Vranić, Mihaela; Skočir, Zoran
Service-oriented architecture (SOA) has already been widely recognized as an effective paradigm for achieving integration of diverse information systems. SOA-based applications can cross boundaries of platforms, operation systems and proprietary data standards, commonly through the usage of Web Services technology. On the other side, metadata is also commonly referred to as a potential integration tool given the fact that standardized metadata objects can provide useful information about specifics of unknown information systems with which one has interest in communicating with, using an approach commonly called "model-based integration". This paper presents the result of research regarding possible synergy between those two integration facilitators. This is accomplished with a vertical example of a metadata-driven SOA-based business process that provides ETL (Extraction, Transformation and Loading) and metadata services to a data warehousing system in need of a real-time ETL support.
Life+ EnvEurope DEIMS - improving access to long-term ecosystem monitoring data in Europe
NASA Astrophysics Data System (ADS)
Kliment, Tomas; Peterseil, Johannes; Oggioni, Alessandro; Pugnetti, Alessandra; Blankman, David
2013-04-01
Long-term ecological (LTER) studies aim at detecting environmental changes and analysing its related drivers. In this respect LTER Europe provides a network of about 450 sites and platforms. However, data on various types of ecosystems and at a broad geographical scale is still not easily available. Managing data resulting from long-term observations is therefore one of the important tasks not only for an LTER site itself but also on the network level. Exchanging and sharing the information within a wider community is a crucial objective in the upcoming years. Due to the fragmented nature of long-term ecological research and monitoring (LTER) in Europe - and also on the global scale - information management has to face several challenges: distributed data sources, heterogeneous data models, heterogeneous data management solutions and the complex domain of ecosystem monitoring with regard to the resulting data. The Life+ EnvEurope project (2010-2013) provides a case study for a workflow using data from the distributed network of LTER-Europe sites. In order to enhance discovery, evaluation and access to data, the EnvEurope Drupal Ecological Information Management System (DEIMS) has been developed. This is based on the first official release of the Drupal metadata editor developed by US LTER. EnvEurope DEIMS consists of three main components: 1) Metadata editor: a web-based client interface to manage metadata of three information resource types - datasets, persons and research sites. A metadata model describing datasets based on Ecological Metadata Language (EML) was developed within the initial phase of the project. A crosswalk to the INSPIRE metadata model was implemented to convey to the currently on-going European activities. Person and research site metadata models defined within the LTER Europe were adapted for the project needs. The three metadata models are interconnected within the system in order to provide easy way to navigate the user among the related resources. 2) Discovery client: provides several search profiles for datasets, persons, research sites and external resources commonly used in the domain, e.g. Catalogue of Life , based on several search patterns ranging from simple full text search, glossary browsing to categorized faceted search. 3) Geo-Viewer: a map client that portrays boundaries and centroids of the research sites as Web Map Service (WMS) layers. Each layer provides a link to both Metadata editor and Discovery client in order to create or discover metadata describing the data collected within the individual research site. Sharing of the dataset metadata with DEIMS is ensured in two ways: XML export of individual metadata records according to the EML schema for inclusion in the international DataOne network, and periodic harvesting of metadata into GeoNetwork catalogue, thus providing catalogue service for web (CSW), which can be invoked by remote clients. The final version of DEIMS will be a pilot implementation for the information system of LTER-Europe, which should establish a common information management framework within the European ecosystem research domain and provide valuable environmental information to other European information infrastructures as SEIS, Copernicus and INSPIRE.
The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF
NASA Astrophysics Data System (ADS)
Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.
2006-12-01
In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.
Report from the International Conference on Dublin Core and Metadata Applications, 2001.
ERIC Educational Resources Information Center
Sugimoto, Shigeo; Adachi, Jun; Baker, Thomas; Weibel, Stuart
This paper describes the International Conference on Dublin Core and Metadata Applications 2001 (DC-2001), the ninth major workshop of the Dublin Core Metadata Initiative (DCMI), which was held in Tokyo in October 2001. DC-2001 was a week-long event that included both a workshop and a conference. In the tradition of previous events, the workshop…
Using Open Educational Resources in Course Syllabi
ERIC Educational Resources Information Center
Andreatos, Antonios; Katsoulis, Stavros
2012-01-01
The purpose of this article is (1) to review the advantages of using learning objects (LOs) and open educational resources (OER), (2) to propose the enrichment of course syllabi with LOs/OER, (3) to propose new fields to be included in metadata and ways for embedding metadata in LOs/OER, (4) to review the problem of lack of metadata in Web 2.0…
Metadata Repository for Improved Data Sharing and Reuse Based on HL7 FHIR.
Ulrich, Hannes; Kock, Ann-Kristin; Duhm-Harbeck, Petra; Habermann, Jens K; Ingenerf, Josef
2016-01-01
Unreconciled data structures and formats are a common obstacle to the urgently required sharing and reuse of data within healthcare and medical research. Within the North German Tumor Bank of Colorectal Cancer, clinical and sample data, based on a harmonized data set, is collected and can be pooled by using a hospital-integrated Research Data Management System supporting biobank and study management. Adding further partners who are not using the core data set requires manual adaptations and mapping of data elements. Facing this manual intervention and focusing the reuse of heterogeneous healthcare instance data (value level) and data elements (metadata level), a metadata repository has been developed. The metadata repository is an ISO 11179-3 conformant server application built for annotating and mediating data elements. The implemented architecture includes the translation of metadata information about data elements into the FHIR standard using the FHIR Data Element resource with the ISO 11179 Data Element Extensions. The FHIR-based processing allows exchange of data elements with clinical and research IT systems as well as with other metadata systems. With increasingly annotated and harmonized data elements, data quality and integration can be improved for successfully enabling data analytics and decision support.
Desaules, André
2012-11-01
It is crucial for environmental monitoring to fully control temporal bias, which is the distortion of real data evolution by varying bias through time. Temporal bias cannot be fully controlled by statistics alone but requires appropriate and sufficient metadata, which should be under rigorous and continuous quality assurance and control (QA/QC) to reliably document the degree of consistency of the monitoring system. All presented strategies to detect and control temporal data bias (QA/QC, harmonisation/homogenisation/standardisation, mass balance approach, use of tracers and analogues and control of changing boundary conditions) rely on metadata. The Will Rogers phenomenon, due to subsequent reclassification, is a particular source of temporal data bias introduced to environmental monitoring here. Sources and effects of temporal data bias are illustrated by examples from the Swiss soil monitoring network. The attempt to make a comprehensive compilation and assessment of required metadata for soil contamination monitoring reveals that most metadata are still far from being reliable. This leads to the conclusion that progress in environmental monitoring means further development of the concept of environmental metadata for the sake of temporal data bias control as a prerequisite for reliable interpretations and decisions.
Transformation of HDF-EOS metadata from the ECS model to ISO 19115-based XML
NASA Astrophysics Data System (ADS)
Wei, Yaxing; Di, Liping; Zhao, Baohua; Liao, Guangxuan; Chen, Aijun
2007-02-01
Nowadays, geographic data, such as NASA's Earth Observation System (EOS) data, are playing an increasing role in many areas, including academic research, government decisions and even in people's every lives. As the quantity of geographic data becomes increasingly large, a major problem is how to fully make use of such data in a distributed, heterogeneous network environment. In order for a user to effectively discover and retrieve the specific information that is useful, the geographic metadata should be described and managed properly. Fortunately, the emergence of XML and Web Services technologies greatly promotes information distribution across the Internet. The research effort discussed in this paper presents a method and its implementation for transforming Hierarchical Data Format (HDF)-EOS metadata from the NASA ECS model to ISO 19115-based XML, which will be managed by the Open Geospatial Consortium (OGC) Catalogue Services—Web Profile (CSW). Using XML and international standards rather than domain-specific models to describe the metadata of those HDF-EOS data, and further using CSW to manage the metadata, can allow metadata information to be searched and interchanged more widely and easily, thus promoting the sharing of HDF-EOS data.
Development of an open metadata schema for prospective clinical research (openPCR) in China.
Xu, W; Guan, Z; Sun, J; Wang, Z; Geng, Y
2014-01-01
In China, deployment of electronic data capture (EDC) and clinical data management system (CDMS) for clinical research (CR) is in its very early stage, and about 90% of clinical studies collected and submitted clinical data manually. This work aims to build an open metadata schema for Prospective Clinical Research (openPCR) in China based on openEHR archetypes, in order to help Chinese researchers easily create specific data entry templates for registration, study design and clinical data collection. Singapore Framework for Dublin Core Application Profiles (DCAP) is used to develop openPCR and four steps such as defining the core functional requirements and deducing the core metadata items, developing archetype models, defining metadata terms and creating archetype records, and finally developing implementation syntax are followed. The core functional requirements are divided into three categories: requirements for research registration, requirements for trial design, and requirements for case report form (CRF). 74 metadata items are identified and their Chinese authority names are created. The minimum metadata set of openPCR includes 3 documents, 6 sections, 26 top level data groups, 32 lower data groups and 74 data elements. The top level container in openPCR is composed of public document, internal document and clinical document archetypes. A hierarchical structure of openPCR is established according to Data Structure of Electronic Health Record Architecture and Data Standard of China (Chinese EHR Standard). Metadata attributes are grouped into six parts: identification, definition, representation, relation, usage guides, and administration. OpenPCR is an open metadata schema based on research registration standards, standards of the Clinical Data Interchange Standards Consortium (CDISC) and Chinese healthcare related standards, and is to be publicly available throughout China. It considers future integration of EHR and CR by adopting data structure and data terms in Chinese EHR Standard. Archetypes in openPCR are modularity models and can be separated, recombined, and reused. The authors recommend that the method to develop openPCR can be referenced by other countries when designing metadata schema of clinical research. In the next steps, openPCR should be used in a number of CR projects to test its applicability and to continuously improve its coverage. Besides, metadata schema for research protocol can be developed to structurize and standardize protocol, and syntactical interoperability of openPCR with other related standards can be considered.
The PEcAn Project: Accessible Tools for On-demand Ecosystem Modeling
NASA Astrophysics Data System (ADS)
Cowdery, E.; Kooper, R.; LeBauer, D.; Desai, A. R.; Mantooth, J.; Dietze, M.
2014-12-01
Ecosystem models play a critical role in understanding the terrestrial biosphere and forecasting changes in the carbon cycle, however current forecasts have considerable uncertainty. The amount of data being collected and produced is increasing on daily basis as we enter the "big data" era, but only a fraction of this data is being used to constrain models. Until we can improve the problems of model accessibility and model-data communication, none of these resources can be used to their full potential. The Predictive Ecosystem Analyzer (PEcAn) is an ecoinformatics toolbox and a set of workflows that wrap around an ecosystem model and manage the flow of information in and out of regional-scale TBMs. Here we present new modules developed in PEcAn to manage the processing of meteorological data, one of the primary driver dependencies for ecosystem models. The module downloads, reads, extracts, and converts meteorological observations to Unidata Climate Forecast (CF) NetCDF community standard, a convention used for most climate forecast and weather models. The module also automates the conversion from NetCDF to model specific formats, including basic merging, gap-filling, and downscaling procedures. PEcAn currently supports tower-based micrometeorological observations at Ameriflux and FluxNET sites, site-level CSV-formatted data, and regional and global reanalysis products such as the North American Regional Reanalysis and CRU-NCEP. The workflow is easily extensible to additional products and processing algorithms.These meteorological workflows have been coupled with the PEcAn web interface and now allow anyone to run multiple ecosystem models for any location on the Earth by simply clicking on an intuitive Google-map based interface. This will allow users to more readily compare models to observations at those sites, leading to better calibration and validation. Current work is extending these workflows to also process field, remotely-sensed, and historical observations of vegetation composition and structure. The processing of heterogeneous met and veg data within PEcAn is made possible using the Brown Dog cyberinfrastructure tools for unstructured data.
Attitudes and Decision Making Related to Pregnancy Among Young Women with Cystic Fibrosis.
Kazmerski, Traci M; Gmelin, Theresa; Slocum, Breonna; Borrero, Sonya; Miller, Elizabeth
2017-04-01
Introduction The number of female patients with CF able to consider pregnancy has increased with improved therapies. This study explored attitudes and decision making regarding pregnancy among young women with CF. Methods Twenty-two women with CF ages 18-30 years completed semi-structured, in-person interviews exploring experiences with preconception counseling and reproductive care in the CF setting. Interviews were audio-recorded, transcribed, and coded using a thematic analysis approach. Results Participants indicated CF is a major factor in pregnancy decision making. Although women acknowledged that CF influences attitudes toward pregnancy, many expressed confusion about how CF can affect fertility/pregnancy. Many perceived disapproval from CF providers regarding pregnancy and were dissatisfied with reproductive care in the CF setting. Discussion Young female patients with CF reported poor understanding of the effect of CF on fertility and pregnancy and limited preconception counseling in CF care. Improvements in female sexual and reproductive health care in CF are warranted.
Assani, Kaivon; Tazi, Mia F.; Amer, Amal O.; Kopp, Benjamin T.
2014-01-01
Burkholderia cenocepacia is a virulent pathogen that causes significant morbidity and mortality in patients with cystic fibrosis (CF), survives intracellularly in macrophages, and uniquely causes systemic infections in CF. Autophagy is a physiologic process that involves engulfing non-functional organelles and proteins and delivering them for lysosomal degradation, but also plays a role in eliminating intracellular pathogens, including B. cenocepacia. Autophagy is defective in CF but can be stimulated in murine CF models leading to increased clearance of B. cenocepacia, but little is known about autophagy stimulation in human CF macrophages. IFN-γ activates macrophages and increases antigen presentation while also inducing autophagy in macrophages. We therefore, hypothesized that treatment with IFN-γ would increase autophagy and macrophage activation in patients with CF. Peripheral blood monocyte derived macrophages (MDMs) were obtained from CF and non-CF donors and subsequently infected with B. cenocepacia. Basal serum levels of IFN-γ were similar between CF and non-CF patients, however after B. cenocepacia infection there is deficient IFN-γ production in CF MDMs. IFN-γ treated CF MDMs demonstrate increased co-localization with the autophagy molecule p62, increased autophagosome formation, and increased trafficking to lysosomes compared to untreated CF MDMs. Electron microscopy confirmed IFN-γ promotes double membrane vacuole formation around bacteria in CF MDMs, while only single membrane vacuoles form in untreated CF cells. Bacterial burden is significantly reduced in autophagy stimulated CF MDMs, comparable to non-CF levels. IL-1β production is decreased in CF MDMs after IFN-γ treatment. Together, these results demonstrate that IFN-γ promotes autophagy-mediated clearance of B. cenocepacia in human CF macrophages. PMID:24798083
NASA Astrophysics Data System (ADS)
Hudspeth, W. B.; Budge, A.
2013-12-01
There is widespread recognition within the public health community that ongoing changes in climate are expected to increasingly pose threats to human health. Environmentally induced health risks to populations with respiratory illnesses are a growing concern globally. Of particular concern are dust and smoke events carrying PM2.5 and PM10 particle sizes, ozone, and pollen. There is considerable interest in documenting the precise linkages between changing patterns in the climate and how these shifts impact the prevalence of respiratory illnesses. The establishment of these linkages can drive the development of early warning and forecasting systems to alert health care professionals of impending air-quality events. As a component of a larger NASA-funded project on Integration of Airborne Dust Prediction Systems and Vegetation Phenology to Track Pollen for Asthma Alerts in Public Health Decision Support Systems, the Earth Data Analysis Center (EDAC) at the University of New Mexico, is developing web-based visualization and analysis services for forecasting pollen concentration data. This decision-support system, New Mexico's Environmental Public Health Tracking System (NMEPHTS), funded by the Centers for Disease Control (CDC) Environmental Public Health Tracking Network (EPHTN), aims to improve health awareness and services by linking health effects data with levels and frequency of environmental exposure. The forecast of atmospheric events with high pollen concentrations has employed a modified version of the DREAM (Dust Regional Atmospheric Model, a verified model for atmospheric dust transport modeling. In this application, PREAM (Pollen Regional Atmospheric Model) models pollen emission using a MODIS-derived phenology of Juniperus spp. communities. Model outputs are verified and validated with ground-based records of pollen release timing and quantities. Outputs of the PREAM model are post-processed and archived in EDAC's Geographic Storage, Transformation, and Retrieval Engine (GStore) database. The GStore geospatial services platform provides general purpose web services based upon the REST service model, and is capable of data discovery, access, and publication functions, metadata delivery functions, data transformation, and auto-generated OGC services for those data products that can support those services. These services are in turn ingested by New Mexico's EPHTN where end users in the public health community can then assess environmental-pubic health data associations. Advances in web mapping and related technologies open new doors for data providers and users that can deliver data and information in near-real time. In the public health community these technologies are being used to enhance disease and syndromic surveillance systems, visualize environmentally-related events such as pollen and dust events, and to provide focused mapping and analysis capabilities on the desktop. Here we present the current results of the project, and will focus on the challenges encountered in providing reliable and accurate forecast of pollen concentrations, as well as the experience of integrating output results and services into end user applications that can provide timely and meaningful alerts and forecasts.
Chakrabarti, Apratim; Panter, Stephen N; Harrison, Kate; Jones, Jonathan D G; Jones, David A
2009-10-01
The tomato Cf-9 and Cf-9B genes both confer resistance to the leaf mold fungus Cladosporium fulvum but only Cf-9 confers seedling resistance and recognizes the avirulence (Avr) protein Avr9 produced by C. fulvum. Using domain swaps, leucine-rich repeats (LRR) 5 to 15 of Cf-9 were shown to be required for Cf-9-specific resistance to C. fulvum in tomato, and the entire N-terminus up to LRR15 of Cf-9B was shown to be required for Cf-9B-specific resistance. Finer domain swaps showed that nine amino-acid differences in LRR 13 to 15 provided sufficient Cf-9-specific residues in a Cf-9B context for recognition of Avr9 in Nicotiana tabacum or sufficient Cf-9B residues in a Cf-9 context for a novel necrotic response caused by the expression of Cf-9B in N. benthamiana. The responses conferred by LRR 13 to 15 were enhanced by addition of LRR 10 to 12, and either region of Cf-9B was found to cause necrosis in N. benthamiana when the other was replaced by Cf-9 sequence in a Cf-9B context. As a consequence, the domain swap with LRR 13 to 15 of Cf-9 in a Cf-9B context gained the dual ability to recognize Avr9 and cause necrosis in N. benthamiana. Intriguingly, two Cf-9B-specific domain swaps gave differing results for necrosis assays in N. benthamiana compared with disease resistance assays in transgenic tomato. The different domain requirements in these two cases suggest that the two assays detect unrelated ligands or detect related ligands in slightly different ways. A heat-sensitive necrosis-inducing factor present in N. benthamiana intercellular washing fluids was found to cause a necrotic response in N. tabacum plants carrying Hcr9-9A, Cf-9B, and Cf-9 but not in plants carrying only Cf-9. We postulate that this necrosis-inducing factor is recognized by Cf-9B either directly as a ligand or indirectly as a regulator of Cf-9B autoactivity.
Calculation and characteristic analysis on synergistic effect of CF3I gas mixtures
NASA Astrophysics Data System (ADS)
Su, ZHAO; Yunkun, DENG; Yuhao, GAO; Dengming, XIAO
2018-06-01
CF3I is a potential SF6 alternative gas. In order to study the insulation properties and synergistic effects of CF3I/N2 and CF3I/CO2 gas mixtures, two-term approximate Boltzmann equations were used to obtain the ionization coefficient α, attachment coefficient η and the critical equivalent electrical field strength (E/N)cr. The results show that the (E/N)cr of CF3I gas at 300 K is 1.2 times that of SF6 gas, and CF3I/N2 and CF3I/CO2 gas mixtures both have synergistic effect occurred. The synergistic effect coefficient of CF3I/CO2 gas mixture was higher than that of CF3I/N2 gas mixture. But the (E/N)cr of CF3I/N2 is higher than that of CF3I/CO2 under the same conditions. When the content of CF3I exceeds 20%, the (E/N)cr of CF3I/N2 and CF3I/CO2 gas mixture increase linearly with the increasing of CF3I gas content. The breakdown voltage of CF3I/N2 gas mixture is also higher than that of CF3I/CO2 gas mixture in slightly non-uniform electrical field under power frequency voltage, but the synergistic effect coefficients of the two gas mixtures are basically the same.
Finel, Moshe; Pick, Uri; Selman-Reimer, Susanne; Selman, Bruce R.
1984-01-01
The isolation of the chloroplast ATP synthase complex (CF0-CF1) and of CF1 from Dunaliella bardawil is described. The subunit structure of the D. bardawil ATPase differs from that of the spinach in that the D. bardawil α subunit migrates ahead of the β subunit and ε-migrates ahead of subunit II of CF0 when separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The CF1 isolated from D. bardawil resembles the CF1 isolated from Chladmydomonas reinhardi in that a reversible, Mg2+-dependent ATPase is induced by selected organic solvents. Glycerol stimulates cyclic photophosphorylation catalyzed by D. bardawil thylakoid membranes but inhibits photophosphorylation catalyzed by spinach thylakoid membranes. Glycerol (20%) also stimulates the rate of ATP-Pi exchange catalyzed by D. bardawil CF0-CF1 proteoliposomes but inhibits the activity with the spinach enzyme. The ethanol-activated, Mg2+-ATPase of the D. bardawil CF1 is more resistant to glycerol inhibition than the octylglucoside-activated, Mg2+-ATPase of spinach CF1 or the ethanol-activated, Mg2+-dependent ATPase of the C. reinhardi CF1. Both cyclic photophosphorylation and ATP-Pi exchange catalyzed by D. bardawil CF0-CF1 are more sensitive to high concentrations of NaCl than is the spinach complex. Images Fig. 5 PMID:16663507
Automated Test Methods for XML Metadata
2017-12-28
Group under RCC Task TG-147. This document (Volume VI of the RCC Document 118 series) describes procedures used for evaluating XML metadata documents...including TMATS, MDL, IHAL, and DDML documents. These documents contain specifications or descriptions of artifacts and systems of importance to...the collection and management of telemetry data. The methods defined in this report provide a means of evaluating the suitability of such a metadata
DOE Office of Scientific and Technical Information (OSTI.GOV)
2006-10-25
The purpose of the eXtended MetaData Registry (XMDR) prototype is to demonstrate the feasibility and utility of constructing an extended metadata registry, i.e., one which encompasses richer classification support, facilities for including terminologies, and better support for formal specification of semantics. The prototype registry will also serve as a reference implementation for the revised versions of ISO 11179, Parts 2 and 3 to help guide production implementations.
Inigo San Gil; Wade Sheldon; Tom Schmidt; Mark Servilla; Raul Aguilar; Corinna Gries; Tanya Gray; Dawn Field; James Cole; Jerry Yun Pan; Giri Palanisamy; Donald Henshaw; Margaret O' Brien; Linda Kinkel; Kathrine McMahon; Renzo Kottmann; Linda Amaral-Zettler; John Hobbie; Philip Goldstein; Robert P. Guralnick; James Brunt; William K. Michener
2008-01-01
The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML)....
NASA Technical Reports Server (NTRS)
DeMore, W.B.
1996-01-01
Relative rate experiments are used to measure rate constants and temperature dependencies of the reactions of OH with CH3F (41), CH2FCl (31), CH2BrCl (30B1), CH2Br2 (3OB2), CHBr3 (2OB3), CF2BrCHFCl (123aBl(alpha)), and CF2ClCHCl2 (122). Rate constants for additional compounds of these types are estimated using an empirical rate constant estimation method which is based on measured rate constants for a wide range of halocarbons. The experimental data are combined with the estimated and previously reported rate constants to illustrate the effects of F, Cl, and Br substitution on OH rate constants for a series of 19 halomethanes and 25 haloethanes. Application of the estimation technique is further illustrated for some higher hydrofluorocarbons (HFCs), including CHF2CF2CF2CF2H (338pcc), CF3CHFCHFCF2CF3 (43-10mee), CF3CH2CH2CF3 (356ffa), CF3CH2CF2CH2CF3 (458mfcf), CF3CH2CHF2 (245fa), and CF3CH2CF2CH3 (365mfc). The predictions are compared with literature data for these compounds.
Tang, Jia; Fu, Zi-Ying; Wei, Chen-Xue; Chen, Qi-Cai
2015-08-01
In constant frequency-frequency modulation (CF-FM) bats, the CF-FM echolocation signals include both CF and FM components, yet the role of such complex acoustic signals in frequency resolution by bats remains unknown. Using CF and CF-FM echolocation signals as acoustic stimuli, the responses of inferior collicular (IC) neurons of Hipposideros armiger were obtained by extracellular recordings. We tested the effect of preceding CF or CF-FM sounds on the shape of the frequency tuning curves (FTCs) of IC neurons. Results showed that both CF-FM and CF sounds reduced the number of FTCs with tailed lower-frequency-side of IC neurons. However, more IC neurons experienced such conversion after adding CF-FM sound compared with CF sound. We also found that the Q 20 value of the FTC of IC neurons experienced the largest increase with the addition of CF-FM sound. Moreover, only CF-FM sound could cause an increase in the slope of the neurons' FTCs, and such increase occurred mainly in the lower-frequency edge. These results suggested that CF-FM sound could increase the accuracy of frequency analysis of echo and cut-off low-frequency elements from the habitat of bats more than CF sound.
Brandão, S F; Campos, T P R
2015-07-01
This article proposes a combination of californium-252 ((252)Cf) brachytherapy, boron neutron capture therapy (BNCT) and an intracavitary moderator balloon catheter applied to brain tumour and infiltrations. Dosimetric evaluations were performed on three protocol set-ups: (252)Cf brachytherapy combined with BNCT (Cf-BNCT); Cf-BNCT with a balloon catheter filled with light water (LWB) and the same set-up with heavy water (HWB). Cf-BNCT-HWB has presented dosimetric advantages to Cf-BNCT-LWB and Cf-BNCT in infiltrations at 2.0-5.0 cm from the balloon surface. However, Cf-BNCT-LWB has shown superior dosimetry up to 2.0 cm from the balloon surface. Cf-BNCT-HWB and Cf-BNCT-LWB protocols provide a selective dose distribution for brain tumour and infiltrations, mainly further from the (252)Cf source, sparing the normal brain tissue. Malignant brain tumours grow rapidly and often spread to adjacent brain tissues, leading to death. Improvements in brain radiation protocols have been continuously achieved; however, brain tumour recurrence is observed in most cases. Cf-BNCT-LWB and Cf-BNCT-HWB represent new modalities for selectively combating brain tumour infiltrations and metastasis.
Automated Transformation of CDISC ODM to OpenClinica.
Gessner, Sophia; Storck, Michael; Hegselmann, Stefan; Dugas, Martin; Soto-Rey, Iñaki
2017-01-01
Due to the increasing use of electronic data capture systems for clinical research, the interest in saving resources by automatically generating and reusing case report forms in clinical studies is growing. OpenClinica, an open-source electronic data capture system enables the reuse of metadata in its own Excel import template, hampering the reuse of metadata defined in other standard formats. One of these standard formats is the Operational Data Model for metadata, administrative and clinical data in clinical studies. This work suggests a mapping from Operational Data Model to OpenClinica and describes the implementation of a converter to automatically generate OpenClinica conform case report forms based upon metadata in the Operational Data Model.
NASA Astrophysics Data System (ADS)
Ulbricht, Damian; Elger, Kirsten; Bertelmann, Roland; Klump, Jens
2016-04-01
With the foundation of DataCite in 2009 and the technical infrastructure installed in the last six years it has become very easy to create citable dataset DOIs. Nowadays, dataset DOIs are increasingly accepted and required by journals in reference lists of manuscripts. In addition, DataCite provides usage statistics [1] of assigned DOIs and offers a public search API to make research data count. By linking related information to the data, they become more useful for future generations of scientists. For this purpose, several identifier systems, as ISBN for books, ISSN for journals, DOI for articles or related data, Orcid for authors, and IGSN for physical samples can be attached to DOIs using the DataCite metadata schema [2]. While these are good preconditions to publish data, free and open solutions that help with the curation of data, the publication of research data, and the assignment of DOIs in one software seem to be rare. At GFZ Potsdam we built a modular software stack that is made of several free and open software solutions and we established 'GFZ Data Services'. 'GFZ Data Services' provides storage, a metadata editor for publication and a facility to moderate minted DOIs. All software solutions are connected through web APIs, which makes it possible to reuse and integrate established software. Core component of 'GFZ Data Services' is an eSciDoc [3] middleware that is used as central storage, and has been designed along the OAIS reference model for digital preservation. Thus, data are stored in self-contained packages that are made of binary file-based data and XML-based metadata. The eSciDoc infrastructure provides access control to data and it is able to handle half-open datasets, which is useful in embargo situations when a subset of the research data are released after an adequate period. The data exchange platform panMetaDocs [4] makes use of eSciDoc's REST API to upload file-based data into eSciDoc and uses a metadata editor [5] to annotate the files with metadata. The metadata editor has a user-friendly interface with nominal lists, extensive explanations, and an interactive mapping tool to provide assistance to scientists describing the data. It is possible to deposit metadata templates to fill certain fields with default values. The metadata editor generates metadata in the schemas ISO19139, NASA GCMD DIF, and DataCite and could be extended for other schemas. panMetaDocs is able to mint dataset DOIs through DOIDB, which is our component to moderate dataset DOIs issued through 'GFZ Data Services'. DOIDB accepts metadata in the schemas ISO19139, DIF, and DataCite. In addition, DOIDB provides an OAI-PMH interface to disseminate all deposited metadata to data portals. The presentation of datasets on DOI landing pages is done though XSLT stylesheet transformation of the XML-based metadata. The landing pages have been designed to meet needs of scientists. We are able to render the metadata to different layouts. Furthermore, additional information about datasets and publications is assembled into the webpage by querying public databases on the internet. The work presented here will focus on technical details of the software stack. [1] http://stats.datacite.org [2] http://www.dlib.org/dlib/january11/starr/01starr.html [3] http://www.escidoc.org [4] http://panmetadocs.sf.net [5] http://github.com/ulbricht
Data Discovery of Big and Diverse Climate Change Datasets - Options, Practices and Challenges
NASA Astrophysics Data System (ADS)
Palanisamy, G.; Boden, T.; McCord, R. A.; Frame, M. T.
2013-12-01
Developing data search tools is a very common, but often confusing, task for most of the data intensive scientific projects. These search interfaces need to be continually improved to handle the ever increasing diversity and volume of data collections. There are many aspects which determine the type of search tool a project needs to provide to their user community. These include: number of datasets, amount and consistency of discovery metadata, ancillary information such as availability of quality information and provenance, and availability of similar datasets from other distributed sources. Environmental Data Science and Systems (EDSS) group within the Environmental Science Division at the Oak Ridge National Laboratory has a long history of successfully managing diverse and big observational datasets for various scientific programs via various data centers such as DOE's Atmospheric Radiation Measurement Program (ARM), DOE's Carbon Dioxide Information and Analysis Center (CDIAC), USGS's Core Science Analytics and Synthesis (CSAS) metadata Clearinghouse and NASA's Distributed Active Archive Center (ORNL DAAC). This talk will showcase some of the recent developments for improving the data discovery within these centers The DOE ARM program recently developed a data discovery tool which allows users to search and discover over 4000 observational datasets. These datasets are key to the research efforts related to global climate change. The ARM discovery tool features many new functions such as filtered and faceted search logic, multi-pass data selection, filtering data based on data quality, graphical views of data quality and availability, direct access to data quality reports, and data plots. The ARM Archive also provides discovery metadata to other broader metadata clearinghouses such as ESGF, IASOA, and GOS. In addition to the new interface, ARM is also currently working on providing DOI metadata records to publishers such as Thomson Reuters and Elsevier. The ARM program also provides a standards based online metadata editor (OME) for PIs to submit their data to the ARM Data Archive. USGS CSAS metadata Clearinghouse aggregates metadata records from several USGS projects and other partner organizations. The Clearinghouse allows users to search and discover over 100,000 biological and ecological datasets from a single web portal. The Clearinghouse also enabled some new data discovery functions such as enhanced geo-spatial searches based on land and ocean classifications, metadata completeness rankings, data linkage via digital object identifiers (DOIs), and semantically enhanced keyword searches. The Clearinghouse also currently working on enabling a dashboard which allows the data providers to look at various statistics such as number their records accessed via the Clearinghouse, most popular keywords, metadata quality report and DOI creation service. The Clearinghouse also publishes metadata records to broader portals such as NSF DataONE and Data.gov. The author will also present how these capabilities are currently reused by the recent and upcoming data centers such as DOE's NGEE-Arctic project. References: [1] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L., Killeffer, T., Krassovski, M., ... & Frame, M. (2014, October). OME: Tool for generating and managing metadata to handle BigData. In BigData Conference (pp. 8-10).
panMetaDocs and DataSync - providing a convenient way to share and publish research data
NASA Astrophysics Data System (ADS)
Ulbricht, D.; Klump, J. F.
2013-12-01
In recent years research institutions, geological surveys and funding organizations started to build infrastructures to facilitate the re-use of research data from previous work. At present, several intermeshed activities are coordinated to make data systems of the earth sciences interoperable and recorded data discoverable. Driven by governmental authorities, ISO19115/19139 emerged as metadata standards for discovery of data and services. Established metadata transport protocols like OAI-PMH and OGC-CSW are used to disseminate metadata to data portals. With the persistent identifiers like DOI and IGSN research data and corresponding physical samples can be given unambiguous names and thus become citable. In summary, these activities focus primarily on 'ready to give away'-data, already stored in an institutional repository and described with appropriate metadata. Many datasets are not 'born' in this state but are produced in small and federated research projects. To make access and reuse of these 'small data' easier, these data should be centrally stored and version controlled from the very beginning of activities. We developed DataSync [1] as supplemental application to the panMetaDocs [2] data exchange platform as a data management tool for small science projects. DataSync is a JAVA-application that runs on a local computer and synchronizes directory trees into an eSciDoc-repository [3] by creating eSciDoc-objects via eSciDocs' REST API. DataSync can be installed on multiple computers and is in this way able to synchronize files of a research team over the internet. XML Metadata can be added as separate files that are managed together with data files as versioned eSciDoc-objects. A project-customized instance of panMetaDocs is provided to show a web-based overview of the previously uploaded file collection and to allow further annotation with metadata inside the eSciDoc-repository. PanMetaDocs is a PHP based web application to assist the creation of metadata in any XML-based metadata schema. To reduce manual entries of metadata to a minimum and make use of contextual information in a project setting, metadata fields can be populated with static or dynamic content. Access rights can be defined to control visibility and access to stored objects. Notifications about recently updated datasets are available by RSS and e-mail and the entire inventory can be harvested via OAI-PMH. panMetaDocs is optimized to be harvested by panFMP [4]. panMetaDocs is able to mint dataset DOIs though DataCite and uses eSciDocs' REST API to transfer eSciDoc-objects from a non-public 'pending'-status to the published status 'released', which makes data and metadata of the published object available worldwide through the internet. The application scenario presented here shows the adoption of open source applications to data sharing and publication of data. An eSciDoc-repository is used as storage for data and metadata. DataSync serves as a file ingester and distributor, whereas panMetaDocs' main function is to annotate the dataset files with metadata to make them ready for publication and sharing with your own team, or with the scientific community.
NASA Astrophysics Data System (ADS)
Lily, Makroni; Baidya, Bidisha; Chandra, Asit K.
2017-02-01
Theoretical studies have been performed on the kinetics, mechanism and thermochemistry of the hydrogen abstraction reactions of CF3CF2OCH3 (HFE-245mc) and CF3CF2OCHO with OH radical using DFT based M06-2X method. IRC calculation shows that both hydrogen abstraction reactions proceed via weakly bound hydrogen-bonded complex preceding to the formation of transition state. The rate coefficients calculated by canonical transition state theory along with Eckart's tunnelling correction at 298 K: k1(CF3CF2OCH3 + OH) = 1.09 × 10-14 and k2(CF3CF2OCHO + OH) = 1.03 × 10-14 cm3 molecule-1 s-1 are in very good agreement with the experimental values. The atmospheric implications of CF3CF2OCH3 and CF3CF2OCHO are also discussed.
Deck, John; Gaither, Michelle R; Ewing, Rodney; Bird, Christopher E; Davies, Neil; Meyer, Christopher; Riginos, Cynthia; Toonen, Robert J; Crandall, Eric D
2017-08-01
The Genomic Observatories Metadatabase (GeOMe, http://www.geome-db.org/) is an open access repository for geographic and ecological metadata associated with biosamples and genetic data. Whereas public databases have served as vital repositories for nucleotide sequences, they do not accession all the metadata required for ecological or evolutionary analyses. GeOMe fills this need, providing a user-friendly, web-based interface for both data contributors and data recipients. The interface allows data contributors to create a customized yet standard-compliant spreadsheet that captures the temporal and geospatial context of each biosample. These metadata are then validated and permanently linked to archived genetic data stored in the National Center for Biotechnology Information's (NCBI's) Sequence Read Archive (SRA) via unique persistent identifiers. By linking ecologically and evolutionarily relevant metadata with publicly archived sequence data in a structured manner, GeOMe sets a gold standard for data management in biodiversity science.
Metadata management and semantics in microarray repositories.
Kocabaş, F; Can, T; Baykal, N
2011-12-01
The number of microarray and other high-throughput experiments on primary repositories keeps increasing as do the size and complexity of the results in response to biomedical investigations. Initiatives have been started on standardization of content, object model, exchange format and ontology. However, there are backlogs and inability to exchange data between microarray repositories, which indicate that there is a great need for a standard format and data management. We have introduced a metadata framework that includes a metadata card and semantic nets that make experimental results visible, understandable and usable. These are encoded in syntax encoding schemes and represented in RDF (Resource Description Frame-word), can be integrated with other metadata cards and semantic nets, and can be exchanged, shared and queried. We demonstrated the performance and potential benefits through a case study on a selected microarray repository. We concluded that the backlogs can be reduced and that exchange of information and asking of knowledge discovery questions can become possible with the use of this metadata framework.
Deck, John; Gaither, Michelle R.; Ewing, Rodney; Bird, Christopher E.; Davies, Neil; Meyer, Christopher; Riginos, Cynthia; Toonen, Robert J.; Crandall, Eric D.
2017-01-01
The Genomic Observatories Metadatabase (GeOMe, http://www.geome-db.org/) is an open access repository for geographic and ecological metadata associated with biosamples and genetic data. Whereas public databases have served as vital repositories for nucleotide sequences, they do not accession all the metadata required for ecological or evolutionary analyses. GeOMe fills this need, providing a user-friendly, web-based interface for both data contributors and data recipients. The interface allows data contributors to create a customized yet standard-compliant spreadsheet that captures the temporal and geospatial context of each biosample. These metadata are then validated and permanently linked to archived genetic data stored in the National Center for Biotechnology Information’s (NCBI’s) Sequence Read Archive (SRA) via unique persistent identifiers. By linking ecologically and evolutionarily relevant metadata with publicly archived sequence data in a structured manner, GeOMe sets a gold standard for data management in biodiversity science. PMID:28771471
Using RDF and Git to Realize a Collaborative Metadata Repository.
Stöhr, Mark R; Majeed, Raphael W; Günther, Andreas
2018-01-01
The German Center for Lung Research (DZL) is a research network with the aim of researching respiratory diseases. The participating study sites' register data differs in terms of software and coding system as well as data field coverage. To perform meaningful consortium-wide queries through one single interface, a uniform conceptual structure is required covering the DZL common data elements. No single existing terminology includes all our concepts. Potential candidates such as LOINC and SNOMED only cover specific subject areas or are not granular enough for our needs. To achieve a broadly accepted and complete ontology, we developed a platform for collaborative metadata management. The DZL data management group formulated detailed requirements regarding the metadata repository and the user interfaces for metadata editing. Our solution builds upon existing standard technologies allowing us to meet those requirements. Its key parts are RDF and the distributed version control system Git. We developed a software system to publish updated metadata automatically and immediately after performing validation tests for completeness and consistency.
A Prototype Publishing Registry for the Virtual Observatory
NASA Astrophysics Data System (ADS)
Williamson, R.; Plante, R.
2004-07-01
In the Virtual Observatory (VO), a registry helps users locate resources, such as data and services, in a distributed environment. A general framework for VO registries is now under development within the International Virtual Observatory Alliance (IVOA) Registry Working Group. We present a prototype of one component of this framework: the publishing registry. The publishing registry allows data providers to expose metadata descriptions of their resources to the VO environment. Searchable registries can harvest the metadata from many publishing registries and make them searchable by users. We have developed a prototype publishing registry that data providers can install at their sites to publish their resources. The descriptions are exposed using the Open Archive Initiative (OAI) Protocol for Metadata Harvesting. Automating the input of metadata into registries is critical when a provider wishes to describe many resources. We illustrate various strategies for such automation, both currently in use and planned for the future. We also describe how future versions of the registry can adapt automatically to evolving metadata schemas for describing resources.
HTTP-based Search and Ordering Using ECHO's REST-based and OpenSearch APIs
NASA Astrophysics Data System (ADS)
Baynes, K.; Newman, D. J.; Pilone, D.
2012-12-01
Metadata is an important entity in the process of cataloging, discovering, and describing Earth science data. NASA's Earth Observing System (EOS) ClearingHOuse (ECHO) acts as the core metadata repository for EOSDIS data centers, providing a centralized mechanism for metadata and data discovery and retrieval. By supporting both the ESIP's Federated Search API and its own search and ordering interfaces, ECHO provides multiple capabilities that facilitate ease of discovery and access to its ever-increasing holdings. Users are able to search and export metadata in a variety of formats including ISO 19115, json, and ECHO10. This presentation aims to inform technically savvy clients interested in automating search and ordering of ECHO's metadata catalog. The audience will be introduced to practical and applicable examples of end-to-end workflows that demonstrate finding, sub-setting and ordering data that is bound by keyword, temporal and spatial constraints. Interaction with the ESIP OpenSearch Interface will be highlighted, as will ECHO's own REST-based API.
NASA Astrophysics Data System (ADS)
Zschocke, Thomas; Beniest, Jan
The Consultative Group on International Agricultural Re- search (CGIAR) has established a digital repository to share its teaching and learning resources along with descriptive educational information based on the IEEE Learning Object Metadata (LOM) standard. As a critical component of any digital repository, quality metadata are critical not only to enable users to find more easily the resources they require, but also for the operation and interoperability of the repository itself. Studies show that repositories have difficulties in obtaining good quality metadata from their contributors, especially when this process involves many different stakeholders as is the case with the CGIAR as an international organization. To address this issue the CGIAR began investigating the Open ECBCheck as well as the ISO/IEC 19796-1 standard to establish quality protocols for its training. The paper highlights the implications and challenges posed by strengthening the metadata creation workflow for disseminating learning objects of the CGIAR.
Metadata and annotations for multi-scale electrophysiological data.
Bower, Mark R; Stead, Matt; Brinkmann, Benjamin H; Dufendach, Kevin; Worrell, Gregory A
2009-01-01
The increasing use of high-frequency (kHz), long-duration (days) intracranial monitoring from multiple electrodes during pre-surgical evaluation for epilepsy produces large amounts of data that are challenging to store and maintain. Descriptive metadata and clinical annotations of these large data sets also pose challenges to simple, often manual, methods of data analysis. The problems of reliable communication of metadata and annotations between programs, the maintenance of the meanings within that information over long time periods, and the flexibility to re-sort data for analysis place differing demands on data structures and algorithms. Solutions to these individual problem domains (communication, storage and analysis) can be configured to provide easy translation and clarity across the domains. The Multi-scale Annotation Format (MAF) provides an integrated metadata and annotation environment that maximizes code reuse, minimizes error probability and encourages future changes by reducing the tendency to over-fit information technology solutions to current problems. An example of a graphical utility for generating and evaluating metadata and annotations for "big data" files is presented.
Wang, Cuicui; Ge, Heyi; Ma, Xiaolong; Liu, Zhifang; Wang, Ting; Zhang, Jingyi
2018-04-01
In this study, the watersoluble epoxy resin was prepared via the ring-opening reaction between diethanolamine and epoxy resin. The modified resin mixed with graphene oxide (GO) as a sizing agent was coated onto carbon fiber (CF) and then the GO-CF reinforced acrylonitrile-butadienestyrene (ABS) composites were prepared. The influences of the different contents of GO on CF and CF/ABS composite were explored. The combination among epoxy, GO sheets and maleic anhydride grafted ABS (ABSMA) showed a synergistic effect on improving the properties of GO-CF and GO-CF/ABS composite. The GO-CF had higher single tensile strength than the commercial CF. The maximum ILSS of GO-CF/ABS composite obtained 19.2% improvement as compared with that of the commercial CF/ABS composite. Such multiscale enhancement method and the synergistic reinforced GO-CF/ABS composite show good prospective applications in many industry areas.
Li, Rui; Tai, Rui; Wang, Dan; Chu, Gui-Xin
2017-10-01
A four year field study was conducted to determine how soil biological properties and soil aggregate stability changed when organic fertilizer and biofertilizer were used to reduce chemical fertilizer application to a drip irrigated cotton field. The study consisted of six fertilization treatments: unfertilized (CK); chemical fertilizer (CF, 300 kg N·hm -2 ; 90 kg P2O5 · hm -2 , 60 kg K2 O·hm -2 ); 80% CF plus 3000 kg·hm -2 organic fertilizer (80%CF+OF); 60% CF plus 6000 kg·hm -2 organic fertilizer (60%CF+OF); 80% CF plus 3000 kg·hm -2 biofertilizer (80%CF+BF); and 60% CF plus 6000 kg·hm -2 biofertilizer (60%CF+BF). The relationships among soil organic C, soil biological properties, and soil aggregate size distribution were determined. The results showed that organic fertilizer and biofertilizer both significantly increased soil enzyme activities. Compared with CF, the biofertilizer treatments increased urease activity by 55.6%-84.0%, alkaline phosphatise activity by 53.1%-74.0%, invertase activity by 15.1%-38.0%, β-glucosidase activity by 38.2%-68.0%, polyphenoloxidase activity by 29.6%-52.0%, and arylsulfatase activity by 35.4%-58.9%. Soil enzyme activity increased as the amount of organic fertilizer and biofertilizer increased (i.e., 60%CF+OF > 80%CF+OF, 60%CF+BF > 80%CF+BF). Soil basal respiration decreased significantly in the order BF > OF > CF > CK. Soil microbial biomass C and N were 22.3% and 43.5% greater, respectively, in 60%CF+BF than in CF. The microbial biomass C:N was significantly lower in 60%CF+BF than in CF. The organic fertilizer and the biofertilizer both improved soil aggregate structure. Soil mass in the >0.25 mm fraction was 7.1% greater in 80%CF+OF and 8.0% greater in (60%CF+OF) than in CF. The geometric mean diameter was 9.2% greater in 80%CF+BF than in 80%CF+OF. Redundancy analysis and cluster analysis both demonstrated that soil aggregate structure and biological activities increased when organic fertilizer and biofertilizer were used to reduce chemical fertilizer application. In conclusion, the organic fertilizer and the biofertilizer significantly increased SOC, soil enzyme activity, and soil microbial biomass C and N. The organic fertilizers also improved soil aggregation. Therefore, soil quality could be improved by using these fertilizers to reduce chemical fertilizer application, especially under drip-irrigation.
Twiddlenet: Metadata Tagging and Data Dissemination in Mobile Device Networks
2007-09-01
hosting a distributed data dissemination application. Stated simply, there are a multitude of handheld devices on the market that can communicate in...content ( UGC ) across a network of distributed devices. This sharing is accomplished through the use of descriptive metadata tags that are assigned to a...file once it has been shared. These metadata files are uploaded to a centralized portal and arranged for efficient UGC location and searching
Streamlining Metadata and Data Management for Evolving Digital Libraries
NASA Astrophysics Data System (ADS)
Clark, D.; Miller, S. P.; Peckman, U.; Smith, J.; Aerni, S.; Helly, J.; Sutton, D.; Chase, A.
2003-12-01
What began two years ago as an effort to stabilize the Scripps Institution of Oceanography (SIO) data archives from more than 700 cruises going back 50 years, has now become the operational fully-searchable "SIOExplorer" digital library, complete with thousands of historic photographs, images, maps, full text documents, binary data files, and 3D visualization experiences, totaling nearly 2 terabytes of digital content. Coping with data diversity and complexity has proven to be more challenging than dealing with large volumes of digital data. SIOExplorer has been built with scalability in mind, so that the addition of new data types and entire new collections may be accomplished with ease. It is a federated system, currently interoperating with three independent data-publishing authorities, each responsible for their own quality control, metadata specifications, and content selection. The IT architecture implemented at the San Diego Supercomputer Center (SDSC) streamlines the integration of additional projects in other disciplines with a suite of metadata management and collection building tools for "arbitrary digital objects." Metadata are automatically harvested from data files into domain-specific metadata blocks, and mapped into various specification standards as needed. Metadata can be browsed and objects can be viewed onscreen or downloaded for further analysis, with automatic proprietary-hold request management.
Gastrointestinal Pathology in Juvenile and Adult CFTR-Knockout Ferrets
Sun, Xingshen; Olivier, Alicia K.; Yi, Yaling; Pope, Christopher E.; Hayden, Hillary S.; Liang, Bo; Sui, Hongshu; Zhou, Weihong; Hager, Kyle R.; Zhang, Yulong; Liu, Xiaoming; Yan, Ziying; Fisher, John T.; Keiser, Nicholas W.; Song, Yi; Tyler, Scott R.; Goeken, J. Adam; Kinyon, Joann M.; Radey, Matthew C.; Fligg, Danielle; Wang, Xiaoyan; Xie, Weiliang; Lynch, Thomas J.; Kaminsky, Paul M.; Brittnacher, Mitchell J.; Miller, Samuel I.; Parekh, Kalpaj; Meyerholz, David K.; Hoffman, Lucas R.; Frana, Timothy; Stewart, Zoe A.; Engelhardt, John F.
2015-01-01
Cystic fibrosis (CF) is a multiorgan disease caused by loss of a functional cystic fibrosis transmembrane conductance regulator (CFTR) chloride channel in many epithelia of the body. Here we report the pathology observed in the gastrointestinal organs of juvenile to adult CFTR-knockout ferrets. CF gastrointestinal manifestations included gastric ulceration, intestinal bacterial overgrowth with villous atrophy, and rectal prolapse. Metagenomic phylogenetic analysis of fecal microbiota by deep sequencing revealed considerable genotype-independent microbial diversity between animals, with the majority of taxa overlapping between CF and non-CF pairs. CF hepatic manifestations were variable, but included steatosis, necrosis, biliary hyperplasia, and biliary fibrosis. Gallbladder cystic mucosal hyperplasia was commonly found in 67% of CF animals. The majority of CF animals (85%) had pancreatic abnormalities, including extensive fibrosis, loss of exocrine pancreas, and islet disorganization. Interestingly, 2 of 13 CF animals retained predominantly normal pancreatic histology (84% to 94%) at time of death. Fecal elastase-1 levels from these CF animals were similar to non-CF controls, whereas all other CF animals evaluated were pancreatic insufficient (<2 μg elastase-1 per gram of feces). These findings suggest that genetic factors likely influence the extent of exocrine pancreas disease in CF ferrets and have implications for the etiology of pancreatic sufficiency in CF patients. In summary, these studies demonstrate that the CF ferret model develops gastrointestinal pathology similar to CF patients. PMID:24637292
Coffey, Michael J; Whitaker, Viola; Gentin, Natalie; Junek, Rosie; Shalhoub, Carolyn; Nightingale, Scott; Hilton, Jodi; Wiley, Veronica; Wilcken, Bridget; Gaskin, Kevin J; Ooi, Chee Y
2017-02-01
To evaluate children with cystic fibrosis (CF) who had a late diagnosis of CF (LD-CF) despite newborn screening (NBS) and compare their clinical outcomes with children diagnosed after a positive NBS (NBS-CF). A retrospective review of patients with LD-CF in New South Wales, Australia, from 1988 to 2010 was performed. LD-CF was defined as NBS-negative (negative immunoreactive trypsinogen or no F508del) or NBS-positive but discharged following sweat chloride < 60 mmol/L. Cases of LD-CF were each matched 1:2 with patients with NBS-CF for age, sex, hospital, and exocrine pancreatic status. A total of 45 LD-CF cases were identified (39 NBS-negative and 6 NBS-positive) with 90 NBS-CF matched controls. Median age (IQR) of diagnosis for LD-CF and NBS-CF was 1.35 (0.4-2.8) and 0.12 (0.03-0.2) years, respectively (P <.0001). Estimated incidence of LD-CF was 1 in 45 000 live births. Compared with NBS-CF, LD-CF had more respiratory manifestations at time of diagnosis (66% vs 4%; P <.0001), a higher rate of hospital admission per year for respiratory illness (0.49 vs 0.2; P = .0004), worse lung function (forced expiratory volume in 1 second percentage of predicted, 0.88 vs 0.97; P = .007), and higher rates of chronic colonization with Pseudomonas aeruginosa (47% vs 24%; P = .01). The LD-CF cohort also appeared to be shorter than NBS-CF controls (mean height z-score -0.65 vs -0.03; P = .02). LD-CF, despite NBS, seems to be associated with worse health before diagnosis and worse later growth and respiratory outcomes, thus providing further support for NBS programs for CF. Copyright © 2016 Elsevier Inc. All rights reserved.
Lindsköld, Lars; Wintell, Mikael; Edgren, Lars; Aspelin, Peter; Lundberg, Nina
2013-07-01
Challenges related to the cross-organizational access of accurate and timely information about a patient's condition has become a critical issue in healthcare. Interoperability of different local sources is necessary. To identify and present missing and semantically incorrect data elements of metadata in the radiology enterprise service that supports cross-organizational sharing of dynamic information about patients' visits, in the Region Västra Götaland, Sweden. Quantitative data elements of metadata were collected yearly from the first Wednesday in March from 2006 to 2011 from the 24 in-house radiology departments in Region Västra Götaland. These radiology departments were organized into four hospital groups and three stand-alone hospitals. Included data elements of metadata were the patient name, patient ID, institutional department name, referring physician's name, and examination description. The majority of missing data elements of metadata was related to the institutional department name for Hospital 2, from 87% in 2007 to 25% in 2011. All data elements of metadata except the patient ID contained semantic errors. For example, for the data element "patient name", only three names out of 3537 were semantically correct. This study shows that the semantics of metadata elements are poorly structured and inconsistently used. Although a cross-organizational solution may technically be fully functional, semantic errors may prevent it from serving as an information infrastructure for collaboration between all departments and hospitals in the region. For interoperability, it is important that the agreed semantic models are implemented in vendor systems using the information infrastructure.
Pathogen metadata platform: software for accessing and analyzing pathogen strain information.
Chang, Wenling E; Peterson, Matthew W; Garay, Christopher D; Korves, Tonia
2016-09-15
Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .
In Interactive, Web-Based Approach to Metadata Authoring
NASA Technical Reports Server (NTRS)
Pollack, Janine; Wharton, Stephen W. (Technical Monitor)
2001-01-01
NASA's Global Change Master Directory (GCMD) serves a growing number of users by assisting the scientific community in the discovery of and linkage to Earth science data sets and related services. The GCMD holds over 8000 data set descriptions in Directory Interchange Format (DIF) and 200 data service descriptions in Service Entry Resource Format (SERF), encompassing the disciplines of geology, hydrology, oceanography, meteorology, and ecology. Data descriptions also contain geographic coverage information, thus allowing researchers to discover data pertaining to a particular geographic location, as well as subject of interest. The GCMD strives to be the preeminent data locator for world-wide directory level metadata. In this vein, scientists and data providers must have access to intuitive and efficient metadata authoring tools. Existing GCMD tools are not currently attracting. widespread usage. With usage being the prime indicator of utility, it has become apparent that current tools must be improved. As a result, the GCMD has released a new suite of web-based authoring tools that enable a user to create new data and service entries, as well as modify existing data entries. With these tools, a more interactive approach to metadata authoring is taken, as they feature a visual "checklist" of data/service fields that automatically update when a field is completed. In this way, the user can quickly gauge which of the required and optional fields have not been populated. With the release of these tools, the Earth science community will be further assisted in efficiently creating quality data and services metadata. Keywords: metadata, Earth science, metadata authoring tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inigo, Gil San; Servilla, Mark; Brunt, James
2008-06-01
The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML). The LTER is one of the top National Science Foundation (NSF) programs in biology since 1980, representing diverse ecosystems and creating long-term, interdisciplinary research, synthesis of information, and theory. The adoption of EML as the LTER network standard has been key to build network synthesis architectures based on high-quality standardized metadata.more » EML is the NSF-recognized metadata standard for LTER, and EML is a criteria used to review the LTER program progress. At the workshop, a potential crosswalk between the GCDML and EML was explored. Also, collaboration between the LTER and GSC developers was proposed to join efforts toward a common metadata cataloging designer's tool. The community adoption success of a metadata standard depends, among other factors, on the tools and trainings developed to use the standard. LTER's experience in embracing EML may help GSC to achieve similar success. A possible collaboration between LTER and GSC to provide training opportunities for GCDML and the associated tools is being explored. Finally, LTER is investigating EML enhancements to better accommodate genomics data, possibly integrating the GCDML schema into EML. All these action items have been accepted by the LTER contingent, and further collaboration between the GSC and LTER is expected.« less
Gil, Inigo San; Sheldon, Wade; Schmidt, Tom; Servilla, Mark; Aguilar, Raul; Gries, Corinna; Gray, Tanya; Field, Dawn; Cole, James; Pan, Jerry Yun; Palanisamy, Giri; Henshaw, Donald; O'Brien, Margaret; Kinkel, Linda; McMahon, Katherine; Kottmann, Renzo; Amaral-Zettler, Linda; Hobbie, John; Goldstein, Philip; Guralnick, Robert P; Brunt, James; Michener, William K
2008-06-01
The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML). The LTER is one of the top National Science Foundation (NSF) programs in biology since 1980, representing diverse ecosystems and creating long-term, interdisciplinary research, synthesis of information, and theory. The adoption of EML as the LTER network standard has been key to build network synthesis architectures based on high-quality standardized metadata. EML is the NSF-recognized metadata standard for LTER, and EML is a criteria used to review the LTER program progress. At the workshop, a potential crosswalk between the GCDML and EML was explored. Also, collaboration between the LTER and GSC developers was proposed to join efforts toward a common metadata cataloging designer's tool. The community adoption success of a metadata standard depends, among other factors, on the tools and trainings developed to use the standard. LTER's experience in embracing EML may help GSC to achieve similar success. A possible collaboration between LTER and GSC to provide training opportunities for GCDML and the associated tools is being explored. Finally, LTER is investigating EML enhancements to better accommodate genomics data, possibly integrating the GCDML schema into EML. All these action items have been accepted by the LTER contingent, and further collaboration between the GSC and LTER is expected.
NCPP's Use of Standard Metadata to Promote Open and Transparent Climate Modeling
NASA Astrophysics Data System (ADS)
Treshansky, A.; Barsugli, J. J.; Guentchev, G.; Rood, R. B.; DeLuca, C.
2012-12-01
The National Climate Predictions and Projections (NCPP) Platform is developing comprehensive regional and local information about the evolving climate to inform decision making and adaptation planning. This includes both creating and providing tools to create metadata about the models and processes used to create its derived data products. NCPP is using the Common Information Model (CIM), an ontology developed by a broad set of international partners in climate research, as its metadata language. This use of a standard ensures interoperability within the climate community as well as permitting access to the ecosystem of tools and services emerging alongside the CIM. The CIM itself is divided into a general-purpose (UML & XML) schema which structures metadata documents, and a project or community-specific (XML) Controlled Vocabulary (CV) which constraints the content of metadata documents. NCPP has already modified the CIM Schema to accommodate downscaling models, simulations, and experiments. NCPP is currently developing a CV for use by the downscaling community. Incorporating downscaling into the CIM will lead to several benefits: easy access to the existing CIM Documents describing CMIP5 models and simulations that are being downscaled, access to software tools that have been developed in order to search, manipulate, and visualize CIM metadata, and coordination with national and international efforts such as ES-DOC that are working to make climate model descriptions and datasets interoperable. Providing detailed metadata descriptions which include the full provenance of derived data products will contribute to making that data (and, the models and processes which generated that data) more open and transparent to the user community.
Brandão, S F
2015-01-01
Objective: This article proposes a combination of californium-252 (252Cf) brachytherapy, boron neutron capture therapy (BNCT) and an intracavitary moderator balloon catheter applied to brain tumour and infiltrations. Methods: Dosimetric evaluations were performed on three protocol set-ups: 252Cf brachytherapy combined with BNCT (Cf-BNCT); Cf-BNCT with a balloon catheter filled with light water (LWB) and the same set-up with heavy water (HWB). Results: Cf-BNCT-HWB has presented dosimetric advantages to Cf-BNCT-LWB and Cf-BNCT in infiltrations at 2.0–5.0 cm from the balloon surface. However, Cf-BNCT-LWB has shown superior dosimetry up to 2.0 cm from the balloon surface. Conclusion: Cf-BNCT-HWB and Cf-BNCT-LWB protocols provide a selective dose distribution for brain tumour and infiltrations, mainly further from the 252Cf source, sparing the normal brain tissue. Advances in knowledge: Malignant brain tumours grow rapidly and often spread to adjacent brain tissues, leading to death. Improvements in brain radiation protocols have been continuously achieved; however, brain tumour recurrence is observed in most cases. Cf-BNCT-LWB and Cf-BNCT-HWB represent new modalities for selectively combating brain tumour infiltrations and metastasis. PMID:25927876
Building a Virtual Solar Observatory: I Look Around and There's a Petabyte Following Me
NASA Technical Reports Server (NTRS)
Gurman, J. B.; Bogart, R.; Hill. F.; Martens, P.; Oergerle, William (Technical Monitor)
2002-01-01
The 2001 July NASA Senior Review of Sun-Earth Connections missions and data centers directed the Solar Data Analysis Center (SDAC) to proceed in studying and implementing a Virtual Solar Observatory (VSO) to ease the identification of and access to distributed archives of solar data. Any such design (cf. the National Virtual Observatory and NASA's Planetary Data System) consists of three elements: the distributed archives, a "broker" facility that translates metadata from all partner archives into a single standard for searches, and a user interface to allow searching, browsing, and download of data. Three groups are now engaged in a six-month study that will produce a candidate design and implementation roadmap for the VSO. We hope to proceed with the construction of a prototype VSO in US fiscal year 2003, with fuller deployment dependent on community reaction to and use of the capability. We therefore invite as broad as possible public comment and involvement, and invite interested parties to a "birds of a feather" session at this meeting. VSO is partnered with the European Grid of Solar Observations (EGSO), and if successful, we hope to be able to offer the VSO as the basis for the solar component of a Living With a Star data system.
Access to Emissions Distributions and Related Ancillary Data through the ECCAD database
NASA Astrophysics Data System (ADS)
Darras, Sabine; Granier, Claire; Liousse, Catherine; De Graaf, Erica; Enriquez, Edgar; Boulanger, Damien; Brissebrat, Guillaume
2017-04-01
The ECCAD database (Emissions of atmospheric Compounds and Compilation of Ancillary Data) provides a user-friendly access to global and regional surface emissions for a large set of chemical compounds and ancillary data (land use, active fires, burned areas, population,etc). The emissions inventories are time series gridded data at spatial resolution from 1x1 to 0.1x0.1 degrees. ECCAD is the emissions database of the GEIA (Global Emissions InitiAtive) project and a sub-project of the French Atmospheric Data Center AERIS (http://www.aeris-data.fr). ECCAD has currently more than 2200 users originating from more than 80 countries. The project benefits from this large international community of users to expand the number of emission datasets made available. ECCAD provides detailed metadata for each of the datasets and various tools for data visualization, for computing global and regional totals and for interactive spatial and temporal analysis. The data can be downloaded as interoperable NetCDF CF-compliant files, i.e. the data are compatible with many other client interfaces. The presentation will provide information on the datasets available within ECCAD, as well as examples of the analysis work that can be done online through the website: http://eccad.aeris-data.fr.
Access to Emissions Distributions and Related Ancillary Data through the ECCAD database
NASA Astrophysics Data System (ADS)
Darras, Sabine; Enriquez, Edgar; Granier, Claire; Liousse, Catherine; Boulanger, Damien; Fontaine, Alain
2016-04-01
The ECCAD database (Emissions of atmospheric Compounds and Compilation of Ancillary Data) provides a user-friendly access to global and regional surface emissions for a large set of chemical compounds and ancillary data (land use, active fires, burned areas, population,etc). The emissions inventories are time series gridded data at spatial resolution from 1x1 to 0.1x0.1 degrees. ECCAD is the emissions database of the GEIA (Global Emissions InitiAtive) project and a sub-project of the French Atmospheric Data Center AERIS (http://www.aeris-data.fr). ECCAD has currently more than 2200 users originating from more than 80 countries. The project benefits from this large international community of users to expand the number of emission datasets made available. ECCAD provides detailed metadata for each of the datasets and various tools for data visualization, for computing global and regional totals and for interactive spatial and temporal analysis. The data can be downloaded as interoperable NetCDF CF-compliant files, i.e. the data are compatible with many other client interfaces. The presentation will provide information on the datasets available within ECCAD, as well as examples of the analysis work that can be done online through the website: http://eccad.aeris-data.fr.
NetCDF-U - Uncertainty conventions for netCDF datasets
NASA Astrophysics Data System (ADS)
Bigagli, Lorenzo; Nativi, Stefano; Domenico, Ben
2013-04-01
To facilitate the automated processing of uncertain data (e.g. uncertainty propagation in modeling applications), we have proposed a set of conventions for expressing uncertainty information within the netCDF data model and format: the NetCDF Uncertainty Conventions (NetCDF-U). From a theoretical perspective, it can be said that no dataset is a perfect representation of the reality it purports to represent. Inevitably, errors arise from the observation process, including the sensor system and subsequent processing, differences in scales of phenomena and the spatial support of the observation mechanism, lack of knowledge about the detailed conversion between the measured quantity and the target variable. This means that, in principle, all data should be treated as uncertain. The most natural representation of an uncertain quantity is in terms of random variables, with a probabilistic approach. However, it must be acknowledged that almost all existing data resources are not treated in this way. Most datasets come simply as a series of values, often without any uncertainty information. If uncertainty information is present, then it is typically within the metadata, as a data quality element. This is typically a global (dataset wide) representation of uncertainty, often derived through some form of validation process. Typically, it is a statistical measure of spread, for example the standard deviation of the residuals. The introduction of a mechanism by which such descriptions of uncertainty can be integrated into existing geospatial applications is considered a practical step towards a more accurate modeling of our uncertain understanding of any natural process. Given the generality and flexibility of the netCDF data model, conventions on naming, syntax, and semantics have been adopted by several communities of practice, as a means of improving data interoperability. Some of the existing conventions include provisions on uncertain elements and concepts, but, to our knowledge, no general convention on the encoding of uncertainty has been proposed, to date. In particular, the netCDF Climate and Forecast Conventions (NetCDF-CF), a de-facto standard for a large amount of data in Fluid Earth Sciences, mention the issue and provide limited support for uncertainty representation. NetCDF-U is designed to be fully compatible with NetCDF-CF, where possible adopting the same mechanisms (e.g. using the same attributes name with compatible semantics). The rationale for this is that a probabilistic description of scientific quantities is a crosscutting aspect, which may be modularized (note that a netCDF dataset may be compliant with more than one convention). The scope of NetCDF-U is to extend and qualify the netCDF classic data model (also known as netCDF3), to capture the uncertainty related to geospatial information encoded in that format. In the future, a netCDF4 approach for uncertainty encoding will be investigated. The NetCDF-U Conventions have the following rationale: • Compatibility with netCDF-CF Conventions 1.5. • Human-readability of conforming datasets structure. • Minimal difference between certain/agnostic and uncertain representations of data (e.g. with respect to dataset structure). NetCDF-U is based on a generic mechanism for annotating netCDF data variables with probability theory semantics. The Uncertainty Markup Language (UncertML) 2.0 is used as a controlled conceptual model and vocabulary for NetCDF-U annotations. The proposed mechanism anticipates a generalized support for semantic annotations in netCDF. NetCDF-U defines syntactical conventions for encoding samples, summary statistics, and distributions, along with mechanisms for expressing dependency relationships among variables. The conventions were accepted as an Open Geospatial Consortium (OGC) Discussion Paper (OGC 11-163); related discussions are conducted on a public forum hosted by the OGC. NetCDF-U may have implications for future work directed at communicating geospatial data provenance and uncertainty in contexts other than netCDF. The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under Grant Agreement n° 248488.
Increasing the international visibility of research data by a joint metadata schema
NASA Astrophysics Data System (ADS)
Svoboda, Nikolai; Zoarder, Muquit; Gärtner, Philipp; Hoffmann, Carsten; Heinrich, Uwe
2017-04-01
The BonaRes Project ("Soil as a sustainable resource for the bioeconomy") was launched in 2015 to promote sustainable soil management and to avoid fragmentation of efforts (Wollschläger et al., 2016). For this purpose, an IT infrastructure is being developed to upload, manage, store, and provide research data and its associated metadata. The research data provided by the BonaRes data centre are, in principle, not subject to any restrictions on reuse. For all research data considerable standardized metadata are the key enablers for the effective use of these data. Providing proper metadata is often viewed as an extra burden with further work and resources consumed. In our lecture we underline the benefits of structured and interoperable metadata like: accessibility of data, discovery of data, interpretation of data, linking data and several more and we counter these advantages with the effort of time, personnel and further costs. Building on this, we describe the framework of metadata in BonaRes combining the standards of OGC for description, visualization, exchange and discovery of geodata as well as the schema of DataCite for the publication and citation of this research data. This enables the generation of a DOI, a unique identifier that provides a permanent link to the citable research data. By using OGC standards, data and metadata become interoperable with numerous research data provided via INSPIRE. It enables further services like CSW for harvesting WMS for visualization and WFS for downloading. We explain the mandatory fields that result from our approach and we give a general overview about our metadata architecture implementation. Literature: Wollschläger, U; Helming, K.; Heinrich, U.; Bartke, S.; Kögel-Knabner, I.; Russell, D.; Eberhardt, E. & Vogel, H.-J.: The BonaRes Centre - A virtual institute for soil research in the context of a sustainable bio-economy. Geophysical Research Abstracts, Vol. 18, EGU2016-9087, 2016.
Predicting age groups of Twitter users based on language and metadata features
Morgan-Lopez, Antonio A.; Chew, Robert F.; Ruddle, Paul
2017-01-01
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles’ metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen’s d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as “school” for youth and “college” for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research. PMID:28850620
Predicting age groups of Twitter users based on language and metadata features.
Morgan-Lopez, Antonio A; Kim, Annice E; Chew, Robert F; Ruddle, Paul
2017-01-01
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles' metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen's d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as "school" for youth and "college" for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research.
NOAA's Data Catalog and the Federal Open Data Policy
NASA Astrophysics Data System (ADS)
Wengren, M. J.; de la Beaujardiere, J.
2014-12-01
The 2013 Open Data Policy Presidential Directive requires Federal agencies to create and maintain a 'public data listing' that includes all agency data that is currently or will be made publicly-available in the future. The directive requires the use of machine-readable and open formats that make use of 'common core' and extensible metadata formats according to the best practices published in an online repository called 'Project Open Data', to use open licenses where possible, and to adhere to existing metadata and other technology standards to promote interoperability. In order to meet the requirements of the Open Data Policy, the National Oceanic and Atmospheric Administration (NOAA) has implemented an online data catalog that combines metadata from all subsidiary NOAA metadata catalogs into a single master inventory. The NOAA Data Catalog is available to the public for search and discovery, providing access to the NOAA master data inventory through multiple means, including web-based text search, OGC CS-W endpoint, as well as a native Application Programming Interface (API) for programmatic query. It generates on a daily basis the Project Open Data JavaScript Object Notation (JSON) file required for compliance with the Presidential directive. The Data Catalog is based on the open source Comprehensive Knowledge Archive Network (CKAN) software and runs on the Amazon Federal GeoCloud. This presentation will cover topics including mappings of existing metadata in standard formats (FGDC-CSDGM and ISO 19115 XML ) to the Project Open Data JSON metadata schema, representation of metadata elements within the catalog, and compatible metadata sources used to feed the catalog to include Web Accessible Folder (WAF), Catalog Services for the Web (CS-W), and Esri ArcGIS.com. It will also discuss related open source technologies that can be used together to build a spatial data infrastructure compliant with the Open Data Policy.
NASA Astrophysics Data System (ADS)
Devaraju, Anusuriya; Klump, Jens; Tey, Victor; Fraser, Ryan
2016-04-01
Physical samples such as minerals, soil, rocks, water, air and plants are important observational units for understanding the complexity of our environment and its resources. They are usually collected and curated by different entities, e.g., individual researchers, laboratories, state agencies, or museums. Persistent identifiers may facilitate access to physical samples that are scattered across various repositories. They are essential to locate samples unambiguously and to share their associated metadata and data systematically across the Web. The International Geo Sample Number (IGSN) is a persistent, globally unique label for identifying physical samples. The IGSNs of physical samples are registered by end-users (e.g., individual researchers, data centers and projects) through allocating agents. Allocating agents are the institutions acting on behalf of the implementing organization (IGSN e.V.). The Commonwealth Scientific and Industrial Research Organisation CSIRO) is one of the allocating agents in Australia. To implement IGSN in our organisation, we developed a RESTful service and a metadata model. The web service enables a client to register sub-namespaces and multiple samples, and retrieve samples' metadata programmatically. The metadata model provides a framework in which different types of samples may be represented. It is generic and extensible, therefore it may be applied in the context of multi-disciplinary projects. The metadata model has been implemented as an XML schema and a PostgreSQL database. The schema is used to handle sample registrations requests and to disseminate their metadata, whereas the relational database is used to preserve the metadata records. The metadata schema leverages existing controlled vocabularies to minimize the scope for error and incorporates some simplifications to reduce complexity of the schema implementation. The solutions developed have been applied and tested in the context of two sample repositories in CSIRO, the Capricorn Distal Footprints project and the Rock Store.
Solutions for extracting file level spatial metadata from airborne mission data
NASA Astrophysics Data System (ADS)
Schwab, M. J.; Stanley, M.; Pals, J.; Brodzik, M.; Fowler, C.; Icebridge Engineering/Spatial Metadata
2011-12-01
Authors: Michael Stanley Mark Schwab Jon Pals Mary J. Brodzik Cathy Fowler Collaboration: Raytheon EED and NSIDC Raytheon / EED 5700 Rivertech Court Riverdale, MD 20737 NSIDC University of Colorado UCB 449 Boulder, CO 80309-0449 Data sets acquired from satellites and aircraft may differ in many ways. We will focus on the differences in spatial coverage between the two platforms. Satellite data sets over a given period typically cover large geographic regions. These data are collected in a consistent, predictable and well understood manner due to the uniformity of satellite orbits. Since satellite data collection paths are typically smooth and uniform the data from satellite instruments can usually be described with simple spatial metadata. Subsequently, these spatial metadata can be stored and searched easily and efficiently. Conversely, aircraft have significantly more freedom to change paths, circle, overlap, and vary altitude all of which add complexity to the spatial metadata. Aircraft are also subject to wind and other elements that result in even more complicated and unpredictable spatial coverage areas. This unpredictability and complexity makes it more difficult to extract usable spatial metadata from data sets collected on aircraft missions. It is not feasible to use all of the location data from aircraft mission data sets for use as spatial metadata. The number of data points in typical data sets poses serious performance problems for spatial searching. In order to provide efficient spatial searching of the large number of files cataloged in our systems, we need to extract approximate spatial descriptions as geo-polygons from a small number of vertices (fewer than two hundred). We present some of the challenges and solutions for creating airborne mission-derived spatial metadata. We are implementing these methods to create the spatial metadata for insertion of IceBridge mission data into ECS for public access through NSIDC and ECHO but, they are potentially extensible to any aircraft mission data.
Özdemir, Vural; Kolker, Eugene; Hotez, Peter J; Mohin, Sophie; Prainsack, Barbara; Wynne, Brian; Vayena, Effy; Coşkun, Yavuz; Dereli, Türkay; Huzair, Farah; Borda-Rodriguez, Alexander; Bragazzi, Nicola Luigi; Faris, Jack; Ramesar, Raj; Wonkam, Ambroise; Dandara, Collet; Nair, Bipin; Llerena, Adrián; Kılıç, Koray; Jain, Rekha; Reddy, Panga Jaipal; Gollapalli, Kishore; Srivastava, Sanjeeva; Kickbusch, Ilona
2014-01-01
Metadata refer to descriptions about data or as some put it, "data about data." Metadata capture what happens on the backstage of science, on the trajectory from study conception, design, funding, implementation, and analysis to reporting. Definitions of metadata vary, but they can include the context information surrounding the practice of science, or data generated as one uses a technology, including transactional information about the user. As the pursuit of knowledge broadens in the 21(st) century from traditional "science of whats" (data) to include "science of hows" (metadata), we analyze the ways in which metadata serve as a catalyst for responsible and open innovation, and by extension, science diplomacy. In 2015, the United Nations Millennium Development Goals (MDGs) will formally come to an end. Therefore, we propose that metadata, as an ingredient of responsible innovation, can help achieve the Sustainable Development Goals (SDGs) on the post-2015 agenda. Such responsible innovation, as a collective learning process, has become a key component, for example, of the European Union's 80 billion Euro Horizon 2020 R&D Program from 2014-2020. Looking ahead, OMICS: A Journal of Integrative Biology, is launching an initiative for a multi-omics metadata checklist that is flexible yet comprehensive, and will enable more complete utilization of single and multi-omics data sets through data harmonization and greater visibility and accessibility. The generation of metadata that shed light on how omics research is carried out, by whom and under what circumstances, will create an "intervention space" for integration of science with its socio-technical context. This will go a long way to addressing responsible innovation for a fairer and more transparent society. If we believe in science, then such reflexive qualities and commitments attained by availability of omics metadata are preconditions for a robust and socially attuned science, which can then remain broadly respected, independent, and responsibly innovative. "In Sierra Leone, we have not too much electricity. The lights will come on once in a week, and the rest of the month, dark[ness]. So I made my own battery to power light in people's houses." Kelvin Doe (Global Minimum, 2012) MIT Visiting Young Innovator Cambridge, USA, and Sierra Leone "An important function of the (Global) R&D Observatory will be to provide support and training to build capacity in the collection and analysis of R&D flows, and how to link them to the product pipeline." World Health Organization (2013) Draft Working Paper on a Global Health R&D Observatory.
NASA Astrophysics Data System (ADS)
Koppe, Roland; Scientific MaNIDA-Team
2013-04-01
The Marine Network for Integrated Data Access (MaNIDA) aims to build a sustainable e-infrastructure to support discovery and re-use of marine data from distinct data providers in Germany (see related abstracts in session ESSI 1.2). In order to provide users integrated access and retrieval of expedition or cruise metadata, data, services and publications as well as relationships among the various objects, we are developing (web) applications based on state of the art technologies: the Data Portal of German Marine Research. Since the German network of distributed content providers have distinct objectives and mandates for storing digital objects (e.g. long-term data preservation, near real time data, publication repositories), we have to cope with heterogeneous metadata in terms of syntax and semantic, data types and formats as well as access solutions. We have defined a set of core metadata elements which are common to our content providers and therefore useful for discovery and building relationships among objects. Existing catalogues for various types of vocabularies are being used to assure the mapping to community-wide used terms. We distinguish between expedition metadata and continuously harvestable metadata objects from distinct data providers. • Existing expedition metadata from distinct sources is integrated and validated in order to create an expedition metadata catalogue which is used as authoritative source for expedition-related content. The web application allows browsing by e.g. research vessel and date, exploring expeditions and research gaps by tracklines and viewing expedition details (begin/end, ports, platforms, chief scientists, events, etc.). Also expedition-related objects from harvesting are dynamically associated with expedition information and presented to the user. Hence we will provide web services to detailed expedition information. • Other harvestable content is separated into four categories: archived data and data products, near real time data, publications and reports. Reports are a special case of publication, describing cruise planning, cruise reports or popular reports on expeditions and are orthogonal to e.g. peer-reviewed articles. Each object's metadata contains at least: identifier(s) e.g. doi/hdl, title, author(s), date, expedition(s), platform(s) e.g. research vessel Polarstern. Furthermore project(s), parameter(s), device(s) and e.g. geographic coverage are of interest. An international gazetteer resolves geographic coverage to region names and annotates to object metadata. Information is homogenously presented to the user, independent of the underlying format, but adaptable to specific disciplines e.g. bathymetry. Also data access and dissemination information is available to the user as data download link or web services (e.g. WFS, WMS). Based on relationship metadata we are dynamically building graphs of objects to support the user in finding possible relevant associated objects. Technically metadata is based on ISO / OGC standards or provider specification. Metadata is harvested via OAI-PMH or OGC CSW and indexed with Apache Lucene. This enables powerful full-text search, geographic and temporal search as well as faceting. In this presentation we will illustrate the architecture and the current implementation of our integrated approach.
Rijs, Nicole J; O'Hair, Richard A J
2012-03-28
A combination of gas-phase 3D quadrupole ion trap mass spectrometry experiments and density functional theory (DFT) calculations have been used to examine the mechanism of thermal decomposition of fluorinated coinage metal carboxylates. The precursor anions, [CF(3)CO(2)MO(2)CCF(3)](-) (M = Cu, Ag and Au), were introduced into the gas-phase via electrospray ionization. Multistage mass spectrometry (MS(n)) experiments were conducted utilizing collision-induced dissociation, yielding a series of trifluoromethylated organometallic species and fluorides via the loss of CO(2), CF(2) or "CF(2)CO(2)". Carboxylate ligand loss was insignificant or absent in all cases. DFT calculations were carried out on a range of potentially competing fragmentation pathways for [CF(3)CO(2)MO(2)CCF(3)](-), [CF(3)CO(2)MCF(3)](-) and [CF(3)CO(2)MF](-). These shed light on possible products and mechanisms for loss of "CF(2)CO(2)", namely, concerted or step-wise loss of CO(2) and CF(2) and a CF(2)CO(2) lactone pathway. The lactone pathway was found to be higher in energy in all cases. In addition, the possibility of forming [CF(3)MCF(3)](-) and [CF(3)MF](-), via decarboxylation is discussed. For the first time the novel fluoride complexes [FMF](-), M = Cu, Ag and Au have been experimentally observed. Finally, the decomposition reactions of [CF(3)CO(2)ML](-) (where L = CF(3) and CF(3)CO(2)) and [CH(3)CO(2)ML](-) (where L = CH(3) and CH(3)CO(2)) are compared.
Security policy speculation of user uploaded images on content sharing sites
NASA Astrophysics Data System (ADS)
Iyapparaja, M.; Tiwari, Maneesh
2017-11-01
Innovation is developing step by step tremendously. As there are numerous social locales where information likes pictures, sound, video and so forth are shared by the client to each other. In concentrate to all exercises on social locales, there is need of protection to pictures. Because of this reason, I utilized Adaptive protection strategy forecast instrument to give security to the pictures. Issue identified with pictures is the huge issue in social locales like Facebook, twitter and so on. So here the part of a social thought, security to pictures, metadata and so on is produced. To conquer this issue we produced an answer which is 2 systems which understanding to a background marked by the pictures gives appropriated answer for them. Here we give an arrangement to the specific sort of pictures by characterizing them and in addition giving protection to pictures which are transferred agreement to a calculation that we utilized. Consequently as indicated by this arrangement expectation pictures take after a similar approach on up and coming pictures and give successful security to them.
The physical characteristics of the sediments on and surrounding Dauphin Island, Alabama
Ellis, Alisha M.; Marot, Marci E.; Smith, Christopher G.; Wheaton, Cathryn J.
2017-06-20
Scientists from the U.S. Geological Survey, St. Petersburg Coastal and Marine Science Center collected 303 surface sediment samples from Dauphin Island, Alabama, and the surrounding water bodies in August 2015. These sediments were processed to determine physical characteristics such as organic content, bulk density, and grain-size. The environments where the sediments were collected include high and low salt marshes, washover deposits, dunes, beaches, sheltered bays, and open water. Sampling by the USGS was part of a larger study to assess the feasibility and sustainability of proposed restoration efforts for Dauphin Island, Alabama, and assess the island’s resilience to rising sea level and storm events. The data presented in this publication can be used by modelers to attempt validation of hindcast models and create predictive forecast models for both baseline conditions and storms. This study was funded by the National Fish and Wildlife Foundation, via the Gulf Environmental Benefit Fund.This report serves as an archive for sedimentological data derived from surface sediments. Downloadable data are available as Excel spreadsheets, JPEG files, and formal Federal Geographic Data Committee metadata.
NASA Astrophysics Data System (ADS)
Tereshchenko, D. S.; Morozov, I. V.; Boltalin, A. I.; Karpova, E. V.; Glazunova, T. Yu.; Troyanov, S. I.
2013-01-01
A series of fluoro(trifluoroacetato)metallates were synthesized by crystallization from solutions in trifluoroacetic acid containing nickel(II) or cobalt(II) nitrate hydrates and alkali metal or ammonium fluorides: Li[Ni3(μ3-F)(CF3COO)6(CF3COOH)3](CF3COOH)3 ( I), M'[Ni3(μ3-F)(CF3COO)6(CF3COOH)3] ( M' = Na ( II), NH4 ( IV), Rb ( V), and Cs ( VI)), NH4[Co3(μ3-F) (CF3COO)6(CF3COOH)3] ( III), and Cs[Ni3(μ3-F)(CF3COO)6(CF3COOH)3](CF3COOH)0.5 ( VII). The crystal structures of these compounds were determined by single-crystal X-ray diffraction. All structures contain triangular trinuclear complex anions [ M 3″(μ3-F)(CF3COO)6(CF3COOH)3]- ( M″ = Ni, Co) structurally similar to trinuclear 3d metal oxo carboxylate complexes. The three-coordinated F atom is located at the center of the triangle formed by Ni(II) or Co(II) atoms. The metal atoms are linked in pairs by six bridging trifluoroacetate groups located above and below the plane of the [ M″3 F] triangle. The oxygen atoms of the axial CF3COOH molecules complete the coordination environment of M″ atoms to an octahedron.
Gastrointestinal pathology in juvenile and adult CFTR-knockout ferrets.
Sun, Xingshen; Olivier, Alicia K; Yi, Yaling; Pope, Christopher E; Hayden, Hillary S; Liang, Bo; Sui, Hongshu; Zhou, Weihong; Hager, Kyle R; Zhang, Yulong; Liu, Xiaoming; Yan, Ziying; Fisher, John T; Keiser, Nicholas W; Song, Yi; Tyler, Scott R; Goeken, J Adam; Kinyon, Joann M; Radey, Matthew C; Fligg, Danielle; Wang, Xiaoyan; Xie, Weiliang; Lynch, Thomas J; Kaminsky, Paul M; Brittnacher, Mitchell J; Miller, Samuel I; Parekh, Kalpaj; Meyerholz, David K; Hoffman, Lucas R; Frana, Timothy; Stewart, Zoe A; Engelhardt, John F
2014-05-01
Cystic fibrosis (CF) is a multiorgan disease caused by loss of a functional cystic fibrosis transmembrane conductance regulator (CFTR) chloride channel in many epithelia of the body. Here we report the pathology observed in the gastrointestinal organs of juvenile to adult CFTR-knockout ferrets. CF gastrointestinal manifestations included gastric ulceration, intestinal bacterial overgrowth with villous atrophy, and rectal prolapse. Metagenomic phylogenetic analysis of fecal microbiota by deep sequencing revealed considerable genotype-independent microbial diversity between animals, with the majority of taxa overlapping between CF and non-CF pairs. CF hepatic manifestations were variable, but included steatosis, necrosis, biliary hyperplasia, and biliary fibrosis. Gallbladder cystic mucosal hyperplasia was commonly found in 67% of CF animals. The majority of CF animals (85%) had pancreatic abnormalities, including extensive fibrosis, loss of exocrine pancreas, and islet disorganization. Interestingly, 2 of 13 CF animals retained predominantly normal pancreatic histology (84% to 94%) at time of death. Fecal elastase-1 levels from these CF animals were similar to non-CF controls, whereas all other CF animals evaluated were pancreatic insufficient (<2 μg elastase-1 per gram of feces). These findings suggest that genetic factors likely influence the extent of exocrine pancreas disease in CF ferrets and have implications for the etiology of pancreatic sufficiency in CF patients. In summary, these studies demonstrate that the CF ferret model develops gastrointestinal pathology similar to CF patients. Copyright © 2014 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Saint-Criq, Vinciane; Kim, Sung Hoon; Katzenellenbogen, John A.; Harvey, Brian J.
2013-01-01
Male cystic fibrosis (CF) patients survive longer than females and lung exacerbations in CF females vary during the estrous cycle. Estrogen has been reported to reduce the height of the airway surface liquid (ASL) in female CF bronchial epithelium. Here we investigated the effect of 17β-estradiol on the airway surface liquid height and ion transport in normal (NuLi-1) and CF (CuFi-1) bronchial epithelial monolayers. Live cell imaging using confocal microscopy revealed that airway surface liquid height was significantly higher in the non-CF cells compared to the CF cells. 17β-estradiol (0.1–10 nM) reduced the airway surface liquid height in non-CF and CF cells after 30 min treatment. Treatment with the nuclear-impeded Estrogen Dendrimer Conjugate mimicked the effect of free estrogen by reducing significantly the airway surface liquid height in CF and non-CF cells. Inhibition of chloride transport or basolateral potassium recycling decreased the airway surface liquid height and 17β-estradiol had no additive effect in the presence of these ion transporter inhibitors. 17β-estradiol decreased bumetanide-sensitive transepithelial short-circuit current in non-CF cells and prevented the forskolin-induced increase in ASL height. 17β-estradiol stimulated an amiloride-sensitive transepithelial current and increased ouabain-sensitive basolateral short-circuit current in CF cells. 17β-estradiol increased PKCδ activity in CF and non-CF cells. These results demonstrate that estrogen dehydrates CF and non-CF ASL, and these responses to 17β-estradiol are non-genomic rather than involving the classical nuclear estrogen receptor pathway. 17β-estradiol acts on the airway surface liquid by inhibiting cAMP-mediated chloride secretion in non-CF cells and increasing sodium absorption via the stimulation of PKCδ, ENaC and the Na+/K+ATPase in CF cells. PMID:24223826
The Use of Metadata Visualisation Assist Information Retrieval
2007-10-01
album title, the track length and the genre of music . Again, any of these pieces of information can be used to quickly search and locate specific...that person. Music files also have metadata tags, in a format called ID3. This usually contains information such as the artist, the song title, the...tracks, to provide more information about the entire music collection, or to find similar or diverse tracks within the collection. Metadata is
Comprehensive Optimal Manpower and Personnel Analytic Simulation System (COMPASS)
2009-10-01
4 The EDB consists of 4 major components (some of which are re-usable): 1. Metadata Editor ( MDE ): Also considered a leaf node, the metadata...end-user queries via the QB. The EDB supports multiple instances of the MDE , although currently, only a single instance is recommended. 2 Query...the MSB is a central collection of web services, responsible for the authentication and authorization of users, maintenance of the EDB metadata
Omics Metadata Management Software v. 1 (OMMS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Our application, the Omics Metadata Management Software (OMMS), answers both needs, empowering experimentalists to generate intuitive, consistent metadata, and to perform bioinformatics analyses and information management tasks via a simple and intuitive web-based interface. Several use cases with short-read sequence datasets are provided to showcase the full functionality of the OMMS, from metadata curation tasks, to bioinformatics analyses and results management and downloading. The OMMS can be implemented as a stand alone-package for individual laboratories, or can be configured for web-based deployment supporting geographically dispersed research teams. Our software was developed with open-source bundles, is flexible, extensible and easily installedmore » and run by operators with general system administration and scripting language literacy.« less
A statistical model to predict one-year risk of death in patients with cystic fibrosis.
Aaron, Shawn D; Stephenson, Anne L; Cameron, Donald W; Whitmore, George A
2015-11-01
We constructed a statistical model to assess the risk of death for cystic fibrosis (CF) patients between scheduled annual clinical visits. Our model includes a CF health index that shows the influence of risk factors on CF chronic health and on the severity and frequency of CF exacerbations. Our study used Canadian CF registry data for 3,794 CF patients born after 1970. Data up to 2010 were analyzed, yielding 44,390 annual visit records. Our stochastic process model postulates that CF health between annual clinical visits is a superposition of chronic disease progression and an exacerbation shock stream. Death occurs when an exacerbation carries CF health across a critical threshold. The data constitute censored survival data, and hence, threshold regression was used to connect CF death to study covariates. Maximum likelihood estimates were used to determine which clinical covariates were included within the regression functions for both CF chronic health and CF exacerbations. Lung function, Pseudomonas aeruginosa infection, CF-related diabetes, weight deficiency, pancreatic insufficiency, and the deltaF508 homozygous mutation were significantly associated with CF chronic health status. Lung function, age, gender, age at CF diagnosis, P aeruginosa infection, body mass index <18.5, number of previous hospitalizations for CF exacerbations in the preceding year, and decline in forced expiratory volume in 1 second in the preceding year were significantly associated with CF exacerbations. When combined in one summative model, the regression functions for CF chronic health and CF exacerbation risk provided a simple clinical scoring tool for assessing 1-year risk of death for an individual CF patient. Goodness-of-fit tests of the model showed very encouraging results. We confirmed predictive validity of the model by comparing actual and estimated deaths in repeated hold-out samples from the data set and showed excellent agreement between estimated and actual mortality. Our threshold regression model incorporates a composite CF chronic health status index and an exacerbation risk index to produce an accurate clinical scoring tool for prediction of 1-year survival of CF patients. Our tool can be used by clinicians to decide on optimal timing for lung transplant referral. Copyright © 2015 Elsevier Inc. All rights reserved.
Sulbaek Andersen, Mads P; Nielsen, Ole J; Karpichev, Boris; Wallington, Timothy J; Sander, Stanley P
2012-06-21
The smog chamber/Fourier-transform infrared spectroscopy (FTIR) technique was used to measure the rate coefficients k(Cl + CF(3)CHClOCHF(2), isoflurane) = (4.5 ± 0.8) × 10(-15), k(Cl + CF(3)CHFOCHF(2), desflurane) = (1.0 ± 0.3) × 10(-15), k(Cl + (CF(3))(2)CHOCH(2)F, sevoflurane) = (1.1 ± 0.1) × 10(-13), and k(OH + (CF(3))(2)CHOCH(2)F) = (3.5 ± 0.7) × 10(-14) cm(3) molecule(-1) in 700 Torr of N(2)/air diluent at 295 ± 2 K. An upper limit of 6 × 10(-17) cm(3) molecule(-1) was established for k(Cl + (CF(3))(2)CHOC(O)F). The laser photolysis/laser-induced fluorescence (LP/LIF) technique was employed to determine hydroxyl radical rate coefficients as a function of temperature (241-298 K): k(OH + CF(3)CHFOCHF(2)) = (7.05 ± 1.80) × 10(-13) exp[-(1551 ± 72)/T] cm(3) molecule(-1); k(296 ± 1 K) = (3.73 ± 0.08) × 10(-15) cm(3) molecule(-1), and k(OH + (CF(3))(2)CHOCH(2)F) = (9.98 ± 3.24) × 10(-13) exp[-(969 ± 82)/T] cm(3) molecule(-1); k(298 ± 1 K) = (3.94 ± 0.30) × 10(-14) cm(3) molecule(-1). The rate coefficient of k(OH + CF(3)CHClOCHF(2), 296 ± 1 K) = (1.45 ± 0.16) × 10(-14) cm(3) molecule(-1) was also determined. Chlorine atoms react with CF(3)CHFOCHF(2) via H-abstraction to give CF(3)CFOCHF(2) and CF(3)CHFOCF(2) radicals in yields of approximately 83% and 17%. The major atmospheric fate of the CF(3)C(O)FOCHF(2) alkoxy radical is decomposition via elimination of CF(3) to give FC(O)OCHF(2) and is unaffected by the method used to generate the CF(3)C(O)FOCHF(2) radicals. CF(3)CHFOCF(2) radicals add O(2) and are converted by subsequent reactions into CF(3)CHFOCF(2)O alkoxy radicals, which decompose to give COF(2) and CF(3)CHFO radicals. In 700 Torr of air 82% of CF(3)CHFO radicals undergo C-C scission to yield HC(O)F and CF(3) radicals with the remaining 18% reacting with O(2) to give CF(3)C(O)F. Atmospheric oxidation of (CF(3))(2)CHOCH(2)F gives (CF(3))(2)CHOC(O)F in a molar yield of 93 ± 6% with CF(3)C(O)CF(3) and HCOF as minor products. The IR spectra of (CF(3))(2)CHOC(O)F and FC(O)OCHF(2) are reported for the first time. The atmospheric lifetimes of CF(3)CHClOCHF(2), CF(3)CHFOCHF(2), and (CF(3))(2)CHOCH(2)F (sevoflurane) are estimated at 3.2, 14, and 1.1 years, respectively. The 100 year time horizon global warming potentials of isoflurane, desflurane, and sevoflurane are 510, 2540, and 130, respectively. The atmospheric degradation products of these anesthetics are not of environmental concern.
The Role Of Contact Force In Atrial Fibrillation Ablation.
Nakagawa, Hiroshi; Jackman, Warren M
2014-01-01
During radiofrequency (RF) ablation, low electrode-tissue contact force (CF) is associated with ineffective RF lesion formation, whereas excessive CF may increase the risk of steam pop and perforation. Recently, ablation catheters using two technologies have been developed to measure real-time catheter-tissue CF. One catheter uses three optical fibers to measure microdeformation of a deformable body in the catheter tip. The other catheter uses a small spring connecting the ablation tip electrode to the catheter shaft with a magnetic transmitter and sensors to measure microdeflection of the spring. Pre-clinical experimental studies have shown that 1) at constant RF power and application time, RF lesion size significantly increases with increasing CF; 2) the incidence of steam pop and thrombus also increase with increasing CF; 3) modulating RF power based on CF (i.e, high RF power at low CF and lower RF power at high CF) results in a similar and predictable RF lesion size. In clinical studies in patients undergoing pulmonary vein (PV) isolation, CF during mapping in the left atrium and PVs showed a wide range of CF and transient high CF. The most common high CF site was located at the anterior/rightward left atrial roof, directly beneath the ascending aorta. There was a poor relationship between CF and previously used surrogate parameters for CF (unipolar or bipolar atrial potential amplitude and impedance). Patients who underwent PV isolation with an average CF of <10 g experienced higher AF recurrence, whereas patients with ablation using an average CF of > 20g had lower AF recurrence. AF recurred within 12 months in 6 of 8 patients (75%) who had a mean Force-Time Integral (FTI, area under the curve for contact force vs. time) < 500 gs. In contrast, AF recurred in only 4 of 13 patients (21%) with ablation using a mean FTI >1000 gs. In another study, controlling RF power based on CF prevented steam pop and impedance rise without loss of lesion effectiveness. These studies confirm that CF is a major determinant of RF lesion size and future systems combining CF, RF power and application time may provide real-time assessment of lesion formation.
NASA Astrophysics Data System (ADS)
Boldrini, Enrico; Schaap, Dick M. A.; Nativi, Stefano
2013-04-01
SeaDataNet implements a distributed pan-European infrastructure for Ocean and Marine Data Management whose nodes are maintained by 40 national oceanographic and marine data centers from 35 countries riparian to all European seas. A unique portal makes possible distributed discovery, visualization and access of the available sea data across all the member nodes. Geographic metadata play an important role in such an infrastructure, enabling an efficient documentation and discovery of the resources of interest. In particular: - Common Data Index (CDI) metadata describe the sea datasets, including identification information (e.g. product title, interested area), evaluation information (e.g. data resolution, constraints) and distribution information (e.g. download endpoint, download protocol); - Cruise Summary Reports (CSR) metadata describe cruises and field experiments at sea, including identification information (e.g. cruise title, name of the ship), acquisition information (e.g. utilized instruments, number of samples taken) In the context of the second phase of SeaDataNet (SeaDataNet 2 EU FP7 project, grant agreement 283607, started on October 1st, 2011 for a duration of 4 years) a major target is the setting, adoption and promotion of common international standards, to the benefit of outreach and interoperability with the international initiatives and communities (e.g. OGC, INSPIRE, GEOSS, …). A standardization effort conducted by CNR with the support of MARIS, IFREMER, STFC, BODC and ENEA has led to the creation of a ISO 19115 metadata profile of CDI and its XML encoding based on ISO 19139. The CDI profile is now in its stable version and it's being implemented and adopted by the SeaDataNet community tools and software. The effort has then continued to produce an ISO based metadata model and its XML encoding also for CSR. The metadata elements included in the CSR profile belong to different models: - ISO 19115: E.g. cruise identification information, including title and area of interest; metadata responsible party information - ISO 19115-2: E.g. acquisition information, including date of sampling, instruments used - SeaDataNet: E.g. SeaDataNet community specific, including EDMO and EDMERP code lists Two main guidelines have been followed in the metadata model drafting: - All the obligations and constraints required by both the ISO standards and INSPIRE directive had to be satisfied. These include the presence of specific elements with given cardinality (e.g. mandatory metadata date stamp, mandatory lineage information) - All the content information of legacy CSR format had to be supported by the new metadata model. An XML encoding of the CSR profile has been defined as well. Based on the ISO 19139 XML schema and constraints, it adds the new elements specific of the SeaDataNet community. The associated Schematron rules are used to enforce constraints not enforceable just with the Schema and to validate elements content against the SeaDataNet code lists vocabularies.
Mercury: Reusable software application for Metadata Management, Data Discovery and Access
NASA Astrophysics Data System (ADS)
Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce E.
2009-12-01
Mercury is a federated metadata harvesting, data discovery and access tool based on both open source packages and custom developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury is itself a reusable toolset for metadata, with current use in 12 different projects. Mercury also supports the reuse of metadata by enabling searching across a range of metadata specification and standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115. Mercury provides a single portal to information contained in distributed data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. One of the major goals of the recent redesign of Mercury was to improve the software reusability across the projects which currently fund the continuing development of Mercury. These projects span a range of land, atmosphere, and ocean ecological communities and have a number of common needs for metadata searches, but they also have a number of needs specific to one or a few projects To balance these common and project-specific needs, Mercury’s architecture includes three major reusable components; a harvester engine, an indexing system and a user interface component. The harvester engine is responsible for harvesting metadata records from various distributed servers around the USA and around the world. The harvester software was packaged in such a way that all the Mercury projects will use the same harvester scripts but each project will be driven by a set of configuration files. The harvested files are then passed to the Indexing system, where each of the fields in these structured metadata records are indexed properly, so that the query engine can perform simple, keyword, spatial and temporal searches across these metadata sources. The search user interface software has two API categories; a common core API which is used by all the Mercury user interfaces for querying the index and a customized API for project specific user interfaces. For our work in producing a reusable, portable, robust, feature-rich application, Mercury received a 2008 NASA Earth Science Data Systems Software Reuse Working Group Peer-Recognition Software Reuse Award. The new Mercury system is based on a Service Oriented Architecture and effectively reuses components for various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. The software also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets, integrated shopping cart to order datasets from various data centers (ORNL DAAC, NSIDC) and integrated visualization tools. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Carbon nanotubes/carbon fiber hybrid material: a super support material for sludge biofilms.
Liu, Qijie; Dai, Guangze; Bao, Yanling
2017-07-16
Carbon fiber (CF) is widely used as a sludge biofilm support material for wastewater treatment. Carbon nanotubes/carbon fiber (CNTs/CF) hybrid material was prepared by ultrasonically assisted electrophoretic deposition (EPD). CF supports (CF without handling, CF oxidized by nitric acid, CNTs/CF hybrid material) were evaluated by sludge immobilization tests, bacterial cell adsorption tests and Derjaguin -Landau -Verwey -Overbeek (DLVO) theory. We found that the CNTs/CF hybrid material has a high capacity for adsorbing activated sludge, nitrifying bacterial sludge and pure strains (Escherichia coli and Staphylococcus aureus). CNTs deposited on CF surface easily wound around the curved surface of bacterial cell which resulted in capturing more bacterial cells. DLVO theory indicated the lowest total interaction energy of CNTs/CF hybrid material, which resulted in the highest bacteria cell adsorption velocity. Experiments and DLVO theory results proved that CNTs/CF hybrid material is a super support material for sludge biofilms.
Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata.
Hu, Wei; Zaveri, Amrapali; Qiu, Honglei; Dumontier, Michel
2017-09-18
The ability to efficiently search and filter datasets depends on access to high quality metadata. While most biomedical repositories require data submitters to provide a minimal set of metadata, some such as the Gene Expression Omnibus (GEO) allows users to specify additional metadata in the form of textual key-value pairs (e.g. sex: female). However, since there is no structured vocabulary to guide the submitter regarding the metadata terms to use, consequently, the 44,000,000+ key-value pairs in GEO suffer from numerous quality issues including redundancy, heterogeneity, inconsistency, and incompleteness. Such issues hinder the ability of scientists to hone in on datasets that meet their requirements and point to a need for accurate, structured and complete description of the data. In this study, we propose a clustering-based approach to address data quality issues in biomedical, specifically gene expression, metadata. First, we present three different kinds of similarity measures to compare metadata keys. Second, we design a scalable agglomerative clustering algorithm to cluster similar keys together. Our agglomerative cluster algorithm identified metadata keys that were similar, based on (i) name, (ii) core concept and (iii) value similarities, to each other and grouped them together. We evaluated our method using a manually created gold standard in which 359 keys were grouped into 27 clusters based on six types of characteristics: (i) age, (ii) cell line, (iii) disease, (iv) strain, (v) tissue and (vi) treatment. As a result, the algorithm generated 18 clusters containing 355 keys (four clusters with only one key were excluded). In the 18 clusters, there were keys that were identified correctly to be related to that cluster, but there were 13 keys which were not related to that cluster. We compared our approach with four other published methods. Our approach significantly outperformed them for most metadata keys and achieved the best average F-Score (0.63). Our algorithm identified keys that were similar to each other and grouped them together. Our intuition that underpins cleaning by clustering is that, dividing keys into different clusters resolves the scalability issues for data observation and cleaning, and keys in the same cluster with duplicates and errors can easily be found. Our algorithm can also be applied to other biomedical data types.
Applications of the LBA-ECO Metadata Warehouse
NASA Astrophysics Data System (ADS)
Wilcox, L.; Morrell, A.; Griffith, P. C.
2006-05-01
The LBA-ECO Project Office has developed a system to harvest and warehouse metadata resulting from the Large-Scale Biosphere Atmosphere Experiment in Amazonia. The harvested metadata is used to create dynamically generated reports, available at www.lbaeco.org, which facilitate access to LBA-ECO datasets. The reports are generated for specific controlled vocabulary terms (such as an investigation team or a geospatial region), and are cross-linked with one another via these terms. This approach creates a rich contextual framework enabling researchers to find datasets relevant to their research. It maximizes data discovery by association and provides a greater understanding of the scientific and social context of each dataset. For example, our website provides a profile (e.g. participants, abstract(s), study sites, and publications) for each LBA-ECO investigation. Linked from each profile is a list of associated registered dataset titles, each of which link to a dataset profile that describes the metadata in a user-friendly way. The dataset profiles are generated from the harvested metadata, and are cross-linked with associated reports via controlled vocabulary terms such as geospatial region. The region name appears on the dataset profile as a hyperlinked term. When researchers click on this link, they find a list of reports relevant to that region, including a list of dataset titles associated with that region. Each dataset title in this list is hyperlinked to its corresponding dataset profile. Moreover, each dataset profile contains hyperlinks to each associated data file at its home data repository and to publications that have used the dataset. We also use the harvested metadata in administrative applications to assist quality assurance efforts. These include processes to check for broken hyperlinks to data files, automated emails that inform our administrators when critical metadata fields are updated, dynamically generated reports of metadata records that link to datasets with questionable file formats, and dynamically generated region/site coordinate quality assurance reports. These applications are as important as those that facilitate access to information because they help ensure a high standard of quality for the information. This presentation will discuss reports currently in use, provide a technical overview of the system, and discuss plans to extend this system to harvest metadata resulting from the North American Carbon Program by drawing on datasets in many different formats, residing in many thematic data centers and also distributed among hundreds of investigators.
Introducing a Web API for Dataset Submission into a NASA Earth Science Data Center
NASA Astrophysics Data System (ADS)
Moroni, D. F.; Quach, N.; Francis-Curley, W.
2016-12-01
As the landscape of data becomes increasingly more diverse in the domain of Earth Science, the challenges of managing and preserving data become more onerous and complex, particularly for data centers on fixed budgets and limited staff. Many solutions already exist to ease the cost burden for the downstream component of the data lifecycle, yet most archive centers are still racing to keep up with the influx of new data that still needs to find a quasi-permanent resting place. For instance, having well-defined metadata that is consistent across the entire data landscape provides for well-managed and preserved datasets throughout the latter end of the data lifecycle. Translators between different metadata dialects are already in operational use, and facilitate keeping older datasets relevant in today's world of rapidly evolving metadata standards. However, very little is done to address the first phase of the lifecycle, which deals with the entry of both data and the corresponding metadata into a system that is traditionally opaque and closed off to external data producers, thus resulting in a significant bottleneck to the dataset submission process. The ATRAC system was the NOAA NCEI's answer to this previously obfuscated barrier to scientists wishing to find a home for their climate data records, providing a web-based entry point to submit timely and accurate metadata and information about a very specific dataset. A couple of NASA's Distributed Active Archive Centers (DAACs) have implemented their own versions of a web-based dataset and metadata submission form including the ASDC and the ORNL DAAC. The Physical Oceanography DAAC is the most recent in the list of NASA-operated DAACs who have begun to offer their own web-based dataset and metadata submission services to data producers. What makes the PO.DAAC dataset and metadata submission service stand out from these pre-existing services is the option of utilizing both a web browser GUI and a RESTful API to facilitate rapid and efficient updating of dataset metadata records by external data producers. Here we present this new service and demonstrate the variety of ways in which a multitude of Earth Science datasets may be submitted in a manner that significantly reduces the time in ensuring that new, vital data reaches the public domain.
Next-Generation Search Engines for Information Retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Hook, Leslie A; Palanisamy, Giri
In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore » finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a centralized index. The harvested files are indexed against SOLR search API consistently, so that it can render search capabilities such as simple, fielded, spatial and temporal searches across a span of projects ranging from land, atmosphere, and ocean ecology. Mercury also provides data sharing capabilities using Open Archive Initiatives Protocol for Metadata Handling (OAI-PMH). In this paper we will discuss about the best practices for archiving data and metadata, new searching techniques, efficient ways of data retrieval and information display.« less
OSCAR/Surface: Metadata for the WMO Integrated Observing System WIGOS
NASA Astrophysics Data System (ADS)
Klausen, Jörg; Pröscholdt, Timo; Mannes, Jürg; Cappelletti, Lucia; Grüter, Estelle; Calpini, Bertrand; Zhang, Wenjian
2016-04-01
The World Meteorological Organization (WMO) Integrated Global Observing System (WIGOS) is a key WMO priority underpinning all WMO Programs and new initiatives such as the Global Framework for Climate Services (GFCS). It does this by better integrating WMO and co-sponsored observing systems, as well as partner networks. For this, an important aspect is the description of the observational capabilities by way of structured metadata. The 17th Congress of the Word Meteorological Organization (Cg-17) has endorsed the semantic WIGOS metadata standard (WMDS) developed by the Task Team on WIGOS Metadata (TT-WMD). The standard comprises of a set of metadata classes that are considered to be of critical importance for the interpretation of observations and the evolution of observing systems relevant to WIGOS. The WMDS serves all recognized WMO Application Areas, and its use for all internationally exchanged observational data generated by WMO Members is mandatory. The standard will be introduced in three phases between 2016 and 2020. The Observing Systems Capability Analysis and Review (OSCAR) platform operated by MeteoSwiss on behalf of WMO is the official repository of WIGOS metadata and an implementation of the WMDS. OSCAR/Surface deals with all surface-based observations from land, air and oceans, combining metadata managed by a number of complementary, more domain-specific systems (e.g., GAWSIS for the Global Atmosphere Watch, JCOMMOPS for the marine domain, the WMO Radar database). It is a modern, web-based client-server application with extended information search, filtering and mapping capabilities including a fully developed management console to add and edit observational metadata. In addition, a powerful application programming interface (API) is being developed to allow machine-to-machine metadata exchange. The API is based on an ISO/OGC-compliant XML schema for the WMDS using the Observations and Measurements (ISO19156) conceptual model. The purpose of the presentation is to acquaint the audience with OSCAR, the WMDS and the current XML schema; and, to explore the relationship to the INSPIRE XML schema. Feedback from experts in the various disciplines of meteorology, climatology, atmospheric chemistry, hydrology on the utility of the new standard and the XML schema will be solicited and will guide WMO in further evolving the WMDS.
ERIC Educational Resources Information Center
Armstrong, C. J.
1997-01-01
Discusses PICS (Platform for Internet Content Selection), the Centre for Information Quality Management (CIQM), and metadata. Highlights include filtering networked information; the quality of information; and standardizing search engines. (LRW)
Photoionisation study of Xe.CF{sub 4} and Kr.CF{sub 4} van-der-Waals molecules
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alekseev, V. A., E-mail: alekseev@va3474.spb.edu; Kevorkyants, R.; Garcia, G. A.
2016-05-14
We report on photoionization studies of Xe.CF{sub 4} and Kr.CF{sub 4} van-der-Waals complexes produced in a supersonic expansion and detected using synchrotron radiation and photoelectron-photoion coincidence techniques. The ionization potential of CF{sub 4} is larger than those of the Xe and Kr atoms and the ground state of the Rg.CF{sub 4}{sup +} ion correlates with Rg{sup +} ({sup 2}P{sub 3/2}) + CF{sub 4}. The onset of the Rg.CF{sub 4}{sup +} signals was found to be only ∼0.2 eV below the Rg ionization potential. In agreement with experiment, complementary ab initio calculations show that vertical transitions originating from the potential minimummore » of the ground state of Rg.CF{sub 4} terminate at a part of the potential energy surfaces of Rg.CF{sub 4}{sup +}, which are approximately 0.05 eV below the Rg{sup +} ({sup 2}P{sub 3/2}) + CF{sub 4} dissociation limit. In contrast to the neutral complexes, which are most stable in the face geometry, for the Rg.CF{sub 4}{sup +} ions, the calculations show that the minimum of the potential energy surface is in the vertex geometry. Experiments which have been performed only with Xe.CF{sub 4} revealed no Xe.CF{sub 4}{sup +} signal above the first ionization threshold of Xe, suggesting that the Rg.CF{sub 4}{sup +} ions are not stable above the first dissociation limit.« less
Serving Fisheries and Ocean Metadata to Communities Around the World
NASA Technical Reports Server (NTRS)
Meaux, Melanie
2006-01-01
NASA's Global Change Master Directory (GCMD) assists the oceanographic community in the discovery, access, and sharing of scientific data by serving on-line fisheries and ocean metadata to users around the globe. As of January 2006, the directory holds more than 16,300 Earth Science data descriptions and over 1,300 services descriptions. Of these, nearly 4,000 unique ocean-related metadata records are available to the public, with many having direct links to the data. In 2005, the GCMD averaged over 5 million hits a month, with nearly a half million unique hosts for the year. Through the GCMD portal (http://qcrnd.nasa.qov/), users can search vast and growing quantities of data and services using controlled keywords, free-text searches or a combination of both. Users may now refine a search based on topic, location, instrument, platform, project, data center, spatial and temporal coverage. The directory also offers data holders a means to post and search their data through customized portals, i.e. online customized subset metadata directories. The discovery metadata standard used is the Directory Interchange Format (DIF), adopted in 1994. This format has evolved to accommodate other national and international standards such as FGDC and IS019115. Users can submit metadata through easy-to-use online and offline authoring tools. The directory, which also serves as a coordinating node of the International Directory Network (IDN), has been active at the international, regional and national level for many years through its involvement with the Committee on Earth Observation Satellites (CEOS), federal agencies (such as NASA, NOAA, and USGS), international agencies (such as IOC/IODE, UN, and JAXA) and partnerships (such as ESIP, IOOS/DMAC, GOSIC, GLOBEC, OBIS, and GoMODP), sharing experience, knowledge related to metadata and/or data management and interoperability.
The Metadata Coverage Index (MCI): A standardized metric for quantifying database metadata richness.
Liolios, Konstantinos; Schriml, Lynn; Hirschman, Lynette; Pagani, Ioanna; Nosrat, Bahador; Sterk, Peter; White, Owen; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Taylor, Chris; Kyrpides, Nikos C; Field, Dawn
2012-07-30
Variability in the extent of the descriptions of data ('metadata') held in public repositories forces users to assess the quality of records individually, which rapidly becomes impractical. The scoring of records on the richness of their description provides a simple, objective proxy measure for quality that enables filtering that supports downstream analysis. Pivotally, such descriptions should spur on improvements. Here, we introduce such a measure - the 'Metadata Coverage Index' (MCI): the percentage of available fields actually filled in a record or description. MCI scores can be calculated across a database, for individual records or for their component parts (e.g., fields of interest). There are many potential uses for this simple metric: for example; to filter, rank or search for records; to assess the metadata availability of an ad hoc collection; to determine the frequency with which fields in a particular record type are filled, especially with respect to standards compliance; to assess the utility of specific tools and resources, and of data capture practice more generally; to prioritize records for further curation; to serve as performance metrics of funded projects; or to quantify the value added by curation. Here we demonstrate the utility of MCI scores using metadata from the Genomes Online Database (GOLD), including records compliant with the 'Minimum Information about a Genome Sequence' (MIGS) standard developed by the Genomic Standards Consortium. We discuss challenges and address the further application of MCI scores; to show improvements in annotation quality over time, to inform the work of standards bodies and repository providers on the usability and popularity of their products, and to assess and credit the work of curators. Such an index provides a step towards putting metadata capture practices and in the future, standards compliance, into a quantitative and objective framework.
Data Access Based on a Guide Map of the Underwater Wireless Sensor Network
Wei, Zhengxian; Song, Min; Yin, Guisheng; Wang, Hongbin; Cheng, Albert M. K.
2017-01-01
Underwater wireless sensor networks (UWSNs) represent an area of increasing research interest, as data storage, discovery, and query of UWSNs are always challenging issues. In this paper, a data access based on a guide map (DAGM) method is proposed for UWSNs. In DAGM, the metadata describes the abstracts of data content and the storage location. The center ring is composed of nodes according to the shortest average data query path in the network in order to store the metadata, and the data guide map organizes, diffuses and synchronizes the metadata in the center ring, providing the most time-saving and energy-efficient data query service for the user. For this method, firstly the data is stored in the UWSN. The storage node is determined, the data is transmitted from the sensor node (data generation source) to the storage node, and the metadata is generated for it. Then, the metadata is sent to the center ring node that is the nearest to the storage node and the data guide map organizes the metadata, diffusing and synchronizing it to the other center ring nodes. Finally, when there is query data in any user node, the data guide map will select a center ring node nearest to the user to process the query sentence, and based on the shortest transmission delay and lowest energy consumption, data transmission routing is generated according to the storage location abstract in the metadata. Hence, specific application data transmission from the storage node to the user is completed. The simulation results demonstrate that DAGM has advantages with respect to data access time and network energy consumption. PMID:29039757
Public Participation in Earth Science from the ISS
NASA Technical Reports Server (NTRS)
Willis, Kimberly J.; Runco, Susan K.; Stefanov, William L.
2010-01-01
The Gateway to Astronaut Photography of Earth (GAPE) is an online database (http://eol.jsc.nasa.gov) of terrestrial astronaut photography that enables the public to experience the astronaut s view from orbit. This database of imagery includes all NASA human-directed missions from the Mercury program of the early 1960 s to the current International Space Station (ISS). To date, the total number of images taken by astronauts is 1,025,333. Of the total, 621,316 images have been "cataloged" (image geographic center points determined and descriptive metadata added). The remaining imagery provides an opportunity for the citizen-scientist to become directly involved with NASA through cataloging of astronaut photography, while simultaneously experiencing the wonder and majesty of our home planet as seen by astronauts on board the ISS every day. We are currently developing a public cataloging interface for the GAPE website. When complete, the citizen-scientist will be able to access a selected subset of astronaut imagery. Each candidate will be required to pass a training tutorial in order to receive certification as a cataloger. The cataloger can then choose from a selection of images with basic metadata that is sorted by difficulty levels. Some guidance will be provided (template/pull down menus) for generation of geographic metadata required from the cataloger for each photograph. Each cataloger will also be able to view other contributions and further edit that metadata if they so choose. After the public inputs their metadata the images will be posted to an internal screening site. Images with similar geographic metadata and centerpoint coordinates from multiple catalogers will be reviewed by NASA JSC Crew Earth Observations (CEO) staff. Once reviewed and verified, the metadata will be entered into the GAPE database with the contributors identified by their chosen usernames as having cataloged the frame.
Data Access Based on a Guide Map of the Underwater Wireless Sensor Network.
Wei, Zhengxian; Song, Min; Yin, Guisheng; Song, Houbing; Wang, Hongbin; Ma, Xuefei; Cheng, Albert M K
2017-10-17
Underwater wireless sensor networks (UWSNs) represent an area of increasing research interest, as data storage, discovery, and query of UWSNs are always challenging issues. In this paper, a data access based on a guide map (DAGM) method is proposed for UWSNs. In DAGM, the metadata describes the abstracts of data content and the storage location. The center ring is composed of nodes according to the shortest average data query path in the network in order to store the metadata, and the data guide map organizes, diffuses and synchronizes the metadata in the center ring, providing the most time-saving and energy-efficient data query service for the user. For this method, firstly the data is stored in the UWSN. The storage node is determined, the data is transmitted from the sensor node (data generation source) to the storage node, and the metadata is generated for it. Then, the metadata is sent to the center ring node that is the nearest to the storage node and the data guide map organizes the metadata, diffusing and synchronizing it to the other center ring nodes. Finally, when there is query data in any user node, the data guide map will select a center ring node nearest to the user to process the query sentence, and based on the shortest transmission delay and lowest energy consumption, data transmission routing is generated according to the storage location abstract in the metadata. Hence, specific application data transmission from the storage node to the user is completed. The simulation results demonstrate that DAGM has advantages with respect to data access time and network energy consumption.
Sanders, Don B.; Fink, Aliza
2016-01-01
SYNOPSIS Cystic fibrosis (CF) is the most common autosomal recessive disease in Caucasians. Significant advances in therapies and outcomes have occurred for people with CF over the past 30 years. Many of these improvements have come about through the concerted efforts of the US CF Foundation and international CF societies, networks of CF care centers, and the worldwide community of care providers, researchers, and patients and families. Despite these gains, there are still hurdles to overcome to continue to improve the quality of life, reduce CF complications, prolong survival, and ultimately cure CF. This article reviews the epidemiology of CF, including trends in incidence and prevalence, clinical characteristics, common complications, and survival. PMID:27469176
NASA Astrophysics Data System (ADS)
Guo, Qin; Zhang, Ni; Uchimaru, Tadafumi; Chen, Liang; Quan, Hengdao; Mizukado, Junji
2018-04-01
The rate constants for the gas-phase reactions of cyc-CF2CF2CF2CH=CH- with OH radicals were determined by a relative rate method between 253 and 328 K. The rate constant k1 at 298 K was measured to be (1.08 ± 0.04) × 10-13 cm3 molecule-1 s-1, and the Arrhenius expression was k1 = (3.72 ± 0.14) × 10-13 exp [(-370 ± 12)/T]. The atmospheric lifetime of cyc-CF2CF2CF2CH=CH- was calculated to be 107 d. The products and mechanism for the reaction of cyc-CF2CF2CF2CH=CH- with OH radicals were also investigated. CO, CO2, and COF2 were identified as the main carbon-containing products following the OH-initiated reaction. Moreover, the radiative efficiency (RE) was determined to be 0.143 W m-2 ppb-1, and the global warming potentials (GWPs) for 20, 100, and 500 yr were 54, 15, and 4, respectively. The photochemical ozone creation potential of the title compound was estimated to be 1.3.
González, Sergio; Jiménez, Elena; Albaladejo, José
2016-05-01
Hydrofluoroolefins (HFOs) of the type CF3(CF2)x≥0CHCH2, are currently being suggested as substitutes of some hydrofluorocarbons (HFCs). In this work, an assessment of the atmospheric removal of CF3(CF2)x=1,3,5CHCH2, initiated by reaction with hydroxyl (OH) radicals and UV solar radiation is addressed. For that purpose, the rate coefficients for the OH + CF3(CF2)x=1,3,5CHCH2 reaction, kOH(T = 263-358 K), were determined by the pulsed laser photolysis-laser induced fluorescence technique. A slightly negative temperature dependence of kOH was observed, obtaining Ea/R (in K) values of -124 ± 15, -128 ± 6 and -160 ± 10, for CF3CF2CHCH2, CF3(CF2)3CHCH2 and CF3(CF2)5CHCH2, respectively. The estimated atmospheric lifetimes are around 8 days, considering that HFOs are well-mixed in the troposphere. Furthermore, an evaluation of the long-wave and short-wave absorption process of these HFOs have been carried out by determining the UV (191-367 nm) and IR (4000-500 cm(-1)) absorption cross sections at 298 K. Based on the obtained UV absorption cross sections, no photolysis of CF3(CF2)x=1,3,5CHCH2 is expected in the troposphere (λ > 290 nm). These species strongly absorb IR radiation in the atmospheric IR window. Despite the strong absorption in the IR region, the lifetime corrected radiative efficiencies are low (0.033 W m(-2) ppb(-1) for CF3(CF2)3CHCH2 and 0.039 Wm(-2) ppb(-1) for CF3(CF2)5CHCH2). Calculation of GWPs for these species has been performed as a function of the horizon time, providing values higher than unity for a short-period term, decreasing dramatically for longer periods. Therefore, it is concluded that emissions of these species do not affect the radiative forcing of climate, making them suitable replacements of large-GWP HFCs. Copyright © 2016. Published by Elsevier Ltd.
USGS Imagery Applications During Disaster Response After Recent Earthquakes
NASA Astrophysics Data System (ADS)
Hudnut, K. W.; Brooks, B. A.; Glennie, C. L.; Finnegan, D. C.
2015-12-01
It is not only important to rapidly characterize surface fault rupture and related ground deformation after an earthquake, but also to repeatedly make observations following an event to forecast fault afterslip. These data may also be used by other agencies to monitor progress on damage repairs and restoration efforts by emergency responders and the public. Related requirements include repeatedly obtaining reference or baseline imagery before a major disaster occurs, as well as maintaining careful geodetic control on all imagery in a time series so that absolute georeferencing may be applied to the image stack through time. In addition, repeated post-event imagery acquisition is required, generally at a higher repetition rate soon after the event, then scaled back to less frequent acquisitions with time, to capture phenomena (such as fault afterslip) that are known to have rates that decrease rapidly with time. For example, lidar observations acquired before and after the South Napa earthquake of 2014, used in our extensive post-processing work that was funded primarily by FEMA, aided in the accurate forecasting of fault afterslip. Lidar was used to independently validate and verify the official USGS afterslip forecast. In order to keep pace with rapidly evolving technology, a development pipeline must be established and maintained to continually test and incorporate new sensors, while adapting these new components to the existing platform and linking them to the existing base software system, and then sequentially testing the system as it evolves. Improvements in system performance by incremental upgrades of system components and software are essential. Improving calibration parameters and thereby progressively eliminating artifacts requires ongoing testing, research and development. To improve the system, we have formed an interdisciplinary team with common interests and diverse sources of support. We share expertise and leverage funding while effectively and rapidly improving our system, which includes the sensor package and software for all steps in acquiring, processing and differencing repeat-pass lidar and electro-optical imagery, and the GRiD metadata and point cloud database standard, already used during disaster response surge events by other agencies (e.g., during Hurricane Sandy in 2012).
Making Information Visible, Accessible, and Understandable: Meta-Data and Registries
2007-07-01
the data created, the length of play time, album name, and the genre. Without resource metadata, portable digital music players would not be so...notion of a catalog card in a library. An example of metadata is the description of a music file specifying the creator, the artist that performed the song...describe struc- ture and formatting which are critical to interoperability and the management of databases. Going back to the portable music player example
Wang, Mengqiang; Wang, Lingling; Xin, Lusheng; Wang, Xiudan; Wang, Lin; Xu, Jianchao; Jia, Zhihao; Yue, Feng; Wang, Hao; Song, Linsheng
2016-06-01
Leucine-rich repeat (LRR)-only proteins could mediate protein-ligand and protein-protein interactions and be involved in the immune response. In the present study, two novel LRR-only proteins, CfLRRop-2 and CfLRRop-3, were identified and characterized from scallop Chlamys farreri. They both contained nine LRR motifs with the consensus signature sequence LxxLxLxxNxL and formed typical horseshoe structure. The CfLRRop-2 and CfLRRop-3 mRNA transcripts were constitutively expressed in haemocytes, muscle, mantle, gill, haepatopancreas and gonad, with the highest expression level in haepatopancreas and gill, respectively. During the ontogenesis of scallop, the mRNA transcripts of CfLRRop-2 were kept at a high level in oocytes and embryos, while those of CfLRRop-3 were expressed at a rather low level from oocytes to blastula. Their mRNA transcripts were significantly increased after the stimulation of lipopolysaccharide (LPS), peptidoglycan (PGN), glucan (GLU) and polyinosinic-polycytidylic acid (poly I:C), and the mRNA expression of CfLRRop-2 rose more intensely than that of CfLRRop-3. After the suppression of CfTLR (previously identified Toll-like receptor in C. farreri) via RNA interference (RNAi), CfLRRop-3 mRNA transcripts increased more intensely and lastingly than those of CfLRRop-2. The rCfLRRop-3 protein could bind LPS, PGN, GLU and poly I:C, while rCfLRRop-2 exhibited no significant binding activity to them. Additionally, rCfLRRop-2 could significantly induce the release of TNF-α from the mixed primary cultured scallop haemocytes, but rCfLRRop-3 failed. These results collectively indicated that CfLRRop-2 might act as an immune effector or pro-inflammatory factor, while CfLRRop-3 would function as a pattern recognition receptor (PRR), suggesting the function of LRR-only protein family has differentiated in scallop. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gjerset, Gunhild M; Loge, Jon H; Kiserud, Cecilie E; Fosså, Sophie D; Gudbergsson, Sævar B; Oldervoll, Line M; Wisløff, Torbjørn; Thorsen, Lene
2017-02-01
Knowledge about the user' needs is important to develop targeted rehabilitation for cancer patients with chronic fatigue (CF). The aims of the study were to examine prevalence of CF in cancer survivors attending an one-week inpatient educational program (IEP) and to identify characteristics of those with CF. Further to examine the perceived needs for different components in a rehabilitation program, need of complex rehabilitation (at least two components) and aspects of health-related quality of life (HRQoL) among survivors with CF versus those without CF. Cancer survivors ≥18 years, diagnosed with different types of cancer within the last 10 years and attending a one-week IEP were invited to this cross-sectional study. CF was assessed by the Fatigue Questionnaire, perceived needs by asking a question about needs for different components in a rehabilitation program and HRQoL was assessed by The Medical Outcomes Study Short Form 36. Of 564 participants, 45% reported CF. Breast cancer, mixed cancer types (including small groups with different cancer types) and comorbidities increased the risk for having CF. Compared to participants without CF, the participants with CF reported more frequently need for physical training (86% vs. 65%, p < 0.001), physiotherapy (71% vs. 55%, p < 0.001) and nutrition counseling (68% vs. 53%, p = 0.001). Among participants with CF, 75% reported need for three or more components whereas 54% reported need for the same number of components among those without CF (p < 0.001). Almost half of the cancer survivors attending the IEP had CF. Physical training, physiotherapy and nutrition counseling were the most frequently reported needs and significantly more often observed in participants with CF than without CF. A higher percentage of those with CF reported need for a complex rehabilitation compared to those without CF. More research is necessary to obtain more knowledge to further make targeted programs to better match cancer survivors' needs.
Mercury- Distributed Metadata Management, Data Discovery and Access System
NASA Astrophysics Data System (ADS)
Palanisamy, Giri; Wilson, Bruce E.; Devarakonda, Ranjeet; Green, James M.
2007-12-01
Mercury is a federated metadata harvesting, search and retrieval tool based on both open source and ORNL- developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports various metadata standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115 (under development). Mercury provides a single portal to information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury supports various projects including: ORNL DAAC, NBII, DADDI, LBA, NARSTO, CDIAC, OCEAN, I3N, IAI, ESIP and ARM. The new Mercury system is based on a Service Oriented Architecture and supports various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. This system also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Park, Yu Rang; Yoon, Young Jo; Jang, Tae Hun; Seo, Hwa Jeong; Kim, Ju Han
2014-01-01
Extension of the standard model while retaining compliance with it is a challenging issue because there is currently no method for semantically or syntactically verifying an extended data model. A metadata-based extended model, named CCR+, was designed and implemented to achieve interoperability between standard and extended models. Furthermore, a multilayered validation method was devised to validate the standard and extended models. The American Society for Testing and Materials (ASTM) Community Care Record (CCR) standard was selected to evaluate the CCR+ model; two CCR and one CCR+ XML files were evaluated. In total, 188 metadata were extracted from the ASTM CCR standard; these metadata are semantically interconnected and registered in the metadata registry. An extended-data-model-specific validation file was generated from these metadata. This file can be used in a smartphone application (Health Avatar CCR+) as a part of a multilayered validation. The new CCR+ model was successfully evaluated via a patient-centric exchange scenario involving multiple hospitals, with the results supporting both syntactic and semantic interoperability between the standard CCR and extended, CCR+, model. A feasible method for delivering an extended model that complies with the standard model is presented herein. There is a great need to extend static standard models such as the ASTM CCR in various domains: the methods presented here represent an important reference for achieving interoperability between standard and extended models.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.
Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata
Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240
Napolitano, Rebecca; Blyth, Anna; Glisic, Branko
2018-01-16
Visualization of sensor networks, data, and metadata is becoming one of the most pivotal aspects of the structural health monitoring (SHM) process. Without the ability to communicate efficiently and effectively between disparate groups working on a project, an SHM system can be underused, misunderstood, or even abandoned. For this reason, this work seeks to evaluate visualization techniques in the field, identify flaws in current practices, and devise a new method for visualizing and accessing SHM data and metadata in 3D. More precisely, the work presented here reflects a method and digital workflow for integrating SHM sensor networks, data, and metadata into a virtual reality environment by combining spherical imaging and informational modeling. Both intuitive and interactive, this method fosters communication on a project enabling diverse practitioners of SHM to efficiently consult and use the sensor networks, data, and metadata. The method is presented through its implementation on a case study, Streicker Bridge at Princeton University campus. To illustrate the efficiency of the new method, the time and data file size were compared to other potential methods used for visualizing and accessing SHM sensor networks, data, and metadata in 3D. Additionally, feedback from civil engineering students familiar with SHM is used for validation. Recommendations on how different groups working together on an SHM project can create SHM virtual environment and convey data to proper audiences, are also included.
Napolitano, Rebecca; Blyth, Anna; Glisic, Branko
2018-01-01
Visualization of sensor networks, data, and metadata is becoming one of the most pivotal aspects of the structural health monitoring (SHM) process. Without the ability to communicate efficiently and effectively between disparate groups working on a project, an SHM system can be underused, misunderstood, or even abandoned. For this reason, this work seeks to evaluate visualization techniques in the field, identify flaws in current practices, and devise a new method for visualizing and accessing SHM data and metadata in 3D. More precisely, the work presented here reflects a method and digital workflow for integrating SHM sensor networks, data, and metadata into a virtual reality environment by combining spherical imaging and informational modeling. Both intuitive and interactive, this method fosters communication on a project enabling diverse practitioners of SHM to efficiently consult and use the sensor networks, data, and metadata. The method is presented through its implementation on a case study, Streicker Bridge at Princeton University campus. To illustrate the efficiency of the new method, the time and data file size were compared to other potential methods used for visualizing and accessing SHM sensor networks, data, and metadata in 3D. Additionally, feedback from civil engineering students familiar with SHM is used for validation. Recommendations on how different groups working together on an SHM project can create SHM virtual environment and convey data to proper audiences, are also included. PMID:29337877
Postnatal airway growth in cystic fibrosis piglets.
Adam, Ryan J; Abou Alaiwa, Mahmoud H; Bouzek, Drake C; Cook, Daniel P; Gansemer, Nicholas D; Taft, Peter J; Powers, Linda S; Stroik, Mallory R; Hoegger, Mark J; McMenimen, James D; Hoffman, Eric A; Zabner, Joseph; Welsh, Michael J; Meyerholz, David K; Stoltz, David A
2017-09-01
Mutations in the gene encoding the cystic fibrosis (CF) transmembrane conductance regulator (CFTR) anion channel cause CF. The leading cause of death in the CF population is lung disease. Increasing evidence suggests that in utero airway development is CFTR-dependent and that developmental abnormalities may contribute to CF lung disease. However, relatively little is known about postnatal CF airway growth, largely because such studies are limited in humans. Therefore, we examined airway growth and lung volume in a porcine model of CF. We hypothesized that CF pigs would have abnormal postnatal airway growth. To test this hypothesis, we performed CT-based airway and lung volume measurements in 3-wk-old non-CF and CF pigs. We found that 3-wk-old CF pigs had tracheas of reduced caliber and irregular shape. Their bronchial lumens were reduced in size proximally but not distally, were irregularly shaped, and had reduced distensibility. Our data suggest that lack of CFTR results in aberrant postnatal airway growth and development, which could contribute to CF lung disease pathogenesis. NEW & NOTEWORTHY This CT scan-based study of airway morphometry in the cystic fibrosis (CF) postnatal period is unique, as analogous studies in humans are greatly limited for ethical and technical reasons. Findings such as reduced airway lumen area and irregular caliber suggest that airway growth and development are CF transmembrane conductance regulator-dependent and that airway growth defects may contribute to CF lung disease pathogenesis. Copyright © 2017 the American Physiological Society.
Avni, A; Avital, S; Gromet-Elhanan, Z
1991-04-25
Incubation of tobacco and lettuce thylakoids with 2 M LiCl in the presence of MgATP removes the beta subunit from their CF1-ATPase (CF1 beta) together with varying amounts of the CF1 alpha subunit (CF1 alpha). These 2 M LiCl extracts, as with the one obtained from spinach thylakoids (Avital, S., and Gromet-Elhanan, Z. (1991) J. Biol. Chem. 266, 7067-7072), could form active hybrid ATPases when reconstituted into inactive beta-less Rhodospirillum rubrum chromatophores. Pure CF1 beta fractions that have been isolated from these extracts could not form such active hybrids by themselves, but could do so when supplemented with trace amounts (less than 5%) of CF1 alpha. A mitochondrial F1-ATPase alpha subunit was recently reported to be a heat-shock protein, having two amino acid sequences that show a highly conserved identity with sequences found in molecular chaperones (Luis, A. M., Alconada, A., and Cuezva, J. M. (1990) J. Biol. Chem. 265, 7713-7716). These sequences are also conserved in CF1 alpha isolated from various plants, but not in F1 beta subunits. The above described reactivation of CF1 beta by trace amounts of CF1 alpha could thus be due to a chaperonin-like function of CF1 alpha, which involves the correct, active folding of isolated pure CF1 beta.
Li, Zhao; Li, Jin; Yu, Peng
2018-01-01
Abstract Metadata curation has become increasingly important for biological discovery and biomedical research because a large amount of heterogeneous biological data is currently freely available. To facilitate efficient metadata curation, we developed an easy-to-use web-based curation application, GEOMetaCuration, for curating the metadata of Gene Expression Omnibus datasets. It can eliminate mechanical operations that consume precious curation time and can help coordinate curation efforts among multiple curators. It improves the curation process by introducing various features that are critical to metadata curation, such as a back-end curation management system and a curator-friendly front-end. The application is based on a commonly used web development framework of Python/Django and is open-sourced under the GNU General Public License V3. GEOMetaCuration is expected to benefit the biocuration community and to contribute to computational generation of biological insights using large-scale biological data. An example use case can be found at the demo website: http://geometacuration.yubiolab.org. Database URL: https://bitbucket.com/yubiolab/GEOMetaCuration PMID:29688376
MetaRNA-Seq: An Interactive Tool to Browse and Annotate Metadata from RNA-Seq Studies.
Kumar, Pankaj; Halama, Anna; Hayat, Shahina; Billing, Anja M; Gupta, Manish; Yousri, Noha A; Smith, Gregory M; Suhre, Karsten
2015-01-01
The number of RNA-Seq studies has grown in recent years. The design of RNA-Seq studies varies from very simple (e.g., two-condition case-control) to very complicated (e.g., time series involving multiple samples at each time point with separate drug treatments). Most of these publically available RNA-Seq studies are deposited in NCBI databases, but their metadata are scattered throughout four different databases: Sequence Read Archive (SRA), Biosample, Bioprojects, and Gene Expression Omnibus (GEO). Although the NCBI web interface is able to provide all of the metadata information, it often requires significant effort to retrieve study- or project-level information by traversing through multiple hyperlinks and going to another page. Moreover, project- and study-level metadata lack manual or automatic curation by categories, such as disease type, time series, case-control, or replicate type, which are vital to comprehending any RNA-Seq study. Here we describe "MetaRNA-Seq," a new tool for interactively browsing, searching, and annotating RNA-Seq metadata with the capability of semiautomatic curation at the study level.
OpenFlow arbitrated programmable network channels for managing quantum metadata
Dasari, Venkat R.; Humble, Travis S.
2016-10-10
Quantum networks must classically exchange complex metadata between devices in order to carry out information for protocols such as teleportation, super-dense coding, and quantum key distribution. Demonstrating the integration of these new communication methods with existing network protocols, channels, and data forwarding mechanisms remains an open challenge. Software-defined networking (SDN) offers robust and flexible strategies for managing diverse network devices and uses. We adapt the principles of SDN to the deployment of quantum networks, which are composed from unique devices that operate according to the laws of quantum mechanics. We show how quantum metadata can be managed within a software-definedmore » network using the OpenFlow protocol, and we describe how OpenFlow management of classical optical channels is compatible with emerging quantum communication protocols. We next give an example specification of the metadata needed to manage and control quantum physical layer (QPHY) behavior and we extend the OpenFlow interface to accommodate this quantum metadata. Here, we conclude by discussing near-term experimental efforts that can realize SDN’s principles for quantum communication.« less
OpenFlow arbitrated programmable network channels for managing quantum metadata
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dasari, Venkat R.; Humble, Travis S.
Quantum networks must classically exchange complex metadata between devices in order to carry out information for protocols such as teleportation, super-dense coding, and quantum key distribution. Demonstrating the integration of these new communication methods with existing network protocols, channels, and data forwarding mechanisms remains an open challenge. Software-defined networking (SDN) offers robust and flexible strategies for managing diverse network devices and uses. We adapt the principles of SDN to the deployment of quantum networks, which are composed from unique devices that operate according to the laws of quantum mechanics. We show how quantum metadata can be managed within a software-definedmore » network using the OpenFlow protocol, and we describe how OpenFlow management of classical optical channels is compatible with emerging quantum communication protocols. We next give an example specification of the metadata needed to manage and control quantum physical layer (QPHY) behavior and we extend the OpenFlow interface to accommodate this quantum metadata. Here, we conclude by discussing near-term experimental efforts that can realize SDN’s principles for quantum communication.« less
NASA Astrophysics Data System (ADS)
Tanner, S.; Schwab, M.; Beam, K.; Skaug, M.
2017-12-01
Operation IceBridge has been flying campaigns in the Arctic and Antarctic for nearly 10 years and will soon be a decadal mission. During that time, the generation and use of file level metadata has evolved from nearly non-existent to robust spatio-temporal support. This evolution has been difficult at times, but the results speak for themselves in the form of production tools for search, discovery, access and analysis. The lessons learned from this experience are now being incorporated into SnowEx, a new mission to measure snow cover using airborne and ground-based measurements. This presentation will focus on techniques for generating metadata for such a diverse set of measurements as well as the resulting tools that utilize this information. This includes the development and deployment of MetGen, a semi-automated metadata generation capability that relies on collaboration between data producers and data archivers, the newly deployed IceBridge data portal which incorporates data browse capabilities and limited in-line analysis, and programmatic access to metadata and data for incorporation into larger automated workflows.
Metadata for data rescue and data at risk
Anderson, William L.; Faundeen, John L.; Greenberg, Jane; Taylor, Fraser
2011-01-01
Scientific data age, become stale, fall into disuse and run tremendous risks of being forgotten and lost. These problems can be addressed by archiving and managing scientific data over time, and establishing practices that facilitate data discovery and reuse. Metadata documentation is integral to this work and essential for measuring and assessing high priority data preservation cases. The International Council for Science: Committee on Data for Science and Technology (CODATA) has a newly appointed Data-at-Risk Task Group (DARTG), participating in the general arena of rescuing data. The DARTG primary objective is building an inventory of scientific data that are at risk of being lost forever. As part of this effort, the DARTG is testing an approach for documenting endangered datasets. The DARTG is developing a minimal and easy to use set of metadata properties for sufficiently describing endangered data, which will aid global data rescue missions. The DARTG metadata framework supports rapid capture, and easy documentation, across an array of scientific domains. This paper reports on the goals and principles supporting the DARTG metadata schema, and provides a description of the preliminary implementation.
Operational Support for Instrument Stability through ODI-PPA Metadata Visualization and Analysis
NASA Astrophysics Data System (ADS)
Young, M. D.; Hayashi, S.; Gopu, A.; Kotulla, R.; Harbeck, D.; Liu, W.
2015-09-01
Over long time scales, quality assurance metrics taken from calibration and calibrated data products can aid observatory operations in quantifying the performance and stability of the instrument, and identify potential areas of concern or guide troubleshooting and engineering efforts. Such methods traditionally require manual SQL entries, assuming the requisite metadata has even been ingested into a database. With the ODI-PPA system, QA metadata has been harvested and indexed for all data products produced over the life of the instrument. In this paper we will describe how, utilizing the industry standard Highcharts Javascript charting package with a customized AngularJS-driven user interface, we have made the process of visualizing the long-term behavior of these QA metadata simple and easily replicated. Operators can easily craft a custom query using the powerful and flexible ODI-PPA search interface and visualize the associated metadata in a variety of ways. These customized visualizations can be bookmarked, shared, or embedded externally, and will be dynamically updated as new data products enter the system, enabling operators to monitor the long-term health of their instrument with ease.
NASA Astrophysics Data System (ADS)
Kumar, Kundan; Jariwala, C.; Pillai, R.; Chauhan, N.; Raole, P. M.
2015-08-01
Carbon fibres (Cf) are one of the most important reinforced materials for ceramic matrix composites such as Cf - SiC composites and they are generally sought for high temperature applications in as space application, nuclear reactor and automobile industries. But the major problem arise when Cf reinforced composites exposed to high temperature in an oxidizing environment, Cf react with oxygen and burnt away. In present work, we have studied the effect of silica (SiO2) coating as a protective coating on Cf for the Cf / SiC composites. The silica solution prepared by the sol-gel process and coating on Cf is done by dip coating technique with varying the withdrawing speed i.e. 2, 5, 8 mm/s with fixed dipping cycle (3 Nos.). The uniform silica coating on the Cf is shown by the Scanning Electron Microscope (SEM) analysis. The tensile test shows the increase in tensile strength with respect to increase in withdrawing speed. The isothermal oxidation analysis confirmed enhancement of oxidation resistance of silica coated Cf as compared tothe uncoated Cf.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-18
... Airworthiness Directives; General Electric Company CF6-45 and CF6-50 Series Turbofan Engines AGENCY: Federal... airworthiness directive (AD) for General Electric Company (GE) CF6-45 and CF6-50 series turbofan engines. That..., and MD-10- 30F. The commenter stated that the proposed AD only listed these airplanes as a series. We...
Principles of metadata organization at the ENCODE data coordination center
Hong, Eurie L.; Sloan, Cricket A.; Chan, Esther T.; Davidson, Jean M.; Malladi, Venkat S.; Strattan, J. Seth; Hitz, Benjamin C.; Gabdank, Idan; Narayanan, Aditi K.; Ho, Marcus; Lee, Brian T.; Rowe, Laurence D.; Dreszer, Timothy R.; Roe, Greg R.; Podduturi, Nikhil R.; Tanaka, Forrest; Hilton, Jason A.; Cherry, J. Michael
2016-01-01
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org PMID:26980513
A New Look at Data Usage by Using Metadata Attributes as Indicators of Data Quality
NASA Technical Reports Server (NTRS)
Won, Young-In; Wanchoo, Lalit; Behnke, Jeanne
2016-01-01
This study reviews the key metrics (users, distributed volume, and files) in multiple ways to gain an understanding of the significance of the metadata. Characterizing the usability of data by key metadata elements, such as discipline and study area, will assist in understanding how the user needs have evolved over time. The data usage pattern based on product level provides insight into the level of data quality. In addition, the data metrics by various services, such as the Open-source Project for a Network Data Access Protocol (OPeNDAP) and subsets, address how these services have extended the usage of data. Over-all, this study presents the usage of data and metadata by metrics analyses, which may assist data centers in better supporting the needs of the users.
PanMetaDocs - A tool for collecting and managing the long tail of "small science data"
NASA Astrophysics Data System (ADS)
Klump, J.; Ulbricht, D.
2011-12-01
In the early days of thinking about cyberinfrastructure the focus was on "big science data". Today, the challenge is not anymore to store several terabytes of data, but to manage data objects in a way that facilitates their re-use. Key to re-use by a user as a data consumer is proper documentation of the data. Also, data consumers need discovery metadata to find the data they need and they need descriptive metadata to be able to use the data they retrieved. Thus, data documentation faces the challenge to extensively and completely describe these objects, hold the items easily accessible at a sustainable cost level. However, data curation and documentation do not rank high in the everyday work of a scientist as a data producer. Data producers are often frustrated by being asked to provide metadata on their data over and over again, information that seemed very obvious from the context of their work. A challenge to data archives is the wide variety of metadata schemata in use, which creates a number of maintenance and design challenges of its own. PanMetaDocs addresses these issues by allowing an uploaded files to be described by more than one metadata object. PanMetaDocs, which was developed from PanMetaWorks, is a PHP based web application that allow to describe data with any xml-based metadata schema. Its user interface is browser based and was developed to collect metadata and data in collaborative scientific projects situated at one or more institutions. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and make use of contextual information in a project setting. In the development of PanMetaDocs the business logic of panMetaWorks is reused, except for the authentication and data management functions of PanMetaWorks, which are delegated to the eSciDoc framework. The eSciDoc repository framework is designed as a service oriented architecture that can be controlled through a REST interface to create version controlled items with metadata records in XML format. PanMetaDocs utilizes the eSciDoc items model to add multiple metadata records that describe uploaded files in different metadata schemata. While datasets are collected and described, shared to collaborate with other scientists and finally published, data objects are transferred from a shared data curation domain into a persistent data curation domain. Through an RSS interface for recent datasets PanMetaWorks allows project members to be informed about data uploaded by other project members. The implementation of the OAI-PMH interface can be used to syndicate data catalogs to research data portals, such as the panFMP data portal framework. Once data objects are uploaded to the eSciDoc infrastructure it is possible to drop the software instance that was used for collecting the data, while the compiled data and metadata are accessible for other authorized applications through the institution's eSciDoc middleware. This approach of "expendable data curation tools" allows for a significant reduction in costs for software maintenance as expensive data capture applications do not need to be maintained indefinitely to ensure long term access to the stored data.
Koch, Ann-Kristin; Brömme, Sabine; Wollschläger, Bettina; Horneff, Gerd; Keyszer, Gernot
2008-09-01
To determine the character and frequency of musculoskeletal manifestations and rheumatic symptoms in patients with cystic fibrosis (CF). Rheumatic symptoms and signs of 70 patients with CF (age 6 to 61 yrs) were determined by interview and clinical assessment. Age and sex-matched healthy volunteers served as a control group. In CF patients, laboratory measures and bone mineral density (BMD) were investigated. The data were correlated with the CF phenotype [Shwachman Score (ShS), Chrispin-Norman Score (ChNS), and pulmonary function tests (PFT)]. The prevalence of joint pain in the CF patients was 12.9%, with a mean duration of 7 days. Swollen joints were found in 4 patients. None fulfilled the criteria for rheumatoid arthritis or connective tissue disease. Adult CF patients complained more often about noninflammatory back pain and myalgia, and demonstrated reduced spine mobility and impaired everyday life functions compared with the controls. Symptomatic CF patients had elevated erythrocyte sedimentation rate and C-reactive protein levels and performed worse on the ShS, ChNS, and PFT than asymptomatic patients. Antibodies against exotoxin A of Pseudomonas aeruginosa and recombinant Aspergillus fumigatus allergen f4 were found more frequently in CF patients with arthralgia. BMD was decreased in adult patients with more severe CF. In CF patients, the prevalence of rheumatic symptoms increases with age and CF severity. Our data suggest an association of infections with P. aeruginosa and A. fumigatus with the occurrence of rheumatic symptoms. However, no association of CF with definite inflammatory joint or connective tissue diseases was observed, and no CF-specific pattern of musculoskeletal symptoms was seen.
NASA Technical Reports Server (NTRS)
Buckley, Donald H.; Johnson, Robert L.
1960-01-01
The gases CF2Cl-CF2Cl, CF2Cl2, and CF2Br-CF2Br were used to lubricate metals, cermets, and ceramics in this study. One of the criteria for determining the effectiveness of a reactive-gas-lubricated systems is the stability of the halogen-containing gas molecule. The carbon-to-halogen bond in the ethane molecule has extremely good thermal stability superior to the methane analogs (CF2Cl2 and CF2Br2) used in earlier research. For this reason, the ethane compounds CF2Cl-CF2Cl and CF2Br-CF2Br were considered as high-temperature lubricants. Friction and wear studies were made with a hemisphere (3/16-in. rad.) rider sliding in a circumferential path on the flat surface of a rotating disk (21/2-in. diam. ). The specimens of metal alloys, cermets, and ceramics were run In an atmosphere of the various gases with a load of 1200 grams, sliding velocities from 75 to 8000 feet per minute, and temperatures from 75 to 1400 F. The gas CF2Cl-CF2Cl was found to be an effective lubricant for the cermet LT-LB (59.0 Cr, 19.0 Al2O3, 20.0 Mo, 2.0 Ti) and the ceramic Al2O3 sliding on Stellite Star J (cobalt-base alloy) at temperatures to 1400 F. The bromine-containing gas CF2Br-CF2Br was found to give friction and wear values that can be considered to be in a region of effective boundary lubrication for the cermet K175D (nickel-bonded metal carbide) sliding on the metal Hastelloy R-235 (nickel-base alloy) at temperatures to 1200 F.
Company, Anna; Jee, Joo-Eun; Ribas, Xavi; Lopez-Valbuena, Josep Maria; Gómez, Laura; Corbella, Montserrat; Llobet, Antoni; Mahía, José; Benet-Buchholz, Jordi; Costas, Miquel; van Eldik, Rudi
2007-10-29
A study of the reversible CO2 fixation by a series of macrocyclic dicopper complexes is described. The dicopper macrocyclic complexes [Cu2(OH)2(Me2p)](CF3SO3)2, 1(CF3SO3)2, and [Cu2(mu-OH)2(Me2m)](CF3SO3)2, 2(CF3SO3)2, (Scheme 1) containing terminally bound and bridging hydroxide ligands, respectively, promote reversible inter- and intramolecular CO2 fixation that results in the formation of the carbonate complexes [{Cu2(Me2p)}2(mu-CO3)2](CF3SO3)4, 4(CF3SO3)4, and [Cu2(mu-CO3)(Me2m)](CF3SO3)2, 5(CF3SO3)2. Under a N2 atmosphere the complexes evolve CO2 and revert to the starting hydroxo complexes 1(CF3SO3)2 and 2(CF3SO3)2, a reaction the rate of which linearly depends on [H2O]. In the presence of water, attempts to crystallize 5(CF3SO3)2 afford [{Cu2(Me2m)(H2O)}2(mu-CO3)2](CF3SO3)4, 6(CF3SO3)4, which appears to rapidly convert to 5(CF3SO3)2 in acetonitrile solution. [Cu2(OH)2(H3m)]2+, 7, which contains a larger macrocyclic ligand, irreversibly reacts with atmospheric CO2 to generate cagelike [{Cu2(H3m)}2(mu-CO3)2](ClO4)4, 8(ClO4)4. However, addition of 1 equiv of HClO4 per Cu generates [Cu2(H3m)(CH3CN)4]4+ (3), and subsequent addition of Et3N under air reassembles 8. The carbonate complexes 4(CF3SO3)4, 5(CF3SO3)2, 6(CF3SO3)4, and 8(ClO4)4 have been characterized in the solid state by X-ray crystallography. This analysis reveals that 4(CF3SO3)4, 6(CF3SO3)4, and 8(ClO4)4 consist of self-assembled molecular boxes containing two macrocyclic dicopper complexes, bridged by CO32- ligands. The bridging mode of the carbonate ligand is anti-anti-mu-eta1:eta1 in 4(CF3SO3)4, anti-anti-mu-eta2:eta1 in 6(CF3SO3)4 and anti-anti-mu-eta2:eta2 in 5(CF3SO3)2 and 8(ClO4)4. Magnetic susceptibility measurements on 4(CF3SO3)4, 6(CF3SO3)4, and 8(ClO4)4 indicate that the carbonate ligands mediate antiferromagnetic coupling between each pair of bridged CuII ions (J = -23.1, -108.3, and -163.4 cm-1, respectively, H = -JS1S2). Detailed kinetic analyses of the reaction between carbon dioxide and the macrocyclic complexes 1(CF3SO3)2 and 2(CF3SO3)2 suggest that it is actually hydrogen carbonate formed in aqueous solution on dissolving CO2 that is responsible for the observed formation of the different carbonate complexes controlled by the binding mode of the hydroxy ligands. This study shows that CO2 fixation can be used as an on/off switch for the reversible self-assembly of supramolecular structures based on macrocyclic dicopper complexes.
Unified Science Information Model for SoilSCAPE using the Mercury Metadata Search System
NASA Astrophysics Data System (ADS)
Devarakonda, Ranjeet; Lu, Kefa; Palanisamy, Giri; Cook, Robert; Santhana Vannan, Suresh; Moghaddam, Mahta Clewley, Dan; Silva, Agnelo; Akbar, Ruzbeh
2013-12-01
SoilSCAPE (Soil moisture Sensing Controller And oPtimal Estimator) introduces a new concept for a smart wireless sensor web technology for optimal measurements of surface-to-depth profiles of soil moisture using in-situ sensors. The objective is to enable a guided and adaptive sampling strategy for the in-situ sensor network to meet the measurement validation objectives of spaceborne soil moisture sensors such as the Soil Moisture Active Passive (SMAP) mission. This work is being carried out at the University of Michigan, the Massachusetts Institute of Technology, University of Southern California, and Oak Ridge National Laboratory. At Oak Ridge National Laboratory we are using Mercury metadata search system [1] for building a Unified Information System for the SoilSCAPE project. This unified portal primarily comprises three key pieces: Distributed Search/Discovery; Data Collections and Integration; and Data Dissemination. Mercury, a Federally funded software for metadata harvesting, indexing, and searching would be used for this module. Soil moisture data sources identified as part of this activity such as SoilSCAPE and FLUXNET (in-situ sensors), AirMOSS (airborne retrieval), SMAP (spaceborne retrieval), and are being indexed and maintained by Mercury. Mercury would be the central repository of data sources for cal/val for soil moisture studies and would provide a mechanism to identify additional data sources. Relevant metadata from existing inventories such as ORNL DAAC, USGS Clearinghouse, ARM, NASA ECHO, GCMD etc. would be brought in to this soil-moisture data search/discovery module. The SoilSCAPE [2] metadata records will also be published in broader metadata repositories such as GCMD, data.gov. Mercury can be configured to provide a single portal to soil moisture information contained in disparate data management systems located anywhere on the Internet. Mercury is able to extract, metadata systematically from HTML pages or XML files using a variety of methods including OAI-PMH [3]. The Mercury search interface then allows users to perform simple, fielded, spatial and temporal searches across a central harmonized index of metadata. Mercury supports various metadata standards including FGDC, ISO-19115, DIF, Dublin-Core, Darwin-Core, and EML. This poster describes in detail how Mercury implements the Unified Science Information Model for Soil moisture data. References: [1]Devarakonda R., et al. Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics (2010), 3(1): 87-94. [2]Devarakonda R., et al. Daymet: Single Pixel Data Extraction Tool. http://daymet.ornl.gov/singlepixel.html (2012). Last Accesses 10-01-2013 [3]Devarakonda R., et al. Data sharing and retrieval using OAI-PMH. Earth Science Informatics (2011), 4(1): 1-5.
ODISEES: A New Paradigm in Data Access
NASA Astrophysics Data System (ADS)
Huffer, E.; Little, M. M.; Kusterer, J.
2013-12-01
As part of its ongoing efforts to improve access to data, the Atmospheric Science Data Center has developed a high-precision Earth Science domain ontology (the 'ES Ontology') implemented in a graph database ('the Semantic Metadata Repository') that is used to store detailed, semantically-enhanced, parameter-level metadata for ASDC data products. The ES Ontology provides the semantic infrastructure needed to drive the ASDC's Ontology-Driven Interactive Search Environment for Earth Science ('ODISEES'), a data discovery and access tool, and will support additional data services such as analytics and visualization. The ES ontology is designed on the premise that naming conventions alone are not adequate to provide the information needed by prospective data consumers to assess the suitability of a given dataset for their research requirements; nor are current metadata conventions adequate to support seamless machine-to-machine interactions between file servers and end-user applications. Data consumers need information not only about what two data elements have in common, but also about how they are different. End-user applications need consistent, detailed metadata to support real-time data interoperability. The ES ontology is a highly precise, bottom-up, queriable model of the Earth Science domain that focuses on critical details about the measurable phenomena, instrument techniques, data processing methods, and data file structures. Earth Science parameters are described in detail in the ES Ontology and mapped to the corresponding variables that occur in ASDC datasets. Variables are in turn mapped to well-annotated representations of the datasets that they occur in, the instrument(s) used to create them, the instrument platforms, the processing methods, etc., creating a linked-data structure that allows both human and machine users to access a wealth of information critical to understanding and manipulating the data. The mappings are recorded in the Semantic Metadata Repository as RDF-triples. An off-the-shelf Ontology Development Environment and a custom Metadata Conversion Tool comprise a human-machine/machine-machine hybrid tool that partially automates the creation of metadata as RDF-triples by interfacing with existing metadata repositories and providing a user interface that solicits input from a human user, when needed. RDF-triples are pushed to the Ontology Development Environment, where a reasoning engine executes a series of inference rules whose antecedent conditions can be satisfied by the initial set of RDF-triples, thereby generating the additional detailed metadata that is missing in existing repositories. A SPARQL Endpoint, a web-based query service and a Graphical User Interface allow prospective data consumers - even those with no familiarity with NASA data products - to search the metadata repository to find and order data products that meet their exact specifications. A web-based API will provide an interface for machine-to-machine transactions.
Combined use of semantics and metadata to manage Research Data Life Cycle in Environmental Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida
2017-04-01
The use of metadata to contextualize datasets is quite extended in Earth System Sciences. There are some initiatives and available tools to help data managers to choose the best metadata standard that fit their use cases, like the DCC Metadata Directory (http://www.dcc.ac.uk/resources/metadata-standards). In our use case, we have been gathering physical, chemical and biological data from a water reservoir since 2010. A well metadata definition is crucial not only to contextualize our own data but also to integrate datasets from other sources like satellites or meteorological agencies. That is why we have chosen EML (Ecological Metadata Language), which integrates many different elements to define a dataset, including the project context, instrumentation and parameters definition, and the software used to process, provide quality controls and include the publication details. Those metadata elements can contribute to help both human and machines to understand and process the dataset. However, the use of metadata is not enough to fully support the data life cycle, from the Data Management Plan definition to the Publication and Re-use. To do so, we need to define not only metadata and attributes but also the relationships between them, so semantics are needed. Ontologies, being a knowledge representation, can contribute to define the elements of a research data life cycle, including DMP, datasets, software, etc. They also can define how the different elements are related between them and how they interact. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop an ontology we are using a graphical tool Protégé, which is a graphical ontology-development tool that supports a rich knowledge model and it is open-source and freely available. To process and manage the ontology, we are using Semantic MediaWiki, which is able to process queries. Semantic MediaWiki is an extension of MediaWiki where we can do semantic search and export data in RDF. Our final goal is integrating our data repository portal and semantic processing engine in order to have a complete system to manage the data life cycle stages and their relationships, including machine-actionable DMP solution, datasets and software management, computing resources for processing and analysis and publication features (DOI mint). This way we will be able to reproduce the full data life cycle chain warranting the FAIR+R principles.
Dhaval, Rakesh; Borlawsky, Tara; Ostrander, Michael; Santangelo, Jennifer; Kamal, Jyoti; Payne, Philip R O
2008-11-06
In order to enhance interoperability between enterprise systems, and improve data validity and reliability throughout The Ohio State University Medical Center (OSUMC), we have initiated the development of an ontology-anchored metadata architecture and knowledge collection for our enterprise data warehouse. The metadata and corresponding semantic relationships stored in the OSUMC knowledge collection are intended to promote consistency and interoperability across the heterogeneous clinical, research, business and education information managed within the data warehouse.
Deploying the ATLAS Metadata Interface (AMI) on the cloud with Jenkins
NASA Astrophysics Data System (ADS)
Lambert, F.; Odier, J.; Fulachier, J.; ATLAS Collaboration
2017-10-01
The ATLAS Metadata Interface (AMI) is a mature application of more than 15 years of existence. Mainly used by the ATLAS experiment at CERN, it consists of a very generic tool ecosystem for metadata aggregation and cataloguing. AMI is used by the ATLAS production system, therefore the service must guarantee a high level of availability. We describe our monitoring and administration systems, and the Jenkins-based strategy used to dynamically test and deploy cloud OpenStack nodes on demand.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-04
[email protected]ge.com . You may review copies of the referenced service information at the FAA, Engine..., February 26, 2009), for GE CF6-45 and CF6-50 series turbofan engines. That AD requires replacing LFCEN... that this proposed AD would affect 383 GE CF6-45 and CF6-50 series turbofan engines installed on...
NASA Astrophysics Data System (ADS)
Gerst, K.; Enquist, C.; Rosemartin, A.; Denny, E. G.; Marsh, L.; Moore, D. J.; Weltzin, J. F.
2014-12-01
The USA National Phenology Network (USA-NPN; www.usanpn.org) serves science and society by promoting a broad understanding of plant and animal phenology and the relationships among phenological patterns and environmental change. The National Phenology Database maintained by USA-NPN now has over 3.7 million records for plants and animals for the period 1954-2014, with the majority of these observations collected since 2008 as part of a broad, national contributory science strategy. These data have been used in a number of science, conservation and resource management applications, including national assessments of historical and potential future trends in phenology, regional assessments of spatio-temporal variation in organismal activity, and local monitoring for invasive species detection. Customizable data downloads are freely available, and data are accompanied by FGDC-compliant metadata, data-use and data-attribution policies, vetted and documented methodologies and protocols, and version control. While users are free to develop custom algorithms for data cleaning, winnowing and summarization prior to analysis, the National Coordinating Office of USA-NPN is developing a suite of standard data products to facilitate use and application by a diverse set of data users. This presentation provides a progress report on data product development, including: (1) Quality controlled raw phenophase status data; (2) Derived phenometrics (e.g. onset, duration) at multiple scales; (3) Data visualization tools; (4) Tools to support assessment of species interactions and overlap; (5) Species responsiveness to environmental drivers; (6) Spatially gridded phenoclimatological products; and (7) Algorithms for modeling and forecasting future phenological responses. The prioritization of these data products is a direct response to stakeholder needs related to informing management and policy decisions. We anticipate that these products will contribute to broad understanding of plant and animal phenology across scientific disciplines.
Chitinase activation in patients with fungus-associated cystic fibrosis lung disease.
Hector, Andreas; Chotirmall, Sanjay H; Lavelle, Gillian M; Mirković, Bojana; Horan, Deirdre; Eichler, Laura; Mezger, Markus; Singh, Anurag; Ralhan, Anjai; Berenbrinker, Sina; Mack, Ines; Ensenauer, Regina; Riethmüller, Joachim; Graepler-Mainka, Ute; Murray, Michelle A; Griese, Matthias; McElvaney, N Gerry; Hartl, Dominik
2016-10-01
Chitinases have recently gained attention in the field of pulmonary diseases, particularly in asthma and chronic obstructive pulmonary disease, but their potential role in patients with cystic fibrosis (CF)-associated lung disease remains unclear. The aim of this study was to assess chitinase activity systemically and in the airways of patients with CF and asthma compared with healthy subjects. Additionally, we assessed factors that regulate chitinase activity within the lungs of patients with CF. Chitinase activities were quantified in serum and bronchoalveolar lavage fluid from patients with CF, asthmatic patients, and healthy control subjects. Mechanistically, the role of CF airway proteases and genetic chitinase deficiency was assessed. Chitinase activity was systemically increased in patients with CF compared with that in healthy control subjects and asthmatic patients. Further stratification showed that chitinase activity was enhanced in patients with CF colonized with Candida albicans compared with that in noncolonized patients. CF proteases degraded chitinases in the airway microenvironment of patients with CF. Genetic chitinase deficiency was associated with C albicans colonization in patients with CF. Patients with CF have enhanced chitinase activation associated with C albicans colonization. Therefore chitinases might represent a novel biomarker and therapeutic target for CF-associated fungal disease. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Cystic fibrosis airway secretions exhibit mucin hyperconcentration and increased osmotic pressure
Henderson, Ashley G.; Ehre, Camille; Button, Brian; Abdullah, Lubna H.; Cai, Li-Heng; Leigh, Margaret W.; DeMaria, Genevieve C.; Matsui, Hiro; Donaldson, Scott H.; Davis, C. William; Sheehan, John K.; Boucher, Richard C.; Kesimer, Mehmet
2014-01-01
The pathogenesis of mucoinfective lung disease in cystic fibrosis (CF) patients likely involves poor mucus clearance. A recent model of mucus clearance predicts that mucus flow depends on the relative mucin concentration of the mucus layer compared with that of the periciliary layer; however, mucin concentrations have been difficult to measure in CF secretions. Here, we have shown that the concentration of mucin in CF sputum is low when measured by immunologically based techniques, and mass spectrometric analyses of CF mucins revealed mucin cleavage at antibody recognition sites. Using physical size exclusion chromatography/differential refractometry (SEC/dRI) techniques, we determined that mucin concentrations in CF secretions were higher than those in normal secretions. Measurements of partial osmotic pressures revealed that the partial osmotic pressure of CF sputum and the retained mucus in excised CF lungs were substantially greater than the partial osmotic pressure of normal secretions. Our data reveal that mucin concentration cannot be accurately measured immunologically in proteolytically active CF secretions; mucins are hyperconcentrated in CF secretions; and CF secretion osmotic pressures predict mucus layer–dependent osmotic compression of the periciliary liquid layer in CF lungs. Consequently, mucin hypersecretion likely produces mucus stasis, which contributes to key infectious and inflammatory components of CF lung disease. PMID:24892808
Enhancement of discharge performance of Li/CF x cell by thermal treatment of CF x cathode material
NASA Astrophysics Data System (ADS)
Zhang, Sheng S.; Foster, Donald; Read, Jeffrey
In this work we demonstrate that the thermal treatment of CF x cathode material just below the decomposition temperature can enhance discharge performance of Li/CF x cells. The performance enhancement becomes more effective when heating a mixture of CF x and citric acid (CA) since CA serves as an extra carbon source. Discharge experiments show that the thermal treatment not only reduces initial voltage delay, but also raises discharge voltage. Whereas the measurement of powder impedance indicates the thermal treatment does not increase electronic conductivity of CF x material. Based on these facts, we propose that the thermal treatment results in a limited decomposition of CF x, which yields a subfluorinated carbon (CF x- δ), instead of a highly conductive carbon. In the case of CF x/AC mixture, the AC provides extra carbon that reacts with F 2 and fluorocarbon radicals generated by the thermal decomposition of CF x to form subfluorinated carbon. The process of thermal treatment is studied by thermogravimetric analysis and X-ray diffraction, and the effect of treatment conditions such as heating temperature, heating time and CF x/CA ratio on the discharge performance of CF x cathode is discussed. As an example, a Li/CF x cell using CF x treated with CA at 500 °C under nitrogen for 2 h achieved theretical specific capacity when being discharged at C/5. Impedance analysis indicates that the enhanced performance is attributed to a significant reduction in the cell reaction resistance.
Qu, Cheng; Wang, Lin-Yan; Jin, Wen-Tao; Tang, Yu-Ping; Jin, Yi; Shi, Xu-Qin; Shang, Li-Li; Shang, Er-Xin; Duan, Jin-Ao
2016-11-06
The flower of Carthamus tinctorius L. (Carthami Flos, safflower), important in traditional Chinese medicine (TCM), is known for treating blood stasis, coronary heart disease, hypertension, and cerebrovascular disease in clinical and experimental studies. It is widely accepted that hydroxysafflor yellow A (HSYA) and anhydrosafflor yellow B (ASYB) are the major bioactive components of many formulae comprised of safflower. In this study, selective knock-out of target components such as HSYA and ASYB by using preparative high performance liquid chromatography (prep-HPLC) followed by antiplatelet and anticoagulation activities evaluation was used to investigate the roles of bioactive ingredients in safflower series of herb pairs. The results showed that both HSYA and ASYB not only played a direct role in activating blood circulation, but also indirectly made a contribution to the total bioactivity of safflower series of herb pairs. The degree of contribution of HSYA in the safflower and its series herb pairs was as follows: Carthami Flos-Ginseng Radix et Rhizoma Rubra (CF-GR) > Carthami Flos-Sappan Lignum (CF-SL) > Carthami Flos-Angelicae Sinensis Radix (CF-AS) > Carthami Flos-Astragali Radix (CF-AR) > Carthami Flos-Angelicae Sinensis Radix (CF-AS) > Carthami Flos-Glycyrrhizae Radix et Rhizoma (CF-GL) > Carthami Flos-Salviae Miltiorrhizae Radix et Rhizoma (CF-SM) > Carthami Flos (CF), and the contribution degree of ASYB in the safflower and its series herb pairs: CF-GL > CF-PS > CF-AS > CF-SL > CF-SM > CF-AR > CF-GR > CF. So, this study provided a significant and effective approach to elucidate the contribution of different herbal components to the bioactivity of the herb pair, and clarification of the variation of herb-pair compatibilities. In addition, this study provides guidance for investigating the relationship between herbal compounds and the bioactivities of herb pairs. It also provides a scientific basis for reasonable clinical applications and new drug development on the basis of the safflower series of herb pairs.
47 CFR 1.1206 - Permit-but-disclose proceedings.
Code of Federal Regulations, 2012 CFR
2012-10-01
... document to be filed electronically contains metadata that is confidential or protected from disclosure by... metadata from the document before filing it electronically. (iii) Filing dates outside the Sunshine period...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-21
... development of standardized metadata in hundreds of organizations, and funded numerous implementations of OGC... of emphasis include: Metadata documentation, clearinghouse establishment, framework development...
47 CFR 1.1206 - Permit-but-disclose proceedings.
Code of Federal Regulations, 2011 CFR
2011-10-01
... technically possible. Where the document to be filed electronically contains metadata that is confidential or... filer may remove such metadata from the document before filing it electronically. (iii) Filing dates...
Dataworks for GNSS: Software for Supporting Data Sharing and Federation of Geodetic Networks
NASA Astrophysics Data System (ADS)
Boler, F. M.; Meertens, C. M.; Miller, M. M.; Wier, S.; Rost, M.; Matykiewicz, J.
2015-12-01
Continuously-operating Global Navigation Satellite System (GNSS) networks are increasingly being installed globally for a wide variety of science and societal applications. GNSS enables Earth science research in areas including tectonic plate interactions, crustal deformation in response to loading by tectonics, magmatism, water and ice, and the dynamics of water - and thereby energy transfer - in the atmosphere at regional scale. The many individual scientists and organizations that set up GNSS stations globally are often open to sharing data, but lack the resources or expertise to deploy systems and software to manage and curate data and metadata and provide user tools that would support data sharing. UNAVCO previously gained experience in facilitating data sharing through the NASA-supported development of the Geodesy Seamless Archive Centers (GSAC) open source software. GSAC provides web interfaces and simple web services for data and metadata discovery and access, supports federation of multiple data centers, and simplifies transfer of data and metadata to long-term archives. The NSF supported the dissemination of GSAC to multiple European data centers forming the European Plate Observing System. To expand upon GSAC to provide end-to-end, instrument-to-distribution capability, UNAVCO developed Dataworks for GNSS with NSF funding to the COCONet project, and deployed this software on systems that are now operating as Regional GNSS Data Centers as part of the NSF-funded TLALOCNet and COCONet projects. Dataworks consists of software modules written in Python and Java for data acquisition, management and sharing. There are modules for GNSS receiver control and data download, a database schema for metadata, tools for metadata handling, ingest software to manage file metadata, data file management scripts, GSAC, scripts for mirroring station data and metadata from partner GSACs, and extensive software and operator documentation. UNAVCO plans to provide a cloud VM image of Dataworks that would allow standing up a Dataworks-enabled GNSS data center without requiring upfront investment in server hardware. By enabling data creators to organize their data and metadata for sharing, Dataworks helps scientists expand their data curation awareness and responsibility, and enhances data access for all.
DOIDB: Reusing DataCite's search software as metadata portal for GFZ Data Services
NASA Astrophysics Data System (ADS)
Elger, K.; Ulbricht, D.; Bertelmann, R.
2016-12-01
GFZ Data Services is the central service point for the publication of research data at the Helmholtz Centre Potsdam GFZ German Research Centre for Geosciences (GFZ). It provides data publishing services to scientists of GFZ, associated projects, and associated institutions. The publishing services aim to make research data and physical samples visible and citable, by assigning persistent identifiers (DOI, IGSN) and by complementing existing IT infrastructure. To integrate several research domains a modular software stack that is made of free software components has been created to manage data and metadata as well as register persistent identifiers [1]. Pivotal component for the registration of DOIs is the DOIDB. It has been derived from three software components provided by DataCite [2] that moderate the registration of DOIs and the deposition of metadata, allow the dissemination of metadata, and provide a user interface to navigate and discover datasets. The DOIDB acts as a proxy to the DataCite infrastructure and in addition to the DataCite metadata schema, it allows to deposit and disseminate metadata following the schemas ISO19139 and NASA GCMD DIF. The search component has been modified to meet the requirements of a geosciences metadata portal. In particular, the search component has been altered to make use of Apache SOLRs capability to index and query spatial coordinates. Furthermore, the user interface has been adjusted to provide a first impression of the data by showing a map, summary information and subjects. DOIDB and its components are available on GitHub [3].We present a software solution for registration of DOIs that allows to integrate existing data systems, keeps track of registered DOIs, and provides a metadata portal to discover datasets [4]. [1] Ulbricht, D.; Elger, K.; Bertelmann, R.; Klump, J. panMetaDocs, eSciDoc, and DOIDB—An Infrastructure for the Curation and Publication of File-Based Datasets for GFZ Data Services. ISPRS Int. J. Geo-Inf. 2016, 5, 25. http://doi.org/10.3390/ijgi5030025[2] https://github.com/datacite[3] https://github.com/ulbricht/search/tree/doidb , https://github.com/ulbricht/mds/tree/doidb , https://github.com/ulbricht/oaip/tree/doidb[4] http://doidb.wdc-terra.org
Advancements in Large-Scale Data/Metadata Management for Scientific Data.
NASA Astrophysics Data System (ADS)
Guntupally, K.; Devarakonda, R.; Palanisamy, G.; Frame, M. T.
2017-12-01
Scientific data often comes with complex and diverse metadata which are critical for data discovery and users. The Online Metadata Editor (OME) tool, which was developed by an Oak Ridge National Laboratory team, effectively manages diverse scientific datasets across several federal data centers, such as DOE's Atmospheric Radiation Measurement (ARM) Data Center and USGS's Core Science Analytics, Synthesis, and Libraries (CSAS&L) project. This presentation will focus mainly on recent developments and future strategies for refining OME tool within these centers. The ARM OME is a standard based tool (https://www.archive.arm.gov/armome) that allows scientists to create and maintain metadata about their data products. The tool has been improved with new workflows that help metadata coordinators and submitting investigators to submit and review their data more efficiently. The ARM Data Center's newly upgraded Data Discovery Tool (http://www.archive.arm.gov/discovery) uses rich metadata generated by the OME to enable search and discovery of thousands of datasets, while also providing a citation generator and modern order-delivery techniques like Globus (using GridFTP), Dropbox and THREDDS. The Data Discovery Tool also supports incremental indexing, which allows users to find new data as and when they are added. The USGS CSAS&L search catalog employs a custom version of the OME (https://www1.usgs.gov/csas/ome), which has been upgraded with high-level Federal Geographic Data Committee (FGDC) validations and the ability to reserve and mint Digital Object Identifiers (DOIs). The USGS's Science Data Catalog (SDC) (https://data.usgs.gov/datacatalog) allows users to discover a myriad of science data holdings through a web portal. Recent major upgrades to the SDC and ARM Data Discovery Tool include improved harvesting performance and migration using new search software, such as Apache Solr 6.0 for serving up data/metadata to scientific communities. Our presentation will highlight the future enhancements of these tools which enable users to retrieve fast search results, along with parallelizing the retrieval process from online and High Performance Storage Systems. In addition, these improvements to the tools will support additional metadata formats like the Large-Eddy Simulation (LES) ARM Symbiotic and Observation (LASSO) bundle data.