marine metadata interoperability: Topics by Science.gov

Sample records for marine metadata interoperability

A Common Metadata System for Marine Data Portals

NASA Astrophysics Data System (ADS)

Wosniok, C.; Breitbach, G.; Lehfeldt, R.

2012-04-01

Processing and allocation of marine datasets depend on the nature of the data resulting from field campaigns, continuous monitoring and numerical modeling. Two research and development projects in northern Germany manage different types of marine data. Due to different data characteristics and institutional frameworks separate data portals are required. This paper describes the integration of distributed marine data in Germany. The Marine Data Infrastructure of Germany (MDI-DE) supports public authorities in the German coastal zone with the implementation of European directives like INSPIRE or the Marine Strategy Framework Directive. This is carried out through setting up standardized web services within a network of participating coastal agencies and the installation of a common data portal (http://www.mdi-de.org), which integrates distributed marine data concerning coastal engineering, coastal water protection and nature conservation in an interoperable and harmonized manner for administrative and scientific purposes as well as for information of the general public. The Coastal Observation System for Northern and Arctic Seas (COSYNA) aims at developing and testing analysis systems for the operational synoptic description of the environmental status of the North Sea and of Arctic coastal waters. This is done by establishing a network of monitoring facilities and the provision of its data in near-real-time. In situ measurements with poles, ferry boxes, and buoys, together with remote sensing measurements, and the data assimilation of these data into simulation results enables COSYNA to provide pre-operational 'products', that are beyond the present routinely applied techniques in observation and modelling. The data allocation in near-real-time requires thoroughly executed data validation, which is processed on the fly before data is passed on to the COSYNA portal (http://kofserver2.hzg.de/codm/). Both projects apply OGC standards such as Web Mapping Service (WMS), Web Feature Service (WFS) and Sensor Observation Service (SOS), which ensures interoperability and extensibility. In addition, metadata as crucial components for searching and finding information in large data infrastructures is provided via the Catalogue Web Service (CS-W). MDI-DE and COSYNA rely on the metadata information system for marine metadata NOKIS, which reflects a metadata profile tailored for marine data according to the specifications of German coastal authorities. In spite of this common software base, interoperability between the two data collections requires constant alignments of the diverse data processed by the two portals. While monitoring data in the MDI-DE is currently rather campaign-based, COSYNA has to fit constantly evolving time series into metadata sets. With all data following the same metadata profile, we now reach full interoperability between the different data collections. The distributed marine information system provides options to search, find and visualise the harmonised results from continuous monitoring, field campaigns, numerical modeling and other data in one web client.
MMI's Metadata and Vocabulary Solutions: 10 Years and Growing

NASA Astrophysics Data System (ADS)

Graybeal, J.; Gayanilo, F.; Rueda-Velasquez, C. A.

2014-12-01

The Marine Metadata Interoperability project (http://marinemetadata.org) held its public opening at AGU's 2004 Fall Meeting. For 10 years since that debut, the MMI guidance and vocabulary sites have served over 100,000 visitors, with 525 community members and continuous Steering Committee leadership. Originally funded by the National Science Foundation, over the years multiple organizations have supported the MMI mission: "Our goal is to support collaborative research in the marine science domain, by simplifying the incredibly complex world of metadata into specific, straightforward guidance. MMI encourages scientists and data managers at all levels to apply good metadata practices from the start of a project, by providing the best guidance and resources for data management, and developing advanced metadata tools and services needed by the community." Now hosted by the Harte Research Institute at Texas A&M University at Corpus Christi, MMI continues to provide guidance and services to the community, and is planning for marine science and technology needs for the next 10 years. In this presentation we will highlight our major accomplishments, describe our recent achievements and imminent goals, and propose a vision for improving marine data interoperability for the next 10 years, including Ontology Registry and Repository (http://mmisw.org/orr) advancements and applications (http://mmisw.org/cfsn).
Seeking the Path to Metadata Nirvana

NASA Astrophysics Data System (ADS)

Graybeal, J.

2008-12-01

Scientists have always found reusing other scientists' data challenging. Computers did not fundamentally change the problem, but enabled more and larger instances of it. In fact, by removing human mediation and time delays from the data sharing process, computers emphasize the contextual information that must be exchanged in order to exchange and reuse data. This requirement for contextual information has two faces: "interoperability" when talking about systems, and "the metadata problem" when talking about data. As much as any single organization, the Marine Metadata Interoperability (MMI) project has been tagged with the mission "Solve the metadata problem." Of course, if that goal is achieved, then sustained, interoperable data systems for interdisciplinary observing networks can be easily built -- pesky metadata differences, like which protocol to use for data exchange, or what the data actually measures, will be a thing of the past. Alas, as you might imagine, there will always be complexities and incompatibilities that are not addressed, and data systems that are not interoperable, even within a science discipline. So should we throw up our hands and surrender to the inevitable? Not at all. Rather, we try to minimize metadata problems as much as we can. In this we increasingly progress, despite natural forces that pull in the other direction. Computer systems let us work with more complexity, build community knowledge and collaborations, and preserve and publish our progress and (dis-)agreements. Funding organizations, science communities, and technologists see the importance interoperable systems and metadata, and direct resources toward them. With the new approaches and resources, projects like IPY and MMI can simultaneously define, display, and promote effective strategies for sustainable, interoperable data systems. This presentation will outline the role metadata plays in durable interoperable data systems, for better or worse. It will describe times when "just choosing a standard" can work, and when it probably won't work. And it will point out signs that suggest a metadata storm is coming to your community project, and how you might avoid it. From these lessons we will seek a path to producing interoperable, interdisciplinary, metadata-enlightened environment observing systems.
Engaging a community towards marine cyberinfrastructure: Lessons Learned from The Marine Metadata Interoperability initiative

NASA Astrophysics Data System (ADS)

Galbraith, N. R.; Graybeal, J.; Bermudez, L. E.; Wright, D.

2005-12-01

The Marine Metadata Interoperability (MMI) initiative promotes the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility. The project, operating since late 2004, presents several cultural organizational challenges because of the diversity of participants: scientists, technical experts, and data managers from around the world, all working in organizations with different corporate cultures, funding structures, and systems of decision-making. MMI provides educational resources at several levels. For instance, short introductions to metadata concepts are available, as well as guides and "cookbooks" for the quick and efficient preparation of marine metadata. For those who are building major marine data systems, including ocean-observing capabilities, there are training materials, marine metadata content examples, and resources for mapping elements between different metadata standards. The MMI also provides examples of good metadata practices in existing data systems, including the EU's Marine XML project, and functioning ocean/coastal clearinghouses and atlases developed by MMI team members. Communication tools that help build community: 1) Website, used to introduce the initiative to new visitors, and to provide in-depth guidance and resources to members and visitors. The site is built using Plone, an open source web content management system. Plone allows the site to serve as a wiki, to which every user can contribute material. This keeps the membership engaged and spreads the responsibility for the tasks of updating and expanding the site. 2) Email-lists, to engage the broad ocean sciences community. The discussion forums "news," "ask," and "site-help" are available for receiving regular updates on MMI activities, seeking advice or support on projects and standards, or for assistance with using the MMI site. Internal email lists are provided for the Technical Team, the Steering Committee and Executive Committee, and for several content-centered teams. These lists help keep committee members connected, and have been very successful in building consensus and momentum. 3) Regularly scheduled telecons, to provide the chance for interaction between members without the need to physically attend meetings. Both the steering committee and the technical team convene via phone every month. Discussions are guided by agendas published in advance, and minutes are kept on-line for reference. These telecons have been an important tool in moving the MMI project forward; they give members an opportunity for informal discussion and provide a timeframe for accomplishing tasks. 4) Workshops, to make progress towards community agreement, such as the technical workshop "Advancing Domain Vocabularies" August 9-11, 2005, in Boulder, Colorado, where featured domain and metadata experts developed mappings between existing marine metadata vocabularies. Most of the work of the meeting was performed in six small, carefully organized breakout teams, oriented around specific domains. 5) Calendar of events, to keep update the users and where any event related to marine metadata and interoperability can be posted. 6) Specific tools to reach agreements among distributed communities. For example, we developed a tool called Vocabulary Integration Environment (VINE), that allows formalized agreements of mappings across different vocabularies.
Semantic Integration for Marine Science Interoperability Using Web Technologies

NASA Astrophysics Data System (ADS)

Rueda, C.; Bermudez, L.; Graybeal, J.; Isenor, A. W.

2008-12-01

The Marine Metadata Interoperability Project, MMI (http://marinemetadata.org) promotes the exchange, integration, and use of marine data through enhanced data publishing, discovery, documentation, and accessibility. A key effort is the definition of an Architectural Framework and Operational Concept for Semantic Interoperability (http://marinemetadata.org/sfc), which is complemented with the development of tools that realize critical use cases in semantic interoperability. In this presentation, we describe a set of such Semantic Web tools that allow performing important interoperability tasks, ranging from the creation of controlled vocabularies and the mapping of terms across multiple ontologies, to the online registration, storage, and search services needed to work with the ontologies (http://mmisw.org). This set of services uses Web standards and technologies, including Resource Description Framework (RDF), Web Ontology language (OWL), Web services, and toolkits for Rich Internet Application development. We will describe the following components: MMI Ontology Registry: The MMI Ontology Registry and Repository provides registry and storage services for ontologies. Entries in the registry are associated with projects defined by the registered users. Also, sophisticated search functions, for example according to metadata items and vocabulary terms, are provided. Client applications can submit search requests using the WC3 SPARQL Query Language for RDF. Voc2RDF: This component converts an ASCII comma-delimited set of terms and definitions into an RDF file. Voc2RDF facilitates the creation of controlled vocabularies by using a simple form-based user interface. Created vocabularies and their descriptive metadata can be submitted to the MMI Ontology Registry for versioning and community access. VINE: The Vocabulary Integration Environment component allows the user to map vocabulary terms across multiple ontologies. Various relationships can be established, for example exactMatch, narrowerThan, and subClassOf. VINE can compute inferred mappings based on the given associations. Attributes about each mapping, like comments and a confidence level, can also be included. VINE also supports registering and storing resulting mapping files in the Ontology Registry. The presentation will describe the application of semantic technologies in general, and our planned applications in particular, to solve data management problems in the marine and environmental sciences.
Ocean Data Interoperability Platform (ODIP): developing a common framework for global marine data management

NASA Astrophysics Data System (ADS)

Glaves, H. M.

2015-12-01

In recent years marine research has become increasingly multidisciplinary in its approach with a corresponding rise in the demand for large quantities of high quality interoperable data as a result. This requirement for easily discoverable and readily available marine data is currently being addressed by a number of regional initiatives with projects such as SeaDataNet in Europe, Rolling Deck to Repository (R2R) in the USA and the Integrated Marine Observing System (IMOS) in Australia, having implemented local infrastructures to facilitate the exchange of standardised marine datasets. However, each of these systems has been developed to address local requirements and created in isolation from those in other regions.Multidisciplinary marine research on a global scale necessitates a common framework for marine data management which is based on existing data systems. The Ocean Data Interoperability Platform project is seeking to address this requirement by bringing together selected regional marine e-infrastructures for the purposes of developing interoperability across them. By identifying the areas of commonality and incompatibility between these data infrastructures, and leveraging the development activities and expertise of these individual systems, three prototype interoperability solutions are being created which demonstrate the effective sharing of marine data and associated metadata across the participating regional data infrastructures as well as with other target international systems such as GEO, COPERNICUS etc.These interoperability solutions combined with agreed best practice and approved standards, form the basis of a common global approach to marine data management which can be adopted by the wider marine research community. To encourage implementation of these interoperability solutions by other regional marine data infrastructures an impact assessment is being conducted to determine both the technical and financial implications of deploying them alongside existing services. The associated best practice and common standards are also being disseminated to the user community through relevant accreditation processes and related initiatives such as the Research Data Alliance and the Belmont Forum.
MMI: Increasing Community Collaboration

NASA Astrophysics Data System (ADS)

Galbraith, N. R.; Stocks, K.; Neiswender, C.; Maffei, A.; Bermudez, L.

2007-12-01

Building community requires a collaborative environment and guidance to help move members towards a common goal. An effective environment for community collaboration is a workspace that fosters participation and cooperation; effective guidance furthers common understanding and promotes best practices. The Marine Metadata Interoperability (MMI) project has developed a community web site to provide a collaborative environment for scientists, technologists, and data managers from around the world to learn about metadata and exchange ideas. Workshops, demonstration projects, and presentations also provide community-building opportunities for MMI. MMI has developed comprehensive online guides to help users understand and work with metadata standards, ontologies, and other controlled vocabularies. Documents such as "The Importance of Metadata Standards", "Usage vs. Discovery Vocabularies" and "Developing Controlled Vocabularies" guide scientists and data managers through a variety of metadata-related concepts. Members from eight organizations involved in marine science and informatics collaborated on this effort. The MMI web site has moved from Plone to Drupal, two content management systems which provide different opportunities for community-based work. Drupal's "organic groups" feature will be used to provide workspace for future teams tasked with content development, outreach, and other MMI mission-critical work. The new site is designed to enable members to easily create working areas, to build communities dedicated to developing consensus on metadata and other interoperability issues. Controlled-vocabulary-driven menus, integrated mailing-lists, member-based content creation and review tools are facets of the new web site architecture. This move provided the challenge of developing a hierarchical vocabulary to describe the resources presented on the site; consistent and logical tagging of web pages is the basis of Drupal site navigation. The new MMI web site presents enhanced opportunities for electronic discussions, focused collaborative work, and even greater community participation. The MMI project is beginning a new initiative to comprehensively catalog and document tools for marine metadata. The new MMI community-based web site will be used to support this work and to support the work of other ad-hoc teams in the future. We are seeking broad input from the community on this effort.
Achieving interoperability for metadata registries using comparative object modeling.

PubMed

Park, Yu Rang; Kim, Ju Han

2010-01-01

Achieving data interoperability between organizations relies upon agreed meaning and representation (metadata) of data. For managing and registering metadata, many organizations have built metadata registries (MDRs) in various domains based on international standard for MDR framework, ISO/IEC 11179. Following this trend, two pubic MDRs in biomedical domain have been created, United States Health Information Knowledgebase (USHIK) and cancer Data Standards Registry and Repository (caDSR), from U.S. Department of Health & Human Services and National Cancer Institute (NCI), respectively. Most MDRs are implemented with indiscriminate extending for satisfying organization-specific needs and solving semantic and structural limitation of ISO/IEC 11179. As a result it is difficult to address interoperability among multiple MDRs. In this paper, we propose an integrated metadata object model for achieving interoperability among multiple MDRs. To evaluate this model, we developed an XML Schema Definition (XSD)-based metadata exchange format. We created an XSD-based metadata exporter, supporting both the integrated metadata object model and organization-specific MDR formats.
EarthCube Data Discovery Hub: Enhancing, Curating and Finding Data across Multiple Geoscience Data Sources.

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Valentine, D.; Richard, S. M.; Gupta, A.; Meier, O.; Peucker-Ehrenbrink, B.; Hudman, G.; Stocks, K. I.; Hsu, L.; Whitenack, T.; Grethe, J. S.; Ozyurt, I. B.

2017-12-01

EarthCube Data Discovery Hub (DDH) is an EarthCube Building Block project using technologies developed in CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) to enable geoscience users to explore a growing portfolio of EarthCube-created and other geoscience-related resources. Over 1 million metadata records are available for discovery through the project portal (cinergi.sdsc.edu). These records are retrieved from data facilities, including federal, state and academic sources, or contributed by geoscientists through workshops, surveys, or other channels. CINERGI metadata augmentation pipeline components 1) provide semantic enhancement based on a large ontology of geoscience terms, using text analytics to generate keywords with references to ontology classes, 2) add spatial extents based on place names found in the metadata record, and 3) add organization identifiers to the metadata. The records are indexed and can be searched via a web portal and standard search APIs. The added metadata content improves discoverability and interoperability of the registered resources. Specifically, the addition of ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by variables measured, equipment used, science domain, processes described, geospatial features studied, and other dataset characteristics that are generated by the pipeline. DDH also lets data curators access and edit the automatically generated metadata records using the CINERGI metadata editor, accept or reject the enhanced metadata content, and consider it in updating their metadata descriptions. We consider several complex data discovery workflows, in environmental seismology (quantifying sediment and water fluxes using seismic data), marine biology (determining available temperature, location, weather and bleaching characteristics of coral reefs related to measurements in a given coral reef survey), and river geochemistry (discovering observations relevant to geochemical measurements outside the tidal zone, given specific discharge conditions).
Implementation of a metadata architecture and knowledge collection to support semantic interoperability in an enterprise data warehouse.

PubMed

Dhaval, Rakesh; Borlawsky, Tara; Ostrander, Michael; Santangelo, Jennifer; Kamal, Jyoti; Payne, Philip R O

2008-11-06

In order to enhance interoperability between enterprise systems, and improve data validity and reliability throughout The Ohio State University Medical Center (OSUMC), we have initiated the development of an ontology-anchored metadata architecture and knowledge collection for our enterprise data warehouse. The metadata and corresponding semantic relationships stored in the OSUMC knowledge collection are intended to promote consistency and interoperability across the heterogeneous clinical, research, business and education information managed within the data warehouse.
Interoperability Between Coastal Web Atlases Using Semantic Mediation: A Case Study of the International Coastal Atlas Network (ICAN)

NASA Astrophysics Data System (ADS)

Wright, D. J.; Lassoued, Y.; Dwyer, N.; Haddad, T.; Bermudez, L. E.; Dunne, D.

2009-12-01

Coastal mapping plays an important role in informing marine spatial planning, resource management, maritime safety, hazard assessment and even national sovereignty. As such, there is now a plethora of data/metadata catalogs, pre-made maps, tabular and text information on resource availability and exploitation, and decision-making tools. A recent trend has been to encapsulate these in a special class of web-enabled geographic information systems called a coastal web atlas (CWA). While multiple benefits are derived from tailor-made atlases, there is great value added from the integration of disparate CWAs. CWAs linked to one another can query more successfully to optimize planning and decision-making. If a dataset is missing in one atlas, it may be immediately located in another. Similar datasets in two atlases may be combined to enhance study in either region. *But how best to achieve semantic interoperability to mitigate vague data queries, concepts or natural language semantics when retrieving and integrating data and information?* We report on the development of a new prototype seeking to interoperate between two initial CWAs: the Marine Irish Digital Atlas (MIDA) and the Oregon Coastal Atlas (OCA). These two mature atlases are used as a testbed for more regional connections, with the intent for the OCA to use lessons learned to develop a regional network of CWAs along the west coast, and for MIDA to do the same in building and strengthening atlas networks with the UK, Belgium, and other parts of Europe. Our prototype uses semantic interoperability via services harmonization and ontology mediation, allowing local atlases to use their own data structures, and vocabularies (ontologies). We use standard technologies such as OGC Web Map Services (WMS) for delivering maps, and OGC Catalogue Service for the Web (CSW) for delivering and querying ISO-19139 metadata. The metadata records of a given CWA use a given ontology of terms called local ontology. Human or machine users formulate their requests using a common ontology of metadata terms, called global ontology. A CSW mediator rewrites the user’s request into CSW requests over local CSWs using their own (local) ontologies, collects the results and sends them back to the user. To extend the system, we have recently added global maritime boundaries and are also considering nearshore ocean observing system data. Ongoing work includes adding WFS, error management, and exception handling, enabling Smart Searches, and writing full documentation. This prototype is a central research project of the new International Coastal Atlas Network (ICAN), a group of 30+ organizations from 14 nations (and growing) dedicated to seeking interoperability approaches to CWAs in support of coastal zone management and the translation of coastal science to coastal decision-making.
Making Interoperability Easier with NASA's Metadata Management Tool (MMT)

NASA Technical Reports Server (NTRS)

Shum, Dana; Reese, Mark; Pilone, Dan; Baynes, Katie

2016-01-01

While the ISO-19115 collection level metadata format meets many users' needs for interoperable metadata, it can be cumbersome to create it correctly. Through the MMT's simple UI experience, metadata curators can create and edit collections which are compliant with ISO-19115 without full knowledge of the NASA Best Practices implementation of ISO-19115 format. Users are guided through the metadata creation process through a forms-based editor, complete with field information, validation hints and picklists. Once a record is completed, users can download the metadata in any of the supported formats with just 2 clicks.
Operational Interoperability Challenges on the Example of GEOSS and WIS

NASA Astrophysics Data System (ADS)

Heene, M.; Buesselberg, T.; Schroeder, D.; Brotzer, A.; Nativi, S.

2015-12-01

The following poster highlights the operational interoperability challenges on the example of Global Earth Observation System of Systems (GEOSS) and World Meteorological Organization Information System (WIS). At the heart of both systems is a catalogue of earth observation data, products and services but with different metadata management concepts. While in WIS a strong governance with an own metadata profile for the hundreds of thousands metadata records exists, GEOSS adopted a more open approach for the ten million records. Furthermore, the development of WIS - as an operational system - follows a roadmap with committed downwards compatibility while the GEOSS development process is more agile. The poster discusses how the interoperability can be reached for the different metadata management concepts and how a proxy concept helps to couple two different systems which follow a different development methodology. Furthermore, the poster highlights the importance of monitoring and backup concepts as a verification method for operational interoperability.
Moving Beyond the 10,000 Ways That Don't Work

NASA Astrophysics Data System (ADS)

Bermudez, L. E.; Arctur, D. K.; Rueda, C.

2009-12-01

From his research in developing light bulb filaments, Thomas Edison provide us with a good lesson to advance any venture. He said "I have not failed, I've just found 10,000 ways that won't work." Advancing data and access interoperability is one of those ventures difficult to achieve because of the differences among the participating communities. Even within the marine domain, different communities exist and with them different technologies (formats and protocols) to publish data and its descriptions, and different vocabularies to name things (e.g. parameters, sensor types). Simplifying the heterogeneity of technologies is not only accomplished by adopting standards, but by creating profiles, and advancing tools that use those standards. In some cases, standards are advanced by building from existing tools. But what is the best strategy? Edison could provide us a hint. Prototypes and test beds are essential to achieve interoperability among geospatial communities. The Open Geospatial Consortium (OGC) calls them interoperability experiments. The World Wide Web Consortium (W3C) calls them incubator projects. Prototypes help test and refine specifications. The Marine Metadata Interoperability (MMI) Initiative, which is advancing marine data integration and re-use by promoting community solutions, understood this strategy and started an interoperability demonstration with the SURA Coastal Ocean Observing and Prediction (SCOOP) program. This interoperability demonstration transformed into the OGC Ocean Science Interoperability Experiment (Oceans IE). The Oceans IE brings together the Ocean-Observing community to advance interoperability of ocean observing systems by using OGC Standards. The Oceans IE Phase I investigated the use of OGC Web Feature Service (WFS) and OGC Sensor Observation Service (SOS) standards for representing and exchanging point data records from fixed in-situ marine platforms. The Oceans IE Phase I produced an engineering best practices report, advanced reference implementations, and submitted various change requests that are now being considered by the OGC SOS working group. Building on Phase I, and with a focus on semantically-enabled services, Oceans IE Phase II will continue the use and improvement of OGC specifications in the marine community. We will present the lessons learned and in particular the strategy of experimenting with technologies to advance standards to publish data in marine communities, which could also help advance interoperability in other geospatial communities. We will also discuss the growing collaborations among ocean-observing standards organizations that will bring about the institutional acceptance needed for these technologies and practices to gain traction globally.
A Metadata Standard for Hydroinformatic Data Conforming to International Standards

NASA Astrophysics Data System (ADS)

Notay, Vikram; Carstens, Georg; Lehfeldt, Rainer

2017-04-01

The affordable availability of computing power and digital storage has been a boon for the scientific community. The hydroinformatics community has also benefitted from the so-called digital revolution, which has enabled the tackling of more and more complex physical phenomena using hydroinformatic models, instruments, sensors, etc. With models getting more and more complex, computational domains getting larger and the resolution of computational grids and measurement data getting finer, a large amount of data is generated and consumed in any hydroinformatics related project. The ubiquitous availability of internet also contributes to this phenomenon with data being collected through sensor networks connected to telecommunications networks and the internet long before the term Internet of Things existed. Although generally good, this exponential increase in the number of available datasets gives rise to the need to describe this data in a standardised way to not only be able to get a quick overview about the data but to also facilitate interoperability of data from different sources. The Federal Waterways Engineering and Research Institute (BAW) is a federal authority of the German Federal Ministry of Transport and Digital Infrastructure. BAW acts as a consultant for the safe and efficient operation of the German waterways. As part of its consultation role, BAW operates a number of physical and numerical models for sections of inland and marine waterways. In order to uniformly describe the data produced and consumed by these models throughout BAW and to ensure interoperability with other federal and state institutes on the one hand and with EU countries on the other, a metadata profile for hydroinformatic data has been developed at BAW. The metadata profile is composed in its entirety using the ISO 19115 international standard for metadata related to geographic information. Due to the widespread use of the ISO 19115 standard in the existing geodata infrastructure worldwide, the profile provides a means to describe hydroinformatic data that conforms to existing metadata standards. Additionally, EU and German national standards, INSPIRE and GDI-DE have been considered to ensure interoperability on an international and national level. Finally, elements of the GovData profile of the Federal Government of Germany have been integrated to be able to participate in its Open Data initiative. All these factors make the metadata profile developed at BAW highly suitable for describing hydroinformatic data in particular and physical state variables in general. Further details about this metadata profile will be presented at the conference. Acknowledgements: The authors would like to thank Christoph Wosniok and Peter Schade for their contributions towards the development of this metadata standard.
CCR+: Metadata Based Extended Personal Health Record Data Model Interoperable with the ASTM CCR Standard.

PubMed

Park, Yu Rang; Yoon, Young Jo; Jang, Tae Hun; Seo, Hwa Jeong; Kim, Ju Han

2014-01-01

Extension of the standard model while retaining compliance with it is a challenging issue because there is currently no method for semantically or syntactically verifying an extended data model. A metadata-based extended model, named CCR+, was designed and implemented to achieve interoperability between standard and extended models. Furthermore, a multilayered validation method was devised to validate the standard and extended models. The American Society for Testing and Materials (ASTM) Community Care Record (CCR) standard was selected to evaluate the CCR+ model; two CCR and one CCR+ XML files were evaluated. In total, 188 metadata were extracted from the ASTM CCR standard; these metadata are semantically interconnected and registered in the metadata registry. An extended-data-model-specific validation file was generated from these metadata. This file can be used in a smartphone application (Health Avatar CCR+) as a part of a multilayered validation. The new CCR+ model was successfully evaluated via a patient-centric exchange scenario involving multiple hospitals, with the results supporting both syntactic and semantic interoperability between the standard CCR and extended, CCR+, model. A feasible method for delivering an extended model that complies with the standard model is presented herein. There is a great need to extend static standard models such as the ASTM CCR in various domains: the methods presented here represent an important reference for achieving interoperability between standard and extended models.
CCR+: Metadata Based Extended Personal Health Record Data Model Interoperable with the ASTM CCR Standard

PubMed Central

Park, Yu Rang; Yoon, Young Jo; Jang, Tae Hun; Seo, Hwa Jeong

2014-01-01

Objectives Extension of the standard model while retaining compliance with it is a challenging issue because there is currently no method for semantically or syntactically verifying an extended data model. A metadata-based extended model, named CCR+, was designed and implemented to achieve interoperability between standard and extended models. Methods Furthermore, a multilayered validation method was devised to validate the standard and extended models. The American Society for Testing and Materials (ASTM) Community Care Record (CCR) standard was selected to evaluate the CCR+ model; two CCR and one CCR+ XML files were evaluated. Results In total, 188 metadata were extracted from the ASTM CCR standard; these metadata are semantically interconnected and registered in the metadata registry. An extended-data-model-specific validation file was generated from these metadata. This file can be used in a smartphone application (Health Avatar CCR+) as a part of a multilayered validation. The new CCR+ model was successfully evaluated via a patient-centric exchange scenario involving multiple hospitals, with the results supporting both syntactic and semantic interoperability between the standard CCR and extended, CCR+, model. Conclusions A feasible method for delivering an extended model that complies with the standard model is presented herein. There is a great need to extend static standard models such as the ASTM CCR in various domains: the methods presented here represent an important reference for achieving interoperability between standard and extended models. PMID:24627817
Marine Profiles for OGC Sensor Web Enablement Standards

NASA Astrophysics Data System (ADS)

Jirka, Simon

2016-04-01

The use of OGC Sensor Web Enablement (SWE) standards in oceanology is increasing. Several projects are developing SWE-based infrastructures to ease the sharing of marine sensor data. This work ranges from developments on sensor level to efforts addressing interoperability of data flows between observatories and organisations. The broad range of activities using SWE standards leads to a risk of diverging approaches how the SWE specifications are applied. Because the SWE standards are designed in a domain independent manner, they intentionally offer a high degree of flexibility enabling implementation across different domains and usage scenarios. At the same time this flexibility allows one to achieve similar goals in different ways. To avoid interoperability issues, an agreement is needed on how to apply SWE concepts and how to use vocabularies in a common way that will be shared by different projects, implementations, and users. To address this need, partners from several projects and initiatives (AODN, BRIDGES, envri+, EUROFLEETS/EUROFLEETS2, FixO3, FRAM, IOOS, Jerico/Jerico-Next, NeXOS, ODIP/ODIP II, RITMARE, SeaDataNet, SenseOcean, X-DOMES) have teamed up to develop marine profiles of OGC SWE standards that can serve as a common basis for developments in multiple projects and organisations. The following aspects will be especially considered: 1.) Provision of metadata: For discovering sensors/instruments as well as observation data, to facilitate the interpretation of observations, and to integrate instruments in sensor platforms, the provision of metadata is crucial. Thus, a marine profile of the OGC Sensor Model Language 2.0 (SensorML 2.0) will be developed allowing to provide metadata for different levels (e.g. observatory, instrument, and detector) and sensor types. The latter will enable metadata of a specific type to be automatically inherited by all devices/sensors of the same type. The application of further standards such as OGC PUCK will benefit from this encoding, too, by facilitating the communication with instruments. 2.) Encoding and modelling of observation data: For delivering observation data, the ISO/OGC Observations and Measurements 2.0 (O&M 2.0) standard serves as a good basis. Within an O&M profile, recommendations will be given on needed observation types that cover different aspects of marine sensing (trajectory, stationary, or profile measurements, etc.). Besides XML, further O&M encodings (e.g. JSON-based) will be considered. 3.) Data access: A profile of the OGC Sensor Observation Service 2.0 (SOS 2.0) standard will be specified to offer a common way on how this web service interface can be used for requesting marine observations and metadata. At the same time this will offer a common interface to cross-domain applications based upon tools such as the GEOSS DAB. Lightweight approaches such as REST will be considered as further bindings for the SOS interface. 4.) Backward compatibility: The profile will consider the existing observation systems so that migration paths towards the specified profiles can be offered. We will present the current state of the profile development. In particular, a comparative analysis of SWE usage in different projects, an outline of the requirements, and fundamental aspects of profiles of SWE standards will be shown.
Science friction: data, metadata, and collaboration.

PubMed

Edwards, Paul N; Mayernik, Matthew S; Batcheller, Archer L; Bowker, Geoffrey C; Borgman, Christine L

2011-10-01

When scientists from two or more disciplines work together on related problems, they often face what we call 'science friction'. As science becomes more data-driven, collaborative, and interdisciplinary, demand increases for interoperability among data, tools, and services. Metadata--usually viewed simply as 'data about data', describing objects such as books, journal articles, or datasets--serve key roles in interoperability. Yet we find that metadata may be a source of friction between scientific collaborators, impeding data sharing. We propose an alternative view of metadata, focusing on its role in an ephemeral process of scientific communication, rather than as an enduring outcome or product. We report examples of highly useful, yet ad hoc, incomplete, loosely structured, and mutable, descriptions of data found in our ethnographic studies of several large projects in the environmental sciences. Based on this evidence, we argue that while metadata products can be powerful resources, usually they must be supplemented with metadata processes. Metadata-as-process suggests the very large role of the ad hoc, the incomplete, and the unfinished in everyday scientific work.
SeaDataNet Pan-European infrastructure for Ocean & Marine Data Management

NASA Astrophysics Data System (ADS)

Manzella, G. M.; Maillard, C.; Maudire, G.; Schaap, D.; Rickards, L.; Nast, F.; Balopoulos, E.; Mikhailov, N.; Vladymyrov, V.; Pissierssens, P.; Schlitzer, R.; Beckers, J. M.; Barale, V.

2007-12-01

SEADATANET is developing a Pan-European data management infrastructure to insure access to a large number of marine environmental data (i.e. temperature, salinity current, sea level, chemical, physical and biological properties), safeguard and long term archiving. Data are derived from many different sensors installed on board of research vessels, satellite and the various platforms of the marine observing system. SeaDataNet allows to have information on real time and archived marine environmental data collected at a pan-european level, through directories on marine environmental data and projects. SeaDataNet allows the access to the most comprehensive multidisciplinary sets of marine in-situ and remote sensing data, from about 40 laboratories, through user friendly tools. The data selection and access is operated through the Common Data Index (CDI), XML files compliant with ISO standards and unified dictionaries. Technical Developments carried out by SeaDataNet includes: A library of Standards - Meta-data standards, compliant with ISO 19115, for communication and interoperability between the data platforms. Software of interoperable on line system - Interconnection of distributed data centres by interfacing adapted communication technology tools. Off-Line Data Management software - software representing the minimum equipment of all the data centres is developed by AWI "Ocean Data View (ODV)". Training, Education and Capacity Building - Training 'on the job' is carried out by IOC-Unesco in Ostende. SeaDataNet Virtual Educational Centre internet portal provides basic tools for informal education

An open repositories network development for medical teaching resources.

PubMed

Soula, Gérard; Darmoni, Stefan; Le Beux, Pierre; Renard, Jean-Marie; Dahamna, Badisse; Fieschi, Marius

2010-01-01

The lack of interoperability between repositories of heterogeneous and geographically widespread data is an obstacle to the diffusion, sharing and reutilization of those data. We present the development of an open repositories network taking into account both the syntactic and semantic interoperability of the different repositories and based on international standards in this field. The network is used by the medical community in France for the diffusion and sharing of digital teaching resources. The syntactic interoperability of the repositories is managed using the OAI-PMH protocol for the exchange of metadata describing the resources. Semantic interoperability is based, on one hand, on the LOM standard for the description of resources and on MESH for the indexing of the latter and, on the other hand, on semantic interoperability management designed to optimize compliance with standards and the quality of the metadata.
Development of Health Information Search Engine Based on Metadata and Ontology

PubMed Central

Song, Tae-Min; Jin, Dal-Lae

2014-01-01

Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907
Development of health information search engine based on metadata and ontology.

PubMed

Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

2014-04-01

The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
Making Interoperability Easier with the NASA Metadata Management Tool

NASA Astrophysics Data System (ADS)

Shum, D.; Reese, M.; Pilone, D.; Mitchell, A. E.

2016-12-01

ISO 19115 has enabled interoperability amongst tools, yet many users find it hard to build ISO metadata for their collections because it can be large and overly flexible for their needs. The Metadata Management Tool (MMT), part of NASA's Earth Observing System Data and Information System (EOSDIS), offers users a modern, easy to use browser based tool to develop ISO compliant metadata. Through a simplified UI experience, metadata curators can create and edit collections without any understanding of the complex ISO-19115 format, while still generating compliant metadata. The MMT is also able to assess the completeness of collection level metadata by evaluating it against a variety of metadata standards. The tool provides users with clear guidance as to how to change their metadata in order to improve their quality and compliance. It is based on NASA's Unified Metadata Model for Collections (UMM-C) which is a simpler metadata model which can be cleanly mapped to ISO 19115. This allows metadata authors and curators to meet ISO compliance requirements faster and more accurately. The MMT and UMM-C have been developed in an agile fashion, with recurring end user tests and reviews to continually refine the tool, the model and the ISO mappings. This process is allowing for continual improvement and evolution to meet the community's needs.
A federated semantic metadata registry framework for enabling interoperability across clinical research and care domains.

PubMed

Sinaci, A Anil; Laleci Erturkmen, Gokce B

2013-10-01

In order to enable secondary use of Electronic Health Records (EHRs) by bridging the interoperability gap between clinical care and research domains, in this paper, a unified methodology and the supporting framework is introduced which brings together the power of metadata registries (MDR) and semantic web technologies. We introduce a federated semantic metadata registry framework by extending the ISO/IEC 11179 standard, and enable integration of data element registries through Linked Open Data (LOD) principles where each Common Data Element (CDE) can be uniquely referenced, queried and processed to enable the syntactic and semantic interoperability. Each CDE and their components are maintained as LOD resources enabling semantic links with other CDEs, terminology systems and with implementation dependent content models; hence facilitating semantic search, much effective reuse and semantic interoperability across different application domains. There are several important efforts addressing the semantic interoperability in healthcare domain such as IHE DEX profile proposal, CDISC SHARE and CDISC2RDF. Our architecture complements these by providing a framework to interlink existing data element registries and repositories for multiplying their potential for semantic interoperability to a greater extent. Open source implementation of the federated semantic MDR framework presented in this paper is the core of the semantic interoperability layer of the SALUS project which enables the execution of the post marketing safety analysis studies on top of existing EHR systems. Copyright © 2013 Elsevier Inc. All rights reserved.
SeaDataNet - Pan-European infrastructure for marine and ocean data management: Unified access to distributed data sets

NASA Astrophysics Data System (ADS)

Schaap, D. M. A.; Maudire, G.

2009-04-01

SeaDataNet is an Integrated research Infrastructure Initiative (I3) in EU FP6 (2006 - 2011) to provide the data management system adapted both to the fragmented observation system and the users need for an integrated access to data, meta-data, products and services. Therefore SeaDataNet insures the long term archiving of the large number of multidisciplinary data (i.e. temperature, salinity current, sea level, chemical, physical and biological properties) collected by many different sensors installed on board of research vessels, satellite and the various platforms of the marine observing system. The SeaDataNet project started in 2006, but builds upon earlier data management infrastructure projects, undertaken over a period of 20 years by an expanding network of oceanographic data centres from the countries around all European seas. Its predecessor project Sea-Search had a strict focus on metadata. SeaDataNet maintains significant interest in the further development of the metadata infrastructure, but its primary objective is the provision of easy data access and generic data products. SeaDataNet is a distributed infrastructure that provides transnational access to marine data, meta-data, products and services through 40 interconnected Trans National Data Access Platforms (TAP) from 35 countries around the Black Sea, Mediterranean, North East Atlantic, North Sea, Baltic and Arctic regions. These include: National Oceanographic Data Centres (NODC's) Satellite Data Centres. Furthermore the SeaDataNet consortium comprises a number of expert modelling centres, SME's experts in IT, and 3 international bodies (ICES, IOC and JRC). Planning: The SeaDataNet project is delivering and operating the infrastructure in 3 versions: Version 0: maintenance and further development of the metadata systems developed by the Sea-Search project plus the development of a new metadata system for indexing and accessing to individual data objects managed by the SeaDataNet data centres. This is known as the Common Data Index (CDI) V0 system Version 1: harmonisation and upgrading of the metadatabases through adoption of the ISO 19115 metadata standard and provision of transparent data access and download services from all partner data centres through upgrading the Common Data Index and deployment of a data object delivery service. Version 2: adding data product services and OGC compliant viewing services and further virtualisation of data access. SeaDataNet Version 0: The SeaDataNet portal has been set up at http://www.seadatanet.org and it provides a platform for all SeaDataNet services and standards as well as background information about the project and its partners. It includes discovery services via the following catalogues: CSR - Cruise Summary Reports of research vessels; EDIOS - Locations and details of monitoring stations and networks / programmes; EDMED - High level inventory of Marine Environmental Data sets collected and managed by research institutes and organisations; EDMERP - Marine Environmental Research Projects ; EDMO - Marine Organisations. These catalogues are interrelated, where possible, to facilitate cross searching and context searching. These catalogues connect to the Common Data Index (CDI). Common Data Index (CDI) The CDI gives detailed insight in available datasets at partners databases and paves the way to direct online data access or direct online requests for data access / data delivery. The CDI V0 metadatabase contains more than 340.000 individual data entries from 36 CDI partners from 29 countries across Europe, covering a broad scope and range of data, held by these organisations. For purposes of standardisation and international exchange the ISO19115 metadata standard has been adopted. The CDI format is defined as a dedicated subset of this standard. A CDI XML format supports the exchange between CDI-partners and the central CDI manager, and ensures interoperability with other systems and networks. CDI XML entries are generated by participating data centres, directly from their databases. CDI-partners can make use of dedicated SeaDataNet Tools to generate CDI XML files automatically. Approach for SeaDataNet V1 and V2: The approach for SeaDataNet V1 and V2, which is in line with the INSPIRE Directive, comprises the following services: Discovery services = Metadata directories Security services = Authentication, Authorization & Accounting (AAA) Delivery services = Data access & downloading of datasets Viewing services = Visualisation of metadata, data and data products Product services = Generic and standard products Monitoring services = Statistics on usage and performance of the system Maintenance services = Updating of metadata by SeaDataNet partners The services will be operated over a distributed network of interconnected Data Centres accessed through a central Portal. In addition to service access the portal will provide information on data management standards, tools and protocols. The architecture has been designed to provide a coherent system based on V1 services, whilst leaving the pathway open for later extension with V2 services. For the implementation, a range of technical components have been defined. Some are already operational with the remainder in the final stages of development and testing. These make use of recent web technologies, and also comprise Java components, to provide multi-platform support and syntactic interoperability. To facilitate sharing of resources and interoperability, SeaDataNet has adopted SOAP Web Service technology. The SeaDataNet architecture and components have been designed to handle all kinds of oceanographic and marine environmental data including both in-situ measurements and remote sensing observations. The V1 technical development is ready and the V1 system is now being implemented and adopted by all participating data centres in SeaDataNet. Interoperability: Interoperability is the key to distributed data management system success and it is achieved in SeaDataNet V1 by: Using common quality control protocols and flag scale Using controlled vocabularies from a single source that have been developed using international content governance Adopting the ISO 19115 metadata standard for all metadata directories Providing XML Validation Services to quality control the metadata maintenance, including field content verification based on Schematron. Providing standard metadata entry tools Using harmonised Data Transport Formats (NetCDF, ODV ASCII and MedAtlas ASCII) for data sets delivery Adopting of OGC standards for mapping and viewing services Using SOAP Web Services in the SeaDataNet architecture SeaDataNet V1 Delivery Services: An important objective of the V1 system is to provide transparent access to the distributed data sets via a unique user interface at the SeaDataNet portal and download service. In the SeaDataNet V1 architecture the Common Data Index (CDI) V1 provides the link between discovery and delivery. The CDI user interface enables users to have a detailed insight of the availability and geographical distribution of marine data, archived at the connected data centres, and it provides the means for downloading data sets in common formats via a transaction mechanism. The SeaDataNet portal provides registered users access to these distributed data sets via the CDI V1 Directory and a shopping basket mechanism. This allows registered users to locate data of interest and submit their data requests. The requests are forwarded automatically from the portal to the relevant SeaDataNet data centres. This process is controlled via the Request Status Manager (RSM) Web Service at the portal and a Download Manager (DM) java software module, implemented at each of the data centres. The RSM also enables registered users to check regularly the status of their requests and download data sets, after access has been granted. Data centres can follow all transactions for their data sets online and can handle requests which require their consent. The actual delivery of data sets is done between the user and the selected data centre. The CDI V1 system is now being populated by all participating data centres in SeaDataNet, thereby phasing out CDI V0. 0.1 SeaDataNet Partners: IFREMER (France), MARIS (Netherlands), HCMR/HNODC (Greece), ULg (Belgium), OGS (Italy), NERC/BODC (UK), BSH/DOD (Germany), SMHI (Sweden), IEO (Spain), RIHMI/WDC (Russia), IOC (International), ENEA (Italy), INGV (Italy), METU (Turkey), CLS (France), AWI (Germany), IMR (Norway), NERI (Denmark), ICES (International), EC-DG JRC (International), MI (Ireland), IHPT (Portugal), RIKZ (Netherlands), RBINS/MUMM (Belgium), VLIZ (Belgium), MRI (Iceland), FIMR (Finland ), IMGW (Poland), MSI (Estonia), IAE/UL (Latvia), CMR (Lithuania), SIO/RAS (Russia), MHI/DMIST (Ukraine), IO/BAS (Bulgaria), NIMRD (Romania), TSU (Georgia), INRH (Morocco), IOF (Croatia), PUT (Albania), NIB (Slovenia), UoM (Malta), OC/UCY (Cyprus), IOLR (Israel), NCSR/NCMS (Lebanon), CNR-ISAC (Italy), ISMAL (Algeria), INSTM (Tunisia)
Data interoperabilty between European Environmental Research Infrastructures and their contribution to global data networks

NASA Astrophysics Data System (ADS)

Kutsch, W. L.; Zhao, Z.; Hardisty, A.; Hellström, M.; Chin, Y.; Magagna, B.; Asmi, A.; Papale, D.; Pfeil, B.; Atkinson, M.

2017-12-01

Environmental Research Infrastructures (ENVRIs) are expected to become important pillars not only for supporting their own scientific communities, but also a) for inter-disciplinary research and b) for the European Earth Observation Program Copernicus as a contribution to the Global Earth Observation System of Systems (GEOSS) or global thematic data networks. As such, it is very important that data-related activities of the ENVRIs will be well integrated. This requires common policies, models and e-infrastructure to optimise technological implementation, define workflows, and ensure coordination, harmonisation, integration and interoperability of data, applications and other services. The key is interoperating common metadata systems (utilising a richer metadata model as the `switchboard' for interoperation with formal syntax and declared semantics). The metadata characterises data, services, users and ICT resources (including sensors and detectors). The European Cluster Project ENVRIplus has developed a reference model (ENVRI RM) for common data infrastructure architecture to promote interoperability among ENVRIs. The presentation will provide an overview of recent progress and give examples for the integration of ENVRI data in global integration networks.
Transforming Dermatologic Imaging for the Digital Era: Metadata and Standards.

PubMed

Caffery, Liam J; Clunie, David; Curiel-Lewandrowski, Clara; Malvehy, Josep; Soyer, H Peter; Halpern, Allan C

2018-01-17

Imaging is increasingly being used in dermatology for documentation, diagnosis, and management of cutaneous disease. The lack of standards for dermatologic imaging is an impediment to clinical uptake. Standardization can occur in image acquisition, terminology, interoperability, and metadata. This paper presents the International Skin Imaging Collaboration position on standardization of metadata for dermatologic imaging. Metadata is essential to ensure that dermatologic images are properly managed and interpreted. There are two standards-based approaches to recording and storing metadata in dermatologic imaging. The first uses standard consumer image file formats, and the second is the file format and metadata model developed for the Digital Imaging and Communication in Medicine (DICOM) standard. DICOM would appear to provide an advantage over using consumer image file formats for metadata as it includes all the patient, study, and technical metadata necessary to use images clinically. Whereas, consumer image file formats only include technical metadata and need to be used in conjunction with another actor-for example, an electronic medical record-to supply the patient and study metadata. The use of DICOM may have some ancillary benefits in dermatologic imaging including leveraging DICOM network and workflow services, interoperability of images and metadata, leveraging existing enterprise imaging infrastructure, greater patient safety, and better compliance to legislative requirements for image retention.
Establishing semantic interoperability of biomedical metadata registries using extended semantic relationships.

PubMed

Park, Yu Rang; Yoon, Young Jo; Kim, Hye Hyeon; Kim, Ju Han

2013-01-01

Achieving semantic interoperability is critical for biomedical data sharing between individuals, organizations and systems. The ISO/IEC 11179 MetaData Registry (MDR) standard has been recognized as one of the solutions for this purpose. The standard model, however, is limited. Representing concepts consist of two or more values, for instance, are not allowed including blood pressure with systolic and diastolic values. We addressed the structural limitations of ISO/IEC 11179 by an integrated metadata object model in our previous research. In the present study, we introduce semantic extensions for the model by defining three new types of semantic relationships; dependency, composite and variable relationships. To evaluate our extensions in a real world setting, we measured the efficiency of metadata reduction by means of mapping to existing others. We extracted metadata from the College of American Pathologist Cancer Protocols and then evaluated our extensions. With no semantic loss, one third of the extracted metadata could be successfully eliminated, suggesting better strategy for implementing clinical MDRs with improved efficiency and utility.
The Benefits and Future of Standards: Metadata and Beyond

NASA Astrophysics Data System (ADS)

Stracke, Christian M.

This article discusses the benefits and future of standards and presents the generic multi-dimensional Reference Model. First the importance and the tasks of interoperability as well as quality development and their relationship are analyzed. Especially in e-Learning their connection and interdependence is evident: Interoperability is one basic requirement for quality development. In this paper, it is shown how standards and specifications are supporting these crucial issues. The upcoming ISO metadata standard MLR (Metadata for Learning Resource) will be introduced and used as example for identifying the requirements and needs for future standardization. In conclusion a vision of the challenges and potentials for e-Learning standardization is outlined.
EMODnet Physics: open and free marine physical data for science and for society

NASA Astrophysics Data System (ADS)

Nolan, G.; Novellino, A.; Gorringe, P.; Manzella, G. M. R., Sr.; Schaap, D.; Pouliquen, S.; Richards, L.

2016-02-01

Europe is sustaining a long term strategy on Blue Growth, looking at seas and oceans as drivers for innovation and growth. A number of weaknesses have been identified, among which gaps in knowledge and data about the state of our oceans, seabed resources, marine life and risks to habitats and ecosystems. European Marine Observation and Data Network (EMODnet) has been created to improve the usefulness to European users for scientific, regulatory and commercial purposes of observations and the resulting marine data collected and held by European public and private bodies. EMODNet Physics is providing access to archived and real time data catalog on the physical condition in Europe's seas and oceans. The overall objectives are to provide access to archived and near real-time data on physical conditions in Europe's seas and oceans by means of a dedicated portal and to determine how well the data meet the needs of users from industry, public authorities and scientists. EMODnet Physics is contributing to the broader initiative 'Marine Knowledge 2020', and in particular to the implementation of the European Copernicus programme, an EU-wide programme that aims to support policymakers, business, and citizens with improved environmental information. In the global context, Copernicus is an integral part of the Global Earth Observation System of Systems. Near real time data and metadata are populated by data owners, organized at EuroGOOS level according its regional operational systems (ROOSs) infrastructure and conventions and made available with the EMODnet Physics user interface. Latest 60 days are freely viewable and downloadable while the access to older data (monthly archives) request credentials. Archived data series and metadata are organized according and in collaboration with NODCs network (SeaDataNet). Access to data and metadata consider measurements on winds at the sea surface, waves, temperature and salinity, water velocities, light attenuation, sea level and ice coverage. EMODnet Physics has the specific objective of processing physical data into interoperable formats which includes agreed standards, common baselines or reference conditions; assessments of their accuracy and precision. The data and metadata are accessible through an ISO, OGC, INSPIRE compliant portal that is operational 24/7.
An Approach to Information Management for AIR7000 with Metadata and Ontologies

DTIC Science & Technology

2009-10-01

metadata. We then propose an approach based on Semantic Technologies including the Resource Description Framework (RDF) and Upper Ontologies, for the...mandating specific metadata schemas can result in interoperability problems. For example, many standards within the ADO mandate the use of XML for metadata...such problems, we propose an archi- tecture in which different metadata schemes can inter operate. By using RDF (Resource Description Framework ) as a
Distributed Interoperable Metadata Registry; How Do Physicists Use an E-Print Archive? Implications for Institutional E-Print Services; A Framework for Building Open Digital Libraries; Implementing Digital Sanborn Maps for Ohio: OhioLINK and OPLIN Collaborative Project.

ERIC Educational Resources Information Center

Blanchi, Christophe; Petrone, Jason; Pinfield, Stephen; Suleman, Hussein; Fox, Edward A.; Bauer, Charly; Roddy, Carol Lynn

2001-01-01

Includes four articles that discuss a distributed architecture for managing metadata that promotes interoperability between digital libraries; the use of electronic print (e-print) by physicists; the development of digital libraries; and a collaborative project between two library consortia in Ohio to provide digital versions of Sanborn Fire…
Assisted editing od SensorML with EDI. A bottom-up scenario towards the definition of sensor profiles.

NASA Astrophysics Data System (ADS)

Oggioni, Alessandro; Tagliolato, Paolo; Fugazza, Cristiano; Bastianini, Mauro; Pavesi, Fabio; Pepe, Monica; Menegon, Stefano; Basoni, Anna; Carrara, Paola

2015-04-01

Sensor observation systems for environmental data have become increasingly important in the last years. The EGU's Informatics in Oceanography and Ocean Science track stressed the importance of management tools and solutions for marine infrastructures. We think that full interoperability among sensor systems is still an open issue and that the solution to this involves providing appropriate metadata. Several open source applications implement the SWE specification and, particularly, the Sensor Observation Services (SOS) standard. These applications allow for the exchange of data and metadata in XML format between computer systems. However, there is a lack of metadata editing tools supporting end users in this activity. Generally speaking, it is hard for users to provide sensor metadata in the SensorML format without dedicated tools. In particular, such a tool should ease metadata editing by providing, for standard sensors, all the invariant information to be included in sensor metadata, thus allowing the user to concentrate on the metadata items that are related to the specific deployment. RITMARE, the Italian flagship project on marine research, envisages a subproject, SP7, for the set-up of the project's spatial data infrastructure. SP7 developed EDI, a general purpose, template-driven metadata editor that is composed of a backend web service and an HTML5/javascript client. EDI can be customized for managing the creation of generic metadata encoded as XML. Once tailored to a specific metadata format, EDI presents the users a web form with advanced auto completion and validation capabilities. In the case of sensor metadata (SensorML versions 1.0.1 and 2.0), the EDI client is instructed to send an "insert sensor" request to an SOS endpoint in order to save the metadata in an SOS server. In the first phase of project RITMARE, EDI has been used to simplify the creation from scratch of SensorML metadata by the involved researchers and data managers. An interesting by-product of this ongoing work is currently constituting an archive of predefined sensor descriptions. This information is being collected in order to further ease metadata creation in the next phase of the project. Users will be able to choose among a number of sensor and sensor platform prototypes: These will be specific instances on which it will be possible to define, in a bottom-up approach, "sensor profiles". We report on the outcome of this activity.
FITS and PDS4: Planetary Surface Data Interoperability Made Easier

NASA Astrophysics Data System (ADS)

Marmo, C.; Hare, T. M.; Erard, S.; Cecconi, B.; Minin, M.; Rossi, A. P.; Costard, F.; Schmidt, F.

2018-04-01

This abstract describes how Flexible Image Transport System (FITS) can be used in planetary surface investigations, and how its metadata can easily be inserted in the PDS4 metadata distribution model.
OOSTethys - Open Source Software for the Global Earth Observing Systems of Systems

NASA Astrophysics Data System (ADS)

Bridger, E.; Bermudez, L. E.; Maskey, M.; Rueda, C.; Babin, B. L.; Blair, R.

2009-12-01

An open source software project is much more than just picking the right license, hosting modular code and providing effective documentation. Success in advancing in an open collaborative way requires that the process match the expected code functionality to the developer's personal expertise and organizational needs as well as having an enthusiastic and responsive core lead group. We will present the lessons learned fromOOSTethys , which is a community of software developers and marine scientists who develop open source tools, in multiple languages, to integrate ocean observing systems into an Integrated Ocean Observing System (IOOS). OOSTethys' goal is to dramatically reduce the time it takes to install, adopt and update standards-compliant web services. OOSTethys has developed servers, clients and a registry. Open source PERL, PYTHON, JAVA and ASP tool kits and reference implementations are helping the marine community publish near real-time observation data in interoperable standard formats. In some cases publishing an OpenGeospatial Consortium (OGC), Sensor Observation Service (SOS) from NetCDF files or a database or even CSV text files could take only minutes depending on the skills of the developer. OOSTethys is also developing an OGC standard registry, Catalog Service for Web (CSW). This open source CSW registry was implemented to easily register and discover SOSs using ISO 19139 service metadata. A web interface layer over the CSW registry simplifies the registration process by harvesting metadata describing the observations and sensors from the “GetCapabilities” response of SOS. OPENIOOS is the web client, developed in PERL to visualize the sensors in the SOS services. While the number of OOSTethys software developers is small, currently about 10 around the world, the number of OOSTethys toolkit implementers is larger and growing and the ease of use has played a large role in spreading the use of interoperable standards compliant web services widely in the marine community.
A semantically rich and standardised approach enhancing discovery of sensor data and metadata

NASA Astrophysics Data System (ADS)

Kokkinaki, Alexandra; Buck, Justin; Darroch, Louise

2016-04-01

The marine environment plays an essential role in the earth's climate. To enhance the ability to monitor the health of this important system, innovative sensors are being produced and combined with state of the art sensor technology. As the number of sensors deployed is continually increasing,, it is a challenge for data users to find the data that meet their specific needs. Furthermore, users need to integrate diverse ocean datasets originating from the same or even different systems. Standards provide a solution to the above mentioned challenges. The Open Geospatial Consortium (OGC) has created Sensor Web Enablement (SWE) standards that enable different sensor networks to establish syntactic interoperability. When combined with widely accepted controlled vocabularies, they become semantically rich and semantic interoperability is achievable. In addition, Linked Data is the recommended best practice for exposing, sharing and connecting information on the Semantic Web using Uniform Resource Identifiers (URIs), Resource Description Framework (RDF) and RDF Query Language (SPARQL). As part of the EU-funded SenseOCEAN project, the British Oceanographic Data Centre (BODC) is working on the standardisation of sensor metadata enabling 'plug and play' sensor integration. Our approach combines standards, controlled vocabularies and persistent URIs to publish sensor descriptions, their data and associated metadata as 5 star Linked Data and OGC SWE (SensorML, Observations & Measurements) standard. Thus sensors become readily discoverable, accessible and useable via the web. Content and context based searching is also enabled since sensors descriptions are understood by machines. Additionally, sensor data can be combined with other sensor or Linked Data datasets to form knowledge. This presentation will describe the work done in BODC to achieve syntactic and semantic interoperability in the sensor domain. It will illustrate the reuse and extension of the Semantic Sensor Network (SSN) ontology to Linked Sensor Ontology (LSO) and the steps taken to combine OGC SWE with the Linked Data approach through alignment and embodiment of other ontologies. It will then explain how data and models were annotated with controlled vocabularies to establish unambiguous semantics and interconnect them with data from different sources. Finally, it will introduce the RDF triple store where the sensor descriptions and metadata are stored and can be queried through the standard query language SPARQL. Providing different flavours of machine readable interpretations of sensors, sensor data and metadata enhances discoverability but most importantly allows seamless aggregation of information from different networks that will finally produce knowledge.
Characterization of Educational Resources in e-Learning Systems Using an Educational Metadata Profile

ERIC Educational Resources Information Center

Solomou, Georgia; Pierrakeas, Christos; Kameas, Achilles

2015-01-01

The ability to effectively administrate educational resources in terms of accessibility, reusability and interoperability lies in the adoption of an appropriate metadata schema, able of adequately describing them. A considerable number of different educational metadata schemas can be found in literature, with the IEEE LOM being the most widely…
Metadata behind the Interoperability of Wireless Sensor Networks

PubMed Central

Ballari, Daniela; Wachowicz, Monica; Callejo, Miguel Angel Manso

2009-01-01

Wireless Sensor Networks (WSNs) produce changes of status that are frequent, dynamic and unpredictable, and cannot be represented using a linear cause-effect approach. Consequently, a new approach is needed to handle these changes in order to support dynamic interoperability. Our approach is to introduce the notion of context as an explicit representation of changes of a WSN status inferred from metadata elements, which in turn, leads towards a decision-making process about how to maintain dynamic interoperability. This paper describes the developed context model to represent and reason over different WSN status based on four types of contexts, which have been identified as sensing, node, network and organisational contexts. The reasoning has been addressed by developing contextualising and bridges rules. As a result, we were able to demonstrate how contextualising rules have been used to reason on changes of WSN status as a first step towards maintaining dynamic interoperability. PMID:22412330
Metadata behind the Interoperability of Wireless Sensor Networks.

PubMed

Ballari, Daniela; Wachowicz, Monica; Callejo, Miguel Angel Manso

2009-01-01

Wireless Sensor Networks (WSNs) produce changes of status that are frequent, dynamic and unpredictable, and cannot be represented using a linear cause-effect approach. Consequently, a new approach is needed to handle these changes in order to support dynamic interoperability. Our approach is to introduce the notion of context as an explicit representation of changes of a WSN status inferred from metadata elements, which in turn, leads towards a decision-making process about how to maintain dynamic interoperability. This paper describes the developed context model to represent and reason over different WSN status based on four types of contexts, which have been identified as sensing, node, network and organisational contexts. The reasoning has been addressed by developing contextualising and bridges rules. As a result, we were able to demonstrate how contextualising rules have been used to reason on changes of WSN status as a first step towards maintaining dynamic interoperability.

Metadata to Describe Genomic Information.

PubMed

Delgado, Jaime; Naro, Daniel; Llorente, Silvia; Gelpí, Josep Lluís; Royo, Romina

2018-01-01

Interoperable metadata is key for the management of genomic information. We propose a flexible approach that we contribute to the standardization by ISO/IEC of a new format for efficient and secure compressed storage and transmission of genomic information.
Latest developments for the IAGOS database: Interoperability and metadata

NASA Astrophysics Data System (ADS)

Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Schultz, Martin; van Velthoven, Peter; Broetz, Bjoern; Rauthe-Schöch, Armin; Brissebrat, Guillaume

2014-05-01

In-service Aircraft for a Global Observing System (IAGOS, http://www.iagos.org) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. Users can access the data through the following web sites: http://www.iagos.fr or http://www.pole-ether.fr as the IAGOS database is part of the French atmospheric chemistry data centre ETHER (CNES and CNRS). The database is in continuous development and improvement. In the framework of the IGAS project (IAGOS for GMES/COPERNICUS Atmospheric Service), major achievements will be reached, such as metadata and format standardisation in order to interoperate with international portals and other databases, QA/QC procedures and traceability, CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data integration within the central database, and the real-time data transmission. IGAS work package 2 aims at providing the IAGOS data to users in a standardized format including the necessary metadata and information on data processing, data quality and uncertainties. We are currently redefining and standardizing the IAGOS metadata for interoperable use within GMES/Copernicus. The metadata are compliant with the ISO 19115, INSPIRE and NetCDF-CF conventions. IAGOS data will be provided to users in NetCDF or NASA Ames format. We also are implementing interoperability between all the involved IAGOS data services, including the central IAGOS database, the former MOZAIC and CARIBIC databases, Aircraft Research DLR database and the Jülich WCS web application JOIN (Jülich OWS Interface) which combines model outputs with in situ data for intercomparison. The optimal data transfer protocol is being investigated to insure the interoperability. To facilitate satellite and model validation, tools will be made available for co-location and comparison with IAGOS. We will enhance the JOIN application in order to properly display aircraft data as vertical profiles and along individual flight tracks and to allow for graphical comparison to model results that are accessible through interoperable web services, such as the daily products from the GMES/Copernicus atmospheric service.
NetCDF-CF-OPeNDAP: Standards for ocean data interoperability and object lessons for community data standards processes

USGS Publications Warehouse

Hankin, Steven C.; Blower, Jon D.; Carval, Thierry; Casey, Kenneth S.; Donlon, Craig; Lauret, Olivier; Loubrieu, Thomas; Srinivasan, Ashwanth; Trinanes, Joaquin; Godøy, Øystein; Mendelssohn, Roy; Signell, Richard P.; de La Beaujardiere, Jeff; Cornillon, Peter; Blanc, Frederique; Rew, Russ; Harlan, Jack; Hall, Julie; Harrison, D.E.; Stammer, Detlef

2010-01-01

It is generally recognized that meeting society's emerging environmental science and management needs will require the marine data community to provide simpler, more effective and more interoperable access to its data. There is broad agreement, as well, that data standards are the bedrock upon which interoperability will be built. The path that would bring the marine data community to agree upon and utilize such standards, however, is often elusive. In this paper we examine the trio of standards 1) netCDF files; 2) the Climate and Forecast (CF) metadata convention; and 3) the OPeNDAP data access protocol. These standards taken together have brought our community a high level of interoperability for "gridded" data such as model outputs, satellite products and climatological analyses, and they are gaining rapid acceptance for ocean observations. We will provide an overview of the scope of the contribution that has been made. We then step back from the information technology considerations to examine the community or "social" process by which the successes were achieved. We contrast the path by which the World Meteorological Organization (WMO) has advanced the Global Telecommunications System (GTS) - netCDF/CF/OPeNDAP exemplifying a "bottom up" standards process whereas GTS is "top down". Both of these standards are tales of success at achieving specific purposes, yet each is hampered by technical limitations. These limitations sometimes lead to controversy over whether alternative technological directions should be pursued. Finally we draw general conclusions regarding the factors that affect the success of a standards development effort - the likelihood that an IT standard will meet its design goals and will achieve community-wide acceptance. We believe that a higher level of thoughtful awareness by the scientists, program managers and technology experts of the vital role of standards and the merits of alternative standards processes can help us as a community to reach our interoperability goals faster.
A Linked Dataset of Medical Educational Resources

ERIC Educational Resources Information Center

Dietze, Stefan; Taibi, Davide; Yu, Hong Qing; Dovrolis, Nikolas

2015-01-01

Reusable educational resources became increasingly important for enhancing learning and teaching experiences, particularly in the medical domain where resources are particularly expensive to produce. While interoperability across educational resources metadata repositories is yet limited to the heterogeneity of metadata standards and interface…
Electronic Health Records Data and Metadata: Challenges for Big Data in the United States.

PubMed

Sweet, Lauren E; Moulaison, Heather Lea

2013-12-01

This article, written by researchers studying metadata and standards, represents a fresh perspective on the challenges of electronic health records (EHRs) and serves as a primer for big data researchers new to health-related issues. Primarily, we argue for the importance of the systematic adoption of standards in EHR data and metadata as a way of promoting big data research and benefiting patients. EHRs have the potential to include a vast amount of longitudinal health data, and metadata provides the formal structures to govern that data. In the United States, electronic medical records (EMRs) are part of the larger EHR. EHR data is submitted by a variety of clinical data providers and potentially by the patients themselves. Because data input practices are not necessarily standardized, and because of the multiplicity of current standards, basic interoperability in EHRs is hindered. Some of the issues with EHR interoperability stem from the complexities of the data they include, which can be both structured and unstructured. A number of controlled vocabularies are available to data providers. The continuity of care document standard will provide interoperability in the United States between the EMR and the larger EHR, potentially making data input by providers directly available to other providers. The data involved is nonetheless messy. In particular, the use of competing vocabularies such as the Systematized Nomenclature of Medicine-Clinical Terms, MEDCIN, and locally created vocabularies inhibits large-scale interoperability for structured portions of the records, and unstructured portions, although potentially not machine readable, remain essential. Once EMRs for patients are brought together as EHRs, the EHRs must be managed and stored. Adequate documentation should be created and maintained to assure the secure and accurate use of EHR data. There are currently a few notable international standards initiatives for EHRs. Organizations such as Health Level Seven International and Clinical Data Interchange Standards Consortium are developing and overseeing implementation of interoperability standards. Denmark and Singapore are two countries that have successfully implemented national EHR systems. Future work in electronic health information initiatives should underscore the importance of standards and reinforce interoperability of EHRs for big data research and for the sake of patients.
A multi-service data management platform for scientific oceanographic products

NASA Astrophysics Data System (ADS)

D'Anca, Alessandro; Conte, Laura; Nassisi, Paola; Palazzo, Cosimo; Lecci, Rita; Cretì, Sergio; Mancini, Marco; Nuzzo, Alessandra; Mirto, Maria; Mannarini, Gianandrea; Coppini, Giovanni; Fiore, Sandro; Aloisio, Giovanni

2017-02-01

An efficient, secure and interoperable data platform solution has been developed in the TESSA project to provide fast navigation and access to the data stored in the data archive, as well as a standard-based metadata management support. The platform mainly targets scientific users and the situational sea awareness high-level services such as the decision support systems (DSS). These datasets are accessible through the following three main components: the Data Access Service (DAS), the Metadata Service and the Complex Data Analysis Module (CDAM). The DAS allows access to data stored in the archive by providing interfaces for different protocols and services for downloading, variables selection, data subsetting or map generation. Metadata Service is the heart of the information system of the TESSA products and completes the overall infrastructure for data and metadata management. This component enables data search and discovery and addresses interoperability by exploiting widely adopted standards for geospatial data. Finally, the CDAM represents the back-end of the TESSA DSS by performing on-demand complex data analysis tasks.
The Next Stage: Moving from Isolated Digital Collections to Interoperable Digital Libraries.

ERIC Educational Resources Information Center

Besser, Howard

2002-01-01

Presents a conceptual framework for digital library development and discusses how to move from isolated digital collections to interoperable digital libraries. Topics include a history of digital libraries; user-centered architecture; stages of technological development; standards, including metadata; and best practices. (Author/LRW)
Bottom-up capacity building for data providers in RITMARE

NASA Astrophysics Data System (ADS)

Pepe, Monica; Basoni, Anna; Bastianini, Mauro; Fugazza, Cristiano; Menegon, Stefano; Oggioni, Alessandro; Pavesi, Fabio; Sarretta, Alessandro; Carrara, Paola

2014-05-01

RITMARE is a Flagship Project by the Italian Ministry of Research, coordinated by the National Research Council (CNR). It aims at the interdisciplinary integration of Italian marine research. Sub-project 7 shall create an interoperable infrastructure for the project, capable of interconnecting the whole community of researchers involved. It will allow coordinating and sharing of data, processes, and information produced by the other sub-projects [1]. Spatial Data Infrastructures (SDIs) allow for interoperable sharing among heterogeneous, distributed spatial content providers. The INSPIRE Directive [2] regulates the development of a pan-european SDI despite the great variety of national approaches in managing spatial data. However, six years after its adoption, its growth is still hampered by technological, cultural, and methodological gaps. In particular, in the research sector, actors may not be prone to comply with INSPIRE (or feel not compelled to) because they are too concentrated on domain-specific activities or hindered by technological issues. Indeed, the available technologies and tools for enabling standard-based discovery and access services are far from being user-friendly and requires time-consuming activities, such as metadata creation. Moreover, the INSPIRE implementation guidelines do not accommodate an essential component in environmental research, that is, in situ observations. In order to overcome most of the aforementioned issues and to enable researchers to actively give their contribution in the creation of the project infrastructure, a bottom-up approach has been adopted: a software suite has been developed, called Starter Kit, which is offered to research data production units, so that they can become autonomous, independent nodes of data provision. The Starter Kit enables the provision of geospatial resources, either geodata (e.g., maps and layers) or observations pulled from sensors, which are made accessible according to the OGC standards defined for the specific category of data (WMS, WFS, WCS, and SOS). Resources are annotated by fine-grained metadata that is compliant with standards (e.g., INSPIRE, SensorML) and also semantically enriched by leveraging controlled vocabularies and RDF-based data structures (e.g., the FOAF description of the project's organisation). The Starter Kit is packaged as an off-the-shelf virtual machine and is made available under an open license (GPL v.3) and with extensive support tools. Among the most innovative features of the architecture is the user-friendly, extensible approach to metadata creation. On the one hand, the number of metadata items that need to be provided by the user is reduced to the minimum by recourse to controlled vocabularies and context information. The semantic underpinning of these data structures enables advanced discovery functionalities. On the other hand, the templating mechanism adopted in metadata editing allows to easily plug-in further schemata. The Starter Kit provides a consistent framework for capacity building that brings the heterogeneous actors in the project under the same umbrella, while preserving the individual practices, formats, and workflows. At the same time, users are empowered with standard-compliant web services that can be discovered and accessed both locally and remotely, such as the RITMARE infrastructure itself. [1] Carrara, P., Sarretta, A., Giorgetti, A., Ribera D'Alcalà, M., Oggioni, A., & Partescano, E. (2013). An interoperable infrastructure for the Italian Marine Research. IMDIS 2013 [2] European Commission, "Establishing an Infrastructure for Spatial Information in the European Community (INSPIRE)" Directive 2007/2/EC, Official J. European Union, vol. 50, no. L 108, 2007, pp. 1-14.
Content Metadata Standards for Marine Science: A Case Study

USGS Publications Warehouse

Riall, Rebecca L.; Marincioni, Fausto; Lightsom, Frances L.

2004-01-01

The U.S. Geological Survey developed a content metadata standard to meet the demands of organizing electronic resources in the marine sciences for a broad, heterogeneous audience. These metadata standards are used by the Marine Realms Information Bank project, a Web-based public distributed library of marine science from academic institutions and government agencies. The development and deployment of this metadata standard serve as a model, complete with lessons about mistakes, for the creation of similarly specialized metadata standards for digital libraries.
Intelligent Discovery for Learning Objects Using Semantic Web Technologies

ERIC Educational Resources Information Center

Hsu, I-Ching

2012-01-01

The concept of learning objects has been applied in the e-learning field to promote the accessibility, reusability, and interoperability of learning content. Learning Object Metadata (LOM) was developed to achieve these goals by describing learning objects in order to provide meaningful metadata. Unfortunately, the conventional LOM lacks the…
The Value of Data and Metadata Standardization for Interoperability in Giovanni Or: Why Your Product's Metadata Causes Us Headaches!

NASA Technical Reports Server (NTRS)

Smit, Christine; Hegde, Mahabaleshwara; Strub, Richard; Bryant, Keith; Li, Angela; Petrenko, Maksym

2017-01-01

Giovanni is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization.
Planetary Sciences Interoperability at VO Paris Data Centre

NASA Astrophysics Data System (ADS)

Le Sidaner, P.; Aboudarham, J.; Birlan, M.; Briot, D.; Bonnin, X.; Cecconi, B.; Chauvin, C.; Erard, S.; Henry, F.; Lamy, L.; Mancini, M.; Normand, J.; Popescu, F.; Roques, F.; Savalle, R.; Schneider, J.; Shih, A.; Thuillot, W.; Vinatier, S.

2015-10-01

The Astronomy community has been developing interoperability since more than 10 years, by standardizing data access, data formats, and metadata. This international action is led by the International Virtual Observatory Alliance (IVOA). Observatoire de Paris is an active participant in this project. All actions on interoperability, data and service provision are centralized in and managed by VOParis Data Centre (VOPDC). VOPDC is a coordinated project from all scientific departments of Observatoire de Paris..
Leveraging Available Technologies for Improved Interoperability and Visualization of Remote Sensing and In-situ Oceanographic data at the PO.DAAC

NASA Astrophysics Data System (ADS)

Tsontos, V. M.; Arms, S. C.; Thompson, C. K.; Quach, N.; Lam, T.

2016-12-01

Earth science applications increasingly rely on the integration of multivariate data from diverse observational platforms. Whether for satellite mission cal/val, science or decision support, the coupling of remote sensing and in-situ field data is integral also to oceanographic workflows. This has prompted archives such as the PO.DAAC, NASA's physical oceanographic data archive, that historically has had a remote sensing focus, to adapt to better accommodate complex field campaign datasets. However, the inherent heterogeneity of in-situ datasets and their variable adherence to meta/data standards poses a significant impediment to interoperability, a problem originating early in the data lifecycle and significantly impacting stewardship and usability of these data long-term. Here we introduce a new initiative underway at PO.DAAC that seeks to catalyze efforts to address these challenges. It involves the enhancement and integration of available high TRL (Technology Readiness level) components for improved interoperability and support of in-situ data with a focus on a novel yet representative class of oceanographic field data: data from electronic tags deployed on a variety of marine species as biological sampling platforms in support of fisheries management and ocean observation efforts. This project seeks to demonstrate, deliver and ultimately sustain operationally a reusable and accessible set of tools to: 1) mediate reconciliation of heterogeneous source data into a tractable number of standardized formats consistent with earth science data standards; 2) harmonize existing metadata models for satellite and field datasets; 3) demonstrate the value added of integrated data access via a range of available tools and services hosted at the PO.DAAC, including a web-based visualization tool for comprehensive mapping of satellite and in-situ data. An innovative part of our project plan involves partnering with the leading electronic tag manufacturer to promote the adoption of appropriate data standards in their processing software. The proposed project thus adopts a model lifecycle approach complimented by broadly applicable technologies to address key data management and interoperability issues for in-situ data
Developing Interoperable Air Quality Community Portals

NASA Astrophysics Data System (ADS)

Falke, S. R.; Husar, R. B.; Yang, C. P.; Robinson, E. M.; Fialkowski, W. E.

2009-04-01

Web portals are intended to provide consolidated discovery, filtering and aggregation of content from multiple, distributed web sources targeted at particular user communities. This paper presents a standards-based information architectural approach to developing portals aimed at air quality community collaboration in data access and analysis. An important characteristic of the approach is to advance beyond the present stand-alone design of most portals to achieve interoperability with other portals and information sources. We show how using metadata standards, web services, RSS feeds and other Web 2.0 technologies, such as Yahoo! Pipes and del.icio.us, helps increase interoperability among portals. The approach is illustrated within the context of the GEOSS Architecture Implementation Pilot where an air quality community portal is being developed to provide a user interface between the portals and clearinghouse of the GEOSS Common Infrastructure and the air quality community catalog of metadata and data services.
NASA's Earth Observing Data and Information System - Supporting Interoperability through a Scalable Architecture (Invited)

NASA Astrophysics Data System (ADS)

Mitchell, A. E.; Lowe, D. R.; Murphy, K. J.; Ramapriyan, H. K.

2011-12-01

Initiated in 1990, NASA's Earth Observing System Data and Information System (EOSDIS) is currently a petabyte-scale archive of data designed to receive, process, distribute and archive several terabytes of science data per day from NASA's Earth science missions. Comprised of 12 discipline specific data centers collocated with centers of science discipline expertise, EOSDIS manages over 6800 data products from many science disciplines and sources. NASA supports global climate change research by providing scalable open application layers to the EOSDIS distributed information framework. This allows many other value-added services to access NASA's vast Earth Science Collection and allows EOSDIS to interoperate with data archives from other domestic and international organizations. EOSDIS is committed to NASA's Data Policy of full and open sharing of Earth science data. As metadata is used in all aspects of NASA's Earth science data lifecycle, EOSDIS provides a spatial and temporal metadata registry and order broker called the EOS Clearing House (ECHO) that allows efficient search and access of cross domain data and services through the Reverb Client and Application Programmer Interfaces (APIs). Another core metadata component of EOSDIS is NASA's Global Change Master Directory (GCMD) which represents more than 25,000 Earth science data set and service descriptions from all over the world, covering subject areas within the Earth and environmental sciences. With inputs from the ECHO, GCMD and Soil Moisture Active Passive (SMAP) mission metadata models, EOSDIS is developing a NASA ISO 19115 Best Practices Convention. Adoption of an international metadata standard enables a far greater level of interoperability among national and international data products. NASA recently concluded a 'Metadata Harmony Study' of EOSDIS metadata capabilities/processes of ECHO and NASA's Global Change Master Directory (GCMD), to evaluate opportunities for improved data access and use, reduce efforts by data providers and improve metadata integrity. The result was a recommendation for EOSDIS to develop a 'Common Metadata Repository (CMR)' to manage the evolution of NASA Earth Science metadata in a unified and consistent way by providing a central storage and access capability that streamlines current workflows while increasing overall data quality and anticipating future capabilities. For applications users interested in monitoring and analyzing a wide variety of natural and man-made phenomena, EOSDIS provides access to near real-time products from the MODIS, OMI, AIRS, and MLS instruments in less than 3 hours from observation. To enable interactive exploration of NASA's Earth imagery, EOSDIS is developing a set of standard services to deliver global, full-resolution satellite imagery in a highly responsive manner. EOSDIS is also playing a lead role in the development of the CEOS WGISS Integrated Catalog (CWIC), which provides search and access to holdings of participating international data providers. EOSDIS provides a platform to expose and share information on NASA Earth science tools and data via Earthdata.nasa.gov while offering a coherent and interoperable system for the NASA Earth Science Data System (ESDS) Program.
NASA's Earth Observing Data and Information System - Supporting Interoperability through a Scalable Architecture (Invited)

NASA Astrophysics Data System (ADS)

Mitchell, A. E.; Lowe, D. R.; Murphy, K. J.; Ramapriyan, H. K.

2013-12-01

Initiated in 1990, NASA's Earth Observing System Data and Information System (EOSDIS) is currently a petabyte-scale archive of data designed to receive, process, distribute and archive several terabytes of science data per day from NASA's Earth science missions. Comprised of 12 discipline specific data centers collocated with centers of science discipline expertise, EOSDIS manages over 6800 data products from many science disciplines and sources. NASA supports global climate change research by providing scalable open application layers to the EOSDIS distributed information framework. This allows many other value-added services to access NASA's vast Earth Science Collection and allows EOSDIS to interoperate with data archives from other domestic and international organizations. EOSDIS is committed to NASA's Data Policy of full and open sharing of Earth science data. As metadata is used in all aspects of NASA's Earth science data lifecycle, EOSDIS provides a spatial and temporal metadata registry and order broker called the EOS Clearing House (ECHO) that allows efficient search and access of cross domain data and services through the Reverb Client and Application Programmer Interfaces (APIs). Another core metadata component of EOSDIS is NASA's Global Change Master Directory (GCMD) which represents more than 25,000 Earth science data set and service descriptions from all over the world, covering subject areas within the Earth and environmental sciences. With inputs from the ECHO, GCMD and Soil Moisture Active Passive (SMAP) mission metadata models, EOSDIS is developing a NASA ISO 19115 Best Practices Convention. Adoption of an international metadata standard enables a far greater level of interoperability among national and international data products. NASA recently concluded a 'Metadata Harmony Study' of EOSDIS metadata capabilities/processes of ECHO and NASA's Global Change Master Directory (GCMD), to evaluate opportunities for improved data access and use, reduce efforts by data providers and improve metadata integrity. The result was a recommendation for EOSDIS to develop a 'Common Metadata Repository (CMR)' to manage the evolution of NASA Earth Science metadata in a unified and consistent way by providing a central storage and access capability that streamlines current workflows while increasing overall data quality and anticipating future capabilities. For applications users interested in monitoring and analyzing a wide variety of natural and man-made phenomena, EOSDIS provides access to near real-time products from the MODIS, OMI, AIRS, and MLS instruments in less than 3 hours from observation. To enable interactive exploration of NASA's Earth imagery, EOSDIS is developing a set of standard services to deliver global, full-resolution satellite imagery in a highly responsive manner. EOSDIS is also playing a lead role in the development of the CEOS WGISS Integrated Catalog (CWIC), which provides search and access to holdings of participating international data providers. EOSDIS provides a platform to expose and share information on NASA Earth science tools and data via Earthdata.nasa.gov while offering a coherent and interoperable system for the NASA Earth Science Data System (ESDS) Program.
OAI and NASA's Scientific and Technical Information

NASA Technical Reports Server (NTRS)

Nelson, Michael L.; Rocker, JoAnne; Harrison, Terry L.

2002-01-01

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is an evolving protocol and philosophy regarding interoperability for digital libraries (DLs). Previously, "distributed searching" models were popular for DL interoperability. However, experience has shown distributed searching systems across large numbers of DLs to be difficult to maintain in an Internet environment. The OAI-PMH is a move away from distributed searching, focusing on the arguably simpler model of "metadata harvesting". We detail NASA s involvement in defining and testing the OAI-PMH and experience to date with adapting existing NASA distributed searching DLs (such as the NASA Technical Report Server) to use the OAI-PMH and metadata harvesting. We discuss some of the entirely new DL projects that the OAI-PMH has made possible, such as the Technical Report Interchange project. We explain the strategic importance of the OAI-PMH to the mission of NASA s Scientific and Technical Information Program.
Interoperability Gap Challenges for Learning Object Repositories & Learning Management Systems

ERIC Educational Resources Information Center

Mason, Robert T.

2011-01-01

An interoperability gap exists between Learning Management Systems (LMSs) and Learning Object Repositories (LORs). Learning Objects (LOs) and the associated Learning Object Metadata (LOM) that is stored within LORs adhere to a variety of LOM standards. A common LOM standard found in LORs is the Sharable Content Object Reference Model (SCORM)…
Development of ITSASGIS-5D: seeking interoperability between Marine GIS layers and scientific multidimensional data using open source tools and OGC services for multidisciplinary research.

NASA Astrophysics Data System (ADS)

Sagarminaga, Y.; Galparsoro, I.; Reig, R.; Sánchez, J. A.

2012-04-01

Since 2000, an intense effort was conducted in AZTI's Marine Research Division to set up a data management system which could gather all the marine datasets that were being produced by different in-house research projects. For that, a corporative GIS was designed that included a data and metadata repository, a database, a layer catalog & search application and an internet map viewer. Several layers, mostly dealing with physical, chemical and biological in-situ sampling, and basic and thematic cartography including bathymetry, geomorphology, different species habitat maps, and human pressure and activities maps, were successfully gathered in this system. Very soon, it was realised that new marine technologies yielding continuous multidimensional data, sometimes called FES (Fluid Earth System) data, were difficult to handle in this structure. The data affected, mainly included numerical oceanographic and meteorological models, remote sensing data, coastal RADAR data, and some in-situ observational systems such as CTD's casts, moored or lagrangian buoys, etc. A management system for gridded multidimensional data was developed using standardized formats (netcdf using CF conventions) and tools such as THREDDS catalog (UNIDATA/UCAR) providing web services such as OPENDAP, NCSS, and WCS, as well as ncWMS service developed by the Reading e-science Center. At present, a system (ITSASGIS-5D) is being developed, based on OGC standards and open-source tools to allow interoperability between all the data types mentioned before. This system includes, in the server side, postgresql/postgis databases and geoserver for GIS layers, and THREDDS/Opendap and ncWMS services for FES gridded data. Moreover, an on-line client is being developed to allow joint access, user configuration, data visualisation & query and data distribution. This client is using mapfish, ExtJS - GeoEXT, and openlayers libraries. Through this presentation the elements of the first released version of this system will be described and showed, together with the new topics to be developed in new versions that include among others, the integration of geoNetwork libraries and tools for both FES and GIS metadata management, and the use of new OGC Sensor Observation Services (SOS) to integrate non gridded multidimensional data such as time series, depth profiles or trajectories provided by different observational systems. The final aim of this approach is to contribute to the multidisciplinary access and use of marine data for management and research activities, and facilitate the implementation of integrated ecosystem based approaches in the fields of fisheries advice and management, marine spatial planning, or the implementation of the European policies such as the Water Framework Directive, the Marine Strategy Framework Directive or the Habitat Framework Directive.
Enabling Interoperability in Heliophysical Domains

NASA Astrophysics Data System (ADS)

Bentley, Robert

2013-04-01

There are many aspects of science in the Solar System that are overlapping - phenomena observed in one domain can have effects in other domains. However, there are many problems related to exploiting the data in cross-disciplinary studies because of lack of interoperability of the data and services. The CASSIS project is a Coordination Action funded under FP7 that has the objective of improving the interoperability of data and services related Solar System science. CASSIS has been investigating how the data could be made more accessible with some relatively minor changes to the observational metadata. The project has been looking at the services that are used within the domain and determining whether they are interoperable with each other and if not what would be required make them so. It has also been examining all types of metadata that are used when identifying and using observations and trying to make them more compliant with techniques and standards developed by bodies such as the International Virtual Observatory Alliance (IVOA). Many of the lessons that are being learnt in the study are applicable to domains that go beyond those directly involved in heliophysics. Adopting some simple standards related to the design of the services interfaces and metadata that are used would make it much easier to investigate interdisciplinary science topics. We will report on our finding and describe a roadmap for the future. For more information about CASSIS, please visit the project Web site on cassis-vo.eu

OBIS-USA: Enhancing Ocean Science Outcomes through Data Interoperability and Usability

NASA Astrophysics Data System (ADS)

Goldstein, P.; Fornwall, M.

2014-12-01

Commercial and industrial information systems have long built and relied upon standard data formats and transactions. Business processes, analytics, applications, and social networks emerge on top of these standards to create value. Examples of value delivered include operational productivity, analytics that enable growth and profit, and enhanced human communication and creativity for innovation. In science informatics, some research and operational activities operate with only scattered adoption of standards and few of the emergent benefits of interoperability. In-situ biological data management in the marine domain is an exemplar. From the origination of biological occurrence records in surveys, observer programs, monitoring and experimentation, through distribution techniques, to applications, decisions, and management response, marine biological data can be difficult, limited, and costly to integrate because of non-standard and undocumented conditions in the data. While this presentation identifies deficits in marine biological data practices, the presentation also identifies this as a field of opportunity. Standards for biological data and metadata do exist, with growing global adoption and extensibility features. Scientific, economic, and social-value motivations provide incentives to maximize marine science investments. Diverse science communities of national and international scale begin to see benefits of collaborative technologies. OBIS-USA (http://USGS.gov/obis-usa) is a program of the United States Geological Survey. This presentation shows how OBIS-USA directly addresses the opportunity to enhance ocean science outcomes through data infrastructure, including: (1) achieving rapid, economical, and high-quality data capture and data flow, (2) offering technology for data storage and methods for data discovery and quality/suitability evaluation, (3) making data understandable and consistent for application purposes, (4) distributing and integrating data in various formats, (5) addressing a range of subject matter within data contents, and (6) preserving data for access long-term.
Data Publication and Interoperability for Long Tail Researchers via the Open Data Repository's (ODR) Data Publisher.

NASA Astrophysics Data System (ADS)

Stone, N.; Lafuente, B.; Bristow, T.; Keller, R.; Downs, R. T.; Blake, D. F.; Fonda, M.; Pires, A.

2016-12-01

Working primarily with astrobiology researchers at NASA Ames, the Open Data Repository (ODR) has been conducting a software pilot to meet the varying needs of this multidisciplinary community. Astrobiology researchers often have small communities or operate individually with unique data sets that don't easily fit into existing database structures. The ODR constructed its Data Publisher software to allow researchers to create databases with common metadata structures and subsequently extend them to meet their individual needs and data requirements. The software accomplishes these tasks through a web-based interface that allows collaborative creation and revision of common metadata templates and individual extensions to these templates for custom data sets. This allows researchers to search disparate datasets based on common metadata established through the metadata tools, but still facilitates distinct analyses and data that may be stored alongside the required common metadata. The software produces web pages that can be made publicly available at the researcher's discretion so that users may search and browse the data in an effort to make interoperability and data discovery a human-friendly task while also providing semantic data for machine-based discovery. Once relevant data has been identified, researchers can utilize the built-in application programming interface (API) that exposes the data for machine-based consumption and integration with existing data analysis tools (e.g. R, MATLAB, Project Jupyter - http://jupyter.org). The current evolution of the project has created the Astrobiology Habitable Environments Database (AHED)[1] which provides an interface to databases connected through a common metadata core. In the next project phase, the goal is for small research teams and groups to be self-sufficient in publishing their research data to meet funding mandates and academic requirements as well as fostering increased data discovery and interoperability through human-readable and machine-readable interfaces. This project is supported by the Science-Enabling Research Activity (SERA) and NASA NNX11AP82A, MSL. [1] B. Lafuente et al. (2016) AGU, submitted.
Making Information Visible, Accessible, and Understandable: Meta-Data and Registries

DTIC Science & Technology

2007-07-01

the data created, the length of play time, album name, and the genre. Without resource metadata, portable digital music players would not be so...notion of a catalog card in a library. An example of metadata is the description of a music file specifying the creator, the artist that performed the song...describe struc- ture and formatting which are critical to interoperability and the management of databases. Going back to the portable music player example
A standard for measuring metadata quality in spectral libraries

NASA Astrophysics Data System (ADS)

Rasaiah, B.; Jones, S. D.; Bellman, C.

2013-12-01

A standard for measuring metadata quality in spectral libraries Barbara Rasaiah, Simon Jones, Chris Bellman RMIT University Melbourne, Australia barbara.rasaiah@rmit.edu.au, simon.jones@rmit.edu.au, chris.bellman@rmit.edu.au ABSTRACT There is an urgent need within the international remote sensing community to establish a metadata standard for field spectroscopy that ensures high quality, interoperable metadata sets that can be archived and shared efficiently within Earth observation data sharing systems. Metadata are an important component in the cataloguing and analysis of in situ spectroscopy datasets because of their central role in identifying and quantifying the quality and reliability of spectral data and the products derived from them. This paper presents approaches to measuring metadata completeness and quality in spectral libraries to determine reliability, interoperability, and re-useability of a dataset. Explored are quality parameters that meet the unique requirements of in situ spectroscopy datasets, across many campaigns. Examined are the challenges presented by ensuring that data creators, owners, and data users ensure a high level of data integrity throughout the lifecycle of a dataset. Issues such as field measurement methods, instrument calibration, and data representativeness are investigated. The proposed metadata standard incorporates expert recommendations that include metadata protocols critical to all campaigns, and those that are restricted to campaigns for specific target measurements. The implication of semantics and syntax for a robust and flexible metadata standard are also considered. Approaches towards an operational and logistically viable implementation of a quality standard are discussed. This paper also proposes a way forward for adapting and enhancing current geospatial metadata standards to the unique requirements of field spectroscopy metadata quality. [0430] BIOGEOSCIENCES / Computational methods and data processing [0480] BIOGEOSCIENCES / Remote sensing [1904] INFORMATICS / Community standards [1912] INFORMATICS / Data management, preservation, rescue [1926] INFORMATICS / Geospatial [1930] INFORMATICS / Data and information governance [1946] INFORMATICS / Metadata [1952] INFORMATICS / Modeling [1976] INFORMATICS / Software tools and services [9810] GENERAL OR MISCELLANEOUS / New fields
MyOcean Internal Information System (Dial-P)

NASA Astrophysics Data System (ADS)

Blanc, Frederique; Jolibois, Tony; Loubrieu, Thomas; Manzella, Giuseppe; Mazzetti, Paolo; Nativi, Stefano

2010-05-01

MyOcean is a three-year project (2008-2011) which goal is the development and pre-operational validation of the GMES Marine Core Service for ocean monitoring and forecasting. It's a transition project that will conduct the European "operational oceanography" community towards the operational phase of a GMES European service, which demands more European integration, more operationality, and more service. Observations, model-based data, and added-value products will be generated - and enhanced thanks to dedicated expertise - by the following production units: • Five Thematic Assembly Centers, each of them dealing with a specific set of observation data: Sea Level, Ocean colour, Sea Surface Temperature, Sea Ice & Wind, and In Situ data, • Seven Monitoring and Forecasting Centers to serve the Global Ocean, the Arctic area, the Baltic Sea, the Atlantic North-West shelves area, the Atlantic Iberian-Biscay-Ireland area, the Mediterranean Sea and the Black sea. Intermediate and final users will discover, view and get the products by means of a central web desk, a central re-active manned service desk and thematic experts distributed across Europe. The MyOcean Information System (MIS) is considering the various aspects of an interoperable - federated information system. Data models support data and computer systems by providing the definition and format of data. The possibility of including the information in the data file is depending on data model adopted. In general there is little effort in the actual project to develop a ‘generic' data model. A strong push to develop a common model is provided by the EU Directive INSPIRE. At present, there is no single de-facto data format for storing observational data. Data formats are still evolving, with their underlying data models moving towards the concept of Feature Types based on ISO/TC211 standards. For example, Unidata are developing the Common Data Model that can represent scientific data types such as point, trajectory, station, grid, etc., which will be implemented in netCDF format. SeaDataNet is recommending ODV and NetCDF formats. Another problem related to data curation and interoperability is the possibility to use common vocabularies. Common vocabularies are developed in many international initiatives, such as GEMET (promoted by INSPIRE as a multilingual thesaurus), UNIDATA, SeaDataNet, Marine Metadata Initiative (MMI). MIS is considering the SeaDataNet vocabulary as a base for interoperability. Four layers of different abstraction levels of interoperability an be defined: - Technical/basic: this layer is implemented at each TAC or MFC through internet connection and basic services for data transfer and browsing (e.g FTP, HTTP, etc). - Syntactic: allowing the interchange of metadata and protocol elements. This layer corresponds to a definition Core Metadata Set, the format of exchange/delivery for the data and associated metadata and possible software. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant metadata set and common data formats). - Functional/pragmatic: based on a common set of functional primitives or on a common set of service definitions. This layer refers to the definition of services based on Web services standards. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant network services). - Semantic: allowing to access similar classes of objects and services across multiple sites, with multilinguality of content as one specific aspect. This layer corresponds to MIS interface, terminology and thesaurus. Given the above requirements, the proposed solution is a federation of systems, where the individual participants are self-contained autonomous systems, but together form a consistent wider picture. A mid-tier integration layer mediates between existing systems, adapting their data and service model schema to the MIS. The developed MIS is a read-only system, i.e. does not allow updating (or inserting) data into the participant resource systems. The main advantages of the proposed approach are: • to enable information sources to join the MIS and publish their data and metadata in a secure way, without any modification to their existing resources and procedures and without any restriction to their autonomy; • to enable users to browse and query the MIS, receiving an aggregated result incorporating relevant data and metadata from across different sources; • to accommodate the growth of such a MIS, either in terms of its clients or of its information resources, as well as the evolution of the underlying data model.
Oceanotron, Scalable Server for Marine Observations

NASA Astrophysics Data System (ADS)

Loubrieu, T.; Bregent, S.; Blower, J. D.; Griffiths, G.

2013-12-01

Ifremer, French marine institute, is deeply involved in data management for different ocean in-situ observation programs (ARGO, OceanSites, GOSUD, ...) or other European programs aiming at networking ocean in-situ observation data repositories (myOcean, seaDataNet, Emodnet). To capitalize the effort for implementing advance data dissemination services (visualization, download with subsetting) for these programs and generally speaking water-column observations repositories, Ifremer decided to develop the oceanotron server (2010). Knowing the diversity of data repository formats (RDBMS, netCDF, ODV, ...) and the temperamental nature of the standard interoperability interface profiles (OGC/WMS, OGC/WFS, OGC/SOS, OpeNDAP, ...), the server is designed to manage plugins: - StorageUnits : which enable to read specific data repository formats (netCDF/OceanSites, RDBMS schema, ODV binary format). - FrontDesks : which get external requests and send results for interoperable protocols (OGC/WMS, OGC/SOS, OpenDAP). In between a third type of plugin may be inserted: - TransformationUnits : which enable ocean business related transformation of the features (for example conversion of vertical coordinates from pressure in dB to meters under sea surface). The server is released under open-source license so that partners can develop their own plugins. Within MyOcean project, University of Reading has plugged a WMS implementation as an oceanotron frontdesk. The modules are connected together by sharing the same information model for marine observations (or sampling features: vertical profiles, point series and trajectories), dataset metadata and queries. The shared information model is based on OGC/Observation & Measurement and Unidata/Common Data Model initiatives. The model is implemented in java (http://www.ifremer.fr/isi/oceanotron/javadoc/). This inner-interoperability level enables to capitalize ocean business expertise in software development without being indentured to specific data formats or protocols. Oceanotron is deployed at seven European data centres for marine in-situ observations within myOcean. While additional extensions are still being developed, to promote new collaborative initiatives, a work is now done on continuous and distributed integration (jenkins, maven), shared reference documentation (on alfresco) and code and release dissemination (sourceforge, github).
The MMI Device Ontology: Enabling Sensor Integration

NASA Astrophysics Data System (ADS)

Rueda, C.; Galbraith, N.; Morris, R. A.; Bermudez, L. E.; Graybeal, J.; Arko, R. A.; Mmi Device Ontology Working Group

2010-12-01

The Marine Metadata Interoperability (MMI) project has developed an ontology for devices to describe sensors and sensor networks. This ontology is implemented in the W3C Web Ontology Language (OWL) and provides an extensible conceptual model and controlled vocabularies for describing heterogeneous instrument types, with different data characteristics, and their attributes. It can help users populate metadata records for sensors; associate devices with their platforms, deployments, measurement capabilities and restrictions; aid in discovery of sensor data, both historic and real-time; and improve the interoperability of observational oceanographic data sets. We developed the MMI Device Ontology following a community-based approach. By building on and integrating other models and ontologies from related disciplines, we sought to facilitate semantic interoperability while avoiding duplication. Key concepts and insights from various communities, including the Open Geospatial Consortium (eg., SensorML and Observations and Measurements specifications), Semantic Web for Earth and Environmental Terminology (SWEET), and W3C Semantic Sensor Network Incubator Group, have significantly enriched the development of the ontology. Individuals ranging from instrument designers, science data producers and consumers to ontology specialists and other technologists contributed to the work. Applications of the MMI Device Ontology are underway for several community use cases. These include vessel-mounted multibeam mapping sonars for the Rolling Deck to Repository (R2R) program and description of diverse instruments on deepwater Ocean Reference Stations for the OceanSITES program. These trials involve creation of records completely describing instruments, either by individual instances or by manufacturer and model. Individual terms in the MMI Device Ontology can be referenced with their corresponding Uniform Resource Identifiers (URIs) in sensor-related metadata specifications (e.g., SensorML, NetCDF). These identifiers can be resolved through a web browser, or other client applications via HTTP against the MMI Ontology Registry and Repository (ORR), where the ontology is maintained. SPARQL-based query capabilities, which are enhanced with reasoning, along with several supported output formats, allow the effective interaction of diverse client applications with the semantic information associated with the device ontology. In this presentation we describe the process for the development of the MMI Device Ontology and illustrate extensions and applications that demonstrate the benefits of adopting this semantic approach, including example queries involving inference. We also highlight the issues encountered and future work.
Exploring NASA GES DISC Data with Interoperable Services

NASA Technical Reports Server (NTRS)

Zhao, Peisheng; Yang, Wenli; Hegde, Mahabal; Wei, Jennifer C.; Kempler, Steven; Pham, Long; Teng, William; Savtchenko, Andrey

2015-01-01

Overview of NASA GES DISC (NASA Goddard Earth Science Data and Information Services Center) data with interoperable services: Open-standard and Interoperable Services Improve data discoverability, accessibility, and usability with metadata, catalogue and portal standards Achieve data, information and knowledge sharing across applications with standardized interfaces and protocols Open Geospatial Consortium (OGC) Data Services and Specifications Web Coverage Service (WCS) -- data Web Map Service (WMS) -- pictures of data Web Map Tile Service (WMTS) --- pictures of data tiles Styled Layer Descriptors (SLD) --- rendered styles.
Supporting Interoperability and Context-Awareness in E-Learning through Situation-Driven Learning Processes

ERIC Educational Resources Information Center

Dietze, Stefan; Gugliotta, Alessio; Domingue, John

2009-01-01

Current E-Learning technologies primarily follow a data and metadata-centric paradigm by providing the learner with composite content containing the learning resources and the learning process description, usually based on specific metadata standards such as ADL SCORM or IMS Learning Design. Due to the design-time binding of learning resources,…
Definition of an ISO 19115 metadata profile for SeaDataNet II Cruise Summary Reports and its XML encoding

NASA Astrophysics Data System (ADS)

Boldrini, Enrico; Schaap, Dick M. A.; Nativi, Stefano

2013-04-01

SeaDataNet implements a distributed pan-European infrastructure for Ocean and Marine Data Management whose nodes are maintained by 40 national oceanographic and marine data centers from 35 countries riparian to all European seas. A unique portal makes possible distributed discovery, visualization and access of the available sea data across all the member nodes. Geographic metadata play an important role in such an infrastructure, enabling an efficient documentation and discovery of the resources of interest. In particular: - Common Data Index (CDI) metadata describe the sea datasets, including identification information (e.g. product title, interested area), evaluation information (e.g. data resolution, constraints) and distribution information (e.g. download endpoint, download protocol); - Cruise Summary Reports (CSR) metadata describe cruises and field experiments at sea, including identification information (e.g. cruise title, name of the ship), acquisition information (e.g. utilized instruments, number of samples taken) In the context of the second phase of SeaDataNet (SeaDataNet 2 EU FP7 project, grant agreement 283607, started on October 1st, 2011 for a duration of 4 years) a major target is the setting, adoption and promotion of common international standards, to the benefit of outreach and interoperability with the international initiatives and communities (e.g. OGC, INSPIRE, GEOSS, …). A standardization effort conducted by CNR with the support of MARIS, IFREMER, STFC, BODC and ENEA has led to the creation of a ISO 19115 metadata profile of CDI and its XML encoding based on ISO 19139. The CDI profile is now in its stable version and it's being implemented and adopted by the SeaDataNet community tools and software. The effort has then continued to produce an ISO based metadata model and its XML encoding also for CSR. The metadata elements included in the CSR profile belong to different models: - ISO 19115: E.g. cruise identification information, including title and area of interest; metadata responsible party information - ISO 19115-2: E.g. acquisition information, including date of sampling, instruments used - SeaDataNet: E.g. SeaDataNet community specific, including EDMO and EDMERP code lists Two main guidelines have been followed in the metadata model drafting: - All the obligations and constraints required by both the ISO standards and INSPIRE directive had to be satisfied. These include the presence of specific elements with given cardinality (e.g. mandatory metadata date stamp, mandatory lineage information) - All the content information of legacy CSR format had to be supported by the new metadata model. An XML encoding of the CSR profile has been defined as well. Based on the ISO 19139 XML schema and constraints, it adds the new elements specific of the SeaDataNet community. The associated Schematron rules are used to enforce constraints not enforceable just with the Schema and to validate elements content against the SeaDataNet code lists vocabularies.
Cross-organizational workflow in radiology: an empirical study of the quality of shared metadata elements in Region Västra Götaland, Sweden.

PubMed

Lindsköld, Lars; Wintell, Mikael; Edgren, Lars; Aspelin, Peter; Lundberg, Nina

2013-07-01

Challenges related to the cross-organizational access of accurate and timely information about a patient's condition has become a critical issue in healthcare. Interoperability of different local sources is necessary. To identify and present missing and semantically incorrect data elements of metadata in the radiology enterprise service that supports cross-organizational sharing of dynamic information about patients' visits, in the Region Västra Götaland, Sweden. Quantitative data elements of metadata were collected yearly from the first Wednesday in March from 2006 to 2011 from the 24 in-house radiology departments in Region Västra Götaland. These radiology departments were organized into four hospital groups and three stand-alone hospitals. Included data elements of metadata were the patient name, patient ID, institutional department name, referring physician's name, and examination description. The majority of missing data elements of metadata was related to the institutional department name for Hospital 2, from 87% in 2007 to 25% in 2011. All data elements of metadata except the patient ID contained semantic errors. For example, for the data element "patient name", only three names out of 3537 were semantically correct. This study shows that the semantics of metadata elements are poorly structured and inconsistently used. Although a cross-organizational solution may technically be fully functional, semantic errors may prevent it from serving as an information infrastructure for collaboration between all departments and hospitals in the region. For interoperability, it is important that the agreed semantic models are implemented in vendor systems using the information infrastructure.
NCPP's Use of Standard Metadata to Promote Open and Transparent Climate Modeling

NASA Astrophysics Data System (ADS)

Treshansky, A.; Barsugli, J. J.; Guentchev, G.; Rood, R. B.; DeLuca, C.

2012-12-01

The National Climate Predictions and Projections (NCPP) Platform is developing comprehensive regional and local information about the evolving climate to inform decision making and adaptation planning. This includes both creating and providing tools to create metadata about the models and processes used to create its derived data products. NCPP is using the Common Information Model (CIM), an ontology developed by a broad set of international partners in climate research, as its metadata language. This use of a standard ensures interoperability within the climate community as well as permitting access to the ecosystem of tools and services emerging alongside the CIM. The CIM itself is divided into a general-purpose (UML & XML) schema which structures metadata documents, and a project or community-specific (XML) Controlled Vocabulary (CV) which constraints the content of metadata documents. NCPP has already modified the CIM Schema to accommodate downscaling models, simulations, and experiments. NCPP is currently developing a CV for use by the downscaling community. Incorporating downscaling into the CIM will lead to several benefits: easy access to the existing CIM Documents describing CMIP5 models and simulations that are being downscaled, access to software tools that have been developed in order to search, manipulate, and visualize CIM metadata, and coordination with national and international efforts such as ES-DOC that are working to make climate model descriptions and datasets interoperable. Providing detailed metadata descriptions which include the full provenance of derived data products will contribute to making that data (and, the models and processes which generated that data) more open and transparent to the user community.
Metadata mapping and reuse in caBIG.

PubMed

Kunz, Isaac; Lin, Ming-Chin; Frey, Lewis

2009-02-05

This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.
Interoperable Solar Data and Metadata via LISIRD 3

NASA Astrophysics Data System (ADS)

Wilson, A.; Lindholm, D. M.; Pankratz, C. K.; Snow, M. A.; Woods, T. N.

2015-12-01

LISIRD 3 is a major upgrade of the LASP Interactive Solar Irradiance Data Center (LISIRD), which serves several dozen space based solar irradiance and related data products to the public. Through interactive plots, LISIRD 3 provides data browsing supported by data subsetting and aggregation. Incorporating a semantically enabled metadata repository, LISIRD 3 users see current, vetted, consistent information about the datasets offered. Users can now also search for datasets based on metadata fields such as dataset type and/or spectral or temporal range. This semantic database enables metadata browsing, so users can discover the relationships between datasets, instruments, spacecraft, mission and PI. The database also enables creation and publication of metadata records in a variety of formats, such as SPASE or ISO, making these datasets more discoverable. The database also enables the possibility of a public SPARQL endpoint, making the metadata browsable in an automated fashion. LISIRD 3's data access middleware, LaTiS, provides dynamic, on demand reformatting of data and timestamps, subsetting and aggregation, and other server side functionality via a RESTful OPeNDAP compliant API, enabling interoperability between LASP datasets and many common tools. LISIRD 3's templated front end design, coupled with the uniform data interface offered by LaTiS, allows easy integration of new datasets. Consequently the number and variety of datasets offered by LISIRD has grown to encompass several dozen, with many more to come. This poster will discuss design and implementation of LISIRD 3, including tools used, capabilities enabled, and issues encountered.
Interoperable web applications for sharing data and products of the International DORIS Service

NASA Astrophysics Data System (ADS)

Soudarin, L.; Ferrage, P.

2017-12-01

The International DORIS Service (IDS) was created in 2003 under the umbrella of the International Association of Geodesy (IAG) to foster scientific research related to the French satellite tracking system DORIS and to deliver scientific products, mostly related to the International Earth rotation and Reference systems Service (IERS). Since its start, the organization has continuously evolved, leading to additional and improved operational products from an expanded set of DORIS Analysis Centers. In addition, IDS has developed services for sharing data and products with the users. Metadata and interoperable web applications are proposed to explore, visualize and download the key products such as the position time series of the geodetic points materialized at the ground tracking stations. The Global Geodetic Observing System (GGOS) encourages the IAG Services to develop such interoperable facilities on their website. The objective for GGOS is to set up an interoperable portal through which the data and products produced by the IAG Services can be served to the user community. We present the web applications proposed by IDS to visualize time series of geodetic observables or to get information about the tracking ground stations and the tracked satellites. We discuss the future plans for IDS to meet the recommendations of GGOS. The presentation also addresses the needs for the IAG Services to adopt common metadata thesaurus to describe data and products, and interoperability standards to share them.
The Value of Data and Metadata Standardization for Interoperability in Giovanni

NASA Astrophysics Data System (ADS)

Smit, C.; Hegde, M.; Strub, R. F.; Bryant, K.; Li, A.; Petrenko, M.

2017-12-01

Giovanni (https://giovanni.gsfc.nasa.gov/giovanni/) is a data exploration and visualization tool at the NASA Goddard Earth Sciences Data Information Services Center (GES DISC). It has been around in one form or another for more than 15 years. Giovanni calculates simple statistics and produces 22 different visualizations for more than 1600 geophysical parameters from more than 90 satellite and model products. Giovanni relies on external data format standards to ensure interoperability, including the NetCDF CF Metadata Conventions. Unfortunately, these standards were insufficient to make Giovanni's internal data representation truly simple to use. Finding and working with dimensions can be convoluted with the CF Conventions. Furthermore, the CF Conventions are silent on machine-friendly descriptive metadata such as the parameter's source product and product version. In order to simplify analyzing disparate earth science data parameters in a unified way, we developed Giovanni's internal standard. First, the format standardizes parameter dimensions and variables so they can be easily found. Second, the format adds all the machine-friendly metadata Giovanni needs to present our parameters to users in a consistent and clear manner. At a glance, users can grasp all the pertinent information about parameters both during parameter selection and after visualization. This poster gives examples of how our metadata and data standards, both external and internal, have both simplified our code base and improved our users' experiences.
SeaDataNet - Pan-European infrastructure for marine and ocean data management: Unified access to distributed data sets (www.seadatanet.org)

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Maudire, Gilbert

2010-05-01

SeaDataNet is a leading infrastructure in Europe for marine & ocean data management. It is actively operating and further developing a Pan-European infrastructure for managing, indexing and providing access to ocean and marine data sets and data products, acquired via research cruises and other observational activities, in situ and remote sensing. The basis of SeaDataNet is interconnecting 40 National Oceanographic Data Centres and Marine Data Centers from 35 countries around European seas into a distributed network of data resources with common standards for metadata, vocabularies, data transport formats, quality control methods and flags, and access. Thereby most of the NODC's operate and/or are developing national networks to other institutes in their countries to ensure national coverage and long-term stewardship of available data sets. The majority of data managed by SeaDataNet partners concerns physical oceanography, marine chemistry, hydrography, and a substantial volume of marine biology and geology and geophysics. These are partly owned by the partner institutes themselves and for a major part also owned by other organizations from their countries. The SeaDataNet infrastructure is implemented with support of the EU via the EU FP6 SeaDataNet project to provide the Pan-European data management system adapted both to the fragmented observation system and the users need for an integrated access to data, meta-data, products and services. The SeaDataNet project has a duration of 5 years and started in 2006, but builds upon earlier data management infrastructure projects, undertaken over a period of 20 years by an expanding network of oceanographic data centres from the countries around all European seas. Its predecessor project Sea-Search had a strict focus on metadata. SeaDataNet maintains significant interest in the further development of the metadata infrastructure, extending its services with the provision of easy data access and generic data products. Version 1 of its infrastructure upgrade was launched in April 2008 and is now well underway to include all 40 data centres at V1 level. It comprises the network of 40 interconnected data centres (NODCs) and a central SeaDataNet portal. V1 provides users a unified and transparent overview of the metadata and controlled access to the large collections of data sets, that are managed at these data centres. The SeaDataNet V1 infrastructure comprises the following middleware services: • Discovery services = Metadata directories and User interfaces • Vocabulary services = Common vocabularies and Governance • Security services = Authentication, Authorization & Accounting • Delivery services = Requesting and Downloading of data sets • Viewing services = Mapping of metadata • Monitoring services = Statistics on system usage and performance and Registration of data requests and transactions • Maintenance services = Entry and updating of metadata by data centres Also good progress is being made with extending the SeaDataNet infrastructure with V2 services: • Viewing services = Quick views and Visualisation of data and data products • Product services = Generic and standard products • Exchange services = transformation of SeaDataNet portal CDI output to INSPIRE compliance As a basis for the V1 services, common standards have been defined for metadata and data formats, common vocabularies, quality flags, and quality control methods, based on international standards, such as ISO 19115, OGC, NetCDF (CF), ODV, best practices from IOC and ICES, and following INSPIRE developments. An important objective of the SeaDataNet V1 infrastructure is to provide transparent access to the distributed data sets via a unique user interface and download service. In the SeaDataNet V1 architecture the Common Data Index (CDI) V1 metadata service provides the link between discovery and delivery of data sets. The CDI user interface enables users to have a detailed insight of the availability and geographical distribution of marine data, archived at the connected data centres. It provides sufficient information to allow the user to assess the data relevance. Moreover the CDI user interface provides the means for downloading data sets in common formats via a transaction mechanism. The SeaDataNet portal provides registered users access to these distributed data sets via the CDI V1 Directory and a shopping basket mechanism. This allows registered users to locate data of interest and submit their data requests. The requests are forwarded automatically from the portal to the relevant SeaDataNet data centres. This process is controlled via the Request Status Manager (RSM) Web Service at the portal and a Download Manager (DM) java software module, implemented at each of the data centres. The RSM also enables registered users to check regularly the status of their requests and download data sets, after access has been granted. Data centres can follow all transactions for their data sets online and can handle requests which require their consent. The actual delivery of data sets is done between the user and the selected data centre. Very good progress is being made with connecting all SeaDataNet data centres and their data sets to the CDI V1 system. At present the CDI V1 system provides users functionality to discover and download more than 500.000 data sets, a number which is steadily increasing. The SeaDataNet architecture provides a coherent system of the various V1 services and inclusion of the V2 services. For the implementation, a range of technical components have been defined and developed. These make use of recent web technologies, and also comprise Java components, to provide multi-platform support and syntactic interoperability. To facilitate sharing of resources and interoperability, SeaDataNet has adopted the technology of SOAP Web services for various communication tasks. The SeaDataNet architecture has been designed as a multi-disciplinary system from the beginning. It is able to support a wide variety of data types and to serve several sector communities. SeaDataNet is willing to share its technologies and expertise, to spread and expand its approach, and to build bridges to other well established infrastructures in the marine domain. Therefore SeaDataNet has developed a strategy of seeking active cooperation on a national scale with other data holding organisations via its NODC networks and on an international scale with other European and international data management initiatives and networks. This is done with the objective to achieve a wider coverage of data sources and an overall interoperability between data infrastructures in the marine and ocean domains. Recent examples are e.g. the EU FP7 projects Geo-Seas for geology and geophysical data sets, UpgradeBlackSeaScene for a Black Sea data management infrastructure, CaspInfo for a Caspian Sea data management infrastructure, the EU EMODNET pilot projects, for hydrographic, chemical, and biological data sets. All projects are adopting the SeaDataNet standards and extending its services. Also active cooperation takes place with EuroGOOS and MyOcean in the domain of real-time and delayed mode metocean monitoring data. SeaDataNet Partners: IFREMER (France), MARIS (Netherlands), HCMR/HNODC (Greece), ULg (Belgium), OGS (Italy), NERC/BODC (UK), BSH/DOD (Germany), SMHI (Sweden), IEO (Spain), RIHMI/WDC (Russia), IOC (International), ENEA (Italy), INGV (Italy), METU (Turkey), CLS (France), AWI (Germany), IMR (Norway), NERI (Denmark), ICES (International), EC-DG JRC (International), MI (Ireland), IHPT (Portugal), RIKZ (Netherlands), RBINS/MUMM (Belgium), VLIZ (Belgium), MRI (Iceland), FIMR (Finland ), IMGW (Poland), MSI (Estonia), IAE/UL (Latvia), CMR (Lithuania), SIO/RAS (Russia), MHI/DMIST (Ukraine), IO/BAS (Bulgaria), NIMRD (Romania), TSU (Georgia), INRH (Morocco), IOF (Croatia), PUT (Albania), NIB (Slovenia), UoM (Malta), OC/UCY (Cyprus), IOLR (Israel), NCSR/NCMS (Lebanon), CNR-ISAC (Italy), ISMAL (Algeria), INSTM (Tunisia)
Metadata mapping and reuse in caBIG™

PubMed Central

Kunz, Isaac; Lin, Ming-Chin; Frey, Lewis

2009-01-01

Background This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG™). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG™ framework or other frameworks that use metadata repositories. Results The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG™ framework and potentially any framework that uses a metadata repository. Conclusion This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG™. This effort contributes to facilitating the development of interoperable systems within caBIG™ as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies. PMID:19208192
Design and implementation of a health data interoperability mediator.

PubMed

Kuo, Mu-Hsing; Kushniruk, Andre William; Borycki, Elizabeth Marie

2010-01-01

The objective of this study is to design and implement a common-gateway oriented mediator to solve the health data interoperability problems that exist among heterogeneous health information systems. The proposed mediator has three main components: (1) a Synonym Dictionary (SD) that stores a set of global metadata and terminologies to serve as the mapping intermediary, (2) a Semantic Mapping Engine (SME) that can be used to map metadata and instance semantics, and (3) a DB-to-XML module that translates source health data stored in a database into XML format and back. A routine admission notification data exchange scenario is used to test the efficiency and feasibility of the proposed mediator. The study results show that the proposed mediator can make health information exchange more efficient.
The XML Metadata Editor of GFZ Data Services

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Elger, Kirsten; Tesei, Telemaco; Trippanera, Daniele

2017-04-01

Following the FAIR data principles, research data should be Findable, Accessible, Interoperable and Reuseable. Publishing data under these principles requires to assign persistent identifiers to the data and to generate rich machine-actionable metadata. To increase the interoperability, metadata should include shared vocabularies and crosslink the newly published (meta)data and related material. However, structured metadata formats tend to be complex and are not intended to be generated by individual scientists. Software solutions are needed that support scientists in providing metadata describing their data. To facilitate data publication activities of 'GFZ Data Services', we programmed an XML metadata editor that assists scientists to create metadata in different schemata popular in the earth sciences (ISO19115, DIF, DataCite), while being at the same time usable by and understandable for scientists. Emphasis is placed on removing barriers, in particular the editor is publicly available on the internet without registration [1] and the scientists are not requested to provide information that may be generated automatically (e.g. the URL of a specific licence or the contact information of the metadata distributor). Metadata are stored in browser cookies and a copy can be saved to the local hard disk. To improve usability, form fields are translated into the scientific language, e.g. 'creators' of the DataCite schema are called 'authors'. To assist filling in the form, we make use of drop down menus for small vocabulary lists and offer a search facility for large thesauri. Explanations to form fields and definitions of vocabulary terms are provided in pop-up windows and a full documentation is available for download via the help menu. In addition, multiple geospatial references can be entered via an interactive mapping tool, which helps to minimize problems with different conventions to provide latitudes and longitudes. Currently, we are extending the metadata editor to be reused to generate metadata for data discovery and contextual metadata developed by the 'Multi-scale Laboratories' Thematic Core Service of the European Plate Observing System (EPOS-IP). The Editor will be used to build a common repository of a large variety of geological and geophysical datasets produced by multidisciplinary laboratories throughout Europe, thus contributing to a significant step toward the integration and accessibility of earth science data. This presentation will introduce the metadata editor and show the adjustments made for EPOS-IP. [1] http://dataservices.gfz-potsdam.de/panmetaworks/metaedit

The Index to Marine and Lacustrine Geological Samples: Improving Sample Accessibility and Enabling Current and Future Research

NASA Astrophysics Data System (ADS)

Moore, C.

2011-12-01

The Index to Marine and Lacustrine Geological Samples is a community designed and maintained resource enabling researchers to locate and request sea floor and lakebed geologic samples archived by partner institutions. Conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center (NGDC) at a 1977 meeting convened by the National Science Foundation (NSF), the Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. Form and content of underlying vocabularies and metadata continue to evolve according to the needs of the community, as do supporting technologies and access methodologies. The Curators Consortium, now international in scope, meets at partner institutions biennially to share ideas and discuss best practices. NGDC serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the herculean task of creating and contributing metadata for over 195,000 sea floor and lakebed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images; 1) at participating institutions, 2) in the NGDC archive, and 3) at sites such as the Rolling Deck to Repository (R2R) and the System for Earth Sample Registration (SESAR). Over 34,000 International GeoSample Numbers (IGSNs) linking to SESAR are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. To promote interoperability and broaden exposure via the semantic web, NGDC is publishing lithologic classification schemes and terminology used in the Index as Simple Knowledge Organization System (SKOS) vocabularies, coordinating with R2R and the Consortium for Ocean Leadership for consistency. Availability in SKOS form will also facilitate use of the vocabularies in International Standards Organization (ISO) 19115-2 compliant metadata records. NGDC provides stewardship for the Index on behalf of U.S. repositories as the NSF designated "appropriate National Data Center" for data and metadata pertaining to sea floor samples as specified in the 2011 Division of Ocean Sciences Sample and Data Policy, and on behalf of international partners via a collocated World Data Center. NGDC operates on the Open Archival Information System (OAIS) reference model. Active Partners: Antarctic Marine Geology Research Facility, Florida State University; British Ocean Sediment Core Research Facility; Geological Survey of Canada; Integrated Ocean Drilling Program; Lamont-Doherty Earth Observatory; National Lacustrine Core Repository, University of Minnesota; Oregon State University; Scripps Institution of Oceanography; University of Rhode Island; U.S. Geological Survey; Woods Hole Oceanographic Institution.
A Spectrum of Interoperability: The Site for Science Prototype for the NSDL; Re-Inventing the Wheel? Standards, Interoperability and Digital Cultural Content; Preservation Risk Management for Web Resources: Virtual Remote Control in Cornell's Project Prism; Safekeeping: A Cooperative Approach to Building a Digital Preservation Resource; Object Persistence and Availability in Digital Libraries; Illinois Digital Cultural Heritage Community-Collaborative Interactions among Libraries, Museums and Elementary Schools.

ERIC Educational Resources Information Center

Arms, William Y.; Hillmann, Diane; Lagoze, Carl; Krafft, Dean; Marisa, Richard; Saylor, John; Terizzi, Carol; Van de Sompel, Herbert; Gill, Tony; Miller, Paul; Kenney, Anne R.; McGovern, Nancy Y.; Botticelli, Peter; Entlich, Richard; Payette, Sandra; Berthon, Hilary; Thomas, Susan; Webb, Colin; Nelson, Michael L.; Allen, B. Danette; Bennett, Nuala A.; Sandore, Beth; Pianfetti, Evangeline S.

2002-01-01

Discusses digital libraries, including interoperability, metadata, and international standards; Web resource preservation efforts at Cornell University; digital preservation at the National Library of Australia; object persistence and availability; collaboration among libraries, museums and elementary schools; Asian digital libraries; and a Web…
Lessons Learned From 104 Years of Mobile Observatories

NASA Astrophysics Data System (ADS)

Miller, S. P.; Clark, P. D.; Neiswender, C.; Raymond, L.; Rioux, M.; Norton, C.; Detrick, R.; Helly, J.; Sutton, D.; Weatherford, J.

2007-12-01

As the oceanographic community ventures into a new era of integrated observatories, it may be helpful to look back on the era of "mobile observatories" to see what Cyberinfrastructure lessons might be learned. For example, SIO has been operating research vessels for 104 years, supporting a wide range of disciplines: marine geology and geophysics, physical oceanography, geochemistry, biology, seismology, ecology, fisheries, and acoustics. In the last 6 years progress has been made with diverse data types, formats and media, resulting in a fully-searchable online SIOExplorer Digital Library of more than 800 cruises (http://SIOExplorer.ucsd.edu). Public access to SIOExplorer is considerable, with 795,351 files (206 GB) downloaded last year. During the last 3 years the efforts have been extended to WHOI, with a "Multi-Institution Testbed for Scalable Digital Archiving" funded by the Library of Congress and NSF (IIS 0455998). The project has created a prototype digital library of data from both institutions, including cruises, Alvin submersible dives, and ROVs. In the process, the team encountered technical and cultural issues that will be facing the observatory community in the near future. Technological Lessons Learned: Shipboard data from multiple institutions are extraordinarily diverse, and provide a good training ground for observatories. Data are gathered from a wide range of authorities, laboratories, servers and media, with little documentation. Conflicting versions exist, generated by alternative processes. Domain- and institution-specific issues were addressed during initial staging. Data files were categorized and metadata harvested with automated procedures. With our second-generation approach to staging, we achieve higher levels of automation with greater use of controlled vocabularies. Database and XML- based procedures deal with the diversity of raw metadata values and map them to agreed-upon standard values, in collaboration with the Marine Metadata Interoperability (MMI) community. All objects are tagged with an expert level, thus serving an educational audience, as well as research users. After staging, publication into the digital library is completely automated. The technical challenges have been largely overcome, thanks to a scalable, federated digital library architecture from the San Diego Supercomputer Center, implemented at SIO, WHOI and other sites. The metadata design is flexible, supporting modular blocks of metadata tailored to the needs of instruments, samples, documents, derived products, cruises or dives, as appropriate. Controlled metadata vocabularies, with content and definitions negotiated by all parties, are critical. Metadata may be mapped to required external standards and formats, as needed. Cultural Lessons Learned: The cultural challenges have been more formidable than expected. They became most apparent during attempts to categorize and stage digital data objects across two institutions, each with their own naming conventions and practices, generally undocumented, and evolving across decades. Whether the questions concerned data ownership, collection techniques, data diversity or institutional practices, the solution involved a joint discussion with scientists, data managers, technicians and archivists, working together. Because metadata discussions go on endlessly, significant benefit comes from dictionaries with definitions of all community-authorized metadata values.
Publishing NASA Metadata as Linked Open Data for Semantic Mashups

NASA Astrophysics Data System (ADS)

Wilson, Brian; Manipon, Gerald; Hua, Hook

2014-05-01

Data providers are now publishing more metadata in more interoperable forms, e.g. Atom or RSS 'casts', as Linked Open Data (LOD), or as ISO Metadata records. A major effort on the part of the NASA's Earth Science Data and Information System (ESDIS) project is the aggregation of metadata that enables greater data interoperability among scientific data sets regardless of source or application. Both the Earth Observing System (EOS) ClearingHOuse (ECHO) and the Global Change Master Directory (GCMD) repositories contain metadata records for NASA (and other) datasets and provided services. These records contain typical fields for each dataset (or software service) such as the source, creation date, cognizant institution, related access URL's, and domain and variable keywords to enable discovery. Under a NASA ACCESS grant, we demonstrated how to publish the ECHO and GCMD dataset and services metadata as LOD in the RDF format. Both sets of metadata are now queryable at SPARQL endpoints and available for integration into "semantic mashups" in the browser. It is straightforward to reformat sets of XML metadata, including ISO, into simple RDF and then later refine and improve the RDF predicates by reusing known namespaces such as Dublin core, georss, etc. All scientific metadata should be part of the LOD world. In addition, we developed an "instant" drill-down and browse interface that provides faceted navigation so that the user can discover and explore the 25,000 datasets and 3000 services. The available facets and the free-text search box appear in the left panel, and the instantly updated results for the dataset search appear in the right panel. The user can constrain the value of a metadata facet simply by clicking on a word (or phrase) in the "word cloud" of values for each facet. The display section for each dataset includes the important metadata fields, a full description of the dataset, potentially some related URL's, and a "search" button that points to an OpenSearch GUI that is pre-configured to search for granules within the dataset. We will present our experiences with converting NASA metadata into LOD, discuss the challenges, illustrate some of the enabled mashups, and demonstrate the latest version of the "instant browse" interface for navigating multiple metadata collections.
Ocean Data Interoperability Platform (ODIP): developing a common global framework for marine data management through international collaboration

NASA Astrophysics Data System (ADS)

Glaves, Helen

2015-04-01

Marine research is rapidly moving away from traditional discipline specific science to a wider ecosystem level approach. This more multidisciplinary approach to ocean science requires large amounts of good quality, interoperable data to be readily available for use in an increasing range of new and complex applications. Significant amounts of marine data and information are already available throughout the world as a result of e-infrastructures being established at a regional level to manage and deliver marine data to the end user. However, each of these initiatives has been developed to address specific regional requirements and independently of those in other regions. Establishing a common framework for marine data management on a global scale necessitates that there is interoperability across these existing data infrastructures and active collaboration between the organisations responsible for their management. The Ocean Data Interoperability Platform (ODIP) project is promoting co-ordination between a number of these existing regional e-infrastructures including SeaDataNet and Geo-Seas in Europe, the Integrated Marine Observing System (IMOS) in Australia, the Rolling Deck to Repository (R2R) in the USA and the international IODE initiative. To demonstrate this co-ordinated approach the ODIP project partners are currently working together to develop several prototypes to test and evaluate potential interoperability solutions for solving the incompatibilities between the individual regional marine data infrastructures. However, many of the issues being addressed by the Ocean Data Interoperability Platform are not specific to marine science. For this reason many of the outcomes of this international collaborative effort are equally relevant and transferable to other domains.
Inter-University Upper Atmosphere Global Observation Network (IUGONET) Metadata Database and Its Interoperability

NASA Astrophysics Data System (ADS)

Yatagai, A. I.; Iyemori, T.; Ritschel, B.; Koyama, Y.; Hori, T.; Abe, S.; Tanaka, Y.; Shinbori, A.; Umemura, N.; Sato, Y.; Yagi, M.; Ueno, S.; Hashiguchi, N. O.; Kaneda, N.; Belehaki, A.; Hapgood, M. A.

2013-12-01

The IUGONET is a Japanese program to build a metadata database for ground-based observations of the upper atmosphere [1]. The project began in 2009 with five Japanese institutions which archive data observed by radars, magnetometers, photometers, radio telescopes and helioscopes, and so on, at various altitudes from the Earth's surface to the Sun. Systems have been developed to allow searching of the above described metadata. We have been updating the system and adding new and updated metadata. The IUGONET development team adopted the SPASE metadata model [2] to describe the upper atmosphere data. This model is used as the common metadata format by the virtual observatories for solar-terrestrial physics. It includes metadata referring to each data file (called a 'Granule'), which enable a search for data files as well as data sets. Further details are described in [2] and [3]. Currently, three additional Japanese institutions are being incorporated in IUGONET. Furthermore, metadata of observations of the troposphere, taken at the observatories of the middle and upper atmosphere radar at Shigaraki and the Meteor radar in Indonesia, have been incorporated. These additions will contribute to efficient interdisciplinary scientific research. In the beginning of 2013, the registration of the 'Observatory' and 'Instrument' metadata was completed, which makes it easy to overview of the metadata database. The number of registered metadata as of the end of July, totalled 8.8 million, including 793 observatories and 878 instruments. It is important to promote interoperability and/or metadata exchange between the database development groups. A memorandum of agreement has been signed with the European Near-Earth Space Data Infrastructure for e-Science (ESPAS) project, which has similar objectives to IUGONET with regard to a framework for formal collaboration. Furthermore, observations by satellites and the International Space Station are being incorporated with a view for making/linking metadata databases. The development of effective data systems will contribute to the progress of scientific research on solar terrestrial physics, climate and the geophysical environment. Any kind of cooperation, metadata input and feedback, especially for linkage of the databases, is welcomed. References 1. Hayashi, H. et al., Inter-university Upper Atmosphere Global Observation Network (IUGONET), Data Sci. J., 12, WDS179-184, 2013. 2. King, T. et al., SPASE 2.0: A standard data model for space physics. Earth Sci. Inform. 3, 67-73, 2010, doi:10.1007/s12145-010-0053-4. 3. Hori, T., et al., Development of IUGONET metadata format and metadata management system. J. Space Sci. Info. Jpn., 105-111, 2012. (in Japanese)
Assessing Public Metabolomics Metadata, Towards Improving Quality.

PubMed

Ferreira, João D; Inácio, Bruno; Salek, Reza M; Couto, Francisco M

2017-12-13

Public resources need to be appropriately annotated with metadata in order to make them discoverable, reproducible and traceable, further enabling them to be interoperable or integrated with other datasets. While data-sharing policies exist to promote the annotation process by data owners, these guidelines are still largely ignored. In this manuscript, we analyse automatic measures of metadata quality, and suggest their application as a mean to encourage data owners to increase the metadata quality of their resources and submissions, thereby contributing to higher quality data, improved data sharing, and the overall accountability of scientific publications. We analyse these metadata quality measures in the context of a real-world repository of metabolomics data (i.e. MetaboLights), including a manual validation of the measures, and an analysis of their evolution over time. Our findings suggest that the proposed measures can be used to mimic a manual assessment of metadata quality.
Community-Driven Initiatives to Achieve Interoperability for Ecological and Environmental Data

NASA Astrophysics Data System (ADS)

Madin, J.; Bowers, S.; Jones, M.; Schildhauer, M.

2007-12-01

Advances in ecology and environmental science increasingly depend on information from multiple disciplines to tackle broader and more complex questions about the natural world. Such advances, however, are hindered by data heterogeneity, which impedes the ability of researchers to discover, interpret, and integrate relevant data that have been collected by others. Here, we outline two community-building initiatives for improving data interoperability in the ecological and environmental sciences, one that is well-established (the Ecological Metadata Language [EML]), and another that is actively underway (a unified model for observations and measurements). EML is a metadata specification developed for the ecology discipline, and is based on prior work done by the Ecological Society of America and associated efforts to ensure a modular and extensible framework to document ecological data. EML "modules" are designed to describe one logical part of the total metadata that should be included with any ecological dataset. EML was developed through a series of working meetings, ongoing discussion forums and email lists, with participation from a broad range of ecological and environmental scientists, as well as computer scientists and software developers. Where possible, EML adopted syntax from the other metadata standards for other disciplines (e.g., Dublin Core, Content Standard for Digital Geospatial Metadata, and more). Although EML has not yet been ratified through a standards body, it has become the de facto metadata standard for a large range of ecological data management projects, including for the Long Term Ecological Research Network, the National Center for Ecological Analysis and Synthesis, and the Ecological Society of America. The second community-building initiative is based on work through the Scientific Environment for Ecological Knowledge (SEEK) as well as a recent workshop on multi-disciplinary data management. This initiative aims at improving interoperability by describing the semantics of data at the level of observation and measurement (rather than the traditional focus at the level of the data set) and will define the necessary specifications and technologies to facilitate semantic interpretation and integration of observational data for the environmental sciences. As such, this initiative will focus on unifying the various existing approaches for representing and describing observation data (e.g., SEEK's Observation Ontology, CUAHSI's Observation Data Model, NatureServe's Observation Data Standard, to name a few). Products of this initiative will be compatible with existing standards and build upon recent advances in knowledge representation (e.g., W3C's recommended Web Ontology Language, OWL) that have demonstrated practical utility in enhancing scientific communication and data interoperability in other communities (e.g., the genomics community). A community-sanctioned, extensible, and unified model for observational data will support metadata standards such as EML while reducing the "babel" of scientific dialects that currently impede effective data integration, which will in turn provide a strong foundation for enabling cross-disciplinary synthetic research in the ecological and environmental sciences.
Increasing the international visibility of research data by a joint metadata schema

NASA Astrophysics Data System (ADS)

Svoboda, Nikolai; Zoarder, Muquit; Gärtner, Philipp; Hoffmann, Carsten; Heinrich, Uwe

2017-04-01

The BonaRes Project ("Soil as a sustainable resource for the bioeconomy") was launched in 2015 to promote sustainable soil management and to avoid fragmentation of efforts (Wollschläger et al., 2016). For this purpose, an IT infrastructure is being developed to upload, manage, store, and provide research data and its associated metadata. The research data provided by the BonaRes data centre are, in principle, not subject to any restrictions on reuse. For all research data considerable standardized metadata are the key enablers for the effective use of these data. Providing proper metadata is often viewed as an extra burden with further work and resources consumed. In our lecture we underline the benefits of structured and interoperable metadata like: accessibility of data, discovery of data, interpretation of data, linking data and several more and we counter these advantages with the effort of time, personnel and further costs. Building on this, we describe the framework of metadata in BonaRes combining the standards of OGC for description, visualization, exchange and discovery of geodata as well as the schema of DataCite for the publication and citation of this research data. This enables the generation of a DOI, a unique identifier that provides a permanent link to the citable research data. By using OGC standards, data and metadata become interoperable with numerous research data provided via INSPIRE. It enables further services like CSW for harvesting WMS for visualization and WFS for downloading. We explain the mandatory fields that result from our approach and we give a general overview about our metadata architecture implementation. Literature: Wollschläger, U; Helming, K.; Heinrich, U.; Bartke, S.; Kögel-Knabner, I.; Russell, D.; Eberhardt, E. & Vogel, H.-J.: The BonaRes Centre - A virtual institute for soil research in the context of a sustainable bio-economy. Geophysical Research Abstracts, Vol. 18, EGU2016-9087, 2016.
ECHO Services: Foundational Middleware for a Science Cyberinfrastructure

NASA Technical Reports Server (NTRS)

Burnett, Michael

2005-01-01

This viewgraph presentation describes ECHO, an interoperability middleware solution. It uses open, XML-based APIs, and supports net-centric architectures and solutions. ECHO has a set of interoperable registries for both data (metadata) and services, and provides user accounts and a common infrastructure for the registries. It is built upon a layered architecture with extensible infrastructure for supporting community unique protocols. It has been operational since November, 2002 and it available as open source.
Ocean Data Interoperability Platform (ODIP): developing a common framework for marine data management on a global scale

NASA Astrophysics Data System (ADS)

Glaves, Helen; Schaap, Dick

2016-04-01

The increasingly ocean basin level approach to marine research has led to a corresponding rise in the demand for large quantities of high quality interoperable data. This requirement for easily discoverable and readily available marine data is currently being addressed by initiatives such as SeaDataNet in Europe, Rolling Deck to Repository (R2R) in the USA and the Australian Ocean Data Network (AODN) with each having implemented an e-infrastructure to facilitate the discovery and re-use of standardised multidisciplinary marine datasets available from a network of distributed repositories, data centres etc. within their own region. However, these regional data systems have been developed in response to the specific requirements of their users and in line with the priorities of the funding agency. They have also been created independently of the marine data infrastructures in other regions often using different standards, data formats, technologies etc. that make integration of marine data from these regional systems for the purposes of basin level research difficult. Marine research at the ocean basin level requires a common global framework for marine data management which is based on existing regional marine data systems but provides an integrated solution for delivering interoperable marine data to the user. The Ocean Data Interoperability Platform (ODIP/ODIP II) project brings together those responsible for the management of the selected marine data systems and other relevant technical experts with the objective of developing interoperability across the regional e-infrastructures. The commonalities and incompatibilities between the individual data infrastructures are identified and then used as the foundation for the specification of prototype interoperability solutions which demonstrate the feasibility of sharing marine data across the regional systems and also with relevant larger global data services such as GEO, COPERNICUS, IODE, POGO etc. The potential impact for the individual regional data infrastructures of implementing these prototype interoperability solutions is also being evaluated to determine both the technical and financial implications of their integration within existing systems. These impact assessments form part of the strategy to encourage wider adoption of the ODIP solutions and approach beyond the current scope of the project which is focussed on regional marine data systems in Europe, Australia, the USA and, more recently, Canada.
Department of the Interior metadata implementation guide—Framework for developing the metadata component for data resource management

USGS Publications Warehouse

Obuch, Raymond C.; Carlino, Jennifer; Zhang, Lin; Blythe, Jonathan; Dietrich, Christopher; Hawkinson, Christine

2018-04-12

The Department of the Interior (DOI) is a Federal agency with over 90,000 employees across 10 bureaus and 8 agency offices. Its primary mission is to protect and manage the Nation’s natural resources and cultural heritage; provide scientific and other information about those resources; and honor its trust responsibilities or special commitments to American Indians, Alaska Natives, and affiliated island communities. Data and information are critical in day-to-day operational decision making and scientific research. DOI is committed to creating, documenting, managing, and sharing high-quality data and metadata in and across its various programs that support its mission. Documenting data through metadata is essential in realizing the value of data as an enterprise asset. The completeness, consistency, and timeliness of metadata affect users’ ability to search for and discover the most relevant data for the intended purpose; and facilitates the interoperability and usability of these data among DOI bureaus and offices. Fully documented metadata describe data usability, quality, accuracy, provenance, and meaning.Across DOI, there are different maturity levels and phases of information and metadata management implementations. The Department has organized a committee consisting of bureau-level points-of-contacts to collaborate on the development of more consistent, standardized, and more effective metadata management practices and guidance to support this shared mission and the information needs of the Department. DOI’s metadata implementation plans establish key roles and responsibilities associated with metadata management processes, procedures, and a series of actions defined in three major metadata implementation phases including: (1) Getting started—Planning Phase, (2) Implementing and Maintaining Operational Metadata Management Phase, and (3) the Next Steps towards Improving Metadata Management Phase. DOI’s phased approach for metadata management addresses some of the major data and metadata management challenges that exist across the diverse missions of the bureaus and offices. All employees who create, modify, or use data are involved with data and metadata management. Identifying, establishing, and formalizing the roles and responsibilities associated with metadata management are key to institutionalizing a framework of best practices, methodologies, processes, and common approaches throughout all levels of the organization; these are the foundation for effective data resource management. For executives and managers, metadata management strengthens their overarching views of data assets, holdings, and data interoperability; and clarifies how metadata management can help accelerate the compliance of multiple policy mandates. For employees, data stewards, and data professionals, formalized metadata management will help with the consistency of definitions, and approaches addressing data discoverability, data quality, and data lineage. In addition to data professionals and others associated with information technology; data stewards and program subject matter experts take on important metadata management roles and responsibilities as data flow through their respective business and science-related workflows. The responsibilities of establishing, practicing, and governing the actions associated with their specific metadata management roles are critical to successful metadata implementation.
The eXtensible ontology development (XOD) principles and tool implementation to support ontology interoperability.

PubMed

He, Yongqun; Xiang, Zuoshuang; Zheng, Jie; Lin, Yu; Overton, James A; Ong, Edison

2018-01-12

Ontologies are critical to data/metadata and knowledge standardization, sharing, and analysis. With hundreds of biological and biomedical ontologies developed, it has become critical to ensure ontology interoperability and the usage of interoperable ontologies for standardized data representation and integration. The suite of web-based Ontoanimal tools (e.g., Ontofox, Ontorat, and Ontobee) support different aspects of extensible ontology development. By summarizing the common features of Ontoanimal and other similar tools, we identified and proposed an "eXtensible Ontology Development" (XOD) strategy and its associated four principles. These XOD principles reuse existing terms and semantic relations from reliable ontologies, develop and apply well-established ontology design patterns (ODPs), and involve community efforts to support new ontology development, promoting standardized and interoperable data and knowledge representation and integration. The adoption of the XOD strategy, together with robust XOD tool development, will greatly support ontology interoperability and robust ontology applications to support data to be Findable, Accessible, Interoperable and Reusable (i.e., FAIR).
Applying Sensor Web Technology to Marine Sensor Data

NASA Astrophysics Data System (ADS)

Jirka, Simon; del Rio, Joaquin; Mihai Toma, Daniel; Nüst, Daniel; Stasch, Christoph; Delory, Eric

2015-04-01

In this contribution we present two activities illustrating how Sensor Web technology helps to enable a flexible and interoperable sharing of marine observation data based on standards. An important foundation is the Sensor Web Architecture developed by the European FP7 project NeXOS (Next generation Low-Cost Multifunctional Web Enabled Ocean Sensor Systems Empowering Marine, Maritime and Fisheries Management). This architecture relies on the Open Geospatial Consortium's (OGC) Sensor Web Enablement (SWE) framework. It is an exemplary solution for facilitating the interoperable exchange of marine observation data within and between (research) organisations. The architecture addresses a series of functional and non-functional requirements which are fulfilled through different types of OGC SWE components. The diverse functionalities offered by the NeXOS Sensor Web architecture are shown in the following overview: - Pull-based observation data download: This is achieved through the OGC Sensor Observation Service (SOS) 2.0 interface standard. - Push-based delivery of observation data to allow users the subscription to new measurements that are relevant for them: For this purpose there are currently several specification activities under evaluation (e.g. OGC Sensor Event Service, OGC Publish/Subscribe Standards Working Group). - (Web-based) visualisation of marine observation data: Implemented through SOS client applications. - Configuration and controlling of sensor devices: This is ensured through the OGC Sensor Planning Service 2.0 interface. - Bridging between sensors/data loggers and Sensor Web components: For this purpose several components such as the "Smart Electronic Interface for Sensor Interoperability" (SEISI) concept are developed; this is complemented by a more lightweight SOS extension (e.g. based on the W3C Efficient XML Interchange (EXI) format). To further advance this architecture, there is on-going work to develop dedicated profiles of selected OGC SWE specifications that provide stricter guidance how these standards shall be applied to marine data (e.g. SensorML 2.0 profiles stating which metadata elements are mandatory building upon the ESONET Sensor Registry developments, etc.). Within the NeXOS project the presented architecture is implemented as a set of open source components. These implementations can be re-used by all interested scientists and data providers needing tools for publishing or consuming oceanographic sensor data. In further projects such as the European project FixO3 (Fixed-point Open Ocean Observatories), these software development activities are complemented with additional efforts to provide guidance how Sensor Web technology can be applied in an efficient manner. This way, not only software components are made available but also documentation and information resources that help to understand which types of Sensor Web deployments are best suited to fulfil different types of user requirements.
Metadata, Identifiers, and Physical Samples

NASA Astrophysics Data System (ADS)

Arctur, D. K.; Lenhardt, W. C.; Hills, D. J.; Jenkyns, R.; Stroker, K. J.; Todd, N. S.; Dassie, E. P.; Bowring, J. F.

2016-12-01

Physical samples are integral to much of the research conducted by geoscientists. The samples used in this research are often obtained at significant cost and represent an important investment for future research. However, making information about samples - whether considered data or metadata - available for researchers to enable discovery is difficult: a number of key elements related to samples are difficult to characterize in common ways, such as classification, location, sample type, sampling method, repository information, subsample distribution, and instrumentation, because these differ from one domain to the next. Unifying these elements or developing metadata crosswalks is needed. The iSamples (Internet of Samples) NSF-funded Research Coordination Network (RCN) is investigating ways to develop these types of interoperability and crosswalks. Within the iSamples RCN, one of its working groups, WG1, has focused on the metadata related to physical samples. This includes identifying existing metadata standards and systems, and how they might interoperate with the International Geo Sample Number (IGSN) schema (schema.igsn.org) in order to help inform leading practices for metadata. For example, we are examining lifecycle metadata beyond the IGSN `birth certificate.' As a first step, this working group is developing a list of relevant standards and comparing their various attributes. In addition, the working group is looking toward technical solutions to facilitate developing a linked set of registries to build the web of samples. Finally, the group is also developing a comparison of sample identifiers and locators. This paper will provide an overview and comparison of the standards identified thus far, as well as an update on the technical solutions examined for integration. We will discuss how various sample identifiers might work in complementary fashion with the IGSN to more completely describe samples, facilitate retrieval of contextual information, and access research work on related samples. Finally, we welcome suggestions and community input to move physical sample unique identifiers forward.
NCI's national environmental research data collection: metadata management built on standards and preparing for the semantic web

NASA Astrophysics Data System (ADS)

Wang, Jingbo; Bastrakova, Irina; Evans, Ben; Gohar, Kashif; Santana, Fabiana; Wyborn, Lesley

2015-04-01

National Computational Infrastructure (NCI) manages national environmental research data collections (10+ PB) as part of its specialized high performance data node of the Research Data Storage Infrastructure (RDSI) program. We manage 40+ data collections using NCI's Data Management Plan (DMP), which is compatible with the ISO 19100 metadata standards. We utilize ISO standards to make sure our metadata is transferable and interoperable for sharing and harvesting. The DMP is used along with metadata from the data itself, to create a hierarchy of data collection, dataset and time series catalogues that is then exposed through GeoNetwork for standard discoverability. This hierarchy catalogues are linked using a parent-child relationship. The hierarchical infrastructure of our GeoNetwork catalogues system aims to address both discoverability and in-house administrative use-cases. At NCI, we are currently improving the metadata interoperability in our catalogue by linking with standardized community vocabulary services. These emerging vocabulary services are being established to help harmonise data from different national and international scientific communities. One such vocabulary service is currently being established by the Australian National Data Services (ANDS). Data citation is another important aspect of the NCI data infrastructure, which allows tracking of data usage and infrastructure investment, encourage data sharing, and increasing trust in research that is reliant on these data collections. We incorporate the standard vocabularies into the data citation metadata so that the data citation become machine readable and semantically friendly for web-search purpose as well. By standardizing our metadata structure across our entire data corpus, we are laying the foundation to enable the application of appropriate semantic mechanisms to enhance discovery and analysis of NCI's national environmental research data information. We expect that this will further increase the data discoverability and encourage the data sharing and reuse within the community, increasing the value of the data much further than its current use.
Enabling the Integrated Assessment of Large Marine Ecosystems: Informatics to the Forefront of Science-Based Decision Support

NASA Astrophysics Data System (ADS)

Di Stefano, M.; Fox, P. A.; Beaulieu, S. E.; Maffei, A. R.; West, P.; Hare, J. A.

2012-12-01

Integrated assessments of large marine ecosystems require the understanding of interactions between environmental, ecological, and socio-economic factors that affect production and utilization of marine natural resources. Assessing the functioning of complex coupled natural-human systems calls for collaboration between natural and social scientists across disciplinary and national boundaries. We are developing a platform to implement and sustain informatics solutions for these applications, providing interoperability among very diverse and heterogeneous data and information sources, as well as multi-disciplinary organizations and people. We have partnered with NOAA NMFS scientists to facilitate the deployment of an integrated ecosystem approach to management in the Northeast U.S. (NES) and California Current Large Marine Ecosystems (LMEs). Our platform will facilitate the collaboration and knowledge sharing among NMFS natural and social scientists, promoting community participation in integrating data, models, and knowledge. Here, we present collaborative software tools developed to aid the production of the Ecosystem Status Report (ESR) for the NES LME. The ESR addresses the D-P-S portion of the DPSIR (Driver-Pressure-State-Impact-Response) management framework: reporting data, indicators, and information products for climate drivers, physical and human (fisheries) pressures, and ecosystem state (primary and secondary production and higher trophic levels). We are developing our tools in open-source software, with the main tool based on a web application capable of providing the ability to work on multiple data types from a variety of sources, providing an effective way to share the source code used to generate data products and associated metadata as well as track workflow provenance to allow in the reproducibility of a data product. Our platform retrieves data, conducts standard analyses, reports data quality and other standardized metadata, provides iterative and interactive visualization, and enables the download of data plotted in the ESR. Data, indicators, and information products include time series, geographic maps, and uni-variate and multi-variate analyses. Also central to the success of this initiative is the commitment to accommodate and train scientists of multiple disciplines who will learn to interact effectively with this new integrated and interoperable ecosystem assessment capability. Traceability, repeatability, explanation, verification, and validation of data, indicators, and information products are important for cross-disciplinary understanding and sharing with managers, policymakers, and the public. We are also developing an ontology to support the implementation of the DPSIR framework. These new capabilities will serve as the essential foundation for the formal synthesis and quantitative analysis of information on relevant natural and socio-economic factors in relation to specified ecosystem management goals which can be applied in other LMEs.
Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes

PubMed Central

Sharma, Deepak K.; Solbrig, Harold R.; Prud’hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

2016-01-01

Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary’s metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration. PMID:28269909
Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes.

PubMed

Sharma, Deepak K; Solbrig, Harold R; Prud'hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

2016-01-01

Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary's metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration.
Evolving Metadata in NASA Earth Science Data Systems

NASA Astrophysics Data System (ADS)

Mitchell, A.; Cechini, M. F.; Walter, J.

2011-12-01

NASA's Earth Observing System (EOS) is a coordinated series of satellites for long term global observations. NASA's Earth Observing System Data and Information System (EOSDIS) is a petabyte-scale archive of environmental data that supports global climate change research by providing end-to-end services from EOS instrument data collection to science data processing to full access to EOS and other earth science data. On a daily basis, the EOSDIS ingests, processes, archives and distributes over 3 terabytes of data from NASA's Earth Science missions representing over 3500 data products ranging from various types of science disciplines. EOSDIS is currently comprised of 12 discipline specific data centers that are collocated with centers of science discipline expertise. Metadata is used in all aspects of NASA's Earth Science data lifecycle from the initial measurement gathering to the accessing of data products. Missions use metadata in their science data products when describing information such as the instrument/sensor, operational plan, and geographically region. Acting as the curator of the data products, data centers employ metadata for preservation, access and manipulation of data. EOSDIS provides a centralized metadata repository called the Earth Observing System (EOS) ClearingHouse (ECHO) for data discovery and access via a service-oriented-architecture (SOA) between data centers and science data users. ECHO receives inventory metadata from data centers who generate metadata files that complies with the ECHO Metadata Model. NASA's Earth Science Data and Information System (ESDIS) Project established a Tiger Team to study and make recommendations regarding the adoption of the international metadata standard ISO 19115 in EOSDIS. The result was a technical report recommending an evolution of NASA data systems towards a consistent application of ISO 19115 and related standards including the creation of a NASA-specific convention for core ISO 19115 elements. Part of NASA's effort to continually evolve its data systems led ECHO to enhancing the method in which it receives inventory metadata from the data centers to allow for multiple metadata formats including ISO 19115. ECHO's metadata model will also be mapped to the NASA-specific convention for ingesting science metadata into the ECHO system. As NASA's new Earth Science missions and data centers are migrating to the ISO 19115 standards, EOSDIS is developing metadata management resources to assist in the reading, writing and parsing ISO 19115 compliant metadata. To foster interoperability with other agencies and international partners, NASA is working to ensure that a common ISO 19115 convention is developed, enhancing data sharing capabilities and other data analysis initiatives. NASA is also investigating the use of ISO 19115 standards to encode data quality, lineage and provenance with stored values. A common metadata standard across NASA's Earth Science data systems promotes interoperability, enhances data utilization and removes levels of uncertainty found in data products.

Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination

PubMed Central

Doiron, Dany; Marcon, Yannick; Fortier, Isabel; Burton, Paul; Ferretti, Vincent

2017-01-01

Abstract Motivation Improving the dissemination of information on existing epidemiological studies and facilitating the interoperability of study databases are essential to maximizing the use of resources and accelerating improvements in health. To address this, Maelstrom Research proposes Opal and Mica, two inter-operable open-source software packages providing out-of-the-box solutions for epidemiological data management, harmonization and dissemination. Implementation Opal and Mica are two standalone but inter-operable web applications written in Java, JavaScript and PHP. They provide web services and modern user interfaces to access them. General features Opal allows users to import, manage, annotate and harmonize study data. Mica is used to build searchable web portals disseminating study and variable metadata. When used conjointly, Mica users can securely query and retrieve summary statistics on geographically dispersed Opal servers in real-time. Integration with the DataSHIELD approach allows conducting more complex federated analyses involving statistical models. Availability Opal and Mica are open-source and freely available at [www.obiba.org] under a General Public License (GPL) version 3, and the metadata models and taxonomies that accompany them are available under a Creative Commons licence. PMID:29025122
Case Studies of Ecological Integrative Information Systems: The Luquillo and Sevilleta Information Management Systems

NASA Astrophysics Data System (ADS)

San Gil, Inigo; White, Marshall; Melendez, Eda; Vanderbilt, Kristin

The thirty-year-old United States Long Term Ecological Research Network has developed extensive metadata to document their scientific data. Standard and interoperable metadata is a core component of the data-driven analytical solutions developed by this research network Content management systems offer an affordable solution for rapid deployment of metadata centered information management systems. We developed a customized integrative metadata management system based on the Drupal content management system technology. Building on knowledge and experience with the Sevilleta and Luquillo Long Term Ecological Research sites, we successfully deployed the first two medium-scale customized prototypes. In this paper, we describe the vision behind our Drupal based information management instances, and list the features offered through these Drupal based systems. We also outline the plans to expand the information services offered through these metadata centered management systems. We will conclude with the growing list of participants deploying similar instances.
Academic Research Library as Broker in Addressing Interoperability Challenges for the Geosciences

NASA Astrophysics Data System (ADS)

Smith, P., II

2015-12-01

Data capture is an important process in the research lifecycle. Complete descriptive and representative information of the data or database is necessary during data collection whether in the field or in the research lab. The National Science Foundation's (NSF) Public Access Plan (2015) mandates the need for federally funded projects to make their research data more openly available. Developing, implementing, and integrating metadata workflows into to the research process of the data lifecycle facilitates improved data access while also addressing interoperability challenges for the geosciences such as data description and representation. Lack of metadata or data curation can contribute to (1) semantic, (2) ontology, and (3) data integration issues within and across disciplinary domains and projects. Some researchers of EarthCube funded projects have identified these issues as gaps. These gaps can contribute to interoperability data access, discovery, and integration issues between domain-specific and general data repositories. Academic Research Libraries have expertise in providing long-term discovery and access through the use of metadata standards and provision of access to research data, datasets, and publications via institutional repositories. Metadata crosswalks, open archival information systems (OAIS), trusted-repositories, data seal of approval, persistent URL, linking data, objects, resources, and publications in institutional repositories and digital content management systems are common components in the library discipline. These components contribute to a library perspective on data access and discovery that can benefit the geosciences. The USGS Community for Data Integration (CDI) has developed the Science Support Framework (SSF) for data management and integration within its community of practice for contribution to improved understanding of the Earth's physical and biological systems. The USGS CDI SSF can be used as a reference model to map to EarthCube Funded projects with academic research libraries facilitating the data and information assets components of the USGS CDI SSF via institutional repositories and/or digital content management. This session will explore the USGS CDI SSF for cross-discipline collaboration considerations from a library perspective.
Eutrophication and contaminant data management for EU marine policies: the EMODnet Chemistry infrastructure.

NASA Astrophysics Data System (ADS)

Vinci, Matteo; Lipizer, Marina; Giorgetti, Alessandra

2016-04-01

The European Marine Observation and Data Network (EMODnet) initiative has the following purposes: to assemble marine metadata, data and products, to make these fragmented resources more easily available to public and private users and to provide quality-assured, standardised and harmonised marine data. EMODnet Chemistry was launched by DG MARE in 2009 to support the Marine Strategy Framework Directive (MSFD) requirements for the assessment of eutrophication and contaminants, following INSPIRE Directive rules. The aim is twofold: the first task is to make available and reusable the big amount of fragmented and inaccessible data, hosted in the European research institutes and environmental agencies. The second objective is to develop visualization services useful for the tasks of the MSFD. The technical set-up is based on the principle of adopting and adapting the SeaDataNet infrastructure for ocean and marine data which are managed by National Oceanographic Data Centers and relies on a distributed network of data centers. Data centers contribute to data harvesting and enrichment with the relevant metadata. Data are processed into interoperable formats (using agreed standards ISO XML, ODV) with the use of common vocabularies and standardized quality control procedures .Data quality control is a key issue when merging heterogeneous data coming from different sources and a data validation loop has been agreed within EMODnet Chemistry community and is routinely performed. After data quality control done by the regional coordinators of the EU marine basins (Atlantic, Baltic, North, Mediterranean and Black Sea), validated regional datasets are used to develop data products useful for the requirements of the MSFD. EMODnet Chemistry provides interpolated seasonal maps of nutrients and services for the visualization of time series and profiles of several chemical parameters. All visualization services are developed following OGC standards as WMS and WPS. In order to test new strategies for data storage, reanalysis and to upgrade the infrastructure performances, EMODnet Chemistry has chosen the Cloud environment offered by Cineca (the Consortium of Italian Universities and research institutes) where both regional aggregated datasets and analysis and visualization services are hosted. Finally, beside the delivery of data and the visualization products, the results of the data harvesting provide a useful tool to identify data gaps where the future monitoring efforts should be focused.
Arc-An OAI Service Provider for Digital Library Federation; Kepler-An OAI Data/Service Provider for the Individual; Information Objects and Rights Management: A Mediation-Based Approach to DRM Interoperability; Automated Name Authority Control and Enhanced Searching in the Levy Collection; Renardus Project Developments and the Wider Digital Library Context.

ERIC Educational Resources Information Center

Liu, Xiaoming; Maly, Kurt; Zubair, Mohammad; Nelson, Michael L.; Erickson, John S.; DiLauro, Tim; Choudhury, G. Sayeed; Patton, Mark; Warner, James W.; Brown, Elizabeth W.; Heery, Rachel; Carpenter, Leona; Day, Michael

2001-01-01

Includes five articles that discuss the OAI (Open Archive Initiative), an interface between data providers and service providers; information objects and digital rights management interoperability; digitizing library collections, including automated name authority control, metadata, and text searching engines; and building digital library services…
Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

ERIC Educational Resources Information Center

Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

2001-01-01

Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)
Semantically supporting data discovery, markup and aggregation in the European Marine Observation and Data Network (EMODnet)

NASA Astrophysics Data System (ADS)

Lowry, Roy; Leadbetter, Adam

2014-05-01

The semantic content of the NERC Vocabulary Server (NVS) has been developed over thirty years. It has been used to mark up metadata and data in a wide range of international projects, including the European Commission (EC) Framework Programme 7 projects SeaDataNet and The Open Service Network for Marine Environmental Data (NETMAR). Within the United States, the National Science Foundation projects Rolling Deck to Repository and Biological & Chemical Data Management Office (BCO-DMO) use concepts from NVS for markup. Further, typed relationships between NVS concepts and terms served by the Marine Metadata Interoperability Ontology Registry and Repository. The vast majority of the concepts publicly served from NVS (35% of ~82,000) form the British Oceanographic Data Centre (BODC) Parameter Usage Vocabulary (PUV). The PUV is instantiated on the NVS as a SKOS concept collection. These terms are used to describe the individual channels in data and metadata served by, for example, BODC, SeaDataNet and BCO-DMO. The PUV terms are designed to be very precise and may contain a high level of detail. Some users have reported that the PUV is difficult to navigate due to its size and complexity (a problem CSIRO have begun to address by deploying a SISSVoc interface to the NVS), and it has been difficult to aggregate data as multiple PUV terms can - with full validity - be used to describe the same data channels. Better approaches to data aggregation are required as a use case for the PUV from the EC European Marine Observation and Data Network (EMODnet) Chemistry project. One solution, proposed and demonstrated during the course of the NETMAR project, is to build new SKOS concept collections which formalise the desired aggregations for given applications, and uses typed relationships to state which PUV concepts contribute to a specific aggregation. Development of these new collections requires input from a group of experts in the application domain who can decide which PUV concepts it is acceptable to aggregate for a given application. Another approach, which has been developed as a use case for concept and data discovery and will be implemented as part of the EC/United States/Australian collaboration the Ocean Data Interoperability Platform, is to expose the well defined, but little publicised, semantic model which underpins each and every concept within the PUV. This will be done in a machine readable form, so that tools can be built to aggregate data and concepts by, for example, the measured parameter; the environmental sphere or compartment of the sampling; and the methodology of the analysis of the parameter. There is interesting work being developed by CSIRO which may be used in this approach. The importance of these data aggregations is growing as more data providers use terms from semantic resources to describe their data, and allows for aggregating data from numerous sources. This importance will grow as data become "born semantic", i.e. when semantics are embedded with data from the point of collection. In this presentation we introduce a brief history of the development of the PUV; the use cases for data aggregation and discovery outlined above; and the semantic model from which the PUV is built; and the ideas for embedding semantics in data from the point of collection.
Geo-Seas - a pan-European infrastructure for the management of marine geological and geophysical data.

NASA Astrophysics Data System (ADS)

Glaves, Helen; Graham, Colin

2010-05-01

Geo-Seas - a pan-European infrastructure for the management of marine geological and geophysical data. Helen Glaves1 and Colin Graham2 on behalf of the Geo-Seas consortium The Geo-Seas project will create a network of twenty six European marine geoscience data centres from seventeen coastal countries including six from the Baltic Sea area. This will be achieved through the development of a pan-European infrastructure for the exchange of marine geoscientific data. Researchers will be able to locate and access harmonised and federated marine geological and geophysical datasets and data products held by the data centres through the Geo-Seas data portal, using a common data catalogue. The new infrastructure, an expansion of the exisiting SeaDataNet, will create an infrastructure covering oceanographic and marine geoscientific data. New data products and services will be developed following consultations with users on their current and future research requirements. Common data standards will be implemented across all of the data centres and other geological and geophysical organisations will be encouraged to adopt the protocols, standards and tools which are developed as part of the Geo-Seas project. Oceanographic and marine data include a wide range of variables, an important category of which are the geological and geophysical data sets. This data includes raw observational and analytical data as well as derived data products from seabed sediment samples, boreholes, geophysical surveys (seismic, gravity etc) and sidescan sonar surveys. All of which are essential in order to produce a complete interpretation of seabed geology. Despite there being a large volume of geological and geophysical data available for the marine environment it is currently very difficult to use these datasets in an integrated way between organisations due to different nomenclatures, formats, scales and coordinate systems being used within different organisations and also within different countries. This makes the direct use of primary data in an integrated way very difficult and also hampers use of the data sets in a harmonised way to produce multidisciplinary data products and services. To ensure interoperability with other marine environmental data types Geo-Seas ISO19115 metadata, OGC and GeoSciML standards will be used as the basis for the metadata profiles for the geological and geophysical data. This will be largely achieved by modifying the SeaDataNet metadata standard profile (Common Data Index or CDI), which is itself based upon the ISO19115 standard, to accommodate the requirements of the Geo-Seas project. The overall objective of Geo-Seas project is to build and deploy a unified marine geoscientific data infrastructure within Europe which will in effect provide a data grid for the sharing of marine geological and geophysical data. This will result in a major improvement in the locating, accessing and delivery of federated marine geological and geophysical data and data products from national geological surveys and research institutes across Europe. There is an emphasis on interoperability both with other disciplines as well as with other key framework projects including the European Marine Observation and Data Network (EMODNet) and One Geology - Europe. In addition, a key objective of the Geo-Seas project is to underpin European directives such as INSPIRE as well as recent framework programmes on both the global and European scale, for example Global Earth Observation System of Systems (GEOSS) and Global Monitoring for Environment and Security (GMES), all of which are intended to encourage the exchange of data and information. Geo-Seas consortium partners: NERC-BGS (United Kingdom), NERC-BODC (United Kingdom), NERC-NOCS (United Kingdom), MARIS (Netherlands), IFREMER (France), BRGM (France), TNO (Netherlands), BSH (Germany), IGME (Spain), INETI (Portugal), IGME (Greece), GSI (Ireland), BGR (Germany), OGS (Italy), GEUS (Denmark), NGU (Norway), PGI (Poland), EGK (Estonia), LIGG (Lithuania), IO-BAS (Bulgaria), NOA (Greece), CIRIA (United Kingdom), MUMM (Belgium), UB (Spain), UCC (Ireland), EU-Consult (Netherlands), CNRS (France), SHOM (France), CEFAS (United Kingdom), and LU (Latvia). The project is coordinated by British Geological Survey (BGS), while the technical coordination is performed by Marine Information Service (MARIS). The Geo-Seas project is an Integrated Infrastructure Initiative (I3) of the Research Infrastructures programme within EU FP7, contract number RI-238952. It has a duration of 42 months from 1st May 2009 till 31st October 2012. 1 British Geological Survey, Keyworth, Nottingham, NG12 5GG, UK. e-mail: hmg@bgs.ac.uk 2 British Geological Survey, Murchison House, West Mains Road, Edinburgh, EH9 3LA, UK. e-mail: ccg@bgs.ac.uk
Interoperability format translation and transformation between IFC architectural design file and simulation file formats

DOEpatents

Chao, Tian-Jy; Kim, Younghun

2015-02-03

Automatically translating a building architecture file format (Industry Foundation Class) to a simulation file, in one aspect, may extract data and metadata used by a target simulation tool from a building architecture file. Interoperability data objects may be created and the extracted data is stored in the interoperability data objects. A model translation procedure may be prepared to identify a mapping from a Model View Definition to a translation and transformation function. The extracted data may be transformed using the data stored in the interoperability data objects, an input Model View Definition template, and the translation and transformation function to convert the extracted data to correct geometric values needed for a target simulation file format used by the target simulation tool. The simulation file in the target simulation file format may be generated.
Facilitating Stewardship of scientific data through standards based workflows

NASA Astrophysics Data System (ADS)

Bastrakova, I.; Kemp, C.; Potter, A. K.

2013-12-01

There are main suites of standards that can be used to define the fundamental scientific methodology of data, methods and results. These are firstly Metadata standards to enable discovery of the data (ISO 19115), secondly the Sensor Web Enablement (SWE) suite of standards that include the O&M and SensorML standards and thirdly Ontology that provide vocabularies to define the scientific concepts and relationships between these concepts. All three types of standards have to be utilised by the practicing scientist to ensure that those who ultimately have to steward the data stewards to ensure that the data can be preserved curated and reused and repurposed. Additional benefits of this approach include transparency of scientific processes from the data acquisition to creation of scientific concepts and models, and provision of context to inform data use. Collecting and recording metadata is the first step in scientific data flow. The primary role of metadata is to provide details of geographic extent, availability and high-level description of data suitable for its initial discovery through common search engines. The SWE suite provides standardised patterns to describe observations and measurements taken for these data, capture detailed information about observation or analytical methods, used instruments and define quality determinations. This information standardises browsing capability over discrete data types. The standardised patterns of the SWE standards simplify aggregation of observation and measurement data enabling scientists to transfer disintegrated data to scientific concepts. The first two steps provide a necessary basis for the reasoning about concepts of ';pure' science, building relationship between concepts of different domains (linked-data), and identifying domain classification and vocabularies. Geoscience Australia is re-examining its marine data flows, including metadata requirements and business processes, to achieve a clearer link between scientific data acquisition and analysis requirements and effective interoperable data management and delivery. This includes participating in national and international dialogue on development of standards, embedding data management activities in business processes, and developing scientific staff as effective data stewards. Similar approach is applied to the geophysical data. By ensuring the geophysical datasets at GA strictly follow metadata and industry standards we are able to implement a provenance based workflow where the data is easily discoverable, geophysical processing can be applied to it and results can be stored. The provenance based workflow enables metadata records for the results to be produced automatically from the input dataset metadata.
Enabling interoperability in planetary sciences and heliophysics: The case for an information model

NASA Astrophysics Data System (ADS)

Hughes, J. Steven; Crichton, Daniel J.; Raugh, Anne C.; Cecconi, Baptiste; Guinness, Edward A.; Isbell, Christopher E.; Mafi, Joseph N.; Gordon, Mitchell K.; Hardman, Sean H.; Joyner, Ronald S.

2018-01-01

The Planetary Data System has developed the PDS4 Information Model to enable interoperability across diverse science disciplines. The Information Model is based on an integration of International Organization for Standardization (ISO) level standards for trusted digital archives, information model development, and metadata registries. Where controlled vocabularies provides a basic level of interoperability by providing a common set of terms for communication between both machines and humans the Information Model improves interoperability by means of an ontology that provides semantic information or additional related context for the terms. The information model was defined by team of computer scientists and science experts from each of the diverse disciplines in the Planetary Science community, including Atmospheres, Geosciences, Cartography and Imaging Sciences, Navigational and Ancillary Information, Planetary Plasma Interactions, Ring-Moon Systems, and Small Bodies. The model was designed to be extensible beyond the Planetary Science community, for example there are overlaps between certain PDS disciplines and the Heliophysics and Astrophysics disciplines. "Interoperability" can apply to many aspects of both the developer and the end-user experience, for example agency-to-agency, semantic level, and application level interoperability. We define these types of interoperability and focus on semantic level interoperability, the type of interoperability most directly enabled by an information model.
Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination.

PubMed

Doiron, Dany; Marcon, Yannick; Fortier, Isabel; Burton, Paul; Ferretti, Vincent

2017-10-01

Improving the dissemination of information on existing epidemiological studies and facilitating the interoperability of study databases are essential to maximizing the use of resources and accelerating improvements in health. To address this, Maelstrom Research proposes Opal and Mica, two inter-operable open-source software packages providing out-of-the-box solutions for epidemiological data management, harmonization and dissemination. Opal and Mica are two standalone but inter-operable web applications written in Java, JavaScript and PHP. They provide web services and modern user interfaces to access them. Opal allows users to import, manage, annotate and harmonize study data. Mica is used to build searchable web portals disseminating study and variable metadata. When used conjointly, Mica users can securely query and retrieve summary statistics on geographically dispersed Opal servers in real-time. Integration with the DataSHIELD approach allows conducting more complex federated analyses involving statistical models. Opal and Mica are open-source and freely available at [www.obiba.org] under a General Public License (GPL) version 3, and the metadata models and taxonomies that accompany them are available under a Creative Commons licence. © The Author 2017; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association
Converting ODM Metadata to FHIR Questionnaire Resources.

PubMed

Doods, Justin; Neuhaus, Philipp; Dugas, Martin

2016-01-01

Interoperability between systems and data sharing between domains is becoming more and more important. The portal medical-data-models.org offers more than 5.300 UMLS annotated forms in CDISC ODM format in order to support interoperability, but several additional export formats are available. CDISC's ODM and HL7's framework FHIR Questionnaire resource were analyzed, a mapping between elements created and a converter implemented. The developed converter was integrated into the portal with FHIR Questionnaire XML or JSON download options. New FHIR applications can now use this large library of forms.
caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability.

PubMed

Komatsoulis, George A; Warzel, Denise B; Hartel, Francis W; Shanbhag, Krishnakant; Chilukuri, Ram; Fragoso, Gilberto; Coronado, Sherri de; Reeves, Dianne M; Hadfield, Jillaine B; Ludet, Christophe; Covitz, Peter A

2008-02-01

One of the requirements for a federated information system is interoperability, the ability of one computer system to access and use the resources of another system. This feature is particularly important in biomedical research systems, which need to coordinate a variety of disparate types of data. In order to meet this need, the National Cancer Institute Center for Bioinformatics (NCICB) has created the cancer Common Ontologic Representation Environment (caCORE), an interoperability infrastructure based on Model Driven Architecture. The caCORE infrastructure provides a mechanism to create interoperable biomedical information systems. Systems built using the caCORE paradigm address both aspects of interoperability: the ability to access data (syntactic interoperability) and understand the data once retrieved (semantic interoperability). This infrastructure consists of an integrated set of three major components: a controlled terminology service (Enterprise Vocabulary Services), a standards-based metadata repository (the cancer Data Standards Repository) and an information system with an Application Programming Interface (API) based on Domain Model Driven Architecture. This infrastructure is being leveraged to create a Semantic Service-Oriented Architecture (SSOA) for cancer research by the National Cancer Institute's cancer Biomedical Informatics Grid (caBIG).
caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability

PubMed Central

Komatsoulis, George A.; Warzel, Denise B.; Hartel, Frank W.; Shanbhag, Krishnakant; Chilukuri, Ram; Fragoso, Gilberto; de Coronado, Sherri; Reeves, Dianne M.; Hadfield, Jillaine B.; Ludet, Christophe; Covitz, Peter A.

2008-01-01

One of the requirements for a federated information system is interoperability, the ability of one computer system to access and use the resources of another system. This feature is particularly important in biomedical research systems, which need to coordinate a variety of disparate types of data. In order to meet this need, the National Cancer Institute Center for Bioinformatics (NCICB) has created the cancer Common Ontologic Representation Environment (caCORE), an interoperability infrastructure based on Model Driven Architecture. The caCORE infrastructure provides a mechanism to create interoperable biomedical information systems. Systems built using the caCORE paradigm address both aspects of interoperability: the ability to access data (syntactic interoperability) and understand the data once retrieved (semantic interoperability). This infrastructure consists of an integrated set of three major components: a controlled terminology service (Enterprise Vocabulary Services), a standards-based metadata repository (the cancer Data Standards Repository) and an information system with an Application Programming Interface (API) based on Domain Model Driven Architecture. This infrastructure is being leveraged to create a Semantic Service Oriented Architecture (SSOA) for cancer research by the National Cancer Institute’s cancer Biomedical Informatics Grid (caBIG™). PMID:17512259
The Arctic Observing Viewer (AOV): Visualization, Data Discovery, Strategic Assessment, and Decision Support for Arctic Observing

NASA Astrophysics Data System (ADS)

Cody, R. P.; Manley, W. F.; Gaylord, A. G.; Kassin, A.; Villarreal, S.; Barba, M.; Dover, M.; Escarzaga, S. M.; Habermann, T.; Kozimor, J.; Score, R.; Tweedie, C. E.

2016-12-01

To better assess progress in Arctic Observing made by U.S. SEARCH, NSF AON, SAON, and related initiatives, an updated version of the Arctic Observing Viewer (AOV; http://ArcticObservingViewer.org) has been released. This web mapping application and information system conveys the who, what, where, and when of "data collection sites" - the precise locations of monitoring assets, observing platforms, and wherever repeat marine or terrestrial measurements have been taken. Over 8000 sites across the circum-arctic are documented including a range of boreholes, ship tracks, buoys, towers, sampling stations, sensor networks, vegetation plots, stream gauges, ice cores, observatories, and more. Contributing partners are the U.S. NSF, ACADIS, ADIwg, AOOS, a2dc, AON, CAFF, GINA, IASOA, INTERACT, NASA ABoVE, and USGS, among others. Users can visualize, navigate, select, search, draw, print, view details, and follow links to obtain a comprehensive perspective of environmental monitoring efforts. We continue to develop, populate, and enhance AOV. Recent improvements include: a more intuitive and functional search tool, a modern cross-platform interface using javascript and HTML5, and hierarchical ISO metadata coupled with RESTful web services & metadata XLinks to span the data life cycle (from project planning to establishment of data collection sites to release of scientific datasets). Additionally, through collaborations with the Barrow Area Information Database (BAID, www.barrowmapped.org) we are exploring linkages with datacenters and have developed a prototype dashboard application that allows users to explore data services in the AOV application. AOV is founded on principles of interoperability, such that agencies and organizations can use the AOV Viewer and web services for their own purposes. In this way, AOV complements other distributed yet interoperable cyber resources and helps science planners, funding agencies, investigators, data specialists, and others to: assess status, identify overlap, fill gaps, optimize sampling design, refine network performance, clarify directions, access data, coordinate logistics, and collaborate to meet Arctic Observing goals.
Research on key technologies for data-interoperability-based metadata, data compression and encryption, and their application

NASA Astrophysics Data System (ADS)

Yu, Xu; Shao, Quanqin; Zhu, Yunhai; Deng, Yuejin; Yang, Haijun

2006-10-01

With the development of informationization and the separation between data management departments and application departments, spatial data sharing becomes one of the most important objectives for the spatial information infrastructure construction, and spatial metadata management system, data transmission security and data compression are the key technologies to realize spatial data sharing. This paper discusses the key technologies for metadata based on data interoperability, deeply researches the data compression algorithms such as adaptive Huffman algorithm, LZ77 and LZ78 algorithm, studies to apply digital signature technique to encrypt spatial data, which can not only identify the transmitter of spatial data, but also find timely whether the spatial data are sophisticated during the course of network transmission, and based on the analysis of symmetric encryption algorithms including 3DES,AES and asymmetric encryption algorithm - RAS, combining with HASH algorithm, presents a improved mix encryption method for spatial data. Digital signature technology and digital watermarking technology are also discussed. Then, a new solution of spatial data network distribution is put forward, which adopts three-layer architecture. Based on the framework, we give a spatial data network distribution system, which is efficient and safe, and also prove the feasibility and validity of the proposed solution.
Assuring the Quality of Agricultural Learning Repositories: Issues for the Learning Object Metadata Creation Process of the CGIAR

NASA Astrophysics Data System (ADS)

Zschocke, Thomas; Beniest, Jan

The Consultative Group on International Agricultural Re- search (CGIAR) has established a digital repository to share its teaching and learning resources along with descriptive educational information based on the IEEE Learning Object Metadata (LOM) standard. As a critical component of any digital repository, quality metadata are critical not only to enable users to find more easily the resources they require, but also for the operation and interoperability of the repository itself. Studies show that repositories have difficulties in obtaining good quality metadata from their contributors, especially when this process involves many different stakeholders as is the case with the CGIAR as an international organization. To address this issue the CGIAR began investigating the Open ECBCheck as well as the ISO/IEC 19796-1 standard to establish quality protocols for its training. The paper highlights the implications and challenges posed by strengthening the metadata creation workflow for disseminating learning objects of the CGIAR.
Digital Curation of Marine Physical Samples at Ocean Networks Canada

NASA Astrophysics Data System (ADS)

Jenkyns, R.; Tomlin, M. C.; Timmerman, R.

2015-12-01

Ocean Networks Canada (ONC) has collected hundreds of geological, biological and fluid samples from the water column and seafloor during its maintenance expeditions. These samples have been collected by Remotely Operated Vehicles (ROVs), divers, networked and autonomously deployed instruments, and rosettes. Subsequent measurements are used for scientific experiments, calibration of in-situ and remote sensors, monitoring of Marine Protected Areas, and environment characterization. Tracking the life cycles of these samples from collection to dissemination of results with all the pertinent documents (e.g., protocols, imagery, reports), metadata (e.g., location, identifiers, purpose, method) and data (e.g., measurements, taxonomic classification) is a challenge. The initial collection of samples is normally documented in SeaScribe (an ROV dive logging tool within ONC's Oceans 2.0 software) for which ONC has defined semantics and syntax. Next, samples are often sent to individual scientists and institutions (e.g., Royal BC Museum) for processing and storage, making acquisition of results and life cycle metadata difficult. Finally, this information needs to be retrieved and collated such that multiple user scenarios can be addressed. ONC aims to improve and extend its digital infrastructure for physical samples to support this complex array of samples, workflows and applications. However, in order to promote effective data discovery and exchange, interoperability and community standards must be an integral part of the design. Thus, integrating recommendations and outcomes of initiatives like the EarthCube iSamples working groups are essential. Use cases, existing tools, schemas and identifiers are reviewed, while remaining gaps and challenges are identified. The current status, selected approaches and possible future directions to enhance ONC's digital infrastructure for each sample type are presented.
Interoperability format translation and transformation between IFC architectural design file and simulation file formats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chao, Tian-Jy; Kim, Younghun

Automatically translating a building architecture file format (Industry Foundation Class) to a simulation file, in one aspect, may extract data and metadata used by a target simulation tool from a building architecture file. Interoperability data objects may be created and the extracted data is stored in the interoperability data objects. A model translation procedure may be prepared to identify a mapping from a Model View Definition to a translation and transformation function. The extracted data may be transformed using the data stored in the interoperability data objects, an input Model View Definition template, and the translation and transformation function tomore » convert the extracted data to correct geometric values needed for a target simulation file format used by the target simulation tool. The simulation file in the target simulation file format may be generated.« less

A Bridge to the Future: Observations on Building a Digital Library.

ERIC Educational Resources Information Center

Gaunt, Marianne I.

2002-01-01

The experience of Rutgers University Libraries illustrates the extensive planning, work effort, possibilities, and investment required to develop the digital library. Examines these key areas: organizational structure; staff development needs; facilities and the new digital infrastructure; metadata standards/interoperability; digital collection…
The NCAR Digital Asset Services Hub (DASH): Implementing Unified Data Discovery and Access

NASA Astrophysics Data System (ADS)

Stott, D.; Worley, S. J.; Hou, C. Y.; Nienhouse, E.

2017-12-01

The National Center for Atmospheric Research (NCAR) Directorate created the Data Stewardship Engineering Team (DSET) to plan and implement an integrated single entry point for uniform digital asset discovery and access across the organization in order to improve the efficiency of access, reduce the costs, and establish the foundation for interoperability with other federated systems. This effort supports new policies included in federal funding mandates, NSF data management requirements, and journal citation recommendations. An inventory during the early planning stage identified diverse asset types across the organization that included publications, datasets, metadata, models, images, and software tools and code. The NCAR Digital Asset Services Hub (DASH) is being developed and phased in this year to improve the quality of users' experiences in finding and using these assets. DASH serves to provide engagement, training, search, and support through the following four nodes (see figure). DASH MetadataDASH provides resources for creating and cataloging metadata to the NCAR Dialect, a subset of ISO 19115. NMDEdit, an editor based on a European open source application, has been configured for manual entry of NCAR metadata. CKAN, an open source data portal platform, harvests these XML records (along with records output directly from databases) from a Web Accessible Folder (WAF) on GitHub for validation. DASH SearchThe NCAR Dialect metadata drives cross-organization search and discovery through CKAN, which provides the display interface of search results. DASH search will establish interoperability by facilitating metadata sharing with other federated systems. DASH ConsultingThe DASH Data Curation & Stewardship Coordinator assists with Data Management (DM) Plan preparation and advises on Digital Object Identifiers. The coordinator arranges training sessions on the DASH metadata tools and DM planning, and provides one-on-one assistance as requested. DASH RepositoryA repository is under development for NCAR datasets currently not in existing lab-managed archives. The DASH repository will be under NCAR governance and meet Trustworthy Repositories Audit & Certification (TRAC) requirements. This poster will highlight the processes, lessons learned, and current status of the DASH effort at NCAR.
Mitogenome metadata: current trends and proposed standards.

PubMed

Strohm, Jeff H T; Gwiazdowski, Rodger A; Hanner, Robert

2016-09-01

Mitogenome metadata are descriptive terms about the sequence, and its specimen description that allow both to be digitally discoverable and interoperable. Here, we review a sampling of mitogenome metadata published in the journal Mitochondrial DNA between 2005 and 2014. Specifically, we have focused on a subset of metadata fields that are available for GenBank records, and specified by the Genomics Standards Consortium (GSC) and other biodiversity metadata standards; and we assessed their presence across three main categories: collection, biological and taxonomic information. To do this we reviewed 146 mitogenome manuscripts, and their associated GenBank records, and scored them for 13 metadata fields. We also explored the potential for mitogenome misidentification using their sequence diversity, and taxonomic metadata on the Barcode of Life Datasystems (BOLD). For this, we focused on all Lepidoptera and Perciformes mitogenomes included in the review, along with additional mitogenome sequence data mined from Genbank. Overall, we found that none of 146 mitogenome projects provided all the metadata we looked for; and only 17 projects provided at least one category of metadata across the three main categories. Comparisons using mtDNA sequences from BOLD, suggest that some mitogenomes may be misidentified. Lastly, we appreciate the research potential of mitogenomes announced through this journal; and we conclude with a suggestion of 13 metadata fields, available on GenBank, that if provided in a mitogenomes's GenBank record, would increase their research value.
Linked Ocean Data

NASA Astrophysics Data System (ADS)

Leadbetter, Adam; Arko, Robert; Chandler, Cynthia; Shepherd, Adam

2014-05-01

"Linked Data" is a term used in Computer Science to encapsulate a methodology for publishing data and metadata in a structured format so that links may be created and exploited between objects. Berners-Lee (2006) outlines the following four design principles of a Linked Data system: Use Uniform Resource Identifiers (URIs) as names for things. Use HyperText Transfer Protocol (HTTP) URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (Resource Description Framework [RDF] and the RDF query language [SPARQL]). Include links to other URIs so that they can discover more things. In 2010, Berners-Lee revisited his original design plan for Linked Data to encourage data owners along a path to "good Linked Data". This revision involved the creation of a five star rating system for Linked Data outlined below. One star: Available on the web (in any format). Two stars: Available as machine-readable structured data (e.g. An Excel spreadsheet instead of an image scan of a table). Three stars: As two stars plus the use of a non-proprietary format (e.g. Comma Separated Values instead of Excel). Four stars: As three stars plus the use of open standards from the World Wide Web Commission (W3C) (i.e. RDF and SPARQL) to identify things, so that people can point to your data and metadata. Five stars: All the above plus link your data to other people's data to provide context Here we present work building on the SeaDataNet common vocabularies served by the NERC Vocabulary Server, connecting projects such as the Rolling Deck to Repository (R2R) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO) and other vocabularies such as the Marine Metadata Interoperability Ontology Register and Repository and the NASA Global Change Master Directory to create a Linked Ocean Data cloud. Publishing the vocabularies and metadata in standard RDF XML and exposing SPARQL endpoints renders them five-star Linked Data repositories. The benefits of this approach include: increased interoperability between the metadata created by projects; improved data discovery as users of SeaDataNet, R2R and BCO-DMO terms can find data using labels with which they are familiar both standard tools and newly developed custom tools may be used to explore the data; and using standards means the custom tools are easier to develop Linked Data is a concept which has been in existence for nearly a decade, and has a simple set of formal best practices associated with it. Linked Data is increasingly being seen as a driver of the next generation of "community science" activities. While many data providers in the oceanographic domain may be unaware of Linked Data, they may also be providing it at one of its lower levels. Here we have shown that it is possible to deliver the highest standard of Linked Oceanographic Data, and some of the benefits of the approach.
Ocean Data Interoperability Platform (ODIP): Developing a Common Framework for Marine Data Management on a Global Scale

NASA Astrophysics Data System (ADS)

Glaves, H. M.; Schaap, D.

2014-12-01

As marine research becomes increasingly multidisciplinary in its approach there has been a corresponding rise in the demand for large quantities of high quality interoperable data. A number of regional initiatives are already addressing this requirement through the establishment of e-infrastructures to improve the discovery and access of marine data. Projects such as Geo-Seas and SeaDataNet in Europe, Rolling Deck to Repository (R2R) in the USA and IMOS in Australia have implemented local infrastructures to facilitate the exchange of standardised marine datasets. However, each of these regional initiatives has been developed to address their own requirements and independently of other regions. To establish a common framework for marine data management on a global scale these is a need to develop interoperability solutions that can be implemented across these initiatives.Through a series of workshops attended by the relevant domain specialists, the Ocean Data Interoperability Platform (ODIP) project has identified areas of commonality between the regional infrastructures and used these as the foundation for the development of three prototype interoperability solutions addressing: the use of brokering services for the purposes of providing access to the data available in the regional data discovery and access services including via the GEOSS portal the development of interoperability between cruise summary reporting systems in Europe, the USA and Australia for routine harvesting of cruise data for delivery via the Partnership for Observation of Global Oceans (POGO) portal the establishment of a Sensor Observation Service (SOS) for selected sensors installed on vessels and in real-time monitoring systems using sensor web enablement (SWE) These prototypes will be used to underpin the development of a common global approach to the management of marine data which can be promoted to the wider marine research community. ODIP is a community lead project that is currently focussed on regional initiatives in Europe, the USA and Australia but which is seeking to expand this framework to include other regional marine data infrastructures.
NetCDF4/HDF5 and Linked Data in the Real World - Enriching Geoscientific Metadata without Bloat

NASA Astrophysics Data System (ADS)

Ip, Alex; Car, Nicholas; Druken, Kelsey; Poudjom-Djomani, Yvette; Butcher, Stirling; Evans, Ben; Wyborn, Lesley

2017-04-01

NetCDF4 has become the dominant generic format for many forms of geoscientific data, leveraging (and constraining) the versatile HDF5 container format, while providing metadata conventions for interoperability. However, the encapsulation of detailed metadata within each file can lead to metadata "bloat", and difficulty in maintaining consistency where metadata is replicated to multiple locations. Complex conceptual relationships are also difficult to represent in simple key-value netCDF metadata. Linked Data provides a practical mechanism to address these issues by associating the netCDF files and their internal variables with complex metadata stored in Semantic Web vocabularies and ontologies, while complying with and complementing existing metadata conventions. One of the stated objectives of the netCDF4/HDF5 formats is that they should be self-describing: containing metadata sufficient for cataloguing and using the data. However, this objective can be regarded as only partially-met where details of conventions and definitions are maintained externally to the data files. For example, one of the most widely used netCDF community standards, the Climate and Forecasting (CF) Metadata Convention, maintains standard vocabularies for a broad range of disciplines across the geosciences, but this metadata is currently neither readily discoverable nor machine-readable. We have previously implemented useful Linked Data and netCDF tooling (ncskos) that associates netCDF files, and individual variables within those files, with concepts in vocabularies formulated using the Simple Knowledge Organization System (SKOS) ontology. NetCDF files contain Uniform Resource Identifier (URI) links to terms represented as SKOS Concepts, rather than plain-text representations of those terms, so we can use simple, standardised web queries to collect and use rich metadata for the terms from any Linked Data-presented SKOS vocabulary. Geoscience Australia (GA) manages a large volume of diverse geoscientific data, much of which is being translated from proprietary formats to netCDF at NCI Australia. This data is made available through the NCI National Environmental Research Data Interoperability Platform (NERDIP) for programmatic access and interdisciplinary analysis. The netCDF files contain both scientific data variables (e.g. gravity, magnetic or radiometric values), but also domain-specific operational values (e.g. specific instrument parameters) best described fully in formal vocabularies. Our ncskos codebase provides access to multiple stores of detailed external metadata in a standardised fashion. Geophysical datasets are generated from a "survey" event, and GA maintains corporate databases of all surveys and their associated metadata. It is impractical to replicate the full source survey metadata into each netCDF dataset so, instead, we link the netCDF files to survey metadata using public Linked Data URIs. These URIs link to Survey class objects which we model as a subclass of Activity objects as defined by the PROV Ontology, and we provide URI resolution for them via a custom Linked Data API which draws current survey metadata from GA's in-house databases. We have demonstrated that Linked Data is a practical way to associate netCDF data with detailed, external metadata. This allows us to ensure that catalogued metadata is kept consistent with metadata points-of-truth, and we can infer complex conceptual relationships not possible with netCDF key-value attributes alone.
Distributed Learning Metadata Standards

ERIC Educational Resources Information Center

McClelland, Marilyn

2004-01-01

Significant economies can be achieved in distributed learning systems architected with a focus on interoperability and reuse. The key building blocks of an efficient distributed learning architecture are the use of standards and XML technologies. The goal of plug and play capability among various components of a distributed learning system…
Extended Relation Metadata for SCORM-Based Learning Content Management Systems

ERIC Educational Resources Information Center

Lu, Eric Jui-Lin; Horng, Gwoboa; Yu, Chia-Ssu; Chou, Ling-Ying

2010-01-01

To increase the interoperability and reusability of learning objects, Advanced Distributed Learning Initiative developed a model called Content Aggregation Model (CAM) to describe learning objects and express relationships between learning objects. However, the suggested relations defined in the CAM can only describe structure-oriented…
Self-Assembling Texts & Courses of Study.

ERIC Educational Resources Information Center

Gibson, David

This paper describes the development of an interoperable meta-database system--a system of applications using metadata--that is intended to facilitate learner-centered collaboration, access to learning resources, and the fitness of channels of information to the emerging needs of learners at both individual and group levels. Highlights include:…
Why Digital Data Collections Are Important

ERIC Educational Resources Information Center

Mitchell, Erik T.

2012-01-01

The silo is a well-worn metaphor in information systems used to illustrate separateness, isolation, and lack of connectivity. Through the many iterations of system development, libraries, archives, and museums (LAMs) have sought to avoid silos and find the sweet spot between interface design and metadata interoperability. This effort is being…
The HDF Product Designer - Interoperability in the First Mile

NASA Astrophysics Data System (ADS)

Lee, H.; Jelenak, A.; Habermann, T.

2014-12-01

Interoperable data have been a long-time goal in many scientific communities. The recent growth in analysis, visualization and mash-up applications that expect data stored in a standardized manner has brought the interoperability issue to the fore. On the other hand, producing interoperable data is often regarded as a sideline task in a typical research team for which resources are not readily available. The HDF Group is developing a software tool aimed at lessening the burden of creating data in standards-compliant, interoperable HDF5 files. The tool, named HDF Product Designer, lowers the threshold needed to design such files by providing a user interface that combines the rich HDF5 feature set with applicable metadata conventions. Users can quickly devise new HDF5 files while at the same time seamlessly incorporating the latest best practices and conventions from their community. That is what the term interoperability in the first mile means: enabling generation of interoperable data in HDF5 files from the onset of their production. The tool also incorporates collaborative features, allowing team approach in the file design, as well as easy transfer of best practices as they are being developed. The current state of the tool and the plans for future development will be presented. Constructive input from interested parties is always welcome.
Development of RESTful services and map-based user interface tools for access and delivery of data and metadata from the Marine-Geo Digital Library

NASA Astrophysics Data System (ADS)

Morton, J. J.; Ferrini, V. L.

2015-12-01

The Marine Geoscience Data System (MGDS, www.marine-geo.org) operates an interactive digital data repository and metadata catalog that provides access to a variety of marine geology and geophysical data from throughout the global oceans. Its Marine-Geo Digital Library includes common marine geophysical data types and supporting data and metadata, as well as complementary long-tail data. The Digital Library also includes community data collections and custom data portals for the GeoPRISMS, MARGINS and Ridge2000 programs, for active source reflection data (Academic Seismic Portal), and for marine data acquired by the US Antarctic Program (Antarctic and Southern Ocean Data Portal). Ensuring that these data are discoverable not only through our own interfaces but also through standards-compliant web services is critical for enabling investigators to find data of interest.Over the past two years, MGDS has developed several new RESTful web services that enable programmatic access to metadata and data holdings. These web services are compliant with the EarthCube GeoWS Building Blocks specifications and are currently used to drive our own user interfaces. New web applications have also been deployed to provide a more intuitive user experience for searching, accessing and browsing metadata and data. Our new map-based search interface combines components of the Google Maps API with our web services for dynamic searching and exploration of geospatially constrained data sets. Direct introspection of nearly all data formats for hundreds of thousands of data files curated in the Marine-Geo Digital Library has allowed for precise geographic bounds, which allow geographic searches to an extent not previously possible. All MGDS map interfaces utilize the web services of the Global Multi-Resolution Topography (GMRT) synthesis for displaying global basemap imagery and for dynamically provide depth values at the cursor location.
Using a linked data approach to aid development of a metadata portal to support Marine Strategy Framework Directive (MSFD) implementation

NASA Astrophysics Data System (ADS)

Wood, Chris

2016-04-01

Under the Marine Strategy Framework Directive (MSFD), EU Member States are mandated to achieve or maintain 'Good Environmental Status' (GES) in their marine areas by 2020, through a series of Programme of Measures (PoMs). The Celtic Seas Partnership (CSP), an EU LIFE+ project, aims to support policy makers, special-interest groups, users of the marine environment, and other interested stakeholders on MSFD implementation in the Celtic Seas geographical area. As part of this support, a metadata portal has been built to provide a signposting service to datasets that are relevant to MSFD within the Celtic Seas. To ensure that the metadata has the widest possible reach, a linked data approach was employed to construct the database. Although the metadata are stored in a traditional RDBS, the metadata are exposed as linked data via the D2RQ platform, allowing virtual RDF graphs to be generated. SPARQL queries can be executed against the end-point allowing any user to manipulate the metadata. D2RQ's mapping language, based on turtle, was used to map a wide range of relevant ontologies to the metadata (e.g. The Provenance Ontology (prov-o), Ocean Data Ontology (odo), Dublin Core Elements and Terms (dc & dcterms), Friend of a Friend (foaf), and Geospatial ontologies (geo)) allowing users to browse the metadata, either via SPARQL queries or by using D2RQ's HTML interface. The metadata were further enhanced by mapping relevant parameters to the NERC Vocabulary Server, itself built on a SPARQL endpoint. Additionally, a custom web front-end was built to enable users to browse the metadata and express queries through an intuitive graphical user interface that requires no prior knowledge of SPARQL. As well as providing means to browse the data via MSFD-related parameters (Descriptor, Criteria, and Indicator), the metadata records include the dataset's country of origin, the list of organisations involved in the management of the data, and links to any relevant INSPIRE-compliant services relating to the dataset. The web front-end therefore enables users to effectively filter, sort, or search the metadata. As the MSFD timeline requires Member States to review their progress on achieving or maintaining GES every six years, the timely development of this metadata portal will not only aid interested stakeholders in understanding how member states are meeting their targets, but also shows how linked data can be used effectively to support policy makers and associated legislative bodies.
The NSF Arctic Data Center: Leveraging the DataONE Federation to Build a Sustainable Archive for the NSF Arctic Research Community

NASA Astrophysics Data System (ADS)

Budden, A. E.; Arzayus, K. M.; Baker-Yeboah, S.; Casey, K. S.; Dozier, J.; Jones, C. S.; Jones, M. B.; Schildhauer, M.; Walker, L.

2016-12-01

The newly established NSF Arctic Data Center plays a critical support role in archiving and curating the data and software generated by Arctic researchers from diverse disciplines. The Arctic community, comprising Earth science, archaeology, geography, anthropology, and other social science researchers, are supported through data curation services and domain agnostic tools and infrastructure, ensuring data are accessible in the most transparent and usable way possible. This interoperability across diverse disciplines within the Arctic community facilitates collaborative research and is mirrored by interoperability between the Arctic Data Center infrastructure and other large scale cyberinfrastructure initiatives. The Arctic Data Center leverages the DataONE federation to standardize access to and replication of data and metadata to other repositories, specifically the NOAA's National Centers for Environmental Information (NCEI). This approach promotes long-term preservation of the data and metadata, as well as opening the door for other data repositories to leverage this replication infrastructure with NCEI and other DataONE member repositories. The Arctic Data Center uses rich, detailed metadata following widely recognized standards. Particularly, measurement-level and provenance metadata provide scientists the details necessary to integrate datasets across studies and across repositories while enabling a full understanding of the provenance of data used in the system. The Arctic Data Center gains this deep metadata and provenance support by simply adopting DataONE services, which results in significant efficiency gains by eliminating the need to develop systems de novo. Similarly, the advanced search tool developed by the Knowledge Network for Biocomplexity and extended for data submission by the Arctic Data Center, can be used by other DataONE-compliant repositories without further development. By standardizing interfaces and leveraging the DataONE federation, the Arctic Data Center has advanced rapidly and can itself contribute to raising the capabilities of all members of the federation.
Ocean Data Interoperability Platform (ODIP): using regional data systems for global ocean research

NASA Astrophysics Data System (ADS)

Schaap, D.; Thijsse, P.; Glaves, H.

2017-12-01

Ocean acidification, loss of coral reefs, sustainable exploitation of the marine environment are just a few of the challenges researchers around the world are currently attempting to understand and address. However, studies of these ecosystem level challenges are impossible unless researchers can discover and re-use the large volumes of interoperable multidisciplinary data that are currently only accessible through regional and global data systems that serve discreet, and often discipline specific, user communities. The plethora of marine data systems currently in existence are also using different standards, technologies and best practices making re-use of the data problematic for those engaged in interdisciplinary marine research. The Ocean Data Interoperability Platform (ODIP) is responding to this growing demand for discoverable, accessible and reusable data by establishing the foundations for a common global framework for marine data management. But creation of such an infrastructure is a major undertaking, and one that needs to be achieved in part by establishing different levels of interoperability across existing regional and global marine e-infrastructures. Workshops organised by ODIP II facilitate dialogue between selected regional and global marine data systems in an effort to identify potential solutions that integrate these marine e-infrastructures. The outcomes of these discussions have formed the basis for a number of prototype development tasks that aim to demonstrate effective sharing of data across multiple data systems, and allow users to access data from more than one system through a single access point. The ODIP II project is currently developing four prototype solutions that are establishing interoperability between selected regional marine data management infrastructures in Europe, the USA, Canada and Australia, and with the global POGO, IODE Ocean Data Portal (ODP) and GEOSS systems. The potential impact of implementing these solutions for the individual marine data infrastructures is also being evaluated to determine both the technical and financial implications of their integration within existing systems. These impact assessments form part of the strategy to encourage wider adoption of the ODIP solutions and approach beyond the current scope of the project.
Developing data aggregation applications from a community standard semantic resource (Invited)

NASA Astrophysics Data System (ADS)

Leadbetter, A.; Lowry, R. K.

2013-12-01

The semantic content of the NERC Vocabulary Server (NVS) has been developed over thirty years. It has been used to mark up metadata and data in a wide range of international projects, including the European Commission (EC) Framework Programme 7 projects SeaDataNet and The Open Service Network for Marine Environmental Data (NETMAR). Within the United States, the National Science Foundation projects Rolling Deck to Repository and Biological & Chemical Data Management Office (BCO-DMO) use concepts from NVS for markup. Further, typed relationships between NVS concepts and terms served by the Marine Metadata Interoperability Ontology Registry and Repository. The vast majority of the concepts publicly served from NVS (35% of ~82,000) form the British Oceanographic Data Centre (BODC) Parameter Usage Vocabulary (PUV). The PUV is instantiated on the NVS as a SKOS concept collection. These terms are used to describe the individual channels in data and metadata served by, for example, BODC, SeaDataNet and BCO-DMO. The PUV terms are designed to be very precise and may contain a high level of detail. Some users have reported that the PUV is difficult to navigate due to its size and complexity (a problem CSIRO have begun to address by deploying a SISSVoc interface to the NVS), and it has been difficult to aggregate data as multiple PUV terms can - with full validity - be used to describe the same data channels. Better approaches to data aggregation are required as a use case for the PUV from the EC European Marine Observation and Data Network (EMODnet) Chemistry project. One solution, proposed and demonstrated during the course of the NETMAR project, is to build new SKOS concept collections which formalise the desired aggregations for given applications, and uses typed relationships to state which PUV concepts contribute to a specific aggregation. Development of these new collections requires input from a group of experts in the application domain who can decide which PUV concepts it is acceptable to aggregate for a given application. Another approach, which has been developed as a use case for concept and data discovery and will be implemented as part of the EC/United States/Australian collaboration the Ocean Data Interoperability Platform, is to expose the well defined, but little publicised, semantic model which underpins each and every concept within the PUV. This will be done in a machine readable form, so that tools can be built to aggregate data and concepts by, for example, the measured parameter; the environmental sphere or compartment of the sampling; and the methodology of the analysis of the parameter. There is interesting work being developed by CSIRO which may be used in this approach. The importance of these data aggregations is growing as more data providers use terms from semantic resources to describe their data, and allows for aggregating data from numerous sources. This importance will grow as data become 'born semantic', i.e. when semantics are embedded with data from the point of collection. In this presentation we introduce a brief history of the development of the PUV; the use cases for data aggregation and discovery outlined above; and the semantic model from which the PUV is built; and the ideas for embedding semantics in data from the point of collection.
Interoperable and accessible census and survey data from IPUMS.

PubMed

Kugler, Tracy A; Fitch, Catherine A

2018-02-27

The first version of the Integrated Public Use Microdata Series (IPUMS) was released to users in 1993, and since that time IPUMS has come to stand for interoperable and accessible census and survey data. Initially created to harmonize U.S. census microdata over time, IPUMS now includes microdata from the U.S. and international censuses and from surveys on health, employment, and other topics. IPUMS also provides geo-spatial data, aggregate population data, and environmental data. IPUMS supports ten data products, each disseminating an integrated data collection with a set of tools that make complex data easy to find, access, and use. Key features are record-level integration to create interoperable datasets, user-friendly interfaces, and comprehensive metadata and documentation. The IPUMS philosophy aligns closely with the FAIR principles of findability, accessibility, interoperability, and re-usability. IPUMS data have catalyzed knowledge generation across a wide range of social science and other disciplines, as evidenced by the large volume of publications and other products created by the vast IPUMS user community.
Combining the CIDOC CRM and MPEG-7 to Describe Multimedia in Museums.

ERIC Educational Resources Information Center

Hunter, Jane

This paper describes a proposal for an interoperable metadata model, based on international standards, that has been designed to enable the description, exchange and sharing of multimedia resources both within and between cultural institutions. Domain-specific ontologies have been developed by two different ISO Working Groups to standardize the…
The Development of the Learning Object Standard Using a Pedagogic Approach: A Comparative Study.

ERIC Educational Resources Information Center

Yahya, Yazrina; Jenkins, John; Yusoff, Mohammed

Education is moving towards revenue generation from such channels as electronic learning, distance learning and virtual education. Hence learning technology standards are critical to the sector's success. Existing learning technology standards have focused on various topics such as metadata, question and test interoperability and others. However,…
A "Simple Query Interface" Adapter for the Discovery and Exchange of Learning Resources

ERIC Educational Resources Information Center

Massart, David

2006-01-01

Developed as part of CEN/ISSS Workshop on Learning Technology efforts to improve interoperability between learning resource repositories, the Simple Query Interface (SQI) is an Application Program Interface (API) for querying heterogeneous repositories of learning resource metadata. In the context of the ProLearn Network of Excellence, SQI is used…

Ridge 2000 Data Management System

NASA Astrophysics Data System (ADS)

Goodwillie, A. M.; Carbotte, S. M.; Arko, R. A.; Haxby, W. F.; Ryan, W. B.; Chayes, D. N.; Lehnert, K. A.; Shank, T. M.

2005-12-01

Hosted at Lamont by the marine geoscience Data Management group, mgDMS, the NSF-funded Ridge 2000 electronic database, http://www.marine-geo.org/ridge2000/, is a key component of the Ridge 2000 multi-disciplinary program. The database covers each of the three Ridge 2000 Integrated Study Sites: Endeavour Segment, Lau Basin, and 8-11N Segment. It promotes the sharing of information to the broader community, facilitates integration of the suite of information collected at each study site, and enables comparisons between sites. The Ridge 2000 data system provides easy web access to a relational database that is built around a catalogue of cruise metadata. Any web browser can be used to perform a versatile text-based search which returns basic cruise and submersible dive information, sample and data inventories, navigation, and other relevant metadata such as shipboard personnel and links to NSF program awards. In addition, non-proprietary data files, images, and derived products which are hosted locally or in national repositories, as well as science and technical reports, can be freely downloaded. On the Ridge 2000 database page, our Data Link allows users to search the database using a broad range of parameters including data type, cruise ID, chief scientist, geographical location. The first Ridge 2000 field programs sailed in 2004 and, in addition to numerous data sets collected prior to the Ridge 2000 program, the database currently contains information on fifteen Ridge 2000-funded cruises and almost sixty Alvin dives. Track lines can be viewed using a recently- implemented Web Map Service button labelled Map View. The Ridge 2000 database is fully integrated with databases hosted by the mgDMS group for MARGINS and the Antarctic multibeam and seismic reflection data initiatives. Links are provided to partner databases including PetDB, SIOExplorer, and the ODP Janus system. Improved inter-operability with existing and new partner repositories continues to be strengthened. One major effort involves the gradual unification of the metadata across these partner databases. Standardised electronic metadata forms that can be filled in at sea are available from our web site. Interactive map-based exploration and visualisation of the Ridge 2000 database is provided by GeoMapApp, a freely-available Java(tm) application being developed within the mgDMS group. GeoMapApp includes high-resolution bathymetric grids for the 8-11N EPR segment and allows customised maps and grids for any of the Ridge 2000 ISS to be created. Vent and instrument locations can be plotted and saved as images, and Alvin dive photos are also available.
SeaDataNet: Pan-European infrastructure for ocean and marine data management

NASA Astrophysics Data System (ADS)

Fichaut, M.; Schaap, D.; Maudire, G.; Manzella, G. M. R.

2012-04-01

The overall objective of the SeaDataNet project is the upgrade the present SeaDataNet infrastructure into an operationally robust and state-of-the-art Pan-European infrastructure for providing up-to-date and high quality access to ocean and marine metadata, data and data products originating from data acquisition activities by all engaged coastal states, by setting, adopting and promoting common data management standards and by realising technical and semantic interoperability with other relevant data management systems and initiatives on behalf of science, environmental management, policy making, and economy. SeaDataNet is undertaken by the National Oceanographic Data Centres (NODCs), and marine information services of major research institutes, from 31 coastal states bordering the European seas, and also includes Satellite Data Centres, expert modelling centres and the international organisations IOC, ICES and EU-JRC in its network. Its 40 data centres are highly skilled and have been actively engaged in data management for many years and have the essential capabilities and facilities for data quality control, long term stewardship, retrieval and distribution. SeaDataNet undertakes activities to achieve data access and data products services that meet requirements of end-users and intermediate user communities, such as GMES Marine Core Services (e.g. MyOcean), establishing SeaDataNet as the core data management component of the EMODNet infrastructure and contributing on behalf of Europe to global portal initiatives, such as the IOC/IODE - Ocean Data Portal (ODP), and GEOSS. Moreover it aims to achieve INSPIRE compliance and to contribute to the INSPIRE process for developing implementing rules for oceanography. • As part of the SeaDataNet upgrading and capacity building, training courses will be organised aiming at data managers and technicians at the data centres. For the data managers it is important, that they learn to work with the upgraded common SeaDataNet formats and procedures and software tools for preparing and updating metadata, processing and quality control of data, and presentation of data in viewing services, and for production of data products. • SeaDataNet maintains and operates several discovery services with overviews of marine organisations in Europe and their engagement in marine research projects, managing large datasets, and data acquisition by research vessels and monitoring programmes for the European seas and global oceans: o European Directory of Marine Environmental Data (EDMED) (at present > 4300 entries from more than 600 data holding centres in Europe) is a comprehensive reference to the marine data and sample collections held within Europe providing marine scientists, engineers and policy makers with a simple discovery mechanism. It covers all marine environmental disciplines. This needs regular maintenance. o European Directory of Marine Environmental Research Projects (EDMERP) (at present > 2200 entries from more than 300 organisations in Europe) gives an overview of research projects relating to the marine environment, that are relevant in the context of data sets and data acquisition activities ( cruises, in situ monitoring networks, ..) that are covered in SeaDataNet. This needs regular updating, following activities by dataholding institutes for preparing metadata references for EDMED, EDIOS, CSR and CDI. o Cruise Summary Reports (CSR) directory (at present > 43000 entries) provides a coarse-grained inventory for tracking oceanographic data collected by research vessels. o European Directory of Oceanographic Observing Systems (EDIOS) (at present > 10000 entries) is an initiative of EuroGOOS and gives an overview of the ocean measuring and monitoring systems operated by European countries. • European Directory of Marine Organisations (EDMO) (at present > 2000 entries) contains the contact information and activity profiles for the organisations whose data and activities are described by the discovery services. • Common Vocabularies (at present > 120000 terms in > 100 lists), covering a broad spectrum of ocean and marine disciplines. The common terms are used to mark up metadata, data and data products in a consistent and coherent way. Governance is regulated by an international board. • Common Data Index (CDI) data discovery and access service: SeaDataNet provides online unified access to distributed datasets via its portal website to the vast resources of marine and ocean datasets, managed by all the connected distributed data centres. The Common Data Index (CDI) service is the key Discovery and Delivery service. It enables users to have a detailed insight of the availability and geographical distribution of marine data, archived at the connected data centres, and it provides the means for downloading datasets in common formats via a transaction mechanism.
SIOExplorer: Advances Across Disciplinary and Institutional Boundaries

NASA Astrophysics Data System (ADS)

Miller, S. P.; Clark, D.; Helly, J.; Sutton, D.; Houghton, T.

2004-12-01

Strategies for interoperability have been an underlying theme in the development of the SIOExplorer Digital Library. The project was launched three years ago to stabilize data from 700 cruises by the Scripps Institution of Oceanography (SIO), scattered across distributed laboratories and on various media, mostly off-line, including paper and at-risk magnetic tapes. The need for a comprehensive scalable approach to harvesting data from 40 years of evolving instrumentation, media and formats has resulted in the implementation of a digital library architecture that is ready for interoperability. Key metadata template files maintain the integrity of the metadata and data structures, allowing forward and backward compatibility throughout the project as metadata blocks evolve or data types are added. The overall growth of the library is managed by federating new collections in disciplines as needed, each with their own independent data publishing authority. We now have a total of four collections: SIO Cruises, SIO Photo Archives, the Seamount Catalog, and the new Educators' Collection for learning resources. The data types include high resolution meteorological observations, water profiles, biological and geological samples, gravity, magnetics, seafloor swath mapping sonar files, maps and visualization files. The library transactions across the Internet amount to approximately 50,000 hits and 6 GB of downloads each month. We are currently building a new Geological Collection with thousands of dredged rocks and cores, a Seismic Collection with 30 years of reflection data, and a Physical Oceanography Collection with 50 cruises of Hydrographic Doppler Sonar System (HDSS) deep acoustic current profiling data. For the user, a Java CruiseViewer provides an interactive portal to the all the federated collections. With CruiseViewer, contents can be discovered by keyword or geographic searches over a global map, metadata can be browsed, and objects can be displayed or scheduled for download. For computer applications, REST and SOAP web services are being implemented to allow computer-to-computer interoperability for applications to search and receive data across the Internet. Discussions are underway to extend this approach and establish a digital library at the Woods Hole Oceanographic Institution for cruise data as well as extensive submersible and ROV digital video and mapping data. These efforts have been supported by NSF NSDL, ITR and OCE awards.
Building Interoperable Learning Objects Using Reduced Learning Object Metadata

ERIC Educational Resources Information Center

Saleh, Mostafa S.

2005-01-01

The new e-learning generation depends on Semantic Web technology to produce learning objects. As the production of these components is very costly, they should be produced and registered once, and reused and adapted in the same context or in other contexts as often as possible. To produce those components, developers should use learning standards…
Streamlining Metadata and Data Management for Evolving Digital Libraries

NASA Astrophysics Data System (ADS)

Clark, D.; Miller, S. P.; Peckman, U.; Smith, J.; Aerni, S.; Helly, J.; Sutton, D.; Chase, A.

2003-12-01

What began two years ago as an effort to stabilize the Scripps Institution of Oceanography (SIO) data archives from more than 700 cruises going back 50 years, has now become the operational fully-searchable "SIOExplorer" digital library, complete with thousands of historic photographs, images, maps, full text documents, binary data files, and 3D visualization experiences, totaling nearly 2 terabytes of digital content. Coping with data diversity and complexity has proven to be more challenging than dealing with large volumes of digital data. SIOExplorer has been built with scalability in mind, so that the addition of new data types and entire new collections may be accomplished with ease. It is a federated system, currently interoperating with three independent data-publishing authorities, each responsible for their own quality control, metadata specifications, and content selection. The IT architecture implemented at the San Diego Supercomputer Center (SDSC) streamlines the integration of additional projects in other disciplines with a suite of metadata management and collection building tools for "arbitrary digital objects." Metadata are automatically harvested from data files into domain-specific metadata blocks, and mapped into various specification standards as needed. Metadata can be browsed and objects can be viewed onscreen or downloaded for further analysis, with automatic proprietary-hold request management.
Enabling Interoperability and Servicing Multiple User Segments Through Web Services, Standards, and Data Tools

NASA Astrophysics Data System (ADS)

Palanisamy, Giriprakash; Wilson, Bruce E.; Cook, Robert B.; Lenhardt, Chris W.; Santhana Vannan, Suresh; Pan, Jerry; McMurry, Ben F.; Devarakonda, Ranjeet

2010-12-01

The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) is one of the science-oriented data centers in EOSDIS, aligned primarily with terrestrial ecology. The ORNL DAAC archives and serves data from NASA-funded field campaigns (such as BOREAS, FIFE, and LBA), regional and global data sets relevant to biogeochemical cycles, land validation studies for remote sensing, and source code for some terrestrial ecology models. Users of the ORNL DAAC include field ecologists, remote sensing scientists, modelers at various scales, synthesis scientific groups, a range of educational users (particularly baccalaureate and graduate instruction), and decision support analysts. It is clear that the wide range of users served by the ORNL DAAC have differing needs and differing capabilities for accessing and using data. It is also not possible for the ORNL DAAC, or the other data centers in EDSS to develop all of the tools and interfaces to support even most of the potential uses of data directly. As is typical of Information Technology to support a research enterprise, the user needs will continue to evolve rapidly over time and users themselves cannot predict future needs, as those needs depend on the results of current investigation. The ORNL DAAC is addressing these needs by targeted implementation of web services and tools which can be consumed by other applications, so that a modeler can retrieve data in netCDF format with the Climate Forecasting convention and a field ecologist can retrieve subsets of that same data in a comma separated value format, suitable for use in Excel or R. Tools such as our MODIS Subsetting capability, the Spatial Data Access Tool (SDAT; based on OGC web services), and OPeNDAP-compliant servers such as THREDDS particularly enable such diverse means of access. We also seek interoperability of metadata, recognizing that terrestrial ecology is a field where there are a very large number of relevant data repositories. ORNL DAAC metadata is published to several metadata repositories using the Open Archive Initiative Protocol for Metadata Handling (OAI-PMH), to increase the chances that users can find data holdings relevant to their particular scientific problem. ORNL also seeks to leverage technology across these various data projects and encourage standardization of processes and technical architecture. This standardization is behind current efforts involving the use of Drupal and Fedora Commons. This poster describes the current and planned approaches that the ORNL DAAC is taking to enable cost-effective interoperability among data centers, both across the NASA EOSDIS data centers and across the international spectrum of terrestrial ecology-related data centers. The poster will highlight the standards that we are currently using across data formats, metadata formats, and data protocols. References: [1]Devarakonda R., et al. Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics (2010), 3(1): 87-94. [2]Devarakonda R., et al. Data sharing and retrieval using OAI-PMH. Earth Science Informatics (2011), 4(1): 1-5.
Speeding up ontology creation of scientific terms

NASA Astrophysics Data System (ADS)

Bermudez, L. E.; Graybeal, J.

2005-12-01

An ontology is a formal specification of a controlled vocabulary. Ontologies are composed of classes (similar to categories), individuals (members of classes) and properties (attributes of the individuals). Having vocabularies expressed in a formal specification like the Web Ontology Language (OWL) enables interoperability due to the comprehensiveness of OWL by software programs. Two main non-inclusive strategies exist when constructing an ontology: an up-down approach and a bottom-up approach. The former one is directed towards the creation of top classes first (main concepts) and then finding the required subclasses and individuals. The later approach starts from the individuals and then finds similar properties promoting the creation of classes. At the Marine Metadata Interoperability (MMI) Initiative we used a bottom-up approach to create ontologies from simple-vocabularies (those that are not expressed in a conceptual way). We found that the vocabularies were available in different formats (relational data bases, plain files, HTML, XML, PDF) and sometimes were composed of thousands of terms, making the ontology creation process a very time consuming activity. To expedite the conversion process we created a tool VOC2OWL that takes a vocabulary in a table like structure (CSV or TAB format) and a conversion-property file to create automatically an ontology. We identified two basic structures of simple-vocabularies: Flat vocabularies (e.g., phone directory) and hierarchical vocabularies (e.g., taxonomies). The property file defines a list of attributes for the conversion process for each structure type. The attributes included metadata information (title, description, subject, contributor, urlForMoreInformation) and conversion flags (treatAsHierarchy, generateAutoIds) and other conversion information needed to create the ontology (columnForPrimaryClass, columnsToCreateClassesFrom, fileIn, fileOut, namespace, format). We created more than 50 ontologies and generated more than 250,000 statements (or triples). The previous ontologies allowed domain experts to create 800 relations allowing to infer 2200 more relations among different vocabularies in the MMI workshop "Advancing Domain Vocabularies" held in Boulder Aug, 2005.
The e-MapScholar project—an example of interoperability in GIScience education

NASA Astrophysics Data System (ADS)

Purves, R. S.; Medyckyj-Scott, D. J.; Mackaness, W. A.

2005-03-01

The proliferation of the use of digital spatial data in learning and teaching provides a set of opportunities and challenges for the development of e-learning materials suitable for use by a broad spectrum of disciplines in Higher Education. Effective e-learning materials must both provide engaging materials with which the learner can interact and be relevant to the learners' disciplinary and background knowledge. Interoperability aims to allow sharing of data and materials through the use of common agreements and specifications. Shared learning materials can take advantage of interoperable components to provide customisable components, and must consider issues in sharing data across institutional borders. The e-MapScholar project delivers teaching materials related to spatial data, which are customisable with respect to both context and location. Issues in the provision of such interoperable materials are discussed, including suitable levels of granularity of materials, the provision of tools to facilitate customisation and mechanisms to deliver multiple data sets and the metadata issues related to such materials. The examples shown make extensive use of the OpenGIS consortium specifications in the delivery of spatial data.
NASA Reverb: Standards-Driven Earth Science Data and Service Discovery

NASA Astrophysics Data System (ADS)

Cechini, M. F.; Mitchell, A.; Pilone, D.

2011-12-01

NASA's Earth Observing System Data and Information System (EOSDIS) is a core capability in NASA's Earth Science Data Systems Program. NASA's EOS ClearingHOuse (ECHO) is a metadata catalog for the EOSDIS, providing a centralized catalog of data products and registry of related data services. Working closely with the EOSDIS community, the ECHO team identified a need to develop the next generation EOS data and service discovery tool. This development effort relied on the following principles: + Metadata Driven User Interface - Users should be presented with data and service discovery capabilities based on dynamic processing of metadata describing the targeted data. + Integrated Data & Service Discovery - Users should be able to discovery data and associated data services that facilitate their research objectives. + Leverage Common Standards - Users should be able to discover and invoke services that utilize common interface standards. Metadata plays a vital role facilitating data discovery and access. As data providers enhance their metadata, more advanced search capabilities become available enriching a user's search experience. Maturing metadata formats such as ISO 19115 provide the necessary depth of metadata that facilitates advanced data discovery capabilities. Data discovery and access is not limited to simply the retrieval of data granules, but is growing into the more complex discovery of data services. These services include, but are not limited to, services facilitating additional data discovery, subsetting, reformatting, and re-projecting. The discovery and invocation of these data services is made significantly simpler through the use of consistent and interoperable standards. By utilizing an adopted standard, developing standard-specific adapters can be utilized to communicate with multiple services implementing a specific protocol. The emergence of metadata standards such as ISO 19119 plays a similarly important role in discovery as the 19115 standard. After a yearlong design, development, and testing process, the ECHO team successfully released "Reverb - The Next Generation Earth Science Discovery Tool." Reverb relies heavily on the information contained in dataset and granule metadata, such as ISO 19115, to provide a dynamic experience to users based on identified search facet values extracted from science metadata. Such an approach allows users to perform cross-dataset correlation and searches, discovering additional data that they may not previously have been aware of. In addition to data discovery, Reverb users may discover services associated with their data of interest. When services utilize supported standards and/or protocols, Reverb can facilitate the invocation of both synchronous and asynchronous data processing services. This greatly enhances a users ability to discover data of interest and accomplish their research goals. Extrapolating on the current movement towards interoperable standards and an increase in available services, data service invocation and chaining will become a natural part of data discovery. Reverb is one example of a discovery tool that provides a mechanism for transforming the earth science data discovery paradigm.
MaNIDA: Integration of marine expedition information, data and publications: Data Portal of German Marine Research

NASA Astrophysics Data System (ADS)

Koppe, Roland; Scientific MaNIDA-Team

2013-04-01

The Marine Network for Integrated Data Access (MaNIDA) aims to build a sustainable e-infrastructure to support discovery and re-use of marine data from distinct data providers in Germany (see related abstracts in session ESSI 1.2). In order to provide users integrated access and retrieval of expedition or cruise metadata, data, services and publications as well as relationships among the various objects, we are developing (web) applications based on state of the art technologies: the Data Portal of German Marine Research. Since the German network of distributed content providers have distinct objectives and mandates for storing digital objects (e.g. long-term data preservation, near real time data, publication repositories), we have to cope with heterogeneous metadata in terms of syntax and semantic, data types and formats as well as access solutions. We have defined a set of core metadata elements which are common to our content providers and therefore useful for discovery and building relationships among objects. Existing catalogues for various types of vocabularies are being used to assure the mapping to community-wide used terms. We distinguish between expedition metadata and continuously harvestable metadata objects from distinct data providers. • Existing expedition metadata from distinct sources is integrated and validated in order to create an expedition metadata catalogue which is used as authoritative source for expedition-related content. The web application allows browsing by e.g. research vessel and date, exploring expeditions and research gaps by tracklines and viewing expedition details (begin/end, ports, platforms, chief scientists, events, etc.). Also expedition-related objects from harvesting are dynamically associated with expedition information and presented to the user. Hence we will provide web services to detailed expedition information. • Other harvestable content is separated into four categories: archived data and data products, near real time data, publications and reports. Reports are a special case of publication, describing cruise planning, cruise reports or popular reports on expeditions and are orthogonal to e.g. peer-reviewed articles. Each object's metadata contains at least: identifier(s) e.g. doi/hdl, title, author(s), date, expedition(s), platform(s) e.g. research vessel Polarstern. Furthermore project(s), parameter(s), device(s) and e.g. geographic coverage are of interest. An international gazetteer resolves geographic coverage to region names and annotates to object metadata. Information is homogenously presented to the user, independent of the underlying format, but adaptable to specific disciplines e.g. bathymetry. Also data access and dissemination information is available to the user as data download link or web services (e.g. WFS, WMS). Based on relationship metadata we are dynamically building graphs of objects to support the user in finding possible relevant associated objects. Technically metadata is based on ISO / OGC standards or provider specification. Metadata is harvested via OAI-PMH or OGC CSW and indexed with Apache Lucene. This enables powerful full-text search, geographic and temporal search as well as faceting. In this presentation we will illustrate the architecture and the current implementation of our integrated approach.
The MAR databases: development and implementation of databases specific for marine metagenomics

PubMed Central

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen

2018-01-01

Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641
A Research on E - learning Resources Construction Based on Semantic Web

NASA Astrophysics Data System (ADS)

Rui, Liu; Maode, Deng

Traditional e-learning platforms have the flaws that it's usually difficult to query or positioning, and realize the cross platform sharing and interoperability. In the paper, the semantic web and metadata standard is discussed, and a kind of e - learning system framework based on semantic web is put forward to try to solve the flaws of traditional elearning platforms.
Application Profiling for Rural Communities: eGov Services and Training Resources in Rural Inclusion

NASA Astrophysics Data System (ADS)

Karamolegkos, Pantelis; Maroudas, Axel; Manouselis, Nikos

Metadata plays a critical role in the design and development of online repositories. The efficiency and ease of use of the repositories are directly associated with the metadata structure, since end-user functionalities such as search, retrieval and access are highly dependent on how the metadata schema and application profile have been conceptualized and implemented. The need for efficient and interoperable application profiles is even more substantial when it comes to services related to the e-government (eGov) paradigm, given a) the close association between services related to eGov and the metadata usage and b) the fact that the eGov concept is associated with time and cost critical processes, i.e. interaction of citizens and services with public authorities. In this paper, we outline an effort related to application profiling for eGov services and training resources, used in the platform of RuralObservatory2.0, which will underpin a major objective of the ICT PSP Rural Inclusion project, i.e. the eGov paradigm uptake by rural communities.
Compatibility Between Metadata Standards: Import Pipeline of CDISC ODM to the Samply.MDR.

PubMed

Kock-Schoppenhauer, Ann-Kristin; Ulrich, Hannes; Wagen-Zink, Stefanie; Duhm-Harbeck, Petra; Ingenerf, Josef; Neuhaus, Philipp; Dugas, Martin; Bruland, Philipp

2018-01-01

The establishment of a digital healthcare system is a national and community task. The Federal Ministry of Education and Research in Germany is providing funding for consortia consisting of university hospitals among others participating in the "Medical Informatics Initiative". Exchange of medical data between research institutions necessitates a place where meta information for this data is made accessible. Within these consortia different metadata registry solutions were chosen. To promote interoperability between these solutions, we have examined whether the portal of Medical Data Models is eligible for managing and communicating metadata and relevant information across different data integration centres of the Medical Informatics Initiative and beyond. Apart from the MDM-portal, some ISO 11179-based systems such as Samply.MDR as well as openEHR-based solutions are going to be applyed. In this paper, we have focused on the creation of a mapping model between the CDISC ODM standard and the Samply.MDR import format. In summary, it can be stated that the mapping model is feasible and promote the exchangeability between different metadata registry approaches.
linkedISA: semantic representation of ISA-Tab experimental metadata.

PubMed

González-Beltrán, Alejandra; Maguire, Eamonn; Sansone, Susanna-Assunta; Rocca-Serra, Philippe

2014-01-01

Reporting and sharing experimental metadata- such as the experimental design, characteristics of the samples, and procedures applied, along with the analysis results, in a standardised manner ensures that datasets are comprehensible and, in principle, reproducible, comparable and reusable. Furthermore, sharing datasets in formats designed for consumption by humans and machines will also maximize their use. The Investigation/Study/Assay (ISA) open source metadata tracking framework facilitates standards-compliant collection, curation, visualization, storage and sharing of datasets, leveraging on other platforms to enable analysis and publication. The ISA software suite includes several components used in increasingly diverse set of life science and biomedical domains; it is underpinned by a general-purpose format, ISA-Tab, and conversions exist into formats required by public repositories. While ISA-Tab works well mainly as a human readable format, we have also implemented a linked data approach to semantically define the ISA-Tab syntax. We present a semantic web representation of the ISA-Tab syntax that complements ISA-Tab's syntactic interoperability with semantic interoperability. We introduce the linkedISA conversion tool from ISA-Tab to the Resource Description Framework (RDF), supporting mappings from the ISA syntax to multiple community-defined, open ontologies and capitalising on user-provided ontology annotations in the experimental metadata. We describe insights of the implementation and how annotations can be expanded driven by the metadata. We applied the conversion tool as part of Bio-GraphIIn, a web-based application supporting integration of the semantically-rich experimental descriptions. Designed in a user-friendly manner, the Bio-GraphIIn interface hides most of the complexities to the users, exposing a familiar tabular view of the experimental description to allow seamless interaction with the RDF representation, and visualising descriptors to drive the query over the semantic representation of the experimental design. In addition, we defined queries over the linkedISA RDF representation and demonstrated its use over the linkedISA conversion of datasets from Nature' Scientific Data online publication. Our linked data approach has allowed us to: 1) make the ISA-Tab semantics explicit and machine-processable, 2) exploit the existing ontology-based annotations in the ISA-Tab experimental descriptions, 3) augment the ISA-Tab syntax with new descriptive elements, 4) visualise and query elements related to the experimental design. Reasoning over ISA-Tab metadata and associated data will facilitate data integration and knowledge discovery.
Catalog Federation and Interoperability for Geoinformatics

NASA Astrophysics Data System (ADS)

Memon, A.; Lin, K.; Baru, C.

2008-12-01

With the increasing proliferation of online resources in the geosciences, including data, tools, and software services, there is also a proliferation of catalogs containing metadata that describe these resources. To realize the vision articulated in the NSF Workshop on Building a National Geoinformatics System, March 2007-where a user can sit at a terminal and easily search, discover, integrate and use distributed geoscience resources-it will be essential that a search request be able to traverse these multiple metadata catalogs. In this paper, we describe our effort at prototyping catalog interoperability across multiple metadata catalogs. An example of a metadata catalog is the one employed in the GEON Project (www.geongrid.org). The central GEON catalog can be searched using spatial, temporal, and other metadata-based search criteria. The search can be invoked as a Web service and, therefore, can be imbedded in any software application. There has been a requirement from some of the GEON collaborators (for example, at the University of Hyderabad, India and the Navajo Technical College, New Mexico) to deploy their own catalogs, to store information about their resources locally, while they publish some of this information for broader access and use. Thus, a search must now be able to span multiple, independent GEON catalogs. Next, some of our collaborators-e.g. GEO Grid (Global Earth Observations Grid) in Japan-are implementing the Catalog Services for the Web (CS-W) standard for their catalog, thereby requiring the search to span across catalogs implemented using the CS-W standard as well. Finally, we have recently deployed a search service to access all EarthScope data products, which are distributed across organizations in Seattle, WA (IRIS), Boulder, CO (UNAVCO), and Potsdam, Germany (ICDP/GFZ). This service essentially implements a virtual catalog (the actual catalogs and data are stored at the remote locations). So, there is the need to incorporate such 3rd party searches within a broader search function, such as GEONsearch in the GEON Portal. We will discuss technical issues involved in designing and deploying such a multi-catalog search service in GEON.
EMODNet Bathymetry - building and providing a high resolution digital bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.

2015-04-01

Access to marine data is a key issue for the implementation of the EU Marine Strategy Framework Directive (MSFD). The EU communication 'Marine Knowledge 2020' underpins the importance of data availability and harmonising access to marine data from different sources. The European Marine Observation and Data Network (EMODnet) is a long term marine data initiative from the European Commission Directorate-General for Maritime Affairs and Fisheries (DG MARE) underpinning the Marine Knowledge 2020 strategy. EMODnet is a consortium of organisations assembling European marine data, data products and metadata from diverse sources in a uniform way. The main purpose of EMODnet is to unlock fragmented and hidden marine data resources and to make these available to individuals and organisations (public and private), and to facilitate investment in sustainable coastal and offshore activities through improved access to quality-assured, standardised and harmonised marine data which are interoperable and free of restrictions on use. The EMODnet data infrastructure is developed through a stepwise approach in three major phases. Currently EMODnet is in the 2nd phase of development with seven sub-portals in operation that provide access to marine data from the following themes: bathymetry, geology, physics, chemistry, biology, seabed habitats and human activities. EMODnet development is a dynamic process so new data, products and functionality are added regularly while portals are continuesly improved to make the service more fit for purpose and user friendly with the help of users and stakeholders. The EMODnet Bathymetry project develops and publishes Digital Terrain Models (DTM) for the European seas. These are produced from survey and aggregated data sets, that are indexed with metadata by adopting the SeaDataNet Common Data Index (CDI) data discovery and access service and the SeaDataNet Sextant data products catalogue service. The new EMODnet DTM will have a resolution of 1/8 arcminute * 1/8 arcminute and will cover all European sea regions. Use is made of available and gathered surveys and already more than 10.000 surveys have been indexed by 24 European data providers and originating from more than 120 organisations. Also use is made of composite DTMs as generated and maintained by several data providers for their areas of interest. Already 44 composite DTMs are included in the Sextant data products catalogue. For areas without coverage use is made of the latest global DTM of GEBCO who is partner in the EMODnet Bathymetry project. In return GEBCO integrates the EMODnet DTM to achieve an enriched and better result. The catalogue services and the generated EMODnet can be queried and browsed at the dedicated EMODnet Bathymetry portal which also provides a versatile DTM viewing service with many relevant map layers and functions for retrieving. Activities are underway for further refinement following user feedback. The EMODnet DTM is publicly available for downloading in various formats. The presentation will highlight key details of EMODnet Bathymetry project, its portal and views on the new EMODNet Digital Bathymetry for European seas as to be released early 2015.
Modernized Techniques for Dealing with Quality Data and Derived Products

NASA Astrophysics Data System (ADS)

Neiswender, C.; Miller, S. P.; Clark, D.

2008-12-01

"I just want a picture of the ocean floor in this area" is expressed all too often by researchers, educators, and students in the marine geosciences. As more sophisticated systems are developed to handle data collection and processing, the demand for quality data, and standardized products continues to grow. Data management is an invisible bridge between science and researchers/educators. The SIOExplorer digital library presents more than 50 years of ocean-going research. Prior to publication, all data is checked for quality using standardized criterion developed for each data stream. Despite the evolution of data formats and processing systems, SIOExplorer continues to present derived products in well- established formats. Standardized products are published for each cruise, and include a cruise report, MGD77 merged data, multi-beam flipbook, and underway profiles. Creation of these products is made possible by processing scripts, which continue to change with ever-evolving data formats. We continue to explore the potential of database-enabled creation of standardized products, such as the metadata-rich MGD77 header file. Database-enabled, automated processing produces standards-compliant metadata for each data and derived product. Metadata facilitates discovery and interpretation of published products. This descriptive information is stored both in an ASCII file, and a searchable digital library database. SIOExplorer's underlying technology allows focused search and retrieval of data and products. For example, users can initiate a search of only multi-beam data, which includes data-specific parameters. This customization is made possible with a synthesis of database, XML, and PHP technology. The combination of standardized products and digital library technology puts quality data and derived products in the hands of scientists. Interoperable systems enable distribution these published resources using technology such as web services. By developing modernized strategies to deal with data, Scripps Institution of Oceanography is able to produce and distribute well-formed, and quality-tested derived products, which aid research, understanding, and education.
The New Online Metadata Editor for Generating Structured Metadata

NASA Astrophysics Data System (ADS)

Devarakonda, R.; Shrestha, B.; Palanisamy, G.; Hook, L.; Killeffer, T.; Boden, T.; Cook, R. B.; Zolly, L.; Hutchison, V.; Frame, M. T.; Cialella, A. T.; Lazer, K.

2014-12-01

Nobody is better suited to "describe" data than the scientist who created it. This "description" about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, and locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [1] [2]. Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. How is OME helping Big Data Centers like ORNL DAAC? The ORNL DAAC is one of NASA's Earth Observing System Data and Information System (EOSDIS) data centers managed by the ESDIS Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability.References:[1] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [2] Wilson, Bruce E., et al. "Mercury Toolset for Spatiotemporal Metadata." NASA Technical Reports Server (NTRS) (2010).
Energize New Mexico - Integration of Diverse Energy-Related Research Data into an Interoperable Geospatial Infrastructure and National Data Repositories

NASA Astrophysics Data System (ADS)

Hudspeth, W. B.; Barrett, H.; Diller, S.; Valentin, G.

2016-12-01

Energize is New Mexico's Experimental Program to Stimulate Competitive Research (NM EPSCoR), funded by the NSF with a focus on building capacity to conduct scientific research. Energize New Mexico leverages the work of faculty and students from NM universities and colleges to provide the tools necessary to a quantitative, science-driven discussion of the state's water policy options and to realize New Mexico's potential for sustainable energy development. This presentation discusses the architectural details of NM EPSCoR's collaborative data management system, GSToRE, and how New Mexico researchers use it to share and analyze diverse research data, with the goal of attaining sustainable energy development in the state.The Earth Data Analysis Center (EDAC) at The University of New Mexico leads the development of computational interoperability capacity that allows the wide use and sharing of energy-related data among NM EPSCoR researchers. Data from a variety of research disciplines is stored and maintained in EDAC's Geographic Storage, Transformation and Retrieval Engine (GSToRE), a distributed platform for large-scale vector and raster data discovery, subsetting, and delivery via Web services that are based on Open Geospatial Consortium (OGC) and REST Web-service standards. Researchers upload and register scientific datasets using a front-end client that collects the critical metadata. In addition, researchers have the option to register their datasets with DataONE, a national, community-driven project that provides access to data across multiple member repositories. The GSToRE platform maintains a searchable, core collection of metadata elements that can be used to deliver metadata in multiple formats, including ISO 19115-2/19139 and FGDC CSDGM. Stored metadata elements also permit the platform to automate the registration of Energize datasets into DataONE, once the datasets are approved for release to the public.

Applying an Archetype-Based Approach to Electroencephalography/Event-Related Potential Experiments in the EEGBase Resource.

PubMed

Papež, Václav; Mouček, Roman

2017-01-01

The purpose of this study is to investigate the feasibility of applying openEHR (an archetype-based approach for electronic health records representation) to modeling data stored in EEGBase, a portal for experimental electroencephalography/event-related potential (EEG/ERP) data management. The study evaluates re-usage of existing openEHR archetypes and proposes a set of new archetypes together with the openEHR templates covering the domain. The main goals of the study are to (i) link existing EEGBase data/metadata and openEHR archetype structures and (ii) propose a new openEHR archetype set describing the EEG/ERP domain since this set of archetypes currently does not exist in public repositories. The main methodology is based on the determination of the concepts obtained from EEGBase experimental data and metadata that are expressible structurally by the openEHR reference model and semantically by openEHR archetypes. In addition, templates as the third openEHR resource allow us to define constraints over archetypes. Clinical Knowledge Manager (CKM), a public openEHR archetype repository, was searched for the archetypes matching the determined concepts. According to the search results, the archetypes already existing in CKM were applied and the archetypes not existing in the CKM were newly developed. openEHR archetypes support linkage to external terminologies. To increase semantic interoperability of the new archetypes, binding with the existing odML electrophysiological terminology was assured. Further, to increase structural interoperability, also other current solutions besides EEGBase were considered during the development phase. Finally, a set of templates using the selected archetypes was created to meet EEGBase requirements. A set of eleven archetypes that encompassed the domain of experimental EEG/ERP measurements were identified. Of these, six were reused without changes, one was extended, and four were newly created. All archetypes were arranged in the templates reflecting the EEGBase metadata structure. A mechanism of odML terminology referencing was proposed to assure semantic interoperability of the archetypes. The openEHR approach was found to be useful not only for clinical purposes but also for experimental data modeling.
The MAR databases: development and implementation of databases specific for marine metagenomics.

PubMed

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

2018-01-04

We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Design of Community Resource Inventories as a Component of Scalable Earth Science Infrastructure: Experience of the Earthcube CINERGI Project

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Valentine, D. W., Jr.; Grethe, J. S.; Hsu, L.; Malik, T.; Bermudez, L. E.; Gupta, A.; Lehnert, K. A.; Whitenack, T.; Ozyurt, I. B.; Condit, C.; Calderon, R.; Musil, L.

2014-12-01

EarthCube is envisioned as a cyberinfrastructure that fosters new, transformational geoscience by enabling sharing, understanding and scientifically-sound and efficient re-use of formerly unconnected data resources, software, models, repositories, and computational power. Its purpose is to enable science enterprise and workforce development via an extensible and adaptable collaboration and resource integration framework. A key component of this vision is development of comprehensive inventories supporting resource discovery and re-use across geoscience domains. The goal of the EarthCube CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) project is to create a methodology and assemble a large inventory of high-quality information resources with standard metadata descriptions and traceable provenance. The inventory is compiled from metadata catalogs maintained by geoscience data facilities, as well as from user contributions. The latter mechanism relies on community resource viewers: online applications that support update and curation of metadata records. Once harvested into CINERGI, metadata records from domain catalogs and community resource viewers are loaded into a staging database implemented in MongoDB, and validated for compliance with ISO 19139 metadata schema. Several types of metadata defects detected by the validation engine are automatically corrected with help of several information extractors or flagged for manual curation. The metadata harvesting, validation and processing components generate provenance statements using W3C PROV notation, which are stored in a Neo4J database. Thus curated metadata, along with the provenance information, is re-published and accessed programmatically and via a CINERGI online application. This presentation focuses on the role of resource inventories in a scalable and adaptable information infrastructure, and on the CINERGI metadata pipeline and its implementation challenges. Key project components are described at the project's website (http://workspace.earthcube.org/cinergi), which also provides access to the initial resource inventory, the inventory metadata model, metadata entry forms and a collection of the community resource viewers.
Re-use of standard ontologies in a water quality vocabulary (Invited)

NASA Astrophysics Data System (ADS)

Cox, S. J.; Simons, B.; Yu, J.

2013-12-01

Observations provide the key constraints on environmental and earth science investigations. Where an investigation uses data sourced from multiple providers, data fusion depends on the observation classifications being comparable. Standard models for observation metadata are available (ISO 19156) which provide slots for key classifiers, in particular, the observed property and observation procedure. While universal use of common vocabularies might be desirable in achieving interoperability, this is unlikely in practice. However, semantic web vocabularies provide the means for asserting proximity and other relationships between items in different vocabularies, thus enabling mediation as an interoperability solution. Here we report on the development of a vocabulary for water quality observations in which recording relationships with existing vocabularies was a core strategy. The vocabulary is required to enable combination of a number of groundwater, surface water and marine water quality datasets on an ongoing basis. Our vocabulary model is based on the principle that observations generally report values of specific parameters which are defined by combining a number of facets. We start from Quantities, Units, Dimensions and Data Types (QUDT), which is an OWL ontology developed by NASA and TopQuadrant. We extend this with two additional classes, for Observed Property and Identified Object, and two linking properties, which enable us to create an observed property vocabulary for water quality applications. This ontology is comparable with models for observed properties developed as part of OGC's Observations and Measurements v1.0 standard, the INSPIRE Generic Conceptual Model, and may also be compared with the W3C SSN Ontology, which is based on the DOLCE Ultralite upper-ontology. Water quality observations commonly report concentrations of chemicals, both natural and contaminant, so we tie many of the Identified Objects to items from Chemical Entities of Biological Interest (ChEBI). ChEBI is an OWL-based dictionary of over 70 000 molecular entities, based on existing scientific work and linked through to International Union of Pure and Applied Chemistry (IUPAC) Nomenclature. Within the model the relevant classes, including those from QUDT, are declared to be subclasses of SKOS Concept, so the resulting vocabularies may be directly mapped to other SKOS-based vocabularies, such as from the NERC Vocabulary Service or the Marine Metadata Initiative, using SKOS predicates. Where the external vocabularies are not published with persistent URIs, such as CUAHSI, the mapping may be recorded more informally using annotations, or use proxy URIs for the external vocabulary. The resulting SKOS vocabularies demonstrate a separation of governance of key definitions such as units and quantities and chemical entities, ensuring reuse where possible and extending and adding detail where necessary.
Quality Metadata Management for Geospatial Scientific Workflows: from Retrieving to Assessing with Online Tools

NASA Astrophysics Data System (ADS)

Leibovici, D. G.; Pourabdollah, A.; Jackson, M.

2011-12-01

Experts and decision-makers use or develop models to monitor global and local changes of the environment. Their activities require the combination of data and processing services in a flow of operations and spatial data computations: a geospatial scientific workflow. The seamless ability to generate, re-use and modify a geospatial scientific workflow is an important requirement but the quality of outcomes is equally much important [1]. Metadata information attached to the data and processes, and particularly their quality, is essential to assess the reliability of the scientific model that represents a workflow [2]. Managing tools, dealing with qualitative and quantitative metadata measures of the quality associated with a workflow, are, therefore, required for the modellers. To ensure interoperability, ISO and OGC standards [3] are to be adopted, allowing for example one to define metadata profiles and to retrieve them via web service interfaces. However these standards need a few extensions when looking at workflows, particularly in the context of geoprocesses metadata. We propose to fill this gap (i) at first through the provision of a metadata profile for the quality of processes, and (ii) through providing a framework, based on XPDL [4], to manage the quality information. Web Processing Services are used to implement a range of metadata analyses on the workflow in order to evaluate and present quality information at different levels of the workflow. This generates the metadata quality, stored in the XPDL file. The focus is (a) on the visual representations of the quality, summarizing the retrieved quality information either from the standardized metadata profiles of the components or from non-standard quality information e.g., Web 2.0 information, and (b) on the estimated qualities of the outputs derived from meta-propagation of uncertainties (a principle that we have introduced [5]). An a priori validation of the future decision-making supported by the outputs of the workflow once run, is then provided using the meta-propagated qualities, obtained without running the workflow [6], together with the visualization pointing out the need to improve the workflow with better data or better processes on the workflow graph itself. [1] Leibovici, DG, Hobona, G Stock, K Jackson, M (2009) Qualifying geospatial workfow models for adaptive controlled validity and accuracy. In: IEEE 17th GeoInformatics, 1-5 [2] Leibovici, DG, Pourabdollah, A (2010a) Workflow Uncertainty using a Metamodel Framework and Metadata for Data and Processes. OGC TC/PC Meetings, September 2010, Toulouse, France [3] OGC (2011) www.opengeospatial.org [4] XPDL (2008) Workflow Process Definition Interface - XML Process Definition Language.Workflow Management Coalition, Document WfMC-TC-1025, 2008 [5] Leibovici, DG Pourabdollah, A Jackson, M (2011) Meta-propagation of Uncertainties for Scientific Workflow Management in Interoperable Spatial Data Infrastructures. In: Proceedings of the European Geosciences Union (EGU2011), April 2011, Austria [6] Pourabdollah, A Leibovici, DG Jackson, M (2011) MetaPunT: an Open Source tool for Meta-Propagation of uncerTainties in Geospatial Processing. In: Proceedings of OSGIS2011, June 2011, Nottingham, UK
The Arctic Cooperative Data and Information System: Data Management Support for the NSF Arctic Research Program (Invited)

NASA Astrophysics Data System (ADS)

Moore, J.; Serreze, M. C.; Middleton, D.; Ramamurthy, M. K.; Yarmey, L.

2013-12-01

The NSF funds the Advanced Cooperative Arctic Data and Information System (ACADIS), url: (http://www.aoncadis.org/). It serves the growing and increasingly diverse data management needs of NSF's arctic research community. The ACADIS investigator team combines experienced data managers, curators and software engineers from the NSIDC, UCAR and NCAR. ACADIS fosters scientific synthesis and discovery by providing a secure long-term data archive to NSF investigators. The system provides discovery and access to arctic related data from this and other archives. This paper updates the technical components of ACADIS, the implementation of best practices, the value of ACADIS to the community and the major challenges facing this archive for the future in handling the diverse data coming from NSF Arctic investigators. ACADIS provides sustainable data management, data stewardship services and leadership for the NSF Arctic research community through open data sharing, adherence to best practices and standards, capitalizing on appropriate evolving technologies, community support and engagement. ACADIS leverages other pertinent projects, capitalizing on appropriate emerging technologies and participating in emerging cyberinfrastructure initiatives. The key elements of ACADIS user services to the NSF Arctic community include: data and metadata upload; support for datasets with special requirements; metadata and documentation generation; interoperability and initiatives with other archives; and science support to investigators and the community. Providing a self-service data publishing platform requiring minimal curation oversight while maintaining rich metadata for discovery, access and preservation is challenging. Implementing metadata standards are a first step towards consistent content. The ACADIS Gateway and ADE offer users choices for data discovery and access with the clear objective of increasing discovery and use of all Arctic data especially for analysis activities. Metadata is at the core of ACADIS activities, from capturing metadata at the point of data submission to ensuring interoperability , providing data citations, and supporting data discovery. ACADIS metadata efforts include: 1) Evolution of the ACADIS metadata profile to increase flexibility in search; 2) Documentation guidelines; and 3) Metadata standardization efforts. A major activity is now underway to ensure consistency in the metadata profile across all archived datasets. ACADIS is embarking on a critical activity to create Digital Object Identifiers (DOI) for all its holdings. The data services offered by ACADIS focus on meeting the needs of the data providers, providing dynamic search capabilities to peruse the ACADIS and related cyrospheric data repositories, efficient data download and some special services including dataset reformatting and visualization. The service is built around of the following key technical elements: The ACADIS Gateway housed at NCAR has been developed to support NSF Arctic data coming from AON and now broadly across PLR/ARC and related archives: The Arctic Data Explorer (ADE) developed at NSIDC is an integral service of ACADIS bringing the rich archive from NSIDC together with catalogs from ACADIS and international partners in Arctic research: and Rosetta and the Digital Object Identifier (DOI) generation scheme are tools available to the community to help publish and utilize datasets in integration and synthesis and publication.
Using Open and Interoperable Ways to Publish and Access LANCE AIRS Near-Real Time Data

NASA Technical Reports Server (NTRS)

Zhao, Peisheng; Lynnes, Christopher; Vollmer, Bruce; Savtchenko, Andrey; Theobald, Michael; Yang, Wenli

2011-01-01

The Atmospheric Infrared Sounder (AIRS) Near-Real Time (NRT) data from the Land Atmosphere Near real-time Capability for EOS (LANCE) element at the Goddard Earth Sciences Data and Information Services Center (GES DISC) provides information on the global and regional atmospheric state, with very low temporal latency, to support climate research and improve weather forecasting. An open and interoperable platform is useful to facilitate access to, and integration of, LANCE AIRS NRT data. As Web services technology has matured in recent years, a new scalable Service-Oriented Architecture (SOA) is emerging as the basic platform for distributed computing and large networks of interoperable applications. Following the provide-register-discover-consume SOA paradigm, this presentation discusses how to use open-source geospatial software components to build Web services for publishing and accessing AIRS NRT data, explore the metadata relevant to registering and discovering data and services in the catalogue systems, and implement a Web portal to facilitate users' consumption of the data and services.
The Planetary Data System Distributed Inventory System

NASA Technical Reports Server (NTRS)

Hughes, J. Steven; McMahon, Susan K.

1996-01-01

The advent of the World Wide Web (Web) and the ability to easily put data repositories on-line has resulted in a proliferation of digital libraries. The heterogeneity of the underlying systems, the autonomy of the individual sites, and distributed nature of the technology has made both interoperability across the sites and the search for resources within a site major research topics. This article will describe a system that addresses both issues using standard Web protocols and meta-data labels to implement an inventory of on-line resources across a group of sites. The success of this system is strongly dependent on the existence of and adherence to a standards architecture that guides the management of meta-data within participating sites.
Geoscience Information Network (USGIN) Solutions for Interoperable Open Data Access Requirements

NASA Astrophysics Data System (ADS)

Allison, M. L.; Richard, S. M.; Patten, K.

2014-12-01

The geosciences are leading development of free, interoperable open access to data. US Geoscience Information Network (USGIN) is a freely available data integration framework, jointly developed by the USGS and the Association of American State Geologists (AASG), in compliance with international standards and protocols to provide easy discovery, access, and interoperability for geoscience data. USGIN standards include the geologic exchange language 'GeoSciML' (v 3.2 which enables instant interoperability of geologic formation data) which is also the base standard used by the 117-nation OneGeology consortium. The USGIN deployment of NGDS serves as a continent-scale operational demonstration of the expanded OneGeology vision to provide access to all geoscience data worldwide. USGIN is developed to accommodate a variety of applications; for example, the International Renewable Energy Agency streams data live to the Global Atlas of Renewable Energy. Alternatively, users without robust data sharing systems can download and implement a free software packet, "GINstack" to easily deploy web services for exposing data online for discovery and access. The White House Open Data Access Initiative requires all federally funded research projects and federal agencies to make their data publicly accessible in an open source, interoperable format, with metadata. USGIN currently incorporates all aspects of the Initiative as it emphasizes interoperability. The system is successfully deployed as the National Geothermal Data System (NGDS), officially launched at the White House Energy Datapalooza in May, 2014. The USGIN Foundation has been established to ensure this technology continues to be accessible and available.
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)

NASA Astrophysics Data System (ADS)

Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.

2016-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
Chapter 35: Describing Data and Data Collections in the VO

NASA Astrophysics Data System (ADS)

Kent, B. R.; Hanisch, R. J.; Williams, R. D.

The list of numbers: 19.22, 17.23, 18.11, 16.98, and 15.11, is of little intrinsic interest without information about the context in which they appear. For instance, are these daily closing stock prices for your favorite investment, or are they hourly photometric measurements of an increasingly bright quasar? The information needed to define this context is called metadata. Metadata are data about data. Astronomers are familiar with metadata through the headers of FITS files and the names and units associated with columns in a table or database. In the VO, metadata describe the contents of tables, images, and spectra, as well as aggregate collections of data (archives, surveys) and computational services. Moreover, VO metadata are constructed according to rules that avoid ambiguity and make it clear whether, in the example above, the stock prices are in dollars or euros, or the photometry is Johnson V or Sloan g. Organization of data is important in any scientific discipline. Equally crucial are the descriptions of that data: the organization publishing the data, its creator or the person making it available, what instruments were used, units assigned to measurement, calibration status, and data quality assessment. The Virtual Observatory metadata scheme not only applies to datasets, but to resources as well, including data archive facilities, searchable web forms, and online analysis and display tools. Since the scientific output flowing from large datasets depends greatly on how well the data are described, it is important for users to understand the basics of the metadata scheme in order to locate the data that they want and use it correctly. Metadata are the key to data discovery and data and service interoperability in the Virtual Observatory.
Sensor metadata blueprints and computer-aided editing for disciplined SensorML

NASA Astrophysics Data System (ADS)

Tagliolato, Paolo; Oggioni, Alessandro; Fugazza, Cristiano; Pepe, Monica; Carrara, Paola

2016-04-01

The need for continuous, accurate, and comprehensive environmental knowledge has led to an increase in sensor observation systems and networks. The Sensor Web Enablement (SWE) initiative has been promoted by the Open Geospatial Consortium (OGC) to foster interoperability among sensor systems. The provision of metadata according to the prescribed SensorML schema is a key component for achieving this and nevertheless availability of correct and exhaustive metadata cannot be taken for granted. On the one hand, it is awkward for users to provide sensor metadata because of the lack in user-oriented, dedicated tools. On the other, the specification of invariant information for a given sensor category or model (e.g., observed properties and units of measurement, manufacturer information, etc.), can be labor- and timeconsuming. Moreover, the provision of these details is error prone and subjective, i.e., may differ greatly across distinct descriptions for the same system. We provide a user-friendly, template-driven metadata authoring tool composed of a backend web service and an HTML5/javascript client. This results in a form-based user interface that conceals the high complexity of the underlying format. This tool also allows for plugging in external data sources providing authoritative definitions for the aforementioned invariant information. Leveraging these functionalities, we compiled a set of SensorML profiles, that is, sensor metadata blueprints allowing end users to focus only on the metadata items that are related to their specific deployment. The natural extension of this scenario is the involvement of end users and sensor manufacturers in the crowd-sourced evolution of this collection of prototypes. We describe the components and workflow of our framework for computer-aided management of sensor metadata.
STAR Online Framework: from Metadata Collection to Event Analysis and System Control

NASA Astrophysics Data System (ADS)

Arkhipkin, D.; Lauret, J.

2015-05-01

In preparation for the new era of RHIC running (RHIC-II upgrades and possibly, the eRHIC era), the STAR experiment is expanding its modular Message Interface and Reliable Architecture framework (MIRA). MIRA allowed STAR to integrate meta-data collection, monitoring, and online QA components in a very agile and efficient manner using a messaging infrastructure approach. In this paper, we briefly summarize our past achievements, provide an overview of the recent development activities focused on messaging patterns and describe our experience with the complex event processor (CEP) recently integrated into the MIRA framework. CEP was used in the recent RHIC Run 14, which provided practical use cases. Finally, we present our requirements and expectations for the planned expansion of our systems, which will allow our framework to acquire features typically associated with Detector Control Systems. Special attention is given to aspects related to latency, scalability and interoperability within heterogeneous set of services, various data and meta-data acquisition components coexisting in STAR online domain.
Metadata for WIS and WIGOS: GAW Profile of ISO19115 and Draft WIGOS Core Metadata Standard

NASA Astrophysics Data System (ADS)

Klausen, Jörg; Howe, Brian

2014-05-01

The World Meteorological Organization (WMO) Integrated Global Observing System (WIGOS) is a key WMO priority to underpin all WMO Programs and new initiatives such as the Global Framework for Climate Services (GFCS). The development of the WIGOS Operational Information Resource (WIR) is central to the WIGOS Framework Implementation Plan (WIGOS-IP). The WIR shall provide information on WIGOS and its observing components, as well as requirements of WMO application areas. An important aspect is the description of the observational capabilities by way of structured metadata. The Global Atmosphere Watch is the WMO program addressing the chemical composition and selected physical properties of the atmosphere. Observational data are collected and archived by GAW World Data Centres (WDCs) and related data centres. The Task Team on GAW WDCs (ET-WDC) have developed a profile of the ISO19115 metadata standard that is compliant with the WMO Information System (WIS) specification for the WMO Core Metadata Profile v1.3. This profile is intended to harmonize certain aspects of the documentation of observations as well as the interoperability of the WDCs. The Inter-Commission-Group on WIGOS (ICG-WIGOS) has established the Task Team on WIGOS Metadata (TT-WMD) with representation of all WMO Technical Commissions and the objective to define the WIGOS Core Metadata. The result of this effort is a draft semantic standard comprising of a set of metadata classes that are considered to be of critical importance for the interpretation of observations relevant to WIGOS. The purpose of the presentation is to acquaint the audience with the standard and to solicit informal feed-back from experts in the various disciplines of meteorology and climatology. This feed-back will help ET-WDC and TT-WMD to refine the GAW metadata profile and the draft WIGOS metadata standard, thereby increasing their utility and acceptance.
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises

NASA Astrophysics Data System (ADS)

Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.

2011-12-01

JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.
Ocean Data Interoperability Platform (ODIP): developing a common framework for marine data management on a global scale

NASA Astrophysics Data System (ADS)

Schaap, D.

2015-12-01

Europe, the USA, and Australia are making significant progress in facilitating the discovery, access and long term stewardship of ocean and marine data through the development, implementation, population and operation of national, regional or international distributed ocean and marine observing and data management infrastructures such as SeaDataNet, EMODnet, IOOS, R2R, and IMOS. All of these developments are resulting in the development of standards and services implemented and used by their regional communities. The Ocean Data Interoperability Platform (ODIP) project is supported by the EU FP7 Research Infrastructures programme, National Science Foundation (USA) and Australian government and has been initiated 1st October 2012. Recently the project has been continued as ODIP 2 for another 3 years with EU HORIZON 2020 funding. ODIP includes all the major organisations engaged in ocean data management in EU, US, and Australia. ODIP is also supported by the IOC-IODE, closely linking this activity with its Ocean Data Portal (ODP) and Ocean Data Standards Best Practices (ODSBP) projects. The ODIP platform aims to ease interoperability between the regional marine data management infrastructures. Therefore it facilitates an organised dialogue between the key infrastructure representatives by means of publishing best practice, organising a series of international workshops and fostering the development of common standards and interoperability solutions. These are evaluated and tested by means of prototype projects. The presentation will give further background on the ODIP projects and the latest information on the progress of three prototype projects addressing: establishing interoperability between the regional EU, USA and Australia data discovery and access services (SeaDataNet CDI, US NODC, and IMOS MCP) and contributing to the global GEOSS and IODE-ODP portals; establishing interoperability between cruise summary reporting systems in Europe, the USA and Australia for routine harvesting of cruise data for delivery via the Partnership for Observation of Global Oceans (POGO) global portal; establishing common standards for a Sensor Observation Service (SOS) for selected sensors installed on vessels and in real-time monitoring systems using sensor web enablement (SWE)
Ocean Data Interoperability Platform (ODIP): developing a common framework for marine data management on a global scale

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Glaves, Helen

2016-04-01

Europe, the USA, and Australia are making significant progress in facilitating the discovery, access and long term stewardship of ocean and marine data through the development, implementation, population and operation of national, regional or international distributed ocean and marine observing and data management infrastructures such as SeaDataNet, EMODnet, IOOS, R2R, and IMOS. All of these developments are resulting in the development of standards and services implemented and used by their regional communities. The Ocean Data Interoperability Platform (ODIP) project is supported by the EU FP7 Research Infrastructures programme, National Science Foundation (USA) and Australian government and has been initiated 1st October 2012. Recently the project has been continued as ODIP II for another 3 years with EU HORIZON 2020 funding. ODIP includes all the major organisations engaged in ocean data management in EU, US, and Australia. ODIP is also supported by the IOC-IODE, closely linking this activity with its Ocean Data Portal (ODP) and Ocean Data Standards Best Practices (ODSBP) projects. The ODIP platform aims to ease interoperability between the regional marine data management infrastructures. Therefore it facilitates an organised dialogue between the key infrastructure representatives by means of publishing best practice, organising a series of international workshops and fostering the development of common standards and interoperability solutions. These are evaluated and tested by means of prototype projects. The presentation will give further background on the ODIP projects and the latest information on the progress of three prototype projects addressing: 1. establishing interoperability between the regional EU, USA and Australia data discovery and access services (SeaDataNet CDI, US NODC, and IMOS MCP) and contributing to the global GEOSS and IODE-ODP portals; 2. establishing interoperability between cruise summary reporting systems in Europe, the USA and Australia for routine harvesting of cruise data for delivery via the Partnership for Observation of Global Oceans (POGO) global portal; 3. the establishment of common standards for a Sensor Observation Service (SOS) for selected sensors installed on vessels and in real-time monitoring systems using sensor web enablement (SWE)
SURA Coastal Ocean Observing and Prediction (SCOOP) Program: Integrating Marine Science and Information Technology

DTIC Science & Technology

2006-09-30

coastal phenomena. OBJECTIVES SURA is creating a SCOOP “Grid” that extends the interoperability enabled by the World Wide Web. The coastal ... community faces special challenges with respect to achieving a level of interoperability that can leverage emerging Grid technologies. With that in mind
SeaDataNet II - EMODNet - building a pan-European infrastructure for marine and ocean data management

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Fichaut, Michele

2014-05-01

The second phase of the project SeaDataNet is well underway since October 2011 and is making good progress. The main objective is to improve operations and to progress towards an efficient data management infrastructure able to handle the diversity and large volume of data collected via research cruises and monitoring activities in European marine waters and global oceans. The SeaDataNet infrastructure comprises a network of interconnected data centres and a central SeaDataNet portal. The portal provides users a unified and transparent overview of the metadata and controlled access to the large collections of data sets, managed by the interconnected data centres, and the various SeaDataNet standards and tools,. Recently the 1st Innovation Cycle has been completed, including upgrading of the CDI Data Discovery and Access service to ISO 19139 and making it fully INSPIRE compliant. The extensive SeaDataNet Vocabularies have been upgraded too and implemented for all SeaDataNet European metadata directories. SeaDataNet is setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards of ISO (19115, 19139), OGC (WMS, WFS, CS-W and SWE), and OpenSearch. The population of directories has also increased considerably in cooperation and involvement in associated EU projects and initiatives. SeaDataNet now gives overview and access to more than 1.4 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 90 connected data centres from 30 countries riparian to European seas. Access to marine data is also a key issue for the implementation of the EU Marine Strategy Framework Directive (MSFD). The EU communication 'Marine Knowledge 2020' underpins the importance of data availability and harmonising access to marine data from different sources. SeaDataNet qualified itself for leading the data management component of the EMODNet (European Marine Observation and Data Network) that is promoted in the EU Communication. In the past 4 years EMODNet portals have been initiated for marine data themes: digital bathymetry, chemistry, physical oceanography, geology, biology, and seabed habitat mapping. These portals are now being expanded to all European seas in successor projects, which started mid 2013 from EU DG MARE. EMODNet encourages more data providers to come forward for data sharing and participating in the process of making complete overviews and homogeneous data products. The EMODNet Bathymetry project is very illustrative for the synergy with SeaDataNet and added value of generating public data products. The project develops and publishes Digital Terrain Models (DTM) for the European seas. These are produced from survey and aggregated data sets. The portal provides a versatile DTM viewing service with many relevant map layers and functions for retrieving. A further refinement is taking place in the new phase. The presentation will give information on present services of the SeaDataNet infrastructure and services, highlight key achievements in SeaDataNet II so far, and give further insights in the EMODNet Bathymetry progress.
The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions

PubMed Central

2011-01-01

Background The practice and research of medicine generates considerable quantities of data and model resources (DMRs). Although in principle biomedical resources are re-usable, in practice few can currently be shared. In particular, the clinical communities in physiology and pharmacology research, as well as medical education, (i.e. PPME communities) are facing considerable operational and technical obstacles in sharing data and models. Findings We outline the efforts of the PPME communities to achieve automated semantic interoperability for clinical resource documentation in collaboration with the RICORDO project. Current community practices in resource documentation and knowledge management are overviewed. Furthermore, requirements and improvements sought by the PPME communities to current documentation practices are discussed. The RICORDO plan and effort in creating a representational framework and associated open software toolkit for the automated management of PPME metadata resources is also described. Conclusions RICORDO is providing the PPME community with tools to effect, share and reason over clinical resource annotations. This work is contributing to the semantic interoperability of DMRs through ontology-based annotation by (i) supporting more effective navigation and re-use of clinical DMRs, as well as (ii) sustaining interoperability operations based on the criterion of biological similarity. Operations facilitated by RICORDO will range from automated dataset matching to model merging and managing complex simulation workflows. In effect, RICORDO is contributing to community standards for resource sharing and interoperability. PMID:21878109

A Pragmatic Approach to Sustainable Interoperability for the Web 2.0 World

NASA Astrophysics Data System (ADS)

Wright, D. J.; Sankaran, S.

2015-12-01

In the geosciences, interoperability is a fundamental requirement. Members of various standards organizations such as the OGC and ISO-TC 211 have done yeomen services to promote a standards-centric approach to manage the interoperability challenges that organizations face today. The specific challenges that organizations face when adopting interoperability patterns are very many. One approach, that of mandating the use of specific standards has been reasonably successful. But scientific communities, as with all others, ultimately want their solutions to be widely accepted and used. And to this end there is a crying need to explore all possible interoperability patterns without restricting the choices to mandated standards. Standards are created by a slow and deliberative process that sometimes takes a long time to come to fruition and therefore sometime feel to fall short of user expectations. It seems therefore that organizations are left with a series of perceived orthogonal requirements when they want to pursue interoperability. They want a robust but agile solution, a mature approach that also needs to satisfy latest technology trends and so on. Sustainable interoperability patterns need to be forward looking and should choose the patterns and paradigms of the Web 2.0 generation. To this end, the key is to choose platform technologies that embrace multiple interoperability mechanisms that are built on fundamental "open" principles and which align with popular mainstream patterns. We seek to explore data-, metadata- and web service-related interoperability patterns through the prism of building solutions that encourage strong implementer and end-user engagement, improved usability and scalability considerations, and appealing developer frameworks that can grow the audience. The path to tread is not new, and the geocommunity only needs to observe and align its end goals with current Web 2.0 patterns to realize all the benefits that today we all take for granted as part of our everyday use of technology.
The IEO Data Center Management System: Tools for quality control, analysis and access marine data

NASA Astrophysics Data System (ADS)

Casas, Antonia; Garcia, Maria Jesus; Nikouline, Andrei

2010-05-01

Since 1994 the Data Centre of the Spanish Oceanographic Institute develops system for archiving and quality control of oceanographic data. The work started in the frame of the European Marine Science & Technology Programme (MAST) when a consortium of several Mediterranean Data Centres began to work on the MEDATLAS project. Along the years, old software modules for MS DOS were rewritten, improved and migrated to Windows environment. Oceanographic data quality control includes now not only vertical profiles (mainly CTD and bottles observations) but also time series of currents and sea level observations. New powerful routines for analysis and for graphic visualization were added. Data presented originally in ASCII format were organized recently in an open source MySQL database. Nowadays, the IEO, as part of SeaDataNet Infrastructure, has designed and developed a new information system, consistent with the ISO 19115 and SeaDataNet standards, in order to manage the large and diverse marine data and information originated in Spain by different sources, and to interoperate with SeaDataNet. The system works with data stored in ASCII files (MEDATLAS, ODV) as well as data stored within the relational database. The components of the system are: 1.MEDATLAS Format and Quality Control - QCDAMAR: Quality Control of Marine Data. Main set of tools for working with data presented as text files. Includes extended quality control (searching for duplicated cruises and profiles, checking date, position, ship velocity, constant profiles, spikes, density inversion, sounding, acceptable data, impossible regional values,...) and input/output filters. - QCMareas: A set of procedures for the quality control of tide gauge data according to standard international Sea Level Observing System. These procedures include checking for unexpected anomalies in the time series, interpolation, filtering, computation of basic statistics and residuals. 2. DAMAR: A relational data base (MySql) designed to manage the wide variety of marine information as common vocabularies, Catalogues (CSR & EDIOS), Data and Metadata. 3.Other tools for analysis and data management - Import_DB: Script to import data and metadata from the Medatlas ASCII files into the database. - SelDamar/Selavi: interface with the database for local and web access. Allows selective retrievals applying the criteria introduced by the user, as geographical bounds, data responsible, cruises, platform, time periods, etc. Includes also statistical reference values calculation, plotting of original and mean profiles together with vertical interpolation. - ExtractDAMAR: Script to extract data when they are archived in ASCII files that meet the criteria upon an user request through SelDamar interface and export them in ODV format, making also a unit conversion.
Simplifying the Reuse and Interoperability of Geoscience Data Sets and Models with Semantic Metadata that is Human-Readable and Machine-actionable

NASA Astrophysics Data System (ADS)

Peckham, S. D.

2017-12-01

Standardized, deep descriptions of digital resources (e.g. data sets, computational models, software tools and publications) make it possible to develop user-friendly software systems that assist scientists with the discovery and appropriate use of these resources. Semantic metadata makes it possible for machines to take actions on behalf of humans, such as automatically identifying the resources needed to solve a given problem, retrieving them and then automatically connecting them (despite their heterogeneity) into a functioning workflow. Standardized model metadata also helps model users to understand the important details that underpin computational models and to compare the capabilities of different models. These details include simplifying assumptions on the physics, governing equations and the numerical methods used to solve them, discretization of space (the grid) and time (the time-stepping scheme), state variables (input or output), model configuration parameters. This kind of metadata provides a "deep description" of a computational model that goes well beyond other types of metadata (e.g. author, purpose, scientific domain, programming language, digital rights, provenance, execution) and captures the science that underpins a model. A carefully constructed, unambiguous and rules-based schema to address this problem, called the Geoscience Standard Names ontology will be presented that utilizes Semantic Web best practices and technologies. It has also been designed to work across science domains and to be readable by both humans and machines.
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

NASA Astrophysics Data System (ADS)

Huffer, E.; Gleason, J. L.

2015-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
Constructing a Cross-Domain Resource Inventory: Key Components and Results of the EarthCube CINERGI Project.

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Malik, T.; Hsu, L.; Gupta, A.; Grethe, J. S.; Valentine, D. W., Jr.; Lehnert, K. A.; Bermudez, L. E.; Ozyurt, I. B.; Whitenack, T.; Schachne, A.; Giliarini, A.

2015-12-01

While many geoscience-related repositories and data discovery portals exist, finding information about available resources remains a pervasive problem, especially when searching across multiple domains and catalogs. Inconsistent and incomplete metadata descriptions, disparate access protocols and semantic differences across domains, and troves of unstructured or poorly structured information which is hard to discover and use are major hindrances toward discovery, while metadata compilation and curation remain manual and time-consuming. We report on methodology, main results and lessons learned from an ongoing effort to develop a geoscience-wide catalog of information resources, with consistent metadata descriptions, traceable provenance, and automated metadata enhancement. Developing such a catalog is the central goal of CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability), an EarthCube building block project (earthcube.org/group/cinergi). The key novel technical contributions of the projects include: a) development of a metadata enhancement pipeline and a set of document enhancers to automatically improve various aspects of metadata descriptions, including keyword assignment and definition of spatial extents; b) Community Resource Viewers: online applications for crowdsourcing community resource registry development, curation and search, and channeling metadata to the unified CINERGI inventory, c) metadata provenance, validation and annotation services, d) user interfaces for advanced resource discovery; and e) geoscience-wide ontology and machine learning to support automated semantic tagging and faceted search across domains. We demonstrate these CINERGI components in three types of user scenarios: (1) improving existing metadata descriptions maintained by government and academic data facilities, (2) supporting work of several EarthCube Research Coordination Network projects in assembling information resources for their domains, and (3) enhancing the inventory and the underlying ontology to address several complicated data discovery use cases in hydrology, geochemistry, sedimentology, and critical zone science. Support from the US National Science Foundation under award ICER-1343816 is gratefully acknowledged.
Ocean Data Interoperability Platform: developing a common global framework for marine data management

NASA Astrophysics Data System (ADS)

Glaves, Helen; Schaap, Dick

2017-04-01

In recent years there has been a paradigm shift in marine research moving from the traditional discipline based methodology employed at the national level by one or more organizations, to a multidisciplinary, ecosystem level approach conducted on an international scale. This increasingly holistic approach to marine research is in part being driven by policy and legislation. For example, the European Commission's Blue Growth strategy promotes sustainable growth in the marine environment including the development of sea-basin strategies (European Commission 2014). As well as this policy driven shift to ecosystem level marine research there are also scientific and economic drivers for a basin level approach. Marine monitoring is essential for assessing the health of an ecosystem and determining the impacts of specific factors and activities on it. The availability of large volumes of good quality data is fundamental to this increasingly holistic approach to ocean research but there are significant barriers to its re-use. These are due to the heterogeneity of the data resulting from having been collected by many organizations around the globe using a variety of sensors mounted on a range of different platforms. The data is then delivered and archived in a range of formats, using various spatial coordinate systems and aligned with different standards. This heterogeneity coupled with organizational and national policies on data sharing make access and re-use of marine data problematic. In response to the need for greater sharing of marine data a number of e-infrastructures have been developed but these have different levels of granularity with the majority having been developed at the regional level to address specific requirements for data e.g. SeaDataNet in Europe, the Australian Ocean Data Network (AODN). These data infrastructures are also frequently aligned with the priorities of the local funding agencies and have been created in isolation from those developed elsewhere. To add a further layer of complexity there are also global initiatives providing marine data infrastructures e.g. IOC-IODE, POGO as well as those with a wider remit which includes environmental data e.g. GEOSS, COPERNICUS etc. Ecosystem level marine research requires a common framework for marine data management that supports the sharing of data across these regional and global data systems, and provides the user with access to the data available from these services via a single point of access. This framework must be based on existing data systems and established by developing interoperability between them. The Ocean Data and Interoperability Platform (ODIP/ODIP II) project brings together those organisations responsible for maintaining selected regional data infrastructures along with other relevant experts in order to identify the common standards and best practice necessary to underpin this framework, and to evaluate the differences and commonalties between the regional data infrastructures in order to establish interoperability between them for the purposes of data sharing. This coordinated approach is being demonstrated and validated through the development of a series of prototype interoperability solutions that demonstrate the mechanisms and standards necessary to facilitate the sharing of marine data across these existing data infrastructures.
Serving Fisheries and Ocean Metadata to Communities Around the World

NASA Technical Reports Server (NTRS)

Meaux, Melanie

2006-01-01

NASA's Global Change Master Directory (GCMD) assists the oceanographic community in the discovery, access, and sharing of scientific data by serving on-line fisheries and ocean metadata to users around the globe. As of January 2006, the directory holds more than 16,300 Earth Science data descriptions and over 1,300 services descriptions. Of these, nearly 4,000 unique ocean-related metadata records are available to the public, with many having direct links to the data. In 2005, the GCMD averaged over 5 million hits a month, with nearly a half million unique hosts for the year. Through the GCMD portal (http://qcrnd.nasa.qov/), users can search vast and growing quantities of data and services using controlled keywords, free-text searches or a combination of both. Users may now refine a search based on topic, location, instrument, platform, project, data center, spatial and temporal coverage. The directory also offers data holders a means to post and search their data through customized portals, i.e. online customized subset metadata directories. The discovery metadata standard used is the Directory Interchange Format (DIF), adopted in 1994. This format has evolved to accommodate other national and international standards such as FGDC and IS019115. Users can submit metadata through easy-to-use online and offline authoring tools. The directory, which also serves as a coordinating node of the International Directory Network (IDN), has been active at the international, regional and national level for many years through its involvement with the Committee on Earth Observation Satellites (CEOS), federal agencies (such as NASA, NOAA, and USGS), international agencies (such as IOC/IODE, UN, and JAXA) and partnerships (such as ESIP, IOOS/DMAC, GOSIC, GLOBEC, OBIS, and GoMODP), sharing experience, knowledge related to metadata and/or data management and interoperability.
EMODNet Bathymetry - building and providing a high resolution digital bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.

2016-04-01

Access to marine data is a key issue for the EU Marine Strategy Framework Directive and the EU Marine Knowledge 2020 agenda and includes the European Marine Observation and Data Network (EMODNet) initiative. EMODNet aims at assembling European marine data, data products and metadata from diverse sources in a uniform way. The EMODNet data infrastructure is developed through a stepwise approach in three major phases. Currently EMODNet is entering its 3rd phase with operational portals providing access to marine data for bathymetry, geology, physics, chemistry, biology, seabed habitats and human activities, complemented by checkpoint projects, analyzing the fitness for purpose of data provision. The EMODNet Bathymetry project develops and publishes Digital Terrain Models (DTM) for the European seas. These are produced from survey and aggregated data sets that are indexed with metadata by adopting from SeaDataNet the Common Data Index (CDI) data discovery and access service and the Sextant data products catalogue service. SeaDataNet is a network of major oceanographic data centers around the European seas that manage, operate and further develop a pan-European infrastructure for marine and ocean data management. SeaDataNet is also setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards such as ISO and OGC. The SeaDataNet portal provides users a number of interrelated meta directories, an extensive range of controlled vocabularies, and the various SeaDataNet standards and tools. SeaDataNet at present gives overview and access to more than 1.8 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 100 connected data centers from 34 countries riparian to European seas. The latest EMODNet Bathymetry DTM has a resolution of 1/8 arc minute * 1/8 arc minute and covers all European sea regions. Use is made of available and gathered surveys and already more than 13.000 surveys have been indexed by 27 European data providers from 15 countries and originating from more than 120 organizations. Also use is made of composite DTMs as generated and maintained by several data providers for their areas of interest. Already 44 composite DTMs are included in the Sextant data products catalogue. For areas without coverage use is made of the latest global DTM of GEBCO who is partner in the EMODNet Bathymetry project. In return GEBCO integrates the EMODNet DTM to achieve an enriched and better result. The catalogue services and the generated EMODNet can be queried and browsed at the dedicated EMODNet Bathymetry portal which also provides a versatile DTM viewing service with many relevant map layers and functions for retrieving. Activities are underway for further refinement following user feedback. The EMODnet DTM is publicly available for downloading in various formats. The presentation will highlight key details of EMODNet Bathymetry project, the recently released EMODNet Digital Bathymetry for all European seas, its portal and its versatile viewer.
Improving Scientific Metadata Interoperability And Data Discoverability using OAI-PMH

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E.

2010-12-01

While general-purpose search engines (such as Google or Bing) are useful for finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. However, there are a number of different protocols for harvesting metadata, with some challenges for ensuring that updates are propagated and for collaborations with repositories using differing metadata standards. The Open Archive Initiative Protocol for Metadata Handling (OAI-PMH) is a standard that is seeing increased use as a means for exchanging structured metadata. OAI-PMH implementations must support Dublin Core as a metadata standard, with other metadata formats as optional. We have developed tools which enable our structured search tool (Mercury; http://mercury.ornl.gov) to consume metadata from OAI-PMH services in any of the metadata formats we support (Dublin Core, Darwin Core, FCDC CSDGM, GCMD DIF, EML, and ISO 19115/19137). We are also making ORNL DAAC metadata available through OAI-PMH for other metadata tools to utilize, such as the NASA Global Change Master Directory, GCMD). This paper describes Mercury capabilities with multiple metadata formats, in general, and, more specifically, the results of our OAI-PMH implementations and the lessons learned. References: [1] R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. [2] R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010). [3] Devarakonda, R.; Palanisamy, G.; Green, J.; Wilson, B. E. "Mercury: An Example of Effective Software Reuse for Metadata Management Data Discovery and Access", Eos Trans. AGU, 89(53), Fall Meet. Suppl., IN11A-1019 (2008).
The EPOS e-Infrastructure

NASA Astrophysics Data System (ADS)

Jeffery, Keith; Bailo, Daniele

2014-05-01

The European Plate Observing System (EPOS) is integrating geoscientific information concerning earth movements in Europe. We are approaching the end of the PP (Preparatory Project) phase and in October 2014 expect to continue with the full project within ESFRI (European Strategic Framework for Research Infrastructures). The key aspects of EPOS concern providing services to allow homogeneous access by end-users over heterogeneous data, software, facilities, equipment and services. The e-infrastructure of EPOS is the heart of the project since it integrates the work on organisational, legal, economic and scientific aspects. Following the creation of an inventory of relevant organisations, persons, facilities, equipment, services, datasets and software (RIDE) the scale of integration required became apparent. The EPOS e-infrastructure architecture has been developed systematically based on recorded primary (user) requirements and secondary (interoperation with other systems) requirements through Strawman, Woodman and Ironman phases with the specification - and developed confirmatory prototypes - becoming more precise and progressively moving from paper to implemented system. The EPOS architecture is based on global core services (Integrated Core Services - ICS) which access thematic nodes (domain-specific European-wide collections, called thematic Core Services - TCS), national nodes and specific institutional nodes. The key aspect is the metadata catalog. In one dimension this is described in 3 levels: (1) discovery metadata using well-known and commonly used standards such as DC (Dublin Core) to enable users (via an intelligent user interface) to search for objects within the EPOS environment relevant to their needs; (2) contextual metadata providing the context of the object described in the catalog to enable a user or the system to determine the relevance of the discovered object(s) to their requirement - the context includes projects, funding, organisations involved, persons involved, related publications, facilities, equipment and others, and utilises CERIF (Common European Research Information Format) standard (see www.eurocris.org); (3) detailed metadata which is specific to a domain or to a particular object and includes the schema describing the object to processing software. The other dimension of the metadata concerns the objects described. These are classified into users, services (including software), data and resources (computing, data storage, instruments and scientific equipment). An alternative architecture has been considered: using brokering. This technique has been used especially in North America geoscience projects to interoperate datasets. The technique involves writing software to interconvert between any two node datasets. Given n nodes this implies writing n*(n-1) convertors. EPOS Working Group 7 (e-infrastructures and virtual community) which deals with the design and implementation of a prototype of the EPOS services, chose to use an approach which endows the system with an extreme flexibility and sustainability. It is called the Metadata Catalogue approach. With the use of the catalogue the EPOS system can: 1. interoperate with software, services, users, organisations, facilities, equipment etc. as well as datasets; 2. avoid to write n*(n-1) software convertors and generate as much as possible, through the information contained in the catalogue only n convertors. This is a huge saving - especially in maintenance as the datasets (or other node resources) evolve. We are working on (semi-) automation of convertor generation by metadata mapping - this is leading-edge computer science research; 3. make large use of contextual metadata which enable a user or a machine to: (i) improve discovery of resources at nodes; (ii) improve precision and recall in search; (iii) drive the systems for identification, authentication, authorisation, security and privacy recording the relevant attributes of the node resources and of the user; (iv) manage provenance and long-term digital preservation; The linkage between the Integrated Services, which provide the integration of data and services, with the diverse Thematic Services Nodes is provided by means of a compatibility layer, which includes the aforementioned metadata catalogue. This layer provides 'connectors' to make local data, software and services available through the EPOS Integrated Services layer. In conclusion, we believe the EPOS e-infrastructure architecture is fit for purpose including long-term sustainability and pan-European access to data and services.
An integrative solution for managing, tracing and citing sensor-related information

NASA Astrophysics Data System (ADS)

Koppe, Roland; Gerchow, Peter; Macario, Ana; Schewe, Ingo; Rehmcke, Steven; Düde, Tobias

2017-04-01

In a data-driven scientific world, the need to capture information on sensors used in the data acquisition process has become increasingly important. Following the recommendations of the Open Geospatial Consortium (OGC), we started by adopting the SensorML standard for describing platforms, devices and sensors. However, it soon became obvious to us that understanding, implementing and filling such standards costs significant effort and cannot be expected from every scientist individually. So we developed a web-based sensor management solution (https://sensor.awi.de) for describing platforms, devices and sensors as hierarchy of systems which supports tracing changes to a system whereas hiding complexity. Each platform contains devices where each device can have sensors associated with specific identifiers, contacts, events, related online resources (e.g. manufacturer factsheets, calibration documentation, data processing documentation), sensor output parameters and geo-location. In order to better understand and address real world requirements, we have closely interacted with field-going scientists in the context of the key national infrastructure project "FRontiers in Arctic marine Monitoring ocean observatory" (FRAM) during the software development. We learned that not only the lineage of observations is crucial for scientists but also alert services using value ranges, flexible output formats and information on data providers (e.g. FTP sources) for example. Mostly important, persistent and citable versions of sensor descriptions are required for traceability and reproducibility allowing seamless integration with existing information systems, e.g. PANGAEA. Within the context of the EU-funded Ocean Data Interoperability Platform project (ODIP II) and in cooperation with 52north we are proving near real-time data via Sensor Observation Services (SOS) along with sensor descriptions based on our sensor management solution. ODIP II also aims to develop a harmonized SensorML profile for the marine community which we will be adopting in our solution as soon as available. In this presentation we will show our sensor management solution which is embedded in our data flow framework to offer out-of-the-box interoperability with existing information systems and standards. In addition, we will present real world examples and challenges related to the description and traceability of sensor metadata.
Watershed and Economic Data InterOperability (WEDO) ...

EPA Pesticide Factsheets

Watershed and Economic Data InterOperability (WEDO) is a system of information technologies designed to publish watershed modeling studies for reuse. WEDO facilitates three aspects of interoperability: discovery, evaluation and integration of data. This increased level of interoperability goes beyond the current practice of publishing modeling studies as reports or journal articles. Rather than summarized results, modeling studies can be published with their full complement of input data, calibration parameters and output with associated metadata for easy duplication by others. Reproducible science is possible only if researchers can find, evaluate and use complete modeling studies performed by other modelers. WEDO greatly increases transparency by making detailed data available to the scientific community.WEDO is a next generation technology, a Web Service linked to the EPA’s EnviroAtlas for discovery of modeling studies nationwide. Streams and rivers are identified using the National Hydrography Dataset network and stream IDs. Streams with modeling studies available are color coded in the EnviroAtlas. One can select streams within a watershed of interest to readily find data available via WEDO. The WEDO website is linked from the EnviroAtlas to provide a thorough review of each modeling study. WEDO currently provides modeled flow and water quality time series, designed for a broad range of watershed and economic models for nutrient trading market analysis. M
Bridging Hydroinformatics Services Between HydroShare and SWATShare

NASA Astrophysics Data System (ADS)

Merwade, V.; Zhao, L.; Song, C. X.; Tarboton, D. G.; Goodall, J. L.; Stealey, M.; Rajib, A.; Morsy, M. M.; Dash, P. K.; Miles, B.; Kim, I. L.

2016-12-01

Many cyberinfrastructure systems in the hydrologic and related domains emerged in the past decade with more being developed to address various data management and modeling needs. Although clearly beneficial to the broad user community, it is a challenging task to build interoperability across these systems due to various obstacles including technological, organizational, semantic, and social issues. This work presents our experience in developing interoperability between two hydrologic cyberinfrastructure systems - SWATShare and HydroShare. HydroShare is a large-scale online system aiming at enabling the hydrologic user community to share their data, models, and analysis online for solving complex hydrologic research questions. On the other side, SWATShare is a focused effort to allow SWAT (Soil and Water Assessment Tool) modelers share, execute and analyze SWAT models using high performance computing resources. Making these two systems interoperable required common sign-in through OAuth, sharing of models through common metadata standards and use of standard web-services for implementing key import/export functionalities. As a result, users from either community can leverage the resources and services across these systems without having to manually importing, exporting, or processing their models. Overall, this use case is an example that can serve as a model for the interoperability among other systems as no one system can provide all the functionality needed to address large interdisciplinary problems.
Interoperability Using Lightweight Metadata Standards: Service & Data Casting, OpenSearch, OPM Provenance, and Shared SciFlo Workflows

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E.

2011-12-01

Under several NASA grants, we are generating multi-sensor merged atmospheric datasets to enable the detection of instrument biases and studies of climate trends over decades of data. For example, under a NASA MEASURES grant we are producing a water vapor climatology from the A-Train instruments, stratified by the Cloudsat cloud classification for each geophysical scene. The generation and proper use of such multi-sensor climate data records (CDR's) requires a high level of openness, transparency, and traceability. To make the datasets self-documenting and provide access to full metadata and traceability, we have implemented a set of capabilities and services using known, interoperable protocols. These protocols include OpenSearch, OPeNDAP, Open Provenance Model, service & data casting technologies using Atom feeds, and REST-callable analysis workflows implemented as SciFlo (XML) documents. We advocate that our approach can serve as a blueprint for how to openly "document and serve" complex, multi-sensor CDR's with full traceability. The capabilities and services provided include: - Discovery of the collections by keyword search, exposed using OpenSearch protocol; - Space/time query across the CDR's granules and all of the input datasets via OpenSearch; - User-level configuration of the production workflows so that scientists can select additional physical variables from the A-Train to add to the next iteration of the merged datasets; - Efficient data merging using on-the-fly OPeNDAP variable slicing & spatial subsetting of data out of input netCDF and HDF files (without moving the entire files); - Self-documenting CDR's published in a highly usable netCDF4 format with groups used to organize the variables, CF-style attributes for each variable, numeric array compression, & links to OPM provenance; - Recording of processing provenance and data lineage into a query-able provenance trail in Open Provenance Model (OPM) format, auto-captured by the workflow engine; - Open Publishing of all of the workflows used to generate products as machine-callable REST web services, using the capabilities of the SciFlo workflow engine; - Advertising of the metadata (e.g. physical variables provided, space/time bounding box, etc.) for our prepared datasets as "datacasts" using the Atom feed format; - Publishing of all datasets via our "DataDrop" service, which exploits the WebDAV protocol to enable scientists to access remote data directories as local files on their laptops; - Rich "web browse" of the CDR's with full metadata and the provenance trail one click away; - Advertising of all services as Google-discoverable "service casts" using the Atom format. The presentation will describe our use of the interoperable protocols and demonstrate the capabilities and service GUI's.
In-field Access to Geoscientific Metadata through GPS-enabled Mobile Phones

NASA Astrophysics Data System (ADS)

Hobona, Gobe; Jackson, Mike; Jordan, Colm; Butchart, Ben

2010-05-01

Fieldwork is an integral part of much geosciences research. But whilst geoscientists have physical or online access to data collections whilst in the laboratory or at base stations, equivalent in-field access is not standard or straightforward. The increasing availability of mobile internet and GPS-supported mobile phones, however, now provides the basis for addressing this issue. The SPACER project was commissioned by the Rapid Innovation initiative of the UK Joint Information Systems Committee (JISC) to explore the potential for GPS-enabled mobile phones to access geoscientific metadata collections. Metadata collections within the geosciences and the wider geospatial domain can be disseminated through web services based on the Catalogue Service for Web(CSW) standard of the Open Geospatial Consortium (OGC) - a global grouping of over 380 private, public and academic organisations aiming to improve interoperability between geospatial technologies. CSW offers an XML-over-HTTP interface for querying and retrieval of geospatial metadata. By default, the metadata returned by CSW is based on the ISO19115 standard and encoded in XML conformant to ISO19139. The SPACER project has created a prototype application that enables mobile phones to send queries to CSW containing user-defined keywords and coordinates acquired from GPS devices built-into the phones. The prototype has been developed using the free and open source Google Android platform. The mobile application offers views for listing titles, presenting multiple metadata elements and a Google Map with an overlay of bounding coordinates of datasets. The presentation will describe the architecture and approach applied in the development of the prototype.
Usability and Interoperability Improvements for an EASE-Grid 2.0 Passive Microwave Data Product Using CF Conventions

NASA Astrophysics Data System (ADS)

Hardman, M.; Brodzik, M. J.; Long, D. G.

2017-12-01

Beginning in 1978, the satellite passive microwave data record has been a mainstay of remote sensing of the cryosphere, providing twice-daily, near-global spatial coverage for monitoring changes in hydrologic and cryospheric parameters that include precipitation, soil moisture, surface water, vegetation, snow water equivalent, sea ice concentration and sea ice motion. Historical versions of the gridded passive microwave data sets were produced as flat binary files described in human-readable documentation. This format is error-prone and makes it difficult to reliably include all processing and provenance. Funded by NASA MEaSUREs, we have completely reprocessed the gridded data record that includes SMMR, SSM/I-SSMIS and AMSR-E. The new Calibrated Enhanced-Resolution Brightness Temperature (CETB) Earth System Data Record (ESDR) files are self-describing. Our approach to the new data set was to create netCDF4 files that use standard metadata conventions and best practices to incorporate file-level, machine- and human-readable contents, geolocation, processing and provenance metadata. We followed the flexible and adaptable Climate and Forecast (CF-1.6) Conventions with respect to their coordinate conventions and map projection parameters. Additionally, we made use of Attribute Conventions for Dataset Discovery (ACDD-1.3) that provided file-level conventions with spatio-temporal bounds that enable indexing software to search for coverage. Our CETB files also include temporal coverage and spatial resolution in the file-level metadata for human-readability. We made use of the JPL CF/ACDD Compliance Checker to guide this work. We tested our file format with real software, for example, netCDF Command-line Operators (NCO) power tools for unlimited control on spatio-temporal subsetting and concatenation of files. The GDAL tools understand the CF metadata and produce fully-compliant geotiff files from our data. ArcMap can then reproject the geotiff files on-the-fly and work with other geolocated data such as coastlines, with no special work required. We expect this combination of standards and well-tested interoperability to significantly improve the usability of this important ESDR for the Earth Science community.
A New Data Management System for Biological and Chemical Oceanography

NASA Astrophysics Data System (ADS)

Groman, R. C.; Chandler, C.; Allison, D.; Glover, D. M.; Wiebe, P. H.

2007-12-01

The Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created to serve PIs principally funded by NSF to conduct marine chemical and ecological research. The new office is dedicated to providing open access to data and information developed in the course of scientific research on short and intermediate time-frames. The data management system developed in support of U.S. JGOFS and U.S. GLOBEC programs is being modified to support the larger scope of the BCO-DMO effort, which includes ultimately providing a way to exchange data with other data systems. The open access system is based on a philosophy of data stewardship, support for existing and evolving data standards, and use of public domain software. The DMO staff work closely with originating PIs to manage data gathered as part of their individual programs. In the new BCO-DMO data system, project and data set metadata records designed to support re-use of the data are stored in a relational database (MySQL) and the data are stored in or made accessible by the JGOFS/GLOBEC object- oriented, relational, data management system. Data access will be provided via any standard Web browser client user interface through a GIS application (Open Source, OGC-compliant MapServer), a directory listing from the data holdings catalog, or a custom search engine that facilitates data discovery. In an effort to maximize data system interoperability, data will also be available via Web Services; and data set descriptions will be generated to comply with a variety of metadata content standards. The office is located at the Woods Hole Oceanographic Institution and web access is via http://www.bco-dmo.org.
Nanopublications for exposing experimental data in the life-sciences: a Huntington's Disease case study.

PubMed

Mina, Eleni; Thompson, Mark; Kaliyaperumal, Rajaram; Zhao, Jun; der Horst, van Eelke; Tatum, Zuotian; Hettne, Kristina M; Schultes, Erik A; Mons, Barend; Roos, Marco

2015-01-01

Data from high throughput experiments often produce far more results than can ever appear in the main text or tables of a single research article. In these cases, the majority of new associations are often archived either as supplemental information in an arbitrary format or in publisher-independent databases that can be difficult to find. These data are not only lost from scientific discourse, but are also elusive to automated search, retrieval and processing. Here, we use the nanopublication model to make scientific assertions that were concluded from a workflow analysis of Huntington's Disease data machine-readable, interoperable, and citable. We followed the nanopublication guidelines to semantically model our assertions as well as their provenance metadata and authorship. We demonstrate interoperability by linking nanopublication provenance to the Research Object model. These results indicate that nanopublications can provide an incentive for researchers to expose data that is interoperable and machine-readable for future use and preservation for which they can get credits for their effort. Nanopublications can have a leading role into hypotheses generation offering opportunities to produce large-scale data integration.
MPEG-7-based description infrastructure for an audiovisual content analysis and retrieval system

NASA Astrophysics Data System (ADS)

Bailer, Werner; Schallauer, Peter; Hausenblas, Michael; Thallinger, Georg

2005-01-01

We present a case study of establishing a description infrastructure for an audiovisual content-analysis and retrieval system. The description infrastructure consists of an internal metadata model and access tool for using it. Based on an analysis of requirements, we have selected, out of a set of candidates, MPEG-7 as the basis of our metadata model. The openness and generality of MPEG-7 allow using it in broad range of applications, but increase complexity and hinder interoperability. Profiling has been proposed as a solution, with the focus on selecting and constraining description tools. Semantic constraints are currently only described in textual form. Conformance in terms of semantics can thus not be evaluated automatically and mappings between different profiles can only be defined manually. As a solution, we propose an approach to formalize the semantic constraints of an MPEG-7 profile using a formal vocabulary expressed in OWL, which allows automated processing of semantic constraints. We have defined the Detailed Audiovisual Profile as the profile to be used in our metadata model and we show how some of the semantic constraints of this profile can be formulated using ontologies. To work practically with the metadata model, we have implemented a MPEG-7 library and a client/server document access infrastructure.
Distributed Multi-interface Catalogue for Geospatial Data

NASA Astrophysics Data System (ADS)

Nativi, S.; Bigagli, L.; Mazzetti, P.; Mattia, U.; Boldrini, E.

2007-12-01

Several geosciences communities (e.g. atmospheric science, oceanography, hydrology) have developed tailored data and metadata models and service protocol specifications for enabling online data discovery, inventory, evaluation, access and download. These specifications are conceived either profiling geospatial information standards or extending the well-accepted geosciences data models and protocols in order to capture more semantics. These artifacts have generated a set of related catalog -and inventory services- characterizing different communities, initiatives and projects. In fact, these geospatial data catalogs are discovery and access systems that use metadata as the target for query on geospatial information. The indexed and searchable metadata provide a disciplined vocabulary against which intelligent geospatial search can be performed within or among communities. There exists a clear need to conceive and achieve solutions to implement interoperability among geosciences communities, in the context of the more general geospatial information interoperability framework. Such solutions should provide search and access capabilities across catalogs, inventory lists and their registered resources. Thus, the development of catalog clearinghouse solutions is a near-term challenge in support of fully functional and useful infrastructures for spatial data (e.g. INSPIRE, GMES, NSDI, GEOSS). This implies the implementation of components for query distribution and virtual resource aggregation. These solutions must implement distributed discovery functionalities in an heterogeneous environment, requiring metadata profiles harmonization as well as protocol adaptation and mediation. We present a catalog clearinghouse solution for the interoperability of several well-known cataloguing systems (e.g. OGC CSW, THREDDS catalog and data services). The solution implements consistent resource discovery and evaluation over a dynamic federation of several well-known cataloguing and inventory systems. Prominent features include: 1)Support to distributed queries over a hierarchical data model, supporting incremental queries (i.e. query over collections, to be subsequently refined) and opaque/translucent chaining; 2)Support to several client protocols, through a compound front-end interface module. This allows to accommodate a (growing) number of cataloguing standards, or profiles thereof, including the OGC CSW interface, ebRIM Application Profile (for Core ISO Metadata and other data models), and the ISO Application Profile. The presented catalog clearinghouse supports both the opaque and translucent pattern for service chaining. In fact, the clearinghouse catalog may be configured either to completely hide the underlying federated services or to provide clients with services information. In both cases, the clearinghouse solution presents a higher level interface (i.e. OGC CSW) which harmonizes multiple lower level services (e.g. OGC CSW, WMS and WCS, THREDDS, etc.), and handles all control and interaction with them. In the translucent case, client has the option to directly access the lower level services (e.g. to improve performances). In the GEOSS context, the solution has been experimented both as a stand-alone user application and as a service framework. The first scenario allows a user to download a multi-platform client software and query a federation of cataloguing systems, that he can customize at will. The second scenario support server-side deployment and can be flexibly adapted to several use-cases, such as intranet proxy, catalog broker, etc.

Integrating TRENCADIS components in gLite to share DICOM medical images and structured reports.

PubMed

Blanquer, Ignacio; Hernández, Vicente; Salavert, José; Segrelles, Damià

2010-01-01

The problem of sharing medical information among different centres has been tackled by many projects. Several of them target the specific problem of sharing DICOM images and structured reports (DICOM-SR), such as the TRENCADIS project. In this paper we propose sharing and organizing DICOM data and DICOM-SR metadata benefiting from the existent deployed Grid infrastructures compliant with gLite such as EGEE or the Spanish NGI. These infrastructures contribute with a large amount of storage resources for creating knowledge databases and also provide metadata storage resources (such as AMGA) to semantically organize reports in a tree-structure. First, in this paper, we present the extension of TRENCADIS architecture to use gLite components (LFC, AMGA, SE) on the shake of increasing interoperability. Using the metadata from DICOM-SR, and maintaining its tree structure, enables federating different but compatible diagnostic structures and simplifies the definition of complex queries. This article describes how to do this in AMGA and it shows an approach to efficiently code radiology reports to enable the multi-centre federation of data resources.
CSW Best Practices

NASA Technical Reports Server (NTRS)

Newman, Doug; Mitchell, Andrew

2016-01-01

During the development of the CMR (Common Metadata Repository) (CMR) for the Earth Observing System Data and Information System (EOSDIS), CSW (Catalog Service for the Web) a number of best practices came to light. Given that the ESIP (Earth Science Information Partners) Discovery Cluster is committed to interoperability and standards in earth data discovery this seemed like a convenient moment to provide Best Practices to the organization in the same way we did for OpenSearch for this widely-used standard.
Advances in a distributed approach for ocean model data interoperability

USGS Publications Warehouse

Signell, Richard P.; Snowden, Derrick P.

2014-01-01

An infrastructure for earth science data is emerging across the globe based on common data models and web services. As we evolve from custom file formats and web sites to standards-based web services and tools, data is becoming easier to distribute, find and retrieve, leaving more time for science. We describe recent advances that make it easier for ocean model providers to share their data, and for users to search, access, analyze and visualize ocean data using MATLAB® and Python®. These include a technique for modelers to create aggregated, Climate and Forecast (CF) metadata convention datasets from collections of non-standard Network Common Data Form (NetCDF) output files, the capability to remotely access data from CF-1.6-compliant NetCDF files using the Open Geospatial Consortium (OGC) Sensor Observation Service (SOS), a metadata standard for unstructured grid model output (UGRID), and tools that utilize both CF and UGRID standards to allow interoperable data search, browse and access. We use examples from the U.S. Integrated Ocean Observing System (IOOS®) Coastal and Ocean Modeling Testbed, a project in which modelers using both structured and unstructured grid model output needed to share their results, to compare their results with other models, and to compare models with observed data. The same techniques used here for ocean modeling output can be applied to atmospheric and climate model output, remote sensing data, digital terrain and bathymetric data.
Towards Improved Satellite-In Situ Oceanographic Data Interoperability and Associated Value Added Services at the Podaac

NASA Astrophysics Data System (ADS)

Tsontos, V. M.; Huang, T.; Holt, B.

2015-12-01

The earth science enterprise increasingly relies on the integration and synthesis of multivariate datasets from diverse observational platforms. NASA's ocean salinity missions, that include Aquarius/SAC-D and the SPURS (Salinity Processes in the Upper Ocean Regional Study) field campaign, illustrate the value of integrated observations in support of studies on ocean circulation, the water cycle, and climate. However, the inherent heterogeneity of resulting data and the disparate, distributed systems that serve them complicates their effective utilization for both earth science research and applications. Key technical interoperability challenges include adherence to metadata and data format standards that are particularly acute for in-situ data and the lack of a unified metadata model facilitating archival and integration of both satellite and oceanographic field datasets. Here we report on efforts at the PO.DAAC, NASA's physical oceanographic data center, to extend our data management and distribution support capabilities for field campaign datasets such as those from SPURS. We also discuss value-added services, based on the integration of satellite and in-situ datasets, which are under development with a particular focus on DOMS. The distributed oceanographic matchup service (DOMS) implements a portable technical infrastructure and associated web services that will be broadly accessible via the PO.DAAC for the dynamic collocation of satellite and in-situ data, hosted by distributed data providers, in support of mission cal/val, science and operational applications.
NOAA's Data Catalog and the Federal Open Data Policy

NASA Astrophysics Data System (ADS)

Wengren, M. J.; de la Beaujardiere, J.

2014-12-01

The 2013 Open Data Policy Presidential Directive requires Federal agencies to create and maintain a 'public data listing' that includes all agency data that is currently or will be made publicly-available in the future. The directive requires the use of machine-readable and open formats that make use of 'common core' and extensible metadata formats according to the best practices published in an online repository called 'Project Open Data', to use open licenses where possible, and to adhere to existing metadata and other technology standards to promote interoperability. In order to meet the requirements of the Open Data Policy, the National Oceanic and Atmospheric Administration (NOAA) has implemented an online data catalog that combines metadata from all subsidiary NOAA metadata catalogs into a single master inventory. The NOAA Data Catalog is available to the public for search and discovery, providing access to the NOAA master data inventory through multiple means, including web-based text search, OGC CS-W endpoint, as well as a native Application Programming Interface (API) for programmatic query. It generates on a daily basis the Project Open Data JavaScript Object Notation (JSON) file required for compliance with the Presidential directive. The Data Catalog is based on the open source Comprehensive Knowledge Archive Network (CKAN) software and runs on the Amazon Federal GeoCloud. This presentation will cover topics including mappings of existing metadata in standard formats (FGDC-CSDGM and ISO 19115 XML ) to the Project Open Data JSON metadata schema, representation of metadata elements within the catalog, and compatible metadata sources used to feed the catalog to include Web Accessible Folder (WAF), Catalog Services for the Web (CS-W), and Esri ArcGIS.com. It will also discuss related open source technologies that can be used together to build a spatial data infrastructure compliant with the Open Data Policy.
An EarthCube Roadmap for Cross-Domain Interoperability in the Geosciences: Governance Aspects

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Couch, A.; Richard, S. M.; Valentine, D. W.; Stocks, K.; Murphy, P.; Lehnert, K. A.

2012-12-01

The goal of cross-domain interoperability is to enable reuse of data and models outside the original context in which these data and models are collected and used and to facilitate analysis and modeling of physical processes that are not confined to disciplinary or jurisdictional boundaries. A new research initiative of the U.S. National Science Foundation, called EarthCube, is developing a roadmap to address challenges of interoperability in the earth sciences and create a blueprint for community-guided cyberinfrastructure accessible to a broad range of geoscience researchers and students. Infrastructure readiness for cross-domain interoperability encompasses the capabilities that need to be in place for such secondary or derivative-use of information to be both scientifically sound and technically feasible. In this initial assessment we consider the following four basic infrastructure components that need to be present to enable cross-domain interoperability in the geosciences: metadata catalogs (at the appropriate community defined granularity) that provide standard discovery services over datasets, data access services, models and other resources of the domain; vocabularies that support unambiguous interpretation of domain resources and metadata; services used to access data repositories and other resources including models, visualizations and workflows; and formal information models that define structure and semantics of the information returned on service requests. General standards for these components have been proposed; they form the backbone of large scale integration activities in the geosciences. By utilizing these standards, EarthCube research designs can take advantage of data discovery across disciplines using the commonality in key data characteristics related to shared models of spatial features, time measurements, and observations. Data can be discovered via federated catalogs and linked nomenclatures from neighboring domains, while standard data services can be used to transparently compile composite data products. Key questions addressed in this presentation are: (1) How to define and assess readiness of existing domain information systems for cross-domain re-use? (2) How to determine EarthCube development priorities given a multitude of use cases that involve cross-domain data flows? and (3) How to involve a wider community of geoscientists in the development and curation of cross-domain resources and incorporate community feedback in the CI design? Answering them involves consideration of governance mechanisms for cross-domain interoperability: while domain information systems and projects developed governance mechanisms, managing cross-domain CI resources and supporting cross-domain information re-use hasn't been the development focus at the scale of the geosciences. We present a cross-domain readiness model as enabling effective communication among scientists, governance bodies, and information providers. We also present an initial readiness assessment and a cross-domain connectivity map for the geosciences, and outline processes for eliciting user requirements, setting priorities, and obtaining community consensus.
A document centric metadata registration tool constructing earth environmental data infrastructure

NASA Astrophysics Data System (ADS)

Ichino, M.; Kinutani, H.; Ono, M.; Shimizu, T.; Yoshikawa, M.; Masuda, K.; Fukuda, K.; Kawamoto, H.

2009-12-01

DIAS (Data Integration and Analysis System) is one of GEOSS activities in Japan. It is also a leading part of the GEOSS task with the same name defined in GEOSS Ten Year Implementation Plan. The main mission of DIAS is to construct data infrastructure that can effectively integrate earth environmental data such as observation data, numerical model outputs, and socio-economic data provided from the fields of climate, water cycle, ecosystem, ocean, biodiversity and agriculture. Some of DIAS's data products are available at the following web site of http://www.jamstec.go.jp/e/medid/dias. Most of earth environmental data commonly have spatial and temporal attributes such as the covering geographic scope or the created date. The metadata standards including these common attributes are published by the geographic information technical committee (TC211) in ISO (the International Organization for Standardization) as specifications of ISO 19115:2003 and 19139:2007. Accordingly, DIAS metadata is developed with basing on ISO/TC211 metadata standards. From the viewpoint of data users, metadata is useful not only for data retrieval and analysis but also for interoperability and information sharing among experts, beginners and nonprofessionals. On the other hand, from the viewpoint of data providers, two problems were pointed out after discussions. One is that data providers prefer to minimize another tasks and spending time for creating metadata. Another is that data providers want to manage and publish documents to explain their data sets more comprehensively. Because of solving these problems, we have been developing a document centric metadata registration tool. The features of our tool are that the generated documents are available instantly and there is no extra cost for data providers to generate metadata. Also, this tool is developed as a Web application. So, this tool does not demand any software for data providers if they have a web-browser. The interface of the tool provides the section titles of the documents and by filling out the content of each section, the documents for the data sets are automatically published in PDF and HTML format. Furthermore, the metadata XML file which is compliant with ISO19115 and ISO19139 is created at the same moment. The generated metadata are managed in the metadata database of the DIAS project, and will be used in various ISO19139 compliant metadata management tools, such as GeoNetwork.
Oceanography Information System of Spanish Institute of Oceanography (IEO)

NASA Astrophysics Data System (ADS)

Tello, Olvido; Gómez, María; González, Sonsoles

2016-04-01

Since 1914, the Spanish Institute of Oceanography (IEO) performs multidisciplinary studies of the marine environment. In same case are systematic studies and in others are specific studies for special requirements (El Hierro submarine volcanic episode, spill Prestige, others.). Different methodologies and data acquisition techniques are used depending on studies aims. The acquired data are stored and presented in different formats. The information is organized into different databases according to the subject and the variables represented (geology, fisheries, aquaculture, pollution, habitats, etc.). Related to physical and chemical oceanography data, in 1964 was created the DATA CENTER of IEO (CEDO), in order to organize the data about physical and chemical variables, to standardize this information and to serve the international data network SeaDataNet. www.seadatanet.org. This database integrates data about temperature, salinity, nutrients, and tidal data. CEDO allows consult and download the data. http://indamar.ieo.es On the other hand, related to data about marine species in 1999 was developed SIRENO DATABASE. All data about species collected in oceanographic surveys carried out by researches of IEO, and data from observers on fishing vessels are incorporated in SIRENO database. In this database is stored catch data, biomass, abundance, etc. This system is based on architecture ORACLE. Due to the large amount of information collected over the 100 years of IEO history, there is a clear need to organize, standardize, integrate and relate the different databases and information, and to provide interoperability and access to the information. Consequently, in 2000 it emerged the first initiative to organize the IEO spatial information in an Oceanography Information System, based on a Geographical Information System (GIS). The GIS was consolidated as IEO institutional GIS and was created the Spatial Data Infrastructure of IEO (IDEO) following trend of INSPIRE. All data included in the GIS have their corresponding metadata about ISO19115 and INSPIRE. IDEO is based on Web services, Quality of Services, Open standards, ISO (OGC) and INSPIRE standards, and both provide access to the geographical marine information of IEO. The GIS allows the information to be organized, visualized, consulted and analyzed. The data from different IEO databases are integrated into a GIS corporate Geodatabase (Esri format). This tool is essential in the decision making of aspects like: - Protection of marine environment - Sustainable management of resources - Natural Hazards. - Marine spatial planning. Examples of the use of GIS as a spatial analysis tool are: - Mud volcanoes explored in LIFE-INDEMARES project. - Cartographic series about Spanish continental shelf, developed from data integrated in IEO marine GIS, acquired from oceanographic surveys in ESPACE project. - Cartography developed from the information gathered in Initial Assessment of Marine Strategy Framework Directive. - Studies of natural hazards related to submarine canyons in southeast region marine Spanish. Currently the IEO is participating in many European initiatives, especially in several lots of EMODNET. The IEO besides is working in consonance with INSPIRE, Growth Blue, Horizon 2020, etc., to contribute to, the knowledge of marine environment, its protection and its spatial planning are extremely relevant issues. In order to facilitate the access to the Spatial Data Infrastructure of IEO, the IEO Geoportal was developed in 2012. It mainly involves a metadata catalog, access to the data viewers and Web Services of IDEO. http://www.geo-ideo.ieo.es/geoportalideo/catalog/main/home.page
TR32DB - Management of Research Data in a Collaborative, Interdisciplinary Research Project

NASA Astrophysics Data System (ADS)

Curdt, Constanze; Hoffmeister, Dirk; Waldhoff, Guido; Lang, Ulrich; Bareth, Georg

2015-04-01

The management of research data in a well-structured and documented manner is essential in the context of collaborative, interdisciplinary research environments (e.g. across various institutions). Consequently, set-up and use of a research data management (RDM) system like a data repository or project database is necessary. These systems should accompany and support scientists during the entire research life cycle (e.g. data collection, documentation, storage, archiving, sharing, publishing) and operate cross-disciplinary in interdisciplinary research projects. Challenges and problems of RDM are well-know. Consequently, the set-up of a user-friendly, well-documented, sustainable RDM system is essential, as well as user support and further assistance. In the framework of the Transregio Collaborative Research Centre 32 'Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, Modelling, and Data Assimilation' (CRC/TR32), funded by the German Research Foundation (DFG), a RDM system was self-designed and implemented. The CRC/TR32 project database (TR32DB, www.tr32db.de) is operating online since early 2008. The TR32DB handles all data, which are created by the involved project participants from several institutions (e.g. Universities of Cologne, Bonn, Aachen, and the Research Centre Jülich) and research fields (e.g. soil and plant sciences, hydrology, geography, geophysics, meteorology, remote sensing). Very heterogeneous research data are considered, which are resulting from field measurement campaigns, meteorological monitoring, remote sensing, laboratory studies and modelling approaches. Furthermore, outcomes like publications, conference contributions, PhD reports and corresponding images are regarded. The TR32DB project database is set-up in cooperation with the Regional Computing Centre of the University of Cologne (RRZK) and also located in this hardware environment. The TR32DB system architecture is composed of three main components: (i) a file-based data storage including backup, (ii) a database-based storage for administrative data and metadata, and (iii) a web-interface for user access. The TR32DB offers common features of RDM systems. These include data storage, entry of corresponding metadata by a user-friendly input wizard, search and download of data depending on user permission, as well as secure internal exchange of data. In addition, a Digital Object Identifier (DOI) can be allocated for specific datasets and several web mapping components are supported (e.g. Web-GIS and map search). The centrepiece of the TR32DB is the self-provided and implemented CRC/TR32 specific metadata schema. This enables the documentation of all involved, heterogeneous data with accurate, interoperable metadata. The TR32DB Metadata Schema is set-up in a multi-level approach and supports several metadata standards and schemes (e.g. Dublin Core, ISO 19115, INSPIRE, DataCite). Furthermore, metadata properties with focus on the CRC/TR32 background (e.g. CRC/TR32 specific keywords) and the supported data types are complemented. Mandatory, optional and automatic metadata properties are specified. Overall, the TR32DB is designed and implemented according to the needs of the CRC/TR32 (e.g. huge amount of heterogeneous data) and demands of the DFG (e.g. cooperation with a computing centre). The application of a self-designed, project-specific, interoperable metadata schema enables the accurate documentation of all CRC/TR32 data. The implementation of the TR32DB in the hardware environment of the RRZK ensures the access to the data after the end of the CRC/TR32 funding in 2018.
Interoperable Communications Systems: Governance and Risk

DTIC Science & Technology

2009-12-01

with the formula and wiling to carry the ball and get the policy board to approve it.” Prior relationships led to the identification of...California 8. Sam Mazza Monterey Fire Department Monterey, California 9. Farhad Mansourian County of Marin Marin, California 10. Mark Brown
Metadata and Service at the GFZ ISDC Portal

NASA Astrophysics Data System (ADS)

Ritschel, B.

2008-05-01

The online service portal of the GFZ Potsdam Information System and Data Center (ISDC) is an access point for all manner of geoscientific geodata, its corresponding metadata, scientific documentation and software tools. At present almost 2000 national and international users and user groups have the opportunity to request Earth science data from a portfolio of 275 different products types and more than 20 Million single data files with an added volume of approximately 12 TByte. The majority of the data and information, the portal currently offers to the public, are global geomonitoring products such as satellite orbit and Earth gravity field data as well as geomagnetic and atmospheric data for the exploration. These products for Earths changing system are provided via state-of-the art retrieval techniques. The data product catalog system behind these techniques is based on the extensive usage of standardized metadata, which are describing the different geoscientific product types and data products in an uniform way. Where as all ISDC product types are specified by NASA's Directory Interchange Format (DIF), Version 9.0 Parent XML DIF metadata files, the individual data files are described by extended DIF metadata documents. Depending on the beginning of the scientific project, one part of data files are described by extended DIF, Version 6 metadata documents and the other part are specified by data Child XML DIF metadata documents. Both, the product type dependent parent DIF metadata documents and the data file dependent child DIF metadata documents are derived from a base-DIF.xsd xml schema file. The ISDC metadata philosophy defines a geoscientific product as a package consisting of mostly one or sometimes more than one data file plus one extended DIF metadata file. Because NASA's DIF metadata standard has been developed in order to specify a collection of data only, the extension of the DIF standard consists of new and specific attributes, which are necessary for an explicit identification of single data files and the set-up of a comprehensive Earth science data catalog. The huge ISDC data catalog is realized by product type dependent tables filled with data file related metadata, which have relations to corresponding metadata tables. The product type describing parent DIF XML metadata documents are stored and managed in ORACLE's XML storage structures. In order to improve the interoperability of the ISDC service portal, the existing proprietary catalog system will be extended by an ISO 19115 based web catalog service. In addition to this development there is ISDC related concerning semantic network of different kind of metadata resources, like different kind of standardized and not-standardized metadata documents and literature as well as Web 2.0 user generated information derived from tagging activities and social navigation data.
Data management in Oceanography at SOCIB

NASA Astrophysics Data System (ADS)

Joaquin, Tintoré; March, David; Lora, Sebastian; Sebastian, Kristian; Frontera, Biel; Gómara, Sonia; Pau Beltran, Joan

2014-05-01

SOCIB, the Balearic Islands Coastal Ocean Observing and Forecasting System (http://www.socib.es), is a Marine Research Infrastructure, a multiplatform distributed and integrated system, a facility of facilities that extends from the nearshore to the open sea and provides free, open and quality control data. SOCIB is a facility o facilities and has three major infrastructure components: (1) a distributed multiplatform observing system, (2) a numerical forecasting system, and (3) a data management and visualization system. We present the spatial data infrastructure and applications developed at SOCIB. One of the major goals of the SOCIB Data Centre is to provide users with a system to locate and download the data of interest (near real-time and delayed mode) and to visualize and manage the information. Following SOCIB principles, data need to be (1) discoverable and accessible, (2) freely available, and (3) interoperable and standardized. In consequence, SOCIB Data Centre Facility is implementing a general data management system to guarantee international standards, quality assurance and interoperability. The combination of different sources and types of information requires appropriate methods to ingest, catalogue, display, and distribute this information. SOCIB Data Centre is responsible for directing the different stages of data management, ranging from data acquisition to its distribution and visualization through web applications. The system implemented relies on open source solutions. In other words, the data life cycle relies in the following stages: • Acquisition: The data managed by SOCIB mostly come from its own observation platforms, numerical models or information generated from the activities in the SIAS Division. • Processing: Applications developed at SOCIB to deal with all collected platform data performing data calibration, derivation, quality control and standardization. • Archival: Storage in netCDF and spatial databases. • Distribution: Data web services using Thredds, Geoserver and RESTful own services. • Catalogue: Metadata is provided through the ncISO plugin in Thredds and Geonetwork. • Visualization: web and mobile applications to present SOCIB data to different user profiles. SOCIB data services and applications have been developed to provide response to science and society needs (eg. European initiatives such as Emodnet or Copernicus), by targeting different user profiles (eg. researchers, technicians, policy and decision makers, educators, students, and society in general). For example, SOCIB has developed applications to: 1) allow researchers and technicians to access oceanographic information; 2) provide decision support for oil spills response; 3) disseminate information about the coastal state for tourists and recreational users; 4) present coastal research in educational programs; and 5) offer easy and fast access to marine information through mobile devices. In conclusion, the organizational and conceptual structure of SOCIB's Data Centre and the components developed provide an example of marine information systems within the framework of new ocean observatories and/or marine research infrastructures.
Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

2016-12-01

Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi
Distributed data discovery, access and visualization services to Improve Data Interoperability across different data holdings

NASA Astrophysics Data System (ADS)

Palanisamy, G.; Krassovski, M.; Devarakonda, R.; Santhana Vannan, S.

2012-12-01

The current climate debate is highlighting the importance of free, open, and authoritative sources of high quality climate data that are available for peer review and for collaborative purposes. It is increasingly important to allow various organizations around the world to share climate data in an open manner, and to enable them to perform dynamic processing of climate data. This advanced access to data can be enabled via Web-based services, using common "community agreed" standards without having to change their internal structure used to describe the data. The modern scientific community has become diverse and increasingly complex in nature. To meet the demands of such diverse user community, the modern data supplier has to provide data and other related information through searchable, data and process oriented tool. This can be accomplished by setting up on-line, Web-based system with a relational database as a back end. The following common features of the web data access/search systems will be outlined in the proposed presentation: - A flexible data discovery - Data in commonly used format (e.g., CSV, NetCDF) - Preparing metadata in standard formats (FGDC, ISO19115, EML, DIF etc.) - Data subseting capabilities and ability to narrow down to individual data elements - Standards based data access protocols and mechanisms (SOAP, REST, OpenDAP, OGC etc.) - Integration of services across different data systems (discovery to access, visualizations and subseting) This presentation will also include specific examples of integration of various data systems that are developed by Oak Ridge National Laboratory's - Climate Change Science Institute, their ability to communicate between each other to enable better data interoperability and data integration. References: [1] Devarakonda, Ranjeet, and Harold Shanafield. "Drupal: Collaborative framework for science research." Collaboration Technologies and Systems (CTS), 2011 International Conference on. IEEE, 2011. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L. A., Killeffer, T. S., Boden, T. A., ... & Lazer, K. (2014). THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA. Oak Ridge National Laboratory (ORNL).
An integrated content and metadata based retrieval system for art.

PubMed

Lewis, Paul H; Martinez, Kirk; Abas, Fazly Salleh; Fauzi, Mohammad Faizal Ahmad; Chan, Stephen C Y; Addis, Matthew J; Boniface, Mike J; Grimwood, Paul; Stevenson, Alison; Lahanier, Christian; Stevenson, James

2004-03-01

A new approach to image retrieval is presented in the domain of museum and gallery image collections. Specialist algorithms, developed to address specific retrieval tasks, are combined with more conventional content and metadata retrieval approaches, and implemented within a distributed architecture to provide cross-collection searching and navigation in a seamless way. External systems can access the different collections using interoperability protocols and open standards, which were extended to accommodate content based as well as text based retrieval paradigms. After a brief overview of the complete system, we describe the novel design and evaluation of some of the specialist image analysis algorithms including a method for image retrieval based on sub-image queries, retrievals based on very low quality images and retrieval using canvas crack patterns. We show how effective retrieval results can be achieved by real end-users consisting of major museums and galleries, accessing the distributed but integrated digital collections.
How Safe and Persistent is your Research? It's all about relationships

NASA Astrophysics Data System (ADS)

Lin, J.

2017-12-01

The relationships between the scholarly resources over the course of the research lifecycle and with the people and places involved are at the core of not only scholarly communications, but the research enterprise. This session will consider the current state of persistent identifiers and their associated metadata in scholarly communications as a critical part of making sure research can be validated. Metadata ensure that the scholarly assets are knitted into the webbing of the research information network. I will discuss this in the context of the growing set of formats and types of research related materials accessible, discoverable, trackable, and reusable. I will highlight the developments in and the importance of expanded standardization and interoperability of scholarly information pertaining to authors, researchers, funders, and others involved in the creation and dissemination of content especially in light of the growing attention towards research integrity and reproducibility.
EarthCube GeoLink: Semantics and Linked Data for the Geosciences

NASA Astrophysics Data System (ADS)

Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Fils, D.; Hitzler, P.; Janowicz, K.; Ji, P.; Jones, M. B.; Krisnadhi, A.; Lehnert, K. A.; Mickle, A.; Narock, T.; O'Brien, M.; Raymond, L. M.; Schildhauer, M.; Shepherd, A.; Wiebe, P. H.

2015-12-01

The NSF EarthCube initiative is building next-generation cyberinfrastructure to aid geoscientists in collecting, accessing, analyzing, sharing, and visualizing their data and knowledge. The EarthCube GeoLink Building Block project focuses on a specific set of software protocols and vocabularies, often characterized as the Semantic Web and "Linked Data", to publish data online in a way that is easily discoverable, accessible, and interoperable. GeoLink brings together specialists from the computer science, geoscience, and library science domains, and includes data from a network of NSF-funded repositories that support scientific studies in marine geology, marine ecosystems, biogeochemistry, and paleoclimatology. We are working collaboratively with closely-related Building Block projects including EarthCollab and CINERGI, and solicit feedback from RCN projects including Cyberinfrastructure for Paleogeosciences (C4P) and iSamples. GeoLink has developed a modular ontology that describes essential geoscience research concepts; published data from seven collections (to date) on the Web as geospatially-enabled Linked Data using this ontology; matched and mapped data between collections using shared identifiers for investigators, repositories, datasets, funding awards, platforms, research cruises, physical specimens, and gazetteer features; and aggregated the results in a shared knowledgebase that can be queried via a standard SPARQL endpoint. Client applications have been built around the knowledgebase, including a Web/map-based data browser using the Leaflet JavaScript library and a simple query service using the OpenSearch format. Future development will include extending and refining the GeoLink ontology, adding content from additional repositories, developing semi-automated algorithms to enhance metadata, and further work on client applications.
Middleware for Plug and Play Integration of Heterogeneous Sensor Resources into the Sensor Web

PubMed Central

Toma, Daniel M.; Jirka, Simon; Del Río, Joaquín

2017-01-01

The study of global phenomena requires the combination of a considerable amount of data coming from different sources, acquired by different observation platforms and managed by institutions working in different scientific fields. Merging this data to provide extensive and complete data sets to monitor the long-term, global changes of our oceans is a major challenge. The data acquisition and data archival procedures usually vary significantly depending on the acquisition platform. This lack of standardization ultimately leads to information silos, preventing the data to be effectively shared across different scientific communities. In the past years, important steps have been taken in order to improve both standardization and interoperability, such as the Open Geospatial Consortium’s Sensor Web Enablement (SWE) framework. Within this framework, standardized models and interfaces to archive, access and visualize the data from heterogeneous sensor resources have been proposed. However, due to the wide variety of software and hardware architectures presented by marine sensors and marine observation platforms, there is still a lack of uniform procedures to integrate sensors into existing SWE-based data infrastructures. In this work, a framework aimed to enable sensor plug and play integration into existing SWE-based data infrastructures is presented. First, an analysis of the operations required to automatically identify, configure and operate a sensor are analysed. Then, the metadata required for these operations is structured in a standard way. Afterwards, a modular, plug and play, SWE-based acquisition chain is proposed. Finally different use cases for this framework are presented. PMID:29244732
Using URIs to effectively transmit sensor data and metadata

NASA Astrophysics Data System (ADS)

Kokkinaki, Alexandra; Buck, Justin; Darroch, Louise; Gardner, Thomas

2017-04-01

Autonomous ocean observation is massively increasing the number of sensors in the ocean. Accordingly, the continuing increase in datasets produced, makes selecting sensors that are fit for purpose a growing challenge. Decision making on selecting quality sensor data, is based on the sensor's metadata, i.e. manufacturer specifications, history of calibrations etc. The Open Geospatial Consortium (OGC) has developed the Sensor Web Enablement (SWE) standards to facilitate integration and interoperability of sensor data and metadata. The World Wide Web Consortium (W3C) Semantic Web technologies enable machine comprehensibility promoting sophisticated linking and processing of data published on the web. Linking the sensor's data and metadata according to the above-mentioned standards can yield practical difficulties, because of internal hardware bandwidth restrictions and a requirement to constrain data transmission costs. Our approach addresses these practical difficulties by uniquely identifying sensor and platform models and instances through URIs, which resolve via content negotiation to either OGC's sensor meta language, sensorML or W3C's Linked Data. Data transmitted by a sensor incorporate the sensor's unique URI to refer to its metadata. Sensor and platform model URIs and descriptions are created and hosted by the British Oceanographic Data Centre (BODC) linked systems service. The sensor owner creates the sensor and platform instance URIs prior and during sensor deployment, through an updatable web form, the Sensor Instance Form (SIF). SIF enables model and instance URI association but also platform and sensor linking. The use of URIs, which are dynamically generated through the SIF, offers both practical and economical benefits to the implementation of SWE and Linked Data standards in near real time systems. Data can be linked to metadata dynamically in-situ while saving on the costs associated to the transmission of long metadata descriptions. The transmission of short URIs also enables the implementation of standards on systems where it is impractical, such as legacy hardware.
Interoperability Across the Stewardship Spectrum in the DataONE Repository Federation

NASA Astrophysics Data System (ADS)

Jones, M. B.; Vieglais, D.; Wilson, B. E.

2016-12-01

Thousands of earth and environmental science repositories serve many researchers and communities, each with their own community and legal mandates, sustainability models, and historical infrastructure. These repositories span the stewardship spectrum from highly curated collections that employ large numbers of staff members to review and improve data, to small, minimal budget repositories that accept data caveat emptor and where all responsibility for quality lies with the submitter. Each repository fills a niche, providing services that meet the stewardship tradeoffs of one or more communities. We have reviewed these stewardship tradeoffs for several DataONE member repositories ranging from minimally (KNB) to highly curated (Arctic Data Center), as well as general purpose (Dryad) to highly discipline or project specific (NEON). The rationale behind different levels of stewardship reflect resolution of these tradeoffs. Some repositories aim to encourage extensive uptake by keeping processes simple and minimizing the amount of information collected, but this limits the long-term utility of the data and the search, discovery, and integration systems that are possible. Other repositories require extensive metadata input, review, and assessment, allowing for excellent preservation, discovery, and integration but at the cost of significant time for submitters and expense for curatorial staff. DataONE recognizes these different levels of curation, and attempts to embrace them to create a federation that is useful across the stewardship spectrum. DataONE provides a tiered model for repositories with growing utility of DataONE services at higher tiers of curation. The lowest tier supports read-only access to data and requires little more than title and contact metadata. Repositories can gradually phase in support for higher levels of metadata and services as needed. These tiered capabilities are possible through flexible support for multiple metadata standards and services, where repositories can incrementally increase their requirements as they want to satisfy more use cases. Within DataONE, metadata search services support minimal metadata models, but significantly expanded precision and recall become possible when repositories provide more extensively curated metadata.

An ontology for component-based models of water resource systems

NASA Astrophysics Data System (ADS)

Elag, Mostafa; Goodall, Jonathan L.

2013-08-01

Component-based modeling is an approach for simulating water resource systems where a model is composed of a set of components, each with a defined modeling objective, interlinked through data exchanges. Component-based modeling frameworks are used within the hydrologic, atmospheric, and earth surface dynamics modeling communities. While these efforts have been advancing, it has become clear that the water resources modeling community in particular, and arguably the larger earth science modeling community as well, faces a challenge of fully and precisely defining the metadata for model components. The lack of a unified framework for model component metadata limits interoperability between modeling communities and the reuse of models across modeling frameworks due to ambiguity about the model and its capabilities. To address this need, we propose an ontology for water resources model components that describes core concepts and relationships using the Web Ontology Language (OWL). The ontology that we present, which is termed the Water Resources Component (WRC) ontology, is meant to serve as a starting point that can be refined over time through engagement by the larger community until a robust knowledge framework for water resource model components is achieved. This paper presents the methodology used to arrive at the WRC ontology, the WRC ontology itself, and examples of how the ontology can aid in component-based water resources modeling by (i) assisting in identifying relevant models, (ii) encouraging proper model coupling, and (iii) facilitating interoperability across earth science modeling frameworks.
Earth Sciences data access and preservation with gLibrary

NASA Astrophysics Data System (ADS)

Guidetti, Veronica; Calanducci, Antonio

2010-05-01

ESA-ESRIN, the European Space Agency Centre for Earth Observation (EO), is the largest European EO data provider and operates as the reference European centre for EO payload data exploitation. EO data acquired from space have become powerful scientific tools to enable better understanding and management of the Earth and its resources. Large international initiatives such as GMES and GEO, supported by the European Commission, focus on coordinating international efforts to environmental monitoring, i.e. to provide political and technical solutions to global issues, such as climate change, global environment monitoring, management of natural resources and humanitarian response. Since the time-span of EO data archives extends from a few years to decades, their value as scientific time-series increases considerably, especially for the topic of global change. It will be soon necessary to re-analyse on global scale the information currently locked inside large thematic archives. Future research in the field of Earth Sciences is of invaluable importance: to carry it on researchers worldwide must be enabled to find and access data of interest in a quick and easy way. At present, several thousands of scientists, principal investigators and operators, access EO missions' metadata, data and derived information on a daily basis. Main objectives may be to study the global climate change, to check the status of the instrument on-board and the quality of EO data. There is a huge worldwide scientific community calling for the need to keep EO data accessible without time constrains, easily and quickly. In collaboration with ESA-ESRIN, INFN, the National Institute for Nuclear Physics, is implementing a demonstrative use case where satellite remote sensing data, including in-situ data and other kind of digital assets, are made available to the scientific community via gLibrary (https://glibrary.ct.infn.it), the INFN digital library platform. gLibrary can be used to store, organise, browse, retrieve, annotate and replicate any kind of digital asset on data grids or distributed storage environments. It provides digital assets preservation capabilities, making use of distributed replication of assets, decoupling from the underlying storage technology, and adoption of standard interfaces and metadata descriptions. In its future development gLibrary will investigate and possibly provide integration with grid and HPC processing services, including the ESA G-POD facility (http://eogrid.esrin.esa.int). Currently, gLibrary features encompass fast data access, quick retrieval of digital assets, metadata handling and sharing (including text annotation), high availability and scalability (due to its distributed architecture), (meta)data replication and, last but not least, authentication and authorisation. Much of the experimentation is on-going at EC and international level to provide coordinated and interoperable access to EO data and satellite imagery including any kind of related digital assets (metadata, documents, product guidelines, auxiliary data, mission/sensor specifications, environmental reports). The work with gLibrary comes as a best effort initiative and targets a full interoperability with ESA EO data dissemination, recovering and processing services and intends to demonstrate the benefit the scientific community can gain from this kind of integrated data access. It contributes to respond to the Earth Sciences data users' needs, moving forward the technology development to facilitate a very interactive EO information sharing, analysis and interoperability on the Web.
Interoperability Outlook in the Big Data Future

NASA Astrophysics Data System (ADS)

Kuo, K. S.; Ramachandran, R.

2015-12-01

The establishment of distributed active archive centers (DAACs) as data warehouses and the standardization of file format by NASA's Earth Observing System Data Information System (EOSDIS) had doubtlessly propelled interoperability of NASA Earth science data to unprecedented heights in the 1990s. However, we obviously still feel wanting two decades later. We believe the inadequate interoperability we experience is a result of the the current practice that data are first packaged into files before distribution and only the metadata of these files are cataloged into databases and become searchable. Data therefore cannot be efficiently filtered. Any extensive study thus requires downloading large volumes of data files to a local system for processing and analysis.The need to download data not only creates duplication and inefficiency but also further impedes interoperability, because the analysis has to be performed locally by individual researchers in individual institutions. Each institution or researcher often has its/his/her own preference in the choice of data management practice as well as programming languages. Analysis results (derived data) so produced are thus subject to the differences of these practices, which later form formidable barriers to interoperability. A number of Big Data technologies are currently being examined and tested to address Big Earth Data issues. These technologies share one common characteristics: exploiting compute and storage affinity to more efficiently analyze large volumes and great varieties of data. Distributed active "archive" centers are likely to evolve into distributed active "analysis" centers, which not only archive data but also provide analysis service right where the data reside. "Analysis" will become the more visible function of these centers. It is thus reasonable to expect interoperability to improve because analysis, in addition to data, becomes more centralized. Within a "distributed active analysis center" interoperability is almost guaranteed because data, analysis, and results all can be readily shared and reused. Effectively, with the establishment of "distributed active analysis centers", interoperation turns from a many-to-many problem into a less complicated few-to-few problem and becomes easier to solve.
Principles of data integration and interoperability in the GEO Biodiversity Observation Network

NASA Astrophysics Data System (ADS)

Saarenmaa, Hannu; Ó Tuama, Éamonn

2010-05-01

The goal of the Global Earth Observation System of Systems (GEOSS) is to link existing information systems into a global and flexible network to address nine areas of critical importance to society. One of these "societal benefit areas" is biodiversity and it will be supported by a GEOSS sub-system known as the GEO Biodiversity Observation Network (GEO BON). In planning the GEO BON, it was soon recognised that there are already a multitude of existing networks and initiatives in place worldwide. What has been lacking is a coordinated framework that allows for information sharing and exchange between the networks. Traversing across the various scales of biodiversity, in particular from the individual and species levels to the ecosystems level has long been a challenge. Furthermore, some of the major regions of the world have already taken steps to coordinate their efforts, but links between the regions have not been a priority until now. Linking biodiversity data to that of the other GEO societal benefit areas, in particular ecosystems, climate, and agriculture to produce useful information for the UN Conventions and other policy-making bodies is another need that calls for integration of information. Integration and interoperability are therefore a major theme of GEO BON, and a "system of systems" is very much needed. There are several approaches to integration that need to be considered. Data integration requires harmonising concepts, agreeing on vocabularies, and building ontologies. Semantic mediation of data using these building blocks is still not easy to achieve. Agreements on, or mappings between, the metadata standards that will be used across the networks is a major requirement that will need to be addressed early on. With interoperable metadata, service integration will be possible through registry of registries systems such as GBIF's forthcoming GBDRS and the GEO Clearinghouse. Chaining various services that build intermediate products using workflow systems will also help expedite the delivery of products and reports that are required for integrated assessment of data from many disciplines. Going beyond the Service Oriented Architectures which now are mainstream, these challenges have lately been addressed in the business world by adopting what is called a Semantic Enterprise Architecture. Semantic portals have been built, in particular, to address interoperability across domains, where users may not be familiar with concepts of all networks. We will discuss the applicability of these approaches for building the global GEO BON.
Automated Database Mediation Using Ontological Metadata Mappings

PubMed Central

Marenco, Luis; Wang, Rixin; Nadkarni, Prakash

2009-01-01

Objective To devise an automated approach for integrating federated database information using database ontologies constructed from their extended metadata. Background One challenge of database federation is that the granularity of representation of equivalent data varies across systems. Dealing effectively with this problem is analogous to dealing with precoordinated vs. postcoordinated concepts in biomedical ontologies. Model Description The authors describe an approach based on ontological metadata mapping rules defined with elements of a global vocabulary, which allows a query specified at one granularity level to fetch data, where possible, from databases within the federation that use different granularities. This is implemented in OntoMediator, a newly developed production component of our previously described Query Integrator System. OntoMediator's operation is illustrated with a query that accesses three geographically separate, interoperating databases. An example based on SNOMED also illustrates the applicability of high-level rules to support the enforcement of constraints that can prevent inappropriate curator or power-user actions. Summary A rule-based framework simplifies the design and maintenance of systems where categories of data must be mapped to each other, for the purpose of either cross-database query or for curation of the contents of compositional controlled vocabularies. PMID:19567801
Development of an open metadata schema for prospective clinical research (openPCR) in China.

PubMed

Xu, W; Guan, Z; Sun, J; Wang, Z; Geng, Y

2014-01-01

In China, deployment of electronic data capture (EDC) and clinical data management system (CDMS) for clinical research (CR) is in its very early stage, and about 90% of clinical studies collected and submitted clinical data manually. This work aims to build an open metadata schema for Prospective Clinical Research (openPCR) in China based on openEHR archetypes, in order to help Chinese researchers easily create specific data entry templates for registration, study design and clinical data collection. Singapore Framework for Dublin Core Application Profiles (DCAP) is used to develop openPCR and four steps such as defining the core functional requirements and deducing the core metadata items, developing archetype models, defining metadata terms and creating archetype records, and finally developing implementation syntax are followed. The core functional requirements are divided into three categories: requirements for research registration, requirements for trial design, and requirements for case report form (CRF). 74 metadata items are identified and their Chinese authority names are created. The minimum metadata set of openPCR includes 3 documents, 6 sections, 26 top level data groups, 32 lower data groups and 74 data elements. The top level container in openPCR is composed of public document, internal document and clinical document archetypes. A hierarchical structure of openPCR is established according to Data Structure of Electronic Health Record Architecture and Data Standard of China (Chinese EHR Standard). Metadata attributes are grouped into six parts: identification, definition, representation, relation, usage guides, and administration. OpenPCR is an open metadata schema based on research registration standards, standards of the Clinical Data Interchange Standards Consortium (CDISC) and Chinese healthcare related standards, and is to be publicly available throughout China. It considers future integration of EHR and CR by adopting data structure and data terms in Chinese EHR Standard. Archetypes in openPCR are modularity models and can be separated, recombined, and reused. The authors recommend that the method to develop openPCR can be referenced by other countries when designing metadata schema of clinical research. In the next steps, openPCR should be used in a number of CR projects to test its applicability and to continuously improve its coverage. Besides, metadata schema for research protocol can be developed to structurize and standardize protocol, and syntactical interoperability of openPCR with other related standards can be considered.
CINERGI: Community Inventory of EarthCube Resources for Geoscience Interoperability

NASA Astrophysics Data System (ADS)

Zaslavsky, Ilya; Bermudez, Luis; Grethe, Jeffrey; Gupta, Amarnath; Hsu, Leslie; Lehnert, Kerstin; Malik, Tanu; Richard, Stephen; Valentine, David; Whitenack, Thomas

2014-05-01

Organizing geoscience data resources to support cross-disciplinary data discovery, interpretation, analysis and integration is challenging because of different information models, semantic frameworks, metadata profiles, catalogs, and services used in different geoscience domains, not to mention different research paradigms and methodologies. The central goal of CINERGI, a new project supported by the US National Science Foundation through its EarthCube Building Blocks program, is to create a methodology and assemble a large inventory of high-quality information resources capable of supporting data discovery needs of researchers in a wide range of geoscience domains. The key characteristics of the inventory are: 1) collaboration with and integration of metadata resources from a number of large data facilities; 2) reliance on international metadata and catalog service standards; 3) assessment of resource "interoperability-readiness"; 4) ability to cross-link and navigate data resources, projects, models, researcher directories, publications, usage information, etc.; 5) efficient inclusion of "long-tail" data, which are not appearing in existing domain repositories; 6) data registration at feature level where appropriate, in addition to common dataset-level registration, and 7) integration with parallel EarthCube efforts, in particular focused on EarthCube governance, information brokering, service-oriented architecture design and management of semantic information. We discuss challenges associated with accomplishing CINERGI goals, including defining the inventory scope; managing different granularity levels of resource registration; interaction with search systems of domain repositories; explicating domain semantics; metadata brokering, harvesting and pruning; managing provenance of the harvested metadata; and cross-linking resources based on the linked open data (LOD) approaches. At the higher level of the inventory, we register domain-wide resources such as domain catalogs, vocabularies, information models, data service specifications, identifier systems, and assess their conformance with international standards (such as those adopted by ISO and OGC, and used by INSPIRE) or de facto community standards using, in part, automatic validation techniques. The main level in CINERGI leverages a metadata aggregation platform (currently Geoportal Server) to organize harvested resources from multiple collections and contributed by community members during EarthCube end-user domain workshops or suggested online. The latter mechanism uses the SciCrunch toolkit originally developed within the Neuroscience Information Framework (NIF) project and now being extended to other communities. The inventory is designed to support requests such as "Find resources with theme X in geographic area S", "Find datasets with subject Y using query concept expansion", "Find geographic regions having data of type Z", "Find datasets that contain property P". With the added LOD support, additional types of requests, such as "Find example implementations of specification X", "Find researchers who have worked in Domain X, dataset Y, location L", "Find resources annotated by person X", will be supported. Project's website (http://workspace.earthcube.org/cinergi) provides access to the initial resource inventory, a gallery of EarthCube researchers, collections of geoscience models, metadata entry forms, and other software modules and inventories being integrated into the CINERGI system. Support from the US National Science Foundation under award NSF ICER-1343816 is gratefully acknowledged.
ISAIA: Interoperable Systems for Archival Information Access

NASA Technical Reports Server (NTRS)

Hanisch, Robert J.

2002-01-01

The ISAIA project was originally proposed in 1999 as a successor to the informal AstroBrowse project. AstroBrowse, which provided a data location service for astronomical archives and catalogs, was a first step toward data system integration and interoperability. The goals of ISAIA were ambitious: '...To develop an interdisciplinary data location and integration service for space science. Building upon existing data services and communications protocols, this service will allow users to transparently query hundreds or thousands of WWW-based resources (catalogs, data, computational resources, bibliographic references, etc.) from a single interface. The service will collect responses from various resources and integrate them in a seamless fashion for display and manipulation by the user.' Funding was approved only for a one-year pilot study, a decision that in retrospect was wise given the rapid changes in information technology in the past few years and the emergence of the Virtual Observatory initiatives in the US and worldwide. Indeed, the ISAIA pilot study was influential in shaping the science goals, system design, metadata standards, and technology choices for the virtual observatory. The ISAIA pilot project also helped to cement working relationships among the NASA data centers, US ground-based observatories, and international data centers. The ISAIA project was formed as a collaborative effort between thirteen institutions that provided data to astronomers, space physicists, and planetary scientists. Among the fruits we ultimately hoped would come from this project would be a central site on the Web that any space scientist could use to efficiently locate existing data relevant to a particular scientific question. Furthermore, we hoped that the needed technology would be general enough to allow smaller, more-focused community within space science could use the same technologies and standards to provide more specialized services. A major challenge to searching for data across a broad community is that information that describe some data products are either not relevant to other data or not applicable in the same way. Some previous metadata standard development efforts (e.g., in the earth science and library communities) have produced standards that are very large and difficult to support. To address this problem, we studied how a standard may be divided into separable pieces. Data providers that wish to participate in interoperable searches can support only those parts of the standard that are relevant to them. We prototyped a top-level metadata standard that was small and applicable to all space science data.
The Energy Industry Profile of ISO/DIS 19115-1: Facilitating Discovery and Evaluation of, and Access to Distributed Information Resources

NASA Astrophysics Data System (ADS)

Hills, S. J.; Richard, S. M.; Doniger, A.; Danko, D. M.; Derenthal, L.; Energistics Metadata Work Group

2011-12-01

A diverse group of organizations representative of the international community involved in disciplines relevant to the upstream petroleum industry, - energy companies, - suppliers and publishers of information to the energy industry, - vendors of software applications used by the industry, - partner government and academic organizations, has engaged in the Energy Industry Metadata Standards Initiative. This Initiative envisions the use of standard metadata within the community to enable significant improvements in the efficiency with which users discover, evaluate, and access distributed information resources. The metadata standard needed to realize this vision is the initiative's primary deliverable. In addition to developing the metadata standard, the initiative is promoting its adoption to accelerate realization of the vision, and publishing metadata exemplars conformant with the standard. Implementation of the standard by community members, in the form of published metadata which document the information resources each organization manages, will allow use of tools requiring consistent metadata for efficient discovery and evaluation of, and access to, information resources. While metadata are expected to be widely accessible, access to associated information resources may be more constrained. The initiative is being conducting by Energistics' Metadata Work Group, in collaboration with the USGIN Project. Energistics is a global standards group in the oil and natural gas industry. The Work Group determined early in the initiative, based on input solicited from 40+ organizations and on an assessment of existing metadata standards, to develop the target metadata standard as a profile of a revised version of ISO 19115, formally the "Energy Industry Profile of ISO/DIS 19115-1 v1.0" (EIP). The Work Group is participating on the ISO/TC 211 project team responsible for the revision of ISO 19115, now ready for "Draft International Standard" (DIS) status. With ISO 19115 an established, capability-rich, open standard for geographic metadata, EIP v1 is expected to be widely acceptable within the community and readily sustainable over the long-term. The EIP design, also per community requirements, will enable discovery, evaluation, and access to types of information resources considered important to the community, including structured and unstructured digital resources, and physical assets such as hardcopy documents and material samples. This presentation will briefly review the development of this initiative as well as the current and planned Work Group activities. More time will be spent providing an overview of the EIP v1, including the requirements it prescribes, design efforts made to enable automated metadata capture and processing, and the structure and content of its documentation, which was written to minimize ambiguity and facilitate implementation. The Work Group considers EIP v1 a solid initial design for interoperable metadata, and first step toward the vision of the Initiative.
Strategic Assessment for Arctic Observing, and the New Arctic Observing Viewer

NASA Astrophysics Data System (ADS)

Kassin, A.; Cody, R. P.; Manley, W. F.; Gaylord, A. G.; Dover, M.; Score, R.; Lin, D. H.; Villarreal, S.; Quezada, A.; Tweedie, C. E.

2013-12-01

Although a great deal of progress has been made with various Arctic Observing efforts, it can be difficult to assess that progress. What data collection efforts are established or under way? Where? By whom? To help meet the strategic needs of SEARCH-AON, SAON, and related initiatives, a new resource has been released: the Arctic Observing Viewer (AOV; http://ArcticObservingViewer.org). This web mapping application covers the 'who', 'what', 'where', and 'when' of data collection sites - wherever marine or terrestrial data are collected. Hundreds of sites are displayed, providing an overview as well as details. Users can visualize, navigate, select, search, draw, print, and more. This application currently showcases a subset of observational activities and will become more comprehensive with time. The AOV is founded on principles of interoperability, with an emerging metadata standard and compatible web service formats, such that participating agencies and organizations can use the AOV tools and services for their own purposes. In this way, the AOV will complement other cyber-resources, and will help science planners, funding agencies, PI's, and others to: assess status, identify overlap, fill gaps, assure sampling design, refine network performance, clarify directions, access data, coordinate logistics, collaborate, and more to meet Arctic Observing goals.
Changing knowledge perspective in a changing world: The Adriatic multidisciplinary TDS approach

NASA Astrophysics Data System (ADS)

Bergamasco, Andrea; Carniel, Sandro; Nativi, Stefano; Signell, Richard P.; Benetazzo, Alvise; Falcieri, Francesco M.; Bonaldo, Davide; Minuzzo, Tiziano; Sclavo, Mauro

2013-04-01

The use and exploitation of the marine environment in recent years has been increasingly high, therefore calling for the need of a better description, monitoring and understanding of its behavior. However, marine scientists and managers often spend too much time in accessing and reformatting data instead of focusing on discovering new knowledge from the processes observed and data acquired. There is therefore the need to make more efficient our approach to data mining, especially in a world where rapid climate change imposes rapid and quick choices. In this context, it is mandatory to explore ways and possibilities to make large amounts of distributed data usable in an efficient and easy way, an effort that requires standardized data protocols, web services and standards-based tools. Following the US-IOOS approach, which has been adopted in many oceanographic and meteorological sectors, we present a CNR experience in the direction of setting up a national Italian IOOS framework (at the moment confined at the Adriatic Sea environment), using the THREDDS (THematic Real-time Environmental Distributed Data Services) Data Server (TDS). A TDS is a middleware designed to fill the gap between data providers and data users, and provides services allowing data users to find the data sets pertaining to their scientific needs, to access, visualize and use them in an easy way, without the need of downloading files to the local workspace. In order to achieve this results, it is necessary that the data providers make their data available in a standard form that the TDS understands, and with sufficient metadata so that the data can be read and searched for in a standard way. The TDS core is a NetCDF- Java Library implementing a Common Data Model (CDM), as developed by Unidata (http://www.unidata.ucar.edu), allowing the access to "array-based" scientific data. Climate and Forecast (CF) compliant NetCDF files can be read directly with no modification, while non-compliant files can be modified to meet appropriate metadata requirements. Once standardized in the CDM, the TDS makes datasets available through a series of web services such as OPeNDAP or Open Geospatial Consortium Web Coverage Service (WCS), allowing the data users to easily obtain small subsets from large datasets, and to quickly visualize their content by using tools such as GODIVA2 or Integrated Data Viewer (IDV). In addition, an ISO metadata service is available through the TDS that can be harvested by catalogue broker services (e.g. GI-cat) to enable distributed search across federated data servers. Example of TDS datasets from oceanographic evolutions (currents, waves, sediments...) will be described and discussed, while some examples can be accessed directly to the Venice site http://tds.ve.ismar.cnr.it:8080/thredds/catalog.html (Bergamasco et al., 2012) also within the framework of RITMARE Project. References Bergamasco A., Benetazzo A., Carniel S., Falcieri F., Minuzzo T., Signell R.P. and M. Sclavo, 2012. From interoperability to knowledge discovery using large model datasets in the marine environment: the THREDDS Data Server example. Advances in Oceanography and Limnology, 3(1), 41-50. DOI:10.1080/19475721.2012.669637
An ODIP Effort to Map R2R Ocean Data Terms to International Vocabularies

NASA Astrophysics Data System (ADS)

Ferreira, R.; Stocks, K. I.; Arko, R. A.

2014-12-01

The diverseness of terminology used in describing ocean data creates a barrier to efficient discovery and re-use of data, particularly across institutional, programmatic, and disciplinary boundaries. Here we explore the outcomes of a student project to crosswalk terms between the Rolling Deck to Repository (R2R) program and other international systems, as part of the Ocean Data Interoperability Platform (ODIP). R2R is a U.S. program developing and implementing an information management system to preserve and provide access to routine underway data collected by U.S. academic research vessels. R2R participates in ODIP, an international forum for improving interoperability and effective sharing of marine data resources through technical workshops and joint prototypes. The vocabulary mapping effort lays a foundation for future ocean data portals through which users search and access ocean data using familiar terms. R2R describes its data with a suite of controlled vocabularies (http://www.rvdata.us/voc) some of which were developed within R2R or are specific to the U.S. The goal of this student project is to crosswalk local/national vocabularies to authoritative international ones, where they exist, or to vocabularies widely used by ODIP partners. Specifically, R2R developed the following crosswalks: UNOLS ports to SeaDataNet Ports Gazetteer, R2R Device Models to NVS SeaVoX Device Catalog, R2R Organizations to the European Directory of Marine Organizations (EDMO), and R2R chief scientist names to well known professional identifiers such as ORCID, Research Gate, Linkedin, etc. Mappings were done in simple spreadsheets using synonymy relationships, and will be published as part of the R2R Linked Data resources. The level of success in crosswalking was variable. All ports are successfully mapped. Both organizations and device models have initial mappings and R2R has added new terms to EDMO and SeaVoX Device Catalog vocabularies allowing for nearly complete coverage of terms. An initial search for R2R scientists identifiers on ORCID returned few potential matches, and most potential matches lacked sufficient metadata to confirm the match. R2R is now adopting an alternate approach of requesting chief scientists to self-report on the professional identifiers used to expose their work.
EMODNet Bathymetry - building and providing a high resolution digital bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, D.

2016-12-01

Access to marine data is a key issue for the EU Marine Strategy Framework Directive and the EU Marine Knowledge 2020 agenda and includes the European Marine Observation and Data Network (EMODnet) initiative. The EMODnet Bathymetry project develops and publishes Digital Terrain Models (DTM) for the European seas. These are produced from survey and aggregated data sets that are indexed with metadata by adopting from SeaDataNet the Common Data Index (CDI) data discovery and access service and the Sextant data products catalogue service. SeaDataNet is a network of major oceanographic data centres around the European seas that manage, operate and further develop a pan-European infrastructure for marine and ocean data management. SeaDataNet is also setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards such as ISO and OGC. The SeaDataNet portal provides users a number of interrelated meta directories, an extensive range of controlled vocabularies, and the various SeaDataNet standards and tools. SeaDataNet at present gives overview and access to more than 1.8 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 100 connected data centres from 34 countries riparian to European seas. The latest EMODnet Bathymetry DTM has a resolution of 1/8 arcminute * 1/8 arcminute and covers all European sea regions. Use is made of available and gathered surveys and already more than 13.000 surveys have been indexed by 27 European data providers from 15 countries. Also use is made of composite DTMs as generated and maintained by several data providers for their areas of interest. Already 44 composite DTMs are included in the Sextant data products catalogue. For areas without coverage use is made of the latest global DTM of GEBCO who is partner in the EMODnet Bathymetry project. In return GEBCO integrates the EMODnet DTM to achieve an enriched and better result. The catalogue services and the generated EMODnet can be queried and browsed at the dedicated EMODnet Bathymetry portal which also provides a versatile DTM viewing service with many relevant map layers and functions for retrieving. The EMODnet DTM is publicly available for downloading in various formats.
Acoustic Metadata Management and Transparent Access to Networked Oceanographic Data Sets

DTIC Science & Technology

2014-09-30

Fisheries Science Center National Marine Fisheries Service 7600 Sand Point Way N.E., Building 4 Seattle, Washington 98115-6349 phone: (206) 526-6331...fax: (206) 526-6615 e-mail: Catherine.Berchok@noaa.gov Erin M. Oleson Pacific Islands Fisheries Science Center National Marine Fisheries ...Sofie M. Van Parijs Northeast Fisheries Science Center National Marine Fisheries Service 166 Water Street, Woods Hole, MA 02543 phone: (508) 495
EMODnet Physics in the EMODnet program phase 3

NASA Astrophysics Data System (ADS)

Novellino, Antonio; Gorringe, Patrick; Schaap, Dick; Pouliquen, Sylvie; Rickards, Lesley; Thijsse, Peter; Manzella, Giuseppe

2017-04-01

Access to marine data is of vital importance for marine research and a key issue for various studies, from climate change prediction to off shore engineering. Giving access to and harmonising marine data from different sources will help industry, public authorities and researchers find the data and make more effective use of them to develop new products, services and improve our understanding of how the seas behave. The aim of EMODnet Physics is the provision of a combined array of services and functionalities (facility for viewing and downloading, dashboard reporting and machine-to-machine communication services) to obtain, free of charge data, meta-data and data products on the physical conditions of European sea basins and oceans from many different distributed data bases. Moreover, the system provides full interoperability with third-party software through WMS services, Web Services and Web catalogues in order to exchange data and products according to the most recent standards. This assures to the user, the access to data having same quality and formats. The portal is providing access to data and products of: wave height and period; temperature and salinity of the water column; wind speed and direction; horizontal velocity of the water column; light attenuation; sea ice coverage and sea level trends. EMODnet Physics is continuously enhancing the number and type of platforms in the system by unlocking and providing high quality data from a growing network. Nowadays the system does integrate information by more than 12.000 stations and is including two ready-to-use data products: Ice Map and Sea Level Trends. The final aim of EMODnet Physics is to confederate different portals and be a portal of portal to further extend the number and type of data (e.g. water noise, river data, etc.) and platforms (e.g. animal bourne instruments, etc) feeding the system; improve the capacity of the system producing data and products that could match the market needs of the current and potential new end and intermediate users.
SIOExplorer: Modern IT Methods and Tools for Digital Library Management

NASA Astrophysics Data System (ADS)

Sutton, D. W.; Helly, J.; Miller, S.; Chase, A.; Clarck, D.

2003-12-01

With more geoscience disciplines becoming data-driven it is increasingly important to utilize modern techniques for data, information and knowledge management. SIOExplorer is a new digital library project with 2 terabytes of oceanographic data collected over the last 50 years on 700 cruises by the Scripps Institution of Oceanography. It is built using a suite of information technology tools and methods that allow for an efficient and effective digital library management system. The library consists of a number of independent collections, each with corresponding metadata formats. The system architecture allows each collection to be built and uploaded based on a collection dependent metadata template file (MTF). This file is used to create the hierarchical structure of the collection, create metadata tables in a relational database, and to populate object metadata files and the collection as a whole. Collections are comprised of arbitrary digital objects stored at the San Diego Supercomputer Center (SDSC) High Performance Storage System (HPSS) and managed using the Storage Resource Broker (SRB), data handling middle ware developed at SDSC. SIOExplorer interoperates with other collections as a data provider through the Open Archives Initiative (OAI) protocol. The user services for SIOExplorer are accessed from CruiseViewer, a Java application served using Java Web Start from the SIOExplorer home page. CruiseViewer is an advanced tool for data discovery and access. It implements general keyword and interactive geospatial search methods for the collections. It uses a basemap to georeference search results on user selected basemaps such as global topography or crustal age. User services include metadata viewing, opening of selective mime type digital objects (such as images, documents and grid files), and downloading of objects (including the brokering of proprietary hold restrictions).
Building a High Performance Metadata Broker using Clojure, NoSQL and Message Queues

NASA Astrophysics Data System (ADS)

Truslove, I.; Reed, S.

2013-12-01

In practice, Earth and Space Science Informatics often relies on getting more done with less: fewer hardware resources, less IT staff, fewer lines of code. As a capacity-building exercise focused on rapid development of high-performance geoinformatics software, the National Snow and Ice Data Center (NSIDC) built a prototype metadata brokering system using a new JVM language, modern database engines and virtualized or cloud computing resources. The metadata brokering system was developed with the overarching goals of (i) demonstrating a technically viable product with as little development effort as possible, (ii) using very new yet very popular tools and technologies in order to get the most value from the least legacy-encumbered code bases, and (iii) being a high-performance system by using scalable subcomponents, and implementation patterns typically used in web architectures. We implemented the system using the Clojure programming language (an interactive, dynamic, Lisp-like JVM language), Redis (a fast in-memory key-value store) as both the data store for original XML metadata content and as the provider for the message queueing service, and ElasticSearch for its search and indexing capabilities to generate search results. On evaluating the results of the prototyping process, we believe that the technical choices did in fact allow us to do more for less, due to the expressive nature of the Clojure programming language and its easy interoperability with Java libraries, and the successful reuse or re-application of high performance products or designs. This presentation will describe the architecture of the metadata brokering system, cover the tools and techniques used, and describe lessons learned, conclusions, and potential next steps.
Organization of marine phenology data in support of planning and conservation in ocean and coastal ecosystems

USGS Publications Warehouse

Thomas, Kathryn A.; Fornwall, Mark D.; Weltzin, Jake F.; Griffis, R.B.

2014-01-01

Among the many effects of climate change is its influence on the phenology of biota. In marine and coastal ecosystems, phenological shifts have been documented for multiple life forms; however, biological data related to marine species' phenology remain difficult to access and is under-used. We conducted an assessment of potential sources of biological data for marine species and their availability for use in phenological analyses and assessments. Our evaluations showed that data potentially related to understanding marine species' phenology are available through online resources of governmental, academic, and non-governmental organizations, but appropriate datasets are often difficult to discover and access, presenting opportunities for scientific infrastructure improvement. The developing Federal Marine Data Architecture when fully implemented will improve data flow and standardization for marine data within major federal repositories and provide an archival repository for collaborating academic and public data contributors. Another opportunity, largely untapped, is the engagement of citizen scientists in standardized collection of marine phenology data and contribution of these data to established data flows. Use of metadata with marine phenology related keywords could improve discovery and access to appropriate datasets. When data originators choose to self-publish, publication of research datasets with a digital object identifier, linked to metadata, will also improve subsequent discovery and access. Phenological changes in the marine environment will affect human economics, food systems, and recreation. No one source of data will be sufficient to understand these changes. The collective attention of marine data collectors is needed—whether with an agency, an educational institution, or a citizen scientist group—toward adopting the data management processes and standards needed to ensure availability of sufficient and useable marine data to understand marine phenology.
Grid Enabled Geospatial Catalogue Web Service

NASA Technical Reports Server (NTRS)

Chen, Ai-Jun; Di, Li-Ping; Wei, Ya-Xing; Liu, Yang; Bui, Yu-Qi; Hu, Chau-Min; Mehrotra, Piyush

2004-01-01

Geospatial Catalogue Web Service is a vital service for sharing and interoperating volumes of distributed heterogeneous geospatial resources, such as data, services, applications, and their replicas over the web. Based on the Grid technology and the Open Geospatial Consortium (0GC) s Catalogue Service - Web Information Model, this paper proposes a new information model for Geospatial Catalogue Web Service, named as GCWS which can securely provides Grid-based publishing, managing and querying geospatial data and services, and the transparent access to the replica data and related services under the Grid environment. This information model integrates the information model of the Grid Replica Location Service (RLS)/Monitoring & Discovery Service (MDS) with the information model of OGC Catalogue Service (CSW), and refers to the geospatial data metadata standards from IS0 19115, FGDC and NASA EOS Core System and service metadata standards from IS0 191 19 to extend itself for expressing geospatial resources. Using GCWS, any valid geospatial user, who belongs to an authorized Virtual Organization (VO), can securely publish and manage geospatial resources, especially query on-demand data in the virtual community and get back it through the data-related services which provide functions such as subsetting, reformatting, reprojection etc. This work facilitates the geospatial resources sharing and interoperating under the Grid environment, and implements geospatial resources Grid enabled and Grid technologies geospatial enabled. It 2!so makes researcher to focus on science, 2nd not cn issues with computing ability, data locztic, processir,g and management. GCWS also is a key component for workflow-based virtual geospatial data producing.
RPPAML/RIMS: A metadata format and an information management system for reverse phase protein arrays

PubMed Central

Stanislaus, Romesh; Carey, Mark; Deus, Helena F; Coombes, Kevin; Hennessy, Bryan T; Mills, Gordon B; Almeida, Jonas S

2008-01-01

Background Reverse Phase Protein Arrays (RPPA) are convenient assay platforms to investigate the presence of biomarkers in tissue lysates. As with other high-throughput technologies, substantial amounts of analytical data are generated. Over 1000 samples may be printed on a single nitrocellulose slide. Up to 100 different proteins may be assessed using immunoperoxidase or immunoflorescence techniques in order to determine relative amounts of protein expression in the samples of interest. Results In this report an RPPA Information Management System (RIMS) is described and made available with open source software. In order to implement the proposed system, we propose a metadata format known as reverse phase protein array markup language (RPPAML). RPPAML would enable researchers to describe, document and disseminate RPPA data. The complexity of the data structure needed to describe the results and the graphic tools necessary to visualize them require a software deployment distributed between a client and a server application. This was achieved without sacrificing interoperability between individual deployments through the use of an open source semantic database, S3DB. This data service backbone is available to multiple client side applications that can also access other server side deployments. The RIMS platform was designed to interoperate with other data analysis and data visualization tools such as Cytoscape. Conclusion The proposed RPPAML data format hopes to standardize RPPA data. Standardization of data would result in diverse client applications being able to operate on the same set of data. Additionally, having data in a standard format would enable data dissemination and data analysis. PMID:19102773

Measures for interoperability of phenotypic data: minimum information requirements and formatting.

PubMed

Ćwiek-Kupczyńska, Hanna; Altmann, Thomas; Arend, Daniel; Arnaud, Elizabeth; Chen, Dijun; Cornut, Guillaume; Fiorani, Fabio; Frohmberg, Wojciech; Junker, Astrid; Klukas, Christian; Lange, Matthias; Mazurek, Cezary; Nafissi, Anahita; Neveu, Pascal; van Oeveren, Jan; Pommier, Cyril; Poorter, Hendrik; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Scholz, Uwe; van Schriek, Marco; Seren, Ümit; Usadel, Björn; Weise, Stephan; Kersey, Paul; Krajewski, Paweł

2016-01-01

Plant phenotypic data shrouds a wealth of information which, when accurately analysed and linked to other data types, brings to light the knowledge about the mechanisms of life. As phenotyping is a field of research comprising manifold, diverse and time-consuming experiments, the findings can be fostered by reusing and combining existing datasets. Their correct interpretation, and thus replicability, comparability and interoperability, is possible provided that the collected observations are equipped with an adequate set of metadata. So far there have been no common standards governing phenotypic data description, which hampered data exchange and reuse. In this paper we propose the guidelines for proper handling of the information about plant phenotyping experiments, in terms of both the recommended content of the description and its formatting. We provide a document called "Minimum Information About a Plant Phenotyping Experiment", which specifies what information about each experiment should be given, and a Phenotyping Configuration for the ISA-Tab format, which allows to practically organise this information within a dataset. We provide examples of ISA-Tab-formatted phenotypic data, and a general description of a few systems where the recommendations have been implemented. Acceptance of the rules described in this paper by the plant phenotyping community will help to achieve findable, accessible, interoperable and reusable data.
Designing for Change: Interoperability in a scaling and adapting environment

NASA Astrophysics Data System (ADS)

Yarmey, L.

2015-12-01

The Earth Science cyberinfrastructure landscape is constantly changing. Technologies advance and technical implementations are refined or replaced. Data types, volumes, packaging, and use cases evolve. Scientific requirements emerge and mature. Standards shift while systems scale and adapt. In this complex and dynamic environment, interoperability remains a critical component of successful cyberinfrastructure. Through the resource- and priority-driven iterations on systems, interfaces, and content, questions fundamental to stable and useful Earth Science cyberinfrastructure arise. For instance, how are sociotechnical changes planned, tracked, and communicated? How should operational stability balance against 'new and shiny'? How can ongoing maintenance and mitigation of technical debt be managed in an often short-term resource environment? The Arctic Data Explorer is a metadata brokering application developed to enable discovery of international, interdisciplinary Arctic data across distributed repositories. Completely dependent on interoperable third party systems, the Arctic Data Explorer publicly launched in 2013 with an original 3000+ data records from four Arctic repositories. Since then the search has scaled to 25,000+ data records from thirteen repositories at the time of writing. In the final months of original project funding, priorities shift to lean operations with a strategic eye on the future. Here we present lessons learned from four years of Arctic Data Explorer design, development, communication, and maintenance work along with remaining questions and potential directions.
Italian Polar Metadata System

NASA Astrophysics Data System (ADS)

Longo, S.; Nativi, S.; Leone, C.; Migliorini, S.; Mazari Villanova, L.

2012-04-01

Italian Polar Metadata System C.Leone, S.Longo, S.Migliorini, L.Mazari Villanova, S. Nativi The Italian Antarctic Research Programme (PNRA) is a government initiative funding and coordinating scientific research activities in polar regions. PNRA manages two scientific Stations in Antarctica - Concordia (Dome C), jointly operated with the French Polar Institute "Paul Emile Victor", and Mario Zucchelli (Terra Nova Bay, Southern Victoria Land). In addition National Research Council of Italy (CNR) manages one scientific Station in the Arctic Circle (Ny-Alesund-Svalbard Islands), named Dirigibile Italia. PNRA started in 1985 with the first Italian Expedition in Antarctica. Since then each research group has collected data regarding biology and medicine, geodetic observatory, geophysics, geology, glaciology, physics and atmospheric chemistry, earth-sun relationships and astrophysics, oceanography and marine environment, chemistry contamination, law and geographic science, technology, multi and inter disciplinary researches, autonomously with different formats. In 2010 the Italian Ministry of Research assigned the scientific coordination of the Programme to CNR, which is in charge of the management and sharing of the scientific results carried out in the framework of the PNRA. Therefore, CNR is establishing a new distributed cyber(e)-infrastructure to collect, manage, publish and share polar research results. This is a service-based infrastructure building on Web technologies to implement resources (i.e. data, services and documents) discovery, access and visualization; in addition, semantic-enabled functionalities will be provided. The architecture applies the "System of Systems" principles to build incrementally on the existing systems by supplementing but not supplanting their mandates and governance arrangements. This allows to keep the existing capacities as autonomous as possible. This cyber(e)-infrastructure implements multi-disciplinary interoperability following a Brokering approach and supporting the relevant international standards recognized by European and international standards, including: GEO/GEOSS, INSPIRE and SCAR. The Brokering approach is empowered by a technology developed by CNR, advanced by the FP7 EuroGEOSS project, and recently adopted by the GEOSS Common Infrastructure (GCI).
Assessing Quality of Data Standards: Framework and Illustration Using XBRL GAAP Taxonomy

NASA Astrophysics Data System (ADS)

Zhu, Hongwei; Wu, Harris

The primary purpose of data standards or metadata schemas is to improve the interoperability of data created by multiple standard users. Given the high cost of developing data standards, it is desirable to assess the quality of data standards. We develop a set of metrics and a framework for assessing data standard quality. The metrics include completeness and relevancy. Standard quality can also be indirectly measured by assessing interoperability of data instances. We evaluate the framework using data from the financial sector: the XBRL (eXtensible Business Reporting Language) GAAP (Generally Accepted Accounting Principles) taxonomy and US Securities and Exchange Commission (SEC) filings produced using the taxonomy by approximately 500 companies. The results show that the framework is useful and effective. Our analysis also reveals quality issues of the GAAP taxonomy and provides useful feedback to taxonomy users. The SEC has mandated that all publicly listed companies must submit their filings using XBRL. Our findings are timely and have practical implications that will ultimately help improve the quality of financial data.
Emergence of a Common Modeling Architecture for Earth System Science (Invited)

NASA Astrophysics Data System (ADS)

Deluca, C.

2010-12-01

Common modeling architecture can be viewed as a natural outcome of common modeling infrastructure. The development of model utility and coupling packages (ESMF, MCT, OpenMI, etc.) over the last decade represents the realization of a community vision for common model infrastructure. The adoption of these packages has led to increased technical communication among modeling centers and newly coupled modeling systems. However, adoption has also exposed aspects of interoperability that must be addressed before easy exchange of model components among different groups can be achieved. These aspects include common physical architecture (how a model is divided into components) and model metadata and usage conventions. The National Unified Operational Prediction Capability (NUOPC), an operational weather prediction consortium, is collaborating with weather and climate researchers to define a common model architecture that encompasses these advanced aspects of interoperability and looks to future needs. The nature and structure of the emergent common modeling architecture will be discussed along with its implications for future model development.
BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

PubMed Central

2014-01-01

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed. PMID:24495517
The Gulf of Mexico Coastal Ocean Observing System: A Decade of Data Aggregation and Services.

NASA Astrophysics Data System (ADS)

Howard, M.; Gayanilo, F.; Kobara, S.; Baum, S. K.; Currier, R. D.; Stoessel, M. M.

2016-02-01

The Gulf of Mexico Coastal Ocean Observing System Regional Association (GCOOS-RA) celebrated its 10-year anniversary in 2015. GCOOS-RA is one of 11 RAs organized under the NOAA-led U.S. Integrated Ocean Observing System (IOOS) Program Office to aggregate regional data and make these data publicly-available in preferred forms and formats via standards-based web services. Initial development of GCOOS focused on building elements of the IOOS Data Management and Communications Plan which is a framework for end-to-end interoperability. These elements included: data discovery, catalog, metadata, online-browse, data access and transport. Initial data types aggregated included near real-time physical oceanographic, marine meteorological and satellite data. Our focus in the middle of the past decade was on the production of basic products such as maps of current oceanographic conditions and quasi-static datasets such as bathymetry and climatologies. In the latter part of the decade we incorporated historical physical oceanographic datasets and historical coastal and offshore water quality data into our holdings and added our first biological dataset. We also developed web environments and products to support Citizen Scientists and stakeholder groups such as recreational boaters. Current efforts are directed towards applying data quality assurance (testing and flagging) to non-federal data, data archiving at national repositories, serving and visualizing numerical model output, providing data services for glider operators, and supporting marine biodiversity observing networks. GCOOS Data Management works closely with the Gulf of Mexico Research Initiative Information and Data Cooperative and various groups involved with Gulf Restoration. GCOOS-RA has influenced attitudes and behaviors associated with good data stewardship and data management practices across the Gulf and will to continue to do so into the next decade.
SIMOcean: Maritime Open Data and Services Platform for Portuguese Institutions

NASA Astrophysics Data System (ADS)

Almeida, Nuno; Grosso, Nuno; Catarino, Nuno; Gutierrez, Antonio; Lamas, Luísa; Alves, Margarida; Almeida, Sara; Deus, Ricardo; Oliveira, Paulo

2016-04-01

Portugal is the country with the largest EEZ in the EU and the 10th largest EEZ in the world, at 3,877,408 km2, rendering the existence of an integrated management of Portuguese marine system crucial to monitor a wide range of interdependent domains. A system like this assimilates data and information from different thematic areas, ranging from ocean and atmosphere state variables to higher level datasets describing human activities and related environmental, social and economic impacts. Currently, these datasets are collected by a wide number of public and private institutions with very diverse purposes (e.g., monitoring, research, recreation, vigilance) leading to dataset duplication, inexistence of common data and metadata standards across organizations, and the propagation of closed information systems with different implementation solutions. This lack of coordination and visibility hinders the marine management, monitoring and vigilance capabilities, not only by making it more difficult to access, or even be aware of, the existence of certain datasets, but also by minimizing the ability to create added value products or services through dataset integration from different sources. Adopting Open Data approach will bring significant benefits by reducing the cost of information exchange and data integration, promoting the extensive use of this data. SIMOcean (System for Integrated Monitoring of the Ocean), co-funded by the EEA Grants Programme, is integrated in the initiative of the Portuguese Government to develop a set of coordinated systems providing access to national marine data. These systems aim to improve the Portuguese marine management, monitoring and vigilance capabilities, aggregating different data, including specific human activities datasets (vessel traffic, fishing records, oil spills), and environment variables (waves, currents, wind). Those datasets, currently scattered among different departments of the Portuguese Meteorological (IPMA) and the Navy's Hydrographic (IH) Institutes, will be brought together in the SIMOcean Open Data system. The SIMOcean system will also exploit this data in the following three flagship value added services: 1) Characterisation of Fishing Areas; 2) Wave Alerts for Sea Ports; and 3) Support to Search and Rescue Missions. These services will be driven by end users such as Civil Protection Authorities, Port Authorities and Fishing Associations, where these new products will lead to a significant positive impact in their operations. SIMOcean will be based on open source web based GIS interoperable solutions, compliant with OGC and INSPIRE directive standards to support the evolution of a set of open interfaces and protocols in the development of a common European spatial data infrastructure. The Catalogue solution (based on ckan) will consider the Portuguese Metadata Profile for the Sea developed by SNIM@R project, the guidelines provided by the directive 2013/37/EU and the Goldbook provided by the European Data portal. The system will be based on SenSyF approach of a scalable Cloud Computing system, providing authorised entities a single access point system for data catalogue, visualisation, processing and value added service deployment. It will be used by the two of the main Portuguese sea data providers with operational responsibilities in marine management, monitoring and vigilance.
Final Report for the Development of the NASA Technical Report Server (NTRS)

NASA Technical Reports Server (NTRS)

Nelson, Michael L.

2005-01-01

The author performed a variety of research, development and consulting tasks for NASA Langley Research Center in the area of digital libraries (DLs) and supporting technologies, such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). In particular, the development focused on the NASA Technical Report Server (NTRS) and its transition from a distributed searching model to one that uses the OAI-PMH. The Open Archives Initiative (OAI) is an international consortium focused on furthering the interoperability of DLs through the use of "metadata harvesting". The OAI-PMH version of NTRS went into public production on April 28, 2003. Since that time, it has been extremely well received. In addition to providing the NTRS user community with a higher level of service than the previous, distributed searching version of NTRS, it has provided more insight into how the user community uses NTRS in a variety of deployment scenarios. This report details the design, implementation and maintenance of the NTRS. Source code is included in the appendices.
R classes and methods for SNP array data.

PubMed

Scharpf, Robert B; Ruczinski, Ingo

2010-01-01

The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.
CruiseViewer: SIOExplorer Graphical Interface to Metadata and Archives.

NASA Astrophysics Data System (ADS)

Sutton, D. W.; Helly, J. J.; Miller, S. P.; Chase, A.; Clark, D.

2002-12-01

We are introducing "CruiseViewer" as a prototype graphical interface for the SIOExplorer digital library project, part of the overall NSF National Science Digital Library (NSDL) effort. When complete, CruiseViewer will provide access to nearly 800 cruises, as well as 100 years of documents and images from the archives of the Scripps Institution of Oceanography (SIO). The project emphasizes data object accessibility, a rich metadata format, efficient uploading methods and interoperability with other digital libraries. The primary function of CruiseViewer is to provide a human interface to the metadata database and to storage systems filled with archival data. The system schema is based on the concept of an "arbitrary digital object" (ADO). Arbitrary in that if the object can be stored on a computer system then SIOExplore can manage it. Common examples are a multibeam swath bathymetry file, a .pdf cruise report, or a tar file containing all the processing scripts used on a cruise. We require a metadata file for every ADO in an ascii "metadata interchange format" (MIF), which has proven to be highly useful for operability and extensibility. Bulk ADO storage is managed using the Storage Resource Broker, SRB, data handling middleware developed at the San Diego Supercomputer Center that centralizes management and access to distributed storage devices. MIF metadata are harvested from several sources and housed in a relational (Oracle) database. For CruiseViewer, cgi scripts resident on an Apache server are the primary communication and service request handling tools. Along with the CruiseViewer java application, users can query, access and download objects via a separate method that operates through standard web browsers, http://sioexplorer.ucsd.edu. Both provide the functionability to query and view object metadata, and select and download ADOs. For the CruiseViewer application Java 2D is used to add a geo-referencing feature that allows users to select basemap images and have vector shapes representing query results mapped over the basemap in the image panel. The two methods together address a wide range of user access needs and will allow for widespread use of SIOExplorer.
Geo-Seas - building a unified e-infrastructure for marine geoscientific data management in Europe

NASA Astrophysics Data System (ADS)

Glaves, H.; Schaap, D.

2012-04-01

A significant barrier to marine geoscientific research in Europe is the lack of standardised marine geological and geophysical data and data products which could potentially facilitate multidisciplinary marine research extending across national and international boundaries. Although there are large volumes of geological and geophysical data available for the marine environment it is currently very difficult to use these datasets in an integrated way due to different nomenclatures, formats, scales and coordinate systems being used within different organisations as well as between countries. This makes the direct use of primary data very difficult and also hampers use of the data to produce integrated multidisciplinary data products and services. The Geo-Seas project, an EU Framework 7 funded initiative, is developing a unified e-infrastructure to facilitate the sharing of marine geoscientific data within Europe. This e-infrastructure is providing on-line access to both discovery metadata and the associated federated data sets from 26 European data centres via a dedicated portal. The implementation of the Geo-Seas portal is allowing a range of end users to locate, assess and access standardised geoscientific data from multiple sources which is interoperable with other marine data types. Geo-Seas is building on the work already done by the existing SeaDataNet project which currently provides a data management e-infrastructure for oceanographic data which allows users to locate and access federated oceanographic data sets. By adopting and adapting the SeaDataNet methodologies and technologies the Geo-Seas project has not only avoid unnecessary duplication of effort by reusing existing and proven technologies but also contributed to the development of a multidisciplinary approach to ocean science across Europe through the creation of a joint infrastructure for both marine geoscientific and oceanographic data. This approach is also leading to the development of collaborative links with other European projects including EMODNET, Eurofleets. Genesi-DEC and iMarine as well as extending to the wider marine geoscientific and oceanographic community including projects in the USA such as the Rolling Deck Repository (R2R) initiative and also organisations in both the USA and Australia. On behalf of the Geo-Seas consortium partners: NERC-BGS (United Kingdom), NERC-BODC (United Kingdom), NERC-NOCS (United Kingdom), MARIS (Netherlands), IFREMER (France), BRGM (France), TNO (Netherlands), BSH (Germany), IGME (Spain), LNEG (Portugal), GSI (Ireland), BGR (Germany), OGS (Italy), GEUS (Denmark), NGU (Norway), PGI (Poland), EGK (Estonia), NRC-IGG (Lithuania), IO-BAS (Bulgaria), NOA (Greece), CIRIA (United Kingdom), MUMM (Belgium), UB (Spain), UCC (Ireland), EU-Consult (Netherlands), CNRS (France), SHOM (France), CEFAS (United Kingdom), and LU (Latvia).
Linking User Identities Across the DataONE Federation of Data Repositories

NASA Astrophysics Data System (ADS)

Jones, M. B.; Mecum, B.; Leinfelder, B.; Jones, C. S.; Walker, L.

2016-12-01

DataONE provides services for identifying, authenticating, and authorizing researchers to access and contribute data to repositories within the DataONE federation. In the earth sciences, thousands of institutional and disciplinary repositories have created their own user identity and authentication systems with their own user directory based on a database or web content management systems. Thus, researchers have many identities that are neither linked nor interoperable, making it difficult to reference the identity of these users across systems. Key user information is hidden, and only a non-disambiguated name is often available. From a sample of 160,000 data sets within DataONE, a super-majority of references to the data creators lack even an email address. In an attempt to disambiguate these people via the GeoLink project, we conservatively estimate they represent at least 57,000 unique identities, but without a clear user identifier, there could be as many as 223,000. Interoperability among repositories is critical to improving the scope of scientific synthesis and capabilities for research collaboration. While many have focused on the convenience of Single Sign-On (SSO), we have found that sharing user identifiers is far more useful for interoperability. With an unambiguous user identity in incoming metadata, DataONE has built user-profiles that present that user's data across repositories, that link users and their organizational affiliations, and that allow users to work collaboratively in private groups that span repository systems. DataONE's user identity solution leverages existing systems such as InCommon, CILogon, Google, and ORCID to not further proliferate user identities. DataONE provides a core service allowing users to link their multiple identities so that authenticating with one identity (e.g., ORCID) can authorize access to data protected via another identity (e.g., InCommon). Currently, DataONE is using ORCID identities to link and identify users, but challenges must still be overcome to support historical records for which ORCIDs can not be used because the associated people are unavailable to confirm their identity. DataONE's identity systems facilitate crosslinking between user identities and scientific metadata to accelerate collaboration and synthesis.
Realising the Benefits of Adopting and Adapting Existing CF Metadata Conventions to a Broader Range of Geoscience Data

NASA Astrophysics Data System (ADS)

Druken, K. A.; Trenham, C. E.; Wang, J.; Bastrakova, I.; Evans, B. J. K.; Wyborn, L. A.; Ip, A. I.; Poudjom Djomani, Y.

2016-12-01

The National Computational Infrastructure (NCI) hosts one of Australia's largest repositories (10+ PBytes) of research data, colocated with a petascale High Performance Computer and a highly integrated research cloud. Key to maximizing benefit of NCI's collections and computational capabilities is ensuring seamless interoperable access to these datasets. This presents considerable data management challenges across the diverse range of geoscience data; spanning disciplines where netCDF-CF is commonly utilized (e.g., climate, weather, remote-sensing), through to the geophysics and seismology fields that employ more traditional domain- and study-specific data formats. These data are stored in a variety of gridded, irregularly spaced (i.e., trajectories, point clouds, profiles), and raster image structures. They often have diverse coordinate projections and resolutions, thus complicating the task of comparison and inter-discipline analysis. Nevertheless, much can be learned from the netCDF-CF model that has long served the climate community, providing a common data structure for the atmospheric, ocean and cryospheric sciences. We are extending the application of the existing Climate and Forecast (CF) metadata conventions to NCI's broader geoscience data collections. We present simple implementations that can significantly improve interoperability of the research collections, particularly in the case of line survey data. NCI has developed a compliance checker to assist with the data quality across all hosted netCDF-CF collections. The tool is an extension to one of the main existing CF Convention checkers, that we have modified to incorporate the Attribute Convention for Data Discovery (ACDD) and ISO19115 standards, and to perform parallelised checks over collections of files, ensuring compliance and consistency across the NCI data collections as a whole. It is complemented by a checker that also verifies functionality against a range of scientific analysis, programming, and data visualisation tools. By design, these tests are not necessarily domain-specific, and demonstrate that verified data is accessible to end-users, thus allowing for seamless interoperability with other datasets across a wide range of fields.
Terminology supported archiving and publication of environmental science data in PANGAEA.

PubMed

Diepenbroek, Michael; Schindler, Uwe; Huber, Robert; Pesant, Stéphane; Stocker, Markus; Felden, Janine; Buss, Melanie; Weinrebe, Matthias

2017-11-10

Exemplified on the information system PANGAEA, we describe the application of terminologies for archiving and publishing environmental science data. A terminology catalogue (TC) was embedded into the system, with interfaces allowing to replicate and to manually work on terminologies. For data ingest and archiving, we show how the TC can improve structuring and harmonizing lineage and content descriptions of data sets. Key is the conceptualization of measurement and observation types (parameters) and methods, for which we have implemented a basic syntax and rule set. For data access and dissemination, we have improved findability of data through enrichment of metadata with TC terms. Semantic annotations, e.g. adding term concepts (including synonyms and hierarchies) or mapped terms of different terminologies, facilitate comprehensive data retrievals. The PANGAEA thesaurus of classifying terms, which is part of the TC is used as an umbrella vocabulary that links the various domains and allows drill downs and side drills with various facets. Furthermore, we describe how TC terms can be linked to nominal data values. This improves data harmonization and facilitates structural transformation of heterogeneous data sets to a common schema. Technical developments are complemented by work on the metadata content. Over the last 20 years, more than 100 new parameters have been defined on average per week. Recently, PANGAEA has increasingly been submitting new terms to various terminology services. Matching terms from terminology services with our parameter or method strings is supported programmatically. However, the process ultimately needs manual input by domain experts. The quality of terminology services is an additional limiting factor, and varies with respect to content, editorial, interoperability, and sustainability. Good quality terminology services are the building blocks for the conceptualization of parameters and methods. In our view, they are essential for data interoperability and arguably the most difficult hurdle for data integration. In summary, the application of terminologies has a mutual positive effect for terminology services and information systems such as PANGAEA. On both sides, the application of terminologies improves content, reliability and interoperability. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Arctic Observing Network Data Management: Current Capabilities and Their Promise for the Future

NASA Astrophysics Data System (ADS)

Collins, J.; Fetterer, F.; Moore, J. A.

2008-12-01

CADIS (the Cooperative Arctic Data and Information Service) serves as the data management, discovery and delivery component of the Arctic Observing Network (AON). As an International Polar Year (IPY) initiative, AON comprises 34 land, atmosphere and ocean observation sites, and will acquire much of the data coming from the interagency Study of Environmental Arctic Change (SEARCH). CADIS is tasked with ensuring that these observational data are managed for long term use by members of the entire Earth System Science community. Portions of CADIS are either in use by the community or available for testing. We now have an opportunity to evaluate the feedback received from our users, to identify any design shortcomings, and to identify those elements which serve their purpose well and will support future development. This presentation will focus on the nuts-and-bolts of the CADIS development to date, with an eye towards presenting lessons learned and best practices based on our experiences so far. The topics include: - How did we assess our users' needs, and how are those contributions reflected in the end product and its capabilities? - Why did we develop a CADIS metadata profile, and how does it allow CADIS to support preservation and scientific interoperability? - How can we shield the user from metadata complexities (especially those associated with various standards) while still obtaining the metadata needed to support an effective data management system? - How can we bridge the gap between the data storage formats considered convenient by researchers in the field, and those which are necessary to provide data interoperability? - What challenges have been encountered in our efforts to provide access to federated data (data stored outside of the CADIS system)? - What are the data browsing and visualization needs of the AON community, and which tools and technologies are most promising in terms of supporting those needs? A live demonstration of the current capabilities of the CADIS system will be included as time and logistics allow. CADIS is a joint effort of the University Corporation for Atmospheric Research (UCAR), the National Snow and Ice Data Center (NSIDC), and the National Center for Atmospheric Research (NCAR).
Multimedia content description framework

NASA Technical Reports Server (NTRS)

Bergman, Lawrence David (Inventor); Mohan, Rakesh (Inventor); Li, Chung-Sheng (Inventor); Smith, John Richard (Inventor); Kim, Michelle Yoonk Yung (Inventor)

2003-01-01

A framework is provided for describing multimedia content and a system in which a plurality of multimedia storage devices employing the content description methods of the present invention can interoperate. In accordance with one form of the present invention, the content description framework is a description scheme (DS) for describing streams or aggregations of multimedia objects, which may comprise audio, images, video, text, time series, and various other modalities. This description scheme can accommodate an essentially limitless number of descriptors in terms of features, semantics or metadata, and facilitate content-based search, index, and retrieval, among other capabilities, for both streamed or aggregated multimedia objects.
System for Earth Sample Registration SESAR: Services for IGSN Registration and Sample Metadata Management

NASA Astrophysics Data System (ADS)

Chan, S.; Lehnert, K. A.; Coleman, R. J.

2011-12-01

SESAR, the System for Earth Sample Registration, is an online registry for physical samples collected for Earth and environmental studies. SESAR generates and administers the International Geo Sample Number IGSN, a unique identifier for samples that is dramatically advancing interoperability amongst information systems for sample-based data. SESAR was developed to provide the complete range of registry services, including definition of IGSN syntax and metadata profiles, registration and validation of name spaces requested by users, tools for users to submit and manage sample metadata, validation of submitted metadata, generation and validation of the unique identifiers, archiving of sample metadata, and public or private access to the sample metadata catalog. With the development of SESAR v3, we placed particular emphasis on creating enhanced tools that make metadata submission easier and more efficient for users, and that provide superior functionality for users to manage metadata of their samples in their private workspace MySESAR. For example, SESAR v3 includes a module where users can generate custom spreadsheet templates to enter metadata for their samples, then upload these templates online for sample registration. Once the content of the template is uploaded, it is displayed online in an editable grid format. Validation rules are executed in real-time on the grid data to ensure data integrity. Other new features of SESAR v3 include the capability to transfer ownership of samples to other SESAR users, the ability to upload and store images and other files in a sample metadata profile, and the tracking of changes to sample metadata profiles. In the next version of SESAR (v3.5), we will further improve the discovery, sharing, registration of samples. For example, we are developing a more comprehensive suite of web services that will allow discovery and registration access to SESAR from external systems. Both batch and individual registrations will be possible through web services. Based on valuable feedback from the user community, we will introduce enhancements that add greater flexibility to the system to accommodate the vast diversity of metadata that users want to store. Users will be able to create custom metadata fields and use these for the samples they register. Users will also be able to group samples into 'collections' to make retrieval for research projects or publications easier. An improved interface design will allow for better workflow transition and navigation throughout the application. In keeping up with the demands of a growing community, SESAR has also made process changes to ensure efficiency in system development. For example, we have implemented a release cycle to better track enhancements and fixes to the system, and an API library that facilitates reusability of code. Usage tracking, metrics and surveys capture information to guide the direction of future developments. A new set of administrative tools allows greater control of system management.
A Multi-Purpose Data Dissemination Infrastructure for the Marine-Earth Observations

NASA Astrophysics Data System (ADS)

Hanafusa, Y.; Saito, H.; Kayo, M.; Suzuki, H.

2015-12-01

To open the data from a variety of observations, the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) has developed a multi-purpose data dissemination infrastructure. Although many observations have been made in the earth science, all the data are not opened completely. We think data centers may provide researchers with a universal data dissemination service which can handle various kinds of observation data with little effort. For this purpose JAMSTEC Data Management Office has developed the "Information Catalog Infrastructure System (Catalog System)". This is a kind of catalog management system which can create, renew and delete catalogs (= databases) and has following features, - The Catalog System does not depend on data types or granularity of data records. - By registering a new metadata schema to the system, a new database can be created on the same system without sytem modification. - As web pages are defined by the cascading style sheets, databases have different look and feel, and operability. - The Catalog System provides databases with basic search tools; search by text, selection from a category tree, and selection from a time line chart. - For domestic users it creates the Japanese and English pages at the same time and has dictionary to control terminology and proper noun. As of August 2015 JAMSTEC operates 7 databases on the Catalog System. We expect to transfer existing databases to this system, or create new databases on it. In comparison with a dedicated database developed for the specific dataset, the Catalog System is suitable for the dissemination of small datasets, with minimum cost. Metadata held in the catalogs may be transfered to other metadata schema to exchange global databases or portals. Examples: JAMSTEC Data Catalog: http://www.godac.jamstec.go.jp/catalog/data_catalog/metadataList?lang=enJAMSTEC Document Catalog: http://www.godac.jamstec.go.jp/catalog/doc_catalog/metadataList?lang=en&tab=categoryResearch Information and Data Access Site of TEAMS: http://www.i-teams.jp/catalog/rias/metadataList?lang=en&tab=list
Cytometry metadata in XML

NASA Astrophysics Data System (ADS)

Leif, Robert C.; Leif, Stephanie H.

2016-04-01

Introduction: The International Society for Advancement of Cytometry (ISAC) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). CytometryML will serve as a common metadata standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). The datatypes are primarily based on the Flow Cytometry and the Digital Imaging and Communication (DICOM) standards. A small section of the code was formatted with standard HTML formatting elements (p, h1, h2, etc.). Results:1) The part of MIFlowCyt that describes the Experimental Overview including the specimen and substantial parts of several other major elements has been implemented as CytometryML XML schemas (www.cytometryml.org). 2) The feasibility of using MIFlowCyt to provide the combination of an overview, table of contents, and/or an index of a scientific paper or a report has been demonstrated. Previously, a sample electronic publication, EPUB, was created that could contain both MIFlowCyt metadata as well as the binary data. Conclusions: The use of CytometryML technology together with XHTML5 and CSS permits the metadata to be directly formatted and together with the binary data to be stored in an EPUB container. This will facilitate: formatting, data- mining, presentation, data verification, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate a publication's adherence to the MIFlowCyt standard, promote interoperability and should also result in the textual and numeric data being published using web technology without any change in composition.

Making United States Integrated Ocean Observing System (U.S. IOOS) inclusive of marine biological resources

USGS Publications Warehouse

Moustahfid, H.; Potemra, J.; Goldstein, P.; Mendelssohn, R.; Desrochers, A.

2011-01-01

An important Data Management and Communication (DMAC) goal is to enable a multi-disciplinary view of the ocean environment by facilitating discovery and integration of data from various sources, projects and scientific domains. United States Integrated Ocean Observing System (U.S. IOOS) DMAC functional requirements are based upon guidelines for standardized data access services, data formats, metadata, controlled vocabularies, and other conventions. So far, the data integration effort has focused on geophysical U.S. IOOS core variables such as temperature, salinity, ocean currents, etc. The IOOS Biological Observations Project is addressing the DMAC requirements that pertain to biological observations standards and interoperability applicable to U.S. IOOS and to various observing systems. Biological observations are highly heterogeneous and the variety of formats, logical structures, and sampling methods create significant challenges. Here we describe an informatics framework for biological observing data (e.g. species presence/absence and abundance data) that will expand information content and reconcile standards for the representation and integration of these biological observations for users to maximize the value of these observing data. We further propose that the approach described can be applied to other datasets generated in scientific observing surveys and will provide a vehicle for wider dissemination of biological observing data. We propose to employ data definition conventions that are well understood in U.S. IOOS and to combine these with ratified terminologies, policies and guidelines. ?? 2011 MTS.
Promoting discovery and access to real time observations produced by regional coastal ocean observing systems

NASA Astrophysics Data System (ADS)

Anderson, D. M.; Snowden, D. P.; Bochenek, R.; Bickel, A.

2015-12-01

In the U.S. coastal waters, a network of eleven regional coastal ocean observing systems support real-time coastal and ocean observing. The platforms supported and variables acquired are diverse, ranging from current sensing high frequency (HF) radar to autonomous gliders. The system incorporates data produced by other networks and experimental systems, further increasing the breadth of the collection. Strategies promoted by the U.S. Integrated Ocean Observing System (IOOS) ensure these data are not lost at sea. Every data set deserves a description. ISO and FGDC compliant metadata enables catalog interoperability and record-sharing. Extensive use of netCDF with the Climate and Forecast convention (identifying both metadata and a structured format) is shown to be a powerful strategy to promote discovery, interoperability, and re-use of the data. To integrate specialized data which are often obscure, quality control protocols are being developed to homogenize the QC and make these data more integrate-able. Data Assembly Centers have been established to integrate some specialized streams including gliders, animal telemetry, and HF radar. Subsets of data that are ingested into the National Data Buoy Center are also routed to the Global Telecommunications System (GTS) of the World Meteorological Organization to assure wide international distribution. From the GTS, data are assimilated into now-cast and forecast models, fed to other observing systems, and used to support observation-based decision making such as forecasts, warnings, and alerts. For a few years apps were a popular way to deliver these real-time data streams to phones and tablets. Responsive and adaptive web sites are an emerging flexible strategy to provide access to the regional coastal ocean observations.
Photon-HDF5: Open Data Format and Computational Tools for Timestamp-based Single-Molecule Experiments

PubMed Central

Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

2017-01-01

Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon-HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon-HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon-HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5. PMID:28649160
Photon-HDF5: Open Data Format and Computational Tools for Timestamp-based Single-Molecule Experiments.

PubMed

Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

2016-02-13

Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon-HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon-HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon-HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5.
Photon-HDF5: open data format and computational tools for timestamp-based single-molecule experiments

NASA Astrophysics Data System (ADS)

Ingargiola, Antonino; Laurence, Ted; Boutelle, Robert; Weiss, Shimon; Michalet, Xavier

2016-02-01

Archival of experimental data in public databases has increasingly become a requirement for most funding agencies and journals. These data-sharing policies have the potential to maximize data reuse, and to enable confirmatory as well as novel studies. However, the lack of standard data formats can severely hinder data reuse. In photon-counting-based single-molecule fluorescence experiments, data is stored in a variety of vendor-specific or even setup-specific (custom) file formats, making data interchange prohibitively laborious, unless the same hardware-software combination is used. Moreover, the number of available techniques and setup configurations make it difficult to find a common standard. To address this problem, we developed Photon-HDF5 (www.photon-hdf5.org), an open data format for timestamp-based single-molecule fluorescence experiments. Building on the solid foundation of HDF5, Photon- HDF5 provides a platform- and language-independent, easy-to-use file format that is self-describing and supports rich metadata. Photon-HDF5 supports different types of measurements by separating raw data (e.g. photon-timestamps, detectors, etc) from measurement metadata. This approach allows representing several measurement types and setup configurations within the same core structure and makes possible extending the format in backward-compatible way. Complementing the format specifications, we provide open source software to create and convert Photon- HDF5 files, together with code examples in multiple languages showing how to read Photon-HDF5 files. Photon- HDF5 allows sharing data in a format suitable for long term archival, avoiding the effort to document custom binary formats and increasing interoperability with different analysis software. We encourage participation of the single-molecule community to extend interoperability and to help defining future versions of Photon-HDF5.
LiPD and CSciBox: A Case Study in Why Data Standards are Important for Paleoscience

NASA Astrophysics Data System (ADS)

Weiss, I.; Bradley, E.; McKay, N.; Emile-Geay, J.; de Vesine, L. R.; Anderson, K. A.; White, J. W. C.; Marchitto, T. M., Jr.

2016-12-01

CSciBox [1] is an integrated software system that helps geoscientists build and evaluate age models. Its user chooses from a number of built-in analysis tools, composing them into an analysis workflow and applying it to paleoclimate proxy datasets. CSciBox employs modern database technology to store both the data and the analysis results in an easily accessible and searchable form, and offers the user access to the computational toolbox, the data, and the results via a graphical user interface and a sophisticated plotter. Standards are a staple of modern life, and underlie any form of automation. Without data standards, it is difficult, if not impossible, to construct effective computer tools for paleoscience analysis. The LiPD (Linked Paleo Data) framework [2] enables the storage of both data and metadata in systematic, meaningful, machine-readable ways. LiPD has been a primary enabler of CSciBox's goals of usability, interoperability, and reproducibility. Building LiPD capabilities into CSciBox's importer, for instance, eliminated the need to ask the user about file formats, variable names, relationships between columns in the input file, etc. Building LiPD capabilities into the exporter facilitated the storage of complete details about the input data-provenance, preprocessing steps, etc.-as well as full descriptions of any analyses that were performed using the CSciBox tool, along with citations to appropriate references. This comprehensive collection of data and metadata, which is all linked together in a semantically meaningful, machine-readable way, not only completely documents the analyses and makes them reproducible. It also enables interoperability with any other software system that employs the LiPD standard. [1] www.cs.colorado.edu/ lizb/cscience.html[2] McKay & Emile-Geay, Climate of the Past 12:1093 (2016)
Grid computing enhances standards-compatible geospatial catalogue service

NASA Astrophysics Data System (ADS)

Chen, Aijun; Di, Liping; Bai, Yuqi; Wei, Yaxing; Liu, Yang

2010-04-01

A catalogue service facilitates sharing, discovery, retrieval, management of, and access to large volumes of distributed geospatial resources, for example data, services, applications, and their replicas on the Internet. Grid computing provides an infrastructure for effective use of computing, storage, and other resources available online. The Open Geospatial Consortium has proposed a catalogue service specification and a series of profiles for promoting the interoperability of geospatial resources. By referring to the profile of the catalogue service for Web, an innovative information model of a catalogue service is proposed to offer Grid-enabled registry, management, retrieval of and access to geospatial resources and their replicas. This information model extends the e-business registry information model by adopting several geospatial data and service metadata standards—the International Organization for Standardization (ISO)'s 19115/19119 standards and the US Federal Geographic Data Committee (FGDC) and US National Aeronautics and Space Administration (NASA) metadata standards for describing and indexing geospatial resources. In order to select the optimal geospatial resources and their replicas managed by the Grid, the Grid data management service and information service from the Globus Toolkits are closely integrated with the extended catalogue information model. Based on this new model, a catalogue service is implemented first as a Web service. Then, the catalogue service is further developed as a Grid service conforming to Grid service specifications. The catalogue service can be deployed in both the Web and Grid environments and accessed by standard Web services or authorized Grid services, respectively. The catalogue service has been implemented at the George Mason University/Center for Spatial Information Science and Systems (GMU/CSISS), managing more than 17 TB of geospatial data and geospatial Grid services. This service makes it easy to share and interoperate geospatial resources by using Grid technology and extends Grid technology into the geoscience communities.
Development of an oil spill information system combining remote sensing data and surveillance metadata

NASA Astrophysics Data System (ADS)

Tufte, Lars; Trieschmann, Olaf; Carreau, Philippe; Hunsaenger, Thomas; Clayton, Peter J. S.; Barjenbruch, Ulrich

2004-02-01

The detection of accidentally or illegal marine oil discharges in the German territorial waters of the North Sea and Baltic Sea is of great importance for combating of oil spills and protection of the marine ecosystem. Therefore the German Federal Ministry of Transport set up an airborne surveillance system consisting of two Dornier DO 228-212 aircrafts equipped with a Side-Looking Airborne Radar (SLAR), a IR/UV sensor, a Microwave Radiometer (MWR) for quantification and a Laser-Flurosensor (LFS) for classification purposes of the oil spills. The flight parameters and the remote sensing data are stored in a database during the flight. A Pollution Observation Log is completed by the operator consisting of information about the detected oil spill (e.g. position, length, width) and several other information about the flight (e.g. name of navigator, name of observer). The objective was to develop an oil spill information system which integrates the described data, metadata and includes visualization and spatial analysis capabilities. The metadata are essential for further statistical analysis in spatial and temporal domains of oil spill occurrences and of the surveillance itself. It should facilitate the communication and distribution of metadata between the administrative bodies and partners of the German oil spill surveillance system. A connection between a GIS and the database allows to use the powerful visualization and spatial analysis functionality of the GIS in conjunction with the oil spill database.
Integrating Data and Networks: Human Factors

NASA Astrophysics Data System (ADS)

Chen, R. S.

2012-12-01

The development of technical linkages and interoperability between scientific networks is a necessary but not sufficient step towards integrated use and application of networked data and information for scientific and societal benefit. A range of "human factors" must also be addressed to ensure the long-term integration, sustainability, and utility of both the interoperable networks themselves and the scientific data and information to which they provide access. These human factors encompass the behavior of both individual humans and human institutions, and include system governance, a common framework for intellectual property rights and data sharing, consensus on terminology, metadata, and quality control processes, agreement on key system metrics and milestones, the compatibility of "business models" in the short and long term, harmonization of incentives for cooperation, and minimization of disincentives. Experience with several national and international initiatives and research programs such as the International Polar Year, the Group on Earth Observations, the NASA Earth Observing Data and Information System, the U.S. National Spatial Data Infrastructure, the Global Earthquake Model, and the United Nations Spatial Data Infrastructure provide a range of lessons regarding these human factors. Ongoing changes in science, technology, institutions, relationships, and even culture are creating both opportunities and challenges for expanded interoperability of scientific networks and significant improvement in data integration to advance science and the use of scientific data and information to achieve benefits for society as a whole.
The PSML format and library for norm-conserving pseudopotential data curation and interoperability

NASA Astrophysics Data System (ADS)

García, Alberto; Verstraete, Matthieu J.; Pouillon, Yann; Junquera, Javier

2018-06-01

Norm-conserving pseudopotentials are used by a significant number of electronic-structure packages, but the practical differences among codes in the handling of the associated data hinder their interoperability and make it difficult to compare their results. At the same time, existing formats lack provenance data, which makes it difficult to track and document computational workflows. To address these problems, we first propose a file format (PSML) that maps the basic concepts of the norm-conserving pseudopotential domain in a flexible form and supports the inclusion of provenance information and other important metadata. Second, we provide a software library (libPSML) that can be used by electronic structure codes to transparently extract the information in the file and adapt it to their own data structures, or to create converters for other formats. Support for the new file format has been already implemented in several pseudopotential generator programs (including ATOM and ONCVPSP), and the library has been linked with SIESTA and ABINIT, allowing them to work with the same pseudopotential operator (with the same local part and fully non-local projectors) thus easing the comparison of their results for the structural and electronic properties, as shown for several example systems. This methodology can be easily transferred to any other package that uses norm-conserving pseudopotentials, and offers a proof-of-concept for a general approach to interoperability.
Interoperability in planetary research for geospatial data analysis

NASA Astrophysics Data System (ADS)

Hare, Trent M.; Rossi, Angelo P.; Frigeri, Alessandro; Marmo, Chiara

2018-01-01

For more than a decade there has been a push in the planetary science community to support interoperable methods for accessing and working with geospatial data. Common geospatial data products for planetary research include image mosaics, digital elevation or terrain models, geologic maps, geographic location databases (e.g., craters, volcanoes) or any data that can be tied to the surface of a planetary body (including moons, comets or asteroids). Several U.S. and international cartographic research institutions have converged on mapping standards that embrace standardized geospatial image formats, geologic mapping conventions, U.S. Federal Geographic Data Committee (FGDC) cartographic and metadata standards, and notably on-line mapping services as defined by the Open Geospatial Consortium (OGC). The latter includes defined standards such as the OGC Web Mapping Services (simple image maps), Web Map Tile Services (cached image tiles), Web Feature Services (feature streaming), Web Coverage Services (rich scientific data streaming), and Catalog Services for the Web (data searching and discoverability). While these standards were developed for application to Earth-based data, they can be just as valuable for planetary domain. Another initiative, called VESPA (Virtual European Solar and Planetary Access), will marry several of the above geoscience standards and astronomy-based standards as defined by International Virtual Observatory Alliance (IVOA). This work outlines the current state of interoperability initiatives in use or in the process of being researched within the planetary geospatial community.
Fundamental Data Standards for Science Data System Interoperability and Data Correlation

NASA Astrophysics Data System (ADS)

Hughes, J. Steven; Gopala Krishna, Barla; Rye, Elizabeth; Crichton, Daniel

The advent of the Web and languages such as XML have brought an explosion of online science data repositories and the promises of correlated data and interoperable systems. However there have been relatively few successes in meeting the expectations of science users in the internet age. For example a Google-like search for images of Mars will return many highly-derived and appropriately tagged images but largely ignore the majority of images in most online image repositories. Once retrieved, users are further frustrated by poor data descriptions, arcane formats, and badly organized ancillary information. A wealth of research indicates that shared information models are needed to enable system interoperability and data correlation. However, at a more fundamental level, data correlation and system interoperability are dependant on a relatively few shared data standards. A com-mon data dictionary standard, for example, allows the controlled vocabulary used in a science repository to be shared with potential collaborators. Common data registry and product iden-tification standards enable systems to efficiently find, locate, and retrieve data products and their metadata from remote repositories. Information content standards define categories of descriptive data that help make the data products scientifically useful to users who were not part of the original team that produced the data. The Planetary Data System (PDS) has a plan to move the PDS to a fully online, federated system. This plan addresses new demands on the system including increasing data volume, numbers of missions, and complexity of missions. A key component of this plan is the upgrade of the PDS Data Standards. The adoption of the core PDS data standards by the International Planetary Data Alliance (IPDA) adds the element of international cooperation to the plan. This presentation will provide an overview of the fundamental data standards being adopted by the PDS that transcend science domains and that will help to meet the PDS's and IPDA's system interoperability and data correlation requirements.
A data delivery system for IMOS, the Australian Integrated Marine Observing System

NASA Astrophysics Data System (ADS)

Proctor, R.; Roberts, K.; Ward, B. J.

2010-09-01

The Integrated Marine Observing System (IMOS, www.imos.org.au), an AUD 150 m 7-year project (2007-2013), is a distributed set of equipment and data-information services which, among many applications, collectively contribute to meeting the needs of marine climate research in Australia. The observing system provides data in the open oceans around Australia out to a few thousand kilometres as well as the coastal oceans through 11 facilities which effectively observe and measure the 4-dimensional ocean variability, and the physical and biological response of coastal and shelf seas around Australia. Through a national science rationale IMOS is organized as five regional nodes (Western Australia - WAIMOS, South Australian - SAIMOS, Tasmania - TASIMOS, New SouthWales - NSWIMOS and Queensland - QIMOS) surrounded by an oceanic node (Blue Water and Climate). Operationally IMOS is organized as 11 facilities (Argo Australia, Ships of Opportunity, Southern Ocean Automated Time Series Observations, Australian National Facility for Ocean Gliders, Autonomous Underwater Vehicle Facility, Australian National Mooring Network, Australian Coastal Ocean Radar Network, Australian Acoustic Tagging and Monitoring System, Facility for Automated Intelligent Monitoring of Marine Systems, eMarine Information Infrastructure and Satellite Remote Sensing) delivering data. IMOS data is freely available to the public. The data, a combination of near real-time and delayed mode, are made available to researchers through the electronic Marine Information Infrastructure (eMII). eMII utilises the Australian Academic Research Network (AARNET) to support a distributed database on OPeNDAP/THREDDS servers hosted by regional computing centres. IMOS instruments are described through the OGC Specification SensorML and where-ever possible data is in CF compliant netCDF format. Metadata, conforming to standard ISO 19115, is automatically harvested from the netCDF files and the metadata records catalogued in the OGC GeoNetwork Metadata Entry and Search Tool (MEST). Data discovery, access and download occur via web services through the IMOS Ocean Portal (http://imos.aodn.org.au) and tools for the display and integration of near real-time data are in development.
ODISEES: A New Paradigm in Data Access

NASA Astrophysics Data System (ADS)

Huffer, E.; Little, M. M.; Kusterer, J.

2013-12-01

As part of its ongoing efforts to improve access to data, the Atmospheric Science Data Center has developed a high-precision Earth Science domain ontology (the 'ES Ontology') implemented in a graph database ('the Semantic Metadata Repository') that is used to store detailed, semantically-enhanced, parameter-level metadata for ASDC data products. The ES Ontology provides the semantic infrastructure needed to drive the ASDC's Ontology-Driven Interactive Search Environment for Earth Science ('ODISEES'), a data discovery and access tool, and will support additional data services such as analytics and visualization. The ES ontology is designed on the premise that naming conventions alone are not adequate to provide the information needed by prospective data consumers to assess the suitability of a given dataset for their research requirements; nor are current metadata conventions adequate to support seamless machine-to-machine interactions between file servers and end-user applications. Data consumers need information not only about what two data elements have in common, but also about how they are different. End-user applications need consistent, detailed metadata to support real-time data interoperability. The ES ontology is a highly precise, bottom-up, queriable model of the Earth Science domain that focuses on critical details about the measurable phenomena, instrument techniques, data processing methods, and data file structures. Earth Science parameters are described in detail in the ES Ontology and mapped to the corresponding variables that occur in ASDC datasets. Variables are in turn mapped to well-annotated representations of the datasets that they occur in, the instrument(s) used to create them, the instrument platforms, the processing methods, etc., creating a linked-data structure that allows both human and machine users to access a wealth of information critical to understanding and manipulating the data. The mappings are recorded in the Semantic Metadata Repository as RDF-triples. An off-the-shelf Ontology Development Environment and a custom Metadata Conversion Tool comprise a human-machine/machine-machine hybrid tool that partially automates the creation of metadata as RDF-triples by interfacing with existing metadata repositories and providing a user interface that solicits input from a human user, when needed. RDF-triples are pushed to the Ontology Development Environment, where a reasoning engine executes a series of inference rules whose antecedent conditions can be satisfied by the initial set of RDF-triples, thereby generating the additional detailed metadata that is missing in existing repositories. A SPARQL Endpoint, a web-based query service and a Graphical User Interface allow prospective data consumers - even those with no familiarity with NASA data products - to search the metadata repository to find and order data products that meet their exact specifications. A web-based API will provide an interface for machine-to-machine transactions.
Integrated platform and API for electrophysiological data

PubMed Central

Sobolev, Andrey; Stoewer, Adrian; Leonhardt, Aljoscha; Rautenberg, Philipp L.; Kellner, Christian J.; Garbers, Christian; Wachtler, Thomas

2014-01-01

Recent advancements in technology and methodology have led to growing amounts of increasingly complex neuroscience data recorded from various species, modalities, and levels of study. The rapid data growth has made efficient data access and flexible, machine-readable data annotation a crucial requisite for neuroscientists. Clear and consistent annotation and organization of data is not only an important ingredient for reproducibility of results and re-use of data, but also essential for collaborative research and data sharing. In particular, efficient data management and interoperability requires a unified approach that integrates data and metadata and provides a common way of accessing this information. In this paper we describe GNData, a data management platform for neurophysiological data. GNData provides a storage system based on a data representation that is suitable to organize data and metadata from any electrophysiological experiment, with a functionality exposed via a common application programming interface (API). Data representation and API structure are compatible with existing approaches for data and metadata representation in neurophysiology. The API implementation is based on the Representational State Transfer (REST) pattern, which enables data access integration in software applications and facilitates the development of tools that communicate with the service. Client libraries that interact with the API provide direct data access from computing environments like Matlab or Python, enabling integration of data management into the scientist's experimental or analysis routines. PMID:24795616
Integrated platform and API for electrophysiological data.

PubMed

Sobolev, Andrey; Stoewer, Adrian; Leonhardt, Aljoscha; Rautenberg, Philipp L; Kellner, Christian J; Garbers, Christian; Wachtler, Thomas

2014-01-01

Recent advancements in technology and methodology have led to growing amounts of increasingly complex neuroscience data recorded from various species, modalities, and levels of study. The rapid data growth has made efficient data access and flexible, machine-readable data annotation a crucial requisite for neuroscientists. Clear and consistent annotation and organization of data is not only an important ingredient for reproducibility of results and re-use of data, but also essential for collaborative research and data sharing. In particular, efficient data management and interoperability requires a unified approach that integrates data and metadata and provides a common way of accessing this information. In this paper we describe GNData, a data management platform for neurophysiological data. GNData provides a storage system based on a data representation that is suitable to organize data and metadata from any electrophysiological experiment, with a functionality exposed via a common application programming interface (API). Data representation and API structure are compatible with existing approaches for data and metadata representation in neurophysiology. The API implementation is based on the Representational State Transfer (REST) pattern, which enables data access integration in software applications and facilitates the development of tools that communicate with the service. Client libraries that interact with the API provide direct data access from computing environments like Matlab or Python, enabling integration of data management into the scientist's experimental or analysis routines.
A Grid Metadata Service for Earth and Environmental Sciences

NASA Astrophysics Data System (ADS)

Fiore, Sandro; Negro, Alessandro; Aloisio, Giovanni

2010-05-01

Critical challenges for climate modeling researchers are strongly connected with the increasingly complex simulation models and the huge quantities of produced datasets. Future trends in climate modeling will only increase computational and storage requirements. For this reason the ability to transparently access to both computational and data resources for large-scale complex climate simulations must be considered as a key requirement for Earth Science and Environmental distributed systems. From the data management perspective (i) the quantity of data will continuously increases, (ii) data will become more and more distributed and widespread, (iii) data sharing/federation will represent a key challenging issue among different sites distributed worldwide, (iv) the potential community of users (large and heterogeneous) will be interested in discovery experimental results, searching of metadata, browsing collections of files, compare different results, display output, etc.; A key element to carry out data search and discovery, manage and access huge and distributed amount of data is the metadata handling framework. What we propose for the management of distributed datasets is the GRelC service (a data grid solution focusing on metadata management). Despite the classical approaches, the proposed data-grid solution is able to address scalability, transparency, security and efficiency and interoperability. The GRelC service we propose is able to provide access to metadata stored in different and widespread data sources (relational databases running on top of MySQL, Oracle, DB2, etc. leveraging SQL as query language, as well as XML databases - XIndice, eXist, and libxml2 based documents, adopting either XPath or XQuery) providing a strong data virtualization layer in a grid environment. Such a technological solution for distributed metadata management leverages on well known adopted standards (W3C, OASIS, etc.); (ii) supports role-based management (based on VOMS), which increases flexibility and scalability; (iii) provides full support for Grid Security Infrastructure, which means (authorization, mutual authentication, data integrity, data confidentiality and delegation); (iv) is compatible with existing grid middleware such as gLite and Globus and finally (v) is currently adopted at the Euro-Mediterranean Centre for Climate Change (CMCC - Italy) to manage the entire CMCC data production activity as well as in the international Climate-G testbed.
Finding Atmospheric Composition (AC) Metadata

NASA Technical Reports Server (NTRS)

Strub, Richard F..; Falke, Stefan; Fiakowski, Ed; Kempler, Steve; Lynnes, Chris; Goussev, Oleg

2015-01-01

The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate Variables identified by the Global Climate Observing System CH4, CO, CO2, NO2, O3, SO2 and aerosols. This poster addresses our efforts in building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience of the excellence, variety, and challenges we encountered.Conclusions:1.The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web Services rather than any GUI usability improvements due to the large amount of data in their catalog.2.There is a trend at the large catalogs towards simulating small data provider portals through advanced services. 3.Populating metadata catalogs using ISO19115 is too complex for users to do in a consistent way, difficult to parse visually or with XML libraries, and too complex for Java XML binders like CASTOR.4.The ability to search for Ids first and then for data (GCMD and ECHO) is better for machine to machine operations rather than the timeouts experienced when returning the entire metadata entry at once. 5.Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. (This is currently being addressed) 6.Most (if not all) Earth science atmospheric composition data providers store a reference to their data at GCMD.
Progress Report on the Airborne Metadata and Time Series Working Groups of the 2016 ESDSWG

NASA Astrophysics Data System (ADS)

Evans, K. D.; Northup, E. A.; Chen, G.; Conover, H.; Ames, D. P.; Teng, W. L.; Olding, S. W.; Krotkov, N. A.

2016-12-01

NASA's Earth Science Data Systems Working Groups (ESDSWG) was created over 10 years ago. The role of the ESDSWG is to make recommendations relevant to NASA's Earth science data systems from users' experiences. Each group works independently focusing on a unique topic. Participation in ESDSWG groups comes from a variety of NASA-funded science and technology projects, including MEaSUREs and ROSS. Participants include NASA information technology experts, affiliated contractor staff and other interested community members from academia and industry. Recommendations from the ESDSWG groups will enhance NASA's efforts to develop long term data products. The Airborne Metadata Working Group is evaluating the suitability of the current Common Metadata Repository (CMR) and Unified Metadata Model (UMM) for airborne data sets and to develop new recommendations as necessary. The overarching goal is to enhance the usability, interoperability, discovery and distribution of airborne observational data sets. This will be done by assessing the suitability (gaps) of the current UMM model for airborne data using lessons learned from current and past field campaigns, listening to user needs and community recommendations and assessing the suitability of ISO metadata and other standards to fill the gaps. The Time Series Working Group (TSWG) is a continuation of the 2015 Time Series/WaterML2 Working Group. The TSWG is using a case study-driven approach to test the new Open Geospatial Consortium (OGC) TimeseriesML standard to determine any deficiencies with respect to its ability to fully describe and encode NASA earth observation-derived time series data. To do this, the time series working group is engaging with the OGC TimeseriesML Standards Working Group (SWG) regarding unsatisfied needs and possible solutions. The effort will end with the drafting of an OGC Engineering Report based on the use cases and interactions with the OGC TimeseriesML SWG. Progress towards finalizing recommendations will be presented at the meeting.
Automated Atmospheric Composition Dataset Level Metadata Discovery. Difficulties and Surprises

NASA Astrophysics Data System (ADS)

Strub, R. F.; Falke, S. R.; Kempler, S.; Fialkowski, E.; Goussev, O.; Lynnes, C.

2015-12-01

The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate Variables identified by the Global Climate Observing System - CH4, CO, CO2, NO2, O3, SO2 and aerosols. This poster addresses our efforts in building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience of the excellence, variety, and challenges we encountered.Conclusions:1.The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web Services rather than any GUI usability improvements due to the large amount of data in their catalog.2.There is a trend at the large catalogs towards simulating small data provider portals through advanced services. 3.Populating metadata catalogs using ISO19115 is too complex for users to do in a consistent way, difficult to parse visually or with XML libraries, and too complex for Java XML binders like CASTOR.4.The ability to search for Ids first and then for data (GCMD and ECHO) is better for machine to machine operations rather than the timeouts experienced when returning the entire metadata entry at once. 5.Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. (This is currently being addressed) 6.Most (if not all) Earth science atmospheric composition data providers store a reference to their data at GCMD.

An ODIP effort to map R2R ocean data terms to international vocabularies

NASA Astrophysics Data System (ADS)

Ferreira, Renata; Stocks, Karen; Arko, Robert

2014-05-01

The heterogeneity of terminology used in describing data creates a barrier to the efficient discovery and re-use of data, particularly across institutional, programmatic, and disciplinary boundaries. Here we explore the outcomes of a student project to crosswalk terms between the Rolling Deck to Repository (R2R) program and other international systems, as part of the Ocean Data Interoperability Platform (ODIP). R2R is a US program developing and implementing an information management system to preserve and provide access to routine underway data collected by U.S academic research vessels. R2R participates in ODIP, an international forum for improving the interoperability and effective sharing of marine data resources through technical workshops and joint prototypes. The vocabulary mapping effort lays a foundation for future ocean data portals through which users search and access international ocean data using familiar terms. R2R describes its data with a suite of controlled vocabularies (http://www.rvdata.us/voc) some of which were developed locally or are specific to the US. The goal of this student project is to crosswalk local/national vocabularies to authoritative international vocabularies, where they exist, or to vocabularies widely used by ODIP partners. Specifically, R2R developed the following crosswalks: R2R science party names to ORCID person identifiers, UNOLS ports to SeaDataNet Ports Gazetteer, R2R Device Models to NVS SeaVoX Device Catalog, and R2R Organizations to the European Directory of Marine Organizations (EDMO). Mappings were done in simple spreadsheets using synonymy relationships only, and will be published as part of the R2R Linked Data resources. The level of success in crosswalking was variable. The majority of ports were successfully mapped. Differences in the character sets (i.e. whether diacritic marks were used) caused automated matching to fail occasionally, but the number of ports was small enough that these could be manually reviewed. Both organizations and device models have initial mappings, and R2R will propose new terms to the EDMO and SeaVoX Device vocabularies to complete coverage. Mapping to ORCID identifiers was abandoned (though R2R will still hold and expose them when supplied by the data provider). Most ORCID entries do not contain insufficient metadata to confirm potential mappings: the match of a family and given name was considered inconclusive without further support. ORCID also does not assign identifiers posthumously, which is occasionally necessary for historical data in R2R.
Documenting Models for Interoperability and Reusability ...

EPA Pesticide Factsheets

Many modeling frameworks compartmentalize science via individual models that link sets of small components to create larger modeling workflows. Developing integrated watershed models increasingly requires coupling multidisciplinary, independent models, as well as collaboration between scientific communities, since component-based modeling can integrate models from different disciplines. Integrated Environmental Modeling (IEM) systems focus on transferring information between components by capturing a conceptual site model; establishing local metadata standards for input/output of models and databases; managing data flow between models and throughout the system; facilitating quality control of data exchanges (e.g., checking units, unit conversions, transfers between software languages); warning and error handling; and coordinating sensitivity/uncertainty analyses. Although many computational software systems facilitate communication between, and execution of, components, there are no common approaches, protocols, or standards for turn-key linkages between software systems and models, especially if modifying components is not the intent. Using a standard ontology, this paper reviews how models can be described for discovery, understanding, evaluation, access, and implementation to facilitate interoperability and reusability. In the proceedings of the International Environmental Modelling and Software Society (iEMSs), 8th International Congress on Environmental Mod
Auto-Generated Semantic Processing Services

NASA Technical Reports Server (NTRS)

Davis, Rodney; Hupf, Greg

2009-01-01

Auto-Generated Semantic Processing (AGSP) Services is a suite of software tools for automated generation of other computer programs, denoted cross-platform semantic adapters, that support interoperability of computer-based communication systems that utilize a variety of both new and legacy communication software running in a variety of operating- system/computer-hardware combinations. AGSP has numerous potential uses in military, space-exploration, and other government applications as well as in commercial telecommunications. The cross-platform semantic adapters take advantage of common features of computer- based communication systems to enforce semantics, messaging protocols, and standards of processing of streams of binary data to ensure integrity of data and consistency of meaning among interoperating systems. The auto-generation aspect of AGSP Services reduces development time and effort by emphasizing specification and minimizing implementation: In effect, the design, building, and debugging of software for effecting conversions among complex communication protocols, custom device mappings, and unique data-manipulation algorithms is replaced with metadata specifications that map to an abstract platform-independent communications model. AGSP Services is modular and has been shown to be easily integrable into new and legacy NASA flight and ground communication systems.
Building a VO-compliant Radio Astronomical DAta Model for Single-dish radio telescopes (RADAMS)

NASA Astrophysics Data System (ADS)

Santander-Vela, Juan de Dios; García, Emilio; Leon, Stephane; Espigares, Victor; Ruiz, José Enrique; Verdes-Montenegro, Lourdes; Solano, Enrique

2012-11-01

The Virtual Observatory (VO) is becoming the de-facto standard for astronomical data publication. However, the number of radio astronomical archives is still low in general, and even lower is the number of radio astronomical data available through the VO. In order to facilitate the building of new radio astronomical archives, easing at the same time their interoperability with VO framework, we have developed a VO-compliant data model which provides interoperable data semantics for radio data. That model, which we call the Radio Astronomical DAta Model for Single-dish (RADAMS) has been built using standards of (and recommendations from) the International Virtual Observatory Alliance (IVOA). This article describes the RADAMS and its components, including archived entities and their relationships to VO metadata. We show that by using IVOA principles and concepts, the effort needed for both the development of the archives and their VO compatibility has been lowered, and the joint development of two radio astronomical archives have been possible. We plan to adapt RADAMS to be able to deal with interferometry data in the future.
DbMap: improving database interoperability issues in medical software using a simple, Java-Xml based solution.

PubMed Central

Karadimas, H.; Hemery, F.; Roland, P.; Lepage, E.

2000-01-01

In medical software development, the use of databases plays a central role. However, most of the databases have heterogeneous encoding and data models. To deal with these variations in the application code directly is error-prone and reduces the potential reuse of the produced software. Several approaches to overcome these limitations have been proposed in the medical database literature, which will be presented. We present a simple solution, based on a Java library, and a central Metadata description file in XML. This development approach presents several benefits in software design and development cycles, the main one being the simplicity in maintenance. PMID:11079915
A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository.

PubMed

Smelter, Andrey; Moseley, Hunter N B

2018-01-01

The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific 'mwTab' flat file format. In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in 'mwTab' formatted files, we implemented a Python library and package. This Python package, named 'mwtab', is a parser for the domain-specific 'mwTab' flat file format, which provides facilities for reading, accessing, and writing 'mwTab' formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given 'mwTab' formatted file. In order to develop the 'mwtab' package we used the official 'mwTab' format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator. The 'mwtab' package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating 'mwTab' formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, 'pythonic' access and manipulation of data and metadata. Also, the package provides facilities to convert 'mwTab' files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The 'mwtab' package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between 'mwTab' and JSONized formats in raw text and a variety of compressed binary file formats. The 'mwtab' package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a 'User Guide', 'Tutorial', and 'API Reference'. The GitHub repository also provides 'mwtab' package unit-tests via a continuous integration service.
Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center

PubMed Central

Stathias, Vasileios; Koleti, Amar; Vidović, Dušica; Cooper, Daniel J.; Jagodnik, Kathleen M.; Terryn, Raymond; Forlin, Michele; Chung, Caty; Torre, Denis; Ayad, Nagi; Medvedovic, Mario; Ma'ayan, Avi; Pillai, Ajay; Schürer, Stephan C.

2018-01-01

The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcriptomics, proteomics, epigenomics, cell phenotype and competitive binding profiling assays. The large volume and variety of data necessitate rigorous data standards and effective data management including modular data processing pipelines and end-user interfaces to facilitate accurate and reliable data exchange, curation, validation, standardization, aggregation, integration, and end user access. Deep metadata annotations and the use of qualified data standards enable integration with many external resources. Here we describe the end-to-end data processing and management at the DCIC to generate a high-quality and persistent product. Our data management and stewardship solutions enable a functioning Consortium and make LINCS a valuable scientific resource that aligns with big data initiatives such as the BD2K NIH Program and concords with emerging data science best practices including the findable, accessible, interoperable, and reusable (FAIR) principles. PMID:29917015
US Geoscience Information Network, Web Services for Geoscience Information Discovery and Access

NASA Astrophysics Data System (ADS)

Richard, S.; Allison, L.; Clark, R.; Coleman, C.; Chen, G.

2012-04-01

The US Geoscience information network has developed metadata profiles for interoperable catalog services based on ISO19139 and the OGC CSW 2.0.2. Currently data services are being deployed for the US Dept. of Energy-funded National Geothermal Data System. These services utilize OGC Web Map Services, Web Feature Services, and THREDDS-served NetCDF for gridded datasets. Services and underlying datasets (along with a wide variety of other information and non information resources are registered in the catalog system. Metadata for registration is produced by various workflows, including harvest from OGC capabilities documents, Drupal-based web applications, transformation from tabular compilations. Catalog search is implemented using the ESRI Geoportal open-source server. We are pursuing various client applications to demonstrated discovery and utilization of the data services. Currently operational applications allow catalog search and data acquisition from map services in an ESRI ArcMap extension, a catalog browse and search application built on openlayers and Django. We are developing use cases and requirements for other applications to utilize geothermal data services for resource exploration and evaluation.
Legacy2Drupal - Conversion of an existing oceanographic relational database to a semantically enabled Drupal content management system

NASA Astrophysics Data System (ADS)

Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.

2009-12-01

Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.
Evolution of Web Services in EOSDIS: Search and Order Metadata Registry (ECHO)

NASA Technical Reports Server (NTRS)

Mitchell, Andrew; Ramapriyan, Hampapuram; Lowe, Dawn

2009-01-01

During 2005 through 2008, NASA defined and implemented a major evolutionary change in it Earth Observing system Data and Information System (EOSDIS) to modernize its capabilities. This implementation was based on a vision for 2015 developed during 2005. The EOSDIS 2015 Vision emphasizes increased end-to-end data system efficiency and operability; increased data usability; improved support for end users; and decreased operations costs. One key feature of the Evolution plan was achieving higher operational maturity (ingest, reconciliation, search and order, performance, error handling) for the NASA s Earth Observing System Clearinghouse (ECHO). The ECHO system is an operational metadata registry through which the scientific community can easily discover and exchange NASA's Earth science data and services. ECHO contains metadata for 2,726 data collections comprising over 87 million individual data granules and 34 million browse images, consisting of NASA s EOSDIS Data Centers and the United States Geological Survey's Landsat Project holdings. ECHO is a middleware component based on a Service Oriented Architecture (SOA). The system is comprised of a set of infrastructure services that enable the fundamental SOA functions: publish, discover, and access Earth science resources. It also provides additional services such as user management, data access control, and order management. The ECHO system has a data registry and a services registry. The data registry enables organizations to publish EOS and other Earth-science related data holdings to a common metadata model. These holdings are described through metadata in terms of datasets (types of data) and granules (specific data items of those types). ECHO also supports browse images, which provide a visual representation of the data. The published metadata can be mapped to and from existing standards (e.g., FGDC, ISO 19115). With ECHO, users can find the metadata stored in the data registry and then access the data either directly online or through a brokered order to the data archive organization. ECHO stores metadata from a variety of science disciplines and domains, including Climate Variability and Change, Carbon Cycle and Ecosystems, Earth Surface and Interior, Atmospheric Composition, Weather, and Water and Energy Cycle. ECHO also has a services registry for community-developed search services and data services. ECHO provides a platform for the publication, discovery, understanding and access to NASA s Earth Observation resources (data, service and clients). In their native state, these data, service and client resources are not necessarily targeted for use beyond their original mission. However, with the proper interoperability mechanisms, users of these resources can expand their value, by accessing, combining and applying them in unforeseen ways.
EPOS Data and Service Provision

NASA Astrophysics Data System (ADS)

Bailo, Daniele; Jeffery, Keith G.; Atakan, Kuvvet; Harrison, Matt

2017-04-01

EPOS is now in IP (implementation phase) after a successful PP (preparatory phase). EPOS consists of essentially two components, one ICS (Integrated Core Services) representing the integrating ICT (Information and Communication Technology) and many TCS (Thematic Core Services) representing the scientific domains. The architecture developed, demonstrated and agreed within the project during the PP is now being developed utilising co-design with the TCS teams and agile, spiral methods within the ICS team. The 'heart' of EPOS is the metadata catalog. This provides for the ICS a digital representation of the TCS assets (services, data, software, equipment, expertise…) thus facilitating access, interoperation and (re-)use. A major part of the work has been interactions with the TCS. The original intention to harvest information from the TCS required (and still requires) discussions to understand fully the TCS organisational structures linked with rights, security and privacy; their (meta)data syntax (structure) and semantics (meaning); their workflows and methods of working and the services offered. To complicate matters further the TCS are each at varying stages of development and the ICS design has to accommodate pre-existing, developing and expected future standards for metadata, data, software and processes. Through information documents, questionnaires and interviews/meetings the EPOS ICS team has collected DDSS (Data, Data Products, Software and Services) information from the TCS. The ICS team developed a simplified metadata model for presentation to the TCS and the ICS team will perform the mapping and conversion from this model to the internal detailed technical metadata model using (CERIF: a EU recommendation to Member States maintained, developed and promoted by euroCRIS www.eurocris.org ). At the time of writing the final modifications of the EPOS metadata model are being made, and the mappings to CERIF designed, prior to the main phase of (meta)data collection into the EPOS metadata catalog. In parallel work proceeds on the user interface softsare, the APIs (Application Programming Interfaces) to the TCS services, the harvesting method and software, the AAAI (Authentication, Authorisation, Accounting Infrastructure) and the system manager. The next steps will involve interfaces to ICS-D (Distributed ICS i.e. facilities and services for computing, data storage, detectors and instruments for data collection etc.) to which requests, software and data will be deployed and from which data will be generated. Associated with this will be the development of the workflow system which will assist the end-user in building a workflow to achieve the scientific objectives.
Best Practices for International Collaboration and Applications of Interoperability within a NASA Data Center

NASA Astrophysics Data System (ADS)

Moroni, D. F.; Armstrong, E. M.; Tauer, E.; Hausman, J.; Huang, T.; Thompson, C. K.; Chung, N.

2013-12-01

The Physical Oceanographic Distributed Active Archive Center (PO.DAAC) is one of 12 data centers sponsored by NASA's Earth Science Data and Information System (ESDIS) project. The PO.DAAC is tasked with archival and distribution of NASA Earth science missions specific to physical oceanography, many of which have interdisciplinary applications for weather forecasting/monitoring, ocean biology, ocean modeling, and climate studies. PO.DAAC has a 20-year history of cross-project and international collaborations with partners in Europe, Japan, Australia, and the UK. Domestically, the PO.DAAC has successfully established lasting partners with non-NASA institutions and projects including the National Oceanic and Atmospheric Administration (NOAA), United States Navy, Remote Sensing Systems, and Unidata. A key component of these partnerships is PO.DAAC's direct involvement with international working groups and science teams, such as the Group for High Resolution Sea Surface Temperature (GHRSST), International Ocean Vector Winds Science Team (IOVWST), Ocean Surface Topography Science Team (OSTST), and the Committee on Earth Observing Satellites (CEOS). To help bolster new and existing collaborations, the PO.DAAC has established a standardized approach to its internal Data Management and Archiving System (DMAS), utilizing a Data Dictionary to provide the baseline standard for entry and capture of dataset and granule metadata. Furthermore, the PO.DAAC has established an end-to-end Dataset Lifecycle Policy, built upon both internal and external recommendations of best practices toward data stewardship. Together, DMAS, the Data Dictionary, and the Dataset Lifecycle Policy provide the infrastructure to enable standardized data and metadata to be fully ingested and harvested to facilitate interoperability and compatibility across data access protocols, tools, and services. The Dataset Lifecycle Policy provides the checks and balances to help ensure all incoming HDF and netCDF-based datasets meet minimum compliance requirements with the Lawrence Livermore National Laboratory's actively maintained Climate and Forecast (CF) conventions with additional goals toward metadata standards provided by the Attribute Convention for Dataset Discovery (ACDD), the International Organization for Standardization (ISO) 19100-series, and the Federal Geographic Data Committee (FGDC). By default, DMAS ensures all datasets are compliant with NASA's Global Change Master Directory (GCMD) and NASA's Reverb data discovery clearinghouse (also known as ECHO). For data access, PO.DAAC offers several widely-used technologies, including File Transfer Protocol (FTP), Open-source Project for a Network Data Access Protocol (OPeNDAP), and Thematic Realtime Environmental Distributed Data Services (THREDDS). These access technologies are available directly to users or through PO.DAAC's web interfaces, specifically the High-level Tool for Interactive Data Extraction (HiTIDE), Live Access Server (LAS), and PO.DAAC's set of search, image, and Consolidated Web Services (CWS). Lastly, PO.DAAC's newly introduced, standards-based CWS provide singular endpoints for search, imaging, and extraction capabilities, respectively, across L2/L3/L4 datasets. Altogether, these tools, services and policies serve to provide flexible, interoperable functionality for both users and data providers.
Provenance in Data Interoperability for Multi-Sensor Intercomparison

NASA Technical Reports Server (NTRS)

Lynnes, Chris; Leptoukh, Greg; Berrick, Steve; Shen, Suhung; Prados, Ana; Fox, Peter; Yang, Wenli; Min, Min; Holloway, Dan; Enloe, Yonsook

2008-01-01

As our inventory of Earth science data sets grows, the ability to compare, merge and fuse multiple datasets grows in importance. This requires a deeper data interoperability than we have now. Efforts such as Open Geospatial Consortium and OPeNDAP (Open-source Project for a Network Data Access Protocol) have broken down format barriers to interoperability; the next challenge is the semantic aspects of the data. Consider the issues when satellite data are merged, cross-calibrated, validated, inter-compared and fused. We must match up data sets that are related, yet different in significant ways: the phenomenon being measured, measurement technique, location in space-time or quality of the measurements. If subtle distinctions between similar measurements are not clear to the user, results can be meaningless or lead to an incorrect interpretation of the data. Most of these distinctions trace to how the data came to be: sensors, processing and quality assessment. For example, monthly averages of satellite-based aerosol measurements often show significant discrepancies, which might be due to differences in spatio- temporal aggregation, sampling issues, sensor biases, algorithm differences or calibration issues. Provenance information must be captured in a semantic framework that allows data inter-use tools to incorporate it and aid in the intervention of comparison or merged products. Semantic web technology allows us to encode our knowledge of measurement characteristics, phenomena measured, space-time representation, and data quality attributes in a well-structured, machine-readable ontology and rulesets. An analysis tool can use this knowledge to show users the provenance-related distrintions between two variables, advising on options for further data processing and analysis. An additional problem for workflows distributed across heterogeneous systems is retrieval and transport of provenance. Provenance may be either embedded within the data payload, or transmitted from server to client in an out-of-band mechanism. The out of band mechanism is more flexible in the richness of provenance information that can be accomodated, but it relies on a persistent framework and can be difficult for legacy clients to use. We are prototyping the embedded model, incorporating provenance within metadata objects in the data payload. Thus, it always remains with the data. The downside is a limit to the size of provenance metadata that we can include, an issue that will eventually need resolution to encompass the richness of provenance information required for daata intercomparison and merging.
The IAGOS Information System

NASA Astrophysics Data System (ADS)

Boulanger, Damien; Thouret, Valérie; Brissebrat, Guillaume

2017-04-01

IAGOS (In-service Aircraft for a Global Observing System) is a European Research Infrastructure which aims at the provision of long-term, regular and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft and do measurements of aerosols, cloud particles, greenhouse gases, ozone, water vapor and nitrogen oxides from the surface to the lower stratosphere. The IAGOS database is an essential part of the global atmospheric monitoring network. It contains IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data. The IAGOS Data Portal http://www.iagos.org, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data center AERIS (http://www.aeris-data.fr). In 2016 the new IAGOS Data Portal has been released. In addition to the data download the portal provides improved and new services such as download in NetCDF or NASA Ames formats and plotting tools (maps, time series, vertical profiles, etc.). New added value products are or will be soon available through the portal: back trajectories, origin of air masses, co-location with satellite data, etc. Web services allow to download IAGOS metadata such as flights and airports information. Administration tools have been implemented for users management and instruments monitoring. A major improvement is the interoperability with international portals or other databases in order to improve IAGOS data discovery. In the frame of the IGAS project (IAGOS for the Copernicus Atmospheric Service), a data network has been setup. It is composed of three data centers: the IAGOS database in Toulouse, the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de) and the CAMS (Copernicus Atmosphere Monitoring Service) data center in Jülich (http://join.iek.fz-juelich.de). The link with the CAMS data center, through the JOIN interface, allows to combine model outputs with IAGOS data for inter-comparison. The CAMS project is a prominent user of the IGAS data network. During the year IAGOS will improved metadata standardization and dissemination through different collaborations with the AERIS data center, GAW for which IAGOS is a contributing network and the ENVRI+ European project. Metadata about measurements traceability and quality will be available, DOI will be implemented and interoperability with other European Infrastructures will be set up through standardized web services.
Reference architecture and interoperability model for data mining and fusion in scientific cross-domain infrastructures

NASA Astrophysics Data System (ADS)

Haener, Rainer; Waechter, Joachim; Grellet, Sylvain; Robida, Francois

2017-04-01

Interoperability is the key factor in establishing scientific research environments and infrastructures, as well as in bringing together heterogeneous, geographically distributed risk management, monitoring, and early warning systems. Based on developments within the European Plate Observing System (EPOS), a reference architecture has been devised that comprises architectural blue-prints and interoperability models regarding the specification of business processes and logic as well as the encoding of data, metadata, and semantics. The architectural blueprint is developed on the basis of the so called service-oriented architecture (SOA) 2.0 paradigm, which combines intelligence and proactiveness of event-driven with service-oriented architectures. SOA 2.0 supports analysing (Data Mining) both, static and real-time data in order to find correlations of disparate information that do not at first appear to be intuitively obvious: Analysed data (e.g., seismological monitoring) can be enhanced with relationships discovered by associating them (Data Fusion) with other data (e.g., creepmeter monitoring), with digital models of geological structures, or with the simulation of geological processes. The interoperability model describes the information, communication (conversations) and the interactions (choreographies) of all participants involved as well as the processes for registering, providing, and retrieving information. It is based on the principles of functional integration, implemented via dedicated services, communicating via service-oriented and message-driven infrastructures. The services provide their functionality via standardised interfaces: Instead of requesting data directly, users share data via services that are built upon specific adapters. This approach replaces the tight coupling at data level by a flexible dependency on loosely coupled services. The main component of the interoperability model is the comprehensive semantic description of the information, business logic and processes on the basis of a minimal set of well-known, established standards. It implements the representation of knowledge with the application of domain-controlled vocabularies to statements about resources, information, facts, and complex matters (ontologies). Seismic experts for example, would be interested in geological models or borehole measurements at a certain depth, based on which it is possible to correlate and verify seismic profiles. The entire model is built upon standards from the Open Geospatial Consortium (Dictionaries, Service Layer), the International Organisation for Standardisation (Registries, Metadata), and the World Wide Web Consortium (Resource Description Framework, Spatial Data on the Web Best Practices). It has to be emphasised that this approach is scalable to the greatest possible extent: All information, necessary in the context of cross-domain infrastructures is referenced via vocabularies and knowledge bases containing statements that provide either the information itself or resources (service-endpoints), the information can be retrieved from. The entire infrastructure communication is subject to a broker-based business logic integration platform where the information exchanged between involved participants, is managed on the basis of standardised dictionaries, repositories, and registries. This approach also enables the development of Systems-of-Systems (SoS), which allow the collaboration of autonomous, large scale concurrent, and distributed systems, yet cooperatively interacting as a collective in a common environment.
Hydrographic processing considerations in the “Big Data” age: An overview of technology trends in ocean and coastal surveys

NASA Astrophysics Data System (ADS)

Holland, M.; Hoggarth, A.; Nicholson, J.

2016-04-01

The quantity of information generated by survey sensors for ocean and coastal zone mapping has reached the “Big Data” age. This is influenced by the number of survey sensors available to conduct a survey, high data resolution, commercial availability, as well as an increased use of autonomous platforms. The number of users of sophisticated survey information is also growing with the increase in data volume. This is leading to a greater demand and broader use of the processed results, which includes marine archeology, disaster response, and many other applications. Data processing and exchange techniques are evolving to ensure this increased accuracy in acquired data meets the user demand, and leads to an improved understanding of the ocean environment. This includes the use of automated processing, models that maintain the best possible representation of varying resolution data to reduce duplication, as well as data plug-ins and interoperability standards. Through the adoption of interoperable standards, data can be exchanged between stakeholders and used many times in any GIS to support an even wider range of activities. The growing importance of Marine Spatial Data Infrastructure (MSDI) is also contributing to the increased access of marine information to support sustainable use of ocean and coastal environments. This paper offers an industry perspective on trends in hydrographic surveying and processing, and the increased use of marine spatial data.
SeaDataNet II - EMODNet Bathymetry - building a pan-European infrastructure for marine and ocean data management and a digital high resolution bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Fichaut, Michele

2015-04-01

The second phase of the project SeaDataNet is well underway since October 2011. The main objective is to improve operations and to progress towards an efficient data management infrastructure able to handle the diversity and large volume of data collected via research cruises and monitoring activities in European marine waters and global oceans. The SeaDataNet infrastructure comprises a network of interconnected data centres and a central SeaDataNet portal. The portal provides users a unified and transparent overview of the metadata and controlled access to the large collections of data sets, managed by the interconnected data centres, and the various SeaDataNet standards and tools,. SeaDataNet is also setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards of ISO (19115, 19139), OGC (WMS, WFS, CS-W and SWE), and OpenSearch. The population of directories has increased considerably in cooperation and involvement in associated EU projects and initiatives. SeaDataNet now gives overview and access to more than 1.6 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 100 connected data centres from 34 countries riparian to European seas. Access to marine data is also a key issue for the implementation of the EU Marine Strategy Framework Directive (MSFD). The EU communication 'Marine Knowledge 2020' underpins the importance of data availability and harmonising access to marine data from different sources. SeaDataNet qualified itself for an active role in the data management component of the EMODnet (European Marine Observation and Data network) that is promoted in the EU Communication. Starting 2009 EMODnet pilot portals have been initiated for marine data themes: digital bathymetry, chemistry, physical oceanography, geology, biology, and seabed habitat mapping. These portals are being expanded to all European sea regions as part of EMODnet Phase 2, which started mid 2013. EMODnet encourages more data providers to come forward for data sharing and participating in the process of making complete overviews and homogeneous data products. The EMODnet Bathymetry project is very illustrative for the synergy between SeaDataNet and EMODnet and added value of generating public data products. The project develops and publishes Digital Terrain Models (DTM) for the European seas. These are produced from survey and aggregated data sets. The portal provides a versatile DTM viewing service with many relevant map layers and functions for retrieving. A further refinement is taking place as part of phase 2. The presentation will highlight key achievements in SeaDataNet II and give further details and views on the new EMODNet Digital Bathymetry for European seas as to be released early 2015.
Common Data Model for Neuroscience Data and Data Model Exchange

PubMed Central

Gardner, Daniel; Knuth, Kevin H.; Abato, Michael; Erde, Steven M.; White, Thomas; DeBellis, Robert; Gardner, Esther P.

2001-01-01

Objective: Generalizing the data models underlying two prototype neurophysiology databases, the authors describe and propose the Common Data Model (CDM) as a framework for federating a broad spectrum of disparate neuroscience information resources. Design: Each component of the CDM derives from one of five superclasses—data, site, method, model, and reference—or from relations defined between them. A hierarchic attribute-value scheme for metadata enables interoperability with variable tree depth to serve specific intra- or broad inter-domain queries. To mediate data exchange between disparate systems, the authors propose a set of XML-derived schema for describing not only data sets but data models. These include biophysical description markup language (BDML), which mediates interoperability between data resources by providing a meta-description for the CDM. Results: The set of superclasses potentially spans data needs of contemporary neuroscience. Data elements abstracted from neurophysiology time series and histogram data represent data sets that differ in dimension and concordance. Site elements transcend neurons to describe subcellular compartments, circuits, regions, or slices; non-neuroanatomic sites include sequences to patients. Methods and models are highly domain-dependent. Conclusions: True federation of data resources requires explicit public description, in a metalanguage, of the contents, query methods, data formats, and data models of each data resource. Any data model that can be derived from the defined superclasses is potentially conformant and interoperability can be enabled by recognition of BDML-described compatibilities. Such metadescriptions can buffer technologic changes. PMID:11141510
panMetaDocs and DataSync - providing a convenient way to share and publish research data

NASA Astrophysics Data System (ADS)

Ulbricht, D.; Klump, J. F.

2013-12-01

In recent years research institutions, geological surveys and funding organizations started to build infrastructures to facilitate the re-use of research data from previous work. At present, several intermeshed activities are coordinated to make data systems of the earth sciences interoperable and recorded data discoverable. Driven by governmental authorities, ISO19115/19139 emerged as metadata standards for discovery of data and services. Established metadata transport protocols like OAI-PMH and OGC-CSW are used to disseminate metadata to data portals. With the persistent identifiers like DOI and IGSN research data and corresponding physical samples can be given unambiguous names and thus become citable. In summary, these activities focus primarily on 'ready to give away'-data, already stored in an institutional repository and described with appropriate metadata. Many datasets are not 'born' in this state but are produced in small and federated research projects. To make access and reuse of these 'small data' easier, these data should be centrally stored and version controlled from the very beginning of activities. We developed DataSync [1] as supplemental application to the panMetaDocs [2] data exchange platform as a data management tool for small science projects. DataSync is a JAVA-application that runs on a local computer and synchronizes directory trees into an eSciDoc-repository [3] by creating eSciDoc-objects via eSciDocs' REST API. DataSync can be installed on multiple computers and is in this way able to synchronize files of a research team over the internet. XML Metadata can be added as separate files that are managed together with data files as versioned eSciDoc-objects. A project-customized instance of panMetaDocs is provided to show a web-based overview of the previously uploaded file collection and to allow further annotation with metadata inside the eSciDoc-repository. PanMetaDocs is a PHP based web application to assist the creation of metadata in any XML-based metadata schema. To reduce manual entries of metadata to a minimum and make use of contextual information in a project setting, metadata fields can be populated with static or dynamic content. Access rights can be defined to control visibility and access to stored objects. Notifications about recently updated datasets are available by RSS and e-mail and the entire inventory can be harvested via OAI-PMH. panMetaDocs is optimized to be harvested by panFMP [4]. panMetaDocs is able to mint dataset DOIs though DataCite and uses eSciDocs' REST API to transfer eSciDoc-objects from a non-public 'pending'-status to the published status 'released', which makes data and metadata of the published object available worldwide through the internet. The application scenario presented here shows the adoption of open source applications to data sharing and publication of data. An eSciDoc-repository is used as storage for data and metadata. DataSync serves as a file ingester and distributor, whereas panMetaDocs' main function is to annotate the dataset files with metadata to make them ready for publication and sharing with your own team, or with the scientific community.
Oceanids command and control (C2) data system - Marine autonomous systems data for vehicle piloting, scientific data users, operational data assimilation, and big data

NASA Astrophysics Data System (ADS)

Buck, J. J. H.; Phillips, A.; Lorenzo, A.; Kokkinaki, A.; Hearn, M.; Gardner, T.; Thorne, K.

2017-12-01

The National Oceanography Centre (NOC) operate a fleet of approximately 36 autonomous marine platforms including submarine gliders, autonomous underwater vehicles, and autonomous surface vehicles. Each platform effectivity has the capability to observe the ocean and collect data akin to a small research vessel. This is creating a growth in data volumes and complexity while the amount of resource available to manage data remains static. The OceanIds Command and Control (C2) project aims to solve these issues by fully automating the data archival, processing and dissemination. The data architecture being implemented jointly by NOC and the Scottish Association for Marine Science (SAMS) includes a single Application Programming Interface (API) gateway to handle authentication, forwarding and delivery of both metadata and data. Technicians and principle investigators will enter expedition data prior to deployment of vehicles enabling automated data processing when vehicles are deployed. The system will support automated metadata acquisition from platforms as this technology moves towards operational implementation. The metadata exposure to the web builds on a prototype developed by the European Commission supported SenseOCEAN project and is via open standards including World Wide Web Consortium (W3C) RDF/XML and the use of the Semantic Sensor Network ontology and Open Geospatial Consortium (OGC) SensorML standard. Data will be delivered in the marine domain Everyone's Glider Observatory (EGO) format and OGC Observations and Measurements. Additional formats will be served by implementation of endpoints such as the NOAA ERDDAP tool. This standardised data delivery via the API gateway enables timely near-real-time data to be served to Oceanids users, BODC users, operational users and big data systems. The use of open standards will also enable web interfaces to be rapidly built on the API gateway and delivery to European research infrastructures that include aligned reference models for data infrastructure.

The interoperability skill of the Geographic Portal of the ISPRA - Geological Survey of Italy

NASA Astrophysics Data System (ADS)

Pia Congi, Maria; Campo, Valentina; Cipolloni, Carlo; Delogu, Daniela; Ventura, Renato; Battaglini, Loredana

2010-05-01

The Geographic Portal of Geological Survey of Italy (ISPRA) available at http://serviziogeologico.apat.it/Portal was planning according to standard criteria of the INSPIRE directive. ArcIMS services and at the same time WMS and WFS services had been realized to satisfy the different clients. For each database and web-services the metadata had been wrote in agreement with the ISO 19115. The management architecture of the portal allow it to encode the clients input and output requests both in ArcXML and in GML language. The web-applications and web-services had been realized for each database owner of Land Protection and Georesources Department concerning the geological map at the scale 1:50.000 (CARG Project) and 1:100.000, the IFFI landslide inventory, the boreholes due Law 464/84, the large-scale geological map and all the raster format maps. The portal thus far published is at the experimental stage but through the development of a new graphical interface achieves the final version. The WMS and WFS services including metadata will be re-designed. The validity of the methodology and the applied standards allow to look ahead to the growing developments. In addition to this it must be borne in mind that the capacity of the new geological standard language (GeoSciML), which is already incorporated in the web-services deployed, will be allow a better display and query of the geological data according to the interoperability. The characteristics of the geological data demand for the cartographic mapping specific libraries of symbols not yet available in a WMS service. This is an other aspect regards the standards of the geological informations. Therefore at the moment were carried out: - a library of geological symbols to be used for printing, with a sketch of system colors and a library for displaying data on video, which almost completely solves the problems of the coverage point and area data (also directed) but that still introduces problems for the linear data (solutions: ArcIMS services from Arcmap projects or a specific SLD implementation for WMS services); - an update of "Guidelines for the supply of geological data" in a short time will be published; - the Geological Survey of Italy is officially involved in the IUGS-CGI working group for the processing and experimentation on the new GeoSciML language with the WMS/WFS services. The availability of geographic informations occurs through the metadata that can be distributed online so that search engines can find them through specialized research. The collected metadata in catalogs are structured in a standard (ISO 19135). The catalogs are a ‘common' interface to locate, view and query data and metadata services, web services and other resources. Then, while working in a growing sector of the environmental knowledgement the focus is to collect the participation of other subjects that contribute to the enrichment of the informative content available, so as to be able to arrive to a real portal of national interest especially in case of disaster management.
SAMOS - A Decade of High-Quality, Underway Meteorological and Oceanographic Data from Research Vessels

NASA Astrophysics Data System (ADS)

Smith, S. R.; Rolph, J.; Briggs, K.; Elya, J. L.; Bourassa, M. A.

2016-02-01

The authors will describe the successes and lessons learned from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) initiative. Over the past decade, SAMOS has acquired, quality controlled, and distributed underway surface meteorological and oceanographic observations from nearly 40 oceanographic research vessels. Research vessels provide underway observations at high-temporal frequency (1-minute sampling interval) that include navigational (position, course, heading, and speed), meteorological (air temperature, humidity, wind, surface pressure, radiation, rainfall), and oceanographic (surface sea temperature and salinity) samples. Vessels recruited to the SAMOS initiative collect a high concentration of data within the U.S. continental shelf, around Hawaii and the islands of the tropical Pacific, and frequently operate well outside routine shipping lanes, capturing observations in extreme ocean environments (Southern, Arctic, South Atlantic, and South Pacific oceans) desired by the air-sea exchange, modeling, and satellite remote sensing communities. The presentation will highlight the data stewardship practices of the SAMOS initiative. Activities include routine automated and visual data quality evaluation, feedback to vessel technicians and operators regarding instrumentation errors, best practices for instrument siting and exposure on research vessels, and professional development activities for research vessel technicians. Best practices for data, metadata, and quality evaluation will be presented. We will discuss ongoing efforts to expand data services to enhance interoperability between marine data centers. Data access and archival protocols will also be presented, including how these data may be referenced and accessed via NCEI.
THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devarakonda, Ranjeet; Shrestha, Biva; Palanisamy, Giri

Nobody is better suited to describe data than the scientist who created it. This description about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset [1]. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, andmore » locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [2][4]. OME is part of ORNL s Mercury software fleet [2][3]. It was jointly developed to support projects funded by the United States Geological Survey (USGS), U.S. Department of Energy (DOE), National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA). OME s architecture provides a customizable interface to support project-specific requirements. Using this new architecture, the ORNL team developed OME instances for USGS s Core Science Analytics, Synthesis, and Libraries (CSAS&L), DOE s Next Generation Ecosystem Experiments (NGEE) and Atmospheric Radiation Measurement (ARM) Program, and the international Surface Ocean Carbon Dioxide ATlas (SOCAT). Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. From the information on the form, the Metadata Editor can create an XML file on the server that the editor is installed or to the user s personal computer. Researchers can also use the ORNL Metadata Editor to modify existing XML metadata files. As an example, an NGEE Arctic scientist use OME to register their datasets to the NGEE data archive and allows the NGEE archive to publish these datasets via a data search portal (http://ngee.ornl.gov/data). These highly descriptive metadata created using OME allows the Archive to enable advanced data search options using keyword, geo-spatial, temporal and ontology filters. Similarly, ARM OME allows scientists or principal investigators (PIs) to submit their data products to the ARM data archive. How would OME help Big Data Centers like the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC)? The ORNL DAAC is one of NASA s Earth Observing System Data and Information System (EOSDIS) data centers managed by the Earth Science Data and Information System (ESDIS) Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological, geological, and chemical components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the NGEE and ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability. Useful Links: USGS OME: http://mercury.ornl.gov/OME/ NGEE OME: http://ngee-arctic.ornl.gov/ngeemetadata/ ARM OME: http://archive2.ornl.gov/armome/ Contact: Ranjeet Devarakonda (devarakondar@ornl.gov) References: [1] Federal Geographic Data Committee. Content standard for digital geospatial metadata. Federal Geographic Data Committee, 1998. [2] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [3] Wilson, B. E., Palanisamy, G., Devarakonda, R., Rhyne, B. T., Lindsley, C., & Green, J. (2010). Mercury Toolset for Spatiotemporal Metadata. [4] Pouchard, L. C., Branstetter, M. L., Cook, R. B., Devarakonda, R., Green, J., Palanisamy, G., ... & Noy, N. F. (2013). A Linked Science investigation: enhancing climate change data discovery with semantic technologies. Earth science informatics, 6(3), 175-185.« less
Building the Synergy between Public Sector and Research Data Infrastructures

NASA Astrophysics Data System (ADS)

Craglia, Massimo; Friis-Christensen, Anders; Ostländer, Nicole; Perego, Andrea

2014-05-01

INSPIRE is a European Directive aiming to establish a EU-wide spatial data infrastructure to give cross-border access to information that can be used to support EU environmental policies, as well as other policies and activities having an impact on the environment. In order to ensure cross-border interoperability of data infrastructures operated by EU Member States, INSPIRE sets out a framework based on common specifications for metadata, data, network services, data and service sharing, monitoring and reporting. The implementation of INSPIRE has reached important milestones: the INSPIRE Geoportal was launched in 2011 providing a single access point for the discovery of INSPIRE data and services across EU Member States (currently, about 300K), while all the technical specifications for the interoperability of data across the 34 INSPIRE themes were adopted at the end of 2013. During this period a number of EU and international initiatives has been launched, concerning cross-domain interoperability and (Linked) Open Data. In particular, the EU Open Data Portal, launched in December 2012, made provisions to access government and scientific data from EU institutions and bodies, and the EU ISA Programme (Interoperability Solutions for European Public Administrations) promotes cross-sector interoperability by sharing and re-using EU-wide and national standards and components. Moreover, the Research Data Alliance (RDA), an initiative jointly funded by the European Commission, the US National Science Foundation and the Australian Research Council, was launched in March 2013 to promote scientific data sharing and interoperability. The Joint Research Centre of the European Commission (JRC), besides being the technical coordinator of the implementation of INSPIRE, is also actively involved in the initiatives promoting cross-sector re-use in INSPIRE, and sustainable approaches to address the evolution of technologies - in particular, how to support Linked Data in INSPIRE and the use of global persistent identifiers. It is evident that government and scientific data infrastructures are currently facing a number of issues that have already been addressed in INSPIRE. Sharing experiences and competencies will avoid re-inventing the wheel, and help promoting the cross-domain adoption of consistent solutions. Actually, one of the lessons learnt from INSPIRE and the initiatives in which JRC is involved, is that government and research data are not two separate worlds. Government data are commonly used as a basis to create scientific data, and vice-versa. Consequently, it is fundamental to adopt a consistent approach to address interoperability and data management issues shared by both government and scientific data. The presentation illustrates some of the lessons learnt during the implementation of INSPIRE and in work on data and service interoperability coordinated with European and international initiatives. We describe a number of critical interoperability issues and barriers affecting both scientific and government data, concerning, e.g., data terminologies, quality and licensing, and propose how these problems could be effectively addressed by a closer collaboration of the government and scientific communities, and the sharing of experiences and practices.
Data Management Rubric for Video Data in Organismal Biology.

PubMed

Brainerd, Elizabeth L; Blob, Richard W; Hedrick, Tyson L; Creamer, Andrew T; Müller, Ulrike K

2017-07-01

Standards-based data management facilitates data preservation, discoverability, and access for effective data reuse within research groups and across communities of researchers. Data sharing requires community consensus on standards for data management, such as storage and formats for digital data preservation, metadata (i.e., contextual data about the data) that should be recorded and stored, and data access. Video imaging is a valuable tool for measuring time-varying phenotypes in organismal biology, with particular application for research in functional morphology, comparative biomechanics, and animal behavior. The raw data are the videos, but videos alone are not sufficient for scientific analysis. Nearly endless videos of animals can be found on YouTube and elsewhere on the web, but these videos have little value for scientific analysis because essential metadata such as true frame rate, spatial calibration, genus and species, weight, age, etc. of organisms, are generally unknown. We have embarked on a project to build community consensus on video data management and metadata standards for organismal biology research. We collected input from colleagues at early stages, organized an open workshop, "Establishing Standards for Video Data Management," at the Society for Integrative and Comparative Biology meeting in January 2017, and then collected two more rounds of input on revised versions of the standards. The result we present here is a rubric consisting of nine standards for video data management, with three levels within each standard: good, better, and best practices. The nine standards are: (1) data storage; (2) video file formats; (3) metadata linkage; (4) video data and metadata access; (5) contact information and acceptable use; (6) camera settings; (7) organism(s); (8) recording conditions; and (9) subject matter/topic. The first four standards address data preservation and interoperability for sharing, whereas standards 5-9 establish minimum metadata standards for organismal biology video, and suggest additional metadata that may be useful for some studies. This rubric was developed with substantial input from researchers and students, but still should be viewed as a living document that should be further refined and updated as technology and research practices change. The audience for these standards includes researchers, journals, and granting agencies, and also the developers and curators of databases that may contribute to video data sharing efforts. We offer this project as an example of building community consensus for data management, preservation, and sharing standards, which may be useful for future efforts by the organismal biology research community. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology.
Data Management Rubric for Video Data in Organismal Biology

PubMed Central

Brainerd, Elizabeth L.; Blob, Richard W.; Hedrick, Tyson L.; Creamer, Andrew T.; Müller, Ulrike K.

2017-01-01

Synopsis Standards-based data management facilitates data preservation, discoverability, and access for effective data reuse within research groups and across communities of researchers. Data sharing requires community consensus on standards for data management, such as storage and formats for digital data preservation, metadata (i.e., contextual data about the data) that should be recorded and stored, and data access. Video imaging is a valuable tool for measuring time-varying phenotypes in organismal biology, with particular application for research in functional morphology, comparative biomechanics, and animal behavior. The raw data are the videos, but videos alone are not sufficient for scientific analysis. Nearly endless videos of animals can be found on YouTube and elsewhere on the web, but these videos have little value for scientific analysis because essential metadata such as true frame rate, spatial calibration, genus and species, weight, age, etc. of organisms, are generally unknown. We have embarked on a project to build community consensus on video data management and metadata standards for organismal biology research. We collected input from colleagues at early stages, organized an open workshop, “Establishing Standards for Video Data Management,” at the Society for Integrative and Comparative Biology meeting in January 2017, and then collected two more rounds of input on revised versions of the standards. The result we present here is a rubric consisting of nine standards for video data management, with three levels within each standard: good, better, and best practices. The nine standards are: (1) data storage; (2) video file formats; (3) metadata linkage; (4) video data and metadata access; (5) contact information and acceptable use; (6) camera settings; (7) organism(s); (8) recording conditions; and (9) subject matter/topic. The first four standards address data preservation and interoperability for sharing, whereas standards 5–9 establish minimum metadata standards for organismal biology video, and suggest additional metadata that may be useful for some studies. This rubric was developed with substantial input from researchers and students, but still should be viewed as a living document that should be further refined and updated as technology and research practices change. The audience for these standards includes researchers, journals, and granting agencies, and also the developers and curators of databases that may contribute to video data sharing efforts. We offer this project as an example of building community consensus for data management, preservation, and sharing standards, which may be useful for future efforts by the organismal biology research community. PMID:28881939
Integrating Ideas for International Data Collaborations Through The Committee on Earth Observation Satellites (CEOS) International Directory Network (IDN)

NASA Technical Reports Server (NTRS)

Olsen, Lola M.

2006-01-01

The capabilities of the International Directory Network's (IDN) version MD9.5, along with a new version of the metadata authoring tool, "docBUILDER", will be presented during the Technology and Services Subgroup session of the Working Group on Information Systems and Services (WGISS). Feedback provided through the international community has proven instrumental in positively influencing the direction of the IDN s development. The international community was instrumental in encouraging support for using the IS0 international character set that is now available through the directory. Supporting metadata descriptions in additional languages encourages extended use of the IDN. Temporal and spatial attributes often prove pivotal in the search for data. Prior to the new software release, the IDN s geospatial and temporal searches suffered from browser incompatibilities and often resulted in unreliable performance for users attempting to initiate a spatial search using a map based on aging Java applet technology. The IDN now offers an integrated Google map and date search that replaces that technology. In addition, one of the most defining characteristics in the search for data relates to the temporal and spatial resolution of the data. The ability to refine the search for data sets meeting defined resolution requirements is now possible. Data set authors are encouraged to indicate the precise resolution values for their data sets and subsequently bin these into one of the pre-selected resolution ranges. New metadata authoring tools have been well received. In response to requests for a standalone metadata authoring tool, a new shareable software package called "docBUILDER solo" will soon be released to the public. This tool permits researchers to document their data during experiments and observational periods in the field. interoperability has been enhanced through the use of the Open Archives Initiative s (OAI) Protocol for Metadata Harvesting (PMH). Harvesting of XML content through OAI-MPH has been successfully tested with several organizations. The protocol appears to be a prime candidate for sharing metadata throughout the international community. Data services for visualizing and analyzing data have become valuable assets in facilitating the use of data. Data providers are offering many of their data-related services through the directory. The IDN plans to develop a service-based architecture to further promote the use of web services. During the IDN Task Team session, ideas for further enhancements will be discussed.
Legacy2Drupal: Conversion of an existing relational oceanographic database to a Drupal 7 CMS

NASA Astrophysics Data System (ADS)

Work, T. T.; Maffei, A. R.; Chandler, C. L.; Groman, R. C.

2011-12-01

Content Management Systems (CMSs) such as Drupal provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have already designed and implemented customized schemas for their metadata. The NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) has ported an existing relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal 7 Content Management System. This is an update on an effort described as a proof-of-concept in poster IN21B-1051, presented at AGU2009. The BCO-DMO project has translated all the existing database tables, input forms, website reports, and other features present in the existing system into Drupal CMS features. The replacement features are made possible by the use of Drupal content types, CCK node-reference fields, a custom theme, and a number of other supporting modules. This presentation describes the process used to migrate content in the original BCO-DMO metadata database to Drupal 7, some problems encountered during migration, and the modules used to migrate the content successfully. Strategic use of Drupal 7 CMS features that enable three separate but complementary interfaces to provide access to oceanographic research metadata will also be covered: 1) a Drupal 7-powered user front-end; 2) REST-ful JSON web services (providing a Mapserver interface to the metadata and data; and 3) a SPARQL interface to a semantic representation of the repository metadata (this feeding a new faceted search capability currently under development). The existing BCO-DMO ontology, developed in collaboration with Rensselaer Polytechnic Institute's Tetherless World Constellation, makes strategic use of pre-existing ontologies and will be used to drive semantically-enabled faceted search capabilities planned for the site. At this point, the use of semantic technologies included in the Drupal 7 core is anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 7 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO and other science data repositories for interoperability between systems that serve ecosystem research data.
European Marine Observation Data Network - EMODnet Physics

NASA Astrophysics Data System (ADS)

Manzella, Giuseppe M. R.; Novellino, Antonio; D'Angelo, Paolo; Gorringe, Patrick; Schaap, Dick; Pouliquen, Sylvie; Loubrieu, Thomas; Rickards, Lesley

2015-04-01

The EMODnet-Physics portal (www.emodnet-physics.eu) makes layers of physical data and their metadata available for use and contributes towards the definition of an operational European Marine Observation and Data Network (EMODnet). It is based on a strong collaboration between EuroGOOS associates and its regional operational systems (ROOSs), and it is bringing together two very different marine communities: the "real time" ocean observing institute/centers and the National Oceanographic Data Centres (NODCs) that are in charge of ocean data validation, quality check and update for marine environmental monitoring. The EMODnet-Physics is a Marine Observation and Data Information System that provides a single point of access to near real time and historical achieved data (www.emodnet-physics.eu/map) it is built on existing infrastructure by adding value and avoiding any unless complexity, it provides data access to users, it is aimed at attracting new data holders, better and more data. With a long-term vision for a pan European Ocean Observation System sustainability, the EMODnet-Physics is supporting the coordination of the EuroGOOS Regional components and the empowerment and improvement of their data management infrastructure. In turn, EMODnet-Physics already implemented high-level interoperability features (WMS, Web catalogue, web services, etc…) to facilitate connection and data exchange with the ROOS and the Institutes within the ROOSs (www.emodnet-physics.eu/services). The on-going EMODnet-Physics structure delivers environmental marine physical data from the whole Europe (wave height and period, temperature of the water column, wind speed and direction, salinity of the water column, horizontal velocity of the water column, light attenuation, and sea level) as monitored by fixed stations, ARGO floats, drifting buoys, gliders, and ferry-boxes. It does provide discovering of data sets (both NRT - near real time - and Historical data sets), visualization and free download of data from more than 1500 platforms. The portal is composed mainly of three sections: the Map, the Selection List and the Station Info Panel. The Map is the core of the EMODnet-Physics system: here the user can access all available data, customize the map visualization and set different display layers. It is also possible to interact with all the information on the map using the filters provided by the service that can be used to select the stations of interest depending on the type, physical parameters measured, the time period of the observations in the database of the system, country of origin, the water basin of reference. It is also possible to browse the data in time by means of the slider in the lower part of the page that allows the user to view the stations that recorded data in a particular time period. Finally, it is possible to change the standard map view with different layers that provide additional visual information on the status of the waters. The Station Info panel available from the main map by clicking on a single platform provides information on the measurements carried out by the station. Moreover, the system provides full interoperability with third-party software through WMS service, Web Service and Web catalogue in order to exchange data and products according to the most recent interop standards. Further developments will ensure the compatibility to the OGS-SWE (Sensor Web Enablement) standard for the description of sensors and related observations using OpenGIS specifications (SensorML, O&M, SOS). The full list of services is available at www.emodnet-physics.eu/services. The result is an excellent example of innovative technologies for providing open and free access to geo-referenced data for the creation of new advanced (operational) oceanography services.
Interoperable Data Sharing for Diverse Scientific Disciplines

NASA Astrophysics Data System (ADS)

Hughes, John S.; Crichton, Daniel; Martinez, Santa; Law, Emily; Hardman, Sean

2016-04-01

For diverse scientific disciplines to interoperate they must be able to exchange information based on a shared understanding. To capture this shared understanding, we have developed a knowledge representation framework using ontologies and ISO level archive and metadata registry reference models. This framework provides multi-level governance, evolves independent of implementation technologies, and promotes agile development, namely adaptive planning, evolutionary development, early delivery, continuous improvement, and rapid and flexible response to change. The knowledge representation framework is populated through knowledge acquisition from discipline experts. It is also extended to meet specific discipline requirements. The result is a formalized and rigorous knowledge base that addresses data representation, integrity, provenance, context, quantity, and their relationships within the community. The contents of the knowledge base is translated and written to files in appropriate formats to configure system software and services, provide user documentation, validate ingested data, and support data analytics. This presentation will provide an overview of the framework, present the Planetary Data System's PDS4 as a use case that has been adopted by the international planetary science community, describe how the framework is being applied to other disciplines, and share some important lessons learned.
A Working Framework for Enabling International Science Data System Interoperability

NASA Astrophysics Data System (ADS)

Hughes, J. Steven; Hardman, Sean; Crichton, Daniel J.; Martinez, Santa; Law, Emily; Gordon, Mitchell K.

2016-07-01

For diverse scientific disciplines to interoperate they must be able to exchange information based on a shared understanding. To capture this shared understanding, we have developed a knowledge representation framework that leverages ISO level reference models for metadata registries and digital archives. This framework provides multi-level governance, evolves independent of the implementation technologies, and promotes agile development, namely adaptive planning, evolutionary development, early delivery, continuous improvement, and rapid and flexible response to change. The knowledge representation is captured in an ontology through a process of knowledge acquisition. Discipline experts in the role of stewards at the common, discipline, and project levels work to design and populate the ontology model. The result is a formal and consistent knowledge base that provides requirements for data representation, integrity, provenance, context, identification, and relationship. The contents of the knowledge base are translated and written to files in suitable formats to configure system software and services, provide user documentation, validate input, and support data analytics. This presentation will provide an overview of the framework, present a use case that has been adopted by an entire science discipline at the international level, and share some important lessons learned.
Evolving Frameworks for Different Communities of Scientists and End Users

NASA Astrophysics Data System (ADS)

Graves, S. J.; Keiser, K.

2016-12-01

Two evolving frameworks for interdisciplinary science will be described in the context of the Common Data Framework for Earth-Observation Data and the importance of standards and protocols. The Event Data Driven Delivery (ED3) Framework, funded by NASA Applied Sciences, provides the delivery of data based on predetermined subscriptions and associated workflows to various communities of end users. ED3's capabilities are used by scientists, as well as policy and resource managers, when event alerts are triggered to respond to their needs. The EarthCube Integration and Testing Environment (ECITE) Assessment Framework for Technology Interoperability and Integration is being developed to facilitate the EarthCube community's assessment of NSF funded technologies addressing Earth science problems. ECITE is addressing the translation of geoscience researchers' use cases into technology use case that apply EarthCube-funded building block technologies (and other existing technologies) for solving science problems. EarthCube criteria for technology assessment include the use of data, metadata and service standards to improve interoperability and integration across program components. The long-range benefit will be the growth of a cyberinfrastructure with technology components that have been shown to work together to solve known science objectives.
A Framework for Integration of Heterogeneous Medical Imaging Networks

PubMed Central

Viana-Ferreira, Carlos; Ribeiro, Luís S; Costa, Carlos

2014-01-01

Medical imaging is increasing its importance in matters of medical diagnosis and in treatment support. Much is due to computers that have revolutionized medical imaging not only in acquisition process but also in the way it is visualized, stored, exchanged and managed. Picture Archiving and Communication Systems (PACS) is an example of how medical imaging takes advantage of computers. To solve problems of interoperability of PACS and medical imaging equipment, the Digital Imaging and Communications in Medicine (DICOM) standard was defined and widely implemented in current solutions. More recently, the need to exchange medical data between distinct institutions resulted in Integrating the Healthcare Enterprise (IHE) initiative that contains a content profile especially conceived for medical imaging exchange: Cross Enterprise Document Sharing for imaging (XDS-i). Moreover, due to application requirements, many solutions developed private networks to support their services. For instance, some applications support enhanced query and retrieve over DICOM objects metadata. This paper proposes anintegration framework to medical imaging networks that provides protocols interoperability and data federation services. It is an extensible plugin system that supports standard approaches (DICOM and XDS-I), but is also capable of supporting private protocols. The framework is being used in the Dicoogle Open Source PACS. PMID:25279021
A framework for integration of heterogeneous medical imaging networks.

PubMed

Viana-Ferreira, Carlos; Ribeiro, Luís S; Costa, Carlos

2014-01-01

Medical imaging is increasing its importance in matters of medical diagnosis and in treatment support. Much is due to computers that have revolutionized medical imaging not only in acquisition process but also in the way it is visualized, stored, exchanged and managed. Picture Archiving and Communication Systems (PACS) is an example of how medical imaging takes advantage of computers. To solve problems of interoperability of PACS and medical imaging equipment, the Digital Imaging and Communications in Medicine (DICOM) standard was defined and widely implemented in current solutions. More recently, the need to exchange medical data between distinct institutions resulted in Integrating the Healthcare Enterprise (IHE) initiative that contains a content profile especially conceived for medical imaging exchange: Cross Enterprise Document Sharing for imaging (XDS-i). Moreover, due to application requirements, many solutions developed private networks to support their services. For instance, some applications support enhanced query and retrieve over DICOM objects metadata. This paper proposes anintegration framework to medical imaging networks that provides protocols interoperability and data federation services. It is an extensible plugin system that supports standard approaches (DICOM and XDS-I), but is also capable of supporting private protocols. The framework is being used in the Dicoogle Open Source PACS.
A Lifecycle Approach to Brokered Data Management for Hydrologic Modeling Data Using Open Standards.

NASA Astrophysics Data System (ADS)

Blodgett, D. L.; Booth, N.; Kunicki, T.; Walker, J.

2012-12-01

The U.S. Geological Survey Center for Integrated Data Analytics has formalized an information management-architecture to facilitate hydrologic modeling and subsequent decision support throughout a project's lifecycle. The architecture is based on open standards and open source software to decrease the adoption barrier and to build on existing, community supported software. The components of this system have been developed and evaluated to support data management activities of the interagency Great Lakes Restoration Initiative, Department of Interior's Climate Science Centers and WaterSmart National Water Census. Much of the research and development of this system has been in cooperation with international interoperability experiments conducted within the Open Geospatial Consortium. Community-developed standards and software, implemented to meet the unique requirements of specific disciplines, are used as a system of interoperable, discipline specific, data types and interfaces. This approach has allowed adoption of existing software that satisfies the majority of system requirements. Four major features of the system include: 1) assistance in model parameter and forcing creation from large enterprise data sources; 2) conversion of model results and calibrated parameters to standard formats, making them available via standard web services; 3) tracking a model's processes, inputs, and outputs as a cohesive metadata record, allowing provenance tracking via reference to web services; and 4) generalized decision support tools which rely on a suite of standard data types and interfaces, rather than particular manually curated model-derived datasets. Recent progress made in data and web service standards related to sensor and/or model derived station time series, dynamic web processing, and metadata management are central to this system's function and will be presented briefly along with a functional overview of the applications that make up the system. As the separate pieces of this system progress, they will be combined and generalized to form a sort of social network for nationally consistent hydrologic modeling.
Developing a data life cycle for carbon and greenhouse gas measurements: challenges, experiences and visions

NASA Astrophysics Data System (ADS)

Kutsch, W. L.

2015-12-01

Environmental research infrastructures and big data integration networks require common data policies, standardized workflows and sophisticated e-infrastructure to optimise the data life cycle. This presentation summarizes the experiences in developing the data life cycle for the Integrated Carbon Observation System (ICOS), a European Research Infrastructure. It will also outline challenges that still exist and visions for future development. As many other environmental research infrastructures ICOS RI built on a large number of distributed observational or experimental sites. Data from these sites are transferred to Thematic Centres and quality checked, processed and integrated there. Dissemination will be managed by the ICOS Carbon Portal. This complex data life cycle has been defined in detail by developing protocols and assigning responsibilities. Since data will be shared under an open access policy there is a strong need for common data citation tracking systems that allow data providers to identify downstream usage of their data so as to prove their importance and show the impact to stakeholders and the public. More challenges arise from interoperating with other infrastructures or providing data for global integration projects as done e.g. in the framework of GEOSS or in global integration approaches such as fluxnet or SOCAt. Here, common metadata systems are the key solutions for data detection and harvesting. The metadata characterises data, services, users and ICT resources (including sensors and detectors). Risks may arise when data of high and low quality are mixed during this process or unexperienced data scientists without detailed knowledge on the data aquisition derive scientific theories through statistical analyses. The vision of fully open data availability is expressed in a recent GEO flagship initiative that will address important issues needed to build a connected and interoperable global network for carbon cycle and greenhouse gas observations and aims to meet the most urgent needs for integration between different information sources and methodologies, between different regional networks and from data providers to users.
Data and Models as Social Objects in the HydroShare System for Collaboration in the Hydrology Community and Beyond

NASA Astrophysics Data System (ADS)

Tarboton, D. G.; Idaszak, R.; Horsburgh, J. S.; Ames, D. P.; Goodall, J. L.; Band, L. E.; Merwade, V.; Couch, A.; Hooper, R. P.; Maidment, D. R.; Dash, P. K.; Stealey, M.; Yi, H.; Gan, T.; Castronova, A. M.; Miles, B.; Li, Z.; Morsy, M. M.; Crawley, S.; Ramirez, M.; Sadler, J.; Xue, Z.; Bandaragoda, C.

2016-12-01

How do you share and publish hydrologic data and models for a large collaborative project? HydroShare is a new, web-based system for sharing hydrologic data and models with specific functionality aimed at making collaboration easier. HydroShare has been developed with U.S. National Science Foundation support under the auspices of the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) to support the collaboration and community cyberinfrastructure needs of the hydrology research community. Within HydroShare, we have developed new functionality for creating datasets, describing them with metadata, and sharing them with collaborators. We cast hydrologic datasets and models as "social objects" that can be shared, collaborated around, annotated, published and discovered. In addition to data and model sharing, HydroShare supports web application programs (apps) that can act on data stored in HydroShare, just as software programs on your PC act on your data locally. This can free you from some of the limitations of local computing capacity and challenges in installing and maintaining software on your own PC. HydroShare's web-based cyberinfrastructure can take work off your desk or laptop computer and onto infrastructure or "cloud" based data and processing servers. This presentation will describe HydroShare's collaboration functionality that enables both public and private sharing with individual users and collaborative user groups, and makes it easier for collaborators to iterate on shared datasets and models, creating multiple versions along the way, and publishing them with a permanent landing page, metadata description, and citable Digital Object Identifier (DOI) when the work is complete. This presentation will also describe the web app architecture that supports interoperability with third party servers functioning as application engines for analysis and processing of big hydrologic datasets. While developed to support the cyberinfrastructure needs of the hydrology community, the informatics infrastructure for programmatic interoperability of web resources has a generality beyond the solution of hydrology problems that will be discussed.
A unified framework for managing provenance information in translational research

PubMed Central

2011-01-01

Background A critical aspect of the NIH Translational Research roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the provenance metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists. Results We identify a common set of challenges in managing provenance information across the pre-publication and post-publication phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata: (a) Provenance collection - during data generation (b) Provenance representation - to support interoperability, reasoning, and incorporate domain semantics (c) Provenance storage and propagation - to allow efficient storage and seamless propagation of provenance as the data is transferred across applications (d) Provenance query - to support queries with increasing complexity over large data size and also support knowledge discovery applications We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for Trypanosoma cruzi (T.cruzi SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness. Conclusions The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis. PMID:22126369
An Annotated and Federated Digital Library of Marine Animal Sounds

DTIC Science & Technology

2005-01-01

of the annotations and the relevant segment delimitation points and linkages to other relevant metadata fields; e) search engines that support the...annotators to add information to the same recording, and search engines that permit either all-annotator or specific-annotator searches. To our knowledge
Progress of Interoperability in Planetary Research for Geospatial Data Analysis

NASA Astrophysics Data System (ADS)

Hare, T. M.; Gaddis, L. R.

2015-12-01

For nearly a decade there has been a push in the planetary science community to support interoperable methods of accessing and working with geospatial data. Common geospatial data products for planetary research include image mosaics, digital elevation or terrain models, geologic maps, geographic location databases (i.e., craters, volcanoes) or any data that can be tied to the surface of a planetary body (including moons, comets or asteroids). Several U.S. and international cartographic research institutions have converged on mapping standards that embrace standardized image formats that retain geographic information (e.g., GeoTiff, GeoJpeg2000), digital geologic mapping conventions, planetary extensions for symbols that comply with U.S. Federal Geographic Data Committee cartographic and geospatial metadata standards, and notably on-line mapping services as defined by the Open Geospatial Consortium (OGC). The latter includes defined standards such as the OGC Web Mapping Services (simple image maps), Web Feature Services (feature streaming), Web Coverage Services (rich scientific data streaming), and Catalog Services for the Web (data searching and discoverability). While these standards were developed for application to Earth-based data, they have been modified to support the planetary domain. The motivation to support common, interoperable data format and delivery standards is not only to improve access for higher-level products but also to address the increasingly distributed nature of the rapidly growing volumes of data. The strength of using an OGC approach is that it provides consistent access to data that are distributed across many facilities. While data-steaming standards are well-supported by both the more sophisticated tools used in Geographic Information System (GIS) and remote sensing industries, they are also supported by many light-weight browsers which facilitates large and small focused science applications and public use. Here we provide an overview of the interoperability initiatives that are currently ongoing in the planetary research community, examples of their successful application, and challenges that remain.

The Development of Clinical Document Standards for Semantic Interoperability in China

PubMed Central

Yang, Peng; Pan, Feng; Wan, Yi; Tu, Haibo; Tang, Xuejun; Hu, Jianping

2011-01-01

Objectives This study is aimed at developing a set of data groups (DGs) to be employed as reusable building blocks for the construction of the eight most common clinical documents used in China's general hospitals in order to achieve their structural and semantic standardization. Methods The Diagnostics knowledge framework, the related approaches taken from the Health Level Seven (HL7), the Integrating the Healthcare Enterprise (IHE), and the Healthcare Information Technology Standards Panel (HITSP) and 1,487 original clinical records were considered together to form the DG architecture and data sets. The internal structure, content, and semantics of each DG were then defined by mapping each DG data set to a corresponding Clinical Document Architecture data element and matching each DG data set to the metadata in the Chinese National Health Data Dictionary. By using the DGs as reusable building blocks, standardized structures and semantics regarding the clinical documents for semantic interoperability were able to be constructed. Results Altogether, 5 header DGs, 48 section DGs, and 17 entry DGs were developed. Several issues regarding the DGs, including their internal structure, identifiers, data set names, definitions, length and format, data types, and value sets, were further defined. Standardized structures and semantics regarding the eight clinical documents were structured by the DGs. Conclusions This approach of constructing clinical document standards using DGs is a feasible standard-driven solution useful in preparing documents possessing semantic interoperability among the disparate information systems in China. These standards need to be validated and refined through further study. PMID:22259722
Supporting NEESPI with Data Services - The SIB-ESS-C e-Infrastructure

NASA Astrophysics Data System (ADS)

Gerlach, R.; Schmullius, C.; Frotscher, K.

2009-04-01

Data discovery and retrieval is commonly among the first steps performed for any Earth science study. The way scientific data is searched and accessed has changed significantly over the past two decades. Especially the development of the World Wide Web and the technologies that evolved along shortened the data discovery and data exchange process. On the other hand the amount of data collected and distributed by earth scientists has increased exponentially requiring new concepts for data management and sharing. One such concept to meet the demand is to build up Spatial Data Infrastructures (SDI) or e-Infrastructures. These infrastructures usually contain components for data discovery allowing users (or other systems) to query a catalogue or registry and retrieve metadata information on available data holdings and services. Data access is typically granted using FTP/HTTP protocols or, more advanced, through Web Services. A Service Oriented Architecture (SOA) approach based on standardized services enables users to benefit from interoperability among different systems and to integrate distributed services into their application. The Siberian Earth System Science Cluster (SIB-ESS-C) being established at the University of Jena (Germany) is such a spatial data infrastructure following these principles and implementing standards published by the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO). The prime objective is to provide researchers with focus on Siberia with the technical means for data discovery, data access, data publication and data analysis. The region of interest covers the entire Asian part of the Russian Federation from the Ural to the Pacific Ocean including the Ob-, Lena- and Yenissey river catchments. The aim of SIB-ESS-C is to provide a comprehensive set of data products for Earth system science in this region. Although SIB-ESS-C will be equipped with processing capabilities for in-house data generation (mainly from Earth Observation), current data holdings of SIB-ESS-C have been created in collaboration with a number of partners in previous and ongoing research projects (e.g. SIBERIA-II, SibFORD, IRIS). At the current development stage the SIB-ESS-C system comprises a federated metadata catalogue accessible through the SIB-ESS-C Web Portal or from any OGC-CSW compliant client. Due to full interoperability with other metadata catalogues users of the SIB-ESS-C Web Portal are able to search external metadata repositories. The Web Portal contains also a simple visualization component which will be extended to a comprehensive visualization and analysis tool in the near future. All data products are already accessible as a Web Mapping Service and will be made available as Web Feature and Web Coverage Services soon allowing users to directly incorporate the data into their application. The SIB-ESS-C infrastructure will be further developed as one node in a network of similar systems (e.g. NASA GIOVANNI) in the NEESPI region.
iTools: a framework for classification, categorization and integration of computational biology resources.

PubMed

Dinov, Ivo D; Rubin, Daniel; Lorensen, William; Dugan, Jonathan; Ma, Jeff; Murphy, Shawn; Kirschner, Beth; Bug, William; Sherman, Michael; Floratos, Aris; Kennedy, David; Jagadish, H V; Schmidt, Jeanette; Athey, Brian; Califano, Andrea; Musen, Mark; Altman, Russ; Kikinis, Ron; Kohane, Isaac; Delp, Scott; Parker, D Stott; Toga, Arthur W

2008-05-28

The advancement of the computational biology field hinges on progress in three fundamental directions--the development of new computational algorithms, the availability of informatics resource management infrastructures and the capability of tools to interoperate and synergize. There is an explosion in algorithms and tools for computational biology, which makes it difficult for biologists to find, compare and integrate such resources. We describe a new infrastructure, iTools, for managing the query, traversal and comparison of diverse computational biology resources. Specifically, iTools stores information about three types of resources--data, software tools and web-services. The iTools design, implementation and resource meta-data content reflect the broad research, computational, applied and scientific expertise available at the seven National Centers for Biomedical Computing. iTools provides a system for classification, categorization and integration of different computational biology resources across space-and-time scales, biomedical problems, computational infrastructures and mathematical foundations. A large number of resources are already iTools-accessible to the community and this infrastructure is rapidly growing. iTools includes human and machine interfaces to its resource meta-data repository. Investigators or computer programs may utilize these interfaces to search, compare, expand, revise and mine meta-data descriptions of existent computational biology resources. We propose two ways to browse and display the iTools dynamic collection of resources. The first one is based on an ontology of computational biology resources, and the second one is derived from hyperbolic projections of manifolds or complex structures onto planar discs. iTools is an open source project both in terms of the source code development as well as its meta-data content. iTools employs a decentralized, portable, scalable and lightweight framework for long-term resource management. We demonstrate several applications of iTools as a framework for integrated bioinformatics. iTools and the complete details about its specifications, usage and interfaces are available at the iTools web page http://iTools.ccb.ucla.edu.
CytometryML binary data standards

NASA Astrophysics Data System (ADS)

Leif, Robert C.

2005-03-01

CytometryML is a proposed new Analytical Cytology (Cytomics) data standard, which is based on a common set of XML schemas for encoding flow cytometry and digital microscopy text based data types (metadata). CytometryML schemas reference both DICOM (Digital Imaging and Communications in Medicine) codes and FCS keywords. Flow Cytometry Standard (FCS) list-mode has been mapped to the DICOM Waveform Information Object. The separation of the large binary data objects (list mode and image data) from the XML description of the metadata permits the metadata to be directly displayed, analyzed, and reported with standard commercial software packages; the direct use of XML languages; and direct interfacing with clinical information systems. The separation of the binary data into its own files simplifies parsing because all extraneous header data has been eliminated. The storage of images as two-dimensional arrays without any extraneous data, such as in the Adobe Photoshop RAW format, facilitates the development by scientists of their own analysis and visualization software. Adobe Photoshop provided the display infrastructure and the translation facility to interconvert between the image data from commercial formats and RAW format. Similarly, the storage and parsing of list mode binary data type with a group of parameters that are specified at compilation time is straight forward. However when the user is permitted at run-time to select a subset of the parameters and/or specify results of mathematical manipulations, the development of special software was required. The use of CytometryML will permit investigators to be able to create their own interoperable data analysis software and to employ commercially available software to disseminate their data.
DialysisNet: Application for Integrating and Management Data Sources of Hemodialysis Information by Continuity of Care Record.

PubMed

Ku, Ho Suk; Kim, Sungho; Kim, HyeHyeon; Chung, Hee-Joon; Park, Yu Rang; Kim, Ju Han

2014-04-01

Health Avatar Beans was for the management of chronic kidney disease and end-stage renal disease (ESRD). This article is about the DialysisNet system in Health Avatar Beans for the seamless management of ESRD based on the personal health record. For hemodialysis data modeling, we identified common data elements for hemodialysis information (CDEHI). We used ASTM continuity of care record (CCR) and ISO/IEC 11179 for the compliance method with a standard model for the CDEHI. According to the contents of the ASTM CCR, we mapped the CDHEI to the contents and created the metadata from that. It was transformed and parsed into the database and verified according to the ASTM CCR/XML schema definition (XSD). DialysisNet was created as an iPad application. The contents of the CDEHI were categorized for effective management. For the evaluation of information transfer, we used CarePlatform, which was developed for data access. The metadata of CDEHI in DialysisNet was exchanged by the CarePlatform with semantic interoperability. The CDEHI was separated into a content list for individual patient data, a contents list for hemodialysis center data, consultation and transfer form, and clinical decision support data. After matching to the CCR, the CDEHI was transformed to metadata, and it was transformed to XML and proven according to the ASTM CCR/XSD. DialysisNet has specific consideration of visualization, graphics, images, statistics, and database. We created the DialysisNet application, which can integrate and manage data sources for hemodialysis information based on CCR standards.
SeaDataNet II - Second phase of developments for the pan-European infrastructure for marine and ocean data management

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Fichaut, Michele

2013-04-01

The second phase of the project SeaDataNet started on October 2011 for another 4 years with the aim to upgrade the SeaDataNet infrastructure built during previous years. The numbers of the project are quite impressive: 59 institutions from 35 different countries are involved. In particular, 45 data centers are sharing human and financial resources in a common efforts to sustain an operationally robust and state-of-the-art Pan-European infrastructure for providing up-to-date and high quality access to ocean and marine metadata, data and data products. The main objective of SeaDataNet II is to improve operations and to progress towards an efficient data management infrastructure able to handle the diversity and large volume of data collected via the Pan-European oceanographic fleet and the new observation systems, both in real-time and delayed mode. The infrastructure is based on a semi-distributed system that incorporates and enhance the existing NODCs network. SeaDataNet aims at serving users from science, environmental management, policy making, and economical sectors. Better integrated data systems are vital for these users to achieve improved scientific research and results, to support marine environmental and integrated coastal zone management, to establish indicators of Good Environmental Status for sea basins, and to support offshore industry developments, shipping, fisheries, and other economic activities. The recent EU communication "MARINE KNOWLEDGE 2020 - marine data and observation for smart and sustainable growth" states that the creation of marine knowledge begins with observation of the seas and oceans. In addition, directives, policies, science programmes require reporting of the state of the seas and oceans in an integrated pan-European manner: of particular note are INSPIRE, MSFD, WISE-Marine and GMES Marine Core Service. These underpin the importance of a well functioning marine and ocean data management infrastructure. SeaDataNet is now one of the major players in informatics in oceanography and collaborative relationships have been created with other EU and non EU projects. In particular SeaDataNet has recognised roles in the continuous serving of common vocabularies, the provision of tools for data management, as well as giving access to metadata, data sets and data products of importance for society. The SeaDataNet infrastructure comprises a network of interconnected data centres and a central SeaDataNet portal. The portal provides users not only background information about SeaDataNet and the various SeaDataNet standards and tools, but also a unified and transparent overview of the metadata and controlled access to the large collections of data sets, managed by the interconnected data centres. The presentation will give information on present services of the SeaDataNet infrastructure and services, and highlight a number of key achievements in SeaDataNet II so far.
SeaDataCloud - further developing the pan-European SeaDataNet infrastructure for marine and ocean data management

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Fichaut, Michele

2017-04-01

SeaDataCloud marks the third phase of developing the pan-European SeaDataNet infrastructure for marine and ocean data management. The SeaDataCloud project is funded by EU and runs for 4 years from 1st November 2016. It succeeds the successful SeaDataNet II (2011 - 2015) and SeaDataNet (2006 - 2011) projects. SeaDataNet has set up and operates a pan-European infrastructure for managing marine and ocean data and is undertaken by National Oceanographic Data Centres (NODC's) and oceanographic data focal points from 34 coastal states in Europe. The infrastructure comprises a network of interconnected data centres and central SeaDataNet portal. The portal provides users a harmonised set of metadata directories and controlled access to the large collections of datasets, managed by the interconnected data centres. The population of directories has increased considerably in cooperation with and involvement in many associated EU projects and initiatives such as EMODnet. SeaDataNet at present gives overview and access to more than 1.9 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology from more than 100 connected data centres from 34 countries riparian to European seas. SeaDataNet is also active in setting and governing marine data standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards of ISO (19115, 19139), and OGC (WMS, WFS, CS-W and SWE). Standards and associated SeaDataNet tools are made available at the SeaDataNet portal for wide uptake by data handling and managing organisations. SeaDataCloud aims at further developing standards, innovating services & products, adopting new technologies, and giving more attention to users. Moreover, it is about implementing a cooperation between the SeaDataNet consortium of marine data centres and the EUDAT consortium of e-infrastructure service providers. SeaDataCloud aims at considerably advancing services and increasing their usage by adopting cloud and High Performance Computing technology. SeaDataCloud will empower researchers with a packaged collection of services and tools, tailored to their specific needs, supporting research and enabling generation of added-value products from marine and ocean data. Substantial activities will be focused on developing added-value services, such as data subsetting, analysis, visualisation, and publishing workflows for users, both regular and advanced users, as part of a Virtual Research Environment (VRE). SeaDataCloud aims at a number of leading user communities that have new challenges for upgrading and expanding the SeaDataNet standards and services: Science, EMODnet, Copernicus Marine Environmental Monitoring Service (CMEMS) and EuroGOOS, and International scientific programmes. The presentation will give information on present services of the SeaDataNet infrastructure and services, and the new challenges in SeaDataCloud, and will highlight a number of key achievements in SeaDataCloud so far.
Harmonising and semantically linking key variables from in-situ observing networks of an Integrated Atlantic Ocean Observing System, AtlantOS

NASA Astrophysics Data System (ADS)

Darroch, Louise; Buck, Justin

2017-04-01

Atlantic Ocean observation is currently undertaken through loosely-coordinated, in-situ observing networks, satellite observations and data management arrangements at regional, national and international scales. The EU Horizon 2020 AtlantOS project aims to deliver an advanced framework for the development of an Integrated Atlantic Ocean Observing System that strengthens the Global Ocean Observing System (GOOS) and contributes to the aims of the Galway Statement on Atlantic Ocean Cooperation. One goal is to ensure that data from different and diverse in-situ observing networks are readily accessible and useable to a wider community, including the international ocean science community and other stakeholders in this field. To help achieve this goal, the British Oceanographic Data Centre (BODC) produced a parameter matrix to harmonise data exchange, data flow and data integration for the key variables acquired by multiple in-situ AtlantOS observing networks such as ARGO, Seafloor Mapping and OceanSITES. Our solution used semantic linking of controlled vocabularies and metadata for parameters that were "mappable" to existing EU and international standard vocabularies. An AtlantOS Essential Variables list of terms (aggregated level) based on Global Climate Observing System (GCOS) Essential Climate Variables (ECV), GOOS Essential Ocean Variables (EOV) and other key network variables was defined and published on the Natural Environment Research Council (NERC) Vocabulary Server (version 2.0) as collection A05 (http://vocab.nerc.ac.uk/collection/A05/current/). This new vocabulary was semantically linked to standardised metadata for observed properties and units that had been validated by the AtlantOS community: SeaDataNet parameters (P01), Climate and Forecast (CF) Standard Names (P07) and SeaDataNet units (P06). Observed properties were mapped to biological entities from the internationally assured AphiaID from the WOrld Register of Marine Species (WoRMS), http://www.marinespecies.org/aphia.php?p=webservice. The AtlantOS parameter matrix offers a way to harmonise the globally important variables (such as ECVs and EOVs) from in-situ observing networks that use different flavours of exchange formats based on SeaDataNet and CF parameter metadata. It also offers a way to standardise data in the wider Integrated Ocean Observing System. It uses sustainable and trusted standardised vocabularies that are governed by internationally renowned and long-standing organisations and is interoperable through the use of persistent resource identifiers, such as URNs and PURLs. It is the first step to integrating and serving data in a variety of international exchange formats using Application programming interfaces (API) improving both data discoverability and utility for users.
The Index to Marine and Lacustrine Geological Samples (IMLGS): Linking Digital Data to Physical Samples for the Marine Community

NASA Astrophysics Data System (ADS)

Stroker, K. J.; Jencks, J. H.; Eakins, B.

2016-12-01

The Index to Marine and Lacustrine Geological Samples (IMLGS) is a community designed and maintained resource enabling researchers to locate and request seafloor and lakebed geologic samples curated by partner institutions. The Index was conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center, now the National Centers for Environmental Information (NCEI), at a 1977 meeting convened by the National Science Foundation (NSF). The Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. The Curators Consortium, international in scope, meets biennially to share ideas and discuss best practices. NCEI serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the laborious task of creating and contributing metadata for over 205,000 sea floor and lake-bed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The IMLGS has a persistent URL/Digital Object Identifier (DOI), as well as DOIs assigned to partner collections for citation and to provide a persistent link to curator collections. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images : 1) at participating institutions, 2) in the NCEI archive, and 3) through a Linked Data interface maintained by the Rolling Deck to Repository R2R. Over 43,000 International GeoSample Numbers (IGSNs) linking to the System for Earth Sample Registration (SESAR) are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. The paper will discuss the database with a goal to increase the connections and links to related data at partner institutions.
Best Practices for Preparing Interoperable Geospatial Data

NASA Astrophysics Data System (ADS)

Wei, Y.; Santhana Vannan, S.; Cook, R. B.; Wilson, B. E.; Beaty, T. W.

2010-12-01

Geospatial data is critically important for a wide scope of research and applications: carbon cycle and ecosystem, climate change, land use and urban planning, environmental protecting, etc. Geospatial data is created by different organizations using different methods, from remote sensing observations, field surveys, model simulations, etc., and stored in various formats. So geospatial data is diverse and heterogeneous, which brings a huge barrier for the sharing and using of geospatial data, especially when targeting a broad user community. Many efforts have been taken to address different aspects of using geospatial data by improving its interoperability. For example, the specification for Open Geospatial Consortium (OGC) catalog services defines a standard way for geospatial information discovery; OGC Web Coverage Services (WCS) and OPeNDAP define interoperable protocols for geospatial data access, respectively. But the reality is that only having the standard mechanisms for data discovery and access is not enough. The geospatial data content itself has to be organized in standard, easily understandable, and readily usable formats. The Oak Ridge National Lab Distributed Archived Data Center (ORNL DAAC) archives data and information relevant to biogeochemical dynamics, ecological data, and environmental processes. The Modeling and Synthesis Thematic Data Center (MAST-DC) prepares and distributes both input data and output data of carbon cycle models and provides data support for synthesis and terrestrial model inter-comparison in multi-scales. Both of these NASA-funded data centers compile and distribute a large amount of diverse geospatial data and have broad user communities, including GIS users, Earth science researchers, and ecosystem modeling teams. The ORNL DAAC and MAST-DC address this geospatial data interoperability issue by standardizing the data content and feeding them into a well-designed Spatial Data Infrastructure (SDI) which provides interoperable mechanisms to advertise, visualize, and distribute the standardized geospatial data. In this presentation, we summarize the experiences learned and the best practices for geospatial data standardization. The presentation will describe how diverse and historical data archived in the ORNL DAAC were converted into standard and non-proprietary formats; what tools were used to make the conversion; how the spatial and temporal information are properly captured in a consistent manor; how to name a data file or a variable to make it both human-friendly and semantically interoperable; how NetCDF file format and CF convention can promote the data usage in ecosystem modeling user community; how those standardized geospatial data can be fed into OGC Web Services to support on-demand data visualization and access; and how the metadata should be collected and organized so that they can be discovered through standard catalog services.
A case for user-generated sensor metadata

NASA Astrophysics Data System (ADS)

Nüst, Daniel

2015-04-01

Cheap and easy to use sensing technology and new developments in ICT towards a global network of sensors and actuators promise previously unthought of changes for our understanding of the environment. Large professional as well as amateur sensor networks exist, and they are used for specific yet diverse applications across domains such as hydrology, meteorology or early warning systems. However the impact this "abundance of sensors" had so far is somewhat disappointing. There is a gap between (community-driven) sensor networks that could provide very useful data and the users of the data. In our presentation, we argue this is due to a lack of metadata which allows determining the fitness of use of a dataset. Syntactic or semantic interoperability for sensor webs have made great progress and continue to be an active field of research, yet they often are quite complex, which is of course due to the complexity of the problem at hand. But still, we see the most generic information to determine fitness for use is a dataset's provenance, because it allows users to make up their own minds independently from existing classification schemes for data quality. In this work we will make the case how curated user-contributed metadata has the potential to improve this situation. This especially applies for scenarios in which an observed property is applicable in different domains, and for set-ups where the understanding about metadata concepts and (meta-)data quality differs between data provider and user. On the one hand a citizen does not understand the ISO provenance metadata. On the other hand a researcher might find issues in publicly accessible time series published by citizens, which the latter might not be aware of or care about. Because users will have to determine fitness for use for each application on their own anyway, we suggest an online collaboration platform for user-generated metadata based on an extremely simplified data model. In the most basic fashion, metadata generated by users can be boiled down to a basic property of the world wide web: many information items, such as news or blog posts, allow users to create comments and rate the content. Therefore we argue to focus a core data model on one text field for a textual comment, one optional numerical field for a rating, and a resolvable identifier for the dataset that is commented on. We present a conceptual framework that integrates user comments in existing standards and relevant applications of online sensor networks and discuss possible approaches, such as linked data, brokering, or standalone metadata portals. We relate this framework to existing work in user generated content, such as proprietary rating systems on commercial websites, microformats, the GeoViQua User Quality Model, the CHARMe annotations, or W3C Open Annotation. These systems are also explored for commonalities and based on their very useful concepts and ideas; we present an outline for future extensions of the minimal model. Building on this framework we present a concept how a simplistic comment-rating-system can be extended to capture provenance information for spatio-temporal observations in the sensor web, and how this framework can be evaluated.
Integrating sea floor observatory data: the EMSO data infrastructure

NASA Astrophysics Data System (ADS)

Huber, Robert; Azzarone, Adriano; Carval, Thierry; Doumaz, Fawzi; Giovanetti, Gabriele; Marinaro, Giuditta; Rolin, Jean-Francois; Beranzoli, Laura; Waldmann, Christoph

2013-04-01

The European research infrastructure EMSO is a European network of fixed-point, deep-seafloor and water column observatories deployed in key sites of the European Continental margin and Arctic. It aims to provide the technological and scientific framework for the investigation of the environmental processes related to the interaction between the geosphere, biosphere, and hydrosphere and for a sustainable management by long-term monitoring also with real-time data transmission. Since 2006, EMSO is on the ESFRI (European Strategy Forum on Research Infrastructures) roadmap and has entered its construction phase in 2012. Within this framework, EMSO is contributing to large infrastructure integration projects such as ENVRI and COOPEUS. The EMSO infrastructure is geographically distributed in key sites of European waters, spanning from the Arctic, through the Atlantic and Mediterranean Sea to the Black Sea. It is presently consisting of thirteen sites which have been identified by the scientific community according to their importance respect to Marine Ecosystems, Climate Changes and Marine GeoHazards. The data infrastructure for EMSO is being designed as a distributed system. Presently, EMSO data collected during experiments at each EMSO site are locally stored and organized in catalogues or relational databases run by the responsible regional EMSO nodes. Three major institutions and their data centers are currently offering access to EMSO data: PANGAEA, INGV and IFREMER. In continuation of the IT activities which have been performed during EMSOs twin project ESONET, EMSO is now implementing the ESONET data architecture within an operational EMSO data infrastructure. EMSO aims to be compliant with relevant marine initiatives such as MyOceans, EUROSITES, EuroARGO, SEADATANET and EMODNET as well as to meet the requirements of international and interdisciplinary projects such as COOPEUS and ENVRI, EUDAT and iCORDI. A major focus is therefore set on standardization and interoperability of the EMSO data infrastructure. Beneath common standards for metadata exchange such as OpenSearch or OAI-PMH, EMSO has chosen to implement core standards of the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) suite of standards, such as Catalogue Service for Web (CS-W), Sensor Observation Service (SOS) and Observations and Measurements (O&M). Further, strong integration efforts are currently undertaken to harmonize data formats e.g NetCDF as well as the used ontologies and terminologies. The presentation will also give information to users about the discovery and visualization procedure for the EMSO data presently available.
The ESPAS e-infrastructure: Access to data from near-Earth space

NASA Astrophysics Data System (ADS)

Belehaki, Anna; James, Sarah; Hapgood, Mike; Ventouras, Spiros; Galkin, Ivan; Lembesis, Antonis; Tsagouri, Ioanna; Charisi, Anna; Spogli, Luca; Berdermann, Jens; Häggström, Ingemar; ESPAS Consortium

2016-10-01

ESPAS, the ;near-Earth space data infrastructure for e-science; is a data e-infrastructure facilitating discovery and access to observations, ground-based and space borne, and to model predictions of the near-Earth space environment, a region extending from the Earth's atmosphere up to the outer radiation belts. ESPAS provides access to metadata and/or data from an extended network of data providers distributed globally. The interoperability of the heterogeneous data collections is achieved with the adoption and adaption of the ESPAS data model which is built entirely on ISO 19100 series geographic information standards. The ESPAS data portal manages a vocabulary of space physics keywords that can be used to narrow down data searches to observations of specific physical content. Such content-targeted search is an ESPAS innovation provided in addition to the commonly practiced data selection by time, location, and instrument. The article presents an overview of the architectural design of the ESPAS system, of its data model and ontology, and of interoperable services that allow the discovery, access and download of registered data. Emphasis is given to the standardization, and expandability concepts which represent also the main elements that support the building of long-term sustainability activities of the ESPAS e-infrastructure.
Articulation Management for Intelligent Integration of Information

NASA Technical Reports Server (NTRS)

Maluf, David A.; Tran, Peter B.; Clancy, Daniel (Technical Monitor)

2001-01-01

When combining data from distinct sources, there is a need to share meta-data and other knowledge about various source domains. Due to semantic inconsistencies and heterogeneity of representations, problems arise in combining multiple domains when the domains are merged. The knowledge that is irrelevant to the task of interoperation will be included, making the result unnecessarily complex. This heterogeneity problem can be eliminated by mediating the conflicts and managing the intersections of the domains. For interoperation and intelligent access to heterogeneous information, the focus is on the intersection of the knowledge, since intersection will define the required articulation rules. An algebra over domain has been proposed to use articulation rules to support disciplined manipulation of domain knowledge resources. The objective of a domain algebra is to provide the capability for interrogating many domain knowledge resources, which are largely semantically disjoint. The algebra supports formally the tasks of selecting, combining, extending, specializing, and modifying Components from a diverse set of domains. This paper presents a domain algebra and demonstrates the use of articulation rules to link declarative interfaces for Internet and enterprise applications. In particular, it discusses the articulation implementation as part of a production system capable of operating over the domain described by the IDL (interface description language) of objects registered in multiple CORBA servers.
Providing interoperability of eHealth communities through peer-to-peer networks.

PubMed

Kilic, Ozgur; Dogac, Asuman; Eichelberg, Marco

2010-05-01

Providing an interoperability infrastructure for Electronic Healthcare Records (EHRs) is on the agenda of many national and regional eHealth initiatives. Two important integration profiles have been specified for this purpose, namely, the "Integrating the Healthcare Enterprise (IHE) Cross-enterprise Document Sharing (XDS)" and the "IHE Cross Community Access (XCA)." IHE XDS describes how to share EHRs in a community of healthcare enterprises and IHE XCA describes how EHRs are shared across communities. However, the current version of the IHE XCA integration profile does not address some of the important challenges of cross-community exchange environments. The first challenge is scalability. If every community that joins the network needs to connect to every other community, i.e., a pure peer-to-peer network, this solution will not scale. Furthermore, each community may use a different coding vocabulary for the same metadata attribute, in which case, the target community cannot interpret the query involving such an attribute. Yet another important challenge is that each community may (and typically will) have a different patient identifier domain. Querying for the patient identifiers in the target community using patient demographic data may create patient privacy concerns. In this paper, we address each of these challenges and show how they can be handled effectively in a superpeer-based peer-to-peer architecture.
Postmarketing Safety Study Tool: A Web Based, Dynamic, and Interoperable System for Postmarketing Drug Surveillance Studies

PubMed Central

Sinaci, A. Anil; Laleci Erturkmen, Gokce B.; Gonul, Suat; Yuksel, Mustafa; Invernizzi, Paolo; Thakrar, Bharat; Pacaci, Anil; Cinar, H. Alper; Cicekli, Nihan Kesim

2015-01-01

Postmarketing drug surveillance is a crucial aspect of the clinical research activities in pharmacovigilance and pharmacoepidemiology. Successful utilization of available Electronic Health Record (EHR) data can complement and strengthen postmarketing safety studies. In terms of the secondary use of EHRs, access and analysis of patient data across different domains are a critical factor; we address this data interoperability problem between EHR systems and clinical research systems in this paper. We demonstrate that this problem can be solved in an upper level with the use of common data elements in a standardized fashion so that clinical researchers can work with different EHR systems independently of the underlying information model. Postmarketing Safety Study Tool lets the clinical researchers extract data from different EHR systems by designing data collection set schemas through common data elements. The tool interacts with a semantic metadata registry through IHE data element exchange profile. Postmarketing Safety Study Tool and its supporting components have been implemented and deployed on the central data warehouse of the Lombardy region, Italy, which contains anonymized records of about 16 million patients with over 10-year longitudinal data on average. Clinical researchers in Roche validate the tool with real life use cases. PMID:26543873
Content Model Use and Development to Redeem Thin Section Records

NASA Astrophysics Data System (ADS)

Hills, D. J.

2014-12-01

The National Geothermal Data System (NGDS) is a catalog of documents and datasets that provide information about geothermal resources located primarily within the United States. The goal of NGDS is to make large quantities of geothermal-relevant geoscience data available to the public by creating a national, sustainable, distributed, and interoperable network of data providers. The Geological Survey of Alabama (GSA) has been a data provider in the initial phase of NGDS. One method by which NGDS facilitates interoperability is through the use of content models. Content models provide a schema (structure) for submitted data. Schemas dictate where and how data should be entered. Content models use templates that simplify data formatting to expedite use by data providers. These methodologies implemented by NGDS can extend beyond geothermal data to all geoscience data. The GSA, using the NGDS physical samples content model, has tested and refined a content model for thin sections and thin section photos. Countless thin sections have been taken from oil and gas well cores housed at the GSA, and many of those thin sections have related photomicrographs. Record keeping for these thin sections has been scattered at best, and it is critical to capture their metadata while the content creators are still available. A next step will be to register the GSA's thin sections with SESAR (System for Earth Sample Registration) and assign an IGSN (International Geo Sample Number) to each thin section. Additionally, the thin section records will be linked to the GSA's online record database. When complete, the GSA's thin sections will be more readily discoverable and have greater interoperability. Moving forward, the GSA is implementing use of NGDS-like content models and registration with SESAR and IGSN to improve collection maintenance and management of additional physical samples.
Seven recommendations to make your invasive alien species data more useful

USGS Publications Warehouse

Groom, Quentin J.; Adriaens, Tim; Desmet, Peter; Simpson, Annie; De Wever, Aaike; Bazos, Ioannis; Cardoso, Ana Cristina; Charles, Lucinda; Christopoulou, Anastasia; Gazda, Anna; Helmisaari, Harry; Hobern, Donald; Josefsson, Melanie; Lucy, Frances; Marisavljevic, Dragana; Oszako, Tomasz; Pergl, Jan; Petrovic-Obradovic, Olivera; Prévot, Céline; Ravn, Hans Peter; Richards, Gareth; Roques, Alain; Roy, Helen; Rozenberg, Marie-Anne A.; Scalera, Riccardo; Tricarico, Elena; Trichkova, Teodora; Vercayie, Diemer; Zenetos, Argyro; Vanderhoeven, Sonia

2017-01-01

Science-based strategies to tackle biological invasions depend on recent, accurate, well-documented, standardized and openly accessible information on alien species. Currently and historically, biodiversity data are scattered in numerous disconnected data silos that lack interoperability. The situation is no different for alien species data, and this obstructs efficient retrieval, combination, and use of these kinds of information for research and policy-making. Standardization and interoperability are particularly important as many alien species related research and policy activities require pooling data. We describe seven ways that data on alien species can be made more accessible and useful, based on the results of a European Cooperation in Science and Technology (COST) workshop: (1) Create data management plans; (2) Increase interoperability of information sources; (3) Document data through metadata; (4) Format data using existing standards; (5) Adopt controlled vocabularies; (6) Increase data availability; and (7) Ensure long-term data preservation. We identify four properties specific and integral to alien species data (species status, introduction pathway, degree of establishment, and impact mechanism) that are either missing from existing data standards or lack a recommended controlled vocabulary. Improved access to accurate, real-time and historical data will repay the long-term investment in data management infrastructure, by providing more accurate, timely and realistic assessments and analyses. If we improve core biodiversity data standards by developing their relevance to alien species, it will allow the automation of common activities regarding data processing in support of environmental policy. Furthermore, we call for considerable effort to maintain, update, standardize, archive, and aggregate datasets, to ensure proper valorization of alien species data and information before they become obsolete or lost.
Distributed Earth observation data integration and on-demand services based on a collaborative framework of geospatial data service gateway

NASA Astrophysics Data System (ADS)

Xie, Jibo; Li, Guoqing

2015-04-01

Earth observation (EO) data obtained by air-borne or space-borne sensors has the characteristics of heterogeneity and geographical distribution of storage. These data sources belong to different organizations or agencies whose data management and storage methods are quite different and geographically distributed. Different data sources provide different data publish platforms or portals. With more Remote sensing sensors used for Earth Observation (EO) missions, different space agencies have distributed archived massive EO data. The distribution of EO data archives and system heterogeneity makes it difficult to efficiently use geospatial data for many EO applications, such as hazard mitigation. To solve the interoperable problems of different EO data systems, an advanced architecture of distributed geospatial data infrastructure is introduced to solve the complexity of distributed and heterogeneous EO data integration and on-demand processing in this paper. The concept and architecture of geospatial data service gateway (GDSG) is proposed to build connection with heterogeneous EO data sources by which EO data can be retrieved and accessed with unified interfaces. The GDSG consists of a set of tools and service to encapsulate heterogeneous geospatial data sources into homogenous service modules. The GDSG modules includes EO metadata harvesters and translators, adaptors to different type of data system, unified data query and access interfaces, EO data cache management, and gateway GUI, etc. The GDSG framework is used to implement interoperability and synchronization between distributed EO data sources with heterogeneous architecture. An on-demand distributed EO data platform is developed to validate the GDSG architecture and implementation techniques. Several distributed EO data achieves are used for test. Flood and earthquake serves as two scenarios for the use cases of distributed EO data integration and interoperability.
Big Data Discovery and Access Services through NOAA OneStop

NASA Astrophysics Data System (ADS)

Casey, K. S.; Neufeld, D.; Ritchey, N. A.; Relph, J.; Fischman, D.; Baldwin, R.

2017-12-01

The NOAA OneStop Project was created as a pathfinder effort to to improve the discovery of, access to, and usability of NOAA's vast and diverse collection of big data. OneStop is led by the NOAA/NESDIS National Centers for Environmental Information (NCEI), and is seen as a key NESDIS contribution to NOAA's open data and data stewardship efforts. OneStop consists of an entire framework of services, from storage and interoperable access services at the base, through metadata and catalog services in the middle, to a modern user interface experience at the top. Importantly, it is an open framework where external tools and services can connect at whichever level is most appropriate. Since the beta release of the OneStop user interface at the 2016 Fall AGU meeting, significant progress has been made improving and modernizing many NOAA data collections to optimize their use within the framework. In addition, OneStop has made progress implementing robust metadata management and catalog systems at the collection and granule level and improving the user experience with the web interface. This progress will be summarized and the results of extensive user testing including professional usability studies will be reviewed. Key big data technologies supporting the framework will be presented and a community input sought on the future directions of the OneStop Project.

Co-occurrence correlations of heavy metals in sediments revealed using network analysis.

PubMed

Liu, Lili; Wang, Zhiping; Ju, Feng; Zhang, Tong

2015-01-01

In this study, the correlation-based study was used to identify the co-occurrence correlations among metals in marine sediment of Hong Kong, based on the long-term (from 1991 to 2011) temporal and spatial monitoring data. 14 stations out of the total 45 marine sediment monitoring stations were selected from three representative areas, including Deep Bay, Victoria Harbour and Mirs Bay. Firstly, Spearman's rank correlation-based network analysis was conducted as the first step to identify the co-occurrence correlations of metals from raw metadata, and then for further analysis using the normalized metadata. The correlations patterns obtained by network were consistent with those obtained by the other statistic normalization methods, including annual ratios, R-squared coefficient and Pearson correlation coefficient. Both Deep Bay and Victoria Harbour have been polluted by heavy metals, especially for Pb and Cu, which showed strong co-occurrence with other heavy metals (e.g. Cr, Ni, Zn and etc.) and little correlations with the reference parameters (Fe or Al). For Mirs Bay, which has better marine sediment quality compared with Deep Bay and Victoria Harbour, the co-occurrence patterns revealed by network analysis indicated that the metals in sediment dominantly followed the natural geography process. Besides the wide applications in biology, sociology and informatics, it is the first time to apply network analysis in the researches of environment pollutions. This study demonstrated its powerful application for revealing the co-occurrence correlations among heavy metals in marine sediments, which could be further applied for other pollutants in various environment systems. Copyright © 2014 Elsevier Ltd. All rights reserved.
Seabed photographs, sediment texture analyses, and sun-illuminated sea floor topography in the Stellwagen Bank National Marine Sanctuary region off Boston, Massachusetts

USGS Publications Warehouse

Valentine, Page C.; Gallea, Leslie B.; Blackwood, Dann S.; Twomey, Erin R.

2010-01-01

The U.S. Geological Survey, in collaboration with National Oceanic and Atmospheric Administration's National Marine Sanctuary Program, conducted seabed mapping and related research in the Stellwagen Bank National Marine Sanctuary region from 1993 to 2004. The mapped area is approximately 3,700 km (1,100 nmi) in size and was subdivided into 18 quadrangles. An extensive series of sea-floor maps of the region based on multibeam sonar surveys has been published as paper maps and online in digital format (PDF, EPS, PS). In addition, 2,628 seabed-sediment samples were collected and analyzed and are in the usSEABED: Atlantic Coast Offshore Surficial Sediment Data Release. This report presents for viewing and downloading the more than 10,600 still seabed photographs that were acquired during the project. The digital images are provided in thumbnail, medium (1536 x 1024 pixels), and high (3071 x 2048) resolution. The images can be viewed by quadrangle on the U.S. Geological Survey Woods Hole Coastal and Marine Science Center's photograph database. Photograph metadata are embedded in each image in Exchangeable Image File Format and also provided in spreadsheet format. Published digital topographic maps and descriptive text for seabed features are included here for downloading and serve as context for the photographs. An interactive topographic map for each quadrangle shows locations of photograph stations, and each location is linked to the photograph database. This map also shows stations where seabed sediment was collected for texture analysis; the results of grain-size analysis and associated metadata are presented in spreadsheet format.
The Gulf of Mexico Coastal Ocean Observing System: A Gulf Science Portal

NASA Astrophysics Data System (ADS)

Howard, M.; Gayanilo, F.; Kobara, S.; Jochens, A. E.

2013-12-01

The Gulf of Mexico Coastal Ocean Observing System's (GCOOS) regional science portal (gcoos.org) was designed to aggregate data and model output from distributed providers and to offer these, and derived products, through a single access point in standardized ways to a diverse set of users. The portal evolved under the NOAA-led U.S. Integrated Ocean Observing System (IOOS) program where automated largely-unattended machine-to-machine interoperability has always been a guiding tenet for system design. The web portal has a business unit where membership lists, new items, and reference materials are kept, a data portal where near real-time and historical data are held and served, and a products portal where data are fused into products tailored for specific or general stakeholder groups. The staff includes a system architect who built and maintains the data portal, a GIS expert who built and maintains the current product portal, the executive director who marshals resources to keep news items fresh and data manger who manages most of this. The business portal is built using WordPress which was selected because it appeared to be the easiest content management system for non-web programmers to add content to, maintain and enhance. The data portal is custom built and uses database, PHP, and web services based on Open Geospatial Consortium standards-based Sensor Observation Service (SOS) with Observations and Measurements (O&M) encodings. We employ a standards-based vocabulary, which we helped develop, which is registered at the Marine Metadata Interoperability Ontology Registry and Repository (http://mmisw.org). The registry is currently maintained by one of the authors. Products appearing in the products portal are primarily constructed using ESRI software by a Ph.D. level Geographer. Some products were built with other software, generally by graduate students over the years. We have been sensitive to the private sector when deciding which products to produce. While science users want numbers, users of all types mainly want maps. We have tried to develop flexible capabilities to present products for a variety of output devices, from desktop screens to the smart phones. Software maintenance is a continuing issue and new initiatives from NOAA add to the work load but improve the system. We will discuss how our data management system has evolved within the backdrop of rapidly changing technologies and diverse community requirements.
The OceanLink Project

NASA Astrophysics Data System (ADS)

Narock, T.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Finin, T.; Hitzler, P.; Krisnadhi, A.; Raymond, L. M.; Shepherd, A.; Wiebe, P. H.

2014-12-01

A wide spectrum of maturing methods and tools, collectively characterized as the Semantic Web, is helping to vastly improve the dissemination of scientific research. Creating semantic integration requires input from both domain and cyberinfrastructure scientists. OceanLink, an NSF EarthCube Building Block, is demonstrating semantic technologies through the integration of geoscience data repositories, library holdings, conference abstracts, and funded research awards. Meeting project objectives involves applying semantic technologies to support data representation, discovery, sharing and integration. Our semantic cyberinfrastructure components include ontology design patterns, Linked Data collections, semantic provenance, and associated services to enhance data and knowledge discovery, interoperation, and integration. We discuss how these components are integrated, the continued automated and semi-automated creation of semantic metadata, and techniques we have developed to integrate ontologies, link resources, and preserve provenance and attribution.
A data and information system for processing, archival, and distribution of data for global change research

NASA Technical Reports Server (NTRS)

Graves, Sara J.

1994-01-01

Work on this project was focused on information management techniques for Marshall Space Flight Center's EOSDIS Version 0 Distributed Active Archive Center (DAAC). The centerpiece of this effort has been participation in EOSDIS catalog interoperability research, the result of which is a distributed Information Management System (IMS) allowing the user to query the inventories of all the DAAC's from a single user interface. UAH has provided the MSFC DAAC database server for the distributed IMS, and has contributed to definition and development of the browse image display capabilities in the system's user interface. Another important area of research has been in generating value-based metadata through data mining. In addition, information management applications for local inventory and archive management, and for tracking data orders were provided.
The GMT/MATLAB Toolbox

NASA Astrophysics Data System (ADS)

Wessel, Paul; Luis, Joaquim F.

2017-02-01

The GMT/MATLAB toolbox is a basic interface between MATLAB® (or Octave) and GMT, the Generic Mapping Tools, which allows MATLAB users full access to all GMT modules. Data may be passed between the two programs using intermediate MATLAB structures that organize the metadata needed; these are produced when GMT modules are run. In addition, standard MATLAB matrix data can be used directly as input to GMT modules. The toolbox improves interoperability between two widely used tools in the geosciences and extends the capability of both tools: GMT gains access to the powerful computational capabilities of MATLAB while the latter gains the ability to access specialized gridding algorithms and can produce publication-quality PostScript-based illustrations. The toolbox is available on all platforms and may be downloaded from the GMT website.
Bridging the gap between Hydrologic and Atmospheric communities through a standard based framework

NASA Astrophysics Data System (ADS)

Boldrini, E.; Salas, F.; Maidment, D. R.; Mazzetti, P.; Santoro, M.; Nativi, S.; Domenico, B.

2012-04-01

Data interoperability in the study of Earth sciences is essential to performing interdisciplinary multi-scale multi-dimensional analyses (e.g. hydrologic impacts of global warming, regional urbanization, global population growth etc.). This research aims to bridge the existing gap between hydrologic and atmospheric communities both at semantic and technological levels. Within the context of hydrology, scientists are usually concerned with data organized as time series: a time series can be seen as a variable measured at a particular point in space over a period of time (e.g. the stream flow values as periodically measured by a buoy sensor in a river); atmospheric scientists instead usually organize their data as coverages: a coverage can be seen as a multidimensional data array (e.g. satellite images acquired through time). These differences make non-trivial the set up of a common framework to perform data discovery and access. A set of web services specifications and implementations is already in place in both the scientific communities to allow data discovery and access in the different domains. The CUAHSI-Hydrologic Information System (HIS) service stack lists different services types and implementations: - a metacatalog (implemented as a CSW) used to discover metadata services by distributing the query to a set of catalogs - time series catalogs (implemented as CSW) used to discover datasets published by the feature services - feature services (implemented as WFS) containing features with data access link - sensor observation services (implemented as SOS) enabling access to the stream of acquisitions Within the Unidata framework, there lies a similar service stack for atmospheric data: - the broker service (implemented as a CSW) distributes a user query to a set of heterogeneous services (i.e. catalogs services, but also inventory and access services) - the catalog service (implemented as a CSW) is able to harvest the available metadata offered by THREDDS services, and executes complex queries against the available metadata. - inventory service (implemented as a THREDDS) being able to hierarchically organize and publish a local collection of multi-dimensional arrays (e.g. NetCDF, GRIB files), as well as publish auxiliary standard services to realize the actual data access and visualization (e.g. WCS, OPeNDAP, WMS). The approach followed in this research is to build on top of the existing standards and implementations, by setting up a standard-aware interoperable framework, able to deal with the existing heterogeneity in an organic way. As a methodology, interoperability tests against real services were performed; existing problems were thus highlighted and possibly solved. The use of flexible tools, able to deal in a smart way with heterogeneity has proven to be successful, in particular experiments were carried on with both GI-cat broker and ESRI GeoPortal frameworks. GI-cat discovery broker was proven successful at implementing the CSW interface, as well as federating heterogeneous resources, such as THREDDS and WCS services published by Unidata, HydroServer, WFS and SOS services published by CUAHSI. Experiments with ESRI GeoPortal were also successful: the GeoPortal was used to deploy a web interface able to distribute searches amongst catalog implementations from both the hydrologic and the atmospheric communities, including HydroServers and GI-cat, combining results from both the domains in a seamless way.
Building essential biodiversity variables (EBVs) of species distribution and abundance at a global scale.

PubMed

Kissling, W Daniel; Ahumada, Jorge A; Bowser, Anne; Fernandez, Miguel; Fernández, Néstor; García, Enrique Alonso; Guralnick, Robert P; Isaac, Nick J B; Kelling, Steve; Los, Wouter; McRae, Louise; Mihoub, Jean-Baptiste; Obst, Matthias; Santamaria, Monica; Skidmore, Andrew K; Williams, Kristen J; Agosti, Donat; Amariles, Daniel; Arvanitidis, Christos; Bastin, Lucy; De Leo, Francesca; Egloff, Willi; Elith, Jane; Hobern, Donald; Martin, David; Pereira, Henrique M; Pesole, Graziano; Peterseil, Johannes; Saarenmaa, Hannu; Schigel, Dmitry; Schmeller, Dirk S; Segata, Nicola; Turak, Eren; Uhlir, Paul F; Wee, Brian; Hardisty, Alex R

2018-02-01

Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a 'Big Data' approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence-only or presence-absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. repeated surveys of the same population from single or multiple sites). Enormous complexity exists in integrating these heterogeneous, multi-source data sets across space, time, taxa and different sampling methods. Integration of such data into global EBV data products requires correcting biases introduced by imperfect detection and varying sampling effort, dealing with different spatial resolution and extents, harmonizing measurement units from different data sources or sampling methods, applying statistical tools and models for spatial inter- or extrapolation, and quantifying sources of uncertainty and errors in data and models. To support the development of EBVs by the Group on Earth Observations Biodiversity Observation Network (GEO BON), we identify 11 key workflow steps that will operationalize the process of building EBV data products within and across research infrastructures worldwide. These workflow steps take multiple sequential activities into account, including identification and aggregation of various raw data sources, data quality control, taxonomic name matching and statistical modelling of integrated data. We illustrate these steps with concrete examples from existing citizen science and professional monitoring projects, including eBird, the Tropical Ecology Assessment and Monitoring network, the Living Planet Index and the Baltic Sea zooplankton monitoring. The identified workflow steps are applicable to both terrestrial and aquatic systems and a broad range of spatial, temporal and taxonomic scales. They depend on clear, findable and accessible metadata, and we provide an overview of current data and metadata standards. Several challenges remain to be solved for building global EBV data products: (i) developing tools and models for combining heterogeneous, multi-source data sets and filling data gaps in geographic, temporal and taxonomic coverage, (ii) integrating emerging methods and technologies for data collection such as citizen science, sensor networks, DNA-based techniques and satellite remote sensing, (iii) solving major technical issues related to data product structure, data storage, execution of workflows and the production process/cycle as well as approaching technical interoperability among research infrastructures, (iv) allowing semantic interoperability by developing and adopting standards and tools for capturing consistent data and metadata, and (v) ensuring legal interoperability by endorsing open data or data that are free from restrictions on use, modification and sharing. Addressing these challenges is critical for biodiversity research and for assessing progress towards conservation policy targets and sustainable development goals. © 2017 The Authors. Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.
Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea.

PubMed

Park, Hyun Sang; Cho, Hune; Kim, Hwa Sun

2016-04-01

This study developed an integrated database for 15 regional biobanks that provides large quantities of high-quality bio-data to researchers to be used for the prevention of disease, for the development of personalized medicines, and in genetics studies. We collected raw data, managed independently by 15 regional biobanks, for database modeling and analyzed and defined the metadata of the items. We also built a three-step (high, middle, and low) classification system for classifying the item concepts based on the metadata. To generate clear meanings of the items, clinical items were defined using the Systematized Nomenclature of Medicine Clinical Terms, and specimen items were defined using the Logical Observation Identifiers Names and Codes. To optimize database performance, we set up a multi-column index based on the classification system and the international standard code. As a result of subdividing 7,197,252 raw data items collected, we refined the metadata into 1,796 clinical items and 1,792 specimen items. The classification system consists of 15 high, 163 middle, and 3,588 low class items. International standard codes were linked to 69.9% of the clinical items and 71.7% of the specimen items. The database consists of 18 tables based on a table from MySQL Server 5.6. As a result of the performance evaluation, the multi-column index shortened query time by as much as nine times. The database developed was based on an international standard terminology system, providing an infrastructure that can integrate the 7,197,252 raw data items managed by the 15 regional biobanks. In particular, it resolved the inevitable interoperability issues in the exchange of information among the biobanks, and provided a solution to the synonym problem, which arises when the same concept is expressed in a variety of ways.
Design and Application of an Ontology for Component-Based Modeling of Water Systems

NASA Astrophysics Data System (ADS)

Elag, M.; Goodall, J. L.

2012-12-01

Many Earth system modeling frameworks have adopted an approach of componentizing models so that a large model can be assembled by linking a set of smaller model components. These model components can then be more easily reused, extended, and maintained by a large group of model developers and end users. While there has been a notable increase in component-based model frameworks in the Earth sciences in recent years, there has been less work on creating framework-agnostic metadata and ontologies for model components. Well defined model component metadata is needed, however, to facilitate sharing, reuse, and interoperability both within and across Earth system modeling frameworks. To address this need, we have designed an ontology for the water resources community named the Water Resources Component (WRC) ontology in order to advance the application of component-based modeling frameworks across water related disciplines. Here we present the design of the WRC ontology and demonstrate its application for integration of model components used in watershed management. First we show how the watershed modeling system Soil and Water Assessment Tool (SWAT) can be decomposed into a set of hydrological and ecological components that adopt the Open Modeling Interface (OpenMI) standard. Then we show how the components can be used to estimate nitrogen losses from land to surface water for the Baltimore Ecosystem study area. Results of this work are (i) a demonstration of how the WRC ontology advances the conceptual integration between components of water related disciplines by handling the semantic and syntactic heterogeneity present when describing components from different disciplines and (ii) an investigation of a methodology by which large models can be decomposed into a set of model components that can be well described by populating metadata according to the WRC ontology.
No Pixel Left Behind - Peeling Away NASA's Satellite Swaths

NASA Astrophysics Data System (ADS)

Cechini, M. F.; Boller, R. A.; Schmaltz, J. E.; Roberts, J. T.; Alarcon, C.; Huang, T.; McGann, M.; Murphy, K. J.

2014-12-01

Discovery and identification of Earth Science products should not be the majority effort of scientific research. Search aides based on text metadata go to great lengths to simplify this process. However, the process is still cumbersome and requires too much data download and analysis to down select to valid products. The EOSDIS Global Imagery Browse Services (GIBS) is attempting to improve this process by providing "visual metadata" in the form of full-resolution visualizations representing geophysical parameters taken directly fromt he data. Through the use of accompanying interpretive information such as color legends and the natural visual processing of the human eye, researchers are able to search and filter through data products in a more natural and efficient way. The GIBS "visual metadata" products are generated as representations of Level 3 data or as temporal composites of the Level 2 granule- or swath-based data products projected across a geographic or polar region. Such an approach allows for low-latency tiled access to pre-generated imagery products. For many GIBS users, the resulting image suffices for a basic representation of the underlying data. However, composite imagery presents an insurmountable problem: for areas of spatial overlap within the composite, only one observation is visually represented. This is especially problematic in the polar regions where a significant portion of sensed data is "lost." In response to its user community, the GIBS team coordinated with its stakeholders to begin developing an approach to ensure that there is "no pixel left behind." In this presentation we will discuss the use cases and requirements guiding our efforts, considerations regarding standards compliance and interoperability, and near term goals. We will also discuss opportunities to actively engage with the GIBS team on this topic to continually improve our services.
PIDs for digital content: Are they used as they should be? The example of DOI and ORCID, told from a research library perspective

NASA Astrophysics Data System (ADS)

Kraft, Angelina; Dreyer, Britta; Löwe, Peter

2017-04-01

For finding, linking and citing research content, persistent digital identifiers are the key, as a persistent identifier is a long-lasting reference to a resource. But are PIDs really used as they should be? With respect to the obstacles of the PID systems, we face a diverse landscape of stakeholders, legacy systems, competing interests and often incomprehensible messaging filled with technical jargon around PIDs. Insufficient metadata quality is another major challenge for these systems. While the principal task for service providers lies in collaborating to provide a shared and easy to use PID infrastructure, it is the key responsibility for data centers to provide rich metadata and structured access to research content. Especially metadata and structured access are imperative for the most basic services such as search, citation tracking and reuse. And of course all needs to be human- and machine interoperable, as we want our machines to be able to interpret PIDs depended on a specific use case. Since 2004, the German National Library of Science and Technology (TIB) has been providing DOI services to data centers in Germany. Recent developments make clear that requirements for PIDs have changed. Science has developed a need for PIDs at multiple content levels: In addition to DOIs for journal articles and research data, PIDs for people, physical objects, collections, software, funders, organizations, expeditions, resources, instruments and even for data management plans are required to enable different platforms to exchange information consistently and unambiguously. In this work we want to emphasize on the distinct increases of total DOI registrations for research data and other research output such as images, videos or software in Germany within the past decade and how research institutes and universities differ in their DOI registration workflows. We present use cases which illustrate the deployment of DOIs e.g. for dynamic data, and demonstrate the need of rich metadata for a successful performance of user services. Along with a broader acceptance of DOIs for research content beyond articles, institutes are faced with the challenge of providing appropriate recognition to their researchers for their published work. To promote this, the ORCID Germany Consortium was launched in October 2016. The consortium, administrated by TIB, is an essential building block of the ORCID DE project with the goal to facilitate the distribution of researcher IDs in Germany. An ORCID (Open Researcher and Contributor ID) provides scientists with an unambiguous identifier, enabling them to distinguish themselves from others, while at the same time simplifying the management of their research activity records (e.g. publications of papers, dissertation, research data, software, and attended conferences). ORCID Germany Consortium member institutions are able to link their academic records to the ORCID identifiers of their researchers and benefit from up-to-date and complete publication lists. DOIs and ORCIDs, if used correctly, are both machine-interoperable PIDs which not only make research more accessible, but also increase the visibility of all outputs of research, the researcher and the affiliated institution.
Playing the Metadata Game: Technologies and Strategies Used by Climate Diagnostics Center for Cataloging and Distributing Climate Data.

NASA Astrophysics Data System (ADS)

Schweitzer, R. H.

2001-05-01

The Climate Diagnostics Center maintains a collection of gridded climate data primarily for use by local researchers. Because this data is available on fast digital storage and because it has been converted to netCDF using a standard metadata convention (called COARDS), we recognize that this data collection is also useful to the community at large. At CDC we try to use technology and metadata standards to reduce our costs associated with making these data available to the public. The World Wide Web has been an excellent technology platform for meeting that goal. Specifically we have developed Web-based user interfaces that allow users to search, plot and download subsets from the data collection. We have also been exploring use of the Pacific Marine Environment Laboratory's Live Access Server (LAS) as an engine for this task. This would result in further savings by allowing us to concentrate on customizing the LAS where needed, rather that developing and maintaining our own system. One such customization currently under development is the use of Java Servlets and JavaServer pages in conjunction with a metadata database to produce a hierarchical user interface to LAS. In addition to these Web-based user interfaces all of our data are available via the Distributed Oceanographic Data System (DODS). This allows other sites using LAS and individuals using DODS-enabled clients to use our data as if it were a local file. All of these technology systems are driven by metadata. When we began to create netCDF files, we collaborated with several other agencies to develop a netCDF convention (COARDS) for metadata. At CDC we have extended that convention to incorporate additional metadata elements to make the netCDF files as self-describing as possible. Part of the local metadata is a set of controlled names for the variable, level in the atmosphere and ocean, statistic and data set for each netCDF file. To allow searching and easy reorganization of these metadata, we loaded the metadata from the netCDF files into a mySQL database. The combination of the mySQL database and the controlled names makes it possible to automate the construction of user interfaces and standard format metadata descriptions, like Federal Geographic Data Committee (FGDC) and Directory Interchange Format (DIF). These standard descriptions also include an association between our controlled names and standard keywords such as those developed by the Global Change Master Directory (GCMD). This talk will give an overview of each of these technology and metadata standards as it applies to work at the Climate Diagnostics Center. The talk will also discuss the pros and cons of each approach and discuss areas for future development.
Oceans of Data : the Australian Ocean Data Network

NASA Astrophysics Data System (ADS)

Proctor, R.; Blain, P.; Mancini, S.

2012-04-01

The Australian Integrated Marine Observing System (IMOS, www.imos.org.au) is a research infrastructure project to establish an enduring marine observing system for Australian oceanic waters and shelf seas (in total, 4% of the world's oceans). Marine data and information are the main products and data management is therefore a central element to the project's success. A single integrative framework for data and information management has been developed which allows discovery and access of the data by scientists, managers and the public, based on standards and interoperability. All data is freely available. This information infrastructure has been further developed to form the Australian Ocean Data Network (AODN, www.aodn.org.au) which is rapidly becoming the 'one-stop-shop' for marine data in Australia. In response to requests from users, new features have recently been added to data discovery, visualization, and data access which move the AODN closer towards providing full integration of multi-disciplinary data.
A Tale of Two Observing Systems: Interoperability in the World of Microsoft Windows

NASA Astrophysics Data System (ADS)

Babin, B. L.; Hu, L.

2008-12-01

Louisiana Universities Marine Consortium's (LUMCON) and Dauphin Island Sea Lab's (DISL) Environmental Monitoring System provide a unified coastal ocean observing system. These two systems are mirrored to maintain autonomy while offering an integrated data sharing environment. Both systems collect data via Campbell Scientific Data loggers, store the data in Microsoft SQL servers, and disseminate the data in real- time on the World Wide Web via Microsoft Internet Information Servers and Active Server Pages (ASP). The utilization of Microsoft Windows technologies presented many challenges to these observing systems as open source tools for interoperability grow. The current open source tools often require the installation of additional software. In order to make data available through common standards formats, "home grown" software has been developed. One example of this is the development of software to generate xml files for transmission to the National Data Buoy Center (NDBC). OOSTethys partners develop, test and implement easy-to-use, open-source, OGC-compliant software., and have created a working prototype of networked, semantically interoperable, real-time data systems. Partnering with OOSTethys, we are developing a cookbook to implement OGC web services. The implementation will be written in ASP, will run in a Microsoft operating system environment, and will serve data via Sensor Observation Services (SOS). This cookbook will give observing systems running Microsoft Windows the tools to easily participate in the Open Geospatial Consortium (OGC) Oceans Interoperability Experiment (OCEANS IE).
An interoperability experiment for sharing hydrological rating tables

NASA Astrophysics Data System (ADS)

Lemon, D.; Taylor, P.; Sheahan, P.

2013-12-01

The increasing demand on freshwater resources is requiring authorities to produce more accurate and timely estimates of their available water. Calculation of continuous time-series of river discharge and storage volumes generally requires rating tables. These approximate relationships between two phenomena, such as river level and discharge, and allow us to produce continuous estimates of a phenomenon that may be impractical or impossible to measure directly. Standardised information models or access mechanisms for rating tables are required to support sharing and exchange of water flow data. An Interoperability Experiment (IE) is underway to test an information model that describes rating tables, the observations made to build these ratings, and river cross-section data. The IE is an initiative of the joint World Meteorological Organisation/Open Geospatial Consortium's Hydrology Domain Working Group (HydroDWG) and the model will be published as WaterML2.0 part 2. Interoperability Experiments (IEs) are low overhead, multiple member projects that are run under the OGC's interoperability program to test existing and emerging standards. The HydroDWG has previously run IEs to test early versions of OGC WaterML2.0 part 1 - timeseries. This IE is focussing on two key exchange scenarios: Sharing rating tables and gauging observations between water agencies. Through the use of standard OGC web services, rating tables and associated data will be made available from water agencies. The (Australian) Bureau of Meteorology will retrieve rating tables on-demand from water authorities, allowing the Bureau to run conversions of data within their own systems. Exposing rating tables and gaugings for online analysis and educational purposes. A web client will be developed to enable exploration and visualization of rating tables, gaugings and related metadata for monitoring points. The client gives a quick view into available rating tables, their periods of applicability and the standard deviation of observations against the relationship. An example of this client running can be seen at the link provided. The result of the IE will form the basis for the standardisation of WaterML2.0 part 2. The use of the standard will lead to increased transparency and accessibility of rating tables, while also improving general understanding of this important hydrological concept.
A Standard for Sharing and Accessing Time Series Data: The Heliophysics Application Programmers Interface (HAPI) Specification

NASA Astrophysics Data System (ADS)

Vandegriff, J. D.; King, T. A.; Weigel, R. S.; Faden, J.; Roberts, D. A.; Harris, B. T.; Lal, N.; Boardsen, S. A.; Candey, R. M.; Lindholm, D. M.

2017-12-01

We present the Heliophysics Application Programmers Interface (HAPI), a new interface specification that both large and small data centers can use to expose time series data holdings in a standard way. HAPI was inspired by the similarity of existing services at many Heliophysics data centers, and these data centers have collaborated to define a single interface that captures best practices and represents what everyone considers the essential, lowest common denominator for basic data access. This low level access can serve as infrastructure to support greatly enhanced interoperability among analysis tools, with the goal being simplified analysis and comparison of data from any instrument, model, mission or data center. The three main services a HAPI server must perform are 1. list a catalog of datasets (one unique ID per dataset), 2. describe the content of one dataset (JSON metadata), and 3. retrieve numerical content for one dataset (stream the actual data). HAPI defines both the format of the query to the server, and the response from the server. The metadata is lightweight, focusing on use rather than discovery, and the data format is a streaming one, with Comma Separated Values (CSV) being required and binary or JSON streaming being optional. The HAPI specification is available at GitHub, where projects are also underway to develop reference implementation servers that data providers can adapt and use at their own sites. Also in the works are data analysis clients in multiple languages (IDL, Python, Matlab, and Java). Institutions which have agreed to adopt HAPI include Goddard (CDAWeb for data and CCMC for models), LASP at the University of Colorado Boulder, the Particles and Plasma Interactions node of the Planetary Data System (PPI/PDS) at UCLA, the Plasma Wave Group at the University of Iowa, the Space Sector at the Johns Hopkins Applied Physics Lab (APL), and the tsds.org site maintained at George Mason University. Over the next year, the adoption of a uniform way to access time series data is expected to significantly enhance interoperability within the Heliophysics data environment. https://github.com/hapi-server/data-specification
Enabling Interoperability - Supporting a Diversity of Search Paradigms Using Shared Ontologies and Federated Registries

NASA Astrophysics Data System (ADS)

Hughes, J. S.; Crichton, D. J.; Hardman, S. H.; Mattman, C. A.; Ramirez, P. M.

2009-12-01

Experience suggests that no single search paradigm will meet all of a community’s search requirements. Traditional forms based search is still considered critical by a significant percentage of most science communities. However text base and facet based search are improving the community’s perception that search can be easy and that the data is available and can be located. Finally semantic search promises ways to find data that were not conceived when the metadata was first captured and organized. This situation suggests that successful science information systems must be able to deploy new search applications quickly, efficiently, and often for ad-hoc purposes. Federated registries allow data to be packaged or associated with their metadata and managed as simple registry objects. Standard reference models for federated registries now exist that ensure registry objects are uniquely identified at registration and that versioning, classification, and cataloging are addressed automatically. Distributed but locally governed, federated registries also provide notification of registry events and federated query, linking, and replication of registry objects. Key principles for shared ontology development in the space sciences are that the ontology remains independent of its implementation and be extensible, flexible and scalable. The dichotomy between digital things and physical/conceptual things in the domain need to be unified under a standard model, such as the Open Archive Information System (OAIS) Information Object. Finally the fact must be accepted that ontology development is a difficult task that requires time, patience and experts in both the science domain and information modeling. The Planetary Data System (PDS) has adopted this architecture for it next generation information system, PDS 2010. The authors will report on progress, briefly describe key elements, and illustrate how the new system will be phased into operations to handle both legacy and new science data. In particular the shared ontology is being used to drive system implementation through the generation of standards documents and software configuration files. The resulting information system will help meet the expectations of modern scientists by providing more of the information interconnectedness, correlative science, and system interoperability that they desire. Fig.1 - Data Driven Architecture
OSCAR/Surface: Metadata for the WMO Integrated Observing System WIGOS

NASA Astrophysics Data System (ADS)

Klausen, Jörg; Pröscholdt, Timo; Mannes, Jürg; Cappelletti, Lucia; Grüter, Estelle; Calpini, Bertrand; Zhang, Wenjian

2016-04-01

The World Meteorological Organization (WMO) Integrated Global Observing System (WIGOS) is a key WMO priority underpinning all WMO Programs and new initiatives such as the Global Framework for Climate Services (GFCS). It does this by better integrating WMO and co-sponsored observing systems, as well as partner networks. For this, an important aspect is the description of the observational capabilities by way of structured metadata. The 17th Congress of the Word Meteorological Organization (Cg-17) has endorsed the semantic WIGOS metadata standard (WMDS) developed by the Task Team on WIGOS Metadata (TT-WMD). The standard comprises of a set of metadata classes that are considered to be of critical importance for the interpretation of observations and the evolution of observing systems relevant to WIGOS. The WMDS serves all recognized WMO Application Areas, and its use for all internationally exchanged observational data generated by WMO Members is mandatory. The standard will be introduced in three phases between 2016 and 2020. The Observing Systems Capability Analysis and Review (OSCAR) platform operated by MeteoSwiss on behalf of WMO is the official repository of WIGOS metadata and an implementation of the WMDS. OSCAR/Surface deals with all surface-based observations from land, air and oceans, combining metadata managed by a number of complementary, more domain-specific systems (e.g., GAWSIS for the Global Atmosphere Watch, JCOMMOPS for the marine domain, the WMO Radar database). It is a modern, web-based client-server application with extended information search, filtering and mapping capabilities including a fully developed management console to add and edit observational metadata. In addition, a powerful application programming interface (API) is being developed to allow machine-to-machine metadata exchange. The API is based on an ISO/OGC-compliant XML schema for the WMDS using the Observations and Measurements (ISO19156) conceptual model. The purpose of the presentation is to acquaint the audience with OSCAR, the WMDS and the current XML schema; and, to explore the relationship to the INSPIRE XML schema. Feedback from experts in the various disciplines of meteorology, climatology, atmospheric chemistry, hydrology on the utility of the new standard and the XML schema will be solicited and will guide WMO in further evolving the WMDS.
Serving Satellite Remote Sensing Data to User Community through the OGC Interoperability Protocols

NASA Astrophysics Data System (ADS)

di, L.; Yang, W.; Bai, Y.

2005-12-01

Remote sensing is one of the major methods for collecting geospatial data. Hugh amount of remote sensing data has been collected by space agencies and private companies around the world. For example, NASA's Earth Observing System (EOS) is generating more than 3 Tb of remote sensing data per day. The data collected by EOS are processed, distributed, archived, and managed by the EOS Data and Information System (EOSDIS). Currently, EOSDIS is managing several petabytes of data. All of those data are not only valuable for global change research, but also useful for local and regional application and decision makings. How to make the data easily accessible to and usable by the user community is one of key issues for realizing the full potential of these valuable datasets. In the past several years, the Open Geospatial Consortium (OGC) has developed several interoperability protocols aiming at making geospatial data easily accessible to and usable by the user community through Internet. The protocols particularly relevant to the discovery, access, and integration of multi-source satellite remote sensing data are the Catalog Service for Web (CS/W) and Web Coverage Services (WCS) Specifications. The OGC CS/W specifies the interfaces, HTTP protocol bindings, and a framework for defining application profiles required to publish and access digital catalogues of metadata for geographic data, services, and related resource information. The OGC WCS specification defines the interfaces between web-based clients and servers for accessing on-line multi-dimensional, multi-temporal geospatial coverage in an interoperable way. Based on definitions by OGC and ISO 19123, coverage data include all remote sensing images as well as gridded model outputs. The Laboratory for Advanced Information Technology and Standards (LAITS), George Mason University, has been working on developing and implementing OGC specifications for better serving NASA Earth science data to the user community for many years. We have developed the NWGISS software package that implements multiple OGC specifications, including OGC WMS, WCS, CS/W, and WFS. As a part of NASA REASON GeoBrain project, the NWGISS WCS and CS/W servers have been extended to provide operational access to NASA EOS data at data pools through OGC protocols and to make both services chainable in the web-service chaining. The extensions in the WCS server include the implementation of WCS 1.0.0 and WCS 1.0.2, and the development of WSDL description of the WCS services. In order to find the on-line EOS data resources, the CS/W server is extended at the backend to search metadata in NASA ECHO. This presentation reports those extensions and discuss lessons-learned on the implementation. It also discusses the advantage, disadvantages, and future improvement of OGC specifications, particularly the WCS.

Automated sea floor extraction from underwater video

NASA Astrophysics Data System (ADS)

Kelly, Lauren; Rahmes, Mark; Stiver, James; McCluskey, Mike

2016-05-01

Ocean floor mapping using video is a method to simply and cost-effectively record large areas of the seafloor. Obtaining visual and elevation models has noteworthy applications in search and recovery missions. Hazards to navigation are abundant and pose a significant threat to the safety, effectiveness, and speed of naval operations and commercial vessels. This project's objective was to develop a workflow to automatically extract metadata from marine video and create image optical and elevation surface mosaics. Three developments made this possible. First, optical character recognition (OCR) by means of two-dimensional correlation, using a known character set, allowed for the capture of metadata from image files. Second, exploiting the image metadata (i.e., latitude, longitude, heading, camera angle, and depth readings) allowed for the determination of location and orientation of the image frame in mosaic. Image registration improved the accuracy of mosaicking. Finally, overlapping data allowed us to determine height information. A disparity map was created using the parallax from overlapping viewpoints of a given area and the relative height data was utilized to create a three-dimensional, textured elevation map.
Ceos Wgiss Common Framework for Wgiss Connected Data Assets

NASA Astrophysics Data System (ADS)

Enloe, Y.; Mitchell, A. E.; Albani, M.; Yapur, M.

2016-12-01

The Committee on Earth Observation Satellites (CEOS), established in 1984 to coordinate civil space-borne observations of the Earth, has been building through its Working Group on Information Systems and Services (WGISS), a common data framework to identify and connect data assets at member agencies. Some of these data assets are federated systems such as the CEOS WGISS Integrated Catalog (CWIC), the European Space Agency's FedEO (Federated Earth Observations Missions Access) system, and the International Directory Network (IDN) which is an international effort developed by NASA to assist researchers in locating information on available data sets. A system level team provides coordination and oversight to make this loosely coupled federated system function and evolve. WGISS has identified 2 search standards, the Open Geospatial Consortium (OGC) Catalog Services for the Web (CSW) and the CEOS OpenSearch Best Practices (which references the OGC OpenSearch Geo and Time Extensions and OGC OpenSearch Extension for Earth Observation) as well as an interoperable metadata standard (ISO 19115) for use within the WGISS Connected Assets. Data partners must register their data collections in the IDN using the Global Change Master Directory (GCMD) Keywords. Data partners need to support one of the 2 search standards and be able to map their internal metadata to the ISO 19115 metadata elements. All searchable data must have a data access path. Clients can offer search and access to all or a subset of the satellite data available through the WGISS Connected Data Assets. Clients can offer support for a 2-step search: (1) Discovery through collection search using platform, instrument, science keywords, etc. at the IDN and (2) Search granule metadata at data partners through CWIC or FedEO. There are more than a dozen international agencies that offer their data through the WGISS Federation or working on developing their connections. This list includes European Space Agency, NASA, NOAA, USGS, National Institute for Space Research (Brazil), Canadian Center for Mapping and Earth Observations (CCMEO), the Academy for Opto-Electronics (China), the Indian Space Research Organization (ISRO), EUMETSAT, Russian Federal Space Agency (ROSCOSMOS) and several agencies within Australia.
Command and Control Common Semantic Core Required to Enable Net-centric Operations

DTIC Science & Technology

2008-05-20

automated processing capability. A former US Marine Corps component C4 director during Operation Iraqi Freedom identified the problems of 1) uncertainty...interoperability improvements to warfighter community processes, thanks to ubiquitous automated processing , are likely high and somewhat easier to quantify. A...synchronized with the actions of other partners / warfare communities. This requires high- quality information, rapid sharing and automated processing – which
The Gulf of Mexico Research Initiative: Building a Big Data System

NASA Astrophysics Data System (ADS)

Howard, M. K.; Gayanilo, F. C.; Gibeaut, J. C.

2012-12-01

On April 20, 2010 the Deepwater Horizon drilling unit located in the northeastern Gulf of Mexico experienced a catastrophic wellhead blowout. Billions of barrels of oil and roughly 1 million U.S. gallons of dispersant were released near the wellhead over the subsequent three months. On May 24, 2010 BP announced the Gulf of Mexico Research Initiative (GoMRI) and pledged 500M over 10 years toward independent scientific research on the spill's impact on the ecosystem. Data collection began immediately. By summer 2012 nearly 200M will have been committed to this research. Five hundred and seventy researchers from 114 institutions in 30 states and 4 countries are involved. Research activities include substantial numerical modeling, field and laboratory investigations of the environment and biota, and chemical studies of oil and dispersants. An additional $300M will be competed in subsequent years. The administrative and data management elements of the enterprise began to build in earnest in mid 2011. The last position in the GoMRI Information and Data Cooperative (GRIIDC) team was filled in July 2012. Due to the rapid evolution of the program in the first year, few data management requirements were imposed on the Year-One researchers. Proposal guidance for the Year 2-4 Research Consortia (RC) programs asked proposers to address data management questions but expressed few mandates. GRIIDC is charged with providing a portal to GoMRI data and metadata. Researchers are required to provide their data to GRIIDC and to national digital repositories with a minimum delay. Almost everything else was left to evolve through human networks. The GRIIDC team is composed of a System Architect, a Database Administrator, Software Engineers, a GIS specialist, a Technical Coordinator and several subject matter experts. The team faces the usual choices related to building a new cyberinfrastructure (e.g., metadata, ontologies, file formats, web services, etc.). However, the human element is the more important challenge and provides the solution. Human networking (word of mouth) during the time RC were preparing their proposals lead most RCs to designate and provision for a project-level data manager. A regular line of communication between the GRIIDC and the RC data managers was established early through face-to-face workshops and regular teleconferences. This greatly enhanced community acceptance of employing community standards and practices. GRIIDC underwent a multi-day "Planning, Scoping, Visioning and Team-building" activity designed to bring team members together and quickly establish roles and shared understanding of terminology. GRIIDC networks with previous programs such as the Marine Metadata Data Interoperability Program, NOAA and NSF data-centric groups, and established regional entities such as the Gulf of Mexico Coastal Ocean Observing System (GCOOS). Several GRIIDC staff also work for GCOOS or its observing system partners. Networking brings expertise to bear on difficult issues to reach solutions sooner than detailed independent study.
Using architectures for semantic interoperability to create journal clubs for emergency response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Powell, James E; Collins, Linn M; Martinez, Mark L B

2009-01-01

In certain types of 'slow burn' emergencies, careful accumulation and evaluation of information can offer a crucial advantage. The SARS outbreak in the first decade of the 21st century was such an event, and ad hoc journal clubs played a critical role in assisting scientific and technical responders in identifying and developing various strategies for halting what could have become a dangerous pandemic. This research-in-progress paper describes a process for leveraging emerging semantic web and digital library architectures and standards to (1) create a focused collection of bibliographic metadata, (2) extract semantic information, (3) convert it to the Resource Descriptionmore » Framework /Extensible Markup Language (RDF/XML), and (4) integrate it so that scientific and technical responders can share and explore critical information in the collections.« less
Distributed heterogeneous inspecting system and its middleware-based solution.

PubMed

Huang, Li-can; Wu, Zhao-hui; Pan, Yun-he

2003-01-01

There are many cases when an organization needs to monitor the data and operations of its supervised departments, especially those departments which are not owned by this organization and are managed by their own information systems. Distributed Heterogeneous Inspecting System (DHIS) is the system an organization uses to monitor its supervised departments by inspecting their information systems. In DHIS, the inspected systems are generally distributed, heterogeneous, and constructed by different companies. DHIS has three key processes-abstracting core data sets and core operation sets, collecting these sets, and inspecting these collected sets. In this paper, we present the concept and mathematical definition of DHIS, a metadata method for solving the interoperability, a security strategy for data transferring, and a middleware-based solution of DHIS. We also describe an example of the inspecting system at WENZHOU custom.
National Scale Marine Geophysical Data Portal for the Israel EEZ with Public Access Web-GIS Platform

NASA Astrophysics Data System (ADS)

Ketter, T.; Kanari, M.; Tibor, G.

2017-12-01

Recent offshore discoveries and regulation in the Israel Exclusive Economic Zone (EEZ) are the driving forces behind increasing marine research and development initiatives such as infrastructure development, environmental protection and decision making among many others. All marine operations rely on existing seabed information, while some also generate new data. We aim to create a single platform knowledge-base to enable access to existing information, in a comprehensive, publicly accessible web-based interface. The Israel EEZ covers approx. 26,000 sqkm and has been surveyed continuously with various geophysical instruments over the past decades, including 10,000 km of multibeam survey lines, 8,000 km of sub-bottom seismic lines, and hundreds of sediment sampling stations. Our database consists of vector and raster datasets from multiple sources compiled into a repository of geophysical data and metadata, acquired nation-wide by several research institutes and universities. The repository will enable public access via a web portal based on a GIS platform, including datasets from multibeam, sub-bottom profiling, single- and multi-channel seismic surveys and sediment sampling analysis. Respective data products will also be available e.g. bathymetry, substrate type, granulometry, geological structure etc. Operating a web-GIS based repository allows retrieval of pre-existing data for potential users to facilitate planning of future activities e.g. conducting marine surveys, construction of marine infrastructure and other private or public projects. User interface is based on map oriented spatial selection, which will reveal any relevant data for designated areas of interest. Querying the database will allow the user to obtain information about the data owner and to address them for data retrieval as required. Wide and free public access to existing data and metadata can save time and funds for academia, government and commercial sectors, while aiding in cooperation and data sharing among the various stakeholders.
Biogeography of photosynthetic light-harvesting genes in marine phytoplankton.

PubMed

Bibby, Thomas S; Zhang, Yinan; Chen, Min

2009-01-01

Photosynthetic light-harvesting proteins are the mechanism by which energy enters the marine ecosystem. The dominant prokaryotic photoautotrophs are the cyanobacterial genera Prochlorococcus and Synechococcus that are defined by two distinct light-harvesting systems, chlorophyll-bound protein complexes or phycobilin-bound protein complexes, respectively. Here, we use the Global Ocean Sampling (GOS) Project as a unique and powerful tool to analyze the environmental diversity of photosynthetic light-harvesting genes in relation to available metadata including geographical location and physical and chemical environmental parameters. All light-harvesting gene fragments and their metadata were obtained from the GOS database, aligned using ClustalX and classified phylogenetically. Each sequence has a name indicative of its geographic location; subsequent biogeographical analysis was performed by correlating light-harvesting gene budgets for each GOS station with surface chlorophyll concentration. Using the GOS data, we have mapped the biogeography of light-harvesting genes in marine cyanobacteria on ocean-basin scales and show that an environmental gradient exists in which chlorophyll concentration is correlated to diversity of light-harvesting systems. Three functionally distinct types of light-harvesting genes are defined: (1) the phycobilisome (PBS) genes of Synechococcus; (2) the pcb genes of Prochlorococcus; and (3) the iron-stress-induced (isiA) genes present in some marine Synechococcus. At low chlorophyll concentrations, where nutrients are limited, the Pcb-type light-harvesting system shows greater genetic diversity; whereas at high chlorophyll concentrations, where nutrients are abundant, the PBS-type light-harvesting system shows higher genetic diversity. We interpret this as an environmental selection of specific photosynthetic strategy. Importantly, the unique light-harvesting system isiA is found in the iron-limited, high-nutrient low-chlorophyll region of the equatorial Pacific. This observation demonstrates the ecological importance of isiA genes in enabling marine Synechococcus to acclimate to iron limitation and suggests that the presence of this gene can be a natural biomarker for iron limitation in oceanic environments.
The GBIF Integrated Publishing Toolkit: Facilitating the Efficient Publishing of Biodiversity Data on the Internet

PubMed Central

Robertson, Tim; Döring, Markus; Guralnick, Robert; Bloom, David; Wieczorek, John; Braak, Kyle; Otegui, Javier; Russell, Laura; Desmet, Peter

2014-01-01

The planet is experiencing an ongoing global biodiversity crisis. Measuring the magnitude and rate of change more effectively requires access to organized, easily discoverable, and digitally-formatted biodiversity data, both legacy and new, from across the globe. Assembling this coherent digital representation of biodiversity requires the integration of data that have historically been analog, dispersed, and heterogeneous. The Integrated Publishing Toolkit (IPT) is a software package developed to support biodiversity dataset publication in a common format. The IPT’s two primary functions are to 1) encode existing species occurrence datasets and checklists, such as records from natural history collections or observations, in the Darwin Core standard to enhance interoperability of data, and 2) publish and archive data and metadata for broad use in a Darwin Core Archive, a set of files following a standard format. Here we discuss the key need for the IPT, how it has developed in response to community input, and how it continues to evolve to streamline and enhance the interoperability, discoverability, and mobilization of new data types beyond basic Darwin Core records. We close with a discussion how IPT has impacted the biodiversity research community, how it enhances data publishing in more traditional journal venues, along with new features implemented in the latest version of the IPT, and future plans for more enhancements. PMID:25099149
THE EARTH SYSTEM PREDICTION SUITE: Toward a Coordinated U.S. Modeling Capability

PubMed Central

Theurich, Gerhard; DeLuca, C.; Campbell, T.; Liu, F.; Saint, K.; Vertenstein, M.; Chen, J.; Oehmke, R.; Doyle, J.; Whitcomb, T.; Wallcraft, A.; Iredell, M.; Black, T.; da Silva, AM; Clune, T.; Ferraro, R.; Li, P.; Kelley, M.; Aleinov, I.; Balaji, V.; Zadeh, N.; Jacob, R.; Kirtman, B.; Giraldo, F.; McCarren, D.; Sandgathe, S.; Peckham, S.; Dunlap, R.

2017-01-01

The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open source terms or to credentialed users. The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the U.S. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multi-agency development of coupled modeling systems, controlled experimentation and testing, and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NavGEM), HYbrid Coordinate Ocean Model (HYCOM), and Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS®); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and GEOS-5 atmospheric general circulation model. PMID:29568125
THE EARTH SYSTEM PREDICTION SUITE: Toward a Coordinated U.S. Modeling Capability.

PubMed

Theurich, Gerhard; DeLuca, C; Campbell, T; Liu, F; Saint, K; Vertenstein, M; Chen, J; Oehmke, R; Doyle, J; Whitcomb, T; Wallcraft, A; Iredell, M; Black, T; da Silva, A M; Clune, T; Ferraro, R; Li, P; Kelley, M; Aleinov, I; Balaji, V; Zadeh, N; Jacob, R; Kirtman, B; Giraldo, F; McCarren, D; Sandgathe, S; Peckham, S; Dunlap, R

2016-07-01

The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open source terms or to credentialed users. The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the U.S. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multi-agency development of coupled modeling systems, controlled experimentation and testing, and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NavGEM), HYbrid Coordinate Ocean Model (HYCOM), and Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS ® ); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and GEOS-5 atmospheric general circulation model.
The Earth System Prediction Suite: Toward a Coordinated U.S. Modeling Capability

NASA Technical Reports Server (NTRS)

Theurich, Gerhard; DeLuca, C.; Campbell, T.; Liu, F.; Saint, K.; Vertenstein, M.; Chen, J.; Oehmke, R.; Doyle, J.; Whitcomb, T.;

2016-01-01

The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open source terms or to credentialed users.The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the U.S. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multi-agency development of coupled modeling systems, controlled experimentation and testing, and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NavGEM), HYbrid Coordinate Ocean Model (HYCOM), and Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and GEOS-5 atmospheric general circulation model.

World Water Online (WWO) Status and Prospects

NASA Astrophysics Data System (ADS)

Arctur, David; Maidment, David

2013-04-01

Water resources, weather, and natural disasters are not constrained by local, regional or national boundaries. Effective research, planning, and response to major events call for improved coordination and data sharing among many organizations, which requires improved interoperability among the organizations' diverse information systems. Just for the historical time series records of surface freshwater resources data compiled by U.S. national agencies, there are over 23 million distributed datasets available today. Cataloguing and searching efficiently for specific content from this many datasets presents a challenge to current standards and practices for digital geospatial catalogues. This presentation summarizes a new global platform for water resource information discovery and sharing, that provides coordinated, interactive access to water resource metadata for the complete holdings of the Global Runoff Data Centre, the U.S. Geological Survey, and other primary sources. In cases where the data holdings are not restricted by national policy, this interface enables direct access to the water resource data, hydrographs, and other derived products. This capability represents a framework in which any number of other services can be integrated in user-accessible workflows, such as to perform watershed delineation from any point on the stream network. World Water Online web services for mapping and metadata have been registered with GEOSS. In addition to summarizing the architecture and capabilities of World Water Online, future plans for integration with GEOSS and EarthCube will be presented.
An ontologically founded architecture for information systems in clinical and epidemiological research.

PubMed

Uciteli, Alexandr; Groß, Silvia; Kireyev, Sergej; Herre, Heinrich

2011-08-09

This paper presents an ontologically founded basic architecture for information systems, which are intended to capture, represent, and maintain metadata for various domains of clinical and epidemiological research. Clinical trials exhibit an important basis for clinical research, and the accurate specification of metadata and their documentation and application in clinical and epidemiological study projects represents a significant expense in the project preparation and has a relevant impact on the value and quality of these studies.An ontological foundation of an information system provides a semantic framework for the precise specification of those entities which are presented in this system. This semantic framework should be grounded, according to our approach, on a suitable top-level ontology. Such an ontological foundation leads to a deeper understanding of the entities of the domain under consideration, and provides a common unifying semantic basis, which supports the integration of data and the interoperability between different information systems.The intended information systems will be applied to the field of clinical and epidemiological research and will provide, depending on the application context, a variety of functionalities. In the present paper, we focus on a basic architecture which might be common to all such information systems. The research, set forth in this paper, is included in a broader framework of clinical research and continues the work of the IMISE on these topics.
EMODnet High Resolution Seabed Mapping - further developing a high resolution digital bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, D.; Schmitt, T.

2017-12-01

Access to marine data is a key issue for the EU Marine Strategy Framework Directive and the EU Marine Knowledge 2020 agenda and includes the European Marine Observation and Data Network (EMODnet) initiative. EMODnet aims at assembling European marine data, data products and metadata from diverse sources in a uniform way. The EMODnet Bathymetry project has developed Digital Terrain Models (DTM) for the European seas. These have been produced from survey and aggregated data sets that are indexed with metadata by adopting the SeaDataNet Catalogue services. SeaDataNet is a network of major oceanographic data centres around the European seas that manage, operate and further develop a pan-European infrastructure for marine and ocean data management. The latest EMODnet Bathymetry DTM release has a grid resolution of 1/8 arcminute and covers all European sea regions. Use has been made of circa 7800 gathered survey datasets and composite DTMs. Catalogues and the EMODnet DTM are published at the dedicated EMODnet Bathymetry portal including a versatile DTM viewing and downloading service. End December 2016 the Bathymetry project has been succeeded by EMODnet High Resolution Seabed Mapping (HRSM). This continues gathering of bathymetric in-situ data sets with extra efforts for near coastal waters and coastal zones. In addition Satellite Derived Bathymetry data are included to fill gaps in coverage of the coastal zones. The extra data and composite DTMs will increase the coverage of the European seas and its coastlines, and provide input for producing an EMODnet DTM with a common resolution of 1/16 arc minutes. The Bathymetry Viewing and Download service will be upgraded to provide a multi-resolution map and including 3D viewing. The higher resolution DTMs will also be used to determine best-estimates of the European coastline for a range of tidal levels (HAT, MHW, MSL, Chart Datum, LAT), thereby making use of a tidal model for Europe. Extra challenges will be `moving to the cloud' and setting up an EMODnet Collaborative Virtual Environment (CVE) for producing the EMODnet DTMs. The presentation will highlight key details of EMODnet Bathymetry results and the way how challenges of the new HRSM project are approached.
Federating Metadata Catalogs

NASA Astrophysics Data System (ADS)

Baru, C.; Lin, K.

2009-04-01

The Geosciences Network project (www.geongrid.org) has been developing cyberinfrastructure for data sharing in the Earth Science community based on a service-oriented architecture. The project defines a standard "software stack", which includes a standardized set of software modules and corresponding service interfaces. The system employs Grid certificates for distributed user authentication. The GEON Portal provides online access to these services via a set of portlets. This service-oriented approach has enabled the GEON network to easily expand to new sites and deploy the same infrastructure in new projects. To facilitate interoperation with other distributed geoinformatics environments, service standards are being defined and implemented for catalog services and federated search across distributed catalogs. The need arises because there may be multiple metadata catalogs in a distributed system, for example, for each institution, agency, geographic region, and/or country. Ideally, a geoinformatics user should be able to search across all such catalogs by making a single search request. In this paper, we describe our implementation for such a search capability across federated metadata catalogs in the GEON service-oriented architecture. The GEON catalog can be searched using spatial, temporal, and other metadata-based search criteria. The search can be invoked as a Web service and, thus, can be imbedded in any software application. The need for federated catalogs in GEON arises because, (i) GEON collaborators at the University of Hyderabad, India have deployed their own catalog, as part of the iGEON-India effort, to register information about local resources for broader access across the network, (ii) GEON collaborators in the GEO Grid (Global Earth Observations Grid) project at AIST, Japan have implemented a catalog for their ASTER data products, and (iii) we have recently deployed a search service to access all data products from the EarthScope project in the US (http://es-portal.geongrid.org), which are distributed across data archives at IRIS in Seattle, Washington, UNAVCO in Boulder, Colorado, and at the ICDP archives in GFZ, Potsdam, Germany. This service implements a "virtual" catalog--the actual/"physical" catalogs and data are stored at each of the remote locations. A federated search across all these catalogs would enable GEON users to discover data across all of these environments with a single search request. Our objective is to implement this search service via the OGC Catalog Services for the Web (CS-W) standard by providing appropriate CSW "wrappers" for each metadata catalog, as necessary. This paper will discuss technical issues in designing and deploying such a multi-catalog search service in GEON and describe an initial prototype of the federated search capability.
GeoSearch: A lightweight broking middleware for geospatial resources discovery

NASA Astrophysics Data System (ADS)

Gui, Z.; Yang, C.; Liu, K.; Xia, J.

2012-12-01

With petabytes of geodata, thousands of geospatial web services available over the Internet, it is critical to support geoscience research and applications by finding the best-fit geospatial resources from the massive and heterogeneous resources. Past decades' developments witnessed the operation of many service components to facilitate geospatial resource management and discovery. However, efficient and accurate geospatial resource discovery is still a big challenge due to the following reasons: 1)The entry barriers (also called "learning curves") hinder the usability of discovery services to end users. Different portals and catalogues always adopt various access protocols, metadata formats and GUI styles to organize, present and publish metadata. It is hard for end users to learn all these technical details and differences. 2)The cost for federating heterogeneous services is high. To provide sufficient resources and facilitate data discovery, many registries adopt periodic harvesting mechanism to retrieve metadata from other federated catalogues. These time-consuming processes lead to network and storage burdens, data redundancy, and also the overhead of maintaining data consistency. 3)The heterogeneous semantics issues in data discovery. Since the keyword matching is still the primary search method in many operational discovery services, the search accuracy (precision and recall) is hard to guarantee. Semantic technologies (such as semantic reasoning and similarity evaluation) offer a solution to solve these issues. However, integrating semantic technologies with existing service is challenging due to the expandability limitations on the service frameworks and metadata templates. 4)The capabilities to help users make final selection are inadequate. Most of the existing search portals lack intuitive and diverse information visualization methods and functions (sort, filter) to present, explore and analyze search results. Furthermore, the presentation of the value-added additional information (such as, service quality and user feedback), which conveys important decision supporting information, is missing. To address these issues, we prototyped a distributed search engine, GeoSearch, based on brokering middleware framework to search, integrate and visualize heterogeneous geospatial resources. Specifically, 1) A lightweight discover broker is developed to conduct distributed search. The broker retrieves metadata records for geospatial resources and additional information from dispersed services (portals and catalogues) and other systems on the fly. 2) A quality monitoring and evaluation broker (i.e., QoS Checker) is developed and integrated to provide quality information for geospatial web services. 3) The semantic assisted search and relevance evaluation functions are implemented by loosely interoperating with ESIP Testbed component. 4) Sophisticated information and data visualization functionalities and tools are assembled to improve user experience and assist resource selection.
A portal for the ocean biogeographic information system

USGS Publications Warehouse

Zhang, Yunqing; Grassle, J. F.

2002-01-01

Since its inception in 1999 the Ocean Biogeographic Information System (OBIS) has developed into an international science program as well as a globally distributed network of biogeographic databases. An OBIS portal at Rutgers University provides the links and functional interoperability among member database systems. Protocols and standards have been established to support effective communication between the portal and these functional units. The portal provides distributed data searching, a taxonomy name service, a GIS with access to relevant environmental data, biological modeling, and education modules for mariners, students, environmental managers, and scientists. The portal will integrate Census of Marine Life field projects, national data archives, and other functional modules, and provides for network-wide analyses and modeling tools.
NOAA Marine and Arctic Monitoring Using UASs

NASA Astrophysics Data System (ADS)

Jacobs, T.; Coffey, J. J.; Hood, R. E.; Hall, P.; Adler, J.

2014-12-01

Unmanned systems have the potential to efficiently, effectively, economically and safely bridging critical observation requirements in an environmentally friendly manner. As the United States' Marine and Arctic areas of interest expand and include hard-to-reach regions of the Earth (such as the Arctic and remote oceanic areas) optimizing unmanned capabilities will be needed to advance the United States' science, technology and security efforts. Through increased multi-mission and multi-agency operations using improved inter-operable and autonomous unmanned systems, the research and operations communities will better collect environmental intelligence and better protect our Country against hazardous weather, environmental, marine and polar hazards. This presentation will examine NOAA's Marine and Arctic Monitoring UAS strategies which includes developing a coordinated effort to maximize the efficiency and capabilities of unmanned systems across the federal government and research partners. Numerous intra- and inter-agency operational demonstrations and assessments have been made to verify and validated these strategies. The presentation will also discuss the requisite sUAS capabilities and our experience in using them.
NASA's Earth Observing System Data and Information System - Many Mechanisms for On-Going Evolution

NASA Astrophysics Data System (ADS)

Ramapriyan, H. K.

2012-12-01

NASA's Earth Observing System Data and Information System has been serving a broad user community since August 1994. As a long-lived multi-mission system serving multiple scientific disciplines and a diverse user community, EOSDIS has been evolving continuously. It has had and continues to have many forms of community input to help with this evolution. Early in its history, it had inputs from the EOSDIS Advisory Panel, benefited from the reviews by various external committees and evolved into the present distributed architecture with discipline-based Distributed Active Archive Centers (DAACs), Science Investigator-led Processing Systems and a cross-DAAC search and data access capability. EOSDIS evolution has been helped by advances in computer technology, moving from an initially planned supercomputing environment to SGI workstations to Linux Clusters for computation and from near-line archives of robotic silos with tape cassettes to RAID-disk-based on-line archives for storage. The network capacities have increased steadily over the years making delivery of data on media almost obsolete. The advances in information systems technologies have been having an even greater impact on the evolution of EOSDIS. In the early days, the advent of the World Wide Web came as a game-changer in the operation of EOSDIS. The metadata model developed for the EOSDIS Core System for representing metadata from EOS standard data products has had an influence on the Federal Geographic Data Committee's metadata content standard and the ISO metadata standards. The influence works both ways. As ISO 19115 metadata standard has developed in recent years, EOSDIS is reviewing its metadata to ensure compliance with the standard. Improvements have been made in the cross-DAAC search and access of data using the centralized metadata clearing house (EOS Clearing House - ECHO) and the client Reverb. Given the diversity of the Earth science disciplines served by the DAACs, the DAACs have developed a number of software tools tailored to their respective user communities. Web services play an important part in improved access to data products including some basic analysis and visualization capabilities. A coherent view into all capabilities available from EOSDIS is evolving through the "Coherent Web" effort. Data are being made available in near real-time for scientific research as well as time-critical applications. On-going community inputs for infusion for maintaining vitality of EOSDIS come from technology developments by NASA-sponsored community data system programs - Advancing Collaborative Connections for Earth System Science (ACCESS), Making Earth System Data Records for Use in Research Environments (MEaSUREs) and Applied Information System Technology (AIST), as well as participation in Earth Science Data System Working Groups, the Earth Science Information Partners Federation and other interagency/international activities. An important source of community needs is the annual American Customer Satisfaction Index survey of EOSDIS users. Some of the key areas in which improvements are required and incremental progress is being made are: ease of discovery and access; cross-organizational interoperability; data inter-use; ease of collaboration; ease of citation of datasets; preservation of provenance and context and making them conveniently available to users.

Computational knowledge integration in biopharmaceutical research.

PubMed

Ficenec, David; Osborne, Mark; Pradines, Joel; Richards, Dan; Felciano, Ramon; Cho, Raymond J; Chen, Richard O; Liefeld, Ted; Owen, James; Ruttenberg, Alan; Reich, Christian; Horvath, Joseph; Clark, Tim

2003-09-01

An initiative to increase biopharmaceutical research productivity by capturing, sharing and computationally integrating proprietary scientific discoveries with public knowledge is described. This initiative involves both organisational process change and multiple interoperating software systems. The software components rely on mutually supporting integration techniques. These include a richly structured ontology, statistical analysis of experimental data against stored conclusions, natural language processing of public literature, secure document repositories with lightweight metadata, web services integration, enterprise web portals and relational databases. This approach has already begun to increase scientific productivity in our enterprise by creating an organisational memory (OM) of internal research findings, accessible on the web. Through bringing together these components it has also been possible to construct a very large and expanding repository of biological pathway information linked to this repository of findings which is extremely useful in analysis of DNA microarray data. This repository, in turn, enables our research paradigm to be shifted towards more comprehensive systems-based understandings of drug action.
A metadata initiative for global information discovery

USGS Publications Warehouse

Christian, E.

2001-01-01

The Global Information Locator Service (GILS) encompasses a global vision framed by the fundamental values of open societies. Societal values such as a free flow of information impose certain requirements on the society's information infrastructure. These requirements in turn shape the various laws, policies, standards, and technologies that determine the infrastructure design. A particular focus of GILS is the requirement to provide the means for people to discover sources of data and information. Information discovery in the GILS vision is designed to be decentralized yet coherent, and globally comprehensive yet useful for detailed data. This article introduces basic concepts and design issues, with emphasis on the techniques by which GILS supports interoperability. It explains the practical implications of GILS for the common roles of organizations involved in handling information, from content provider through system engineer and intermediary to searcher. The article provides examples of GILS initiatives in various types of communities: bibliographic, geographic, environmental, and government. ?? 2001 Elsevier Science Inc.
Provenance for actionable data products and indicators in marine ecosystem assessments

NASA Astrophysics Data System (ADS)

Beaulieu, S. E.; Maffei, A. R.; Fox, P. A.; West, P.; Di Stefano, M.; Hare, J. A.; Fogarty, M.

2013-12-01

Ecosystem-based management of Large Marine Ecosystems (LMEs) involves the sharing of data and information products among a diverse set of stakeholders - from environmental and fisheries scientists to policy makers, commercial entities, nonprofits, and the public. Often the data products that are shared have resulted from a number of processing steps and may also have involved the combination of a number of data sources. The traceability from an actionable data product or indicator back to its original data source(s) is important not just for trust and understanding of each final data product, but also to compare with similar data products produced by the different stakeholder groups. For a data product to be traceable, its provenance, i.e., lineage or history, must be recorded and preferably machine-readable. We are collaborating on a use case to develop a software framework for the bi-annual Ecosystem Status Report (ESR) for the U.S. Northeast Shelf LME. The ESR presents indicators of ecosystem status including climate forcing, primary and secondary production, anthropogenic factors, and integrated ecosystem measures. Our software framework retrieves data, conducts standard analyses, provides iterative and interactive visualization, and generates final graphics for the ESR. The specific process for each data and information product is updated in a metadata template, including data source, code versioning, attribution, and related contextual information suitable for traceability, repeatability, explanation, verification, and validation. Here we present the use of standard metadata for provenance for data products in the ESR, in particular the W3C provenance (PROV) family of specifications, including the PROV-O ontology which maps the PROV data model to RDF. We are also exploring extensions to PROV-O in development (e.g., PROV-ES for Earth Science Data Systems, D-PROV for workflow structure). To associate data products in the ESR to domain-specific ontologies we are also exploring the Global Change Information System ontology, BCO-DMO Ocean Data Ontology, and other relevant published ontologies (e.g., Integrated Ocean Observing System ontology). We are also using the mapping of ISO 19115-2 Lineage to PROV-O and comparing both strategies for traceability of marine ecosystem indicators. The use of standard metadata for provenance for data products in the ESR will enable the transparency, and ultimately reproducibility, endorsed in the recent NOAA Information Quality Guidelines. Semantically enabling not only the provenance but also the data products will yield a better understanding of the connected web of relationships between marine ecosystem and ocean health assessments conducted by different stakeholder groups.
An open platform for promoting interoperability in solar system sciences

NASA Astrophysics Data System (ADS)

Csillaghy, André; Aboudarham, Jean; Berghmans, David; Jacquey, Christian

2013-04-01

The European coordination project CASSIS is promoting the creation of an integrated data space that will facilitate science across community boundaries in solar system sciences. Many disciplines may need to use the same data set to support scientific research, although the way they are used may depend on the project and on the particular piece of science. Often, access is hindered because of differences in the way the different communities describe, store their data, as well as how they make them accessible. Working towards this goal, we have set up an open collaboration platform, www.explorespace.eu, that can serve as a hub for discovering and developing interoperability resources in the communities involved. The platform is independent of the project and will be maintained well after the end of the funding. As a first step, we have captured the description of services already provided by the community. The openness of the collaboration platform should allow to discuss with all stakeholders ways to make key types of metadata and derived products more complete and coherent and thus more usable across the domain boundaries. Furthermore, software resources and discussions should help facilitating the development of interoperable services. The platform, along with the database of services, address the following questions, which we consider crucial for promoting interoperability: • Current extent of the data space coverage: What part of the common data space is already covered by the existing interoperable services in terms of data access. In other words, what data, from catalogues as well as from raw data, can be reached by an application through standard protocols today? • Needed extension of the data space coverage: What would be needed to extend the data space coverage? In other words, how can the currently accessible data space be extended by adding services? • Missing services: What applications / services are still missing and need to be developed? This is not a trivial question, as the generation of the common data space in itself creates new requirements on overarching applications that might be necessary to provide a unified access to all the services. As an example, one particular aspect discussed in the platform is the design of web services. Applications of today are mainly human centred while interoperability must happen one level below and the back ends (databases) must be generic, i.e. independent from the applications. We intent our effort to provide to developers resources that disentangle user interfaces from data services. Many activities are challenging and we hope they will be discussed on our platform. In particular, the quality of the services, the data space and the needs of interdisciplinary approaches are serious concerns for instruments such as ATST and EST or the ones onboard SDO and, in the future, Solar Orbiter. We believe that our platform might be useful as a kind of guide that would allow groups of not having to reinvent the wheel for each new instrument.
Summary Report Panel 1: The Need for Protocols and Standards in Research on Underwater Noise Impacts on Marine Life.

PubMed

Erbe, Christine; Ainslie, Michael A; de Jong, Christ A F; Racca, Roberto; Stocker, Michael

2016-01-01

As concern about anthropogenic noise and its impacts on marine fauna is increasing around the globe, data are being compared across populations, species, noise sources, geographic regions, and time. However, much of the raw and processed data are not comparable due to differences in measurement methodology, analysis and reporting, and a lack of metadata. Common protocols and more formal, international standards are needed to ensure the effectiveness of research, conservation, regulation and practice, and unambiguous communication of information and ideas. Developing standards takes time and effort, is largely driven by a few expert volunteers, and would benefit from stakeholders' contribution and support.
Sediment data collected in 2010 from Cat Island, Mississippi

USGS Publications Warehouse

Buster, Noreen A.; Kelso, Kyle W.; Miselis, Jennifer L.; Kindinger, Jack G.

2014-01-01

Scientists from the U.S. Geological Survey, St. Petersburg Coastal and Marine Science Center, in collaboration with the U.S. Army Corps of Engineers, conducted geophysical and sedimentological surveys in 2010 around Cat Island, Mississippi, which is the westernmost island in the Mississippi-Alabama barrier island chain. The objective of the study was to understand the geologic evolution of Cat Island relative to other barrier islands in the northern Gulf of Mexico by identifying relationships between the geologic history, present day morphology, and sediment distribution. This data series serves as an archive of terrestrial and marine sediment vibracores collected August 4-6 and October 20-22, 2010, respectively. Geographic information system data products include marine and terrestrial core locations and 2007 shoreline data. Additional files include marine and terrestrial core description logs, core photos, results of sediment grain-size analyses, optically stimulated luminescence dating and carbon-14 dating locations and results, Field Activity Collection System logs, and formal Federal Geographic Data Committee metadata.
Exposing Coverage Data to the Semantic Web within the MELODIES project: Challenges and Solutions

NASA Astrophysics Data System (ADS)

Riechert, Maik; Blower, Jon; Griffiths, Guy

2016-04-01

Coverage data, typically big in data volume, assigns values to a given set of spatiotemporal positions, together with metadata on how to interpret those values. Existing storage formats like netCDF, HDF and GeoTIFF all have various restrictions that prevent them from being preferred formats for use over the web, especially the semantic web. Factors that are relevant here are the processing complexity, the semantic richness of the metadata, and the ability to request partial information, such as a subset or just the appropriate metadata. Making coverage data available within web browsers opens the door to new ways for working with such data, including new types of visualization and on-the-fly processing. As part of the European project MELODIES (http://melodiesproject.eu) we look into the challenges of exposing such coverage data in an interoperable and web-friendly way, and propose solutions using a host of emerging technologies like JSON-LD, the DCAT and GeoDCAT-AP ontologies, the CoverageJSON format, and new approaches to REST APIs for coverage data. We developed the CoverageJSON format within the MELODIES project as an additional way to expose coverage data to the web, next to having simple rendered images available using standards like OGC's WMS. CoverageJSON partially incorporates JSON-LD but does not encode individual data values as semantic resources, making use of the technology in a practical manner. The development also focused on it being a potential output format for OGC WCS. We will demonstrate how existing netCDF data can be exposed as CoverageJSON resources on the web together with a REST API that allows users to explore the data and run operations such as spatiotemporal subsetting. We will show various use cases from the MELODIES project, including reclassification of a Land Cover dataset client-side within the browser with the ability for the user to influence the reclassification result by making use of the above technologies.
iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources

PubMed Central

Dinov, Ivo D.; Rubin, Daniel; Lorensen, William; Dugan, Jonathan; Ma, Jeff; Murphy, Shawn; Kirschner, Beth; Bug, William; Sherman, Michael; Floratos, Aris; Kennedy, David; Jagadish, H. V.; Schmidt, Jeanette; Athey, Brian; Califano, Andrea; Musen, Mark; Altman, Russ; Kikinis, Ron; Kohane, Isaac; Delp, Scott; Parker, D. Stott; Toga, Arthur W.

2008-01-01

The advancement of the computational biology field hinges on progress in three fundamental directions – the development of new computational algorithms, the availability of informatics resource management infrastructures and the capability of tools to interoperate and synergize. There is an explosion in algorithms and tools for computational biology, which makes it difficult for biologists to find, compare and integrate such resources. We describe a new infrastructure, iTools, for managing the query, traversal and comparison of diverse computational biology resources. Specifically, iTools stores information about three types of resources–data, software tools and web-services. The iTools design, implementation and resource meta - data content reflect the broad research, computational, applied and scientific expertise available at the seven National Centers for Biomedical Computing. iTools provides a system for classification, categorization and integration of different computational biology resources across space-and-time scales, biomedical problems, computational infrastructures and mathematical foundations. A large number of resources are already iTools-accessible to the community and this infrastructure is rapidly growing. iTools includes human and machine interfaces to its resource meta-data repository. Investigators or computer programs may utilize these interfaces to search, compare, expand, revise and mine meta-data descriptions of existent computational biology resources. We propose two ways to browse and display the iTools dynamic collection of resources. The first one is based on an ontology of computational biology resources, and the second one is derived from hyperbolic projections of manifolds or complex structures onto planar discs. iTools is an open source project both in terms of the source code development as well as its meta-data content. iTools employs a decentralized, portable, scalable and lightweight framework for long-term resource management. We demonstrate several applications of iTools as a framework for integrated bioinformatics. iTools and the complete details about its specifications, usage and interfaces are available at the iTools web page http://iTools.ccb.ucla.edu. PMID:18509477
Incorporating Brokers within Collaboration Environments

NASA Astrophysics Data System (ADS)

Rajasekar, A.; Moore, R.; de Torcy, A.

2013-12-01

A collaboration environment, such as the integrated Rule Oriented Data System (iRODS - http://irods.diceresearch.org), provides interoperability mechanisms for accessing storage systems, authentication systems, messaging systems, information catalogs, networks, and policy engines from a wide variety of clients. The interoperability mechanisms function as brokers, translating actions requested by clients to the protocol required by a specific technology. The iRODS data grid is used to enable collaborative research within hydrology, seismology, earth science, climate, oceanography, plant biology, astronomy, physics, and genomics disciplines. Although each domain has unique resources, data formats, semantics, and protocols, the iRODS system provides a generic framework that is capable of managing collaborative research initiatives that span multiple disciplines. Each interoperability mechanism (broker) is linked to a name space that enables unified access across the heterogeneous systems. The collaboration environment provides not only support for brokers, but also support for virtualization of name spaces for users, files, collections, storage systems, metadata, and policies. The broker enables access to data or information in a remote system using the appropriate protocol, while the collaboration environment provides a uniform naming convention for accessing and manipulating each object. Within the NSF DataNet Federation Consortium project (http://www.datafed.org), three basic types of interoperability mechanisms have been identified and applied: 1) drivers for managing manipulation at the remote resource (such as data subsetting), 2) micro-services that execute the protocol required by the remote resource, and 3) policies for controlling the execution. For example, drivers have been written for manipulating NetCDF and HDF formatted files within THREDDS servers. Micro-services have been written that manage interactions with the CUAHSI data repository, the DataONE information catalog, and the GeoBrain broker. Policies have been written that manage transfer of messages between an iRODS message queue and the Advanced Message Queuing Protocol. Examples of these brokering mechanisms will be presented. The DFC collaboration environment serves as the intermediary between community resources and compute grids, enabling reproducible data-driven research. It is possible to create an analysis workflow that retrieves data subsets from a remote server, assemble the required input files, automate the execution of the workflow, automatically track the provenance of the workflow, and share the input files, workflow, and output files. A collaborator can re-execute a shared workflow, compare results, change input files, and re-execute an analysis.
Improved integration and discoverability of tephra data for multidisciplinary applications

NASA Astrophysics Data System (ADS)

Kuehn, S. C.; Bursik, M. I.; Pouget, S.

2013-12-01

Tephra deposits form a common thread which connects diverse, multidisciplinary research directions that share overlapping data needs. Tephra beds reflect the magmatic, eruptive, and dispersal processes involved in their generation as well as the tectonic environment from which they originate. Therefore, they are globally important for examining links between tectonics, magma chemistry, volcano behavior, and environmental effects. They are fundamental for understanding past eruptions and future hazards, and they are key for dating both geologic and prehistoric events. It is, perhaps, in tephrochronology that tephra beds find their most diverse applications: providing isochrons of nearly unmatched temporal precision across regional to continental and even inter-continental distances; tying together glacial, marine, lacustrine, and terrestrial records; and helping to answer major questions in climate change, archaeology, paleontology, paleoecology, paleolimnology, paleoseismology, and geomorphology, among others. Tephra data include physical (particle size, bed thickness), mineralogical, geographic, time-stratigraphic, geochemical, and interpretive information. Data collected over decades currently exist largely in disparate, disconnected, and commonly offline datasets, and this severely limits discovery and accessibility. The integration of such data along with eruption catalogs into unified or interoperable databases linked at the eruption scale is a critical need. The need is especially acute for tephrochronology, which is by its very nature a comparative technique requiring access to ideally comprehensive, multiparameter datasets for the identification and correlation of tephra beds. To meet the needs of this large research community, we envision (1) the integration of decades of tephra data and available metadata into a system with a single point of access for all data types, (2) development of an interface and mechanism for multiparameter searching, (3) development of protocols for more routine collection and reporting of physical data for tephrochronology samples and better collection and reporting of metadata, (4) simplification of data entry to encourage routine submission of new data. Doing so will substantially enhance progress on fundamental questions in volcanology and petrology, facilitate progress toward an integrated tephrochronologic framework for North America (and support efforts to do so in other regions), and increase efficacy and confidence in tephra correlation. As a first step, we are currently planning a workshop in cooperation with the IAVCEI commission on tephra hazard modeling, VHub, and others. We anticipate community-wide involvement (introducing the volcanologists to the Quaternary scientists, for example) resulting in enhanced cooperation that benefits all tephra researchers and fostering the development of new collaborative studies. We plan to discuss the state of the art in tephra studies and community-wide data needs with the goal of formulating a path forward.
Seamless Provenance Representation and Use in Collaborative Science Scenarios

NASA Astrophysics Data System (ADS)

Missier, P.; Ludaescher, B.; Bowers, S.; Altintas, I.; Anand, M. K.; Dey, S.; Sarkar, A.; Shrestha, B.; Goble, C.

2010-12-01

The notion of sharing scientific data has only recently begun to gain ground in science, where data is still considered a private asset. There is growing evidence, however, that the benefits of scientific collaboration through early data sharing during the course of a science project may outgrow the risk of losing exclusive ownership of the data. As exemplar success stories are making the headlines[1], principles of effective information sharing have become the subject of e-science research. In particular, any piece of published data should be self-describing, to the extent necessary for consumers to determine its suitability for reuse in their own projects. This is accomplished by associating a body of formally specified and machine-processable metadata to the data. When data is produced and reused by independent groups, however, metadata interoperability issues emerge. This is the case for provenance, a form of metadata that describes the history of a data product, Y. Provenance is typically expressed as a graph-structured set of dependencies that account for the sequence of computational or interactive steps that led to Y, often starting from some primary, observational data. Traversing dependency graphs is one of the mechanisms used to answer questions on data reliability. In the context of the NSF DataONE project[2], we have been studying issues of provenance interoperability in scientific collaboration scenarios. Consider a first scientist, Alice, who publishes a data product X along with its provenance, and a second scientist who further transforms X into a new product Y, also along with its provenance. A third scientist, who is interested in Y, expects to be able to trace Y's history up to the inputs used by Alice. This is only possible, however, if provenance accumulates into a single, uniform graph that can be seamlessly traversed. This becomes problematic when provenance is captured using different tools and computational models (i.e. workflow systems), as well as when data is published and reused using mechanisms that are not provenance-aware. In this presentation we discuss requirements for ensuring provenance-aware data publishing and reuse, and describe the design and implementation of a prototype toolkit that involves two specific, and broadly used, workflow models, Kepler [3] and Taverna [4]. The implementation is expected to be adopted as part of DataONE's investigators' toolkit, in support of its mission of large-scale data preservation. Refs. [1]Sharing of Data Leads to Progress on Alzheimer’s, G. Kolata, NYT, 8/12/2010 [2]http://www.dataone.org [3]Ludaescher B., Altintas I. et al. Scientific Workflow Management and the Kepler System. Special Issue: Workflow in Grid Systems. Concurrency and Computation: Practice & Experience 18(10): 1039-1065, 2006 [4]D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, T. Oinn. Taverna: a tool for building and running workflows of services. Nucl. Acids Res. 34: W729-W732, 2006
Building an Internet of Samples: The Australian Contribution

NASA Astrophysics Data System (ADS)

Wyborn, Lesley; Klump, Jens; Bastrakova, Irina; Devaraju, Anusuriya; McInnes, Brent; Cox, Simon; Karssies, Linda; Martin, Julia; Ross, Shawn; Morrissey, John; Fraser, Ryan

2017-04-01

Physical samples are often the ground truth to research reported in the scientific literature across multiple domains. They are collected by many different entities (individual researchers, laboratories, government agencies, mining companies, citizens, museums, etc.). Samples must be curated over the long-term to ensure both that their existence is known, and to allow any data derived from them through laboratory and field tests to be linked to the physical samples. For example, having unique identifiers that link back ground truth data on the original sample helps calibrate large volumes of remotely sensed data. Access to catalogues of reliably identified samples from several collections promotes collaboration across all Earth Science disciplines. It also increases the cost effectiveness of research by reducing the need to re-collect samples in the field. The assignment of web identifiers to the digital representations of these physical objects allows us to link to data, literature, investigators and institutions, thus creating an "Internet of Samples". An Australian implementation of the "Internet of Samples" is using the IGSN (International Geo Sample Number, http://igsn.github.io) to identify samples in a globally unique and persistent way. IGSN was developed in the solid earth science community and is recommended for sample identification by the Coalition for Publishing Data in the Earth and Space Sciences (COPDESS). IGSN is interoperable with other persistent identifier systems such as DataCite. Furthermore, the basic IGSN description metadata schema is compatible with existing schemas such as OGC Observations and Measurements (O&M) and DataCite Metadata Schema which makes crosswalks to other metadata schemas easy. IGSN metadata is disseminated through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) allowing it to be aggregated in other applications such as portals (e.g. the Australian IGSN catalogue http://igsn2.csiro.au). The metadata is available in more than one format. The software for IGSN web services is based on components developed for DataCite and adapted to the specific requirements of IGSN. This cooperation in open source development ensures sustainable implementation and faster turnaround times for updates. IGSN, in particular in its Australian implementation, is characterised by a federated approach to system architecture and organisational governance giving it the necessary flexibility to adapt to particular local practices within multiple domains, whilst maintaining an overarching international standard. The three current IGSN allocation agents in Australia: Geoscience Australia, CSIRO and Curtin University, represent different sectors. Through funding from the Australian Research Data Services Program they have combined to develop a common web portal that allows discovery of physical samples and sample collections at a national level.International governance then ensures we can link to an international community but at the same time act locally to ensure the services offered are relevant to the needs of Australian researchers. This flexibility aids the integration of new disciplines into a global community of a physical samples information network.
The Earth System Prediction Suite: Toward a Coordinated U.S. Modeling Capability

DOE PAGES

Theurich, Gerhard; DeLuca, C.; Campbell, T.; ...

2016-08-22

The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open-source terms or to credentialed users. Furthermore, the ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the United States. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC)more » Layer, a set of ESMF-based component templates and interoperability conventions. Our shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multiagency development of coupled modeling systems; controlled experimentation and testing; and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NAVGEM), the Hybrid Coordinate Ocean Model (HYCOM), and the Coupled Ocean–Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and the Goddard Earth Observing System Model, version 5 (GEOS-5), atmospheric general circulation model.« less
The Earth System Prediction Suite: Toward a Coordinated U.S. Modeling Capability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Theurich, Gerhard; DeLuca, C.; Campbell, T.

The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open-source terms or to credentialed users. Furthermore, the ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the United States. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC)more » Layer, a set of ESMF-based component templates and interoperability conventions. Our shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multiagency development of coupled modeling systems; controlled experimentation and testing; and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NAVGEM), the Hybrid Coordinate Ocean Model (HYCOM), and the Coupled Ocean–Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and the Goddard Earth Observing System Model, version 5 (GEOS-5), atmospheric general circulation model.« less
Using Open and Interoperable Ways to Publish and Access LANCE AIRS Near-Real Time Data

NASA Astrophysics Data System (ADS)

Zhao, P.; Lynnes, C.; Vollmer, B.; Savtchenko, A. K.; Yang, W.

2011-12-01

Atmospheric Infrared Sounder (AIRS) Near-Real Time (NRT) data from the Land Atmosphere Near real time Capability for EOS (LANCE) provide the information on the global and regional atmospheric state with very low latency. An open and interoperable platform is useful to facilitate access to and integration of LANCE AIRS NRT data. This paper discusses the use of open-source software components to build Web services for publishing and accessing AIRS NRT data in the context of Service Oriented Architecture (SOA). The AIRS NRT data have also been made available through an OPeNDAP server. OPeNDAP allows several open-source netCDF-based tools such as Integrated Data Viewer, Ferret and Panoply to directly display the Level 2 data over the network. To enable users to locate swath data files in the OPeNDAP server that lie within a certain geographical area, graphical "granule maps" are being added to show the outline of each file on a map of the Earth. The metadata of AIRS NRT data and services is then explored to implement information advertisement and discovery in catalogue systems. Datacasting, an RSS-based technology for accessing Earth Science data and information to facilitate the subscriptions to AIRS NRT data availability, filtering, downloading and viewing data, is also discussed. To provide an easy entry point to AIRS NRT data and services, a Web portal designed for customized data downloading and visualization is introduced.
Semantics in NETMAR (open service NETwork for MARine environmental data)

NASA Astrophysics Data System (ADS)

Leadbetter, Adam; Lowry, Roy; Clements, Oliver

2010-05-01

Over recent years, there has been a proliferation of environmental data portals utilising a wide range of systems and services, many of which cannot interoperate. The European Union Framework 7 project NETMAR (that commenced February 2010) aims to provide a toolkit for building such portals in a coherent manner through the use of chained Open Geospatial Consortium Web Services (WxS), OPeNDAP file access and W3C standards controlled by a Business Process Execution Language workflow. As such, the end product will be configurable by user communities interested in developing a portal for marine environmental data, and will offer search, download and integration tools for a range of satellite, model and observed data from open ocean and coastal areas. Further processing of these data will also be available in order to provide statistics and derived products suitable for decision making in the chosen environmental domain. In order to make the resulting portals truly interoperable, the NETMAR programme requires a detailed definition of the semantics of the services being called and the data which are being requested. A key goal of the NETMAR programme is, therefore, to develop a multi-domain and multilingual ontology of marine data and services. This will allow searches across both human languages and across scientific domains. The approach taken will be to analyse existing semantic resources and provide mappings between them, gluing together the definitions, semantics and workflows of the WxS services. The mappings between terms aim to be more general than the standard "narrower than", "broader than" type seen in the thesauri or simple ontologies implemented by previous programmes. Tools for the development and population of ontologoies will also be provided by NETMAR as there will be instances in which existing resources cannot sufficiently describe newly encountered data or services.
Enhancing the Value of Sensor-based Observations by Capturing the Knowledge of How An Observation Came to Be

NASA Astrophysics Data System (ADS)

Fredericks, J.; Rueda-Velasquez, C. A.

2016-12-01

As we move from keeping data on our disks to sharing it with the world, often in real-time, we are obligated to also tell an unknown user about how our observations were made. Data that are shared must not only have ownership metadata, unit descriptions and content formatting information. The provider must also share information that is needed to assess the data as it relates to potential re-use. A user must be able to assess the limitations and capabilities of the sensor, as it is configured, to understand its value. For example, when an instrument is configured, it typically affects the data accuracy and operational limits of the sensor. An operator may sacrifice data accuracy to achieve a broader operational range and visa versa. If you are looking at newly discovered data, it is important to be able to find all of the information that relates to assessing the data quality for your particular application. Traditionally, metadata are captured by data managers who usually do not know how the data are collected. By the time data are distributed, this knowledge is often gone, buried within notebooks or hidden in documents that are not machine-harvestable and often not human-readable. In a recently funded NSF EarthCube Integrative Activity called X-DOMES (Cross-Domain Observational Metadata in EnviroSensing), mechanisms are underway to enable the capture of sensor and deployment metadata by sensor manufacturers and field operators. The support has enabled the development of a community ontology repository (COR) within the Earth Science Information Partnership (ESIP) community, fostering easy creation of resolvable terms for the broader community. This tool enables non-experts to easily develop W3C standards-based content, promoting the implementation of Semantic Web technologies for enhanced discovery of content and interoperability in workflows. The X-DOMES project is also developing a SensorML Viewer/Editor to provide an easy interface for sensor manufacturers and field operators to fully-describe sensor capabilities and configuration/deployment content - automatically generating it in machine-harvestable encodings that can be referenced by data managers and/or associated with the data through web-services, such as the OGC SWE Sensor Observation Service.
Bathymetry and Acoustic Backscatter: Northern Santa Barbara Channel, Southern California

USGS Publications Warehouse

Dartnell, Pete; Finlayson, David; Conrad, Jamie; Cochrane, Guy; Johnson, Samuel

2010-01-01

In the summer of 2008, as part of the California Seafloor Mapping Program (CSMP) the U.S. Geological Survey, Coastal and Marine Geology mapped a nearshore region of the northern Santa Barbara Channel in Southern California (fig 1). The CSMP is a cooperative partnership between Federal and State agencies, Universities, and Industry to create a comprehensive coastal/marine geologic and habitat basemap series to support the Marine Life Protection Act (MLPA) inititive. The program is supported by the California Ocean Protection Council and the California Coastal Conservancy. The 2008 mapping collected high resolution bathymetry and acoustic backscatter data using a bathymetric side scan system within State waters from about the 10-m isobath out over 3-nautical miles. This Open-File Report provides these data in a number of different formats, as well as a summary of the mapping mission, maps of bathymetry and backscatter, and FGDC metadata.
Mapping the World's Marine Protected and Managed Areas - Promoting Awareness, Compliance, and Enforcement via Open Data and Tools.

NASA Astrophysics Data System (ADS)

Vincent, T.; Zetterlind, V.; Tougher, B.

2016-12-01

Marine Protected and Managed Areas (MPAs) are a cornerstone of coastal and ocean conservation efforts and reflect years of dedicated effort to protect species and habitats through science-based regulation. When they are effective, biomass increases dramatically, and up to 14 fold and play a significant role in conserving biodiversity. Effective MPAs have enforcement. Enforcement cannot occur without awareness of their location among ocean stakeholders and the general public. The Anthropocene Institute, in partnership with the NOAA Marine Protected Area Center, is creating an actively managed, free and open, worldwide database of MPAs, including normalized metadata and regulation summaries, full GIS boundaries, revision history, and public facing interactive web maps. This project employs 2 full-time lawyers that first comb the relevant regulation; 2 full-time geographers and a full-time GIS database/web engineer.
YummyData: providing high-quality open life science data

PubMed Central

Yamaguchi, Atsuko; Splendiani, Andrea

2018-01-01

Abstract Many life science datasets are now available via Linked Data technologies, meaning that they are represented in a common format (the Resource Description Framework), and are accessible via standard APIs (SPARQL endpoints). While this is an important step toward developing an interoperable bioinformatics data landscape, it also creates a new set of obstacles, as it is often difficult for researchers to find the datasets they need. Different providers frequently offer the same datasets, with different levels of support: as well as having more or less up-to-date data, some providers add metadata to describe the content, structures, and ontologies of the stored datasets while others do not. We currently lack a place where researchers can go to easily assess datasets from different providers in terms of metrics such as service stability or metadata richness. We also lack a space for collecting feedback and improving data providers’ awareness of user needs. To address this issue, we have developed YummyData, which consists of two components. One periodically polls a curated list of SPARQL endpoints, monitoring the states of their Linked Data implementations and content. The other presents the information measured for the endpoints and provides a forum for discussion and feedback. YummyData is designed to improve the findability and reusability of life science datasets provided as Linked Data and to foster its adoption. It is freely accessible at http://yummydata.org/. Database URL: http://yummydata.org/ PMID:29688370

Using Semantic Web technologies for the generation of domain-specific templates to support clinical study metadata standards.

PubMed

Jiang, Guoqian; Evans, Julie; Endle, Cory M; Solbrig, Harold R; Chute, Christopher G

2016-01-01

The Biomedical Research Integrated Domain Group (BRIDG) model is a formal domain analysis model for protocol-driven biomedical research, and serves as a semantic foundation for application and message development in the standards developing organizations (SDOs). The increasing sophistication and complexity of the BRIDG model requires new approaches to the management and utilization of the underlying semantics to harmonize domain-specific standards. The objective of this study is to develop and evaluate a Semantic Web-based approach that integrates the BRIDG model with ISO 21090 data types to generate domain-specific templates to support clinical study metadata standards development. We developed a template generation and visualization system based on an open source Resource Description Framework (RDF) store backend, a SmartGWT-based web user interface, and a "mind map" based tool for the visualization of generated domain-specific templates. We also developed a RESTful Web Service informed by the Clinical Information Modeling Initiative (CIMI) reference model for access to the generated domain-specific templates. A preliminary usability study is performed and all reviewers (n = 3) had very positive responses for the evaluation questions in terms of the usability and the capability of meeting the system requirements (with the average score of 4.6). Semantic Web technologies provide a scalable infrastructure and have great potential to enable computable semantic interoperability of models in the intersection of health care and clinical research.
Enriching the Web Processing Service

NASA Astrophysics Data System (ADS)

Wosniok, Christoph; Bensmann, Felix; Wössner, Roman; Kohlus, Jörn; Roosmann, Rainer; Heidmann, Carsten; Lehfeldt, Rainer

2014-05-01

The OGC Web Processing Service (WPS) provides a standard for implementing geospatial processes in service-oriented networks. In its current version 1.0.0 it allocates the operations GetCapabilities, DescribeProcess and Execute, which can be used to offer custom processes based on single or multiple sub-processes. A large range of ready to use fine granular, fundamental geospatial processes have been developed by the GIS-community in the past. However, modern use cases or whole workflow processes demand specifications of lifecycle management and service orchestration. Orchestrating smaller sub-processes is a task towards interoperability; a comprehensive documentation by using appropriate metadata is also required. Though different approaches were tested in the past, developing complex WPS applications still requires programming skills, knowledge about software libraries in use and a lot of effort for integration. Our toolset RichWPS aims at providing a better overall experience by setting up two major components. The RichWPS ModelBuilder enables the graphics-aided design of workflow processes based on existing local and distributed processes and geospatial services. Once tested by the RichWPS Server, a composition can be deployed for production use on the RichWPS Server. The ModelBuilder obtains necessary processes and services from a directory service, the RichWPS semantic proxy. It manages the lifecycle and is able to visualize results and debugging-information. One aim will be to generate reproducible results; the workflow should be documented by metadata that can be integrated in Spatial Data Infrastructures. The RichWPS Server provides a set of interfaces to the ModelBuilder for, among others, testing composed workflow sequences, estimating their performance and to publish them as common processes. Therefore the server is oriented towards the upcoming WPS 2.0 standard and its ability to transactionally deploy and undeploy processes making use of a WPS-T interface. In order to deal with the results of these processing workflows, a server side extension enables the RichWPS Server and its clients to use WPS presentation directives (WPS-PD), a content related enhancement for the standardized WPS schema. We identified essential requirements of the components of our toolset by applying two use cases. The first enables the simplified comparison of modeled and measured data, a common task in hydro-engineering to validate the accuracy of a model. An implementation of the workflow includes reading, harmonizing and comparing two datasets in NetCDF-format. 2D Water level data from the German Bight can be chosen, presented and evaluated in a web client with interactive plots. The second use case is motivated by the Marine Strategy Directive (MSD) of the EU, which demands monitoring, action plans and at least an evaluation of the ecological situation in marine environment. Information technics adapted to those of INSPIRE should be used. One of the parameters monitored and evaluated for MSD is the expansion and quality of seagrass fields. With the view towards other evaluation parameters we decompose the complex process of evaluation of seagrass in reusable process steps and implement those packages as configurable WPS.
Joint Efforts Towards European HF Radar Integration

NASA Astrophysics Data System (ADS)

Rubio, A.; Mader, J.; Griffa, A.; Mantovani, C.; Corgnati, L.; Novellino, A.; Schulz-Stellenfleth, J.; Quentin, C.; Wyatt, L.; Ruiz, M. I.; Lorente, P.; Hartnett, M.; Gorringe, P.

2016-12-01

During the past two years, significant steps have been made in Europe for achieving the needed accessibility to High Frequency Radar (HFR) data for a pan-European use. Since 2015, EuroGOOS Ocean Observing Task Teams (TT), such as HFR TT, are operational networks of observing platforms. The main goal is on the harmonization of systems requirements, systems design, data quality, improvement and proof of the readiness and standardization of HFR data access and tools. Particular attention is being paid by HFR TT to converge from different projects and programs toward those common objectives. First, JERICO-NEXT (Joint European Research Infrastructure network for Coastal Observatory - Novel European eXpertise for coastal observaTories, H2020 2015 Programme) will contribute on describing the status of the European network, on seeking harmonization through exchange of best practices and standardization, on developing and giving access to quality control procedures and new products, and finally on demonstrating the use of such technology in the general scientific strategy focused by the Coastal Observatory. Then, EMODnet (European Marine Observation and Data Network) Physics started to assemble HF radar metadata and data products within Europe in a uniform way. This long term program is providing a combined array of services and functionalities to users for obtaining free of charge data, meta-data and data products on the physical conditions of European sea basins and oceans. Additionally, the Copernicus Marine Environment Monitoring Service (CMEMS) delivers from 2015 a core information service to any user related to 4 areas of benefits: Maritime Safety, Coastal and Marine Environment, Marine Resources, and Weather, Seasonal Forecasting and Climate activities. INCREASE (Innovation and Networking for the integration of Coastal Radars into EuropeAn marine SErvices - CMEMS Service Evolution 2016) will set the necessary developments towards the integration of existing European HFR operational systems into CMEMS. Finally, these current progresses will contribute to integrate HFR platforms as important operational components of EOOS, the European Ocean Observing System, designed to align and integrate Europe's ocean observing capacity for a truly integrated end-to-end ocean observing in Europe.
Launching Discovery through a Digital Library Portal: SIOExplorer

NASA Astrophysics Data System (ADS)

Miller, S. P.; Staudigel, H.; Johnson, C.; McSherry, K.; Clark, D.; Peckman, U.; Helly, J.; Sutton, D.; Chase, A.; Schottlaender, B. E.; Day, D.; Helly, M.

2003-12-01

The launching of an oceanographic expedition has its own brand of excitement, with the sound of the main engines firing up, and the lifting of the gangway in a foreign port, as the team of scientists and crew sets out for a month at sea with only the resources they have aboard. Although this adventure is broadly appealing, very few have the privilege of actually joining an expedition. With the "SIOExplorer" family of projects we are now beginning to open this experience across cyberspace to a wide range of students and teachers. What began two years ago as an effort to stabilize the Scripps Institution of Oceanography (SIO) data archives from more than 700 cruises going back 50 years, has now become an operational component of the National Science Digital Library (NSDL; www.nsdl.org), complete with thousands of historic photographs, full text documents and 3D visualization experiences. Our initial emphasis has been on marine geology and geophysics, in particular multibeam seafloor mapping, including 2 terabytes of digital objects. The IT architecture implemented at the San Diego Supercomputer Center (SDSC) streamlines the integration of additional projects in other disciplines with a suite of metadata management and collection building tools for "arbitrary digital objects." The "CruiseViewer" Java application is the primary portal to the digital library, providing a graphical user and display interface, the interface with the metadata database, and the interface with the SDSC "Storage Resource Broker" for long-term bulk distributed data storage management. It presents the user with a view of the available objects, overlaid on a global topography map. Geospatial objects can be selected interactively, and searches can be constrained by keywords. Metadata can be browsed and objects can be viewed onscreen or downloaded for further analysis, with automatic proprietary-hold request management. These efforts will be put to the test with national teacher workshops in the next two summers. Teachers, in collaboration with SIO-graduate students, will prepare and field-test learning-experience modules that explore concepts from plate tectonics theory for classroom and web use. Students will design their own personal voyages of discovery through our digital archives, promoting inquiry-based learning tailored to each individual. Future education and outreach efforts will include 1) developing a global registry of seafloor research or education projects (academic, industry, government), allowing at least a URL and a contact for further information 2) adding new collections, including dredged rocks and cores, 3) interoperating with other international data collections, 4) interacting with education and outreach projects such as the California Center for Ocean Science Education Excellence (COSEE), 5) continued testing of a real-time stand-alone digital library on a laptop shipboard acquisition system, 6) enhanced use of real-time Real-time Observatories, Applications, and Data management Network (ROADnet) satellite links to SIO vessels, and 7) continued construction of a series of museum exhibits based on digital terrain models. Now that SIOExplorer has become operational, we look forward to collaborating with other institutions for data and technology exchange, as well as for education and outreach opportunities. Support is provided by NSF NSDL, ITR and OCE programs, as well as by UCSD funds.
Real World Data and Service Integration: Demonstrations and Lessons Learnt from the GEOSS Architecture Implementation Pilot Phase Four

NASA Astrophysics Data System (ADS)

Simonis, I.; Alameh, N.; Percivall, G.

2012-04-01

The GEOSS Architecture Implementation Pilots (AIP) develop and pilot new process and infrastructure components for the GEOSS Common Infrastructure (GCI) and the broader GEOSS architecture through an evolutionary development process consisting of a set of phases. Each phase addresses a set of Societal Benefit Areas (SBA) and geoinformatic topics. The first three phases consisted of architecture refinements based on interactions with users; component interoperability testing; and SBA-driven demonstrations. The fourth phase (AIP-4) documented here focused on fostering interoperability arrangements and common practices for GEOSS by facilitating access to priority earth observation data sources and by developing and testing specific clients and mediation components to enable such access. Additionally, AIP-4 supported the development of a thesaurus for earth observation parameters and tutorials to guide data providers to make their data available through GEOSS. The results of AIP-4 are documented in two engineering reports and captured in a series of videos posted online. Led by the Open Geospatial Consortium (OGC), AIP-4 built on contributions from over 60 organizations. This wide portfolio helped testing interoperability arrangements in a highly heterogeneous environment. AIP-4 participants cooperated closely to test available data sets, access services, and client applications in multiple workflows and set ups. Eventually, AIP-4 improved the accessibility of GEOSS datasets identified as supporting Critical Earth Observation Priorities by the GEO User Interface Committee (UIC), and increased the use of the data through promoting availability of new data services, clients, and applications. During AIP-4, A number of key earth observation data sources have been made available online at standard service interfaces, discovered using brokered search approaches, and processed and visualized in generalized client applications. AIP-4 demonstrated the level of interoperability that can be achieved using currently available standards and corresponding products and implementations. The AIP-4 integration testing process proved that the integration of heterogeneous data resources available via interoperability arrangements such as WMS, WFS, WCS and WPS indeed works. However, the integration often required various levels of customizations on the client side to accommodate for variations in the service implementations. Those variations seem to be based on both malfunctioning service implementations as well as varying interpretations of or inconsistencies in existing standards. Other interoperability issues identified revolve around missing metadata or using unrecognized identifiers in the description of GEOSS resources. Once such issues are resolved, continuous compliance testing is necessary to ensure minimizing variability of implementations. Once data providers can choose from a set of enhanced implementations for offering their data using consistent interoperability arrangements, the barrier to client and decision support implementation developers will be lowered, leading to true leveraging of earth observation data through GEOSS. AIP-4 results, lessons learnt from previous AIPs 1-3 and close coordination with the Infrastructure Implementation Board (IIB), the successor of the Architecture and Data Committee (ADC), form the basis in the current preparation phase for the next Architecture Implementation Pilot, AIP-5. The Call For Participation will be launched in February and the pilot will be conducted from May to November 2012. The current planning foresees a scenario- oriented approach, with possible scenarios coming from the domains of disaster management, health (including air quality and waterborne diseases), water resource observations, energy, biodiversity and climate change, and agriculture.
Current Efforts in European Projects to Facilitate the Sharing of Scientific Observation Data

NASA Astrophysics Data System (ADS)

Bredel, Henning; Rieke, Matthes; Maso, Joan; Jirka, Simon; Stasch, Christoph

2017-04-01

This presentation is intended to provide an overview of currently ongoing efforts in European projects to facilitate and promote the interoperable sharing of scientific observation data. This will be illustrated through two examples: a prototypical portal developed in the ConnectinGEO project for matching available (in-situ) data sources to the needs of users and a joint activity of several research projects to harmonise the usage of the OGC Sensor Web Enablement standards for providing access to marine observation data. ENEON is an activity initiated by the European ConnectinGEO project to coordinate in-situ Earth observation networks with the aim to harmonise the access to observations, improve discoverability, and identify/close gaps in European earth observation data resources. In this context, ENEON commons has been developed as a supporting Web portal for facilitating discovery, access, re-use and creation of knowledge about observations, networks, and related activities (e.g. projects). The portal is based on developments resulting from the European WaterInnEU project and has been extended to cover the requirements for handling knowledge about in-situ earth observation networks. A first prototype of the portal was completed in January 2017 which offers functionality for interactive discussion, information exchange and querying information about data delivered by different observation networks. Within this presentation, we will introduce the presented prototype and initiate a discussion about potential future work directions. The second example concerns the harmonisation of data exchange in the marine domain. There are many organisation who operate ocean observatories or data archives. In recent years, the application of the OGC Sensor Web Enablement (SWE) technology has become more and more popular to increase the interoperability between marine observation networks. However, as the SWE standards were intentionally designed in a domain independent manner, there are still a significant degrees of freedom how the same information could be handled in the SWE framework. Thus, further domain-specific agreements are necessary to describe more precisely, how SWE standards shall be applied in specific contexts. Within this presentation we will report the current status of the marine SWE profiles initiative which has the aim to develop guidance and recommendations for the application of SWE standards for ocean observation data. This initiative which is supported by projects such as NeXOS, FixO3, ODIP 2, BRIDGES and SeaDataCloud has already lead to first results, which will be introduced in the proposed presentation. In summary we will introduce two different building blocks how earth observation networks can be coordinated to ensure better discoverability through intelligent portal solutions and to ensure a common, interoperable exchange of the collected data through dedicated domain profiles of Sensor Web standard.
GIS-based realization of international standards for digital geological mapping - developments in planetary mapping

NASA Astrophysics Data System (ADS)

Nass, Andrea; van Gasselt, Stephan; Jaumann, Ralf

2010-05-01

The Helmholtz Alliance and the European Planetary Network are research communities with different main topics. One of the main research topics which are shared by these communities is the question about the geomorphological evolutions of planetary surfaces as well as the geological context of life. This research contains questions like "Is there volcanic activity on a planet?" or "Where are possible landing sites?". In order to help answering such questions, analyses of surface features and morphometric measurements need to be performed. This ultimately leads to the generation of thematic maps (e.g. geological and geomorphologic maps) as a basis for the further studies. By using modern GIS techniques the comparative work and generalisation during mapping processes results in new information. These insights are crucial for subsequent investigations. Therefore, the aim is to make these results available to the research community as a secondary data basis. In order to obtain a common and interoperable data collection results of different mapping projects have to follow a standardised data-infrastructure, metadata definition and map layout. Therefore, we are currently focussing on the generation of a database model arranging all data and processes in a uniform mapping schema. With the help of such a schema, the mapper will be able to utilise a predefined (but customisable) GIS environment with individual tool items as well as a standardised symbolisation and a metadata environment. This environment is based on a data model which is currently on a conceptual level and provides the layout of the data infrastructure including relations and topologies. One of the first tasks towards this data model is the definition of a consistent basis of symbolisation standards developed for planetary mapping. The mapper/geologist will be able to access the pre-built signatures and utilise these in scale dependence within the mapping project. The symbolisation will be related to the data model in the next step. As second task, we designed a concept for description of the digital mapping result. Therefore, we are creating a metadata template based on existing standards for individual needs in planetary sciences. This template is subdivided in (meta) data about the general map content (e.g. on which data the mapping result based on) and in metadata for each individual mapping element/layer comprising information like minimum mapping scale, interpretation hints, etc. The assignment of such a metadata description in combination with the usage of a predefined mapping schema facilitates the efficient and traceable storage of data information on a network server and enables a subsequent representation, e.g. as a mapserver data structure. Acknowledgement: This work is partly supported by DLR and the Helmholtz Alliance "Planetary Evolution and Life".
QualityML: a dictionary for quality metadata encoding

NASA Astrophysics Data System (ADS)

Ninyerola, Miquel; Sevillano, Eva; Serral, Ivette; Pons, Xavier; Zabala, Alaitz; Bastin, Lucy; Masó, Joan

2014-05-01

The scenario of rapidly growing geodata catalogues requires tools focused on facilitate users the choice of products. Having quality fields populated in metadata allow the users to rank and then select the best fit-for-purpose products. In this direction, we have developed the QualityML (http://qualityml.geoviqua.org), a dictionary that contains hierarchically structured concepts to precisely define and relate quality levels: from quality classes to quality measurements. Generically, a quality element is the path that goes from the higher level (quality class) to the lowest levels (statistics or quality metrics). This path is used to encode quality of datasets in the corresponding metadata schemas. The benefits of having encoded quality, in the case of data producers, are related with improvements in their product discovery and better transmission of their characteristics. In the case of data users, particularly decision-makers, they would find quality and uncertainty measures to take the best decisions as well as perform dataset intercomparison. Also it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable. On one hand, the QualityML is a profile of the ISO geospatial metadata standards providing a set of rules for precisely documenting quality indicator parameters that is structured in 6 levels. On the other hand, QualityML includes semantics and vocabularies for the quality concepts. Whenever possible, if uses statistic expressions from the UncertML dictionary (http://www.uncertml.org) encoding. However it also extends UncertML to provide list of alternative metrics that are commonly used to quantify quality. A specific example, based on a temperature dataset, is shown below. The annual mean temperature map has been validated with independent in-situ measurements to obtain a global error of 0.5 ° C. Level 0: Quality class (e.g., Thematic accuracy) Level 1: Quality indicator (e.g., Quantitative attribute correctness) Level 2: Measurement field (e.g., DifferentialErrors1D) Level 3: Statistic or Metric (e.g., Half-lengthConfidenceInterval) Level 4: Units (e.g. Celsius degrees) Level 5: Value (e.g.0.5) Level 6: Specifications. Additional information on how the measurement took place, citation of the reference data, the traceability of the process and a publication describing the validation process encoded using new 19157 elements or the GeoViQua (http://www.geoviqua.org) Quality Model (PQM-UQM) extensions to the ISO models. Finally, keep in mind, that QualityML is not just suitable for encoding dataset level but also considers pixel and object level uncertainties. This is done by link the metadata quality descriptions with layers representing not just the data but the uncertainty values associated with each geospatial element.
The Climate Data Centre of Deutscher Wetterdienst (DWD)

NASA Astrophysics Data System (ADS)

Kaspar, F.; Schreiber, K.-J.; Behrendt, J.

2010-09-01

In 2009 the German meteorological service (Deutscher Wetterdienst, DWD) has started to set up a Climate Data Centre (CDC) in order to provide unified access to its variety of climate data especially to users from research, educational and public institutions. CDC acts as a central point of contact to various data collections of DWD. These include observations from German weather stations and DWD's observatories, special data as e.g. from hydroclimatology, agro-climatology and medical climatology, but also from international activities of DWD, such as the Global Precipitation Climatology Center (GPCC), EUMETSAT's Satellite Application Facility on Climate Monitoring (CM-SAF) or marine climatological data (ship and buoy observations) of the Global Collecting Centre for Marine Climatological Data. Data are based on conventional surface observations over land and ocean as well as on various remote sensing methods, such as satellite observation. The major part consists of climate data from the past, but CDC will also include results from scenario calculations and projections for the future. In addition to pure observational data, CDC offers derived statistical parameters and spatial analyses as gridded datasets. As first step, a central data catalogue provides standardised descriptions and information on data access. It follows national and international rules for the description of geo-referenced data (GDI-DE; INSPIRE). The individual data providers of DWD can use the catalogue to easily edit and publish their metadata in a unified way. These metadata contain information on data access, data policy, data quality, spatial and temporal coverage, responsible persons, etc. The catalogue is based on an open source software product (geonetwork-opensource) that is also used by a large number of international organizations. Metadata can be exchanged (harvested) between these catalogues. This will allow implementing a structure that provides search capabilities over institutions. The software also allows implementing web-based mapping services and group-specific data access policies.
Discovery of Marine Datasets and Geospatial Metadata Visualization

NASA Astrophysics Data System (ADS)

Schwehr, K. D.; Brennan, R. T.; Sellars, J.; Smith, S.

2009-12-01

NOAA's National Geophysical Data Center (NGDC) provides the deep archive of US multibeam sonar hydrographic surveys. NOAA stores the data as Bathymetric Attributed Grids (BAG; http://www.opennavsurf.org/) that are HDF5 formatted files containing gridded bathymetry, gridded uncertainty, and XML metadata. While NGDC provides the deep store and a basic ERSI ArcIMS interface to the data, additional tools need to be created to increase the frequency with which researchers discover hydrographic surveys that might be beneficial for their research. Using Open Source tools, we have created a draft of a Google Earth visualization of NOAA's complete collection of BAG files as of March 2009. Each survey is represented as a bounding box, an optional preview image of the survey data, and a pop up placemark. The placemark contains a brief summary of the metadata and links to directly download of the BAG survey files and the complete metadata file. Each survey is time tagged so that users can search both in space and time for surveys that meet their needs. By creating this visualization, we aim to make the entire process of data discovery, validation of relevance, and download much more efficient for research scientists who may not be familiar with NOAA's hydrographic survey efforts or the BAG format. In the process of creating this demonstration, we have identified a number of improvements that can be made to the hydrographic survey process in order to make the results easier to use especially with respect to metadata generation. With the combination of the NGDC deep archiving infrastructure, a Google Earth virtual globe visualization, and GeoRSS feeds of updates, we hope to increase the utilization of these high-quality gridded bathymetry. This workflow applies equally well to LIDAR topography and bathymetry. Additionally, with proper referencing and geotagging in journal publications, we hope to close the loop and help the community create a true “Geospatial Scholar” infrastructure.
Developing a Metadata Infrastructure to facilitate data driven science gateway and to provide Inspire/GEMINI compliance for CLIPC

NASA Astrophysics Data System (ADS)

Mihajlovski, Andrej; Plieger, Maarten; Som de Cerff, Wim; Page, Christian

2016-04-01

The CLIPC project is developing a portal to provide a single point of access for scientific information on climate change. This is made possible through the Copernicus Earth Observation Programme for Europe, which will deliver a new generation of environmental measurements of climate quality. The data about the physical environment which is used to inform climate change policy and adaptation measures comes from several categories: satellite measurements, terrestrial observing systems, model projections and simulations and from re-analyses (syntheses of all available observations constrained with numerical weather prediction systems). These data categories are managed by different communities: CLIPC will provide a single point of access for the whole range of data. The CLIPC portal will provide a number of indicators showing impacts on specific sectors which have been generated using a range of factors selected through structured expert consultation. It will also, as part of the transformation services, allow users to explore the consequences of using different combinations of driving factors which they consider to be of particular relevance to their work or life. The portal will provide information on the scientific quality and pitfalls of such transformations to prevent misleading usage of the results. The CLIPC project will develop an end to end processing chain (indicator tool kit), from comprehensive information on the climate state through to highly aggregated decision relevant products. Indicators of climate change and climate change impact will be provided, and a tool kit to update and post process the collection of indicators will be integrated into the portal. The CLIPC portal has a distributed architecture, making use of OGC services provided by e.g., climate4impact.eu and CEDA. CLIPC has two themes: 1. Harmonized access to climate datasets derived from models, observations and re-analyses 2. A climate impact tool kit to evaluate, rank and aggregate indicators Key is the availability of standardized metadata, describing indicator data and services. This will enable standardization and interoperability between the different distributed services of CLIPC. To disseminate CLIPC indicator data, transformed data products to enable impacts assessments and climate change impact indicators a standardized meta-data infrastructure is provided. The challenge is that compliance of existing metadata to INSPIRE ISO standards and GEMINI standards needs to be extended to further allow the web portal to be generated from the available metadata blueprint. The information provided in the headers of netCDF files available through multiple catalogues, allow us to generate ISO compliant meta data which is in turn used to generate web based interface content, as well as OGC compliant web services such as WCS and WMS for front end and WPS interactions for the scientific users to combine and generate new datasets. The goal of the metadata infrastructure is to provide a blueprint for creating a data driven science portal, generated from the underlying: GIS data, web services and processing infrastructure. In the presentation we will present the results and lessons learned.
Acoustic Metadata Management and Transparent Access to Networked Oceanographic Data Sets

DTIC Science & Technology

2011-09-30

Roberts in Pat Halpin’s lab, integrating the Marine Geospatial Ecology (GeoEco) toolset into our database services. While there is a steep...noise bands. The lower box at each site denotes the 1-6 kHz band while the upper box denotes 6-96 kHz band. Lad seamount has deployments at two sites...N00014-11-1-0697 http://cetus.ucsd.edu Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of
MaNIDA: an operational infrastructure for shipborne data

NASA Astrophysics Data System (ADS)

Macario, Ana; Scientific MaNIDA Team

2013-04-01

The Marine Network for Integrated Data Access (MaNIDA) aims to build a sustainable e-Infrastruture to support discovery and re-use of data archived in a distributed network of data providers in Germany (see related abstracts in session ESSI1.2 and session ESSI2.2). Because one of the primary focus of MaNIDA is the underway data acquired on board of German academic research vessels, we will be addressing various issues related to cruise-level metadata, shiptrack navigation, sampling events conducted during the cruise (event logs), standardization of device-related (type, name, parameters) and place-related (gazetteer) vocabularies, QA/QC procedures (near real time and post-cruise validation, corrections, quality flags) as well as ingestion and management of contextual information (e.g. various types of cruise-related reports and project-related information). One of MaNIDA's long-term goal is to be able to offer an integrative "one-stop-shop" framework for management and access of ship-related information based on international standards and interoperability. This access framework will be freely available and is intended for scientists, funding agencies and the public. The master "catalog" we are building currently contains information from 13 German academic research vessels and respective cruises (to date ~1900 cruises with expected growing rate of ~150 cruises annually). Moreover, MaNIDA's operational infrastructure will additionally provide a direct pipeline to SeaDataNet Cruise Summary Report Inventory, among others. In this presentation, we will focus on the extensions we are currently implementing to support automated acquisition and standardized transfer of various types of data from German research vessels to hosts on land. Our concept towards nationwide common QA/QC procedures for various types of underway data (including versioning concept) and common workflows will also be presented. The "linking" of cruise-related information with quality-controlled data and data products (e.g., digital terrain models), publications, cruise-related reports, people and other contextual information will be additionally shown in the framework of a prototype for R.V. Polarstern.
Geological events in submerged areas: attributes and standards in the EMODnet Geology Project

NASA Astrophysics Data System (ADS)

Fiorentino, A.; Battaglini, L.; D'Angelo, S.

2017-12-01

EMODnet Geology is a European Project which promotes the collection and harmonization of marine geological data mapped by various national and regional mapping projects and recovered in the literature, in order to make them freely available through a web portal. Among the several features considered within the Project, "Geological events and probabilities" include submarine landslides, earthquakes, volcanic centers, tsunamis, fluid emissions and Quaternary faults in European Seas. Due to the different geological settings of European sea areas it was necessary to elaborate a comprehensive and detailed pattern of Attributes for the different features in order to represent the diverse characteristics of each occurrence. Datasets consist of shapefiles representing each event at 1:250,000 scale. The elaboration of guidelines to compile the shapefiles and attribute tables was aimed at identifying parameters that should be used to characterize events and any additional relevant information. Particular attention has been devoted to the definition of the Attribute table in order to achieve the best degree of harmonization and standardization according to the European INSPIRE Directive. One of the main objectives is the interoperability of data, in order to offer more complete, error-free and reliable information and to facilitate exchange and re-use of data even between non-homogeneous systems. Metadata and available information collected during the Project is displayed on the Portal (http://www.emodnet-geology.eu/) as polygons, lines and points layers according to their geometry. By combining all these data it might be possible to elaborate additional thematic maps which could support further research as well as land planning and management. A possible application is being experimented by the Geological Survey of Italy - ISPRA which, in cooperation with other Italian institutions contributing to EMODnet Geology, is working at the production of an update for submerged areas of the structural model of Italy.
Building a biomedical cyberinfrastructure for collaborative research.

PubMed

Schad, Peter A; Mobley, Lee Rivers; Hamilton, Carol M

2011-05-01

For the potential power of genome-wide association studies (GWAS) and translational medicine to be realized, the biomedical research community must adopt standard measures, vocabularies, and systems to establish an extensible biomedical cyberinfrastructure. Incorporating standard measures will greatly facilitate combining and comparing studies via meta-analysis. Incorporating consensus-based and well-established measures into various studies should reduce the variability across studies due to attributes of measurement, making findings across studies more comparable. This article describes two well-established consensus-based approaches to identifying standard measures and systems: PhenX (consensus measures for phenotypes and eXposures), and the Open Geospatial Consortium (OGC). NIH support for these efforts has produced the PhenX Toolkit, an assembled catalog of standard measures for use in GWAS and other large-scale genomic research efforts, and the RTI Spatial Impact Factor Database (SIFD), a comprehensive repository of geo-referenced variables and extensive meta-data that conforms to OGC standards. The need for coordinated development of cyberinfrastructure to support measures and systems that enhance collaboration and data interoperability is clear; this paper includes a discussion of standard protocols for ensuring data compatibility and interoperability. Adopting a cyberinfrastructure that includes standard measures and vocabularies, and open-source systems architecture, such as the two well-established systems discussed here, will enhance the potential of future biomedical and translational research. Establishing and maintaining the cyberinfrastructure will require a fundamental change in the way researchers think about study design, collaboration, and data storage and analysis. Copyright © 2011 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Building a Biomedical Cyberinfrastructure for Collaborative Research

PubMed Central

Schad, Peter A.; Mobley, Lee Rivers; Hamilton, Carol M.

2018-01-01

For the potential power of genome-wide association studies (GWAS) and translational medicine to be realized, the biomedical research community must adopt standard measures, vocabularies, and systems to establish an extensible biomedical cyberinfrastructure. Incorporating standard measures will greatly facilitate combining and comparing studies via meta-analysis, which is a means for deriving larger populations, needed for increased statistical power to detect less apparent and more complex associations (gene-environment interactions and polygenic gene-gene interactions). Incorporating consensus-based and well-established measures into various studies should reduce the variability across studies due to attributes of measurement, making findings across studies more comparable. This article describes two consensus-based approaches to establishing standard measures and systems: PhenX (consensus measures for Phenotypes and eXposures), and the Open Geospatial Consortium (OGC). National Institutes of Health support for these efforts has produced the PhenX Toolkit, an assembled catalog of standard measures for use in GWAS and other large-scale genomic research efforts, and the RTI Spatial Impact Factor Database (SIFD), a comprehensive repository of georeferenced variables and extensive metadata that conforms to OGC standards. The need for coordinated development of cyberinfrastructure to support collaboration and data interoperability is clear, and we discuss standard protocols for ensuring data compatibility and interoperability. Adopting a cyberinfrastructure that includes standard measures, vocabularies, and open-source systems architecture will enhance the potential of future biomedical and translational research. Establishing and maintaining the cyberinfrastructure will require a fundamental change in the way researchers think about study design, collaboration, and data storage and analysis. PMID:21521587
EMODnet Physical Parameters (EMODNet PP) Portal

NASA Astrophysics Data System (ADS)

Novellino, A.; Schaap, D.; Manzella, G. M. R.; Pouliquen, S.; Gorringe, P.

2012-04-01

In December 2007 the European Parliament and Council adopted a common text for the Marine Strategy Framework Directive which aims to achieve environmentally healthy marine waters by 2020. This Directive includes an initiative for an overarching European Marine Observation and Data Network (EMODNet). During the one-year consultation phase that followed the release of the EU Green Paper on a Future Maritime Policy for the European Union, stakeholders gave an overwhelming positive response. Facilitating access to high quality marine data will resolve difficulties and stimulate an expansion of value-added public and commercial services, lay the foundations for sound governance and reduce uncertainties on human impact on the planet as well as of forecasts relating to the future state of the marine environment. Better and linked marine data will have an immediate impact on the planning of environmental policy and mitigation measures, and will also facilitate impact assessments and scientific work. The overall objectives of the EMODnet Physical Parameters (EMODNet PP) preparatory action is to provide access to archived and near real-time data on physical conditions in Europe's seas and oceans by means of a dedicated Pilot Portal and to determine how well the data meet the needs of users from industry, public authorities and scientists. The latter implicates that it is also an objective to identify data gaps and arguments why these gaps should be filled in future monitoring. This project will contribute towards the definition of an operational European Marine Observation and Data Network (EMODnet). This is done done by: 1. providing through a portal: a. access to marine data from measurement stations and ferryboxes. Both near real-time and archived data of time series are to be made available. b. metadata for these data sets using EMODnet/INSPIRE standards. c. metadata maps and overviews for whole sea-basins showing the availability of data and monitoring intensity of that basin. 2. monitoring and reporting on the effectiveness of the portal in meeting the needs of users in terms of ease of use, quality of information and fitness for purpose of the products delivered. 3. analysing what lessons have been learned for a future operational EMODnet. 4. keeping the portal operational afterwards The EMODNet PP project asks for the following types of measurements: Measurements from fixed stations that should cover at least: 1. wave height and period; 2. temperature of the water column; 3. wind speed and direction; 4. salinity of the water column; 5. horizontal velocity of the water column ; 6. light attenuation; 7. sea level. Measurements from ferryboxes that should cover at least: - temperature of the water column; - salinity of the water column. A portal accessing distributed data bases has been developed.
Common Patterns with End-to-end Interoperability for Data Access

NASA Astrophysics Data System (ADS)

Gallagher, J.; Potter, N.; Jones, M. B.

2010-12-01

At first glance, using common storage formats and open standards should be enough to ensure interoperability between data servers and client applications, but that is often not the case. In the REAP (Realtime Environment for Analytical Processing; NSF #0619060) project we integrated access to data from OPeNDAP servers into the Kepler workflow system and found that, as in previous cases, we spent the bulk of our effort addressing the twin issues of data model compatibility and integration strategies. Implementing seamless data access between a remote data source and a client application (data sink) can be broken down into two kinds of issues. First, the solution must address any differences in the data models used by the data source (OPeNDAP) and the data sink (the Kepler workflow system). If these models match completely, there is little work to be done. However, that is rarely the case. To map OPeNDAP's data model to Kepler's, we used two techniques (ignoring trivial conversions): On-the-fly type mapping and out-of-band communication. Type conversion takes place both for data and metadata because Kepler requires a priori knowledge of some aspects (e.g., syntactic metadata) of the data to build a workflow. In addition, OPeNDAP's constraint expression syntax was used to send out-of-band information to restrict the data requested from the server, facilitating changes in the returned data's type. This technique provides a way for users to exert fine-grained control over the data request, a potentially useful technique, at the cost of requiring that users understand a little about the data source's processing capabilities. The second set of issues for end-to-end data access are integration strategies. OPeNDAP provides several different tools for bringing data into an application: C++, C and Java libraries that provide functions for newly written software; The netCDF library which enables existing applications to read from servers using an older interface; and simple file transfers. These options affect seamlessness in that they represent tradeoffs in new development (required for the first option) with cumbersome extra user actions (required by the last option). While the middle option, adding new functionality to an existing library (netCDF), is very appealing because practice has shown that it can be very effective over a wide range of clients, it's very hard to build these libraries because correctly writing a new implementation of an existing API that preserves the original's exact semantics can be a daunting task. In the example discussed here, we developed a new module for Kepler using OPeNDAP's Java API. This provided a way to leverage internal optimizations for data organization in Kepler and we felt that outweighed the additional cost of new development and the need for users to learn how to use a new Kepler module. While common storage formats and open standards play an important role in data access, our work with the Kepler workflow system reinforces the experience that matching the data models of the data server (source) and user client (sink) and choosing the most appropriate integration strategy are critical to achieving interoperability.
European Multidisciplinary seafloor and the Observatory of the water column for Development; The setup of an interoperable Generic Sensor Module

NASA Astrophysics Data System (ADS)

Danobeitia, J.; Oscar, G.; Bartolomé, R.; Sorribas, J.; Del Rio, J.; Cadena, J.; Toma, D. M.; Bghiel, I.; Martinez, E.; Bardaji, R.; Piera, J.; Favali, P.; Beranzoli, L.; Rolin, J. F.; Moreau, B.; Andriani, P.; Lykousis, V.; Hernandez Brito, J.; Ruhl, H.; Gillooly, M.; Terrinha, P.; Radulescu, V.; O'Neill, N.; Best, M.; Marinaro, G.

2016-12-01

European Multidisciplinary seafloor and the Observatory of the water column for Development (EMSODEV) is a Horizon-2020 UE project whose overall objective is the operationalization of eleven marine observatories and four test sites distributed throughout Europe, from the Arctic to the Atlantic, from the Mediterranean to the Black Sea. The whole infrastructure is managed by the European consortium EMSO-ERIC (European Research Infrastructure Consortium) with the participation of 8 European countries and other partner countries. Now, we are implementing a Generic Sensor Module (EGIM) within the EMSO ERIC distributed marine research infrastructure. Our involvement is mainly on developing standard-compliant generic software for Sensor Web Enablement (SWE) on EGIM device. The main goal of this development is to support the sensors data acquisition on a new interoperable EGIM system. The EGIM software structure is made up of one acquisition layer located between the recorded data at EGIM module and the data management services. Therefore, two main interfaces are implemented: first, assuring the EGIM hardware acquisition and second allowing push and pull data from data management layer (Sensor Web Enable standard compliant). All software components used are Open source licensed and has been configured to manage different roles on the whole system (52º North SOS Server, Zabbix Monitoring System). The acquisition data module has been implemented with the aim to join all components for EGIM data acquisition and server fulfilling SOS standards interface. The system is already achieved awaiting for the first laboratory bench test and shallow water test connection to the OBSEA node, offshore Vilanova I la Geltrú (Barcelona, Spain). The EGIM module will record a wide range of ocean parameters in a long-term consistent, accurate and comparable manner from disciplines such as biology, geology, chemistry, physics, engineering, and computer science, from polar to subtropical environments, through the water column down to the deep sea. The measurements recorded along EMSO NODES are critical to respond accurately to the social and scientific challenges such as climate change, changes in marine ecosystems, and marine hazards.
Bathymetry and acoustic backscatter: outer mainland shelf and slope, Gulf of Santa Catalina, southern California

USGS Publications Warehouse

Dartnell, Peter; Conrad, James E.; Ryan, Holly F.; Finlayson, David P.

2014-01-01

In 2010 and 2011, scientists from the U.S. Geological Survey (USGS), Coastal and Marine Geology Program, acquired bathymetry and acoustic-backscatter data from the outer shelf and slope region offshore of southern California. The surveys were conducted as part of the USGS Marine Geohazards Program. Assessment of the hazards posed by offshore faults, submarine landslides, and tsunamis are facilitated by accurate and detailed bathymetric data. The surveys were conducted using the USGS R/V Parke Snavely outfitted with a 100-kHz Reson 7111 multibeam-echosounder system. This report provides the bathymetry and backscatter data acquired during these surveys in several formats, a summary of the mapping mission, maps of bathymetry and backscatter, and Federal Geographic Data Committee (FGDC) metadata.

Expressive map design: OGC SLD/SE++ extension for expressive map styles

NASA Astrophysics Data System (ADS)

Christophe, Sidonie; Duménieu, Bertrand; Masse, Antoine; Hoarau, Charlotte; Ory, Jérémie; Brédif, Mathieu; Lecordix, François; Mellado, Nicolas; Turbet, Jérémie; Loi, Hugo; Hurtut, Thomas; Vanderhaeghe, David; Vergne, Romain; Thollot, Joëlle

2018-05-01

In the context of custom map design, handling more artistic and expressive tools has been identified as a carto-graphic need, in order to design stylized and expressive maps. Based on previous works on style formalization, an approach for specifying the map style has been proposed and experimented for particular use cases. A first step deals with the analysis of inspiration sources, in order to extract `what does make the style of the source', i.e. the salient visual characteristics to be automatically reproduced (textures, spatial arrangements, linear stylization, etc.). In a second step, in order to mimic and generate those visual characteristics, existing and innovative rendering techniques have been implemented in our GIS engine, thus extending the capabilities to generate expressive renderings. Therefore, an extension of the existing cartographic pipeline has been proposed based on the following aspects: 1- extension of the symbolization specifications OGC SLD/SE in order to provide a formalism to specify and reference expressive rendering methods; 2- separate the specification of each rendering method and its parameterization, as metadata. The main contribution has been described in (Christophe et al. 2016). In this paper, we focus firstly on the extension of the cartographic pipeline (SLD++ and metadata) and secondly on map design capabilities which have been experimented on various topographic styles: old cartographic styles (Cassini), artistic styles (watercolor, impressionism, Japanese print), hybrid topographic styles (ortho-imagery & vector data) and finally abstract and photo-realist styles for the geovisualization of costal area. The genericity and interoperability of our approach are promising and have already been tested for 3D visualization.
Improving Interoperability by Incorporating UnitsML Into Markup Languages.

PubMed

Celebi, Ismet; Dragoset, Robert A; Olsen, Karen J; Schaefer, Reinhold; Kramer, Gary W

2010-01-01

Maintaining the integrity of analytical data over time is a challenge. Years ago, data were recorded on paper that was pasted directly into a laboratory notebook. The digital age has made maintaining the integrity of data harder. Nowadays, digitized analytical data are often separated from information about how the sample was collected and prepared for analysis and how the data were acquired. The data are stored on digital media, while the related information about the data may be written in a paper notebook or stored separately in other digital files. Sometimes the connection between this "scientific meta-data" and the analytical data is lost, rendering the spectrum or chromatogram useless. We have been working with ASTM Subcommittee E13.15 on Analytical Data to create the Analytical Information Markup Language or AnIML-a new way to interchange and store spectroscopy and chromatography data based on XML (Extensible Markup Language). XML is a language for describing what data are by enclosing them in computer-useable tags. Recording the units associated with the analytical data and metadata is an essential issue for any data representation scheme that must be addressed by all domain-specific markup languages. As scientific markup languages proliferate, it is very desirable to have a single scheme for handling units to facilitate moving information between different data domains. At NIST, we have been developing a general markup language just for units that we call UnitsML. This presentation will describe how UnitsML is used and how it is being incorporated into AnIML.
A SMART groundwater portal: An OGC web services orchestration framework for hydrology to improve data access and visualisation in New Zealand

NASA Astrophysics Data System (ADS)

Klug, Hermann; Kmoch, Alexander

2014-08-01

Transboundary and cross-catchment access to hydrological data is the key to designing successful environmental policies and activities. Electronic maps based on distributed databases are fundamental for planning and decision making in all regions and for all spatial and temporal scales. Freshwater is an essential asset in New Zealand (and globally) and the availability as well as accessibility of hydrological information held by or held for public authorities and businesses are becoming a crucial management factor. Access to and visual representation of environmental information for the public is essential for attracting greater awareness of water quality and quantity matters. Detailed interdisciplinary knowledge about the environment is required to ensure that the environmental policy-making community of New Zealand considers regional and local differences of hydrological statuses, while assessing the overall national situation. However, cross-regional and inter-agency sharing of environmental spatial data is complex and challenging. In this article, we firstly provide an overview of the state of the art standard compliant techniques and methodologies for the practical implementation of simple, measurable, achievable, repeatable, and time-based (SMART) hydrological data management principles. Secondly, we contrast international state of the art data management developments with the present status for groundwater information in New Zealand. Finally, for the topics (i) data access and harmonisation, (ii) sensor web enablement and (iii) metadata, we summarise our findings, provide recommendations on future developments and highlight the specific advantages resulting from a seamless view, discovery, access, and analysis of interoperable hydrological information and metadata for decision making.
TOPCAT -- Tool for OPerations on Catalogues And Tables

NASA Astrophysics Data System (ADS)

Taylor, Mark

TOPCAT is an interactive graphical viewer and editor for tabular data. It has been designed for use with astronomical tables such as object catalogues, but is not restricted to astronomical applications. It understands a number of different astronomically important formats, and more formats can be added. It is designed to cope well with large tables; a million rows by a hundred columns should not present a problem even with modest memory and CPU resources. It offers a variety of ways to view and analyse the data, including a browser for the cell data themselves, viewers for information about table and column metadata, tools for joining tables using flexible matching algorithms, and visualisation facilities including histograms, 2- and 3-dimensional scatter plots, and density maps. Using a powerful and extensible Java-based expression language new columns can be defined and row subsets selected for separate analysis. Selecting a row can be configured to trigger an action, for instance displaying an image of the catalogue object in an external viewer. Table data and metadata can be edited and the resulting modified table can be written out in a wide range of output formats. A number of options are provided for loading data from external sources, including Virtual Observatory (VO) services, thus providing a gateway to many remote archives of astronomical data. It can also interoperate with other desktop tools using the SAMP protocol. TOPCAT is written in pure Java and is available under the GNU General Public Licence. Its underlying table processing facilities are provided by STIL, the Starlink Tables Infrastructure Library.
New Features of the re3data Registry of Research Data Repositories

NASA Astrophysics Data System (ADS)

Elger, K.; Pampel, H.; Vierkant, P.; Witt, M.

2016-12-01

re3data is a registry of research data repositories that lists over 1,600 repositories from around the world, making it the largest and most comprehensive online catalog of data repositories on the web. The registry offers researchers, funding agencies, libraries and publishers a comprehensive overview of the heterogeneous landscape of data repositories. The repositories are described, following the "Metadata Schema for the Description of Research Data Repositories". re3data summarises the properties of a repository into a user-friendly icon system helping users to easily identify an adequate repository for the storage of their datasets. The re3data entries are curated by an international, multi-disciplinary editorial board. An application programming interface (API) enables other information systems to list and fetch metadata for integration and interoperability. Funders like the European Commission (2015) and publishers like Springer Nature (2016) recommend the use of re3data.org in their policies. The original re3data project partners are the GFZ German Research Centre for Geosciences, the Humboldt-Universität zu Berlin, the Purdue University Libraries and the Karlsruhe Institute of Technology (KIT). Since 2015 re3data is operated as a service of DataCite, a global non-profit organisation that provides persistent identifiers (DOIs) for research data. At the 2016 AGU Fall Meeting we will describe the current status of re3data. An overview of the major developments and new features will be given. Furthermore, we will present our plans to increase the quality of the re3data entries.
From global action against malaria to local issues: state of the art and perspectives of web platforms dealing with malaria information.

PubMed

Briand, Dominique; Roux, Emmanuel; Desconnets, Jean Christophe; Gervet, Carmen; Barcellos, Christovam

2018-03-21

Since prehistory to present times and despite a rough combat against it, malaria remains a concern for human beings. While evolutions of science and technology through times allowed for some infectious diseases eradication in the 20th century, malaria resists. This review aims at assessing how Internet and web technologies are used in fighting malaria. Precisely, how do malaria fighting actors profit from these developments, how do they deal with ensuing phenomena, such as the increase of data volume, and did these technologies bring new opportunities for fighting malaria? Eleven web platforms linked to spatio-temporal malaria information are reviewed, focusing on data, metadata, web services and categories of users. Though the web platforms are highly heterogeneous the review reveals that the latest advances in web technologies are underused. Information are rarely updated dynamically, metadata catalogues are absent, web services are more and more used, but rarely standardized, and websites are mainly dedicated to scientific communities, essentially researchers. Improvement of systems interoperability, through standardization, is an opportunity to be seized in order to allow real time information exchange and online multisource data analysis. To facilitate multidisciplinary/multiscale studies, the web of linked data and the semantic web innovations can be used in order to formalize the different view points of actors involved in the combat against malaria. By doing so, new malaria fighting strategies could take place, to tackle the bottlenecks listed in the United Nation Millennium Development Goals reports, but also specific issues highlighted by the World Health Organization such as malaria elimination in international borders.
The RD-Connect Registry & Biobank Finder: a tool for sharing aggregated data and metadata among rare disease researchers.

PubMed

Gainotti, Sabina; Torreri, Paola; Wang, Chiuhui Mary; Reihs, Robert; Mueller, Heimo; Heslop, Emma; Roos, Marco; Badowska, Dorota Mazena; de Paulis, Federico; Kodra, Yllka; Carta, Claudio; Martìn, Estrella Lopez; Miller, Vanessa Rangel; Filocamo, Mirella; Mora, Marina; Thompson, Mark; Rubinstein, Yaffa; Posada de la Paz, Manuel; Monaco, Lucia; Lochmüller, Hanns; Taruscio, Domenica

2018-05-01

In rare disease (RD) research, there is a huge need to systematically collect biomaterials, phenotypic, and genomic data in a standardized way and to make them findable, accessible, interoperable and reusable (FAIR). RD-Connect is a 6 years global infrastructure project initiated in November 2012 that links genomic data with patient registries, biobanks, and clinical bioinformatics tools to create a central research resource for RDs. Here, we present RD-Connect Registry & Biobank Finder, a tool that helps RD researchers to find RD biobanks and registries and provide information on the availability and accessibility of content in each database. The finder concentrates information that is currently sparse on different repositories (inventories, websites, scientific journals, technical reports, etc.), including aggregated data and metadata from participating databases. Aggregated data provided by the finder, if appropriately checked, can be used by researchers who are trying to estimate the prevalence of a RD, to organize a clinical trial on a RD, or to estimate the volume of patients seen by different clinical centers. The finder is also a portal to other RD-Connect tools, providing a link to the RD-Connect Sample Catalogue, a large inventory of RD biological samples available in participating biobanks for RD research. There are several kinds of users and potential uses for the RD-Connect Registry & Biobank Finder, including researchers collaborating with academia and the industry, dealing with the questions of basic, translational, and/or clinical research. As of November 2017, the finder is populated with aggregated data for 222 registries and 21 biobanks.
BCube: Building a Geoscience Brokering Framework

NASA Astrophysics Data System (ADS)

Jodha Khalsa, Siri; Nativi, Stefano; Duerr, Ruth; Pearlman, Jay

2014-05-01

BCube is addressing the need for effective and efficient multi-disciplinary collaboration and interoperability through the advancement of brokering technologies. As a prototype "building block" for NSF's EarthCube cyberinfrastructure initiative, BCube is demonstrating how a broker can serve as an intermediary between information systems that implement well-defined interfaces, thereby providing a bridge between communities that employ different specifications. Building on the GEOSS Discover and Access Broker (DAB), BCube will develop new modules and services including: • Expanded semantic brokering capabilities • Business Model support for work flows • Automated metadata generation • Automated linking to services discovered via web crawling • Credential passing for seamless access to data • Ranking of search results from brokered catalogs Because facilitating cross-discipline research involves cultural and well as technical challenges, BCube is also addressing the sociological and educational components of infrastructure development. We are working, initially, with four geoscience disciplines: hydrology, oceans, polar and weather, with an emphasis on connecting existing domain infrastructure elements to facilitate cross-domain communications.
What Comes First, the OWL or the Bean? Creating Reusable Scientific Software with OWL/RDF Vocabularies.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stephan, Eric G.; Elsethagen, Todd O.; Kleese van Dam, Kerstin

In the spring of 2013 the U.S. White House by executive order mandated: “Government information shall be managed as an asset throughout its life cycle to promote interoperability and openness, and, wherever possible and legally permissible, to ensure that data are released to the public in ways that make the data easy to find, accessible, and usable.” Key for the reusability of any scientific data is hereby the availability of metadata describing the published data in a vocabulary that is familiar to its potential users. The objective of this paper is to help scientific application developers who want to adoptmore » the continuous stream of new community vocabularies to help make their data sharable, self-describable, and easily understood. To achieve this we suggest semantic vocabulary and application integration best practices and discuss the tradeoffs of encoding vocabularies through code versus deriving code from vocabularies.« less
Public health, GIS, and the internet.

PubMed

Croner, Charles M

2003-01-01

Internet access and use of georeferenced public health information for GIS application will be an important and exciting development for the nation's Department of Health and Human Services and other health agencies in this new millennium. Technological progress toward public health geospatial data integration, analysis, and visualization of space-time events using the Web portends eventual robust use of GIS by public health and other sectors of the economy. Increasing Web resources from distributed spatial data portals and global geospatial libraries, and a growing suite of Web integration tools, will provide new opportunities to advance disease surveillance, control, and prevention, and insure public access and community empowerment in public health decision making. Emerging supercomputing, data mining, compression, and transmission technologies will play increasingly critical roles in national emergency, catastrophic planning and response, and risk management. Web-enabled public health GIS will be guided by Federal Geographic Data Committee spatial metadata, OpenGIS Web interoperability, and GML/XML geospatial Web content standards. Public health will become a responsive and integral part of the National Spatial Data Infrastructure.
A Reference Architecture for Space Information Management

NASA Technical Reports Server (NTRS)

Mattmann, Chris A.; Crichton, Daniel J.; Hughes, J. Steven; Ramirez, Paul M.; Berrios, Daniel C.

2006-01-01

We describe a reference architecture for space information management systems that elegantly overcomes the rigid design of common information systems in many domains. The reference architecture consists of a set of flexible, reusable, independent models and software components that function in unison, but remain separately managed entities. The main guiding principle of the reference architecture is to separate the various models of information (e.g., data, metadata, etc.) from implemented system code, allowing each to evolve independently. System modularity, systems interoperability, and dynamic evolution of information system components are the primary benefits of the design of the architecture. The architecture requires the use of information models that are substantially more advanced than those used by the vast majority of information systems. These models are more expressive and can be more easily modularized, distributed and maintained than simpler models e.g., configuration files and data dictionaries. Our current work focuses on formalizing the architecture within a CCSDS Green Book and evaluating the architecture within the context of the C3I initiative.
Developing an Online Framework for Publication of Uncertainty Information in Hydrological Modeling

NASA Astrophysics Data System (ADS)

Etienne, E.; Piasecki, M.

2012-12-01

Inaccuracies in data collection and parameters estimation, and imperfection of models structures imply uncertain predictions of the hydrological models. Finding a way to communicate the uncertainty information in a model output is important in decision-making. This work aims to publish uncertainty information (computed by project partner at Penn State) associated with hydrological predictions on catchments. To this end we have developed a DB schema (derived from the CUAHSI ODM design) which is focused on storing uncertainty information and its associated metadata. The technologies used to build the system are: OGC's Sensor Observation Service (SOS) for publication, the uncertML markup language (also developed by the OGC) to describe uncertainty information, and use of the Interoperability and Automated Mapping (INTAMAP) Web Processing Service (WPS) that handles part of the statistics computations. We develop a service to provide users with the capability to exploit all the functionality of the system (based on DRUPAL). Users will be able to request and visualize uncertainty data, and also publish their data in the system.
A Conceptual Framework to Enhance the Interoperability of Observatories among Countries, Continents and the World

NASA Astrophysics Data System (ADS)

Loescher, H.; Fundamental Instrument Unit

2013-05-01

Ecological research addresses challenges relating to the dynamics of the planet, such as changes in climate, biodiversity, ecosystem functioning and services, carbon and energy cycles, natural and human-induced hazards, and adaptation and mitigation strategies that involve many science and engineering disciplines and cross national boundaries. Because of the global nature of these challenges, greater international collaboration is required for knowledge sharing and technology deployment to advance earth science investigations and enhance societal benefits. For example, the Working Group on Biodiversity Preservation and Ecosystem Services (PCAST 2011) noted the scale and complexity of the physical and human resources needed to address these challenges. Many of the most pressing ecological research questions require global-scale data and global scale solutions (Suresh 2012), e.g., interdisciplinary data access from data centers managing ecological resources and hazards, drought, heat islands, carbon cycle, or data used to forecast the rate of spread of invasive species or zoonotic diseases. Variability and change at one location or in one region may well result from the superposition of global processes coupled together with regional and local modes of variability. For example, we know the El Niño-Southern Oscillation large-scale modes of variability in the coupled terrestrial-aquatic-atmospheric systems' correlation with variability in regional rainfall and ecosystem functions. It is therefore a high priority of government and non-government organizations to develop the necessary large scale, world-class research infrastructures for environmental research—and the framework by which these data can be shared, discovered, and utilized by a broad user community of scientists and policymakers, alike. Given that there are many, albeit nascent, efforts to build new environmental observatories/networks globally (e.g., EU-ICOS, EU-Lifewatch, AU-TERN, China-CERN, GEOSS, GEO-BON, NutNet, etc.) and domestically, (e.g., NSF-CZO, USDA-LTAR, DOE-NGEE, Soil Carbon Network, etc.), there is a strong and mutual desire to assure interoperability of data. Developing interoperability is the degree by which each of the following is mapped between observatories (entities), defined by linking i) science requirements with science questions, ii) traceability of measurements to nationally and internationally accepted standards, iii) how data product are derived, i.e., algorithms, procedures, and methods, and iv) the bioinformatics which broadly include data formats, metadata, controlled vocabularies, and semantics. Here, we explore the rationale and focus areas for interoperability, the governance and work structures, example projects (NSF-NEON, EU-ICOS, and AU-TERN), and the emergent roles of scientists in these endeavors.
The Sea Monitoring Virtual Research Community (VRC) in the EVER-EST Project (a virtual research environment for the Earth Sciences).

NASA Astrophysics Data System (ADS)

Foglini, Federica; Boero, Ferdinando; Guarino, Raffaele

2016-04-01

The EU's H2020 EVER-EST Project is dedicated to the realization of a Virtual Research Environment (VRE) for Earth Science researchers during 2015-2018. In this framework the Sea monitoring represents one of the four use case VRCs chosen to validate the EVER-EST e-infrastructure, which is aimed at representing a wide and multidisciplinary Earth Science domain. The objective of the Sea Monitoring Virtual Research Community (VRC) is to provide useful and applicable contributions to the identification and definition of variables indicated by the European Commission in the Marine Directive under the framework for Good Environment Status (GES). The European Marine Strategy Framework Directive (MSFD, http://ec.europa.eu/environment/marine/index_en.htm) has defined the descriptors for Good Environmental Status in marine waters. The first descriptor is biodiversity; the second one is the presence of non-indigenous species while the remaining nine (even when they consider physical, chemical or geological variables) require proper functioning of the ecosystem, linked to a good state of biodiversity. The Sea Monitoring VRC is direct to provide practical methods, procedures and protocols to support coherent and widely accepted interpretation of the Descriptors 1(Biodiversity), 2 (non- indigenous species), 4 (food webs) and 6 (seafloor integrity) identified in GES. In that context, the criteria and methodological standards already identified by the European Commission, and at same time considering the activities and projects in progress in the marine framework, will be taken into account. This research of practical methods to estimate and measure GES parameters requires a close cooperation among different disciplines including: biologists, geologists, geophysics, oceanographers, Earth observation experts and others. It will also require a number of different types of scientific data and observations (e.g. biology related, chemico-physical, etc.) from different inputs and sensors (e.g. remote sensing, on-site buoys, marine stations, administrations, citizen observations, etc.). Furthermore, different communities require support and guidance to be able to effectively interoperate and share practices, methods, standards and terminologies. The EVER-EST VRE will provide the Sea Monitoring VRC users community with an innovative framework aimed at enhancing their ability to interoperate and share knowledge, experience and methods for GES assessment and monitoring. Furthermore the Sea monitoring VRC will focus the attention on the implementation of Research Object (RO, a semantically rich aggregation of resources bringing together data, documents and methods in scientific investigations) for GES assessment to be shared among the wide sea monitoring community for the first time.
A database of marine phytoplankton abundance, biomass and species composition in Australian waters

NASA Astrophysics Data System (ADS)

Davies, Claire H.; Coughlan, Alex; Hallegraeff, Gustaaf; Ajani, Penelope; Armbrecht, Linda; Atkins, Natalia; Bonham, Prudence; Brett, Steve; Brinkman, Richard; Burford, Michele; Clementson, Lesley; Coad, Peter; Coman, Frank; Davies, Diana; Dela-Cruz, Jocelyn; Devlin, Michelle; Edgar, Steven; Eriksen, Ruth; Furnas, Miles; Hassler, Christel; Hill, David; Holmes, Michael; Ingleton, Tim; Jameson, Ian; Leterme, Sophie C.; Lønborg, Christian; McLaughlin, James; McEnnulty, Felicity; McKinnon, A. David; Miller, Margaret; Murray, Shauna; Nayar, Sasi; Patten, Renee; Pritchard, Tim; Proctor, Roger; Purcell-Meyerink, Diane; Raes, Eric; Rissik, David; Ruszczyk, Jason; Slotwinski, Anita; Swadling, Kerrie M.; Tattersall, Katherine; Thompson, Peter; Thomson, Paul; Tonks, Mark; Trull, Thomas W.; Uribe-Palomino, Julian; Waite, Anya M.; Yauwenas, Rouna; Zammit, Anthony; Richardson, Anthony J.

2016-06-01

There have been many individual phytoplankton datasets collected across Australia since the mid 1900s, but most are unavailable to the research community. We have searched archives, contacted researchers, and scanned the primary and grey literature to collate 3,621,847 records of marine phytoplankton species from Australian waters from 1844 to the present. Many of these are small datasets collected for local questions, but combined they provide over 170 years of data on phytoplankton communities in Australian waters. Units and taxonomy have been standardised, obviously erroneous data removed, and all metadata included. We have lodged this dataset with the Australian Ocean Data Network (http://portal.aodn.org.au/) allowing public access. The Australian Phytoplankton Database will be invaluable for global change studies, as it allows analysis of ecological indicators of climate change and eutrophication (e.g., changes in distribution; diatom:dinoflagellate ratios). In addition, the standardised conversion of abundance records to biomass provides modellers with quantifiable data to initialise and validate ecosystem models of lower marine trophic levels.
Framework for Informed Policy Making Using Data from National Environmental Observatories

NASA Astrophysics Data System (ADS)

Wee, B.; Taylor, J. R.; Poinsatte, J.

2012-12-01

Large-scale environmental changes pose challenges that straddle environmental, economic, and social boundaries. As we design and implement climate adaptation strategies at the Federal, state, local, and tribal levels, accessible and usable data are essential for implementing actions that are informed by the best available information. Data-intensive science has been heralded as an enabler for scientific breakthroughs powered by advanced computing capabilities and interoperable data systems. Those same capabilities can be applied to data and information systems that facilitate the transformation of data into highly processed products. At the interface of scientifically informed public policy and data intensive science lies the potential for producers of credible, integrated, multi-scalar environmental data like the National Ecological Observatory Network (NEON) and its partners to capitalize on data and informatics interoperability initiatives that enable the integration of environmental data from across credible data sources. NSF's large-scale environmental observatories such as NEON and the Ocean Observatories Initiative (OOI) are designed to provide high-quality, long-term environmental data for research. These data are also meant to be repurposed for operational needs that like risk management, vulnerability assessments, resource management, and others. The proposed USDA Agriculture Research Service (ARS) Long Term Agro-ecosystem Research (LTAR) network is another example of such an environmental observatory that will produce credible data for environmental / agricultural forecasting and informing policy. To facilitate data fusion across observatories, there is a growing call for observation systems to more closely coordinate and standardize how variables are measured. Together with observation standards, cyberinfrastructure standards enable the proliferation of an ecosystem of applications that utilize diverse, high-quality, credible data. Interoperability facilitates the integration of data from multiple credible sources of data, and enables the repurposing of data for use at different geographical scales. Metadata that captures the transformation of data into value-added products ("provenance") lends reproducability and transparency to the entire process. This way, the datasets and model code used to create any product can be examined by other parties. This talk outlines a pathway for transforming environmental data into value-added products by various stakeholders to better inform sustainable agriculture using data from environmental observatories including NEON and LTAR.;
HydroShare for iUTAH: Collaborative Publication, Interoperability, and Reuse of Hydrologic Data and Models for a Large, Interdisciplinary Water Research Project

NASA Astrophysics Data System (ADS)

Horsburgh, J. S.; Jones, A. S.

2016-12-01

Data and models used within the hydrologic science community are diverse. New research data and model repositories have succeeded in making data and models more accessible, but have been, in most cases, limited to particular types or classes of data or models and also lack the type of collaborative, and iterative functionality needed to enable shared data collection and modeling workflows. File sharing systems currently used within many scientific communities for private sharing of preliminary and intermediate data and modeling products do not support collaborative data capture, description, visualization, and annotation. More recently, hydrologic datasets and models have been cast as "social objects" that can be published, collaborated around, annotated, discovered, and accessed. Yet it can be difficult using existing software tools to achieve the kind of collaborative workflows and data/model reuse that many envision. HydroShare is a new, web-based system for sharing hydrologic data and models with specific functionality aimed at making collaboration easier and achieving new levels of interactive functionality and interoperability. Within HydroShare, we have developed new functionality for creating datasets, describing them with metadata, and sharing them with collaborators. HydroShare is enabled by a generic data model and content packaging scheme that supports describing and sharing diverse hydrologic datasets and models. Interoperability among the diverse types of data and models used by hydrologic scientists is achieved through the use of consistent storage, management, sharing, publication, and annotation within HydroShare. In this presentation, we highlight and demonstrate how the flexibility of HydroShare's data model and packaging scheme, HydroShare's access control and sharing functionality, and versioning and publication capabilities have enabled the sharing and publication of research datasets for a large, interdisciplinary water research project called iUTAH (innovative Urban Transitions and Aridregion Hydro-sustainability). We discuss the experiences of iUTAH researchers now using HydroShare to collaboratively create, curate, and publish datasets and models in a way that encourages collaboration, promotes reuse, and meets funding agency requirements.
National Geothermal Data System (USA): an Exemplar of Open Access to Data

NASA Astrophysics Data System (ADS)

Allison, M. Lee; Richard, Stephen; Blackman, Harold; Anderson, Arlene; Patten, Kim

2014-05-01

The National Geothermal Data System's (NGDS - www.geothermaldata.org) formal launch in April, 2014 will provide open access to millions of data records, sharing -relevant geoscience and longer term to land use data to propel geothermal development and production. NGDS serves information from all of the U.S. Department of Energy's sponsored development and research projects and geologic data from all 50 states, using free and open source software. This interactive online system is opening new exploration opportunities and potentially shortening project development by making data easily discoverable, accessible, and interoperable. We continue to populate our prototype functional data system with multiple data nodes and nationwide data online and available to the public. Data from state geological surveys and partners includes more than 6 million records online, including 1.72 million well headers (oil and gas, water, geothermal), 670,000 well logs, and 497,000 borehole temperatures and is growing rapidly. There are over 312 interoperable Web services and another 106 WMS (Web Map Services) registered in the system as of January, 2014. Companion projects run by Southern Methodist University and U.S. Geological Survey (USGS) are adding millions of additional data records. The DOE Geothermal Data Repository, currently hosted on OpenEI, is a system node and clearinghouse for data from hundreds of U.S. DOE-funded geothermal projects. NGDS is built on the US Geoscience Information Network (USGIN) data integration framework, which is a joint undertaking of the USGS and the Association of American State Geologists (AASG). NGDS complies with the White House Executive Order of May 2013, requiring all federal agencies to make their data holdings publicly accessible online in open source, interoperable formats with common core and extensible metadata. The National Geothermal Data System is being designed, built, deployed, and populated primarily with support from the US Department of Energy, Geothermal Technologies Office. To keep this system operational after the original implementation will require four core elements: continued serving of data and applications by providers; maintenance of system operations; a governance structure; and an effective business model. Each of these presents a number of challenges currently under consideration.
Using GDAL to Convert NetCDF 4 CF 1.6 to GeoTIFF: Interoperability Problems and Solutions for Data Providers and Distributors

NASA Astrophysics Data System (ADS)

Haran, T. M.; Brodzik, M. J.; Nordgren, B.; Estilow, T.; Scott, D. J.

2015-12-01

An increasing number of new Earth science datasets are being producedby data providers in self-describing, machine-independent file formatsincluding Hierarchical Data Format version 5 (HDF5) and NetworkCommon Data Form version 4 (netCDF-4). Furthermore data providers maybe producing netCDF-4 files that follow the conventions for Climateand Forecast metadata version 1.6 (CF 1.6) which, for datasets mappedto a projected raster grid covering all or a portion of the earth,includes the Coordinate Reference System (CRS) used to define howlatitude and longitude are mapped to grid coordinates, i.e. columnsand rows, and vice versa. One problem that users may encounter is thattheir preferred visualization and analysis tool may not yet includesupport for one of these newer formats. Moreover, data distributorssuch as NASA's NSIDC DAAC may not yet include support for on-the-flyconversion of data files for all data sets produced in a new format toa preferred older distributed format.There do exist open source solutions to this dilemma in the form ofsoftware packages that can translate files in one of the new formatsto one of the preferred formats. However these software packagesrequire that the file to be translated conform to the specificationsof its respective format. Although an online CF-Convention compliancechecker is available from cfconventions.org, a recent NSIDC userservices incident described here in detail involved an NSIDC-supporteddata set that passed the (then current) CF Checker Version 2.0.6, butwas in fact lacking two variables necessary for conformance. Thisproblem was not detected until GDAL, a software package which reliedon the missing variables, was employed by a user in an attempt totranslate the data into a different file format, namely GeoTIFF.This incident indicates that testing a candidate data product with oneor more software products written to accept the advertised conventionsis proposed as a practice which improves interoperability. Differencesbetween data file contents and software package expectations areexposed, affording an opportunity to improve conformance of software,data or both. The incident can also serve as a demonstration that dataproviders, distributors, and users can work together to improve dataproduct quality and interoperability.
Leverage and Delegation in Developing an Information Model for Geology

NASA Astrophysics Data System (ADS)

Cox, S. J.

2007-12-01

GeoSciML is an information model and XML encoding developed by a group of primarily geologic survey organizations under the auspices of the IUGS CGI. The scope of the core model broadly corresponds with information traditionally portrayed on a geologic map, viz. interpreted geology, some observations, the map legend and accompanying memoir. The development of GeoSciML has followed the methodology specified for an Application Schema defined by OGC and ISO 19100 series standards. This requires agreement within a community concerning their domain model, its formal representation using UML, documentation as a Feature Type Catalogue, with an XML Schema implementation generated from the model by applying a rule-based transformation. The framework and technology supports a modular governance process. Standard datatypes and GI components (geometry, the feature and coverage metamodels, metadata) are imported from the ISO framework. The observation and sampling model (including boreholes) is imported from OGC. The scale used for most scalar literal values (terms, codes, measures) allows for localization where necessary. Wildcards and abstract base- classes provide explicit extensibility points. Link attributes appear in a regular way in the encodings, allowing reference to external resources using URIs. The encoding is compatible with generic GI data-service interfaces (WFS, WMS, SOS). For maximum interoperability within a community, the interfaces may be specialised through domain-specified constraints (e.g. feature-types, scale and vocabulary bindings, query-models). Formalization using UML and XML allows use of standard validation and processing tools. Use of upper-level elements defined for generic GI application reduces the development effort and governance resonsibility, while maximising cross-domain interoperability. On the other hand, enabling specialization to be delegated in a controlled manner is essential to adoption across a range of subdisciplines and jurisdictions. The GeoSciML design team is responsible only for the part of the model that is unique to geology but for which general agreement can be reached within the domain. This paper is presented on behalf of the Interoperability Working Group of the IUGS Commission for Geoscience Information (CGI) - follow web-link for details of the membership.

SeaDataNet : Pan-European infrastructure for marine and ocean data management - Project objectives, structure and components

NASA Astrophysics Data System (ADS)

Maudire, G.; Maillard, C.; Fichaut, M.; Manzella, G.; Schaap, D. M. A.

2009-04-01

SeaDataNet : Pan-European infrastructure for marine and ocean data management Project objectives, structure and components G. Maudire (1), C. Maillard (1), G. Manzella (2), M. Fichaut (1), D.M.A. Schaap (3), E. Iona (4) and the SeaDataNet consortium. (1) IFREMER, Brest, France (Gilbert.Maudire@ifremer.fr), (2) ENEA, La Spezia, Italy, (3) Mariene Informatie Service 'MARIS', Voorburg, The Netherlands, (4) Hellenic Centre for Marine Research-HCMR, Anavyssos, Greece. Since a large part of the earth population lives near the oceans or carries on activities directly or indirectly linked to the seas (fishery and aquaculture, exploitation of sea bottom resources, international shipping, tourism), knowledge of oceans is of primary importance for security and economy. However, observation and monitoring of the oceans remains difficult and expensive even if real improvements have been achieved using research vessels and submersibles, satellites and automatic observatories like buoys, floats and seafloor observatories transmitting directly to the shore using global transmission systems. More than 600 governmental or private organizations are active in observation of seas bordering Europe, but European oceanographic data are fragmented, not always validated and not always easily accessible. That highlights the need of international collaboration to tend toward a comprehensive view of ocean mechanisms, resources and changes. SeaDataNet is an Integrated research Infrastructure Initiative (I3) in European Union Framework Program 6 (2006 - 2011) to provide the data management system adapted both to the fragmented observation systems and to the users need for an integrated access to data, meta-data, products and services. Its major objectives are to: - encourage long-term archiving at national level to secure ocean data taking into account that all the observations made in the variable oceanic environment can never be remade if they are lost; - promote best practices for data management, taking benefits of the development of international initiatives and standards on data quality insurance, data descriptions (metadata and common vocabulary) and interoperability. Software tools are developed or adapted accordingly to support these practices and the adoption of standards; - establish online services to facilitate data discovery, data requests, data visualisation and data download for the users; - process data sets of reference like ocean climatologies at a regional basin scale to provide comprehensive data sets Sustainability of the provided services is researched by a balance between the activities mostly undertaken at National level by the National Oceanographic data centres or some thematic data centres and the effort done at the Pan-European level by the project. The SeaDataNet consortium brings now together a unique group of 49 partners from major oceanographic institutes of 35 countries. Taking in account that valuable work on ocean data management must be done at basin level, most of countries bordering Black Sea, Mediterranean Sea, North-East Atlantic, North Sea, Baltic Sea and Artic Sea are part of the project. Capacity building of consortium members is necessary to meet project objectives and a comprehensive training program is conducted both for data management and for IT technologies which are necessary to establish such a distributed system: databases management, XML language, web portal and services, GIS technologies. SeaDataNet Partners: IFREMER (France), MARIS (Netherlands), HCMR/HNODC (Greece), ULg (Belgium), OGS (Italy),NERC/BODC (UK), BSH/DOD (Germany), SMHI (Sweden), IEO (Spain), RIHMI/WDC (Russia), IOC (International), ENEA (Italy), INGV (Italy), METU (Turkey), CLS (France), AWI (Germany), IMR (Norway), NERI (Denmark), ICES (International), EC-DG JRC (International), MI (Ireland), IHPT (Portugal), RIKZ (Netherlands), RBINS/MUMM (Belgium), VLIZ (Belgium), MRI (Iceland), FIMR (Finland ), IMGW (Poland), MSI (Estonia), IAE/UL (Latvia), CMR (Lithuania), SIO/RAS (Russia), MHI/DMIST (Ukraine), IO/BAS (Bulgaria), NIMRD (Romania), TSU (Georgia), INRH (Morocco), IOF (Croatia), PUT (Albania), NIB (Slovenia), UoM (Malta), OC/UCY (Cyprus), IOLR (Israel), NCSR/NCMS (Lebanon), CNR-ISAC (Italy), ISMAL (Algeria), INSTM (Tunisia)
Effect of food on metamorphic competence in the model system Crepidula fornicata.

PubMed

Padilla, Dianna K; McCann, Michael J; Glenn, Mica McCarty; Hooks, Alexandra P; Shumway, Sandra E

2014-12-01

Food quality and quantity, as well as temperature, are all factors that are expected to affect rates of development, and are likely to be affected by expected climatic change. We tested the effect of a mixed diet versus a single-food diet on metamorphic competence in the emerging model species Crepidula fornicata. We then compared our results with other published studies on this species that examined time to metamorphic competence across a range of food concentrations and rearing temperatures. Ours was the only study to test the effects of single food versus a mixed diet on metamorphic competence for this species. Diet composition did not affect metamorphic competence or survivorship. Comparing results across studies, we found that the shortest time to metamorphic competence was typically found when the food availability per larva was the greatest, independent of rearing temperature. Unfortunately, some published studies did not include important metadata needed for comparison with other studies; these data included larval rearing density, food density, frequency of feeding, and rearing temperature. Mortality rates were not always reported and when reported were often measured in different ways, preventing comparison. Such metadata are essential for comparisons among studies as well as among taxa, and for the determination of generalizable patterns and evolutionary trends. Increased reporting of all such metadata is essential if we are to use scientific studies performed to their fullest potential. © 2014 Marine Biological Laboratory.
JADDS - towards a tailored global atmospheric composition data service for CAMS forecasts and reanalysis

NASA Astrophysics Data System (ADS)

Stein, Olaf; Schultz, Martin G.; Rambadt, Michael; Saini, Rajveer; Hoffmann, Lars; Mallmann, Daniel

2017-04-01

Global model data of atmospheric composition produced by the Copernicus Atmospheric Monitoring Service (CAMS) is collected since 2010 at FZ Jülich and serves as boundary condition for use by Regional Air Quality (RAQ) modellers world-wide. RAQ models need time-resolved meteorological as well as chemical lateral boundary conditions for their individual model domains. While the meteorological data usually come from well-established global forecast systems, the chemical boundary conditions are not always well defined. In the past, many models used 'climatic' boundary conditions for the tracer concentrations, which can lead to significant concentration biases, particularly for tracers with longer lifetimes which can be transported over long distances (e.g. over the whole northern hemisphere) with the mean wind. The Copernicus approach utilizes extensive near-realtime data assimilation of atmospheric composition data observed from space which gives additional reliability to the global modelling data and is well received by the RAQ communities. An existing Web Coverage Service (WCS) for sharing these individually tailored model results is currently being re-engineered to make use of a modern, scalable database technology in order to improve performance, enhance flexibility, and allow the operation of catalogue services. The new Jülich Atmospheric Data Distributions Server (JADDS) adheres to the Web Coverage Service WCS2.0 standard as defined by the Open Geospatial Consortium OGC. This enables the user groups to flexibly define datasets they need by selecting a subset of chemical species or restricting geographical boundaries or the length of the time series. The data is made available in the form of different catalogues stored locally on our server. In addition, the Jülich OWS Interface (JOIN) provides interoperable web services allowing for easy download and visualization of datasets delivered from WCS servers via the internet. We will present the prototype JADDS server and address the major issues identified when relocating large four-dimensional datasets into a RASDAMAN raster array database. So far the RASDAMAN support for data available in netCDF format is limited with respect to metadata related to variables and axes. For community-wide accepted solutions, selected data coverages shall result in downloadable netCDF files including metadata complying with the netCDF CF Metadata Conventions standard (http://cfconventions.org/). This can be achieved by adding custom metadata elements for RASDAMAN bands (model levels) on data ingestion. Furthermore, an optimization strategy for ingestion of several TB of 4D model output data will be outlined.
EarthRef.org: Exploring aspects of a Cyber Infrastructure in Earth Science and Education

NASA Astrophysics Data System (ADS)

Staudigel, H.; Koppers, A.; Tauxe, L.; Constable, C.; Helly, J.

2004-12-01

EarthRef.org is the common host and (co-) developer of a range of earth science databases and IT resources providing a test bed for a Cyberinfrastructure in Earth Science and Education (CIESE). EarthRef.org data base efforts include in particular the Geochemical Earth Reference Model (GERM), the Magnetics Information Consortium (MagIC), the Educational Resources for Earth Science Education (ERESE) project, the Seamount Catalog, the Mid-Ocean Ridge Catalog, the Radio-Isotope Geochronology (RiG) initiative for CHRONOS, and the Microbial Observatory for Fe oxidizing microbes on Loihi Seamount (FeMO; the most recent development). These diverse databases are developed under a single database umbrella and webserver at the San Diego Supercomputing Center. All the data bases have similar structures, with consistent metadata concepts, a common database layout, and automated upload wizards. Shared resources include supporting databases like an address book, a reference/publication catalog, and a common digital archive making database development and maintenance cost-effective, while guaranteeing interoperability. The EarthRef.org CIESE provides a common umbrella for synthesis information as well as sample-based data, and it bridges the gap between science and science education in middle and high schools, validating the potential for a system wide data infrastructure in a CIESE. EarthRef.org experiences have shown that effective communication with the respective communities is a key part of a successful CIESE facilitating both utility and community buy-in. GERM has been particularly successful at developing a metadata scheme for geochemistry and in the development of a new electronic journal (G-cubed) that has made much progress in data publication and linkages between journals and community data bases. GERM also has worked, through editors and publishers, towards interfacing databases with the publication process, to accomplish a more scholarly and database friendly data publication environment, and to interface with the respective science communities. MagIC has held several workshops that have resulted in an integrated data archival environment using metadata that are interchangeable with the geochemical metadata. MagIC archives a wide array of paleo and rock magnetic directional, intensity and magnetic property data as well as integrating computational tools. ERESE brought together librarians, teachers, and scientists to create an educational environment that supports inquiry driven education and the use of science data. Experiences in EarthRef.org demonstrates the feasibility of an effective, community wide CIESE for data publication, archival and modeling, as well as the outreach to the educational community.
EARS : Repositioning data management near data acquisition.

NASA Astrophysics Data System (ADS)

Sinquin, Jean-Marc; Sorribas, Jordi; Diviacco, Paolo; Vandenberghe, Thomas; Munoz, Raquel; Garcia, Oscar

2016-04-01

The EU FP7 Projects Eurofleets and Eurofleets2 are an European wide alliance of marine research centers that aim to share their research vessels, to improve information sharing on planned, current and completed cruises, on details of ocean-going research vessels and specialized equipment, and to durably improve cost-effectiveness of cruises. Within this context logging of information on how, when and where anything happens on board of the vessel is crucial information for data users in a later stage. This forms a primordial step in the process of data quality control as it could assist in the understanding of anomalies and unexpected trends recorded in the acquired data sets. In this way completeness of the metadata is improved as it is recorded accurately at the origin of the measurement. The collection of this crucial information has been done in very different ways, using different procedures, formats and pieces of software in the context of the European Research Fleet. At the time that the Eurofleets project started, every institution and country had adopted different strategies and approaches, which complicated the task of users that need to log general purpose information and events on-board whenever they access a different platform loosing the opportunity to produce this valuable metadata on-board. Among the many goals the Eurofleets project has, a very important task is the development of an "event log software" called EARS (Eurofleets Automatic Reporting System) that enables scientists and operators to record what happens during a survey. EARS will allow users to fill, in a standardized way, the gap existing at the moment in metadata description that only very seldom links data with its history. Events generated automatically by acquisition instruments will also be handled, enhancing the granularity and precision of the event annotation. The adoption of a common procedure to log survey events and a common terminology to describe them is crucial to provide a friendly and successfully metadata on-board creation procedure for the whole the European Fleet. The possibility of automatically reporting metadata and general purpose data, will simplify the work of scientists and data managers with regards to data transmission. An improved accuracy and completeness of metadata is expected when events are recorded at acquisition time. This will also enhance multiple usages of the data as it allows verification of the different requirements existing in different disciplines.
Best Practices for Making Scientific Data Discoverable and Accessible through Integrated, Standards-Based Data Portals

NASA Astrophysics Data System (ADS)

Lucido, J. M.

2013-12-01

Scientists in the fields of hydrology, geophysics, and climatology are increasingly using the vast quantity of publicly-available data to address broadly-scoped scientific questions. For example, researchers studying contamination of nearshore waters could use a combination of radar indicated precipitation, modeled water currents, and various sources of in-situ monitoring data to predict water quality near a beach. In discovering, gathering, visualizing and analyzing potentially useful data sets, data portals have become invaluable tools. The most effective data portals often aggregate distributed data sets seamlessly and allow multiple avenues for accessing the underlying data, facilitated by the use of open standards. Additionally, adequate metadata are necessary for attribution, documentation of provenance and relating data sets to one another. Metadata also enable thematic, geospatial and temporal indexing of data sets and entities. Furthermore, effective portals make use of common vocabularies for scientific methods, units of measure, geologic features, chemical, and biological constituents as they allow investigators to correctly interpret and utilize data from external sources. One application that employs these principles is the National Ground Water Monitoring Network (NGWMN) Data Portal (http://cida.usgs.gov/ngwmn), which makes groundwater data from distributed data providers available through a single, publicly accessible web application by mediating and aggregating native data exposed via web services on-the-fly into Open Geospatial Consortium (OGC) compliant service output. That output may be accessed either through the map-based user interface or through the aforementioned OGC web services. Furthermore, the Geo Data Portal (http://cida.usgs.gov/climate/gdp/), which is a system that provides users with data access, subsetting and geospatial processing of large and complex climate and land use data, exemplifies the application of International Standards Organization (ISO) metadata records to enhance data discovery for both human and machine interpretation. Lastly, the Water Quality Portal (http://www.waterqualitydata.us/) achieves interoperable dissemination of water quality data by referencing a vocabulary service for mapping constituents and methods between the USGS and USEPA. The NGWMN Data Portal, Geo Data Portal and Water Quality Portal are three examples of best practices when implementing data portals that provide distributed scientific data in an integrated, standards-based approach.
Lessons in weather data interoperability: the National Mesonet Program

NASA Astrophysics Data System (ADS)

Evans, J. D.; Werner, B.; Cogar, C.; Heppner, P.

2015-12-01

The National Mesonet Program (NMP) links local, state, and regional surface weather observation networks (a.k.a. mesonets) to enhance the prediction of high-impact, local-scale weather events. A consortium of 23 (and counting) private firms, state agencies, and universities provides near-real-time observations from over 7,000 fixed weather stations, and over 1,000 vehicle-mounted sensors, every 15 minutes or less, together with the detailed sensor and station metadata required for effective forecasts and decision-making. In order to integrate these weather observations across the United States, and to provide full details about sensors, stations, and observations, the NMP has defined a set of conventions for observational data and sensor metadata. These conventions address the needs of users with limited bandwidth and computing resources, while also anticipating a growing variety of sensors and observations. For disseminating weather observation data, the NMP currently employs a simple ASCII format derived from the Integrated Ocean Observing System. This simplifies data ingest into common desktop software, and parsing by simple scripts; and it directly supports basic readings of temperature, pressure, etc. By extending the format to vector-valued observations, it can also convey readings taken at different altitudes (e.g. windspeed) or depths (e.g., soil moisture). Extending beyond these observations to fit a greater variety of sensors (solar irradiation, sodar, radar, lidar) may require further extensions, or a move to more complex formats (e.g., based on XML or JSON). We will discuss the tradeoffs of various conventions for different users and use cases. To convey sensor and station metadata, the NMP uses a convention known as Starfish Fungus Language (*FL), derived from the Open Geospatial Consortium's SensorML standard. *FL separates static and dynamic elements of a sensor description, allowing for relatively compact expressions that reference a library of shared definitions (e.g., sensor manufacturer's specifications) alongside time-varying and site-specific details (slope / aspect, calibration, etc.) We will discuss the tradeoffs of *FL, SensorML, and alternatives for conveying sensor details to various users and uses.
The Evolution of NSF Arctic Data Management: Challenges and Lessons Learned after Two Decades of Support

NASA Astrophysics Data System (ADS)

Moore, J. A.; Serreze, M. C.; Williams, S.; Ramamurthy, M. K.; Middleton, D.

2014-12-01

The U.S. National Science Foundation has been providing data management support to the Arctic research community through the UCAR/NCAR since late 1995. Support began during the early planning phase of the Surface Heat Budget of the Arctic (SHEBA) Project and continues today with a major collaboration involving the NCAR Earth Observing Laboratory (EOL), the NCAR Computational Information Systems Laboratory (CISL), the UCAR Unidata Program, and the National Snow and Ice Data Center (NSIDC), in the Advanced Cooperative Arctic Data and Information System (ACADIS). These groups have managed thousands of datasets for hundreds of Principal Investigators. The datasets, including the metadata and documentation held in the archives vary in size from less than 30 kilobytes to tens of gigabytes and represent dozens of research disciplines. The ACADIS holdings alone include more than 50 scientific disciplines as defined by the NASA/GCMD keywords. The data formats vary from simple ASCII text to proprietary complex binary and imagery. A lot has changed in the way data are collected due to improved data collection technologies, real time processing and wide bandwidth communications. There have been some changes to data management best practices especially related to metadata, flexible formatting, DOIs, and interoperability with other archives to take advantage of new technologies, software and related support capabilities. ACADIS has spent more than 7 years working these issues and implementing an agile service approach. There are some very interesting challenges that we have been confronted with and overcome during the past 20 years. However, with all those improvements there are guiding principles for the data managers that are robust and remain important even after 20 years of experience. These include the provision of evolving standards and complete metadata records to describe each dataset, International data exchange and easy access to the archived data, and the inclusion of comprehensive documentation to foster long-term reuse potential of the data. The authors will provide details on the handling of these specific issues and also consider some other more subtle situations that continue to require serious consideration and problem solving.
University of TX Bureau of Economic Geology's Core Research Centers: The Time is Right for Registering Physical Samples and Assigning IGSN's - Workflows, Stumbling Blocks, and Successes.

NASA Astrophysics Data System (ADS)

Averett, A.; DeJarnett, B. B.

2016-12-01

The University Of Texas Bureau Of Economic Geology (BEG) serves as the geological survey for Texas and operates three geological sample repositories that house well over 2 million boxes of geological samples (cores and cuttings) and an abundant amount of geoscience data (geophysical logs, thin sections, geochemical analyses, etc.). Material is accessible and searchable online, and it is publically available to the geological community for research and education. Patrons access information about our collection by using our online core and log database (SQL format). BEG is currently undertaking a large project to: 1) improve the internal accuracy of metadata associated with the collection; 2) enhance the capabilities of the database for both BEG curators and researchers as well as our external patrons; and 3) ensure easy and efficient navigation for patrons through our online portal. As BEG undertakes this project, BEG is in the early stages of planning to export the metadata for its collection into SESAR (System for Earth Sample Registration) and have IGSN's (International GeoSample Numbers) assigned to its samples. Education regarding the value of IGSN's and an external registry (SESAR) has been crucial to receiving management support for the project because the concept and potential benefits of registering samples in a registry outside of the institution were not well-known prior to this project. Potential benefits such as increases in discoverability, repository recognition in publications, and interoperability were presented. The project was well-received by management, and BEG fully supports the effort to register our physical samples with SESAR. Since BEG is only in the initial phase of this project, any stumbling blocks, workflow issues, successes/failures, etc. can only be predicted at this point, but by mid-December, BEG expects to have several concrete issues to present in the session. Currently, our most pressing issue involves establishing the most efficient workflow for exporting of large amounts of metadata in a format that SESAR can easily ingest, and how this can be best accomplished with very few BEG staff assigned to the project.
The IAGOS Information System

NASA Astrophysics Data System (ADS)

Boulanger, D.; Thouret, V.

2016-12-01

IAGOS (In-service Aircraft for a Global Observing System) is a European Research Infrastructure which aims at the provision of long-term, regular and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft and do measurements of aerosols, cloud particles, greenhouse gases, ozone, water vapor and nitrogen oxides from the surface to the lower stratosphere. The IAGOS database is an essential part of the global atmospheric monitoring network. It contains IAGOS-core and IAGOS-CARIBIC data. The IAGOS Data Portal (http://www.iagos.fr) is part of the French atmospheric chemistry data center AERIS (http://www.aeris-data.fr). In 2016 the new IAGOS Data Portal has been released. In addition to the data download the portal provides improved and new services such as download in NetCDF or NASA Ames formats and plotting tools (maps, time series, vertical profiles). New added value products are available through the portal: back trajectories, origin of air masses, co-location with satellite data. Web services allow to download IAGOS metadata such as flights and airports information. Administration tools have been implemented for users management and instruments monitoring. A major improvement is the interoperability with international portals and other databases in order to improve IAGOS data discovery. In the frame of the IGAS project (IAGOS for the Copernicus Atmospheric Service), a data network has been setup. It is composed of three data centers: the IAGOS database in Toulouse, the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de) and the CAMS (Copernicus Atmosphere Monitoring Service) data center in Jülich (http://join.iek.fz-juelich.de). The link with the CAMS data center, through the JOIN interface, allows to combine model outputs with IAGOS data for inter-comparison. The CAMS project is a prominent user of the IGAS data network. Duting the next year IAGOS will improve metadata standardization and dissemination through different collaborations with the AERIS data center, GAW for which IAGOS is a contributing network and the ENVRI+ European project. Measurements traceability and quality metadata will be available and DOI will be implemented.
NOAA Atmospheric, Marine and Arctic Monitoring Using UASs (including Rapid Response)

NASA Astrophysics Data System (ADS)

Coffey, J. J.; Jacobs, T.

2015-12-01

Unmanned systems have the potential to efficiently, effectively, economically, and safely bridge critical observation requirements in an environmentally friendly manner. As the United States' Atmospheric, Marine and Arctic areas of interest expand and include hard-to-reach regions of the Earth (such as the Arctic and remote oceanic areas) optimizing unmanned capabilities will be needed to advance the United States' science, technology and security efforts. Through increased multi-mission and multi-agency operations using improved inter-operable and autonomous unmanned systems, the research and operations communities will better collect environmental intelligence and better protect our Country against hazardous weather, environmental, marine and polar hazards. This presentation will examine NOAA's Atmospheric, Marine and Arctic Monitoring Unmanned Aircraft System (UAS) strategies which includes developing a coordinated effort to maximize the efficiency and capabilities of unmanned systems across the federal government and research partners. Numerous intra- and inter-agency operational demonstrations and assessments have been made to verify and validated these strategies. This includes the introduction of the Targeted Autonomous Insitu Sensing and Rapid Response (TAISRR) with UAS concept of operations. The presentation will also discuss the requisite UAS capabilities and our experience in using them.
Knowledge base for growth and innovation in ocean economy: assembly and dissemination of marine data for seabed mapping - European Marine Observation Data Network - EMODnet Physics

NASA Astrophysics Data System (ADS)

Novellino, Antonio; Gorringe, Patrick; Schaap, Dick; Pouliquen, Sylvie; Rickards, Lesley; Manzella, Giuseppe

2014-05-01

The Physics preparatory action (MARE/2010/02 - Lot [SI2.579120]) had the overall objectives to provide access to archived and near real-time data on physical conditions as monitored by fixed stations and Ferrybox lines in all the European sea basins and oceans and to determine how well the data meet the needs of users. The existing EMODnet-Physics portal, www.emodnet-physics.eu, includes systems for physical data from the whole Europe (wave height and period, temperature of the water column, wind speed and direction, salinity of the water column, horizontal velocity of the water column, light attenuation, and sea level) provided mainly by fixed stations and ferry-box platforms, discovering related data sets (both near real time and historical data sets), viewing and downloading of the data from about 470 platforms across the European Sea basins. It makes layers of physical data and their metadata available for use and contributes towards the definition of an operational European Marine Observation and Data Network (EMODnet). It is based on a strong collaboration between EuroGOOS member institutes and its regional operational oceanographic systems (ROOSs), and it brings together two marine, but different, communities : the "real time" ocean observing institutes and centers and the National Oceanographic Data Centres (NODCs) that are in charge for archived ocean data validation, quality check and continuous update of data archives for marine environmental monitoring. EMODnet Physics is a Marine Observation and Data Information System that provides a single point of access to near real time and historical achieved data, it is built on existing infrastructure by adding value and avoiding any unnecessary complexity, it provides data access to any relevant user, and is aimed at attracting new data holders and providing better and more data. With a long term-vision for a sustained pan European Ocean Observation System EMODnet Physics is supporting the coordination of the EuroGOOS ROOSs and the empowerment and improvement of their observing and data management infrastructure. The on-going EMODnet Physics preparatory action has recently been extended (MARE/2012/06 - Lot 6) with the aim to enlarge the coverage with additional monitoring systems (e.g. Argos, Gliders, HF Radars etc) and products and strengthening the underlying infrastructure. The presentation will show how to exploit the EMODnet portal and access to the metadata and data of connected platforms.
Semantic Interoperability for Computational Mineralogy: Experiences of the eMinerals Consortium

NASA Astrophysics Data System (ADS)

Walker, A. M.; White, T. O.; Dove, M. T.; Bruin, R. P.; Couch, P. A.; Tyer, R. P.

2006-12-01

The use of atomic scale computer simulation of minerals to obtain information for geophysics and environmental science has grown enormously over the past couple of decades. It is now routine to probe mineral behavior in the Earth's deep interior and in the surface environment by borrowing methods and simulation codes from computational chemistry and physics. It is becoming increasingly important to use methods embodied in more than one of these codes to solve any single scientific problem. However, scientific codes are rarely designed for easy interoperability and data exchange; data formats are often code-specific, poorly documented and fragile, liable to frequent change between software versions, and even compiler versions. This means that the scientist's simple desire to use the methodological approaches offered by multiple codes is frustrated, and even the sharing of data between collaborators becomes fraught with difficulties. The eMinerals consortium was formed in the early stages of the UK eScience program with the aim of developing the tools needed to apply atomic scale simulation to environmental problems in a grid-enabled world, and to harness the computational power offered by grid technologies to address some outstanding mineralogical problems. One example of the kind of problem we can tackle is the origin of the compressibility anomaly in silica glass. By passing data directly between simulation and analysis tools we were able to probe this effect in more detail than has previously been possible and have shown how the anomaly is related to the details of the amorphous structure. In order to approach this kind of problem we have constructed a mini-grid, a small scale and extensible combined compute- and data-grid that allows the execution of many calculations in parallel, and the transparent storage of semantically-rich marked-up result data. Importantly, we automatically capture multiple kinds of metadata and key results from each calculation. We believe that the lessons learned and tools developed will be useful in many areas of science beyond the computational mineralogy. Key tools that will be described include: a pure Fortran XML library (FoX) that presents XPath, SAX and DOM interfaces as well as permitting the easy production of valid XML from legacy Fortran programs; a job submission framework that automatically schedules calculations to remote grid resources, handles data staging and metadata capture; and a tool (AgentX) that map concepts from an ontology onto locations in documents of various formats that we use to enable data exchange.
Towards an integrated EU data system within AtlantOS project

NASA Astrophysics Data System (ADS)

Pouliquen, Sylvie; Harscoat, Valerie; Waldmann, Christoph; Koop-Jakobsen, ketill

2017-04-01

The H2020 AtlantOS project started in June 2015 and aims to optimise and enhance the Integrated Atlantic Ocean Observing Systems (IAOOS). One goal is to ensure that data from different and diverse in-situ observing networks are readily accessible and useable to the wider community, international ocean science community and other stakeholders in this field. To achieve that, the strategy is to move towards an integrated data system within AtlantOS that harmonises work flows, data processing and distribution across the in-situ observing network systems, and integrates in-situ observations in existing European and international data infrastructures (Copernicus marine service, SeaDataNet NODCs, EMODnet, OBIS, GEOSS) so called Integrators. The targeted integrated system will deal with data management challenges for efficient and reliable data service to users: • Quality control commons for heterogeneous and nearly real time data • Standardisation of mandatory metadata for efficient data exchange • Interoperability of network and integrator data management systems Presently the situation is that the data acquired by the different in situ observing networks contributing to the AtlantOS project are processed and distributed using different methodologies and means. Depending on the network data management organization, the data are either processed following recommendations elaborated y the network teams and accessible through a unique portal (FTP or Web), or are processed by individual scientific researchers and made available through National Data Centres or directly at institution level. Some datasets are available through Integrators, such as Copernicus or EMODnet, but connected through ad-hoc links. To facilitate the access to the Atlantic observations and avoid "mixing pears with apples", it has been necessary to agree on (1) the EOVs list and definition across the Networks, (2) a minimum set of common vocabularies for metadata and data description to be used by all the Networks, and (3) a minimum level of Near Real Time Quality Control Procedures for selected EOVs. Then a data exchange backbone has been defined and is being setting up to facilitate discovery, viewing and downloading by the users. Some tools will be recommended to help Network plugging their data on this backbone and facilitate integration in the Integrators. Finally, existing services to the users for data discovery, viewing and downloading will be enhanced to ease access to existing observations. An initial working phase relying on existing international standards and protocols, involving data providers, both Networks and Integrators, and dealing with data harmonisation and integration objectives, has led to agreements and recommendations .The setup phase has started, both on Networks and Integrators sides, to adapt the existing systems in order to move toward this integrated EU data system within AtlantOS as well as collaboration with international partners arpound the ATlantic Ocean.
The ChArMEx database

NASA Astrophysics Data System (ADS)

Ferré, Helene; Belmahfoud, Nizar; Boichard, Jean-Luc; Brissebrat, Guillaume; Descloitres, Jacques; Fleury, Laurence; Focsa, Loredana; Henriot, Nicolas; Mastrorillo, Laurence; Mière, Arnaud; Vermeulen, Anne

2014-05-01

The Chemistry-Aerosol Mediterranean Experiment (ChArMEx, http://charmex.lsce.ipsl.fr/) aims at a scientific assessment of the present and future state of the atmospheric environment in the Mediterranean Basin, and of its impacts on the regional climate, air quality, and marine biogeochemistry. The project includes long term monitoring of environmental parameters, intensive field campaigns, use of satellite data and modelling studies. Therefore ChARMEx scientists produce and need to access a wide diversity of data. In this context, the objective of the database task is to organize data management, distribution system and services, such as facilitating the exchange of information and stimulating the collaboration between researchers within the ChArMEx community, and beyond. The database relies on a strong collaboration between OMP and ICARE data centres and has been set up in the framework of the Mediterranean Integrated Studies at Regional And Locals Scales (MISTRALS) program data portal. All the data produced by or of interest for the ChArMEx community will be documented in the data catalogue and accessible through the database website: http://mistrals.sedoo.fr/ChArMEx. At present, the ChArMEx database contains about 75 datasets, including 50 in situ datasets (2012 and 2013 campaigns, Ersa background monitoring station), 25 model outputs (dust model intercomparison, MEDCORDEX scenarios), and a high resolution emission inventory over the Mediterranean. Many in situ datasets have been inserted in a relational database, in order to enable more accurate data selection and download of different datasets in a shared format. The database website offers different tools: - A registration procedure which enables any scientist to accept the data policy and apply for a user database account. - A data catalogue that complies with metadata international standards (ISO 19115-19139; INSPIRE European Directive; Global Change Master Directory Thesaurus). - Metadata forms to document observations or products that will be provided to the database. - A search tool to browse the catalogue using thematic, geographic and/or temporal criteria. - A shopping-cart web interface to order in situ data files. - A web interface to select and access to homogenized datasets. Interoperability between the two data centres is being set up using the OPEnDAP protocol. The data portal will soon propose a user-friendly access to satellite products managed by the ICARE data centre (SEVIRI, TRIMM, PARASOL...). In order to meet the operational needs of the airborne and ground based observational teams during the ChArMEx 2012 and 2013 campaigns, a day-to-day chart and report display website has been developed too: http://choc.sedoo.org. It offers a convenient way to browse weather conditions and chemical composition during the campaign periods.
Preserving Geological Samples and Metadata from Polar Regions

NASA Astrophysics Data System (ADS)

Grunow, A.; Sjunneskog, C. M.

2011-12-01

The Office of Polar Programs at the National Science Foundation (NSF-OPP) has long recognized the value of preserving earth science collections due to the inherent logistical challenges and financial costs of collecting geological samples from Polar Regions. NSF-OPP established two national facilities to make Antarctic geological samples and drill cores openly and freely available for research. The Antarctic Marine Geology Research Facility (AMGRF) at Florida State University was established in 1963 and archives Antarctic marine sediment cores, dredge samples and smear slides along with ship logs. The United States Polar Rock Repository (USPRR) at Ohio State University was established in 2003 and archives polar rock samples, marine dredges, unconsolidated materials and terrestrial cores, along with associated materials such as field notes, maps, raw analytical data, paleomagnetic cores, thin sections, microfossil mounts, microslides and residues. The existence of the AMGRF and USPRR helps to minimize redundant sample collecting, lessen the environmental impact of doing polar field work, facilitates field logistics planning and complies with the data sharing requirement of the Antarctic Treaty. USPRR acquires collections through donations from institutions and scientists and then makes these samples available as no-cost loans for research, education and museum exhibits. The AMGRF acquires sediment cores from US based and international collaboration drilling projects in Antarctica. Destructive research techniques are allowed on the loaned samples and loan requests are accepted from any accredited scientific institution in the world. Currently, the USPRR has more than 22,000 cataloged rock samples available to scientists from around the world. All cataloged samples are relabeled with a USPRR number, weighed, photographed and measured for magnetic susceptibility. Many aspects of the sample metadata are included in the database, e.g. geographical location, sample description, collector, rock age, formation, section location, multimedia images as well structural data, field observations, logistics, surface features, etc. The metadata are entered into a commercial, museum based database called EMu. The AMGRF houses more than 25,000m of deep-sea cores and drill cores as well as nearly 3,000 meters of rotary cored geological material from Antarctica. Detailed information on the sediment cores including location, sediment composition are available in cruise reports posted on the AMGRF web-site. Researchers may access the sample collections through the online websites (http://www-bprc.mps.ohio-state.edu/emuwebusprr and http://www.arf.fsu.edu). Searches may be done using multiple search terms or by use of the mapping feature. The on-line databases provide an essential resource for proposal preparation, pilot studies and other sample based research that should make fieldwork more efficient.
Maximizing data holdings and data documentation with a hierarchical system for sample-based geochemical data

NASA Astrophysics Data System (ADS)

Hsu, L.; Lehnert, K. A.; Walker, J. D.; Chan, C.; Ash, J.; Johansson, A. K.; Rivera, T. A.

2011-12-01

Sample-based measurements in geochemistry are highly diverse, due to the large variety of sample types, measured properties, and idiosyncratic analytical procedures. In order to ensure the utility of sample-based data for re-use in research or education they must be associated with a high quality and quantity of descriptive, discipline-specific metadata. Without an adequate level of documentation, it is not possible to reproduce scientific results or have confidence in using the data for new research inquiries. The required detail in data documentation makes it challenging to aggregate large sets of data from different investigators and disciplines. One solution to this challenge is to build data systems with several tiers of intricacy, where the less detailed tiers are geared toward discovery and interoperability, and the more detailed tiers have higher value for data analysis. The Geoinformatics for Geochemistry (GfG) group, which is part of the Integrated Earth Data Applications facility (http://www.iedadata.org), has taken this approach to provide services for the discovery, access, and analysis of sample-based geochemical data for a diverse user community, ranging from the highly informed geochemist to non-domain scientists and undergraduate students. GfG builds and maintains three tiers in the sample based data systems, from a simple data catalog (Geochemical Resource Library), to a substantially richer data model for the EarthChem Portal (EarthChem XML), and finally to detailed discipline-specific data models for petrologic (PetDB), sedimentary (SedDB), hydrothermal spring (VentDB), and geochronological (GeoChron) samples. The data catalog, the lowest level in the hierarchy, contains the sample data values plus metadata only about the dataset itself (Dublin Core metadata such as dataset title and author), and therefore can accommodate the widest diversity of data holdings. The second level includes measured data values from the sample, basic information about the analytical method, and metadata about the samples such as geospatial information and sample type. The third and highest level includes detailed data quality documentation and more specific information about the scientific context of the sample. The three tiers are linked to allow users to quickly navigate to their desired level of metadata detail. Links are based on the use of unique identifiers: (a) DOI at the granularity of datasets, and (b) the International Geo Sample Number IGSN at the granularity of samples. Current developments in the GfG sample-based systems include new registry architecture for the IGSN to advance international implementation, growth and modification of EarthChemXML to include geochemical data for new sample types such as soils and liquids, and the construction of a hydrothermal vent data system. This flexible, tiered, model provides a solution for offering varying levels of detail in order to aggregate a large quantity of data and serve the largest user group of both disciplinary novices and experts.
Large-Scale Data Collection Metadata Management at the National Computation Infrastructure

NASA Astrophysics Data System (ADS)

Wang, J.; Evans, B. J. K.; Bastrakova, I.; Ryder, G.; Martin, J.; Duursma, D.; Gohar, K.; Mackey, T.; Paget, M.; Siddeswara, G.

2014-12-01

Data Collection management has become an essential activity at the National Computation Infrastructure (NCI) in Australia. NCI's partners (CSIRO, Bureau of Meteorology, Australian National University, and Geoscience Australia), supported by the Australian Government and Research Data Storage Infrastructure (RDSI), have established a national data resource that is co-located with high-performance computing. This paper addresses the metadata management of these data assets over their lifetime. NCI manages 36 data collections (10+ PB) categorised as earth system sciences, climate and weather model data assets and products, earth and marine observations and products, geosciences, terrestrial ecosystem, water management and hydrology, astronomy, social science and biosciences. The data is largely sourced from NCI partners, the custodians of many of the national scientific records, and major research community organisations. The data is made available in a HPC and data-intensive environment - a ~56000 core supercomputer, virtual labs on a 3000 core cloud system, and data services. By assembling these large national assets, new opportunities have arisen to harmonise the data collections, making a powerful cross-disciplinary resource.To support the overall management, a Data Management Plan (DMP) has been developed to record the workflows, procedures, the key contacts and responsibilities. The DMP has fields that can be exported to the ISO19115 schema and to the collection level catalogue of GeoNetwork. The subset or file level metadata catalogues are linked with the collection level through parent-child relationship definition using UUID. A number of tools have been developed that support interactive metadata management, bulk loading of data, and support for computational workflows or data pipelines. NCI creates persistent identifiers for each of the assets. The data collection is tracked over its lifetime, and the recognition of the data providers, data owners, data generators and data aggregators are updated. A Digital Object Identifier is assigned using the Australian National Data Service (ANDS). Once the data has been quality assured, a DOI is minted and the metadata record updated. NCI's data citation policy establishes the relationship between research outcomes, data providers, and the data.
Mind the Gap: furthering the development of an international collaboration in marine data management

NASA Astrophysics Data System (ADS)

Glaves, Helen; Miller, Stephen; Proctor, Roger; Schaap, Dick

2013-04-01

A large and ever increasing amount of marine data is available throughout Europe, USA, Australia and beyond. The challenges associated with the acquisition of this data mean that the cost of collection is high and the data itself often irreplaceable. At a time when the demand for marine data is growing while financial resources for its collection are being dramatically reduced the need to maximise its re-use is becoming a priority for marine data managers. A number of barriers to the re-use of marine data currently exist due to the various formats, standards, vocabularies etc. used by the organisations engaged in collecting and managing this data. These challenges are already being addressed at a regional level by projects in Europe (Geo-Seas, SeaDataNet etc.), USA (R2R) and Australia (IMOS). To expand these projects further and bridge the gap between these regional initiatives the Ocean Data Interoperability Platform (ODIP) will establish a collaborative platform which will facilitate the development of a common approach to marine data management. Proactive dissemination of the outcomes and products of this project will promote adoption of the common standards and practices developed by the ODIP project to other organisations and regions beyond the 20 original consortium partners. To demonstrate this coordinated approach several joint prototypes will be developed to test and evaluate potential solutions for solving the marine data management issues identified within the different marine. These prototypes will also be used to illustrate the effective sharing of data across scientific domains, organisations and international boundaries through the development of common practices and standards in marine data management.
Mind the Gap: furthering the development of an international collaboration in marine data management

NASA Astrophysics Data System (ADS)

Glaves, H. M.; Miller, S. P.; Proctor, R.; Schaap, D.

2012-12-01

A large and ever increasing amount of marine data is available throughout Europe, USA, Australia and beyond. The challenges associated with the acquisition of this data mean that the cost of collection is high and the data itself often irreplaceable. At a time when the demand for marine data is growing while financial resources for its collection are being dramatically reduced the need to maximise its re-use is becoming a priority for marine data managers. A number of barriers to the re-use of marine data currently exist due to the various formats, standards, vocabularies etc. used by the organisations engaged in collecting and managing this data. These challenges are already being addressed at a regional level by projects in Europe (Geo-Seas, SeaDataNet etc.), USA (R2R) and Australia (IMOS). To expand these projects further and bridge the gap between these regional initiatives the Ocean Data Interoperability Platform (ODIP) will establish a collaborative platform which will facilitate the development of a common approach to marine data management. Proactive dissemination of the outcomes and products of this project will promote adoption of the common standards and practices developed by the ODIP project to other organisations and regions beyond the 20 original consortium partners. To demonstrate this coordinated approach several joint prototypes will be developed to test and evaluate potential solutions for solving the marine data management issues identified within the different marine disciplines. These prototypes will also be used to illustrate the effective sharing of data across scientific domains, organisations and international boundaries through the development of common practices and standards in marine data management.

The IAGOS information system

NASA Astrophysics Data System (ADS)

Boulanger, Damien; Gautron, Benoit; Schultz, Martin; Brötz, Björn; Rauthe-Schöch, Armin; Thouret, Valérie

2015-04-01

IAGOS (In-service Aircraft for a Global Observing System) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. The IAGOS database (http://www.iagos.fr, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data centre Ether (CNES and CNRS). In the framework of the IGAS project (IAGOS for Copernicus Atmospheric Service) interoperability with international portals or other databases is implemented in order to improve IAGOS data discovery. The IGAS data network is composed of three data centres: the IAGOS database in Toulouse including IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data since January 2015; the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de); and the MACC data centre in Jülich (http://join.iek.fz-juelich.de). The MACC (Monitoring Atmospheric Composition and Climate) project is a prominent user of the IGAS data network. In June 2015 a new version of the IAGOS database will be released providing improved services such as download in NetCDF or NASA Ames formats; graphical tools (maps, scatter plots, etc.); standardized metadata (ISO 19115) and a better users management. The link with the MACC data centre, through JOIN (Jülich OWS Interface), will allow to combine model outputs with IAGOS data for intercomparison. The interoperability within the IGAS data network, implemented thanks to many web services, will improve the functionalities of the web interfaces of each data centre.
Using a generalised identity reference model with archetypes to support interoperability of demographics information in electronic health record systems.

PubMed

Xu Chen; Berry, Damon; Stephens, Gaye

2015-01-01

Computerised identity management is in general encountered as a low-level mechanism that enables users in a particular system or region to securely access resources. In the Electronic Health Record (EHR), the identifying information of both the healthcare professionals who access the EHR and the patients whose EHR is accessed, are subject to change. Demographics services have been developed to manage federated patient and healthcare professional identities and to support challenging healthcare-specific use cases in the presence of diverse and sometimes conflicting demographic identities. Demographics services are not the only use for identities in healthcare. Nevertheless, contemporary EHR specifications limit the types of entities that can be the actor or subject of a record to health professionals and patients, thus limiting the use of two level models in other healthcare information systems. Demographics are ubiquitous in healthcare, so for a general identity model to be usable, it should be capable of managing demographic information. In this paper, we introduce a generalised identity reference model (GIRM) based on key characteristics of five surveyed demographic models. We evaluate the GIRM by using it to express the EN13606 demographics model in an extensible way at the metadata level and show how two-level modelling can support the exchange of instances of demographic identities. This use of the GIRM to express demographics information shows its application for standards-compliant two-level modelling alongside heterogeneous demographics models. We advocate this approach to facilitate the interoperability of identities between two-level model-based EHR systems and show the validity and the extensibility of using GIRM for the expression of other health-related identities.
Community-Supported Data Repositories in Paleobiology: A 'Middle Tail' Between the Geoscientific and Informatics Communities

NASA Astrophysics Data System (ADS)

Williams, J. W.; Ashworth, A. C.; Betancourt, J. L.; Bills, B.; Blois, J.; Booth, R.; Buckland, P.; Charles, D.; Curry, B. B.; Goring, S. J.; Davis, E.; Grimm, E. C.; Graham, R. W.; Smith, A. J.

2015-12-01

Community-supported data repositories (CSDRs) in paleoecology and paleoclimatology have a decades-long tradition and serve multiple critical scientific needs. CSDRs facilitate synthetic large-scale scientific research by providing open-access and curated data that employ community-supported metadata and data standards. CSDRs serve as a 'middle tail' or boundary organization between information scientists and the long-tail community of individual geoscientists collecting and analyzing paleoecological data. Over the past decades, a distributed network of CSDRs has emerged, each serving a particular suite of data and research communities, e.g. Neotoma Paleoecology Database, Paleobiology Database, International Tree Ring Database, NOAA NCEI for Paleoclimatology, Morphobank, iDigPaleo, and Integrated Earth Data Alliance. Recently, these groups have organized into a common Paleobiology Data Consortium dedicated to improving interoperability and sharing best practices and protocols. The Neotoma Paleoecology Database offers one example of an active and growing CSDR, designed to facilitate research into ecological and evolutionary dynamics during recent past global change. Neotoma combines a centralized database structure with distributed scientific governance via multiple virtual constituent data working groups. The Neotoma data model is flexible and can accommodate a variety of paleoecological proxies from many depositional contests. Data input into Neotoma is done by trained Data Stewards, drawn from their communities. Neotoma data can be searched, viewed, and returned to users through multiple interfaces, including the interactive Neotoma Explorer map interface, REST-ful Application Programming Interfaces (APIs), the neotoma R package, and the Tilia stratigraphic software. Neotoma is governed by geoscientists and provides community engagement through training workshops for data contributors, stewards, and users. Neotoma is engaged in the Paleobiological Data Consortium and other efforts to improve interoperability among cyberinfrastructure in the paleogeosciences.
The Earth System Grid Federation (ESGF) Project

NASA Astrophysics Data System (ADS)

Carenton-Madiec, Nicolas; Denvil, Sébastien; Greenslade, Mark

2015-04-01

The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data. ESGF's primary goal is to facilitate advancements in Earth System Science. It is an interagency and international effort led by the US Department of Energy (DOE), and co-funded by National Aeronautics and Space Administration (NASA), National Oceanic and Atmospheric Administration (NOAA), National Science Foundation (NSF), Infrastructure for the European Network of Earth System Modelling (IS-ENES) and international laboratories such as the Max Planck Institute for Meteorology (MPI-M) german Climate Computing Centre (DKRZ), the Australian National University (ANU) National Computational Infrastructure (NCI), Institut Pierre-Simon Laplace (IPSL), and the British Atmospheric Data Center (BADC). Its main mission is to support current CMIP5 activities and prepare for future assesments. The ESGF architecture is based on a system of autonomous and distributed nodes, which interoperate through common acceptance of federation protocols and trust agreements. Data is stored at multiple nodes around the world, and served through local data and metadata services. Nodes exchange information about their data holdings and services, trust each other for registering users and establishing access control decisions. The net result is that a user can use a web browser, connect to any node, and seamlessly find and access data throughout the federation. This type of collaborative working organization and distributed architecture context en-lighted the need of integration and testing processes definition to ensure the quality of software releases and interoperability. This presentation will introduce the ESGF project and demonstrate the range of tools and processes that have been set up to support release management activities.
DECADE web portal: toward the integration of MaGa, EarthChem and VOTW data systems to further the knowledge on Earth degassing

NASA Astrophysics Data System (ADS)

Cardellini, Carlo; Frigeri, Alessandro; Lehnert, Kerstin; Ash, Jason; McCormick, Brendan; Chiodini, Giovanni; Fischer, Tobias; Cottrell, Elizabeth

2015-04-01

The release of volatiles from the Earth's interior takes place in both volcanic and non-volcanic areas of the planet. The comprehension of such complex process and the improvement of the current estimates of global carbon emissions, will greatly benefit from the integration of geochemical, petrological and volcanological data. At present, major online data repositories relevant to studies of degassing are not linked and interoperable. In the framework of the Deep Earth Carbon Degassing (DECADE) initiative of the Deep Carbon Observatory (DCO), we are developing interoperability between three data systems that will make their data accessible via the DECADE portal: (1) the Smithsonian Institutionian's Global Volcanism Program database (VOTW) of volcanic activity data, (2) EarthChem databases for geochemical and geochronological data of rocks and melt inclusions, and (3) the MaGa database (Mapping Gas emissions) which contains compositional and flux data of gases released at volcanic and non-volcanic degassing sites. The DECADE web portal will create a powerful search engine of these databases from a single entry point and will return comprehensive multi-component datasets. A user will be able, for example, to obtain data relating to compositions of emitted gases, compositions and age of the erupted products and coincident activity, of a specific volcano. This level of capability requires a complete synergy between the databases, including availability of standard-based web services (WMS, WFS) at all data systems. Data and metadata can thus be extracted from each system without interfering with each database's local schema or being replicated to achieve integration at the DECADE web portal. The DECADE portal will enable new synoptic perspectives on the Earth degassing process allowing to explore Earth degassing related datasets over previously unexplored spatial or temporal ranges.
Rosetta: Ensuring the Preservation and Usability of ASCII-based Data into the Future

NASA Astrophysics Data System (ADS)

Ramamurthy, M. K.; Arms, S. C.

2015-12-01

Field data obtained from dataloggers often take the form of comma separated value (CSV) ASCII text files. While ASCII based data formats have positive aspects, such as the ease of accessing the data from disk and the wide variety of tools available for data analysis, there are some drawbacks, especially when viewing the situation through the lens of data interoperability and stewardship. The Unidata data translation tool, Rosetta, is a web-based service that provides an easy, wizard-based interface for data collectors to transform their datalogger generated ASCII output into Climate and Forecast (CF) compliant netCDF files following the CF-1.6 discrete sampling geometries. These files are complete with metadata describing what data are contained in the file, the instruments used to collect the data, and other critical information that otherwise may be lost in one of many README files. The choice of the machine readable netCDF data format and data model, coupled with the CF conventions, ensures long-term preservation and interoperability, and that future users will have enough information to responsibly use the data. However, with the understanding that the observational community appreciates the ease of use of ASCII files, methods for transforming the netCDF back into a CSV or spreadsheet format are also built-in. One benefit of translating ASCII data into a machine readable format that follows open community-driven standards is that they are instantly able to take advantage of data services provided by the many open-source data server tools, such as the THREDDS Data Server (TDS). While Rosetta is currently a stand-alone service, this talk will also highlight efforts to couple Rosetta with the TDS, thus allowing self-publishing of thoroughly documented datasets by the data producers themselves.
Modeling and interoperability of heterogeneous genomic big data for integrative processing and querying.

PubMed

Masseroli, Marco; Kaitoua, Abdulrahman; Pinoli, Pietro; Ceri, Stefano

2016-12-01

While a huge amount of (epi)genomic data of multiple types is becoming available by using Next Generation Sequencing (NGS) technologies, the most important emerging problem is the so-called tertiary analysis, concerned with sense making, e.g., discovering how different (epi)genomic regions and their products interact and cooperate with each other. We propose a paradigm shift in tertiary analysis, based on the use of the Genomic Data Model (GDM), a simple data model which links genomic feature data to their associated experimental, biological and clinical metadata. GDM encompasses all the data formats which have been produced for feature extraction from (epi)genomic datasets. We specifically describe the mapping to GDM of SAM (Sequence Alignment/Map), VCF (Variant Call Format), NARROWPEAK (for called peaks produced by NGS ChIP-seq or DNase-seq methods), and BED (Browser Extensible Data) formats, but GDM supports as well all the formats describing experimental datasets (e.g., including copy number variations, DNA somatic mutations, or gene expressions) and annotations (e.g., regarding transcription start sites, genes, enhancers or CpG islands). We downloaded and integrated samples of all the above-mentioned data types and formats from multiple sources. The GDM is able to homogeneously describe semantically heterogeneous data and makes the ground for providing data interoperability, e.g., achieved through the GenoMetric Query Language (GMQL), a high-level, declarative query language for genomic big data. The combined use of the data model and the query language allows comprehensive processing of multiple heterogeneous data, and supports the development of domain-specific data-driven computations and bio-molecular knowledge discovery. Copyright Â© 2016 Elsevier Inc. All rights reserved.
A Tsunami-Focused Tide Station Data Sharing Framework

NASA Astrophysics Data System (ADS)

Kari, U. S.; Marra, J. J.; Weinstein, S. A.

2006-12-01

The Indian Ocean Tsunami of 26 December 2004 made it clear that information about tide stations that could be used to support detection and warning (such as location, collection and transmission capabilities, operator identification) are insufficiently known or not readily accessible. Parties interested in addressing this problem united under the Pacific Region Data Integrated Data Enterprise (PRIDE), and in 2005 began a multiyear effort to develop a distributed metadata system describing tide stations starting with pilot activities in a regional framework and focusing on tsunami detection and warning systems being developed by various agencies. First, a plain semantic description of the tsunami-focused tide station metadata was developed. The semantic metadata description was, in turn, developed into a formal metadata schema championed by International Tsunami Information Centre (ITIC) as part of a larger effort to develop a prototype web service under the PRIDE program in 2005. Under the 2006 PRIDE program the formal metadata schema was then expanded to corral input parameters for the TideTool application used by Pacific Tsunami Warning Center (PTWC) to drill down into wave activity at a tide station that is located using a web service developed on this metadata schema. This effort contributed to formalization of web service dissemination of PTWC watch and warning tsunami bulletins. During this time, the data content and sharing issues embodied in this schema have been discussed at various forums. The result is that the various stakeholders have different data provider and user perspectives (semantic content) and also exchange formats (not limited to just XML). The challenge then, is not only to capture all data requirements, but also to have formal representation that is easily transformed into any specified format. The latest revision of the tide gauge schema (Version 0.3), begins to address this challenge. It encompasses a broader range of provider and user perspectives, such as station operators, warning system managers, disaster managers, other marine hazard warning systems (such as storm surges and sea level change monitoring and research. In the next revision(s), we hope to take into account various relevant standards, including specifically, the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) Framework, that will serve all prospective stakeholders in the most useful (extensible, scalable) manner. This is because Sensor ML has addressed many of the challenges we face already, through very useful fundamental modeling consideration and data types that are particular to sensors in general, with perhaps some extension needed for tide gauges. As a result of developing this schema, and associated client application architectures, we hope to have a much more distributed network of data providers, who are able to contribute to a global tide station metadata from the comfort of their own Information Technology (IT) departments.
The Heliophysics Integrated Observatory

NASA Astrophysics Data System (ADS)

Csillaghy, A.; Bentley, R. D.

2009-12-01

HELIO is a new Europe-wide, FP7-funded distributed network of services that will address the needs of a broad community of researchers in heliophysics. This new research field explores the “Sun-Solar System Connection” and requires the joint exploitation of solar, heliospheric, magnetospheric and ionospheric observations. HELIO will provide the most comprehensive integrated information system in this domain; it will coordinate access to the distributed resources needed by the community, and will provide access to services to mine and analyse the data. HELIO will be designed as a Service-oriented Architecture. The initial infrastructure will include services based on metadata and data servers deployed by the European Grid of Solar Observations (EGSO). We will extend these to address observations from all the disciplines of heliophysics; differences in the way the domains describe and handle the data will be resolved using semantic mapping techniques. Processing and storage services will allow the user to explore the data and create the products that meet stringent standards of interoperability. These capabilities will be orchestrated with the data and metadata services using the Taverna workflow tool. HELIO will address the challenges along the FP7 I3 activities model: (1) Networking: we will cooperate closely with the community to define new standards for heliophysics and the required capabilities of the HELIO system. (2) Services: we will integrate the services developed by the project and other groups to produce an infrastructure that can easily be extended to satisfy the growing and changing needs of the community. (3) Joint Research: we will develop search tools that span disciplinary boundaries and explore new types of user-friendly interfaces HELIO will be a key component of a worldwide effort to integrate heliophysics data and will coordinate closely with international organizations to exploit synergies with complementary domains.
Data Integration and Analysis System (DIAS) Contributing to the Sustainable Development Goals (SDGs)

NASA Astrophysics Data System (ADS)

Koike, T.

2014-12-01

It has been said that scientists and experts usually use 80% of their research time for data management (DOE, 2002). Only 20% of research time is used for purely scientific activities. This ratio should be reversed by introducing computer science technology. To realize this goal, the Japanese government supported the development of a data system called "Data Integration and Analysis System (DIAS)," as one of the national key projects promoted by the Council for Science and Technology Policy (CSTP) from 2006 to 2010. A follow-up 5-year project is also ongoing. The essential aim for the DIAS was to create knowledge that would enable solutions to problems and generate socioeconomic benefits. DIAS mainly consists of four data components including data injection, management, integration, and interoperability. DIAS is now tackling a large increase in the diversity and volume of data from observing the Earth. Dictionaries have been developing an ontology system for technical and geographical terms, and a metadata design has been completed according to international standards. The volume of data stored has exponentially increased. Previously, almost all of the large-volume data came from satellites, but model outputs occupy the largest volume of our data storage nowadays. In collaboration with scientific and technological groups, DIAS can accelerate data archiving by including data loading, quality checking, metadata registration, and our system data-searching capability is being enriched. DIAS also enables us to perform integrated research and realize interdisciplinarity. Essentially, we are now working in the fields of climate, water resources, food, fisheries, and biodiversity by collaborating between different disciplines and trying to develop bases of contribution to Sustainable Development Goals (SDGs).
Towards a metadata scheme for the description of materials - the description of microstructures

NASA Astrophysics Data System (ADS)

Schmitz, Georg J.; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre

2016-01-01

The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
Hydrological Modeling Reproducibility Through Data Management and Adaptors for Model Interoperability

NASA Astrophysics Data System (ADS)

Turner, M. A.

2015-12-01

Because of a lack of centralized planning and no widely-adopted standards among hydrological modeling research groups, research communities, and the data management teams meant to support research, there is chaos when it comes to data formats, spatio-temporal resolutions, ontologies, and data availability. All this makes true scientific reproducibility and collaborative integrated modeling impossible without some glue to piece it all together. Our Virtual Watershed Integrated Modeling System provides the tools and modeling framework hydrologists need to accelerate and fortify new scientific investigations by tracking provenance and providing adaptors for integrated, collaborative hydrologic modeling and data management. Under global warming trends where water resources are under increasing stress, reproducible hydrological modeling will be increasingly important to improve transparency and understanding of the scientific facts revealed through modeling. The Virtual Watershed Data Engine is capable of ingesting a wide variety of heterogeneous model inputs, outputs, model configurations, and metadata. We will demonstrate one example, starting from real-time raw weather station data packaged with station metadata. Our integrated modeling system will then create gridded input data via geostatistical methods along with error and uncertainty estimates. These gridded data are then used as input to hydrological models, all of which are available as web services wherever feasible. Models may be integrated in a data-centric way where the outputs too are tracked and used as inputs to "downstream" models. This work is part of an ongoing collaborative Tri-state (New Mexico, Nevada, Idaho) NSF EPSCoR Project, WC-WAVE, comprised of researchers from multiple universities in each of the three states. The tools produced and presented here have been developed collaboratively alongside watershed scientists to address specific modeling problems with an eye on the bigger picture of scientific reproducibility and transparency, and data publication and reuse.
An Assessment of the Need for Standard Variable Names for Airborne Field Campaigns

NASA Astrophysics Data System (ADS)

Beach, A. L., III; Chen, G.; Northup, E. A.; Kusterer, J.; Quam, B. M.

2017-12-01

The NASA Earth Venture Program has led to a dramatic increase in airborne observations, requiring updated data management practices with clearly defined data standards and protocols for metadata. An airborne field campaign can involve multiple aircraft and a variety of instruments. It is quite common to have different instruments/techniques measure the same parameter on one or more aircraft platforms. This creates a need to allow instrument Principal Investigators (PIs) to name their variables in a way that would distinguish them across various data sets. A lack of standardization of variables names presents a challenge for data search tools in enabling discovery of similar data across airborne studies, aircraft platforms, and instruments. This was also identified by data users as one of the top issues in data use. One effective approach for mitigating this problem is to enforce variable name standardization, which can effectively map the unique PI variable names to fixed standard names. In order to ensure consistency amongst the standard names, it will be necessary to choose them from a controlled list. However, no such list currently exists despite a number of previous efforts to establish a sufficient list of atmospheric variable names. The Atmospheric Composition Variable Standard Name Working Group was established under the auspices of NASA's Earth Science Data Systems Working Group (ESDSWG) to solicit research community feedback to create a list of standard names that are acceptable to data providers and data users This presentation will discuss the challenges and recommendations of standard variable names in an effort to demonstrate how airborne metadata curation/management can be improved to streamline data ingest, improve interoperability, and discoverability to a broader user community.
Quality Management, Certification and Rating of Health Information on the Net with MedCERTAIN: Using a medPICS/RDF/XML metadata structure for implementing eHealth ethics and creating trust globally

PubMed Central

Eysenbach, Gunther; Yihune, Gabriel; Lampe, Kristian; Cross, Phil; Brickley, Dan

2000-01-01

MedCERTAIN (MedPICS Certification and Rating of Trustworthy Health Information on the Net, http://www.medcertain.org/) is a recently launched international project funded under the European Union's (EU) "Action Plan for safer use of the Internet". It provides a technical infrastructure and a conceptual basis for an international system of "quality seals", ratings and self-labelling of Internet health information, with the final aim to establish a global "trustmark" for networked health information. Digital "quality seals" are evaluative metadata (using standards such as PICS=Platform for Internet Content Selection, now being replaced by RDF/XML) assigned by trusted third-party raters. The project also enables and encourages self-labelling with descriptive metainformation by web authors. Together these measures will help consumers as well as professionals to identify high-quality information on the Internet. MedCERTAIN establishes a fully functional demonstrator for a self- and third-party rating system enabling consumers and professionals to filter harmful health information and to positively identify and select high quality information. We aim to provide a trustmark system which allows citizens to place greater confidence in networked information, to encourage health information providers to follow best practices guidelines such as the Washington eHealth Code of Ethics, to provide effective feedback and law enforcement channels to handle user complaints, and to stimulate medical societies to develop standard for patient information. The project further proposes and identifies standards for interoperability of rating and description services (such as libraries or national health portals) and fosters a worldwide collaboration to guide consumers to high-quality information on the web.
Towards a metadata scheme for the description of materials - the description of microstructures.

PubMed

Schmitz, Georg J; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre

2016-01-01

The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
Design of an UML conceptual model and implementation of a GIS with metadata information for a seismic hazard assessment cooperative project.

NASA Astrophysics Data System (ADS)

Torres, Y.; Escalante, M. P.

2009-04-01

This work illustrates the advantages of using a Geographic Information System in a cooperative project with researchers of different countries, such as the RESIS II project (financed by the Norwegian Government and managed by CEPREDENAC) for seismic hazard assessment of Central America. As input data present different formats, cover distinct geographical areas and are subjected to different interpretations, data inconsistencies may appear and their management get complicated. To achieve data homogenization and to integrate them in a GIS, it is required previously to develop a conceptual model. This is accomplished in two phases: requirements analysis and conceptualization. The Unified Modeling Language (UML) is used to compose the conceptual model of the GIS. UML complies with ISO 19100 norms and allows the designer defining model architecture and interoperability. The GIS provides a frame for the combination of large geographic-based data volumes, with an uniform geographic reference and avoiding duplications. All this information contains its own metadata following ISO 19115 normative. In this work, the integration in the same environment of active faults and subduction slabs geometries, combined with the epicentres location, has facilitated the definition of seismogenetic regions. This is a great support for national specialists of different countries to make easier their teamwork. The GIS capacity for making queries (by location and by attributes) and geostatistical analyses is used to interpolate discrete data resulting from seismic hazard calculations and to create continuous maps as well as to check and validate partial results of the study. GIS-based products, such as complete, homogenised databases and thematic cartography of the region, are distributed to all researchers, facilitating cross-national communication, the project execution and results dissemination.
Operable Data Management for Ocean Observing Systems

NASA Astrophysics Data System (ADS)

Chavez, F. P.; Graybeal, J. B.; Godin, M. A.

2004-12-01

As oceanographic observing systems become more numerous and complex, data management solutions must follow. Most existing oceanographic data management systems fall into one of three categories: they have been developed as dedicated solutions, with limited application to other observing systems; they expect that data will be pre-processed into well-defined formats, such as netCDF; or they are conceived as robust, generic data management solutions, with complexity (high) and maturity and adoption rates (low) to match. Each approach has strengths and weaknesses; no approach yet fully addresses, nor takes advantage of, the sophistication of ocean observing systems as they are now conceived. In this presentation we describe critical data management requirements for advanced ocean observing systems, of the type envisioned by ORION and IOOS. By defining common requirements -- functional, qualitative, and programmatic -- for all such ocean observing systems, the performance and nature of the general data management solution can be characterized. Issues such as scalability, maintaining metadata relationships, data access security, visualization, and operational flexibility suggest baseline architectural characteristics, which may in turn lead to reusable components and approaches. Interoperability with other data management systems, with standards-based solutions in metadata specification and data transport protocols, and with the data management infrastructure envisioned by IOOS and ORION, can also be used to define necessary capabilities. Finally, some requirements for the software infrastructure of ocean observing systems can be inferred. Early operational results and lessons learned, from development and operations of MBARI ocean observing systems, are used to illustrate key requirements, choices, and challenges. Reference systems include the Monterey Ocean Observing System (MOOS), its component software systems (Software Infrastructure and Applications for MOOS, and the Shore Side Data System), and the Autonomous Ocean Sampling Network (AOSN).
Distributed metadata servers for cluster file systems using shared low latency persistent key-value metadata store

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin; Pedone, Jr., James M.

A cluster file system is provided having a plurality of distributed metadata servers with shared access to one or more shared low latency persistent key-value metadata stores. A metadata server comprises an abstract storage interface comprising a software interface module that communicates with at least one shared persistent key-value metadata store providing a key-value interface for persistent storage of key-value metadata. The software interface module provides the key-value metadata to the at least one shared persistent key-value metadata store in a key-value format. The shared persistent key-value metadata store is accessed by a plurality of metadata servers. A metadata requestmore » can be processed by a given metadata server independently of other metadata servers in the cluster file system. A distributed metadata storage environment is also disclosed that comprises a plurality of metadata servers having an abstract storage interface to at least one shared persistent key-value metadata store.« less
A database of marine phytoplankton abundance, biomass and species composition in Australian waters

PubMed Central

Davies, Claire H.; Coughlan, Alex; Hallegraeff, Gustaaf; Ajani, Penelope; Armbrecht, Linda; Atkins, Natalia; Bonham, Prudence; Brett, Steve; Brinkman, Richard; Burford, Michele; Clementson, Lesley; Coad, Peter; Coman, Frank; Davies, Diana; Dela-Cruz, Jocelyn; Devlin, Michelle; Edgar, Steven; Eriksen, Ruth; Furnas, Miles; Hassler, Christel; Hill, David; Holmes, Michael; Ingleton, Tim; Jameson, Ian; Leterme, Sophie C.; Lønborg, Christian; McLaughlin, James; McEnnulty, Felicity; McKinnon, A. David; Miller, Margaret; Murray, Shauna; Nayar, Sasi; Patten, Renee; Pritchard, Tim; Proctor, Roger; Purcell-Meyerink, Diane; Raes, Eric; Rissik, David; Ruszczyk, Jason; Slotwinski, Anita; Swadling, Kerrie M.; Tattersall, Katherine; Thompson, Peter; Thomson, Paul; Tonks, Mark; Trull, Thomas W.; Uribe-Palomino, Julian; Waite, Anya M.; Yauwenas, Rouna; Zammit, Anthony; Richardson, Anthony J.

2016-01-01

There have been many individual phytoplankton datasets collected across Australia since the mid 1900s, but most are unavailable to the research community. We have searched archives, contacted researchers, and scanned the primary and grey literature to collate 3,621,847 records of marine phytoplankton species from Australian waters from 1844 to the present. Many of these are small datasets collected for local questions, but combined they provide over 170 years of data on phytoplankton communities in Australian waters. Units and taxonomy have been standardised, obviously erroneous data removed, and all metadata included. We have lodged this dataset with the Australian Ocean Data Network (http://portal.aodn.org.au/) allowing public access. The Australian Phytoplankton Database will be invaluable for global change studies, as it allows analysis of ecological indicators of climate change and eutrophication (e.g., changes in distribution; diatom:dinoflagellate ratios). In addition, the standardised conversion of abundance records to biomass provides modellers with quantifiable data to initialise and validate ecosystem models of lower marine trophic levels. PMID:27328409
Integrating digital information for coastal and marine sciences

USGS Publications Warehouse

Marincioni, Fausto; Lightsom, Frances L.; Riall, Rebecca L.; Linck, Guthrie A.; Aldrich, Thomas C.; Caruso, Michael J.

2004-01-01

A pilot distributed geolibrary, the Marine Realms Information Bank (MRIB), was developed by the U.S. Geological Survey Coastal and Marine Geology Program and the Woods Hole Oceanographic Institution, to classify, integrate, and facilitate access to scientific information about oceans, coasts, and lakes. The MRIB is composed of a categorization scheme, a metadata database, and a specialized software backend, capable of drawing together information from remote sources without modifying their original format or content. Twelve facets are used to classify information: location, geologic time, feature type, biota, discipline, research method, hot topics, project, agency, author, content type, and file type. The MRIB approach allows easy and flexible organization of large or growing document collections for which centralized repositories would be impractical. Geographic searching based on the gazetteer and map interface is the centerpiece of the MRIB distributed geolibrary. The MRIB is one of a very few digital libraries that employ georeferencing -- a fundamentally different way to structure information from the traditional author/title/subject/keyword approach employed by most digital libraries. Lessons learned in developing the MRIB will be useful as other digital libraries confront the challenges of georeferencing.

Unleashing Geophysics Data with Modern Formats and Services

NASA Astrophysics Data System (ADS)

Ip, Alex; Brodie, Ross C.; Druken, Kelsey; Bastrakova, Irina; Evans, Ben; Kemp, Carina; Richardson, Murray; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

2016-04-01

Geoscience Australia (GA) is the national steward of large volumes of geophysical data extending over the entire Australasian region and spanning many decades. The volume and variety of data which must be managed, coupled with the increasing need to support machine-to-machine data access, mean that the old "click-and-ship" model delivering data as downloadable files for local analysis is rapidly becoming unviable - a "big data" problem not unique to geophysics. The Australian Government, through the Research Data Services (RDS) Project, recently funded the Australian National Computational Infrastructure (NCI) to organize a wide range of Earth Systems data from diverse collections including geoscience, geophysics, environment, climate, weather, and water resources onto a single High Performance Data (HPD) Node. This platform, which now contains over 10 petabytes of data, is called the National Environmental Research Data Interoperability Platform (NERDIP), and is designed to facilitate broad user access, maximise reuse, and enable integration. GA has contributed several hundred terabytes of geophysical data to the NERDIP. Historically, geophysical datasets have been stored in a range of formats, with metadata of varying quality and accessibility, and without standardised vocabularies. This has made it extremely difficult to aggregate original data from multiple surveys (particularly un-gridded geophysics point/line data) into standard formats suited to High Performance Computing (HPC) environments. To address this, it was decided to use the NERDIP-preferred Hierarchical Data Format (HDF) 5, which is a proven, standard, open, self-describing and high-performance format supported by extensive software tools, libraries and data services. The Network Common Data Form (NetCDF) 4 API facilitates the use of data in HDF5, whilst the NetCDF Climate & Forecasting conventions (NetCDF-CF) further constrain NetCDF4/HDF5 data so as to provide greater inherent interoperability. The first geophysical data collection selected for transformation by GA was Airborne ElectroMagnetics (AEM) data which was held in proprietary-format files, with associated ISO 19115 metadata held in a separate relational database. Existing NetCDF-CF metadata profiles were enhanced to cover AEM and other geophysical data types, and work is underway to formalise the new geophysics vocabulary as a proposed extension to the Climate & Forecasting conventions. The richness and flexibility of HDF5's internal indexing mechanisms has allowed lossless restructuring of the AEM data for efficient storage, subsetting and access via either the NetCDF4/HDF5 APIs or Open-source Project for a Network Data Access Protocol (OPeNDAP) data services. This approach not only supports large-scale HPC processing, but also interactive access to a wide range of geophysical data in user-friendly environments such as iPython notebooks and more sophisticated cloud-enabled portals such as the Virtual Geophysics Laboratory (VGL). As multidimensional AEM datasets are relatively complex compared to other geophysical data types, the general approach employed in this project for modernizing AEM data is likely to be applicable to other geophysics data types. When combined with the use of standards-based data services and APIs, a coordinated, systematic modernisation will result in vastly improved accessibility to, and usability of, geophysical data in a wide range of computational environments both within and beyond the geophysics community.
UNAVCO Software and Services for Visualization and Exploration of Geoscience Data

NASA Astrophysics Data System (ADS)

Meertens, C.; Wier, S.

2007-12-01

UNAVCO has been involved in visualization of geoscience data to support education and research for several years. An early and ongoing service is the Jules Verne Voyager, a web browser applet built on the GMT that displays any area on Earth, with many data set choices, including maps, satellite images, topography, geoid heights, sea-floor ages, strain rates, political boundaries, rivers and lakes, earthquake and volcano locations, focal mechanisms, stress axes, and observed and modeled plate motion and deformation velocity vectors from geodetic measurements around the world. As part of the GEON project, UNAVCO has developed the GEON IDV, a research-level, 4D (earth location, depth and/or altitude, and time), Java application for interactive display and analysis of geoscience data. The GEON IDV is designed to meet the challenge of investigating complex, multi-variate, time-varying, three-dimensional geoscience data anywhere on earth. The GEON IDV supports simultaneous displays of data sets from differing sources, with complete control over colors, time animation, map projection, map area, point of view, and vertical scale. The GEON IDV displays gridded and point data, images, GIS shape files, and several other types of data. The GEON IDV has symbols and displays for GPS velocity vectors, seismic tomography, earthquake focal mechanisms, earthquake locations with magnitude or depth, seismic ray paths in 3D, seismic anisotropy, convection model visualization, earth strain axes and strain field imagery, and high-resolution 3D topographic relief maps. Multiple data sources and display types may appear in one view. As an example of GEON IDV utility, it can display hypocenters under a volcano, a surface geology map of the volcano draped over 3D topographic relief, town locations and political boundaries, and real-time 3D weather radar clouds of volcanic ash in the atmosphere, with time animation. The GEON IDV can drive a GeoWall or other 3D stereo system. IDV output includes imagery, movies, and KML files for Google Earth use of IDV static images, where Google Earth can handle the display. The IDV can be scripted to create display images on user request or automatically on data arrival, offering the use of the IDV as a back end to support a data web site. We plan to extend the power of the IDV by accepting new data types and data services, such as GeoSciML. An active program of online and video training in GEON IDV use is planned. UNAVCO will support users who need assistance converting their data to the standard formats used by the GEON IDV. The UNAVCO Facility provides web-accessible support for Google Earth and Google Maps display of any of more than 9500 GPS stations and survey points, including metadata for each installation. UNAVCO provides corresponding Open Geospatial Consortium (OGC) web services with the same data. UNAVCO's goal is to facilitate data access, interoperability, and efficient searches, exploration, and use of data by promoting web services, standards for GEON IDV data formats and metadata, and software able to simultaneously read and display multiple data sources, formats, and map locations or projections. Retention and propagation of semantics and metadata with observational and experimental values is essential for interoperability and understanding diverse data sources.
Research of marine sensor web based on SOA and EDA

NASA Astrophysics Data System (ADS)

Jiang, Yongguo; Dou, Jinfeng; Guo, Zhongwen; Hu, Keyong

2015-04-01

A great deal of ocean sensor observation data exists, for a wide range of marine disciplines, derived from in situ and remote observing platforms, in real-time, near-real-time and delayed mode. Ocean monitoring is routinely completed using sensors and instruments. Standardization is the key requirement for exchanging information about ocean sensors and sensor data and for comparing and combining information from different sensor networks. One or more sensors are often physically integrated into a single ocean `instrument' device, which often brings in many challenges related to diverse sensor data formats, parameters units, different spatiotemporal resolution, application domains, data quality and sensors protocols. To face these challenges requires the standardization efforts aiming at facilitating the so-called Sensor Web, which making it easy to provide public access to sensor data and metadata information. In this paper, a Marine Sensor Web, based on SOA and EDA and integrating the MBARI's PUCK protocol, IEEE 1451 and OGC SWE 2.0, is illustrated with a five-layer architecture. The Web Service layer and Event Process layer are illustrated in detail with an actual example. The demo study has demonstrated that a standard-based system can be built to access sensors and marine instruments distributed globally using common Web browsers for monitoring the environment and oceanic conditions besides marine sensor data on the Web, this framework of Marine Sensor Web can also play an important role in many other domains' information integration.
Advancing global marine biogeography research with open-source GIS software and cloud-computing

USGS Publications Warehouse

Fujioka, Ei; Vanden Berghe, Edward; Donnelly, Ben; Castillo, Julio; Cleary, Jesse; Holmes, Chris; McKnight, Sean; Halpin, patrick

2012-01-01

Across many scientific domains, the ability to aggregate disparate datasets enables more meaningful global analyses. Within marine biology, the Census of Marine Life served as the catalyst for such a global data aggregation effort. Under the Census framework, the Ocean Biogeographic Information System was established to coordinate an unprecedented aggregation of global marine biogeography data. The OBIS data system now contains 31.3 million observations, freely accessible through a geospatial portal. The challenges of storing, querying, disseminating, and mapping a global data collection of this complexity and magnitude are significant. In the face of declining performance and expanding feature requests, a redevelopment of the OBIS data system was undertaken. Following an Open Source philosophy, the OBIS technology stack was rebuilt using PostgreSQL, PostGIS, GeoServer and OpenLayers. This approach has markedly improved the performance and online user experience while maintaining a standards-compliant and interoperable framework. Due to the distributed nature of the project and increasing needs for storage, scalability and deployment flexibility, the entire hardware and software stack was built on a Cloud Computing environment. The flexibility of the platform, combined with the power of the application stack, enabled rapid re-development of the OBIS infrastructure, and ensured complete standards-compliance.
Enabling Science Integration through the Marine Geoscience Data System Media Bank

NASA Astrophysics Data System (ADS)

Leung, A.; Ferrini, V.; Arko, R.; Carbotte, S. M.; Goehring, L.; Simms, E.

2008-12-01

The Marine Geoscience Data System Media Bank (http://media.marine-geo.org) was constructed to enable the sharing of high quality images, illustrations and animations among members of the science community and to provide a new forum for education and public outreach (EPO). The initial focus of Media Bank was to serve Ridge 2000 research and EPO efforts, but it was constructed as a flexible system that could accommodate media from other multidisciplinary marine geoscience research initiatives. Media Bank currently contains digital photographs, maps, 3-D visualizations, and video clips from the Ridge 2000 and MARGINS focus sites as well as the Antarctic and Southern Ocean. We actively seek contributions of other high quality marine geoscience media for inclusion in Media Bank. Media Bank is driven by a relational database backend, enabling image browsing, sorting by category, keyword search functionality, and the creation of media galleries. All media are accompanied by a descriptive figure caption that provides easy access to expert knowledge to help foster data integration across disciplines as well as EPO efforts. In addition to access to high quality media, Media Bank also provides basic metadata including geographic position, investigator name and affiliation, as well as copyright information, and links to references and relevant data sets. Since media are tied to geospatial coordinates, a map-based interface is also provided for access to media.
Ground-penetrating radar and differential global positioning system data collected from Long Beach Island, New Jersey, April 2015

USGS Publications Warehouse

Zaremba, Nicholas J.; Smith, Kathryn E.L.; Bishop, James M.; Smith, Christopher G.

2016-08-04

Scientists from the United States Geological Survey, St. Petersburg Coastal and Marine Science Center, U.S. Geological Survey Pacific Coastal and Marine Science Center, and students from the University of Hawaii at Manoa collected sediment cores, sediment surface grab samples, ground-penetrating radar (GPR) and Differential Global Positioning System (DGPS) data from within the Edwin B. Forsythe National Wildlife Refuge–Holgate Unit located on the southern end of Long Beach Island, New Jersey, in April 2015 (FAN 2015-611-FA). The study’s objective was to identify washover deposits in the stratigraphic record to aid in understanding barrier island evolution. This report is an archive of GPR and DGPS data collected from Long Beach Island in 2015. Data products, including raw GPR and processed DGPS data, elevation corrected GPR profiles, and accompanying Federal Geographic Data Committee metadata can be downloaded from the Data Downloads page.
BCO-DMO: Enabling Access to Federally Funded Research Data

NASA Astrophysics Data System (ADS)

Kinkade, D.; Allison, M. D.; Chandler, C. L.; Groman, R. C.; Rauch, S.; Shepherd, A.; Gegg, S. R.; Wiebe, P. H.; Glover, D. M.

2013-12-01

In a February, 2013 memo1, the White House Office of Science and Technology Policy (OSTP) outlined principles and objectives to increase access by the public to federally funded research publications and data. Such access is intended to drive innovation by allowing private and commercial efforts to take full advantage of existing resources, thereby maximizing Federal research dollars and efforts. The Biological and Chemical Oceanography Data Management Office (BCO-DMO; bco-dmo.org) serves as a model resource for organizations seeking compliance with the OSTP policy. BCO-DMO works closely with scientific investigators to publish their data from research projects funded by the National Science Foundation (NSF), within the Biological and Chemical Oceanography Sections (OCE) and the Division of Polar Programs Antarctic Organisms & Ecosystems Program (PLR). BCO-DMO addresses many of the OSTP objectives for public access to digital scientific data: (1) Marine biogeochemical and ecological data and metadata are disseminated via a public website, and curated on intermediate time frames; (2) Preservation needs are met by collaborating with appropriate national data facilities for data archive; (3) Cost and administrative burden associated with data management is minimized by the use of one dedicated office providing hundreds of NSF investigators support for data management plan development, data organization, metadata generation and deposition of data and metadata into the BCO-DMO repository; (4) Recognition of intellectual property is reinforced through the office's citation policy and the use of digital object identifiers (DOIs); (5) Education and training in data stewardship and use of the BCO-DMO system is provided by office staff through a variety of venues. Oceanographic research data and metadata from thousands of datasets generated by hundreds of investigators are now available through BCO-DMO. 1 White House Office of Science and Technology Policy, Memorandum for the Heads of Executive Departments and Agencies: Increasing Access to the Results of Federally Funded Scientific Research, February 23, 2013. http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
Log-less metadata management on metadata server for parallel file systems.

PubMed

Liao, Jianwei; Xiao, Guoqiang; Peng, Xiaoning

2014-01-01

This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally.
Log-Less Metadata Management on Metadata Server for Parallel File Systems

PubMed Central

Xiao, Guoqiang; Peng, Xiaoning

2014-01-01

This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally. PMID:24892093
Archive of U.S. Geological Survey selected single-beam bathymetry datasets, 1969-2000

USGS Publications Warehouse

Schreppel, Heather A.; Degnan, Carolyn H.; Dadisman, Shawn V.; Metzger, Dan R.

2013-01-01

New national programs, as well as natural and man-made disasters, have raised awareness about the need to find new and improved ways to share information about the coastal and marine environment with a wide-ranging public audience. The U.S. Geological Survey (USGS) Coastal and Marine Geology Program (CMGP) has begun a large-scale effort to incorporate the program's published, digital geophysical data into a single point of access known as the Coastal and Marine Geoscience Data System (CMGDS) (http://cmgds.marine.usgs.gov/). To aid in data discovery, work is also being done to import CMGP data into highly visible data and information resources, such as the National Oceanic and Atmospheric Administration's (NOAA) National Geophysical Data Center (NGDC) and two widely used Earth-science tools, GeoMapApp (GMA) (http://www.geomapapp.org) and Virtual Ocean (VO) (http://www.virtualocean.org/). This task of the CMGP Integrated Data Management System project will help support information exchange with partners, regional planning groups, and the public, as well as facilitate integrated spatial-data analysis. Sharing USGS-CMGP geophysical data via CMGDS, NGDC, GMA, and VO will aid data discovery and enable the data to support new purposes beyond those for which the data were originally intended. In order to make data available to NGDC, and from there into GMA and VO, the data must be reformatted into a standard exchange format and published. In 1977, a group of geophysical data managers from the public and private sectors developed the MGD77 format as the standard exchange format for geophysical data. In 2010, a tab-delimited version of the format was added as MGD77T (Hittelman and others, 1977). The MGD77T geophysical data format can include bathymetry, magnetics, gravity, and seismic navigation data. It is used for the transmission of data between marine institutions, data centers, and can be used by various software programs as an exchange format. A header (documentation) file and data file are created for each survey (Hittelman and others, 1977). More details about the MGD77T format are available at http://www.ngdc.noaa.gov/mgg/dat/geodas/docs/mgd77.pdf (74MB PDF). This archive describes the detailed steps used to convert single-beam bathymetry and navigation files into the MGD77T format (Hittelman and others, 1977) for submission to NGDC and formal Federal Geographic Data Committee (FGDC) (http://www.fgdc.gov/metadata) metadata as a publication of these single-beam bathymetry datasets.
Xray: N-dimensional, labeled arrays for analyzing physical datasets in Python

NASA Astrophysics Data System (ADS)

Hoyer, S.

2015-12-01

Efficient analysis of geophysical datasets requires tools that both preserve and utilize metadata, and that transparently scale to process large datas. Xray is such a tool, in the form of an open source Python library for analyzing the labeled, multi-dimensional array (tensor) datasets that are ubiquitous in the Earth sciences. Xray's approach pairs Python data structures based on the data model of the netCDF file format with the proven design and user interface of pandas, the popular Python data analysis library for labeled tabular data. On top of the NumPy array, xray adds labeled dimensions (e.g., "time") and coordinate values (e.g., "2015-04-10"), which it uses to enable a host of operations powered by these labels: selection, aggregation, alignment, broadcasting, split-apply-combine, interoperability with pandas and serialization to netCDF/HDF5. Many of these operations are enabled by xray's tight integration with pandas. Finally, to allow for easy parallelism and to enable its labeled data operations to scale to datasets that does not fit into memory, xray integrates with the parallel processing library dask.
A metadata-aware application for remote scoring and exchange of tissue microarray images

PubMed Central

2013-01-01

Background The use of tissue microarrays (TMA) and advances in digital scanning microscopy has enabled the collection of thousands of tissue images. There is a need for software tools to annotate, query and share this data amongst researchers in different physical locations. Results We have developed an open source web-based application for remote scoring of TMA images, which exploits the value of Microsoft Silverlight Deep Zoom to provide a intuitive interface for zooming and panning around digital images. We use and extend existing XML-based standards to ensure that the data collected can be archived and that our system is interoperable with other standards-compliant systems. Conclusion The application has been used for multi-centre scoring of TMA slides composed of tissues from several Phase III breast cancer trials and ten different studies participating in the International Breast Cancer Association Consortium (BCAC). The system has enabled researchers to simultaneously score large collections of TMA and export the standardised data to integrate with pathological and clinical outcome data, thereby facilitating biomarker discovery. PMID:23635078
Ocean Tracking Network (OTN): Development of Oceanographic Data Integration with Animal Movement

NASA Astrophysics Data System (ADS)

Bajona, L.

2016-02-01

OTN is a $168-million ocean research and technology development platform headquartered at Dalhousie University, Canada. Using acoustic and satellite telemetry to globally document the movements and survival of aquatic animals, and their environmental correlates. The OTN Mission: to foster conservation and sustainability of valued species by generating knowledge on the movement patterns of aquatic species in their changing environment. OTN's ever-expanding global network of acoustic receivers listening for over 90 different key animal species is providing for the data needed in working in collaboration with researchers for the development of oceanographic data integration with animal movement. Presented here is Data Management's work to date, status and challenges in OTN's move towards a community standard to enable sharing between projects nationally and internationally; permitting inter-operability with other large national (e.g. CHONe, ArcticNET) and international (IOOS, IMOS) networks. This work includes co-development of Animal Acoustic Telemetry (AAT) metadata standard and implementation using an ERDDAP data server (NOAA, Environmental Research Division's Data Access Program) facilitating ingestion for modelers (eg. netcdf).
ECHO Data Partners Join Forces to Federate Access to Resources

NASA Astrophysics Data System (ADS)

Kendall, J.; Macie, M.

2003-12-01

During the past year the NASA's Earth Science Data and Information System (ESDIS) project has been collaborating with various Earth science data and client providers to design and implement the EOS Clearinghouse (ECHO). ECHO is an open, interoperable metadata clearinghouse and order broker system. ECHO functions as a repository of information intended to streamline access to digital data and services provided by NASA's Earth Science Enterprise and the extended Earth science community. In a unique partnership, ECHO data providers are working to extend their services in the digital era, to reflect current trends in scientific and educational communications. The multi-organization, inter-disciplinary content of ECHO provides a valuable new service to a growing number of Earth science applications and interdisciplinary research efforts. As such, ECHO is expected to attract a wide audience. In this poster, we highlight the contributions of current ECHO data partners and provide information for prospective data partners on how the project supports the incorporation of new collections and effective long-term asset management that is directly under the control of the organizations who contribute resources to ECHO.
Earth Science Digital Museum (ESDM): Toward a new paradigm for museums

NASA Astrophysics Data System (ADS)

Dong, Shaochun; Xu, Shijin; Wu, Gangshan

2006-07-01

New technologies have pushed traditional museums to take their exhibitions beyond the barrier of a museum's walls and enhance their functions: education and entertainment. Earth Science Digital Museum (ESDM) is such an emerging effort in this field. It serves as a platform for Earth Scientists to build a Web community to share knowledge about the Earth and is of to benefit the general public for their life-long learning. After analyzing the purposes and requirements of ESDM, we present here our basic philosophy of ESDM and a four-layer hierarchical architecture for enhancing the structure of ESDM via Internet. It is a Web-based application to enable specimens to be exhibited, shared and preserved in digital form, and to provide the functionalities of interoperability. One of the key components of ESDM is the development of a metadata set for describing Earth Science specimens and their digital representations, which is particularly important for building ESDM. Practical demonstrations show that ESDM is suitable for formal and informal Earth Science education, including classroom education, online education and life-long learning.
Integrated Data Capturing Requirements for 3d Semantic Modelling of Cultural Heritage: the Inception Protocol

NASA Astrophysics Data System (ADS)

Di Giulio, R.; Maietti, F.; Piaia, E.; Medici, M.; Ferrari, F.; Turillazzi, B.

2017-02-01

The generation of high quality 3D models can be still very time-consuming and expensive, and the outcome of digital reconstructions is frequently provided in formats that are not interoperable, and therefore cannot be easily accessed. This challenge is even more crucial for complex architectures and large heritage sites, which involve a large amount of data to be acquired, managed and enriched by metadata. In this framework, the ongoing EU funded project INCEPTION - Inclusive Cultural Heritage in Europe through 3D semantic modelling proposes a workflow aimed at the achievements of efficient 3D digitization methods, post-processing tools for an enriched semantic modelling, web-based solutions and applications to ensure a wide access to experts and non-experts. In order to face these challenges and to start solving the issue of the large amount of captured data and time-consuming processes in the production of 3D digital models, an Optimized Data Acquisition Protocol (DAP) has been set up. The purpose is to guide the processes of digitization of cultural heritage, respecting needs, requirements and specificities of cultural assets.
Confronting data requirements and data provision in Space Weather: The Contribution of Long Term Archives. Part 1.

NASA Astrophysics Data System (ADS)

Heynderickx, Daniel; Glover, Alexi

Operational space weather services rely heavily on reliable data streams from spacecraft and ground-based facilities, as well as from services providing processed data products. This event focuses on an unusual solar maximum viewed from several different perspectives, and as such highlights the important contribution of long term archives in supporting space weather studies and services. We invite the space weather community to contribute to a discussion on the key topics listed below, with the aim of formulating recommendations and guidelines for policy makers, stakeholders, data and service providers: - facilitating access to and awareness of existing data resources - establishing clear guidelines for space weather data archives including data quality, interoperability and metadata standards - ensuring data ownership and terms of (re)use are clearly identified such that this information can be taken into account when (potentially commercial) services are developed based on data provided without charge for scientific purposes only All participants are invited to submit input for the discussion to the authors ahead of the Assembly. The outcome of the session will be formulated as a set of proposed panel recommendations.
Confronting data requirements and data provision in Space Weather: The Contribution of Long Term Archives. Part 2.

NASA Astrophysics Data System (ADS)

Glover, Alexi; Heynderickx, Daniel

Operational space weather services rely heavily on reliable data streams from spacecraft and ground-based facilities, as well as from services providing processed data products. This event focuses on an unusual solar maximum viewed from several different perspectives, and as such highlights the important contribution of long term archives in supporting space weather studies and services. We invite the space weather community to contribute to a discussion on the key topics listed below, with the aim of formulating recommendations and guidelines for policy makers, stakeholders, data and service providers: - facilitating access to and awareness of existing data resources - establishing clear guidelines for space weather data archives including data quality, interoperability and metadata standards - ensuring data ownership and terms of (re)use are clearly identified such that this information can be taken into account when (potentially commercial) services are developed based on data provided without charge for scientific purposes only All participants are invited to submit input for the discussion to the authors ahead of the Assembly. The outcome of the session will be formulated as a set of proposed panel recommendations.
Modernization of the Caltech/USGS Southern California Seismic Network

NASA Astrophysics Data System (ADS)

Bhadha, R.; Devora, A.; Hauksson, E.; Johnson, D.; Thomas, V.; Watkins, M.; Yip, R.; Yu, E.; Given, D.; Cone, G.; Koesterer, C.

2009-12-01

The USGS/ANSS/ARRA program is providing Government Furnished Equipment (GFE), and two year funding for upgrading the Caltech/USGS Southern California Seismic Network (SCSN). The SCSN is the modern digital ground motion seismic network in southern California that monitors seismicity and provides real-time earthquake information products such as rapid notifications, moment tensors, and ShakeMap. The SCSN has evolved through the years and now consists of several well-integrated components such as Short-Period analog, TERRAscope, digital stations, and real-time strong motion stations, or about 300 stations. In addition, the SCSN records data from about 100 stations provided by partner networks. To strengthen the ability of SCSN to meet the ANSS performance standards, we will install GFE and carry out the following upgrades and improvements of the various components of the SCSN: 1) Upgrade of dataloggers at seven TERRAscope stations; 2) Upgrade of dataloggers at 131 digital stations and upgrade broadband sensors at 25 stations; 3) Upgrade of SCSN metadata capabilities; 4) Upgrade of telemetry capabilities for both seismic and GPS data; and 5) Upgrade balers at stations with existing Q330 dataloggers. These upgrades will enable the SCSN to meet the ANSS Performance Standards more consistently than before. The new equipment will improve station uptimes and reduce maintenance costs. The new equipment will also provide improved waveform data quality and consequently superior data products. The data gaps due to various outages will be minimized, and ‘late’ data will be readily available through retrieval from on-site storage. Compared to the outdated equipment, the new equipment will speed up data delivery by about 10 sec, which is fast enough for earthquake early warning applications. The new equipment also has about a factor of ten lower consumption of power. We will also upgrade the SCSN data acquisition and data center facilities, which will improve the SCSN performance and metadata availability. We will improve existing software to facilitate the update of metadata, and to improve the interoperability between SeisNetWatch and our database of metadata. The improved software will also be made available to other regional networks as part of the CISN software distribution. These upgrades, will greatly improve the robustness of the SCSN, and facilitate higher quality and more reliable earthquake monitoring than was available before in southern California. The modernized SCSN will contribute to more coordinated search and rescue as well as economic resilience following a major earthquake by providing accurate earthquake information, and thus facilitate rapid deployment of field crews and rapid business resumption. Further, advances in seismological research will be facilitated by the high quality seismic data that will be collected in one of the most seismically active areas in the contiguous US.
EMODnet High Resolution Seabed Mapping - further developing a high resolution digital bathymetry for European seas

NASA Astrophysics Data System (ADS)

Schaap, Dick M. A.; Schmitt, Thierry

2017-04-01

Access to marine data is a key issue for the EU Marine Strategy Framework Directive and the EU Marine Knowledge 2020 agenda and includes the European Marine Observation and Data Network (EMODnet) initiative. EMODnet aims at assembling European marine data, data products and metadata from diverse sources in a uniform way. The EMODnet data infrastructure is developed through a stepwise approach in three major phases. Currently EMODnet is entering its 3rd phase with operational portals providing access to marine data for bathymetry, geology, physics, chemistry, biology, seabed habitats and human activities, complemented by checkpoint projects, analysing the fitness for purpose of data provision. The EMODnet Bathymetry project has developed Digital Terrain Models (DTM) for the European seas. These have been produced from survey and aggregated data sets that are indexed with metadata by adopting the SeaDataNet Catalogue services. SeaDataNet is a network of major oceanographic data centres around the European seas that manage, operate and further develop a pan-European infrastructure for marine and ocean data management. The latest EMODnet Bathymetry DTM release has a resolution of 1/8 arcminute * 1/8 arcminute and covers all European sea regions. Use has been made of circa 7800 gathered survey datasets and composite DTMs from 27 European data providers from 15 countries. For areas without coverage use has been made of the latest GEBCO DTM. The catalogue services and the generated EMODnet DTM have been published at the dedicated EMODnet Bathymetry portal which includes a versatile DTM viewing service that also supports downloading in various formats. End December 2016 the Bathymetry project has been succeeded by EMODnet High Resolution Seabed Mapping (HRSM) as part of the third phase of EMODnet. This new project will continue gathering of bathymetric in-situ data sets with extra efforts for near coastal waters and coastal zones. In addition Satellite Derived Bathymetry data will be included and in particular to fill gaps in coverage of the coastal zones. The data and composite DTMs will increase the coverage of the European seas and its coastlines, and provide input for producing an EMODnet DTM with a common resolution of 3 arc seconds versus 1/8 arc minutes at present. Moreover local DTMs with even higher resolutions will be produced, where data and data providers permit. The Bathymetry Viewing and Download service will be upgraded to provide a multi-resolution map and including 3D viewing. The higher resolution DTMs will also be used to determine best-estimates of the European coastline for a range of tidal levels (HAT, MHW, MSL, Chart Datum, LAT), thereby making use of a tidal model for Europe. Extra challenges will be 'moving to the cloud' and setting up an EMODnet Collaborative Virtual Environment (CVE) for producing the EMODnet DTMs. The presentation will highlight key details of EMODnet Bathymetry results and the way how challenges of the new HRSM project are approached.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.