core metadata element: Topics by Science.gov

Sample records for core metadata element

Metadata: Standards for Retrieving WWW Documents (and Other Digitized and Non-Digitized Resources)

NASA Astrophysics Data System (ADS)

Rusch-Feja, Diann

The use of metadata for indexing digitized and non-digitized resources for resource discovery in a networked environment is being increasingly implemented all over the world. Greater precision is achieved using metadata than relying on universal search engines and furthermore, meta-data can be used as filtering mechanisms for search results. An overview of various metadata sets is given, followed by a more focussed presentation of Dublin Core Metadata including examples of sub-elements and qualifiers. Especially the use of the Dublin Core Relation element provides connections between the metadata of various related electronic resources, as well as the metadata for physical, non-digitized resources. This facilitates more comprehensive search results without losing precision and brings together different genres of information which would otherwise be only searchable in separate databases. Furthermore, the advantages of Dublin Core Metadata in comparison with library cataloging and the use of universal search engines are discussed briefly, followed by a listing of types of implementation of Dublin Core Metadata.
International Metadata Initiatives: Lessons in Bibliographic Control.

ERIC Educational Resources Information Center

Caplan, Priscilla

This paper looks at a subset of metadata schemes, including the Text Encoding Initiative (TEI) header, the Encoded Archival Description (EAD), the Dublin Core Metadata Element Set (DCMES), and the Visual Resources Association (VRA) Core Categories for visual resources. It examines why they developed as they did, major point of difference from…
A model for enhancing Internet medical document retrieval with "medical core metadata".

PubMed

Malet, G; Munoz, F; Appleyard, R; Hersh, W

1999-01-01

Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and MEDLINE-type content descriptions. The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines.
A Model for Enhancing Internet Medical Document Retrieval with “Medical Core Metadata”

PubMed Central

Malet, Gary; Munoz, Felix; Appleyard, Richard; Hersh, William

1999-01-01

Objective: Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. Design: The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and Medline-type content descriptions. Results: The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. Conclusions: The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines. PMID:10094069
Hyper Text Mark-up Language and Dublin Core metadata element set usage in websites of Iranian State Universities' libraries.

PubMed

Zare-Farashbandi, Firoozeh; Ramezan-Shirazi, Mahtab; Ashrafi-Rizi, Hasan; Nouri, Rasool

2014-01-01

Recent progress in providing innovative solutions in the organization of electronic resources and research in this area shows a global trend in the use of new strategies such as metadata to facilitate description, place for, organization and retrieval of resources in the web environment. In this context, library metadata standards have a special place; therefore, the purpose of the present study has been a comparative study on the Central Libraries' Websites of Iran State Universities for Hyper Text Mark-up Language (HTML) and Dublin Core metadata elements usage in 2011. The method of this study is applied-descriptive and data collection tool is the check lists created by the researchers. Statistical community includes 98 websites of the Iranian State Universities of the Ministry of Health and Medical Education and Ministry of Science, Research and Technology and method of sampling is the census. Information was collected through observation and direct visits to websites and data analysis was prepared by Microsoft Excel software, 2011. The results of this study indicate that none of the websites use Dublin Core (DC) metadata and that only a few of them have used overlaps elements between HTML meta tags and Dublin Core (DC) elements. The percentage of overlaps of DC elements centralization in the Ministry of Health were 56% for both description and keywords and, in the Ministry of Science, were 45% for the keywords and 39% for the description. But, HTML meta tags have moderate presence in both Ministries, as the most-used elements were keywords and description (56%) and the least-used elements were date and formatter (0%). It was observed that the Ministry of Health and Ministry of Science follows the same path for using Dublin Core standard on their websites in the future. Because Central Library Websites are an example of scientific web pages, special attention in designing them can help the researchers to achieve faster and more accurate information resources. Therefore, the influence of librarians' ideas on the awareness of web designers and developers will be important for using metadata elements as general, and specifically for applying such standards.
Hyper Text Mark-up Language and Dublin Core metadata element set usage in websites of Iranian State Universities’ libraries

PubMed Central

Zare-Farashbandi, Firoozeh; Ramezan-Shirazi, Mahtab; Ashrafi-Rizi, Hasan; Nouri, Rasool

2014-01-01

Introduction: Recent progress in providing innovative solutions in the organization of electronic resources and research in this area shows a global trend in the use of new strategies such as metadata to facilitate description, place for, organization and retrieval of resources in the web environment. In this context, library metadata standards have a special place; therefore, the purpose of the present study has been a comparative study on the Central Libraries’ Websites of Iran State Universities for Hyper Text Mark-up Language (HTML) and Dublin Core metadata elements usage in 2011. Materials and Methods: The method of this study is applied-descriptive and data collection tool is the check lists created by the researchers. Statistical community includes 98 websites of the Iranian State Universities of the Ministry of Health and Medical Education and Ministry of Science, Research and Technology and method of sampling is the census. Information was collected through observation and direct visits to websites and data analysis was prepared by Microsoft Excel software, 2011. Results: The results of this study indicate that none of the websites use Dublin Core (DC) metadata and that only a few of them have used overlaps elements between HTML meta tags and Dublin Core (DC) elements. The percentage of overlaps of DC elements centralization in the Ministry of Health were 56% for both description and keywords and, in the Ministry of Science, were 45% for the keywords and 39% for the description. But, HTML meta tags have moderate presence in both Ministries, as the most-used elements were keywords and description (56%) and the least-used elements were date and formatter (0%). Conclusion: It was observed that the Ministry of Health and Ministry of Science follows the same path for using Dublin Core standard on their websites in the future. Because Central Library Websites are an example of scientific web pages, special attention in designing them can help the researchers to achieve faster and more accurate information resources. Therefore, the influence of librarians’ ideas on the awareness of web designers and developers will be important for using metadata elements as general, and specifically for applying such standards. PMID:24741646
A Metadata Element Set for Project Documentation

NASA Technical Reports Server (NTRS)

Hodge, Gail; Templeton, Clay; Allen, Robert B.

2003-01-01

Abstract NASA Goddard Space Flight Center is a large engineering enterprise with many projects. We describe our efforts to develop standard metadata sets across project documentation which we term the "Goddard Core". We also address broader issues for project management metadata.
Map Metadata: Essential Elements for Search and Storage

ERIC Educational Resources Information Center

Beamer, Ashley

2009-01-01

Purpose: The purpose of this paper is to develop an understanding of the issues surrounding the cataloguing of maps in archives and libraries. An investigation into appropriate metadata formats, such as MARC21, EAD and Dublin Core with RDF, shows how particular map data can be stored. Mathematical map elements, specifically co-ordinates, are…
Metadata Repository for Improved Data Sharing and Reuse Based on HL7 FHIR.

PubMed

Ulrich, Hannes; Kock, Ann-Kristin; Duhm-Harbeck, Petra; Habermann, Jens K; Ingenerf, Josef

2016-01-01

Unreconciled data structures and formats are a common obstacle to the urgently required sharing and reuse of data within healthcare and medical research. Within the North German Tumor Bank of Colorectal Cancer, clinical and sample data, based on a harmonized data set, is collected and can be pooled by using a hospital-integrated Research Data Management System supporting biobank and study management. Adding further partners who are not using the core data set requires manual adaptations and mapping of data elements. Facing this manual intervention and focusing the reuse of heterogeneous healthcare instance data (value level) and data elements (metadata level), a metadata repository has been developed. The metadata repository is an ISO 11179-3 conformant server application built for annotating and mediating data elements. The implemented architecture includes the translation of metadata information about data elements into the FHIR standard using the FHIR Data Element resource with the ISO 11179 Data Element Extensions. The FHIR-based processing allows exchange of data elements with clinical and research IT systems as well as with other metadata systems. With increasingly annotated and harmonized data elements, data quality and integration can be improved for successfully enabling data analytics and decision support.
Collaborative Metadata Curation in Support of NASA Earth Science Data Stewardship

NASA Technical Reports Server (NTRS)

Sisco, Adam W.; Bugbee, Kaylin; le Roux, Jeanne; Staton, Patrick; Freitag, Brian; Dixon, Valerie

2018-01-01

Growing collection of NASA Earth science data is archived and distributed by EOSDIS’s 12 Distributed Active Archive Centers (DAACs). Each collection and granule is described by a metadata record housed in the Common Metadata Repository (CMR). Multiple metadata standards are in use, and core elements of each are mapped to and from a common model – the Unified Metadata Model (UMM). Work done by the Analysis and Review of CMR (ARC) Team.
DUBLIN CORE

EPA Science Inventory

The Dublin Core is a metadata element set intended to facilitate discovery of electronic resources. It was originally conceived for author-generated descriptions of Web resources, and the Dublin Core has attracted broad ranging international and interdisciplinary support. The cha...
Metadata: Pure and Simple, or Is It?

ERIC Educational Resources Information Center

Chalmers, Marilyn

2002-01-01

Discusses issues concerning metadata in Web pages based on experiences in a vocational education center library in Queensland (Australia). Highlights include Dublin Core elements; search engines; controlled vocabulary; performance measurement to assess usage patterns and provide quality control over the vocabulary; and considerations given the…
Development of Health Information Search Engine Based on Metadata and Ontology

PubMed Central

Song, Tae-Min; Jin, Dal-Lae

2014-01-01

Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907
Development of health information search engine based on metadata and ontology.

PubMed

Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

2014-04-01

The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.
Mapping and converting essential Federal Geographic Data Committee (FGDC) metadata into MARC21 and Dublin Core: towards an alternative to the FGDC Clearinghouse

USGS Publications Warehouse

Chandler, A.; Foley, D.; Hafez, A.M.

2000-01-01

The purpose of this article is to raise and address a number of issues related to the conversion of Federal Geographic Data Committee metadata into MARC21 and Dublin Core. We present an analysis of 466 FGDC metadata records housed in the National Biological Information Infrastructure (NBII) node of the FGDC Clearinghouse, with special emphasis on the length of fields and the total length of records in this set. One of our contributions is a 34 element crosswalk, a proposal that takes into consideration the constraints of the MARC21 standard as implemented in OCLC's World Cat and the realities of user behavior.
What Information Does Your EHR Contain? Automatic Generation of a Clinical Metadata Warehouse (CMDW) to Support Identification and Data Access Within Distributed Clinical Research Networks.

PubMed

Bruland, Philipp; Doods, Justin; Storck, Michael; Dugas, Martin

2017-01-01

Data dictionaries provide structural meta-information about data definitions in health information technology (HIT) systems. In this regard, reusing healthcare data for secondary purposes offers several advantages (e.g. reduce documentation times or increased data quality). Prerequisites for data reuse are its quality, availability and identical meaning of data. In diverse projects, research data warehouses serve as core components between heterogeneous clinical databases and various research applications. Given the complexity (high number of data elements) and dynamics (regular updates) of electronic health record (EHR) data structures, we propose a clinical metadata warehouse (CMDW) based on a metadata registry standard. Metadata of two large hospitals were automatically inserted into two CMDWs containing 16,230 forms and 310,519 data elements. Automatic updates of metadata are possible as well as semantic annotations. A CMDW allows metadata discovery, data quality assessment and similarity analyses. Common data models for distributed research networks can be established based on similarity analyses.
A metadata template for ocean acidification data

NASA Astrophysics Data System (ADS)

Jiang, L.

2014-12-01

Metadata is structured information that describes, explains, and locates an information resource (e.g., data). It is often coarsely described as data about data, and documents information such as what was measured, by whom, when, where, and how it was sampled, analyzed, with what instruments. Metadata is inherent to ensure the survivability and accessibility of the data into the future. With the rapid expansion of biological response ocean acidification (OA) studies, the lack of a common metadata template to document such type of data has become a significant gap for ocean acidification data management efforts. In this paper, we present a metadata template that can be applied to a broad spectrum of OA studies, including those studying the biological responses of organisms on ocean acidification. The "variable metadata section", which includes the variable name, observation type, whether the variable is a manipulation condition or response variable, and the biological subject on which the variable is studied, forms the core of this metadata template. Additional metadata elements, such as principal investigators, temporal and spatial coverage, platforms for the sampling, data citation are essential components to complete the template. We explain the structure of the template, and define many metadata elements that may be unfamiliar to researchers. For that reason, this paper can serve as a user's manual for the template.
Inheritance rules for Hierarchical Metadata Based on ISO 19115

NASA Astrophysics Data System (ADS)

Zabala, A.; Masó, J.; Pons, X.

2012-04-01

Mainly, ISO19115 has been used to describe metadata for datasets and services. Furthermore, ISO19115 standard (as well as the new draft ISO19115-1) includes a conceptual model that allows to describe metadata at different levels of granularity structured in hierarchical levels, both in aggregated resources such as particularly series, datasets, and also in more disaggregated resources such as types of entities (feature type), types of attributes (attribute type), entities (feature instances) and attributes (attribute instances). In theory, to apply a complete metadata structure to all hierarchical levels of metadata, from the whole series to an individual feature attributes, is possible, but to store all metadata at all levels is completely impractical. An inheritance mechanism is needed to store each metadata and quality information at the optimum hierarchical level and to allow an ease and efficient documentation of metadata in both an Earth observation scenario such as a multi-satellite mission multiband imagery, as well as in a complex vector topographical map that includes several feature types separated in layers (e.g. administrative limits, contour lines, edification polygons, road lines, etc). Moreover, and due to the traditional split of maps in tiles due to map handling at detailed scales or due to the satellite characteristics, each of the previous thematic layers (e.g. 1:5000 roads for a country) or band (Landsat-5 TM cover of the Earth) are tiled on several parts (sheets or scenes respectively). According to hierarchy in ISO 19115, the definition of general metadata can be supplemented by spatially specific metadata that, when required, either inherits or overrides the general case (G.1.3). Annex H of this standard states that only metadata exceptions are defined at lower levels, so it is not necessary to generate the full registry of metadata for each level but to link particular values to the general value that they inherit. Conceptually the metadata registry is complete for each metadata hierarchical level, but at the implementation level most of the metadata elements are not stored at both levels but only at more generic one. This communication defines a metadata system that covers 4 levels, describes which metadata has to support series-layer inheritance and in which way, and how hierarchical levels are defined and stored. Metadata elements are classified according to the type of inheritance between products, series, tiles and the datasets. It explains the metadata elements classification and exemplifies it using core metadata elements. The communication also presents a metadata viewer and edition tool that uses the described model to propagate metadata elements and to show to the user a complete set of metadata for each level in a transparent way. This tool is integrated in the MiraMon GIS software.
MCM generator: a Java-based tool for generating medical metadata.

PubMed

Munoz, F; Hersh, W

1998-01-01

In a previous paper we introduced the need to implement a mechanism to facilitate the discovery of relevant Web medical documents. We maintained that the use of META tags, specifically ones that define the medical subject and resource type of a document, help towards this goal. We have now developed a tool to facilitate the generation of these tags for the authors of medical documents. Written entirely in Java, this tool makes use of the SAPHIRE server, and helps the author identify the Medical Subject Heading terms that most appropriately describe the subject of the document. Furthermore, it allows the author to generate metadata tags for the 15 elements that the Dublin Core considers as core elements in the description of a document. This paper describes the use of this tool in the cataloguing of Web and non-Web medical documents, such as images, movie, and sound files.
Enriching the trustworthiness of health-related web pages.

PubMed

Gaudinat, Arnaud; Cruchet, Sarah; Boyer, Celia; Chrawdhry, Pravir

2011-06-01

We present an experimental mechanism for enriching web content with quality metadata. This mechanism is based on a simple and well-known initiative in the field of the health-related web, the HONcode. The Resource Description Framework (RDF) format and the Dublin Core Metadata Element Set were used to formalize these metadata. The model of trust proposed is based on a quality model for health-related web pages that has been tested in practice over a period of thirteen years. Our model has been explored in the context of a project to develop a research tool that automatically detects the occurrence of quality criteria in health-related web pages.

Development of an open metadata schema for prospective clinical research (openPCR) in China.

PubMed

Xu, W; Guan, Z; Sun, J; Wang, Z; Geng, Y

2014-01-01

In China, deployment of electronic data capture (EDC) and clinical data management system (CDMS) for clinical research (CR) is in its very early stage, and about 90% of clinical studies collected and submitted clinical data manually. This work aims to build an open metadata schema for Prospective Clinical Research (openPCR) in China based on openEHR archetypes, in order to help Chinese researchers easily create specific data entry templates for registration, study design and clinical data collection. Singapore Framework for Dublin Core Application Profiles (DCAP) is used to develop openPCR and four steps such as defining the core functional requirements and deducing the core metadata items, developing archetype models, defining metadata terms and creating archetype records, and finally developing implementation syntax are followed. The core functional requirements are divided into three categories: requirements for research registration, requirements for trial design, and requirements for case report form (CRF). 74 metadata items are identified and their Chinese authority names are created. The minimum metadata set of openPCR includes 3 documents, 6 sections, 26 top level data groups, 32 lower data groups and 74 data elements. The top level container in openPCR is composed of public document, internal document and clinical document archetypes. A hierarchical structure of openPCR is established according to Data Structure of Electronic Health Record Architecture and Data Standard of China (Chinese EHR Standard). Metadata attributes are grouped into six parts: identification, definition, representation, relation, usage guides, and administration. OpenPCR is an open metadata schema based on research registration standards, standards of the Clinical Data Interchange Standards Consortium (CDISC) and Chinese healthcare related standards, and is to be publicly available throughout China. It considers future integration of EHR and CR by adopting data structure and data terms in Chinese EHR Standard. Archetypes in openPCR are modularity models and can be separated, recombined, and reused. The authors recommend that the method to develop openPCR can be referenced by other countries when designing metadata schema of clinical research. In the next steps, openPCR should be used in a number of CR projects to test its applicability and to continuously improve its coverage. Besides, metadata schema for research protocol can be developed to structurize and standardize protocol, and syntactical interoperability of openPCR with other related standards can be considered.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

PubMed

Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata

PubMed Central

Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240
ISO 19115 Experiences in NASA's Earth Observing System (EOS) ClearingHOuse (ECHO)

NASA Astrophysics Data System (ADS)

Cechini, M. F.; Mitchell, A.

2011-12-01

Metadata is an important entity in the process of cataloging, discovering, and describing earth science data. As science research and the gathered data increases in complexity, so does the complexity and importance of descriptive metadata. To meet these growing needs, the metadata models required utilize richer and more mature metadata attributes. Categorizing, standardizing, and promulgating these metadata models to a politically, geographically, and scientifically diverse community is a difficult process. An integral component of metadata management within NASA's Earth Observing System Data and Information System (EOSDIS) is the Earth Observing System (EOS) ClearingHOuse (ECHO). ECHO is the core metadata repository for the EOSDIS data centers providing a centralized mechanism for metadata and data discovery and retrieval. ECHO has undertaken an internal restructuring to meet the changing needs of scientists, the consistent advancement in technology, and the advent of new standards such as ISO 19115. These improvements were based on the following tenets for data discovery and retrieval: + There exists a set of 'core' metadata fields recommended for data discovery. + There exists a set of users who will require the entire metadata record for advanced analysis. + There exists a set of users who will require a 'core' set metadata fields for discovery only. + There will never be a cessation of new formats or a total retirement of all old formats. + Users should be presented metadata in a consistent format of their choosing. In order to address the previously listed items, ECHO's new metadata processing paradigm utilizes the following approach: + Identify a cross-format set of 'core' metadata fields necessary for discovery. + Implement format-specific indexers to extract the 'core' metadata fields into an optimized query capability. + Archive the original metadata in its entirety for presentation to users requiring the full record. + Provide on-demand translation of 'core' metadata to any supported result format. Lessons learned by the ECHO team while implementing its new metadata approach to support usage of the ISO 19115 standard will be presented. These lessons learned highlight some discovered strengths and weaknesses in the ISO 19115 standard as it is introduced to an existing metadata processing system.
Towards a semantic medical Web: HealthCyberMap's tool for building an RDF metadata base of health information resources based on the Qualified Dublin Core Metadata Set.

PubMed

Boulos, Maged N; Roudsari, Abdul V; Carson, Ewart R

2002-07-01

HealthCyberMap (http://healthcybermap.semanticweb.org/) aims at mapping Internet health information resources in novel ways for enhanced retrieval and navigation. This is achieved by collecting appropriate resource metadata in an unambiguous form that preserves semantics. We modelled a qualified Dublin Core (DC) metadata set ontology with extra elements for resource quality and geographical provenance in Prot g -2000. A metadata collection form helps acquiring resource instance data within Prot g . The DC subject field is populated with UMLS terms directly imported from UMLS Knowledge Source Server using UMLS tab, a Prot g -2000 plug-in. The project is saved in RDFS/RDF. The ontology and associated form serve as a free tool for building and maintaining an RDF medical resource metadata base. The UMLS tab enables browsing and searching for concepts that best describe a resource, and importing them to DC subject fields. The resultant metadata base can be used with a search and inference engine, and have textual and/or visual navigation interface(s) applied to it, to ultimately build a medical Semantic Web portal. Different ways of exploiting Prot g -2000 RDF output are discussed. By making the context and semantics of resources, not merely their raw text and formatting, amenable to computer 'understanding,' we can build a Semantic Web that is more useful to humans than the current Web. This requires proper use of metadata and ontologies. Clinical codes can reliably describe the subjects of medical resources, establish the semantic relationships (as defined by underlying coding scheme) between related resources, and automate their topical categorisation.
Metadata Effectiveness in Internet Discovery: An Analysis of Digital Collection Metadata Elements and Internet Search Engine Keywords

ERIC Educational Resources Information Center

Yang, Le

2016-01-01

This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…
Report from the International Conference on Dublin Core and Metadata Applications, 2001.

ERIC Educational Resources Information Center

Sugimoto, Shigeo; Adachi, Jun; Baker, Thomas; Weibel, Stuart

This paper describes the International Conference on Dublin Core and Metadata Applications 2001 (DC-2001), the ninth major workshop of the Dublin Core Metadata Initiative (DCMI), which was held in Tokyo in October 2001. DC-2001 was a week-long event that included both a workshop and a conference. In the tradition of previous events, the workshop…
caCORE: a common infrastructure for cancer informatics.

PubMed

Covitz, Peter A; Hartel, Frank; Schaefer, Carl; De Coronado, Sherri; Fragoso, Gilberto; Sahni, Himanso; Gustafson, Scott; Buetow, Kenneth H

2003-12-12

Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources. caCORE downloads and web interfaces can be accessed from links on the caCORE web site (http://ncicb.nci.nih.gov/core). caBIO software is distributed under an open source license that permits unrestricted academic and commercial use. Vocabulary and metadata content in the EVS and caDSR, respectively, is similarly unrestricted, and is available through web applications and FTP downloads. http://ncicb.nci.nih.gov/core/publications contains links to the caBIO 1.0 class diagram and the caCORE 1.0 Technical Guide, which provide detailed information on the present caCORE architecture, data sources and APIs. Updated information appears on a regular basis on the caCORE web site (http://ncicb.nci.nih.gov/core).
A database of paleoceanographic sediment cores from the North Pacific, 1951-2016

NASA Astrophysics Data System (ADS)

Borreggine, Marisa; Myhre, Sarah E.; Mislan, K. Allison S.; Deutsch, Curtis; Davis, Catherine V.

2017-09-01

We assessed sediment coring, data acquisition, and publications from the North Pacific (north of 30° N) from 1951 to 2016. There are 2134 sediment cores collected by American, French, Japanese, Russian, and international research vessels across the North Pacific (including the Pacific subarctic gyre, Alaskan gyre, Japan margin, and California margin; 1391 cores), the Sea of Okhotsk (271 cores), the Bering Sea (123 cores), and the Sea of Japan (349 cores) reported here. All existing metadata associated with these sediment cores are documented here, including coring date, location, core number, cruise number, water depth, vessel metadata, and coring technology. North Pacific sediment core age models are built with isotope stratigraphy, radiocarbon dating, magnetostratigraphy, biostratigraphy, tephrochronology, % opal, color, and lithological proxies. Here, we evaluate the iterative generation of each published age model and provide comprehensive documentation of the dating techniques used, along with sedimentation rates and age ranges. We categorized cores according to the availability of a variety of proxy evidence, including biological (e.g., benthic and planktonic foraminifera assemblages), geochemical (e.g., major trace element concentrations), isotopic (e.g., bulk sediment nitrogen, oxygen, and carbon isotopes), and stratigraphic (e.g., preserved laminations) proxies. This database is a unique resource to the paleoceanographic and paleoclimate communities and provides cohesive accessibility to sedimentary sequences, age model development, and proxies. The data set is publicly available through PANGAEA at https://doi.org/10.1594/PANGAEA.875998.
Partnerships To Mine Unexploited Sources of Metadata.

ERIC Educational Resources Information Center

Reynolds, Regina Romano

This paper discusses the metadata created for other purposes as a potential source of bibliographic data. The first section addresses collecting metadata by means of templates, including the Nordic Metadata Project's Dublin Core Metadata Template. The second section considers potential partnerships for re-purposing metadata for bibliographic use,…
Trends in the Evolution of the Public Web, 1998-2002; The Fedora Project: An Open-source Digital Object Repository Management System; State of the Dublin Core Metadata Initiative, April 2003; Preservation Metadata; How Many People Search the ERIC Database Each Day?

ERIC Educational Resources Information Center

O'Neill, Edward T.; Lavoie, Brian F.; Bennett, Rick; Staples, Thornton; Wayland, Ross; Payette, Sandra; Dekkers, Makx; Weibel, Stuart; Searle, Sam; Thompson, Dave; Rudner, Lawrence M.

2003-01-01

Includes five articles that examine key trends in the development of the public Web: size and growth, internationalization, and metadata usage; Flexible Extensible Digital Object and Repository Architecture (Fedora) for use in digital libraries; developments in the Dublin Core Metadata Initiative (DCMI); the National Library of New Zealand Te Puna…
Generation of Multiple Metadata Formats from a Geospatial Data Repository

NASA Astrophysics Data System (ADS)

Hudspeth, W. B.; Benedict, K. K.; Scott, S.

2012-12-01

The Earth Data Analysis Center (EDAC) at the University of New Mexico is partnering with the CYBERShARE and Environmental Health Group from the Center for Environmental Resource Management (CERM), located at the University of Texas, El Paso (UTEP), the Biodiversity Institute at the University of Kansas (KU), and the New Mexico Geo- Epidemiology Research Network (GERN) to provide a technical infrastructure that enables investigation of a variety of climate-driven human/environmental systems. Two significant goals of this NASA-funded project are: a) to increase the use of NASA Earth observational data at EDAC by various modeling communities through enabling better discovery, access, and use of relevant information, and b) to expose these communities to the benefits of provenance for improving understanding and usability of heterogeneous data sources and derived model products. To realize these goals, EDAC has leveraged the core capabilities of its Geographic Storage, Transformation, and Retrieval Engine (Gstore) platform, developed with support of the NSF EPSCoR Program. The Gstore geospatial services platform provides general purpose web services based upon the REST service model, and is capable of data discovery, access, and publication functions, metadata delivery functions, data transformation, and auto-generated OGC services for those data products that can support those services. Central to the NASA ACCESS project is the delivery of geospatial metadata in a variety of formats, including ISO 19115-2/19139, FGDC CSDGM, and the Proof Markup Language (PML). This presentation details the extraction and persistence of relevant metadata in the Gstore data store, and their transformation into multiple metadata formats that are increasingly utilized by the geospatial community to document not only core library catalog elements (e.g. title, abstract, publication data, geographic extent, projection information, and database elements), but also the processing steps used to generate derived modeling products. In particular, we discuss the generation and service delivery of provenance, or trace of data sources and analytical methods used in a scientific analysis, for archived data. We discuss the workflows developed by EDAC to capture end-to-end provenance, the storage model for those data in a delivery format independent data structure, and delivery of PML, ISO, and FGDC documents to clients requesting those products.
"CanCore": In Canada and around the World

ERIC Educational Resources Information Center

Friesen, Norm

2005-01-01

In this article, the author discusses "CanCore," a learning resource metadata initiative funded by Industry Canada and supported by Athabasca University, Alberta, and TeleUniversite du Quebec, and describes the increasing range of international uses of the "CanCore" metadata for the indexing of learning objects.…
ISO, FGDC, DIF and Dublin Core - Making Sense of Metadata Standards for Earth Science Data

NASA Astrophysics Data System (ADS)

Jones, P. R.; Ritchey, N. A.; Peng, G.; Toner, V. A.; Brown, H.

2014-12-01

Metadata standards provide common definitions of metadata fields for information exchange across user communities. Despite the broad adoption of metadata standards for Earth science data, there are still heterogeneous and incompatible representations of information due to differences between the many standards in use and how each standard is applied. Federal agencies are required to manage and publish metadata in different metadata standards and formats for various data catalogs. In 2014, the NOAA National Climatic data Center (NCDC) managed metadata for its scientific datasets in ISO 19115-2 in XML, GCMD Directory Interchange Format (DIF) in XML, DataCite Schema in XML, Dublin Core in XML, and Data Catalog Vocabulary (DCAT) in JSON, with more standards and profiles of standards planned. Of these standards, the ISO 19115-series metadata is the most complete and feature-rich, and for this reason it is used by NCDC as the source for the other metadata standards. We will discuss the capabilities of metadata standards and how these standards are being implemented to document datasets. Successful implementations include developing translations and displays using XSLTs, creating links to related data and resources, documenting dataset lineage, and establishing best practices. Benefits, gaps, and challenges will be highlighted with suggestions for improved approaches to metadata storage and maintenance.
A federated semantic metadata registry framework for enabling interoperability across clinical research and care domains.

PubMed

Sinaci, A Anil; Laleci Erturkmen, Gokce B

2013-10-01

In order to enable secondary use of Electronic Health Records (EHRs) by bridging the interoperability gap between clinical care and research domains, in this paper, a unified methodology and the supporting framework is introduced which brings together the power of metadata registries (MDR) and semantic web technologies. We introduce a federated semantic metadata registry framework by extending the ISO/IEC 11179 standard, and enable integration of data element registries through Linked Open Data (LOD) principles where each Common Data Element (CDE) can be uniquely referenced, queried and processed to enable the syntactic and semantic interoperability. Each CDE and their components are maintained as LOD resources enabling semantic links with other CDEs, terminology systems and with implementation dependent content models; hence facilitating semantic search, much effective reuse and semantic interoperability across different application domains. There are several important efforts addressing the semantic interoperability in healthcare domain such as IHE DEX profile proposal, CDISC SHARE and CDISC2RDF. Our architecture complements these by providing a framework to interlink existing data element registries and repositories for multiplying their potential for semantic interoperability to a greater extent. Open source implementation of the federated semantic MDR framework presented in this paper is the core of the semantic interoperability layer of the SALUS project which enables the execution of the post marketing safety analysis studies on top of existing EHR systems. Copyright © 2013 Elsevier Inc. All rights reserved.
Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO).

PubMed

Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

2017-08-01

A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metadata values. We evaluate our framework in the context of experimental metadata from the Gene Expression Omnibus (GEO). We applied four rule mining algorithms to the most common structured metadata elements (sample type, molecular type, platform, label type and organism) from over 1.3million GEO records. We examined the quality of well supported rules from each algorithm and visualized the dependencies among metadata elements. Finally, we evaluated the performance of the algorithms in terms of accuracy, precision, recall, and F-measure. We found that PART is the best algorithm outperforming Apriori, Predictive Apriori, and Decision Table. All algorithms perform significantly better in predicting class values than the majority vote classifier. We found that the performance of the algorithms is related to the dimensionality of the GEO elements. The average performance of all algorithm increases due of the decreasing of dimensionality of the unique values of these elements (2697 platforms, 537 organisms, 454 labels, 9 molecules, and 5 types). Our work suggests that experimental metadata such as present in GEO can be accurately predicted using rule mining algorithms. Our work has implications for both prospective and retrospective augmentation of metadata quality, which are geared towards making data easier to find and reuse. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Cross-organizational workflow in radiology: an empirical study of the quality of shared metadata elements in Region Västra Götaland, Sweden.

PubMed

Lindsköld, Lars; Wintell, Mikael; Edgren, Lars; Aspelin, Peter; Lundberg, Nina

2013-07-01

Challenges related to the cross-organizational access of accurate and timely information about a patient's condition has become a critical issue in healthcare. Interoperability of different local sources is necessary. To identify and present missing and semantically incorrect data elements of metadata in the radiology enterprise service that supports cross-organizational sharing of dynamic information about patients' visits, in the Region Västra Götaland, Sweden. Quantitative data elements of metadata were collected yearly from the first Wednesday in March from 2006 to 2011 from the 24 in-house radiology departments in Region Västra Götaland. These radiology departments were organized into four hospital groups and three stand-alone hospitals. Included data elements of metadata were the patient name, patient ID, institutional department name, referring physician's name, and examination description. The majority of missing data elements of metadata was related to the institutional department name for Hospital 2, from 87% in 2007 to 25% in 2011. All data elements of metadata except the patient ID contained semantic errors. For example, for the data element "patient name", only three names out of 3537 were semantically correct. This study shows that the semantics of metadata elements are poorly structured and inconsistently used. Although a cross-organizational solution may technically be fully functional, semantic errors may prevent it from serving as an information infrastructure for collaboration between all departments and hospitals in the region. For interoperability, it is important that the agreed semantic models are implemented in vendor systems using the information infrastructure.
Descriptive Metadata: Emerging Standards.

ERIC Educational Resources Information Center

Ahronheim, Judith R.

1998-01-01

Discusses metadata, digital resources, cross-disciplinary activity, and standards. Highlights include Standard Generalized Markup Language (SGML); Extensible Markup Language (XML); Dublin Core; Resource Description Framework (RDF); Text Encoding Initiative (TEI); Encoded Archival Description (EAD); art and cultural-heritage metadata initiatives;…
EnviroAtlas Tree Cover Configuration and Connectivity, Water Background Web Service

EPA Pesticide Factsheets

This EnviroAtlas web service supports research and online mapping activities related to EnviroAtlas (https://www.epa.gov/enviroatlas). The 1-meter resolution tree cover configuration and connectivity map categorizes tree cover into structural elements (e.g. core, edge, connector, etc.). Source imagery varies by community. For specific information about methods and accuracy of each community's tree cover configuration and connectivity classification, consult their individual metadata records: Austin, TX (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B29D2B039-905C-4825-B0B4-9315122D6A9F%7D); Cleveland, OH (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B03cd54e1-4328-402e-ba75-e198ea9fbdc7%7D); Des Moines, IA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B350A83E6-10A2-4D5D-97E6-F7F368D268BB%7D); Durham, NC (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BC337BA5F-8275-4BA8-9647-F63C443F317D%7D); Fresno, CA (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B84B98749-9C1C-4679-AE24-9B9C0998EBA5%7D); Green Bay, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B69E48A44-3D30-4E84-A764-38FBDCCAC3D0%7D); Memphis, TN (https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7BB7313ADA-04F7-4D80-ABBA-77E753AAD002%7D); Milwaukee, WI (https://edg.epa.gov/metadata/catalog/search/resource/details.page?u
Principle Paradigms Revisiting the Dublin Core 1:1 Principle

ERIC Educational Resources Information Center

Urban, Richard J.

2012-01-01

The Dublin Core "1:1 Principle" asserts that "related but conceptually different entities, for example a painting and a digital image of the painting, are described by separate metadata records" (Woodley et al., 2005). While this seems to be a simple requirement, studies of metadata quality have found that cultural heritage…

Improving Scientific Metadata Interoperability And Data Discoverability using OAI-PMH

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E.

2010-12-01

While general-purpose search engines (such as Google or Bing) are useful for finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. However, there are a number of different protocols for harvesting metadata, with some challenges for ensuring that updates are propagated and for collaborations with repositories using differing metadata standards. The Open Archive Initiative Protocol for Metadata Handling (OAI-PMH) is a standard that is seeing increased use as a means for exchanging structured metadata. OAI-PMH implementations must support Dublin Core as a metadata standard, with other metadata formats as optional. We have developed tools which enable our structured search tool (Mercury; http://mercury.ornl.gov) to consume metadata from OAI-PMH services in any of the metadata formats we support (Dublin Core, Darwin Core, FCDC CSDGM, GCMD DIF, EML, and ISO 19115/19137). We are also making ORNL DAAC metadata available through OAI-PMH for other metadata tools to utilize, such as the NASA Global Change Master Directory, GCMD). This paper describes Mercury capabilities with multiple metadata formats, in general, and, more specifically, the results of our OAI-PMH implementations and the lessons learned. References: [1] R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. [2] R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010). [3] Devarakonda, R.; Palanisamy, G.; Green, J.; Wilson, B. E. "Mercury: An Example of Effective Software Reuse for Metadata Management Data Discovery and Access", Eos Trans. AGU, 89(53), Fall Meet. Suppl., IN11A-1019 (2008).
Metadata for WIS and WIGOS: GAW Profile of ISO19115 and Draft WIGOS Core Metadata Standard

NASA Astrophysics Data System (ADS)

Klausen, Jörg; Howe, Brian

2014-05-01

The World Meteorological Organization (WMO) Integrated Global Observing System (WIGOS) is a key WMO priority to underpin all WMO Programs and new initiatives such as the Global Framework for Climate Services (GFCS). The development of the WIGOS Operational Information Resource (WIR) is central to the WIGOS Framework Implementation Plan (WIGOS-IP). The WIR shall provide information on WIGOS and its observing components, as well as requirements of WMO application areas. An important aspect is the description of the observational capabilities by way of structured metadata. The Global Atmosphere Watch is the WMO program addressing the chemical composition and selected physical properties of the atmosphere. Observational data are collected and archived by GAW World Data Centres (WDCs) and related data centres. The Task Team on GAW WDCs (ET-WDC) have developed a profile of the ISO19115 metadata standard that is compliant with the WMO Information System (WIS) specification for the WMO Core Metadata Profile v1.3. This profile is intended to harmonize certain aspects of the documentation of observations as well as the interoperability of the WDCs. The Inter-Commission-Group on WIGOS (ICG-WIGOS) has established the Task Team on WIGOS Metadata (TT-WMD) with representation of all WMO Technical Commissions and the objective to define the WIGOS Core Metadata. The result of this effort is a draft semantic standard comprising of a set of metadata classes that are considered to be of critical importance for the interpretation of observations relevant to WIGOS. The purpose of the presentation is to acquaint the audience with the standard and to solicit informal feed-back from experts in the various disciplines of meteorology and climatology. This feed-back will help ET-WDC and TT-WMD to refine the GAW metadata profile and the draft WIGOS metadata standard, thereby increasing their utility and acceptance.
Digital Initiatives and Metadata Use in Thailand

ERIC Educational Resources Information Center

SuKantarat, Wichada

2008-01-01

Purpose: This paper aims to provide information about various digital initiatives in libraries in Thailand and especially use of Dublin Core metadata in cataloguing digitized objects in academic and government digital databases. Design/methodology/approach: The author began researching metadata use in Thailand in 2003 and 2004 while on sabbatical…
NOAA's Data Catalog and the Federal Open Data Policy

NASA Astrophysics Data System (ADS)

Wengren, M. J.; de la Beaujardiere, J.

2014-12-01

The 2013 Open Data Policy Presidential Directive requires Federal agencies to create and maintain a 'public data listing' that includes all agency data that is currently or will be made publicly-available in the future. The directive requires the use of machine-readable and open formats that make use of 'common core' and extensible metadata formats according to the best practices published in an online repository called 'Project Open Data', to use open licenses where possible, and to adhere to existing metadata and other technology standards to promote interoperability. In order to meet the requirements of the Open Data Policy, the National Oceanic and Atmospheric Administration (NOAA) has implemented an online data catalog that combines metadata from all subsidiary NOAA metadata catalogs into a single master inventory. The NOAA Data Catalog is available to the public for search and discovery, providing access to the NOAA master data inventory through multiple means, including web-based text search, OGC CS-W endpoint, as well as a native Application Programming Interface (API) for programmatic query. It generates on a daily basis the Project Open Data JavaScript Object Notation (JSON) file required for compliance with the Presidential directive. The Data Catalog is based on the open source Comprehensive Knowledge Archive Network (CKAN) software and runs on the Amazon Federal GeoCloud. This presentation will cover topics including mappings of existing metadata in standard formats (FGDC-CSDGM and ISO 19115 XML ) to the Project Open Data JSON metadata schema, representation of metadata elements within the catalog, and compatible metadata sources used to feed the catalog to include Web Accessible Folder (WAF), Catalog Services for the Web (CS-W), and Esri ArcGIS.com. It will also discuss related open source technologies that can be used together to build a spatial data infrastructure compliant with the Open Data Policy.
Assessing Metadata Quality of a Federally Sponsored Health Data Repository.

PubMed

Marc, David T; Beattie, James; Herasevich, Vitaly; Gatewood, Laël; Zhang, Rui

2016-01-01

The U.S. Federal Government developed HealthData.gov to disseminate healthcare datasets to the public. Metadata is provided for each datasets and is the sole source of information to find and retrieve data. This study employed automated quality assessments of the HealthData.gov metadata published from 2012 to 2014 to measure completeness, accuracy, and consistency of applying standards. The results demonstrated that metadata published in earlier years had lower completeness, accuracy, and consistency. Also, metadata that underwent modifications following their original creation were of higher quality. HealthData.gov did not uniformly apply Dublin Core Metadata Initiative to the metadata, which is a widely accepted metadata standard. These findings suggested that the HealthData.gov metadata suffered from quality issues, particularly related to information that wasn't frequently updated. The results supported the need for policies to standardize metadata and contributed to the development of automated measures of metadata quality.
Assessing Metadata Quality of a Federally Sponsored Health Data Repository

PubMed Central

Marc, David T.; Beattie, James; Herasevich, Vitaly; Gatewood, Laël; Zhang, Rui

2016-01-01

The U.S. Federal Government developed HealthData.gov to disseminate healthcare datasets to the public. Metadata is provided for each datasets and is the sole source of information to find and retrieve data. This study employed automated quality assessments of the HealthData.gov metadata published from 2012 to 2014 to measure completeness, accuracy, and consistency of applying standards. The results demonstrated that metadata published in earlier years had lower completeness, accuracy, and consistency. Also, metadata that underwent modifications following their original creation were of higher quality. HealthData.gov did not uniformly apply Dublin Core Metadata Initiative to the metadata, which is a widely accepted metadata standard. These findings suggested that the HealthData.gov metadata suffered from quality issues, particularly related to information that wasn’t frequently updated. The results supported the need for policies to standardize metadata and contributed to the development of automated measures of metadata quality. PMID:28269883
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations.

PubMed

Martínez-Romero, Marcos; O'Connor, Martin J; Shankar, Ravi D; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L; Gevaert, Olivier; Graybeal, John; Musen, Mark A

2017-01-01

In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository.
Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations

PubMed Central

Martínez-Romero, Marcos; O’Connor, Martin J.; Shankar, Ravi D.; Panahiazar, Maryam; Willrett, Debra; Egyedi, Attila L.; Gevaert, Olivier; Graybeal, John; Musen, Mark A.

2017-01-01

In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository. PMID:29854196
An Examination of the Adoption of Preservation Metadata in Cultural Heritage Institutions: An Exploratory Study Using Diffusion of Innovations Theory

ERIC Educational Resources Information Center

Alemneh, Daniel Gelaw

2009-01-01

Digital preservation is a significant challenge for cultural heritage institutions and other repositories of digital information resources. Recognizing the critical role of metadata in any successful digital preservation strategy, the Preservation Metadata Implementation Strategies (PREMIS) has been extremely influential on providing a "core" set…
Evolving Metadata in NASA Earth Science Data Systems

NASA Astrophysics Data System (ADS)

Mitchell, A.; Cechini, M. F.; Walter, J.

2011-12-01

NASA's Earth Observing System (EOS) is a coordinated series of satellites for long term global observations. NASA's Earth Observing System Data and Information System (EOSDIS) is a petabyte-scale archive of environmental data that supports global climate change research by providing end-to-end services from EOS instrument data collection to science data processing to full access to EOS and other earth science data. On a daily basis, the EOSDIS ingests, processes, archives and distributes over 3 terabytes of data from NASA's Earth Science missions representing over 3500 data products ranging from various types of science disciplines. EOSDIS is currently comprised of 12 discipline specific data centers that are collocated with centers of science discipline expertise. Metadata is used in all aspects of NASA's Earth Science data lifecycle from the initial measurement gathering to the accessing of data products. Missions use metadata in their science data products when describing information such as the instrument/sensor, operational plan, and geographically region. Acting as the curator of the data products, data centers employ metadata for preservation, access and manipulation of data. EOSDIS provides a centralized metadata repository called the Earth Observing System (EOS) ClearingHouse (ECHO) for data discovery and access via a service-oriented-architecture (SOA) between data centers and science data users. ECHO receives inventory metadata from data centers who generate metadata files that complies with the ECHO Metadata Model. NASA's Earth Science Data and Information System (ESDIS) Project established a Tiger Team to study and make recommendations regarding the adoption of the international metadata standard ISO 19115 in EOSDIS. The result was a technical report recommending an evolution of NASA data systems towards a consistent application of ISO 19115 and related standards including the creation of a NASA-specific convention for core ISO 19115 elements. Part of NASA's effort to continually evolve its data systems led ECHO to enhancing the method in which it receives inventory metadata from the data centers to allow for multiple metadata formats including ISO 19115. ECHO's metadata model will also be mapped to the NASA-specific convention for ingesting science metadata into the ECHO system. As NASA's new Earth Science missions and data centers are migrating to the ISO 19115 standards, EOSDIS is developing metadata management resources to assist in the reading, writing and parsing ISO 19115 compliant metadata. To foster interoperability with other agencies and international partners, NASA is working to ensure that a common ISO 19115 convention is developed, enhancing data sharing capabilities and other data analysis initiatives. NASA is also investigating the use of ISO 19115 standards to encode data quality, lineage and provenance with stored values. A common metadata standard across NASA's Earth Science data systems promotes interoperability, enhances data utilization and removes levels of uncertainty found in data products.
A novel metadata management model to capture consent for record linkage in longitudinal research studies.

PubMed

McMahon, Christiana; Denaxas, Spiros

2017-11-06

Informed consent is an important feature of longitudinal research studies as it enables the linking of the baseline participant information with administrative data. The lack of standardized models to capture consent elements can lead to substantial challenges. A structured approach to capturing consent-related metadata can address these. a) Explore the state-of-the-art for recording consent; b) Identify key elements of consent required for record linkage; and c) Create and evaluate a novel metadata management model to capture consent-related metadata. The main methodological components of our work were: a) a systematic literature review and qualitative analysis of consent forms; b) the development and evaluation of a novel metadata model. We qualitatively analyzed 61 manuscripts and 30 consent forms. We extracted data elements related to obtaining consent for linkage. We created a novel metadata management model for consent and evaluated it by comparison with the existing standards and by iteratively applying it to case studies. The developed model can facilitate the standardized recording of consent for linkage in longitudinal research studies and enable the linkage of external participant data. Furthermore, it can provide a structured way of recording consent-related metadata and facilitate the harmonization and streamlining of processes.
Using a linked data approach to aid development of a metadata portal to support Marine Strategy Framework Directive (MSFD) implementation

NASA Astrophysics Data System (ADS)

Wood, Chris

2016-04-01

Under the Marine Strategy Framework Directive (MSFD), EU Member States are mandated to achieve or maintain 'Good Environmental Status' (GES) in their marine areas by 2020, through a series of Programme of Measures (PoMs). The Celtic Seas Partnership (CSP), an EU LIFE+ project, aims to support policy makers, special-interest groups, users of the marine environment, and other interested stakeholders on MSFD implementation in the Celtic Seas geographical area. As part of this support, a metadata portal has been built to provide a signposting service to datasets that are relevant to MSFD within the Celtic Seas. To ensure that the metadata has the widest possible reach, a linked data approach was employed to construct the database. Although the metadata are stored in a traditional RDBS, the metadata are exposed as linked data via the D2RQ platform, allowing virtual RDF graphs to be generated. SPARQL queries can be executed against the end-point allowing any user to manipulate the metadata. D2RQ's mapping language, based on turtle, was used to map a wide range of relevant ontologies to the metadata (e.g. The Provenance Ontology (prov-o), Ocean Data Ontology (odo), Dublin Core Elements and Terms (dc & dcterms), Friend of a Friend (foaf), and Geospatial ontologies (geo)) allowing users to browse the metadata, either via SPARQL queries or by using D2RQ's HTML interface. The metadata were further enhanced by mapping relevant parameters to the NERC Vocabulary Server, itself built on a SPARQL endpoint. Additionally, a custom web front-end was built to enable users to browse the metadata and express queries through an intuitive graphical user interface that requires no prior knowledge of SPARQL. As well as providing means to browse the data via MSFD-related parameters (Descriptor, Criteria, and Indicator), the metadata records include the dataset's country of origin, the list of organisations involved in the management of the data, and links to any relevant INSPIRE-compliant services relating to the dataset. The web front-end therefore enables users to effectively filter, sort, or search the metadata. As the MSFD timeline requires Member States to review their progress on achieving or maintaining GES every six years, the timely development of this metadata portal will not only aid interested stakeholders in understanding how member states are meeting their targets, but also shows how linked data can be used effectively to support policy makers and associated legislative bodies.
METADATA REGISTRY, ISO/IEC 11179

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pon, R K; Buttler, D J

2008-01-03

ISO/IEC-11179 is an international standard that documents the standardization and registration of metadata to make data understandable and shareable. This standardization and registration allows for easier locating, retrieving, and transmitting data from disparate databases. The standard defines the how metadata are conceptually modeled and how they are shared among parties, but does not define how data is physically represented as bits and bytes. The standard consists of six parts. Part 1 provides a high-level overview of the standard and defines the basic element of a metadata registry - a data element. Part 2 defines the procedures for registering classification schemesmore » and classifying administered items in a metadata registry (MDR). Part 3 specifies the structure of an MDR. Part 4 specifies requirements and recommendations for constructing definitions for data and metadata. Part 5 defines how administered items are named and identified. Part 6 defines how administered items are registered and assigned an identifier.« less
Metadata Design in the New PDS4 Standards - Something for Everybody

NASA Astrophysics Data System (ADS)

Raugh, Anne C.; Hughes, John S.

2015-11-01

The Planetary Data System (PDS) archives, supports, and distributes data of diverse targets, from diverse sources, to diverse users. One of the core problems addressed by the PDS4 data standard redesign was that of metadata - how to accommodate the increasingly sophisticated demands of search interfaces, analytical software, and observational documentation into label standards without imposing limits and constraints that would impinge on the quality or quantity of metadata that any particular observer or team could supply. And yet, as an archive, PDS must have detailed documentation for the metadata in the labels it supports, or the institutional knowledge encoded into those attributes will be lost - putting the data at risk.The PDS4 metadata solution is based on a three-step approach. First, it is built on two key ISO standards: ISO 11179 "Information Technology - Metadata Registries", which provides a common framework and vocabulary for defining metadata attributes; and ISO 14721 "Space Data and Information Transfer Systems - Open Archival Information System (OAIS) Reference Model", which provides the framework for the information architecture that enforces the object-oriented paradigm for metadata modeling. Second, PDS has defined a hierarchical system that allows it to divide its metadata universe into namespaces ("data dictionaries", conceptually), and more importantly to delegate stewardship for a single namespace to a local authority. This means that a mission can develop its own data model with a high degree of autonomy and effectively extend the PDS model to accommodate its own metadata needs within the common ISO 11179 framework. Finally, within a single namespace - even the core PDS namespace - existing metadata structures can be extended and new structures added to the model as new needs are identifiedThis poster illustrates the PDS4 approach to metadata management and highlights the expected return on the development investment for PDS, users and data preparers.
A Window to the World: Lessons Learned from NASA's Collaborative Metadata Curation Effort

NASA Astrophysics Data System (ADS)

Bugbee, K.; Dixon, V.; Baynes, K.; Shum, D.; le Roux, J.; Ramachandran, R.

2017-12-01

Well written descriptive metadata adds value to data by making data easier to discover as well as increases the use of data by providing the context or appropriateness of use. While many data centers acknowledge the importance of correct, consistent and complete metadata, allocating resources to curate existing metadata is often difficult. To lower resource costs, many data centers seek guidance on best practices for curating metadata but struggle to identify those recommendations. In order to assist data centers in curating metadata and to also develop best practices for creating and maintaining metadata, NASA has formed a collaborative effort to improve the Earth Observing System Data and Information System (EOSDIS) metadata in the Common Metadata Repository (CMR). This effort has taken significant steps in building consensus around metadata curation best practices. However, this effort has also revealed gaps in EOSDIS enterprise policies and procedures within the core metadata curation task. This presentation will explore the mechanisms used for building consensus on metadata curation, the gaps identified in policies and procedures, the lessons learned from collaborating with both the data centers and metadata curation teams, and the proposed next steps for the future.
Structured representation for core elements of common clinical decision support interventions to facilitate knowledge sharing.

PubMed

Zhou, Li; Hongsermeier, Tonya; Boxwala, Aziz; Lewis, Janet; Kawamoto, Kensaku; Maviglia, Saverio; Gentile, Douglas; Teich, Jonathan M; Rocha, Roberto; Bell, Douglas; Middleton, Blackford

2013-01-01

At present, there are no widely accepted, standard approaches for representing computer-based clinical decision support (CDS) intervention types and their structural components. This study aimed to identify key requirements for the representation of five widely utilized CDS intervention types: alerts and reminders, order sets, infobuttons, documentation templates/forms, and relevant data presentation. An XML schema was proposed for representing these interventions and their core structural elements (e.g., general metadata, applicable clinical scenarios, CDS inputs, CDS outputs, and CDS logic) in a shareable manner. The schema was validated by building CDS artifacts for 22 different interventions, targeted toward guidelines and clinical conditions called for in the 2011 Meaningful Use criteria. Custom style sheets were developed to render the XML files in human-readable form. The CDS knowledge artifacts were shared via a public web portal. Our experience also identifies gaps in existing standards and informs future development of standards for CDS knowledge representation and sharing.
Metadata Wizard: an easy-to-use tool for creating FGDC-CSDGM metadata for geospatial datasets in ESRI ArcGIS Desktop

USGS Publications Warehouse

Ignizio, Drew A.; O'Donnell, Michael S.; Talbert, Colin B.

2014-01-01

Creating compliant metadata for scientific data products is mandated for all federal Geographic Information Systems professionals and is a best practice for members of the geospatial data community. However, the complexity of the The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata, the limited availability of easy-to-use tools, and recent changes in the ESRI software environment continue to make metadata creation a challenge. Staff at the U.S. Geological Survey Fort Collins Science Center have developed a Python toolbox for ESRI ArcDesktop to facilitate a semi-automated workflow to create and update metadata records in ESRI’s 10.x software. The U.S. Geological Survey Metadata Wizard tool automatically populates several metadata elements: the spatial reference, spatial extent, geospatial presentation format, vector feature count or raster column/row count, native system/processing environment, and the metadata creation date. Once the software auto-populates these elements, users can easily add attribute definitions and other relevant information in a simple Graphical User Interface. The tool, which offers a simple design free of esoteric metadata language, has the potential to save many government and non-government organizations a significant amount of time and costs by facilitating the development of The Federal Geographic Data Committee’s Content Standards for Digital Geospatial Metadata compliant metadata for ESRI software users. A working version of the tool is now available for ESRI ArcDesktop, version 10.0, 10.1, and 10.2 (downloadable at http:/www.sciencebase.gov/metadatawizard).
Building a semantic web-based metadata repository for facilitating detailed clinical modeling in cancer genome studies.

PubMed

Sharma, Deepak K; Solbrig, Harold R; Tao, Cui; Weng, Chunhua; Chute, Christopher G; Jiang, Guoqian

2017-06-05

Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous computer systems. To better support clinical cancer data capturing and reporting, there is an emerging need to develop informatics solutions for standards-based clinical models in cancer study domains. The objective of the study is to develop and evaluate a cancer genome study metadata management system that serves as a key infrastructure in supporting clinical information modeling in cancer genome study domains. We leveraged a Semantic Web-based metadata repository enhanced with both ISO11179 metadata standard and Clinical Information Modeling Initiative (CIMI) Reference Model. We used the common data elements (CDEs) defined in The Cancer Genome Atlas (TCGA) data dictionary, and extracted the metadata of the CDEs using the NCI Cancer Data Standards Repository (caDSR) CDE dataset rendered in the Resource Description Framework (RDF). The ITEM/ITEM_GROUP pattern defined in the latest CIMI Reference Model is used to represent reusable model elements (mini-Archetypes). We produced a metadata repository with 38 clinical cancer genome study domains, comprising a rich collection of mini-Archetype pattern instances. We performed a case study of the domain "clinical pharmaceutical" in the TCGA data dictionary and demonstrated enriched data elements in the metadata repository are very useful in support of building detailed clinical models. Our informatics approach leveraging Semantic Web technologies provides an effective way to build a CIMI-compliant metadata repository that would facilitate the detailed clinical modeling to support use cases beyond TCGA in clinical cancer study domains.
Energize New Mexico - Integration of Diverse Energy-Related Research Data into an Interoperable Geospatial Infrastructure and National Data Repositories

NASA Astrophysics Data System (ADS)

Hudspeth, W. B.; Barrett, H.; Diller, S.; Valentin, G.

2016-12-01

Energize is New Mexico's Experimental Program to Stimulate Competitive Research (NM EPSCoR), funded by the NSF with a focus on building capacity to conduct scientific research. Energize New Mexico leverages the work of faculty and students from NM universities and colleges to provide the tools necessary to a quantitative, science-driven discussion of the state's water policy options and to realize New Mexico's potential for sustainable energy development. This presentation discusses the architectural details of NM EPSCoR's collaborative data management system, GSToRE, and how New Mexico researchers use it to share and analyze diverse research data, with the goal of attaining sustainable energy development in the state.The Earth Data Analysis Center (EDAC) at The University of New Mexico leads the development of computational interoperability capacity that allows the wide use and sharing of energy-related data among NM EPSCoR researchers. Data from a variety of research disciplines is stored and maintained in EDAC's Geographic Storage, Transformation and Retrieval Engine (GSToRE), a distributed platform for large-scale vector and raster data discovery, subsetting, and delivery via Web services that are based on Open Geospatial Consortium (OGC) and REST Web-service standards. Researchers upload and register scientific datasets using a front-end client that collects the critical metadata. In addition, researchers have the option to register their datasets with DataONE, a national, community-driven project that provides access to data across multiple member repositories. The GSToRE platform maintains a searchable, core collection of metadata elements that can be used to deliver metadata in multiple formats, including ISO 19115-2/19139 and FGDC CSDGM. Stored metadata elements also permit the platform to automate the registration of Energize datasets into DataONE, once the datasets are approved for release to the public.
Serious Games for Health: The Potential of Metadata.

PubMed

Göbel, Stefan; Maddison, Ralph

2017-02-01

Numerous serious games and health games exist, either as commercial products (typically with a focus on entertaining a broad user group) or smaller games and game prototypes, often resulting from research projects (typically tailored to a smaller user group with a specific health characteristic). A major drawback of existing health games is that they are not very well described and attributed with (machine-readable, quantitative, and qualitative) metadata such as the characterizing goal of the game, the target user group, or expected health effects well proven in scientific studies. This makes it difficult or even impossible for end users to find and select the most appropriate game for a specific situation (e.g., health needs). Therefore, the aim of this article was to motivate the need and potential/benefit of metadata for the description and retrieval of health games and to describe a descriptive model for the qualitative description of games for health. It was not the aim of the article to describe a stable, running system (portal) for health games. This will be addressed in future work. Building on previous work toward a metadata format for serious games, a descriptive model for the formal description of games for health is introduced. For the conceptualization of this model, classification schemata of different existing health game repositories are considered. The classification schema consists of three levels: a core set of mandatory descriptive fields relevant for all games for health application areas, a detailed level with more comprehensive, optional information about the games, and so-called extension as level three with specific descriptive elements relevant for dedicated health games application areas, for example, cardio training. A metadata format provides a technical framework to describe, find, and select appropriate health games matching the needs of the end user. Future steps to improve, apply, and promote the metadata format in the health games market are discussed.

PhysDoc: A Distributed Network of Physics Institutions: Collecting, Indexing, and Searching High Quality Documents by Using Harvest; The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions; Information Services for Higher Education: A New Competitive Space; Intellectual Property Conservancies.

ERIC Educational Resources Information Center

Severiens, Thomas; Hohlfeld, Michael; Zimmermann, Kerstin; Hilf, Eberhard R.; von Ossietzky, Carl; Weibel, Stuart L.; Koch, Traugott; Hughes, Carol Ann; Bearman, David

2000-01-01

Includes four articles that discuss a variety to topics, including a distributed network of physics institutions documents called PhysDocs which harvests information from the local Web-servers of professional physics institutions; the Dublin Core metadata initiative; information services for higher education in a competitive environment; and…
The Arctic Cooperative Data and Information System: Data Management Support for the NSF Arctic Research Program (Invited)

NASA Astrophysics Data System (ADS)

Moore, J.; Serreze, M. C.; Middleton, D.; Ramamurthy, M. K.; Yarmey, L.

2013-12-01

The NSF funds the Advanced Cooperative Arctic Data and Information System (ACADIS), url: (http://www.aoncadis.org/). It serves the growing and increasingly diverse data management needs of NSF's arctic research community. The ACADIS investigator team combines experienced data managers, curators and software engineers from the NSIDC, UCAR and NCAR. ACADIS fosters scientific synthesis and discovery by providing a secure long-term data archive to NSF investigators. The system provides discovery and access to arctic related data from this and other archives. This paper updates the technical components of ACADIS, the implementation of best practices, the value of ACADIS to the community and the major challenges facing this archive for the future in handling the diverse data coming from NSF Arctic investigators. ACADIS provides sustainable data management, data stewardship services and leadership for the NSF Arctic research community through open data sharing, adherence to best practices and standards, capitalizing on appropriate evolving technologies, community support and engagement. ACADIS leverages other pertinent projects, capitalizing on appropriate emerging technologies and participating in emerging cyberinfrastructure initiatives. The key elements of ACADIS user services to the NSF Arctic community include: data and metadata upload; support for datasets with special requirements; metadata and documentation generation; interoperability and initiatives with other archives; and science support to investigators and the community. Providing a self-service data publishing platform requiring minimal curation oversight while maintaining rich metadata for discovery, access and preservation is challenging. Implementing metadata standards are a first step towards consistent content. The ACADIS Gateway and ADE offer users choices for data discovery and access with the clear objective of increasing discovery and use of all Arctic data especially for analysis activities. Metadata is at the core of ACADIS activities, from capturing metadata at the point of data submission to ensuring interoperability , providing data citations, and supporting data discovery. ACADIS metadata efforts include: 1) Evolution of the ACADIS metadata profile to increase flexibility in search; 2) Documentation guidelines; and 3) Metadata standardization efforts. A major activity is now underway to ensure consistency in the metadata profile across all archived datasets. ACADIS is embarking on a critical activity to create Digital Object Identifiers (DOI) for all its holdings. The data services offered by ACADIS focus on meeting the needs of the data providers, providing dynamic search capabilities to peruse the ACADIS and related cyrospheric data repositories, efficient data download and some special services including dataset reformatting and visualization. The service is built around of the following key technical elements: The ACADIS Gateway housed at NCAR has been developed to support NSF Arctic data coming from AON and now broadly across PLR/ARC and related archives: The Arctic Data Explorer (ADE) developed at NSIDC is an integral service of ACADIS bringing the rich archive from NSIDC together with catalogs from ACADIS and international partners in Arctic research: and Rosetta and the Digital Object Identifier (DOI) generation scheme are tools available to the community to help publish and utilize datasets in integration and synthesis and publication.
Metadata Creation, Management and Search System for your Scientific Data

NASA Astrophysics Data System (ADS)

Devarakonda, R.; Palanisamy, G.

2012-12-01

Mercury Search Systems is a set of tools for creating, searching, and retrieving of biogeochemical metadata. Mercury toolset provides orders of magnitude improvements in search speed, support for any metadata format, integration with Google Maps for spatial queries, multi-facetted type search, search suggestions, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. Mercury's metadata editor provides a easy way for creating metadata and Mercury's search interface provides a single portal to search for data and information contained in disparate data management systems, each of which may use any metadata format including FGDC, ISO-19115, Dublin-Core, Darwin-Core, DIF, ECHO, and EML. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury is being used more than 14 different projects across 4 federal agencies. It was originally developed for NASA, with continuing development funded by NASA, USGS, and DOE for a consortium of projects. Mercury search won the NASA's Earth Science Data Systems Software Reuse Award in 2008. References: R. Devarakonda, G. Palanisamy, B.E. Wilson, and J.M. Green, "Mercury: reusable metadata management data discovery and access system", Earth Science Informatics, vol. 3, no. 1, pp. 87-94, May 2010. R. Devarakonda, G. Palanisamy, J.M. Green, B.E. Wilson, "Data sharing and retrieval using OAI-PMH", Earth Science Informatics DOI: 10.1007/s12145-010-0073-0, (2010);
Inferring Metadata for a Semantic Web Peer-to-Peer Environment

ERIC Educational Resources Information Center

Brase, Jan; Painter, Mark

2004-01-01

Learning Objects Metadata (LOM) aims at describing educational resources in order to allow better reusability and retrieval. In this article we show how additional inference rules allows us to derive additional metadata from existing ones. Additionally, using these rules as integrity constraints helps us to define the constraints on LOM elements,…
Mercury- Distributed Metadata Management, Data Discovery and Access System

NASA Astrophysics Data System (ADS)

Palanisamy, Giri; Wilson, Bruce E.; Devarakonda, Ranjeet; Green, James M.

2007-12-01

Mercury is a federated metadata harvesting, search and retrieval tool based on both open source and ORNL- developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports various metadata standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115 (under development). Mercury provides a single portal to information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury supports various projects including: ORNL DAAC, NBII, DADDI, LBA, NARSTO, CDIAC, OCEAN, I3N, IAI, ESIP and ARM. The new Mercury system is based on a Service Oriented Architecture and supports various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. This system also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Associating uncertainty with datasets using Linked Data and allowing propagation via provenance chains

NASA Astrophysics Data System (ADS)

Car, Nicholas; Cox, Simon; Fitch, Peter

2015-04-01

With earth-science datasets increasingly being published to enable re-use in projects disassociated from the original data acquisition or generation, there is an urgent need for associated metadata to be connected, in order to guide their application. In particular, provenance traces should support the evaluation of data quality and reliability. However, while standards for describing provenance are emerging (e.g. PROV-O), these do not include the necessary statistical descriptors and confidence assessments. UncertML has a mature conceptual model that may be used to record uncertainty metadata. However, by itself UncertML does not support the representation of uncertainty of multi-part datasets, and provides no direct way of associating the uncertainty information - metadata in relation to a dataset - with dataset objects.We present a method to address both these issues by combining UncertML with PROV-O, and delivering resulting uncertainty-enriched provenance traces through the Linked Data API. UncertProv extends the PROV-O provenance ontology with an RDF formulation of the UncertML conceptual model elements, adds further elements to support uncertainty representation without a conceptual model and the integration of UncertML through links to documents. The Linked ID API provides a systematic way of navigating from dataset objects to their UncertProv metadata and back again. The Linked Data API's 'views' capability enables access to UncertML and non-UncertML uncertainty metadata representations for a dataset. With this approach, it is possible to access and navigate the uncertainty metadata associated with a published dataset using standard semantic web tools, such as SPARQL queries. Where the uncertainty data follows the UncertML model it can be automatically interpreted and may also support automatic uncertainty propagation . Repositories wishing to enable uncertainty propagation for all datasets must ensure that all elements that are associated with uncertainty (PROV-O Entity and Activity classes) have UncertML elements recorded. This methodology is intentionally flexible to allow uncertainty metadata in many forms, not limited to UncertML. While the more formal representation of uncertainty metadata is desirable (using UncertProv elements to implement the UncertML conceptual model ), this will not always be possible, and any uncertainty data stored will be better than none. Since the UncertProv ontology contains a superset of UncertML elements to facilitate the representation of non-UncertML uncertainty data, it could easily be extended to include other formal uncertainty conceptual models thus allowing non-UncertML propagation calculations.
Reviving legacy clay mineralogy data and metadata through the IEDA-CCNY Data Internship Program

NASA Astrophysics Data System (ADS)

Palumbo, R. V.; Randel, C.; Ismail, A.; Block, K. A.; Cai, Y.; Carter, M.; Hemming, S. R.; Lehnert, K.

2016-12-01

Reconstruction of past climate and ocean circulation using ocean sediment cores relies on the use of multiple climate proxies measured on well-studied cores. Preserving all the information collected on a sediment core is crucial for the success of future studies using these unique and important samples. Clay mineralogy is a powerful tool to study weathering processes and sedimentary provenance. In his pioneering dissertation, Pierre Biscaye (1964, Yale University) established the X-Ray Diffraction (XRD) method for quantitative clay mineralogy analyses in ocean sediments and presented data for 500 core-top samples throughout the Atlantic Ocean and its neighboring seas. Unfortunately, the data only exists in analog format, which has discouraged scientists from reusing the data, apart from replication of the published maps. Archiving and preserving this dataset and making it publicly available in a digital format, linked with the metadata from the core repository will allow the scientific community to use these data to generate new findings. Under the supervision of Sidney Hemming and members of the Interdisciplinary Earth Data Alliance (IEDA) team, IEDA-CCNY interns digitized the data and metadata from Biscaye's dissertation and linked them with additional sample metadata using IGSN (International Geo-Sample Number). After compilation and proper documentation of the dataset, it was published in the EarthChem Library where the dataset will be openly accessible, and citable with a persistent DOI (Digital Object Identifier). During this internship, the students read peer-reviewed articles, interacted with active scientists in the field and acquired knowledge about XRD methods and the data generated, as well as its applications. They also learned about existing and emerging best practices in data publication and preservation. Data rescue projects are a fun and interactive way for students to become engaged in the field.
Digital Preservation and Deep Infrastructure; Dublin Core Metadata Initiative Progress Report and Workplan for 2002; Video Gaming, Education and Digital Learning Technologies: Relevance and Opportunities; Digital Collections of Real World Objects; The MusArt Music-Retrieval System: An Overview; eML: Taking Mississippi Libraries into the 21st Century.

ERIC Educational Resources Information Center

Granger, Stewart; Dekkers, Makx; Weibel, Stuart L.; Kirriemuir, John; Lensch, Hendrik P. A.; Goesele, Michael; Seidel, Hans-Peter; Birmingham, William; Pardo, Bryan; Meek, Colin; Shifrin, Jonah; Goodvin, Renee; Lippy, Brooke

2002-01-01

One opinion piece and five articles in this issue discuss: digital preservation infrastructure; accomplishments and changes in the Dublin Core Metadata Initiative in 2001 and plans for 2002; video gaming and how it relates to digital libraries and learning technologies; overview of a music retrieval system; and the online version of the…
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and ARM

NASA Astrophysics Data System (ADS)

Crow, M. C.; Devarakonda, R.; Killeffer, T.; Hook, L.; Boden, T.; Wullschleger, S.

2017-12-01

Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This poster describes tools being used in several projects at Oak Ridge National Laboratory (ORNL), with a focus on the U.S. Department of Energy's Next Generation Ecosystem Experiment in the Arctic (NGEE Arctic) and Atmospheric Radiation Measurements (ARM) project, and their usage at different stages of the data lifecycle. The Online Metadata Editor (OME) is used for the documentation and archival stages while a Data Search tool supports indexing, cataloging, and searching. The NGEE Arctic OME Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload while adhering to standard metadata formats. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The Data Search Tool conveniently displays each data record in a thumbnail containing the title, source, and date range, and features a quick view of the metadata associated with that record, as well as a direct link to the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for geo-searching. These tools are supported by the Mercury [2] consortium (funded by DOE, NASA, USGS, and ARM) and developed and managed at Oak Ridge National Laboratory. Mercury is a set of tools for collecting, searching, and retrieving metadata and data. Mercury collects metadata from contributing project servers, then indexes the metadata to make it searchable using Apache Solr, and provides access to retrieve it from the web page. Metadata standards that Mercury supports include: XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115.
Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.

PubMed

Dave, Jaydev K; Gingold, Eric L

2013-01-01

The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.
A SensorML-based Metadata Model and Registry for Ocean Observatories: a Contribution from European Projects NeXOS and FixO3

NASA Astrophysics Data System (ADS)

Delory, E.; Jirka, S.

2016-02-01

Discovering sensors and observation data is important when enabling the exchange of oceanographic data between observatories and scientists that need the data sets for their work. To better support this discovery process, one task of the European project FixO3 (Fixed-point Open Ocean Observatories) is dealing with the question which elements are needed for developing a better registry for sensors. This has resulted in four items which are addressed by the FixO3 project in cooperation with further European projects such as NeXOS (http://www.nexosproject.eu/). 1.) Metadata description format: To store and retrieve information about sensors and platforms it is necessary to have a common approach how to provide and encode the metadata. For this purpose, the OGC Sensor Model Language (SensorML) 2.0 standard was selected. Especially the opportunity to distinguish between sensor types and instances offers new chances for a more efficient provision and maintenance of sensor metadata. 2.) Conversion of existing metadata into a SensorML 2.0 representation: In order to ensure a sustainable re-use of already provided metadata content (e.g. from ESONET-FixO3 yellow pages), it is important to provide a mechanism which is capable of transforming these already available metadata sets into the new SensorML 2.0 structure. 3.) Metadata editor: To create descriptions of sensors and platforms, it is not possible to expect users to manually edit XML-based description files. Thus, a visual interface is necessary to help during the metadata creation. We will outline a prototype of this editor, building upon the development of the ESONET sensor registry interface. 4.) Sensor Metadata Store: A server is needed that for storing and querying the created sensor descriptions. For this purpose different options exist which will be discussed. In summary, we will present a set of different elements enabling sensor discovery ranging from metadata formats, metadata conversion and editing to metadata storage. Furthermore, the current development status will be demonstrated.
Mercury: Reusable software application for Metadata Management, Data Discovery and Access

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce E.

2009-12-01

Mercury is a federated metadata harvesting, data discovery and access tool based on both open source packages and custom developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury is itself a reusable toolset for metadata, with current use in 12 different projects. Mercury also supports the reuse of metadata by enabling searching across a range of metadata specification and standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115. Mercury provides a single portal to information contained in distributed data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. One of the major goals of the recent redesign of Mercury was to improve the software reusability across the projects which currently fund the continuing development of Mercury. These projects span a range of land, atmosphere, and ocean ecological communities and have a number of common needs for metadata searches, but they also have a number of needs specific to one or a few projects To balance these common and project-specific needs, Mercury’s architecture includes three major reusable components; a harvester engine, an indexing system and a user interface component. The harvester engine is responsible for harvesting metadata records from various distributed servers around the USA and around the world. The harvester software was packaged in such a way that all the Mercury projects will use the same harvester scripts but each project will be driven by a set of configuration files. The harvested files are then passed to the Indexing system, where each of the fields in these structured metadata records are indexed properly, so that the query engine can perform simple, keyword, spatial and temporal searches across these metadata sources. The search user interface software has two API categories; a common core API which is used by all the Mercury user interfaces for querying the index and a customized API for project specific user interfaces. For our work in producing a reusable, portable, robust, feature-rich application, Mercury received a 2008 NASA Earth Science Data Systems Software Reuse Working Group Peer-Recognition Software Reuse Award. The new Mercury system is based on a Service Oriented Architecture and effectively reuses components for various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. The software also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets, integrated shopping cart to order datasets from various data centers (ORNL DAAC, NSIDC) and integrated visualization tools. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Study on Information Management for the Conservation of Traditional Chinese Architectural Heritage - 3d Modelling and Metadata Representation

NASA Astrophysics Data System (ADS)

Yen, Y. N.; Weng, K. H.; Huang, H. Y.

2013-07-01

After over 30 years of practise and development, Taiwan's architectural conservation field is moving rapidly into digitalization and its applications. Compared to modern buildings, traditional Chinese architecture has considerably more complex elements and forms. To document and digitize these unique heritages in their conservation lifecycle is a new and important issue. This article takes the caisson ceiling of the Taipei Confucius Temple, octagonal with 333 elements in 8 types, as a case study for digitization practise. The application of metadata representation and 3D modelling are the two key issues to discuss. Both Revit and SketchUp were appliedin this research to compare its effectiveness to metadata representation. Due to limitation of the Revit database, the final 3D models wasbuilt with SketchUp. The research found that, firstly, cultural heritage databasesmustconvey that while many elements are similar in appearance, they are unique in value; although 3D simulations help the general understanding of architectural heritage, software such as Revit and SketchUp, at this stage, could onlybe used tomodel basic visual representations, and is ineffective indocumenting additional critical data ofindividually unique elements. Secondly, when establishing conservation lifecycle information for application in management systems, a full and detailed presentation of the metadata must also be implemented; the existing applications of BIM in managing conservation lifecycles are still insufficient. Results of the research recommends SketchUp as a tool for present modelling needs, and BIM for sharing data between users, but the implementation of metadata representation is of the utmost importance.
The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF

NASA Astrophysics Data System (ADS)

Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.

2006-12-01

In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.
Combined use of semantics and metadata to manage Research Data Life Cycle in Environmental Sciences

NASA Astrophysics Data System (ADS)

Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Pertinez, Esther; Palacio, Aida

2017-04-01

The use of metadata to contextualize datasets is quite extended in Earth System Sciences. There are some initiatives and available tools to help data managers to choose the best metadata standard that fit their use cases, like the DCC Metadata Directory (http://www.dcc.ac.uk/resources/metadata-standards). In our use case, we have been gathering physical, chemical and biological data from a water reservoir since 2010. A well metadata definition is crucial not only to contextualize our own data but also to integrate datasets from other sources like satellites or meteorological agencies. That is why we have chosen EML (Ecological Metadata Language), which integrates many different elements to define a dataset, including the project context, instrumentation and parameters definition, and the software used to process, provide quality controls and include the publication details. Those metadata elements can contribute to help both human and machines to understand and process the dataset. However, the use of metadata is not enough to fully support the data life cycle, from the Data Management Plan definition to the Publication and Re-use. To do so, we need to define not only metadata and attributes but also the relationships between them, so semantics are needed. Ontologies, being a knowledge representation, can contribute to define the elements of a research data life cycle, including DMP, datasets, software, etc. They also can define how the different elements are related between them and how they interact. The first advantage of developing an ontology of a knowledge domain is that they provide a common vocabulary hierarchy (i.e. a conceptual schema) that can be used and standardized by all the agents interested in the domain (either humans or machines). This way of using ontologies is one of the basis of the Semantic Web, where ontologies are set to play a key role in establishing a common terminology between agents. To develop an ontology we are using a graphical tool Protégé, which is a graphical ontology-development tool that supports a rich knowledge model and it is open-source and freely available. To process and manage the ontology, we are using Semantic MediaWiki, which is able to process queries. Semantic MediaWiki is an extension of MediaWiki where we can do semantic search and export data in RDF. Our final goal is integrating our data repository portal and semantic processing engine in order to have a complete system to manage the data life cycle stages and their relationships, including machine-actionable DMP solution, datasets and software management, computing resources for processing and analysis and publication features (DOI mint). This way we will be able to reproduce the full data life cycle chain warranting the FAIR+R principles.
Definition of an ISO 19115 metadata profile for SeaDataNet II Cruise Summary Reports and its XML encoding

NASA Astrophysics Data System (ADS)

Boldrini, Enrico; Schaap, Dick M. A.; Nativi, Stefano

2013-04-01

SeaDataNet implements a distributed pan-European infrastructure for Ocean and Marine Data Management whose nodes are maintained by 40 national oceanographic and marine data centers from 35 countries riparian to all European seas. A unique portal makes possible distributed discovery, visualization and access of the available sea data across all the member nodes. Geographic metadata play an important role in such an infrastructure, enabling an efficient documentation and discovery of the resources of interest. In particular: - Common Data Index (CDI) metadata describe the sea datasets, including identification information (e.g. product title, interested area), evaluation information (e.g. data resolution, constraints) and distribution information (e.g. download endpoint, download protocol); - Cruise Summary Reports (CSR) metadata describe cruises and field experiments at sea, including identification information (e.g. cruise title, name of the ship), acquisition information (e.g. utilized instruments, number of samples taken) In the context of the second phase of SeaDataNet (SeaDataNet 2 EU FP7 project, grant agreement 283607, started on October 1st, 2011 for a duration of 4 years) a major target is the setting, adoption and promotion of common international standards, to the benefit of outreach and interoperability with the international initiatives and communities (e.g. OGC, INSPIRE, GEOSS, …). A standardization effort conducted by CNR with the support of MARIS, IFREMER, STFC, BODC and ENEA has led to the creation of a ISO 19115 metadata profile of CDI and its XML encoding based on ISO 19139. The CDI profile is now in its stable version and it's being implemented and adopted by the SeaDataNet community tools and software. The effort has then continued to produce an ISO based metadata model and its XML encoding also for CSR. The metadata elements included in the CSR profile belong to different models: - ISO 19115: E.g. cruise identification information, including title and area of interest; metadata responsible party information - ISO 19115-2: E.g. acquisition information, including date of sampling, instruments used - SeaDataNet: E.g. SeaDataNet community specific, including EDMO and EDMERP code lists Two main guidelines have been followed in the metadata model drafting: - All the obligations and constraints required by both the ISO standards and INSPIRE directive had to be satisfied. These include the presence of specific elements with given cardinality (e.g. mandatory metadata date stamp, mandatory lineage information) - All the content information of legacy CSR format had to be supported by the new metadata model. An XML encoding of the CSR profile has been defined as well. Based on the ISO 19139 XML schema and constraints, it adds the new elements specific of the SeaDataNet community. The associated Schematron rules are used to enforce constraints not enforceable just with the Schema and to validate elements content against the SeaDataNet code lists vocabularies.
Design and implementation of a fault-tolerant and dynamic metadata database for clinical trials

NASA Astrophysics Data System (ADS)

Lee, J.; Zhou, Z.; Talini, E.; Documet, J.; Liu, B.

2007-03-01

In recent imaging-based clinical trials, quantitative image analysis (QIA) and computer-aided diagnosis (CAD) methods are increasing in productivity due to higher resolution imaging capabilities. A radiology core doing clinical trials have been analyzing more treatment methods and there is a growing quantity of metadata that need to be stored and managed. These radiology centers are also collaborating with many off-site imaging field sites and need a way to communicate metadata between one another in a secure infrastructure. Our solution is to implement a data storage grid with a fault-tolerant and dynamic metadata database design to unify metadata from different clinical trial experiments and field sites. Although metadata from images follow the DICOM standard, clinical trials also produce metadata specific to regions-of-interest and quantitative image analysis. We have implemented a data access and integration (DAI) server layer where multiple field sites can access multiple metadata databases in the data grid through a single web-based grid service. The centralization of metadata database management simplifies the task of adding new databases into the grid and also decreases the risk of configuration errors seen in peer-to-peer grids. In this paper, we address the design and implementation of a data grid metadata storage that has fault-tolerance and dynamic integration for imaging-based clinical trials.
Case Studies of Ecological Integrative Information Systems: The Luquillo and Sevilleta Information Management Systems

NASA Astrophysics Data System (ADS)

San Gil, Inigo; White, Marshall; Melendez, Eda; Vanderbilt, Kristin

The thirty-year-old United States Long Term Ecological Research Network has developed extensive metadata to document their scientific data. Standard and interoperable metadata is a core component of the data-driven analytical solutions developed by this research network Content management systems offer an affordable solution for rapid deployment of metadata centered information management systems. We developed a customized integrative metadata management system based on the Drupal content management system technology. Building on knowledge and experience with the Sevilleta and Luquillo Long Term Ecological Research sites, we successfully deployed the first two medium-scale customized prototypes. In this paper, we describe the vision behind our Drupal based information management instances, and list the features offered through these Drupal based systems. We also outline the plans to expand the information services offered through these metadata centered management systems. We will conclude with the growing list of participants deploying similar instances.
Observation Data Model Core Components, its Implementation in the Table Access Protocol Version 1.1

NASA Astrophysics Data System (ADS)

Louys, Mireille; Tody, Doug; Dowler, Patrick; Durand, Daniel; Michel, Laurent; Bonnarel, Francos; Micol, Alberto; IVOA DataModel Working Group; Louys, Mireille; Tody, Doug; Dowler, Patrick; Durand, Daniel

2017-05-01

This document defines the core components of the Observation data model that are necessary to perform data discovery when querying data centers for astronomical observations of interest. It exposes use-cases to be carried out, explains the model and provides guidelines for its implementation as a data access service based on the Table Access Protocol (TAP). It aims at providing a simple model easy to understand and to implement by data providers that wish to publish their data into the Virtual Observatory. This interface integrates data modeling and data access aspects in a single service and is named ObsTAP. It will be referenced as such in the IVOA registries. In this document, the Observation Data Model Core Components (ObsCoreDM) defines the core components of queryable metadata required for global discovery of observational data. It is meant to allow a single query to be posed to TAP services at multiple sites to perform global data discovery without having to understand the details of the services present at each site. It defines a minimal set of basic metadata and thus allows for a reasonable cost of implementation by data providers. The combination of the ObsCoreDM with TAP is referred to as an ObsTAP service. As with most of the VO Data Models, ObsCoreDM makes use of STC, Utypes, Units and UCDs. The ObsCoreDM can be serialized as a VOTable. ObsCoreDM can make reference to more complete data models such as Characterisation DM, Spectrum DM or Simple Spectral Line Data Model (SSLDM). ObsCore shares a large set of common concepts with DataSet Metadata Data Model (Cresitello-Dittmar et al. 2016) which binds together most of the data model concepts from the above models in a comprehensive and more general frame work. This current specification on the contrary provides guidelines for implementing these concepts using the TAP protocol and answering ADQL queries. It is dedicated to global discovery.
A metadata approach for clinical data management in translational genomics studies in breast cancer.

PubMed

Papatheodorou, Irene; Crichton, Charles; Morris, Lorna; Maccallum, Peter; Davies, Jim; Brenton, James D; Caldas, Carlos

2009-11-30

In molecular profiling studies of cancer patients, experimental and clinical data are combined in order to understand the clinical heterogeneity of the disease: clinical information for each subject needs to be linked to tumour samples, macromolecules extracted, and experimental results. This may involve the integration of clinical data sets from several different sources: these data sets may employ different data definitions and some may be incomplete. In this work we employ semantic web techniques developed within the CancerGrid project, in particular the use of metadata elements and logic-based inference to annotate heterogeneous clinical information, integrate and query it. We show how this integration can be achieved automatically, following the declaration of appropriate metadata elements for each clinical data set; we demonstrate the practicality of this approach through application to experimental results and clinical data from five hospitals in the UK and Canada, undertaken as part of the METABRIC project (Molecular Taxonomy of Breast Cancer International Consortium). We describe a metadata approach for managing similarities and differences in clinical datasets in a standardized way that uses Common Data Elements (CDEs). We apply and evaluate the approach by integrating the five different clinical datasets of METABRIC.

Metadata registry and management system based on ISO 11179 for cancer clinical trials information system

PubMed Central

Park, Yu Rang; Kim*, Ju Han

2006-01-01

Standardized management of data elements (DEs) for Case Report Form (CRF) is crucial in Clinical Trials Information System (CTIS). Traditional CTISs utilize organization-specific definitions and storage methods for Des and CRFs. We developed metadata-based DE management system for clinical trials, Clinical and Histopathological Metadata Registry (CHMR), using international standard for metadata registry (ISO 11179) for the management of cancer clinical trials information. CHMR was evaluated in cancer clinical trials with 1625 DEs extracted from the College of American Pathologists Cancer Protocols for 20 major cancers. PMID:17238675
A Digital Broadcast Item (DBI) enabling metadata repository for digital, interactive television (digiTV) feedback channel networks

NASA Astrophysics Data System (ADS)

Lugmayr, Artur R.; Mailaparampil, Anurag; Tico, Florina; Kalli, Seppo; Creutzburg, Reiner

2003-01-01

Digital television (digiTV) is an additional multimedia environment, where metadata is one key element for the description of arbitrary content. This implies adequate structures for content description, which is provided by XML metadata schemes (e.g. MPEG-7, MPEG-21). Content and metadata management is the task of a multimedia repository, from which digiTV clients - equipped with an Internet connection - can access rich additional multimedia types over an "All-HTTP" protocol layer. Within this research work, we focus on conceptual design issues of a metadata repository for the storage of metadata, accessible from the feedback channel of a local set-top box. Our concept describes the whole heterogeneous life-cycle chain of XML metadata from the service provider to the digiTV equipment, device independent representation of content, accessing and querying the metadata repository, management of metadata related to digiTV, and interconnection of basic system components (http front-end, relational database system, and servlet container). We present our conceptual test configuration of a metadata repository that is aimed at a real-world deployment, done within the scope of the future interaction (fiTV) project at the Digital Media Institute (DMI) Tampere (www.futureinteraction.tv).
Interpreting the ASTM 'content standard for digital geospatial metadata'

USGS Publications Warehouse

Nebert, Douglas D.

1996-01-01

ASTM and the Federal Geographic Data Committee have developed a content standard for spatial metadata to facilitate documentation, discovery, and retrieval of digital spatial data using vendor-independent terminology. Spatial metadata elements are identifiable quality and content characteristics of a data set that can be tied to a geographic location or area. Several Office of Management and Budget Circulars and initiatives have been issued that specify improved cataloguing of and accessibility to federal data holdings. An Executive Order further requires the use of the metadata content standard to document digital spatial data sets. Collection and reporting of spatial metadata for field investigations performed for the federal government is an anticipated requirement. This paper provides an overview of the draft spatial metadata content standard and a description of how the standard could be applied to investigations collecting spatially-referenced field data.
MaNIDA: Integration of marine expedition information, data and publications: Data Portal of German Marine Research

NASA Astrophysics Data System (ADS)

Koppe, Roland; Scientific MaNIDA-Team

2013-04-01

The Marine Network for Integrated Data Access (MaNIDA) aims to build a sustainable e-infrastructure to support discovery and re-use of marine data from distinct data providers in Germany (see related abstracts in session ESSI 1.2). In order to provide users integrated access and retrieval of expedition or cruise metadata, data, services and publications as well as relationships among the various objects, we are developing (web) applications based on state of the art technologies: the Data Portal of German Marine Research. Since the German network of distributed content providers have distinct objectives and mandates for storing digital objects (e.g. long-term data preservation, near real time data, publication repositories), we have to cope with heterogeneous metadata in terms of syntax and semantic, data types and formats as well as access solutions. We have defined a set of core metadata elements which are common to our content providers and therefore useful for discovery and building relationships among objects. Existing catalogues for various types of vocabularies are being used to assure the mapping to community-wide used terms. We distinguish between expedition metadata and continuously harvestable metadata objects from distinct data providers. • Existing expedition metadata from distinct sources is integrated and validated in order to create an expedition metadata catalogue which is used as authoritative source for expedition-related content. The web application allows browsing by e.g. research vessel and date, exploring expeditions and research gaps by tracklines and viewing expedition details (begin/end, ports, platforms, chief scientists, events, etc.). Also expedition-related objects from harvesting are dynamically associated with expedition information and presented to the user. Hence we will provide web services to detailed expedition information. • Other harvestable content is separated into four categories: archived data and data products, near real time data, publications and reports. Reports are a special case of publication, describing cruise planning, cruise reports or popular reports on expeditions and are orthogonal to e.g. peer-reviewed articles. Each object's metadata contains at least: identifier(s) e.g. doi/hdl, title, author(s), date, expedition(s), platform(s) e.g. research vessel Polarstern. Furthermore project(s), parameter(s), device(s) and e.g. geographic coverage are of interest. An international gazetteer resolves geographic coverage to region names and annotates to object metadata. Information is homogenously presented to the user, independent of the underlying format, but adaptable to specific disciplines e.g. bathymetry. Also data access and dissemination information is available to the user as data download link or web services (e.g. WFS, WMS). Based on relationship metadata we are dynamically building graphs of objects to support the user in finding possible relevant associated objects. Technically metadata is based on ISO / OGC standards or provider specification. Metadata is harvested via OAI-PMH or OGC CSW and indexed with Apache Lucene. This enables powerful full-text search, geographic and temporal search as well as faceting. In this presentation we will illustrate the architecture and the current implementation of our integrated approach.
The health care and life sciences community profile for dataset descriptions

PubMed Central

Alexiev, Vladimir; Ansell, Peter; Bader, Gary; Baran, Joachim; Bolleman, Jerven T.; Callahan, Alison; Cruz-Toledo, José; Gaudet, Pascale; Gombocz, Erich A.; Gonzalez-Beltran, Alejandra N.; Groth, Paul; Haendel, Melissa; Ito, Maori; Jupp, Simon; Juty, Nick; Katayama, Toshiaki; Kobayashi, Norio; Krishnaswami, Kalpana; Laibe, Camille; Le Novère, Nicolas; Lin, Simon; Malone, James; Miller, Michael; Mungall, Christopher J.; Rietveld, Laurens; Wimalaratne, Sarala M.; Yamaguchi, Atsuko

2016-01-01

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets. PMID:27602295
Metadata Means Communication: The Challenges of Producing Useful Metadata

NASA Astrophysics Data System (ADS)

Edwards, P. N.; Batcheller, A. L.

2010-12-01

Metadata are increasingly perceived as an important component of data sharing systems. For instance, metadata accompanying atmospheric model output may indicate the grid size, grid type, and parameter settings used in the model configuration. We conducted a case study of a data portal in the atmospheric sciences using in-depth interviews, document review, and observation. OUr analysis revealed a number of challenges in producing useful metadata. First, creating and managing metadata required considerable effort and expertise, yet responsibility for these tasks was ill-defined and diffused among many individuals, leading to errors, failure to capture metadata, and uncertainty about the quality of the primary data. Second, metadata ended up stored in many different forms and software tools, making it hard to manage versions and transfer between formats. Third, the exact meanings of metadata categories remained unsettled and misunderstood even among a small community of domain experts -- an effect we expect to be exacerbated when scientists from other disciplines wish to use these data. In practice, we found that metadata problems due to these obstacles are often overcome through informal, personal communication, such as conversations or email. We conclude that metadata serve to communicate the context of data production from the people who produce data to those who wish to use it. Thus while formal metadata systems are often public, critical elements of metadata (those embodied in informal communication) may never be recorded. Therefore, efforts to increase data sharing should include ways to facilitate inter-investigator communication. Instead of tackling metadata challenges only on the formal level, we can improve data usability for broader communities by better supporting metadata communication.
Staff - Simone Montayne | Alaska Division of Geological & Geophysical

Science.gov Websites

Alaska's Mineral Industry Reports AKGeology.info Rare Earth Elements WebGeochem Engineering Geology Alaska Preservation Workshop Professional Experience Metadata - Simone compiles all of the division's metadata files Professional Activities Website and database administrator for the Association of American State Geologists
The French initiative for scientific cores virtual curating : a user-oriented integrated approach

NASA Astrophysics Data System (ADS)

Pignol, Cécile; Godinho, Elodie; Galabertier, Bruno; Caillo, Arnaud; Bernardet, Karim; Augustin, Laurent; Crouzet, Christian; Billy, Isabelle; Teste, Gregory; Moreno, Eva; Tosello, Vanessa; Crosta, Xavier; Chappellaz, Jérome; Calzas, Michel; Rousseau, Denis-Didier; Arnaud, Fabien

2016-04-01

Managing scientific data is probably one the most challenging issue in modern science. The question is made even more sensitive with the need of preserving and managing high value fragile geological sam-ples: cores. Large international scientific programs, such as IODP or ICDP are leading an intense effort to solve this problem and propose detailed high standard work- and dataflows thorough core handling and curating. However most results derived from rather small-scale research programs in which data and sample management is generally managed only locally - when it is … The national excellence equipment program (Equipex) CLIMCOR aims at developing French facilities for coring and drilling investigations. It concerns indiscriminately ice, marine and continental samples. As part of this initiative, we initiated a reflexion about core curating and associated coring-data management. The aim of the project is to conserve all metadata from fieldwork in an integrated cyber-environment which will evolve toward laboratory-acquired data storage in a near future. In that aim, our demarche was conducted through an close relationship with field operators as well laboratory core curators in order to propose user-oriented solutions. The national core curating initiative currently proposes a single web portal in which all scientifics teams can store their field data. For legacy samples, this will requires the establishment of a dedicated core lists with associated metadata. For forthcoming samples, we propose a mobile application, under Android environment to capture technical and scientific metadata on the field. This application is linked with a unique coring tools library and is adapted to most coring devices (gravity, drilling, percussion, etc...) including multiple sections and holes coring operations. Those field data can be uploaded automatically to the national portal, but also referenced through international standards or persistent identifiers (IGSN, ORCID and INSPIRE) and displayed in international portals (currently, NOAA's IMLGS). In this paper, we present the architecture of the integrated system, future perspectives and the approach we adopted to reach our goals. We will also present in front of our poster, one of the three mobile applications, dedicated more particularly to the operations of continental drillings.
Doing One Thing Well: Leveraging Microservices for NASA Earth Science Discovery and Access Across Heterogenous Data Sources

NASA Astrophysics Data System (ADS)

Baynes, K.; Gilman, J.; Pilone, D.; Mitchell, A. E.

2015-12-01

The NASA EOSDIS (Earth Observing System Data and Information System) Common Metadata Repository (CMR) is a continuously evolving metadata system that merges all existing capabilities and metadata from EOS ClearingHOuse (ECHO) and the Global Change Master Directory (GCMD) systems. This flagship catalog has been developed with several key requirements: fast search and ingest performance ability to integrate heterogenous external inputs and outputs high availability and resiliency scalability evolvability and expandability This talk will focus on the advantages and potential challenges of tackling these requirements using a microservices architecture, which decomposes system functionality into smaller, loosely-coupled, individually-scalable elements that communicate via well-defined APIs. In addition, time will be spent examining specific elements of the CMR architecture and identifying opportunities for future integrations.
Predicting structured metadata from unstructured metadata.

PubMed

Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

2016-01-01

Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data-defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. © The Author(s) 2016. Published by Oxford University Press.
Predicting structured metadata from unstructured metadata

PubMed Central

Posch, Lisa; Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

2016-01-01

Enormous amounts of biomedical data have been and are being produced by investigators all over the world. However, one crucial and limiting factor in data reuse is accurate, structured and complete description of the data or data about the data—defined as metadata. We propose a framework to predict structured metadata terms from unstructured metadata for improving quality and quantity of metadata, using the Gene Expression Omnibus (GEO) microarray database. Our framework consists of classifiers trained using term frequency-inverse document frequency (TF-IDF) features and a second approach based on topics modeled using a Latent Dirichlet Allocation model (LDA) to reduce the dimensionality of the unstructured data. Our results on the GEO database show that structured metadata terms can be the most accurately predicted using the TF-IDF approach followed by LDA both outperforming the majority vote baseline. While some accuracy is lost by the dimensionality reduction of LDA, the difference is small for elements with few possible values, and there is a large improvement over the majority classifier baseline. Overall this is a promising approach for metadata prediction that is likely to be applicable to other datasets and has implications for researchers interested in biomedical metadata curation and metadata prediction. Database URL: http://www.yeastgenome.org/ PMID:28637268
ASDC Collaborations and Processes to Ensure Quality Metadata and Consistent Data Availability

NASA Astrophysics Data System (ADS)

Trapasso, T. J.

2017-12-01

With the introduction of new tools, faster computing, and less expensive storage, increased volumes of data are expected to be managed with existing or fewer resources. Metadata management is becoming a heightened challenge from the increase in data volume, resulting in more metadata records needed to be curated for each product. To address metadata availability and completeness, NASA ESDIS has taken significant strides with the creation of the United Metadata Model (UMM) and Common Metadata Repository (CMR). These UMM helps address hurdles experienced by the increasing number of metadata dialects and the CMR provides a primary repository for metadata so that required metadata fields can be served through a growing number of tools and services. However, metadata quality remains an issue as metadata is not always inherent to the end-user. In response to these challenges, the NASA Atmospheric Science Data Center (ASDC) created the Collaboratory for quAlity Metadata Preservation (CAMP) and defined the Product Lifecycle Process (PLP) to work congruently. CAMP is unique in that it provides science team members a UI to directly supply metadata that is complete, compliant, and accurate for their data products. This replaces back-and-forth communication that often results in misinterpreted metadata. Upon review by ASDC staff, metadata is submitted to CMR for broader distribution through Earthdata. Further, approval of science team metadata in CAMP automatically triggers the ASDC PLP workflow to ensure appropriate services are applied throughout the product lifecycle. This presentation will review the design elements of CAMP and PLP as well as demonstrate interfaces to each. It will show the benefits that CAMP and PLP provide to the ASDC that could potentially benefit additional NASA Earth Science Data and Information System (ESDIS) Distributed Active Archive Centers (DAACs).
Active non-volatile memory post-processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Sudarsun; Milojicic, Dejan S.; Talwar, Vanish

A computing node includes an active Non-Volatile Random Access Memory (NVRAM) component which includes memory and a sub-processor component. The memory is to store data chunks received from a processor core, the data chunks comprising metadata indicating a type of post-processing to be performed on data within the data chunks. The sub-processor component is to perform post-processing of said data chunks based on said metadata.
Digital Libraries and the Problem of Purpose [and] On DigiPaper and the Dissemination of Electronic Documents [and] DFAS: The Distributed Finding Aid Search System [and] Best Practices for Digital Archiving: An Information Life Cycle Approach [and] Mapping and Converting Essential Federal Geographic Data Committee (FGDC) Metadata into MARC21 and Dublin Core: Towards an Alternative to the FGDC Clearinghouse [and] Evaluating Website Modifications at the National Library of Medicine through Search Log analysis.

ERIC Educational Resources Information Center

Levy, David M.; Huttenlocher, Dan; Moll, Angela; Smith, MacKenzie; Hodge, Gail M.; Chandler, Adam; Foley, Dan; Hafez, Alaaeldin M.; Redalen, Aaron; Miller, Naomi

2000-01-01

Includes six articles focusing on the purpose of digital public libraries; encoding electronic documents through compression techniques; a distributed finding aid server; digital archiving practices in the framework of information life cycle management; converting metadata into MARC format and Dublin Core formats; and evaluating Web sites through…
HTTP-based Search and Ordering Using ECHO's REST-based and OpenSearch APIs

NASA Astrophysics Data System (ADS)

Baynes, K.; Newman, D. J.; Pilone, D.

2012-12-01

Metadata is an important entity in the process of cataloging, discovering, and describing Earth science data. NASA's Earth Observing System (EOS) ClearingHOuse (ECHO) acts as the core metadata repository for EOSDIS data centers, providing a centralized mechanism for metadata and data discovery and retrieval. By supporting both the ESIP's Federated Search API and its own search and ordering interfaces, ECHO provides multiple capabilities that facilitate ease of discovery and access to its ever-increasing holdings. Users are able to search and export metadata in a variety of formats including ISO 19115, json, and ECHO10. This presentation aims to inform technically savvy clients interested in automating search and ordering of ECHO's metadata catalog. The audience will be introduced to practical and applicable examples of end-to-end workflows that demonstrate finding, sub-setting and ordering data that is bound by keyword, temporal and spatial constraints. Interaction with the ESIP OpenSearch Interface will be highlighted, as will ECHO's own REST-based API.
ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository.

PubMed

Dugas, Martin; Meidt, Alexandra; Neuhaus, Philipp; Storck, Michael; Varghese, Julian

2016-06-01

The volume and complexity of patient data - especially in personalised medicine - is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare. Semantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don't provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository. A web-based tool called ODMedit ( https://odmeditor.uni-muenster.de/ ) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal. Uniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.
Phase III: Implementation and Operation of the Repository

DOE Office of Scientific and Technical Information (OSTI.GOV)

None, None

1998-07-01

The metadata catalog was brought online for public access May 14, 1998. Since then dozens of users have registered and began to access the system. The system was demonstrated at the AAPG annual meeting in Salt Lake City and the EAGE (European Association of Geoscientists and Engineers) annual meeting in Leipzig, Germany. Hart Publications and PTTC "NetworkNews" have published articles about the metadata catalog, and articles for the AAPG Explorer and GSA Today are being developed. A back-up system at AGI headquarters was established. In support of the metadata catalog system, a leased-line Internet connection and two servers were installed.more » Porting of the GeoTrek server software to the new systems has begun. The back-up system will be operational during the 3 rd quarter of 1998 and will serve the NGDRS needs during periods when access to the site in Houston is down. Additionally, experimentation with new data types and deployment schemes will be tested on the system at AGI. The NGDRS has picked-up additional endorsements from the American Association of State Geologists, the MMS Outer Continental Shelf Policy Committee, and a new endorsement is being formulated by the AAPG Core Preservation Committee for consideration by the AAPG Executive Committee. The Texas Bureau of Economic Geology (BEG) is currently geocoding the well locations for the metadata catalog. Also, they have solicited proposals for the development of a core inventory control system that will work hand-in-hand with GeoTrek. A contract for that system will probably be given during the 3 rd quarter of 1998. The Texas Railroad Commission proposes to test the application of GeoTrek for accessing data in a joint project with the BEG. Several data transfer projects are underway. Vastar has committed to the transfer of 2D Appalachian seismic lines to the NDGRS clearinghouse. Receiving repositories have been identified and the final preparations are being made for transfer to these public repositories. Discussions have been initiated with the State of Oregon concerning listing their 400 oil and gas well and 50 geothermal well cores and logs on the metadata catalog. Additionally, discussions continue with the Stapleton Development Corporation concerning the transfer of facilities in Denver for use as a central core repository. A letter of intent for the facility's transfer is being reviewed.« less
Developing Cyberinfrastructure Tools and Services for Metadata Quality Evaluation

NASA Astrophysics Data System (ADS)

Mecum, B.; Gordon, S.; Habermann, T.; Jones, M. B.; Leinfelder, B.; Powers, L. A.; Slaughter, P.

2016-12-01

Metadata and data quality are at the core of reusable and reproducible science. While great progress has been made over the years, much of the metadata collected only addresses data discovery, covering concepts such as titles and keywords. Improving metadata beyond the discoverability plateau means documenting detailed concepts within the data such as sampling protocols, instrumentation used, and variables measured. Given that metadata commonly do not describe their data at this level, how might we improve the state of things? Giving scientists and data managers easy to use tools to evaluate metadata quality that utilize community-driven recommendations is the key to producing high-quality metadata. To achieve this goal, we created a set of cyberinfrastructure tools and services that integrate with existing metadata and data curation workflows which can be used to improve metadata and data quality across the sciences. These tools work across metadata dialects (e.g., ISO19115, FGDC, EML, etc.) and can be used to assess aspects of quality beyond what is internal to the metadata such as the congruence between the metadata and the data it describes. The system makes use of a user-friendly mechanism for expressing a suite of checks as code in popular data science programming languages such as Python and R. This reduces the burden on scientists and data managers to learn yet another language. We demonstrated these services and tools in three ways. First, we evaluated a large corpus of datasets in the DataONE federation of data repositories against a metadata recommendation modeled after existing recommendations such as the LTER best practices and the Attribute Convention for Dataset Discovery (ACDD). Second, we showed how this service can be used to display metadata and data quality information to data producers during the data submission and metadata creation process, and to data consumers through data catalog search and access tools. Third, we showed how the centrally deployed DataONE quality service can achieve major efficiency gains by allowing member repositories to customize and use recommendations that fit their specific needs without having to create de novo infrastructure at their site.
The National Institute of Neurological Disorders and Stroke and Department of Defense Sport-Related Concussion Common Data Elements Version 1.0 Recommendations.

PubMed

Broglio, Steven P; Kontos, Anthony P; Levin, Harvey; Schneider, Kathryn; Wilde, Elisabeth A; Cantu, Robert C; Feddermann-Demont, Nina; Fuller, Gordon; Gagnon, Isabelle; Gioia, Gerry; Giza, Christopher C; Griesbach, Grace Sophia; Leddy, John J; Lipton, Michael L; Mayer, Andrew; McAllister, Thomas; McCrea, Michael; McKenzie, Lara; Putukian, Margot; Signoretti, Stefano; Suskauer, Stacy J; Tamburro, Robert; Turner, Michael; Yeates, Keith Owen; Zemek, Roger; Ala'i, Sherita; Esterlitz, Joy; Gay, Katelyn; Bellgowan, Patrick S F; Joseph, Kristen

2018-05-02

Through a partnership with the National Institute of Neurological Disorders and Stroke (NINDS), National Institutes of Health (NIH), and Department of Defense (DoD), the development of Sport-Related Concussion (SRC) Common Data Elements (CDEs) was initiated. The aim of this collaboration was to increase the efficiency and effectiveness of clinical research studies and clinical treatment outcomes, increase data quality, facilitate data sharing across studies, reduce study start-up time, more effectively aggregate information into metadata results, and educate new clinical investigators. The SRC CDE Working Group consisted of 34 worldwide experts in concussion from varied fields of related expertise, divided into three Subgroups: Acute (<72 hours post-concussion), Subacute (3 days-3 months post-concussion) and Persistent/Chronic (>3 months post-concussion). To develop CDEs, the Subgroups reviewed various domains, and then selected from, refined, and added to existing CDEs, case report forms and field-tested data elements from national registries and funded research studies. Recommendations were posted to the NINDS CDE Website for Public Review from February 2017 to April 2017. Following an internal Working Group review of recommendations, along with consideration of comments received from the Public Review period, the first iteration (Version 1.0) of the NINDS SRC CDEs was completed in June 2017. The recommendations include Core and Supplemental - Highly Recommended CDEs for cognitive data elements and symptom checklists, as well as other outcomes and endpoints (e.g., vestibular, oculomotor, balance, anxiety, depression) and sample case report forms (e.g., injury reporting, demographics, concussion history) for domains typically included in clinical research studies. The NINDS SRC CDEs and supporting documents are publicly available on the NINDS CDE website https://www.commondataelements.ninds.nih.gov/. Widespread use of CDEs by researchers and clinicians will facilitate consistent SRC clinical research and trial design, data sharing, and metadata retrospective analysis.
Challenges with secondary use of multi-source water-quality data in the United States

USGS Publications Warehouse

Sprague, Lori A.; Oelsner, Gretchen P.; Argue, Denise M.

2017-01-01

Combining water-quality data from multiple sources can help counterbalance diminishing resources for stream monitoring in the United States and lead to important regional and national insights that would not otherwise be possible. Individual monitoring organizations understand their own data very well, but issues can arise when their data are combined with data from other organizations that have used different methods for reporting the same common metadata elements. Such use of multi-source data is termed “secondary use”—the use of data beyond the original intent determined by the organization that collected the data. In this study, we surveyed more than 25 million nutrient records collected by 488 organizations in the United States since 1899 to identify major inconsistencies in metadata elements that limit the secondary use of multi-source data. Nearly 14.5 million of these records had missing or ambiguous information for one or more key metadata elements, including (in decreasing order of records affected) sample fraction, chemical form, parameter name, units of measurement, precise numerical value, and remark codes. As a result, metadata harmonization to make secondary use of these multi-source data will be time consuming, expensive, and inexact. Different data users may make different assumptions about the same ambiguous data, potentially resulting in different conclusions about important environmental issues. The value of these ambiguous data is estimated at \\$US12 billion, a substantial collective investment by water-resource organizations in the United States. By comparison, the value of unambiguous data is estimated at \\$US8.2 billion. The ambiguous data could be preserved for uses beyond the original intent by developing and implementing standardized metadata practices for future and legacy water-quality data throughout the United States.

Design of case report forms based on a public metadata registry: re-use of data elements to improve compatibility of data.

PubMed

Dugas, Martin

2016-11-29

Clinical trials use many case report forms (CRFs) per patient. Because of the astronomical number of potential CRFs, data element re-use at the design stage is attractive to foster compatibility of data from different trials. The objective of this work is to assess the technical feasibility of a CRF editor with connection to a public metadata registry (MDR) to support data element re-use. Based on the Medical Data Models portal, an ISO/IEC 11179-compliant MDR was implemented and connected to a web-based CRF editor. Three use cases were implemented: re-use at the form, item group and data element levels. CRF design with data element re-use from a public MDR is feasible. A prototypic system is available. The main limitation of the system is the amount of available MDR content.
{Semantic metadata application for information resources systematization in water spectroscopy} A.Fazliev (1), A.Privezentsev (1), J.Tennyson (2) (1) Institute of Atmospheric Optics SB RAS, Tomsk, Russia, (2) University College London, London, UK (faz@iao

NASA Astrophysics Data System (ADS)

Fazliev, A.

2009-04-01

The information and knowledge layers of information-computational system for water spectroscopy are described. Semantic metadata for all the tasks of domain information model that are the basis of the layers have been studied. The principle of semantic metadata determination and mechanisms of the usage during information systematization in molecular spectroscopy has been revealed. The software developed for the work with semantic metadata is described as well. Formation of domain model in the framework of Semantic Web is based on the use of explicit specification of its conceptualization or, in other words, its ontologies. Formation of conceptualization for molecular spectroscopy was described in Refs. 1, 2. In these works two chains of task are selected for zeroth approximation for knowledge domain description. These are direct tasks chain and inverse tasks chain. Solution schemes of these tasks defined approximation of data layer for knowledge domain conceptualization. Spectroscopy tasks solutions properties lead to a step-by-step extension of molecular spectroscopy conceptualization. Information layer of information system corresponds to this extension. An advantage of molecular spectroscopy model designed in a form of tasks chain is actualized in the fact that one can explicitly define data and metadata at each step of solution of these molecular spectroscopy chain tasks. Metadata structure (tasks solutions properties) in knowledge domain also has form of a chain in which input data and metadata of the previous task become metadata of the following tasks. The term metadata is used in its narrow sense: metadata are the properties of spectroscopy tasks solutions. Semantic metadata represented with the help of OWL 3 are formed automatically and they are individuals of classes (A-box). Unification of T-box and A-box is an ontology that can be processed with the help of inference engine. In this work we analyzed the formation of individuals of molecular spectroscopy applied ontologies as well as the software used for their creation by means of OWL DL language. The results of this work are presented in a form of an information layer and a knowledge layer in W@DIS information system 4. 1 FORMATION OF INDIVIDUALS OF WATER SPECTROSCOPY APPLIED ONTOLOGY Applied tasks ontology contains explicit description of input an output data of physical tasks solved in two chains of molecular spectroscopy tasks. Besides physical concepts, related to spectroscopy tasks solutions, an information source, which is a key concept of knowledge domain information model, is also used. Each solution of knowledge domain task is linked to the information source which contains a reference on published task solution, molecule and task solution properties. Each information source allows us to identify a certain knowledge domain task solution contained in the information system. Water spectroscopy applied ontology classes are formed on the basis of molecular spectroscopy concepts taxonomy. They are defined by constrains on properties of the selected conceptualization. Extension of applied ontology in W@DIS information system is actualized according to two scenarios. Individuals (ontology facts or axioms) formation is actualized during the task solution upload in the information system. Ontology user operation that implies molecular spectroscopy taxonomy and individuals is performed solely by the user. For this purpose Protege ontology editor was used. For the formation, processing and visualization of knowledge domain tasks individuals a software was designed and implemented. Method of individual formation determines the sequence of steps of created ontology individuals' generation. Tasks solutions properties (metadata) have qualitative and quantitative values. Qualitative metadata are regarded as metadata describing qualitative side of a task such as solution method or other information that can be explicitly specified by object properties of OWL DL language. Quantitative metadata are metadata that describe quantitative properties of task solution such as minimal and maximal data value or other information that can be explicitly obtained by programmed algorithmic operations. These metadata are related to DatatypeProperty properties of OWL specification language Quantitative metadata can be obtained automatically during data upload into information system. Since ObjectProperty values are objects, processing of qualitative metadata requires logical constraints. In case of the task solved in W@DIS ICS qualitative metadata can be formed automatically (for example in spectral functions calculation task). The used methods of translation of qualitative metadata into quantitative is characterized as roughened representation of knowledge in knowledge domain. The existence of two ways of data obtainment is a key moment in the formation of applied ontology of molecular spectroscopy task. experimental method (metadata for experimental data contain description of equipment, experiment conditions and so on) on the initial stage and inverse task solution on the following stages; calculation method (metadata for calculation data are closely related to the metadata used for the description of physical and mathematical models of molecular spectroscopy) 2 SOFTWARE FOR ONTOLOGY OPERATION Data collection in water spectroscopy information system is organized in a form of workflow that contains such operations as information source creation, entry of bibliographic data on publications, formation of uploaded data schema an so on. Metadata are generated in information source as well. Two methods are used for their formation: automatic metadata generation and manual metadata generation (performed by user). Software implementation of support of actions related to metadata formation is performed by META+ module. Functions of META+ module can be divided into two groups. The first groups contains the functions necessary to software developer while the second one the functions necessary to a user of the information system. META+ module functions necessary to the developer are: 1. creation of taxonomy (T-boxes) of applied ontology classes of knowledge domain tasks; 2. creation of instances of task classes; 3. creation of data schemes of tasks in a form of an XML-pattern and based on XML-syntax. XML-pattern is developed for instances generator and created according to certain rules imposed on software generator implementation. 4. implementation of metadata values calculation algorithms; 5. creation of a request interface and additional knowledge processing function for the solution of these task; 6. unification of the created functions and interfaces into one information system The following sequence is universal for the generation of task classes' individuals that form chains. Special interfaces for user operations management are designed for software developer in META+ module. There are means for qualitative metadata values updating during data reuploading to information source. The list of functions necessary to end user contains: - data sets visualization and editing, taking into account their metadata, e.g.: display of unique number of bands in transitions for a certain data source; - export of OWL/RDF models from information system to the environment in XML-syntax; - visualization of instances of classes of applied ontology tasks on molecular spectroscopy; - import of OWL/RDF models into the information system and their integration with domain vocabulary; - formation of additional knowledge of knowledge domain for the construction of ontological instances of task classes using GTML-formats and their processing; - formation of additional knowledge in knowledge domain for the construction of instances of task classes, using software algorithm for data sets processing; - function of semantic search implementation using an interface that formulates questions in a form of related triplets in order for getting an adequate answer. 3 STRUCTURE OF META+ MODULE META+ software module that provides the above functions contains the following components: - a knowledge base that stores semantic metadata and taxonomies of information system; - software libraries POWL and RAP 5 created by third-party developer and providing access to ontological storage; - function classes and libraries that form the core of the module and perform the tasks of formation, storage and visualization of classes instances; - configuration files and module patterns that allow one to adjust and organize operation of different functional blocks; META+ module also contains scripts and patterns implemented according to the rules of W@DIS information system development environment. - scripts for interaction with environment by means of the software core of information system. These scripts provide organizing web-oriented interactive communication; - patterns for the formation of functionality visualization realized by the scripts Software core of scientific information-computational system W@DIS is created with the help of MVC (Model - View - Controller) design pattern that allows us to separate logic of application from its representation. It realizes the interaction of three logical components, actualizing interactivity with the environment via Web and performing its preprocessing. Functions of «Controller» logical component are realized with the help of scripts designed according to the rules imposed by software core of the information system. Each script represents a definite object-oriented class with obligatory class method of script initiation called "start". Functions of actualization of domain application operation results representation (i.e. "View" component) are sets of HTML-patterns that allow one to visualize the results of domain applications operation with the help of additional constructions processed by software core of the system. Besides the interaction with the software core of the scientific information system this module also deals with configuration files of software core and its database. Such organization of work provides closer integration with software core and deeper and more adequate connection in operating system support. 4 CONCLUSION In this work the problems of semantic metadata creation in information system oriented on information representation in the area of molecular spectroscopy have been discussed. The described method of semantic metadata and functions formation as well as realization and structure of META+ module have been described. Architecture of META+ module is closely related to the existing software of "Molecular spectroscopy" scientific information system. Realization of the module is performed with the use of modern approaches to Web-oriented applications development. It uses the existing applied interfaces. The developed software allows us to: - perform automatic metadata annotation of calculated tasks solutions directly in the information system; - perform automatic annotation of metadata on the solution of tasks on task solution results uploading outside the information system forming an instance of the solved task on the basis of entry data; - use ontological instances of task solution for identification of data in information tasks of viewing, comparison and search solved by information system; - export applied tasks ontologies for the operation with them by external means; - solve the task of semantic search according to the pattern and using question-answer type interface. 5 ACKNOWLEDGEMENT The authors are grateful to RFBR for the financial support of development of distributed information system for molecular spectroscopy. REFERENCES A.D.Bykov, A.Z. Fazliev, N.N.Filippov, A.V. Kozodoev, A.I.Privezentsev, L.N.Sinitsa, M.V.Tonkov and M.Yu.Tretyakov, Distributed information system on atmospheric spectroscopy // Geophysical Research Abstracts, SRef-ID: 1607-7962/gra/EGU2007-A-01906, 2007, v. 9, p. 01906. A.I.Prevezentsev, A.Z. Fazliev Applied task ontology for molecular spectroscopy information resources systematization. The Proceedings of 9th Russian scientific conference "Electronic libraries: advanced methods and technologies, electronic collections" - RCDL'2007, Pereslavl Zalesskii, 2007, part.1, 2007, P.201-210. OWL Web Ontology Language Semantics and Abstract Syntax, W3C Recommendation 10 February 2004, http://www.w3.org/TR/2004/REC-owl-semantics-20040210/ W@DIS information system, http://wadis.saga.iao.ru RAP library, http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/.
Fallon, Nevada FORGE Gravity and Magnetics Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blankenship, Doug; Witter, Jeff; Carpenter, Thomas

This package contains principal facts for new gravity data collected September - November 2017 in support of the Fallon FORGE project. Also included are rock core density and magnetic susceptibility data for key core intervals, used in modeling 2D and 3D gravity inversions. Individual metadata summaries are provided as .pdf within each attached archive.
Mercury: An Example of Effective Software Reuse for Metadata Management, Data Discovery and Access

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce E.

2008-12-01

Mercury is a federated metadata harvesting, data discovery and access tool based on both open source packages and custom developed software. Though originally developed for NASA, the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports the reuse of metadata by enabling searching across a range of metadata specification and standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115. Mercury provides a single portal to information contained in distributed data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. One of the major goals of the recent redesign of Mercury was to improve the software reusability across the 12 projects which currently fund the continuing development of Mercury. These projects span a range of land, atmosphere, and ocean ecological communities and have a number of common needs for metadata searches, but they also have a number of needs specific to one or a few projects. To balance these common and project-specific needs, Mercury's architecture has three major reusable components; a harvester engine, an indexing system and a user interface component. The harvester engine is responsible for harvesting metadata records from various distributed servers around the USA and around the world. The harvester software was packaged in such a way that all the Mercury projects will use the same harvester scripts but each project will be driven by a set of project specific configuration files. The harvested files are structured metadata records that are indexed against the search library API consistently, so that it can render various search capabilities such as simple, fielded, spatial and temporal. This backend component is supported by a very flexible, easy to use Graphical User Interface which is driven by cascading style sheets, which make it even simpler for reusable design implementation. The new Mercury system is based on a Service Oriented Architecture and effectively reuses components for various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. The software also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets, integrated shopping cart to order datasets from various data centers (ORNL DAAC, NSIDC) and integrated visualization tools. Other features include: Filtering and dynamic sorting of search results, book- markable search results, save, retrieve, and modify search criteria.
Mercury: An Example of Effective Software Reuse for Metadata Management, Data Discovery and Access

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devarakonda, Ranjeet

2008-01-01

Mercury is a federated metadata harvesting, data discovery and access tool based on both open source packages and custom developed software. Though originally developed for NASA, the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports the reuse of metadata by enabling searching across a range of metadata specification and standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115. Mercury provides a single portal to information contained in distributed data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfacesmore » then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. One of the major goals of the recent redesign of Mercury was to improve the software reusability across the 12 projects which currently fund the continuing development of Mercury. These projects span a range of land, atmosphere, and ocean ecological communities and have a number of common needs for metadata searches, but they also have a number of needs specific to one or a few projects. To balance these common and project-specific needs, Mercury's architecture has three major reusable components; a harvester engine, an indexing system and a user interface component. The harvester engine is responsible for harvesting metadata records from various distributed servers around the USA and around the world. The harvester software was packaged in such a way that all the Mercury projects will use the same harvester scripts but each project will be driven by a set of project specific configuration files. The harvested files are structured metadata records that are indexed against the search library API consistently, so that it can render various search capabilities such as simple, fielded, spatial and temporal. This backend component is supported by a very flexible, easy to use Graphical User Interface which is driven by cascading style sheets, which make it even simpler for reusable design implementation. The new Mercury system is based on a Service Oriented Architecture and effectively reuses components for various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. The software also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets, integrated shopping cart to order datasets from various data centers (ORNL DAAC, NSIDC) and integrated visualization tools. Other features include: Filtering and dynamic sorting of search results, book- markable search results, save, retrieve, and modify search criteria.« less
CT Scans of Cores Metadata, Barrow, Alaska 2015

DOE Data Explorer

Katie McKnight; Tim Kneafsey; Craig Ulrich

2015-03-11

Individual ice cores were collected from Barrow Environmental Observatory in Barrow, Alaska, throughout 2013 and 2014. Cores were drilled along different transects to sample polygonal features (i.e. the trough, center and rim of high, transitional and low center polygons). Most cores were drilled around 1 meter in depth and a few deep cores were drilled around 3 meters in depth. Three-dimensional images of the frozen cores were constructed using a medical X-ray computed tomography (CT) scanner. TIFF files can be uploaded to ImageJ (an open-source imaging software) to examine soil structure and densities within each core.
Unified Science Information Model for SoilSCAPE using the Mercury Metadata Search System

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Lu, Kefa; Palanisamy, Giri; Cook, Robert; Santhana Vannan, Suresh; Moghaddam, Mahta Clewley, Dan; Silva, Agnelo; Akbar, Ruzbeh

2013-12-01

SoilSCAPE (Soil moisture Sensing Controller And oPtimal Estimator) introduces a new concept for a smart wireless sensor web technology for optimal measurements of surface-to-depth profiles of soil moisture using in-situ sensors. The objective is to enable a guided and adaptive sampling strategy for the in-situ sensor network to meet the measurement validation objectives of spaceborne soil moisture sensors such as the Soil Moisture Active Passive (SMAP) mission. This work is being carried out at the University of Michigan, the Massachusetts Institute of Technology, University of Southern California, and Oak Ridge National Laboratory. At Oak Ridge National Laboratory we are using Mercury metadata search system [1] for building a Unified Information System for the SoilSCAPE project. This unified portal primarily comprises three key pieces: Distributed Search/Discovery; Data Collections and Integration; and Data Dissemination. Mercury, a Federally funded software for metadata harvesting, indexing, and searching would be used for this module. Soil moisture data sources identified as part of this activity such as SoilSCAPE and FLUXNET (in-situ sensors), AirMOSS (airborne retrieval), SMAP (spaceborne retrieval), and are being indexed and maintained by Mercury. Mercury would be the central repository of data sources for cal/val for soil moisture studies and would provide a mechanism to identify additional data sources. Relevant metadata from existing inventories such as ORNL DAAC, USGS Clearinghouse, ARM, NASA ECHO, GCMD etc. would be brought in to this soil-moisture data search/discovery module. The SoilSCAPE [2] metadata records will also be published in broader metadata repositories such as GCMD, data.gov. Mercury can be configured to provide a single portal to soil moisture information contained in disparate data management systems located anywhere on the Internet. Mercury is able to extract, metadata systematically from HTML pages or XML files using a variety of methods including OAI-PMH [3]. The Mercury search interface then allows users to perform simple, fielded, spatial and temporal searches across a central harmonized index of metadata. Mercury supports various metadata standards including FGDC, ISO-19115, DIF, Dublin-Core, Darwin-Core, and EML. This poster describes in detail how Mercury implements the Unified Science Information Model for Soil moisture data. References: [1]Devarakonda R., et al. Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics (2010), 3(1): 87-94. [2]Devarakonda R., et al. Daymet: Single Pixel Data Extraction Tool. http://daymet.ornl.gov/singlepixel.html (2012). Last Accesses 10-01-2013 [3]Devarakonda R., et al. Data sharing and retrieval using OAI-PMH. Earth Science Informatics (2011), 4(1): 1-5.
The Effects of Discipline on the Application of Learning Object Metadata in UK Higher Education: The Case of the Jorum Repository

ERIC Educational Resources Information Center

Balatsoukas, Panos; O'Brien, Ann; Morris, Anne

2011-01-01

Introduction: This paper reports on the findings of a study investigating the potential effects of discipline (sciences and engineering versus humanities and social sciences) on the application of the Institute of Electrical and Electronic Engineers learning object metadata elements for the description of learning objects in the Jorum learning…
Implementing DSpace at NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Lowe, Greta

2007-01-01

This presentation looks at the implementation of the DSpace institutional repository system at the NASA Langley Technical Library. NASA Langley Technical Library implemented DSpace software as a replacement for the Langley Technical Report Server (LTRS). DSpace was also used to develop the Langley Technical Library Digital Repository (LTLDR). LTLDR contains archival copies of core technical reports in the aeronautics area dating back to the NACA era and other specialized collections relevant to the NASA Langley community. Extensive metadata crosswalks were created to facilitate moving data from various systems and formats to DSpace. The Dublin Core metadata screens were also customized. The OpenURL standard and Ex Libris Metalib are being used in this environment to assist our customers with either discovering full-text content or with initiating a request for the item.
HDF-EOS Dump Tools

NASA Astrophysics Data System (ADS)

Prasad, U.; Rahabi, A.

2001-05-01

The following utilities developed for HDF-EOS format data dump are of special use for Earth science data for NASA's Earth Observation System (EOS). This poster demonstrates their use and application. The first four tools take HDF-EOS data files as input. HDF-EOS Metadata Dumper - metadmp Metadata dumper extracts metadata from EOS data granules. It operates by simply copying blocks of metadata from the file to the standard output. It does not process the metadata in any way. Since all metadata in EOS granules is encoded in the Object Description Language (ODL), the output of metadmp will be in the form of complete ODL statements. EOS data granules may contain up to three different sets of metadata (Core, Archive, and Structural Metadata). HDF-EOS Contents Dumper - heosls Heosls dumper displays the contents of HDF-EOS files. This utility provides detailed information on the POINT, SWATH, and GRID data sets. in the files. For example: it will list, the Geo-location fields, Data fields and objects. HDF-EOS ASCII Dumper - asciidmp The ASCII dump utility extracts fields from EOS data granules into plain ASCII text. The output from asciidmp should be easily human readable. With minor editing, asciidmp's output can be made ingestible by any application with ASCII import capabilities. HDF-EOS Binary Dumper - bindmp The binary dumper utility dumps HDF-EOS objects in binary format. This is useful for feeding the output of it into existing program, which does not understand HDF, for example: custom software and COTS products. HDF-EOS User Friendly Metadata - UFM The UFM utility tool is useful for viewing ECS metadata. UFM takes an EOSDIS ODL metadata file and produces an HTML report of the metadata for display using a web browser. HDF-EOS METCHECK - METCHECK METCHECK can be invoked from either Unix or Dos environment with a set of command line options that a user might use to direct the tool inputs and output . METCHECK validates the inventory metadata in (.met file) using The Descriptor file (.desc) as the reference. The tool takes (.desc), and (.met) an ODL file as inputs, and generates a simple output file contains the results of the checking process.
Principles of metadata organization at the ENCODE data coordination center

PubMed Central

Hong, Eurie L.; Sloan, Cricket A.; Chan, Esther T.; Davidson, Jean M.; Malladi, Venkat S.; Strattan, J. Seth; Hitz, Benjamin C.; Gabdank, Idan; Narayanan, Aditi K.; Ho, Marcus; Lee, Brian T.; Rowe, Laurence D.; Dreszer, Timothy R.; Roe, Greg R.; Podduturi, Nikhil R.; Tanaka, Forrest; Hilton, Jason A.; Cherry, J. Michael

2016-01-01

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org PMID:26980513
A New Look at Data Usage by Using Metadata Attributes as Indicators of Data Quality

NASA Technical Reports Server (NTRS)

Won, Young-In; Wanchoo, Lalit; Behnke, Jeanne

2016-01-01

This study reviews the key metrics (users, distributed volume, and files) in multiple ways to gain an understanding of the significance of the metadata. Characterizing the usability of data by key metadata elements, such as discipline and study area, will assist in understanding how the user needs have evolved over time. The data usage pattern based on product level provides insight into the level of data quality. In addition, the data metrics by various services, such as the Open-source Project for a Network Data Access Protocol (OPeNDAP) and subsets, address how these services have extended the usage of data. Over-all, this study presents the usage of data and metadata by metrics analyses, which may assist data centers in better supporting the needs of the users.
Overcoming the challenges of secure mobile applications for network-centric, data-sensitive applications

NASA Astrophysics Data System (ADS)

Farroha, Bassam; Farroha, Deborah

2012-05-01

Gaining the competitive advantage in today's aggressive environment requires our corporate leaders and Warfighters alike to be armed with up-to-date knowledge related to friendly and opposing forces. This knowledge has to be delivered in real-time between the core enterprise and tactical/mobile units at the edge. The type and sensitivity of data delivered will vary depending on users, threat level and current rules of dissemination. This paper will describe the mobile security management that basis access rights on positive identification of user, authenticating the user and the edge device. Next, Access Management is granted on a fine grain basis where each data element is tagged with meta-data that is crypto-bound to the data itself to ensure authenticity of contents and observance of data sensitivity.
The Index to Marine and Lacustrine Geological Samples: Improving Sample Accessibility and Enabling Current and Future Research

NASA Astrophysics Data System (ADS)

Moore, C.

2011-12-01

The Index to Marine and Lacustrine Geological Samples is a community designed and maintained resource enabling researchers to locate and request sea floor and lakebed geologic samples archived by partner institutions. Conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center (NGDC) at a 1977 meeting convened by the National Science Foundation (NSF), the Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. Form and content of underlying vocabularies and metadata continue to evolve according to the needs of the community, as do supporting technologies and access methodologies. The Curators Consortium, now international in scope, meets at partner institutions biennially to share ideas and discuss best practices. NGDC serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the herculean task of creating and contributing metadata for over 195,000 sea floor and lakebed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images; 1) at participating institutions, 2) in the NGDC archive, and 3) at sites such as the Rolling Deck to Repository (R2R) and the System for Earth Sample Registration (SESAR). Over 34,000 International GeoSample Numbers (IGSNs) linking to SESAR are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. To promote interoperability and broaden exposure via the semantic web, NGDC is publishing lithologic classification schemes and terminology used in the Index as Simple Knowledge Organization System (SKOS) vocabularies, coordinating with R2R and the Consortium for Ocean Leadership for consistency. Availability in SKOS form will also facilitate use of the vocabularies in International Standards Organization (ISO) 19115-2 compliant metadata records. NGDC provides stewardship for the Index on behalf of U.S. repositories as the NSF designated "appropriate National Data Center" for data and metadata pertaining to sea floor samples as specified in the 2011 Division of Ocean Sciences Sample and Data Policy, and on behalf of international partners via a collocated World Data Center. NGDC operates on the Open Archival Information System (OAIS) reference model. Active Partners: Antarctic Marine Geology Research Facility, Florida State University; British Ocean Sediment Core Research Facility; Geological Survey of Canada; Integrated Ocean Drilling Program; Lamont-Doherty Earth Observatory; National Lacustrine Core Repository, University of Minnesota; Oregon State University; Scripps Institution of Oceanography; University of Rhode Island; U.S. Geological Survey; Woods Hole Oceanographic Institution.
Metadata, Identifiers, and Physical Samples

NASA Astrophysics Data System (ADS)

Arctur, D. K.; Lenhardt, W. C.; Hills, D. J.; Jenkyns, R.; Stroker, K. J.; Todd, N. S.; Dassie, E. P.; Bowring, J. F.

2016-12-01

Physical samples are integral to much of the research conducted by geoscientists. The samples used in this research are often obtained at significant cost and represent an important investment for future research. However, making information about samples - whether considered data or metadata - available for researchers to enable discovery is difficult: a number of key elements related to samples are difficult to characterize in common ways, such as classification, location, sample type, sampling method, repository information, subsample distribution, and instrumentation, because these differ from one domain to the next. Unifying these elements or developing metadata crosswalks is needed. The iSamples (Internet of Samples) NSF-funded Research Coordination Network (RCN) is investigating ways to develop these types of interoperability and crosswalks. Within the iSamples RCN, one of its working groups, WG1, has focused on the metadata related to physical samples. This includes identifying existing metadata standards and systems, and how they might interoperate with the International Geo Sample Number (IGSN) schema (schema.igsn.org) in order to help inform leading practices for metadata. For example, we are examining lifecycle metadata beyond the IGSN `birth certificate.' As a first step, this working group is developing a list of relevant standards and comparing their various attributes. In addition, the working group is looking toward technical solutions to facilitate developing a linked set of registries to build the web of samples. Finally, the group is also developing a comparison of sample identifiers and locators. This paper will provide an overview and comparison of the standards identified thus far, as well as an update on the technical solutions examined for integration. We will discuss how various sample identifiers might work in complementary fashion with the IGSN to more completely describe samples, facilitate retrieval of contextual information, and access research work on related samples. Finally, we welcome suggestions and community input to move physical sample unique identifiers forward.
Exploring Cultural Heritage Resources in a 3d Collaborative Environment

NASA Astrophysics Data System (ADS)

Respaldiza, A.; Wachowicz, M.; Vázquez Hoehne, A.

2012-06-01

Cultural heritage is a complex and diverse concept, which brings together a wide domain of information. Resources linked to a cultural heritage site may consist of physical artefacts, books, works of art, pictures, historical maps, aerial photographs, archaeological surveys and 3D models. Moreover, all these resources are listed and described by a set of a variety of metadata specifications that allow their online search and consultation on the most basic characteristics of them. Some examples include Norma ISO 19115, Dublin Core, AAT, CDWA, CCO, DACS, MARC, MoReq, MODS, MuseumDat, TGN, SPECTRUM, VRA Core and Z39.50. Gateways are in place to fit in these metadata standards into those used in a SDI (ISO 19115 or INSPIRE), but substantial work still remains to be done for the complete incorporation of cultural heritage information. Therefore, the aim of this paper is to demonstrate how the complexity of cultural heritage resources can be dealt with by a visual exploration of their metadata within a 3D collaborative environment. The 3D collaborative environments are promising tools that represent the new frontier of our capacity of learning, understanding, communicating and transmitting culture.
Stop the Bleeding: the Development of a Tool to Streamline NASA Earth Science Metadata Curation Efforts

NASA Astrophysics Data System (ADS)

le Roux, J.; Baker, A.; Caltagirone, S.; Bugbee, K.

2017-12-01

The Common Metadata Repository (CMR) is a high-performance, high-quality repository for Earth science metadata records, and serves as the primary way to search NASA's growing 17.5 petabytes of Earth science data holdings. Released in 2015, CMR has the capability to support several different metadata standards already being utilized by NASA's combined network of Earth science data providers, or Distributed Active Archive Centers (DAACs). The Analysis and Review of CMR (ARC) Team located at Marshall Space Flight Center is working to improve the quality of records already in CMR with the goal of making records optimal for search and discovery. This effort entails a combination of automated and manual review, where each NASA record in CMR is checked for completeness, accuracy, and consistency. This effort is highly collaborative in nature, requiring communication and transparency of findings amongst NASA personnel, DAACs, the CMR team and other metadata curation teams. Through the evolution of this project it has become apparent that there is a need to document and report findings, as well as track metadata improvements in a more efficient manner. The ARC team has collaborated with Element 84 in order to develop a metadata curation tool to meet these needs. In this presentation, we will provide an overview of this metadata curation tool and its current capabilities. Challenges and future plans for the tool will also be discussed.
Using RDF and Git to Realize a Collaborative Metadata Repository.

PubMed

Stöhr, Mark R; Majeed, Raphael W; Günther, Andreas

2018-01-01

The German Center for Lung Research (DZL) is a research network with the aim of researching respiratory diseases. The participating study sites' register data differs in terms of software and coding system as well as data field coverage. To perform meaningful consortium-wide queries through one single interface, a uniform conceptual structure is required covering the DZL common data elements. No single existing terminology includes all our concepts. Potential candidates such as LOINC and SNOMED only cover specific subject areas or are not granular enough for our needs. To achieve a broadly accepted and complete ontology, we developed a platform for collaborative metadata management. The DZL data management group formulated detailed requirements regarding the metadata repository and the user interfaces for metadata editing. Our solution builds upon existing standard technologies allowing us to meet those requirements. Its key parts are RDF and the distributed version control system Git. We developed a software system to publish updated metadata automatically and immediately after performing validation tests for completeness and consistency.
The XML Metadata Editor of GFZ Data Services

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Elger, Kirsten; Tesei, Telemaco; Trippanera, Daniele

2017-04-01

Following the FAIR data principles, research data should be Findable, Accessible, Interoperable and Reuseable. Publishing data under these principles requires to assign persistent identifiers to the data and to generate rich machine-actionable metadata. To increase the interoperability, metadata should include shared vocabularies and crosslink the newly published (meta)data and related material. However, structured metadata formats tend to be complex and are not intended to be generated by individual scientists. Software solutions are needed that support scientists in providing metadata describing their data. To facilitate data publication activities of 'GFZ Data Services', we programmed an XML metadata editor that assists scientists to create metadata in different schemata popular in the earth sciences (ISO19115, DIF, DataCite), while being at the same time usable by and understandable for scientists. Emphasis is placed on removing barriers, in particular the editor is publicly available on the internet without registration [1] and the scientists are not requested to provide information that may be generated automatically (e.g. the URL of a specific licence or the contact information of the metadata distributor). Metadata are stored in browser cookies and a copy can be saved to the local hard disk. To improve usability, form fields are translated into the scientific language, e.g. 'creators' of the DataCite schema are called 'authors'. To assist filling in the form, we make use of drop down menus for small vocabulary lists and offer a search facility for large thesauri. Explanations to form fields and definitions of vocabulary terms are provided in pop-up windows and a full documentation is available for download via the help menu. In addition, multiple geospatial references can be entered via an interactive mapping tool, which helps to minimize problems with different conventions to provide latitudes and longitudes. Currently, we are extending the metadata editor to be reused to generate metadata for data discovery and contextual metadata developed by the 'Multi-scale Laboratories' Thematic Core Service of the European Plate Observing System (EPOS-IP). The Editor will be used to build a common repository of a large variety of geological and geophysical datasets produced by multidisciplinary laboratories throughout Europe, thus contributing to a significant step toward the integration and accessibility of earth science data. This presentation will introduce the metadata editor and show the adjustments made for EPOS-IP. [1] http://dataservices.gfz-potsdam.de/panmetaworks/metaedit
The Gemini Recipe System: A Dynamic Workflow for Automated Data Reduction

NASA Astrophysics Data System (ADS)

Labrie, K.; Hirst, P.; Allen, C.

2011-07-01

Gemini's next generation data reduction software suite aims to offer greater automation of the data reduction process without compromising the flexibility required by science programs using advanced or unusual observing strategies. The Recipe System is central to our new data reduction software. Developed in Python, it facilitates near-real time processing for data quality assessment, and both on- and off-line science quality processing. The Recipe System can be run as a standalone application or as the data processing core of an automatic pipeline. Building on concepts that originated in ORAC-DR, a data reduction process is defined in a Recipe written in a science (as opposed to computer) oriented language, and consists of a sequence of data reduction steps called Primitives. The Primitives are written in Python and can be launched from the PyRAF user interface by users wishing for more hands-on optimization of the data reduction process. The fact that the same processing Primitives can be run within both the pipeline context and interactively in a PyRAF session is an important strength of the Recipe System. The Recipe System offers dynamic flow control allowing for decisions regarding processing and calibration to be made automatically, based on the pixel and the metadata properties of the dataset at the stage in processing where the decision is being made, and the context in which the processing is being carried out. Processing history and provenance recording are provided by the AstroData middleware, which also offers header abstraction and data type recognition to facilitate the development of instrument-agnostic processing routines. All observatory or instrument specific definitions are isolated from the core of the AstroData system and distributed in external configuration packages that define a lexicon including classifications, uniform metadata elements, and transformations.

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

PubMed

Sahoo, Satya S; Valdez, Joshua; Rueschman, Michael

2016-01-01

Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled "Rigor and Reproducibility " for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project.
Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description

PubMed Central

Sahoo, Satya S.; Valdez, Joshua; Rueschman, Michael

2016-01-01

Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled “Rigor and Reproducibility “ for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project. PMID:28269904
SIPSMetGen: It's Not Just For Aircraft Data and ECS Anymore.

NASA Astrophysics Data System (ADS)

Schwab, M.

2015-12-01

The SIPSMetGen utility, developed for the NASA EOSDIS project, under the EED contract, simplified the creation of file level metadata for the ECS System. The utility has been enhanced for ease of use, efficiency, speed and increased flexibility. The SIPSMetGen utility was originally created as a means of generating file level spatial metadata for Operation IceBridge. The first version created only ODL metadata, specific for ingest into ECS. The core strength of the utility was, and continues to be, its ability to take complex shapes and patterns of data collection point clouds from aircraft flights and simplify them to a relatively simple concave hull geo-polygon. It has been found to be a useful and easy to use tool for creating file level metadata for many other missions, both aircraft and satellite. While the original version was useful it had its limitations. In 2014 Raytheon was tasked to make enhancements to SIPSMetGen, this resulted a new version of SIPSMetGen which can create ISO Compliant XML metadata; provides optimization and streamlining of the algorithm for creating the spatial metadata; a quicker runtime with more consistent results; a utility that can be configured to run multi-threaded on systems with multiple processors. The utility comes with a java based graphical user interface to aid in configuration and running of the utility. The enhanced SIPSMetGen allows more diverse data sets to be archived with file level metadata. The advantage of archiving data with file level metadata is that it makes it easier for data users, and scientists to find relevant data. File level metadata unlocks the power of existing archives and metadata repositories such as ECS and CMR and search and discovery utilities like Reverb and Earth Data Search. Current missions now using SIPSMetGen include: Aquarius, Measures, ARISE, and Nimbus.
Building a high level sample processing and quality assessment model for biogeochemical measurements: a case study from the ocean acidification community

NASA Astrophysics Data System (ADS)

Thomas, R.; Connell, D.; Spears, T.; Leadbetter, A.; Burger, E. F.

2016-12-01

The scientific literature heavily features small-scale studies with the impact of the results extrapolated to regional/global importance. There are on-going initiatives (e.g. OA-ICC, GOA-ON, GEOTRACES, EMODNet Chemistry) aiming to assemble regional to global-scale datasets that are available for trend or meta-analyses. Assessing the quality and comparability of these data requires information about the processing chain from "sampling to spreadsheet". This provenance information needs to be captured and readily available to assess data fitness for purpose. The NOAA Ocean Acidification metadata template was designed in consultation with domain experts for this reason; the core carbonate chemistry variables have 23-37 metadata fields each and for scientists generating these datasets there could appear to be an ever increasing amount of metadata expected to accompany a dataset. While this provenance metadata should be considered essential by those generating or using the data, for those discovering data there is a sliding scale between what is considered discovery metadata (title, abstract, contacts, etc.) versus usage metadata (methodology, environmental setup, lineage, etc.), the split depending on the intended use of data. As part of the OA-ICC's activities, the metadata fields from the NOAA template relevant to the sample processing chain and QA criteria have been factored to develop profiles for, and extensions to, the OM-JSON encoding supported by the PROV ontology. While this work started focused on carbonate chemistry variable specific metadata, the factorization could be applied within the O&M model across other disciplines such as trace metals or contaminants. In a linked data world with a suitable high level model for sample processing and QA available, tools and support can be provided to link reproducible units of metadata (e.g. the standard protocol for a variable as adopted by a community) and simplify the provision of metadata and subsequent discovery.
The Planetary Data System (PDS) Data Dictionary Tool (LDDTool)

NASA Astrophysics Data System (ADS)

Raugh, Anne C.; Hughes, John S.

2017-10-01

One of the major design goals of the PDS4 development effort was to provide an avenue for discipline specialists and large data preparers such as mission archivists to extend the core PDS4 Information Model (IM) to include metadata definitions specific to their own contexts. This capability is critical for the Planetary Data System - an archive that deals with a data collection that is diverse along virtually every conceivable axis. Amid such diversity, it is in the best interests of the PDS archive and its users that all extensions to the core IM follow the same design techniques, conventions, and restrictions as the core implementation itself. Notwithstanding, expecting all mission and discipline archivist seeking to define metadata for a new context to acquire expertise in information modeling, model-driven design, ontology, schema formulation, and PDS4 design conventions and philosophy is unrealistic, to say the least.To bridge that expertise gap, the PDS Engineering Node has developed the data dictionary creation tool known as “LDDTool”. This tool incorporates the same software used to maintain and extend the core IM, packaged with an interface that enables a developer to create his contextual information model using the same, open standards-based metadata framework PDS itself uses. Through this interface, the novice dictionary developer has immediate access to the common set of data types and unit classes for defining attributes, and a straight-forward method for constructing classes. The more experienced developer, using the same tool, has access to more sophisticated modeling methods like abstraction and extension, and can define very sophisticated validation rules.We present the key features of the PDS Local Data Dictionary Tool, which both supports the development of extensions to the PDS4 IM, and ensures their compatibility with the IM.
Principles of metadata organization at the ENCODE data coordination center.

PubMed

Hong, Eurie L; Sloan, Cricket A; Chan, Esther T; Davidson, Jean M; Malladi, Venkat S; Strattan, J Seth; Hitz, Benjamin C; Gabdank, Idan; Narayanan, Aditi K; Ho, Marcus; Lee, Brian T; Rowe, Laurence D; Dreszer, Timothy R; Roe, Greg R; Podduturi, Nikhil R; Tanaka, Forrest; Hilton, Jason A; Cherry, J Michael

2016-01-01

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org. © The Author(s) 2016. Published by Oxford University Press.
EPOS Data and Service Provision

NASA Astrophysics Data System (ADS)

Bailo, Daniele; Jeffery, Keith G.; Atakan, Kuvvet; Harrison, Matt

2017-04-01

EPOS is now in IP (implementation phase) after a successful PP (preparatory phase). EPOS consists of essentially two components, one ICS (Integrated Core Services) representing the integrating ICT (Information and Communication Technology) and many TCS (Thematic Core Services) representing the scientific domains. The architecture developed, demonstrated and agreed within the project during the PP is now being developed utilising co-design with the TCS teams and agile, spiral methods within the ICS team. The 'heart' of EPOS is the metadata catalog. This provides for the ICS a digital representation of the TCS assets (services, data, software, equipment, expertise…) thus facilitating access, interoperation and (re-)use. A major part of the work has been interactions with the TCS. The original intention to harvest information from the TCS required (and still requires) discussions to understand fully the TCS organisational structures linked with rights, security and privacy; their (meta)data syntax (structure) and semantics (meaning); their workflows and methods of working and the services offered. To complicate matters further the TCS are each at varying stages of development and the ICS design has to accommodate pre-existing, developing and expected future standards for metadata, data, software and processes. Through information documents, questionnaires and interviews/meetings the EPOS ICS team has collected DDSS (Data, Data Products, Software and Services) information from the TCS. The ICS team developed a simplified metadata model for presentation to the TCS and the ICS team will perform the mapping and conversion from this model to the internal detailed technical metadata model using (CERIF: a EU recommendation to Member States maintained, developed and promoted by euroCRIS www.eurocris.org ). At the time of writing the final modifications of the EPOS metadata model are being made, and the mappings to CERIF designed, prior to the main phase of (meta)data collection into the EPOS metadata catalog. In parallel work proceeds on the user interface softsare, the APIs (Application Programming Interfaces) to the TCS services, the harvesting method and software, the AAAI (Authentication, Authorisation, Accounting Infrastructure) and the system manager. The next steps will involve interfaces to ICS-D (Distributed ICS i.e. facilities and services for computing, data storage, detectors and instruments for data collection etc.) to which requests, software and data will be deployed and from which data will be generated. Associated with this will be the development of the workflow system which will assist the end-user in building a workflow to achieve the scientific objectives.
Pragmatic Metadata Management for Integration into Multiple Spatial Data Infrastructure Systems and Platforms

NASA Astrophysics Data System (ADS)

Benedict, K. K.; Scott, S.

2013-12-01

While there has been a convergence towards a limited number of standards for representing knowledge (metadata) about geospatial (and other) data objects and collections, there exist a variety of community conventions around the specific use of those standards and within specific data discovery and access systems. This combination of limited (but multiple) standards and conventions creates a challenge for system developers that aspire to participate in multiple data infrastrucutres, each of which may use a different combination of standards and conventions. While Extensible Markup Language (XML) is a shared standard for encoding most metadata, traditional direct XML transformations (XSLT) from one standard to another often result in an imperfect transfer of information due to incomplete mapping from one standard's content model to another. This paper presents the work at the University of New Mexico's Earth Data Analysis Center (EDAC) in which a unified data and metadata management system has been developed in support of the storage, discovery and access of heterogeneous data products. This system, the Geographic Storage, Transformation and Retrieval Engine (GSTORE) platform has adopted a polyglot database model in which a combination of relational and document-based databases are used to store both data and metadata, with some metadata stored in a custom XML schema designed as a superset of the requirements for multiple target metadata standards: ISO 19115-2/19139/19110/19119, FGCD CSDGM (both with and without remote sensing extensions) and Dublin Core. Metadata stored within this schema is complemented by additional service, format and publisher information that is dynamically "injected" into produced metadata documents when they are requested from the system. While mapping from the underlying common metadata schema is relatively straightforward, the generation of valid metadata within each target standard is necessary but not sufficient for integration into multiple data infrastructures, as has been demonstrated through EDAC's testing and deployment of metadata into multiple external systems: Data.Gov, the GEOSS Registry, the DataONE network, the DSpace based institutional repository at UNM and semantic mediation systems developed as part of the NASA ACCESS ELSeWEB project. Each of these systems requires valid metadata as a first step, but to make most effective use of the delivered metadata each also has a set of conventions that are specific to the system. This presentation will provide an overview of the underlying metadata management model, the processes and web services that have been developed to automatically generate metadata in a variety of standard formats and highlight some of the specific modifications made to the output metadata content to support the different conventions used by the multiple metadata integration endpoints.
Sediment data collected in 2013 from the northern Chandeleur Islands, Louisiana

USGS Publications Warehouse

Buster, Noreen A.; Kelso, Kyle W.; Bernier, Julie C.; Flocks, James G.; Miselis, Jennifer L.; DeWitt, Nancy T.

2014-01-01

This data series serves as an archive of sediment data collected in July 2013 from the Chandeleur Islands sand berm and adjacent barrier-island environments. Data products include descriptive core logs, core photographs and x-radiographs, results of sediment grain-size analyses, sample location maps, and Geographic Information System data files with accompanying formal Federal Geographic Data Committee metadata.
A Transparently-Scalable Metadata Service for the Ursa Minor Storage System

DTIC Science & Technology

2010-06-25

provide application-level guarantees. For example, many document editing programs imple- ment atomic updates by writing the new document ver- sion into a...Transparently-Scalable Metadata Service for the Ursa Minor Storage System 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6...operations that could involve multiple servers, how close existing systems come to transparent scala - bility, how systems that handle multi-server
Innovations and Lessons Learned Developing the USDA Long-Term Agroecosystem Research Network Common Observatory Data Repository

NASA Astrophysics Data System (ADS)

Campbell, J. D.; Heilman, P.; Goodrich, D. C.; Sadler, J.

2015-12-01

The objective for the USDA Long-Term Agroecosystem Research (LTAR) network Common Observatory Repository (CORe) is to provide data management services including archive, discovery, and access for consistently observed data across all 18 nodes. LTAR members have an average of 56 years of diverse historic data. Each LTAR has designated a representative 'permanent' site as the location's common meteorological observatory. CORe implementation is phased, starting with meteorology, then adding hydrology, eddy flux, soil, and biology data. A design goal was to adopt existing best practices while minimizing the additional data management duties for the researchers. LTAR is providing support for data management specialists at the locations and the National Agricultural Library is providing central data management services. Maintaining continuity with historical observations is essential, so observations from both the legacy and new common methods are included in CORe. International standards are used to store robust descriptive metadata (ISO 19115) for the observation station and surrounding locale (WMO), sensors (Sensor ML), and activity (e.g., re-calibration, locale changes) to provide sufficient detail for novel data re-use for the next 50 years. To facilitate data submission a simple text format was designed. Datasets in CORe will receive DOIs to encourage citations giving fair credit for data providers. Data and metadata access are designed to support multiple formats and naming conventions. An automated QC process is being developed to enhance comparability among LTAR locations and to generate QC process metadata. Data provenance is maintained with a permanent record of changes including those by local scientists reviewing the automated QC results. Lessons learned so far include increase in site acceptance of CORe with the decision to store data from both legacy and new common methods. A larger than anticipated variety of currently used methods with potentially significant differences for future data use was found. Cooperative peer support among locations with the same sensors coupled with central support has reduced redundancy in procedural and data documentation.
Publishing NASA Metadata as Linked Open Data for Semantic Mashups

NASA Astrophysics Data System (ADS)

Wilson, Brian; Manipon, Gerald; Hua, Hook

2014-05-01

Data providers are now publishing more metadata in more interoperable forms, e.g. Atom or RSS 'casts', as Linked Open Data (LOD), or as ISO Metadata records. A major effort on the part of the NASA's Earth Science Data and Information System (ESDIS) project is the aggregation of metadata that enables greater data interoperability among scientific data sets regardless of source or application. Both the Earth Observing System (EOS) ClearingHOuse (ECHO) and the Global Change Master Directory (GCMD) repositories contain metadata records for NASA (and other) datasets and provided services. These records contain typical fields for each dataset (or software service) such as the source, creation date, cognizant institution, related access URL's, and domain and variable keywords to enable discovery. Under a NASA ACCESS grant, we demonstrated how to publish the ECHO and GCMD dataset and services metadata as LOD in the RDF format. Both sets of metadata are now queryable at SPARQL endpoints and available for integration into "semantic mashups" in the browser. It is straightforward to reformat sets of XML metadata, including ISO, into simple RDF and then later refine and improve the RDF predicates by reusing known namespaces such as Dublin core, georss, etc. All scientific metadata should be part of the LOD world. In addition, we developed an "instant" drill-down and browse interface that provides faceted navigation so that the user can discover and explore the 25,000 datasets and 3000 services. The available facets and the free-text search box appear in the left panel, and the instantly updated results for the dataset search appear in the right panel. The user can constrain the value of a metadata facet simply by clicking on a word (or phrase) in the "word cloud" of values for each facet. The display section for each dataset includes the important metadata fields, a full description of the dataset, potentially some related URL's, and a "search" button that points to an OpenSearch GUI that is pre-configured to search for granules within the dataset. We will present our experiences with converting NASA metadata into LOD, discuss the challenges, illustrate some of the enabled mashups, and demonstrate the latest version of the "instant browse" interface for navigating multiple metadata collections.
USGIN ISO metadata profile

NASA Astrophysics Data System (ADS)

Richard, S. M.

2011-12-01

The USGIN project has drafted and is using a specification for use of ISO 19115/19/39 metadata, recommendations for simple metadata content, and a proposal for a URI scheme to identify resources using resolvable http URI's(see http://lab.usgin.org/usgin-profiles). The principal target use case is a catalog in which resources can be registered and described by data providers for discovery by users. We are currently using the ESRI Geoportal (Open Source), with configuration files for the USGIN profile. The metadata offered by the catalog must provide sufficient content to guide search engines to locate requested resources, to describe the resource content, provenance, and quality so users can determine if the resource will serve for intended usage, and finally to enable human users and sofware clients to obtain or access the resource. In order to achieve an operational federated catalog system, provisions in the ISO specification must be restricted and usage clarified to reduce the heterogeneity of 'standard' metadata and service implementations such that a single client can search against different catalogs, and the metadata returned by catalogs can be parsed reliably to locate required information. Usage of the complex ISO 19139 XML schema allows for a great deal of structured metadata content, but the heterogenity in approaches to content encoding has hampered development of sophisticated client software that can take advantage of the rich metadata; the lack of such clients in turn reduces motivation for metadata producers to produce content-rich metadata. If the only significant use of the detailed, structured metadata is to format into text for people to read, then the detailed information could be put in free text elements and be just as useful. In order for complex metadata encoding and content to be useful, there must be clear and unambiguous conventions on the encoding that are utilized by the community that wishes to take advantage of advanced metadata content. The use cases for the detailed content must be well understood, and the degree of metadata complexity should be determined by requirements for those use cases. The ISO standard provides sufficient flexibility that relatively simple metadata records can be created that will serve for text-indexed search/discovery, resource evaluation by a user reading text content from the metadata, and access to the resource via http, ftp, or well-known service protocols (e.g. Thredds; OGC WMS, WFS, WCS).
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

NASA Astrophysics Data System (ADS)

Huffer, E.; Gleason, J. L.

2015-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
[caCORE: core architecture of bioinformation on cancer research in America].

PubMed

Gao, Qin; Zhang, Yan-lei; Xie, Zhi-yun; Zhang, Qi-peng; Hu, Zhang-zhi

2006-04-18

A critical factor in the advancement of biomedical research is the ease with which data can be integrated, redistributed and analyzed both within and across domains. This paper summarizes the Biomedical Information Core Infrastructure built by National Cancer Institute Center for Bioinformatics in America (NCICB). The main product from the Core Infrastructure is caCORE--cancer Common Ontologic Reference Environment, which is the infrastructure backbone supporting data management and application development at NCICB. The paper explains the structure and function of caCORE: (1) Enterprise Vocabulary Services (EVS). They provide controlled vocabulary, dictionary and thesaurus services, and EVS produces the NCI Thesaurus and the NCI Metathesaurus; (2) The Cancer Data Standards Repository (caDSR). It provides a metadata registry for common data elements. (3) Cancer Bioinformatics Infrastructure Objects (caBIO). They provide Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. The vision for caCORE is to provide a common data management framework that will support the consistency, clarity, and comparability of biomedical research data and information. In addition to providing facilities for data management and redistribution, caCORE helps solve problems of data integration. All NCICB-developed caCORE components are distributed under open-source licenses that support unrestricted usage by both non-profit and commercial entities, and caCORE has laid the foundation for a number of scientific and clinical applications. Based on it, the paper expounds caCORE-base applications simply in several NCI projects, of which one is CMAP (Cancer Molecular Analysis Project), and the other is caBIG (Cancer Biomedical Informatics Grid). In the end, the paper also gives good prospects of caCORE, and while caCORE was born out of the needs of the cancer research community, it is intended to serve as a general resource. Cancer research has historically contributed to many areas beyond tumor biology. At the same time, the paper makes some suggestions about the study at the present time on biomedical informatics in China.
Stability assessment of structures under earthquake hazard through GRID technology

NASA Astrophysics Data System (ADS)

Prieto Castrillo, F.; Boton Fernandez, M.

2009-04-01

This work presents a GRID framework to estimate the vulnerability of structures under earthquake hazard. The tool has been designed to cover the needs of a typical earthquake engineering stability analysis; preparation of input data (pre-processing), response computation and stability analysis (post-processing). In order to validate the application over GRID, a simplified model of structure under artificially generated earthquake records has been implemented. To achieve this goal, the proposed scheme exploits the GRID technology and its main advantages (parallel intensive computing, huge storage capacity and collaboration analysis among institutions) through intensive interaction among the GRID elements (Computing Element, Storage Element, LHC File Catalogue, federated database etc.) The dynamical model is described by a set of ordinary differential equations (ODE's) and by a set of parameters. Both elements, along with the integration engine, are encapsulated into Java classes. With this high level design, subsequent improvements/changes of the model can be addressed with little effort. In the procedure, an earthquake record database is prepared and stored (pre-processing) in the GRID Storage Element (SE). The Metadata of these records is also stored in the GRID federated database. This Metadata contains both relevant information about the earthquake (as it is usual in a seismic repository) and also the Logical File Name (LFN) of the record for its later retrieval. Then, from the available set of accelerograms in the SE, the user can specify a range of earthquake parameters to carry out a dynamic analysis. This way, a GRID job is created for each selected accelerogram in the database. At the GRID Computing Element (CE), displacements are then obtained by numerical integration of the ODE's over time. The resulting response for that configuration is stored in the GRID Storage Element (SE) and the maximum structure displacement is computed. Then, the corresponding Metadata containing the response LFN, earthquake magnitude and maximum structure displacement is also stored. Finally, the displacements are post-processed through a statistically-based algorithm from the available Metadata to obtain the probability of collapse of the structure for different earthquake magnitudes. From this study, it is possible to build a vulnerability report for the structure type and seismic data. The proposed methodology can be combined with the on-going initiatives to build a European earthquake record database. In this context, Grid enables collaboration analysis over shared seismic data and results among different institutions.
Oceans 2.0: a Data Management Infrastructure as a Platform

NASA Astrophysics Data System (ADS)

Pirenne, B.; Guillemot, E.

2012-04-01

Oceans 2.0: a Data Management Infrastructure as a Platform Benoît Pirenne, Associate Director, IT, NEPTUNE Canada Eric Guillemot, Manager, Software Development, NEPTUNE Canada The Data Management and Archiving System (DMAS) serving the needs of a number of undersea observing networks such as VENUS and NEPTUNE Canada was conceived from the beginning as a Service-Oriented Infrastructure. Its core functional elements (data acquisition, transport, archiving, retrieval and processing) can interact with the outside world using Web Services. Those Web Services can be exploited by a variety of higher level applications. Over the years, DMAS has developed Oceans 2.0: an environment where these techniques are implemented. The environment thereby becomes a platform in that it allows for easy addition of new and advanced features that build upon the tools at the core of the system. The applications that have been developed include: data search and retrieval, including options such as data product generation, data decimation or averaging, etc. dynamic infrastructure description (search all observatory metadata) and visualization data visualization, including dynamic scalar data plots, integrated fast video segment search and viewing Building upon these basic applications are new concepts, coming from the Web 2.0 world that DMAS has added: They allow people equipped only with a web browser to collaborate and contribute their findings or work results to the wider community. Examples include: addition of metadata tags to any part of the infrastructure or to any data item (annotations) ability to edit and execute, share and distribute Matlab code on-line, from a simple web browser, with specific calls within the code to access data ability to interactively and graphically build pipeline processing jobs that can be executed on the cloud web-based, interactive instrument control tools that allow users to truly share the use of the instruments and communicate with each other and last but not least: a public tool in the form of a game, that crowd-sources the inventory of the underwater video archive content, thereby adding tremendous amounts of metadata Beyond those tools that represent the functionality presently available to users, a number of the Web Services dedicated to data access are being exposed for anyone to use. This allows not only for ad hoc data access by individuals who need non-interactive access, but will foster the development of new applications in a variety of areas.
The EPOS e-Infrastructure

NASA Astrophysics Data System (ADS)

Jeffery, Keith; Bailo, Daniele

2014-05-01

The European Plate Observing System (EPOS) is integrating geoscientific information concerning earth movements in Europe. We are approaching the end of the PP (Preparatory Project) phase and in October 2014 expect to continue with the full project within ESFRI (European Strategic Framework for Research Infrastructures). The key aspects of EPOS concern providing services to allow homogeneous access by end-users over heterogeneous data, software, facilities, equipment and services. The e-infrastructure of EPOS is the heart of the project since it integrates the work on organisational, legal, economic and scientific aspects. Following the creation of an inventory of relevant organisations, persons, facilities, equipment, services, datasets and software (RIDE) the scale of integration required became apparent. The EPOS e-infrastructure architecture has been developed systematically based on recorded primary (user) requirements and secondary (interoperation with other systems) requirements through Strawman, Woodman and Ironman phases with the specification - and developed confirmatory prototypes - becoming more precise and progressively moving from paper to implemented system. The EPOS architecture is based on global core services (Integrated Core Services - ICS) which access thematic nodes (domain-specific European-wide collections, called thematic Core Services - TCS), national nodes and specific institutional nodes. The key aspect is the metadata catalog. In one dimension this is described in 3 levels: (1) discovery metadata using well-known and commonly used standards such as DC (Dublin Core) to enable users (via an intelligent user interface) to search for objects within the EPOS environment relevant to their needs; (2) contextual metadata providing the context of the object described in the catalog to enable a user or the system to determine the relevance of the discovered object(s) to their requirement - the context includes projects, funding, organisations involved, persons involved, related publications, facilities, equipment and others, and utilises CERIF (Common European Research Information Format) standard (see www.eurocris.org); (3) detailed metadata which is specific to a domain or to a particular object and includes the schema describing the object to processing software. The other dimension of the metadata concerns the objects described. These are classified into users, services (including software), data and resources (computing, data storage, instruments and scientific equipment). An alternative architecture has been considered: using brokering. This technique has been used especially in North America geoscience projects to interoperate datasets. The technique involves writing software to interconvert between any two node datasets. Given n nodes this implies writing n*(n-1) convertors. EPOS Working Group 7 (e-infrastructures and virtual community) which deals with the design and implementation of a prototype of the EPOS services, chose to use an approach which endows the system with an extreme flexibility and sustainability. It is called the Metadata Catalogue approach. With the use of the catalogue the EPOS system can: 1. interoperate with software, services, users, organisations, facilities, equipment etc. as well as datasets; 2. avoid to write n*(n-1) software convertors and generate as much as possible, through the information contained in the catalogue only n convertors. This is a huge saving - especially in maintenance as the datasets (or other node resources) evolve. We are working on (semi-) automation of convertor generation by metadata mapping - this is leading-edge computer science research; 3. make large use of contextual metadata which enable a user or a machine to: (i) improve discovery of resources at nodes; (ii) improve precision and recall in search; (iii) drive the systems for identification, authentication, authorisation, security and privacy recording the relevant attributes of the node resources and of the user; (iv) manage provenance and long-term digital preservation; The linkage between the Integrated Services, which provide the integration of data and services, with the diverse Thematic Services Nodes is provided by means of a compatibility layer, which includes the aforementioned metadata catalogue. This layer provides 'connectors' to make local data, software and services available through the EPOS Integrated Services layer. In conclusion, we believe the EPOS e-infrastructure architecture is fit for purpose including long-term sustainability and pan-European access to data and services.
Preserving Geological Samples and Metadata from Polar Regions

NASA Astrophysics Data System (ADS)

Grunow, A.; Sjunneskog, C. M.

2011-12-01

The Office of Polar Programs at the National Science Foundation (NSF-OPP) has long recognized the value of preserving earth science collections due to the inherent logistical challenges and financial costs of collecting geological samples from Polar Regions. NSF-OPP established two national facilities to make Antarctic geological samples and drill cores openly and freely available for research. The Antarctic Marine Geology Research Facility (AMGRF) at Florida State University was established in 1963 and archives Antarctic marine sediment cores, dredge samples and smear slides along with ship logs. The United States Polar Rock Repository (USPRR) at Ohio State University was established in 2003 and archives polar rock samples, marine dredges, unconsolidated materials and terrestrial cores, along with associated materials such as field notes, maps, raw analytical data, paleomagnetic cores, thin sections, microfossil mounts, microslides and residues. The existence of the AMGRF and USPRR helps to minimize redundant sample collecting, lessen the environmental impact of doing polar field work, facilitates field logistics planning and complies with the data sharing requirement of the Antarctic Treaty. USPRR acquires collections through donations from institutions and scientists and then makes these samples available as no-cost loans for research, education and museum exhibits. The AMGRF acquires sediment cores from US based and international collaboration drilling projects in Antarctica. Destructive research techniques are allowed on the loaned samples and loan requests are accepted from any accredited scientific institution in the world. Currently, the USPRR has more than 22,000 cataloged rock samples available to scientists from around the world. All cataloged samples are relabeled with a USPRR number, weighed, photographed and measured for magnetic susceptibility. Many aspects of the sample metadata are included in the database, e.g. geographical location, sample description, collector, rock age, formation, section location, multimedia images as well structural data, field observations, logistics, surface features, etc. The metadata are entered into a commercial, museum based database called EMu. The AMGRF houses more than 25,000m of deep-sea cores and drill cores as well as nearly 3,000 meters of rotary cored geological material from Antarctica. Detailed information on the sediment cores including location, sediment composition are available in cruise reports posted on the AMGRF web-site. Researchers may access the sample collections through the online websites (http://www-bprc.mps.ohio-state.edu/emuwebusprr and http://www.arf.fsu.edu). Searches may be done using multiple search terms or by use of the mapping feature. The on-line databases provide an essential resource for proposal preparation, pilot studies and other sample based research that should make fieldwork more efficient.
Integrated Array/Metadata Analytics

NASA Astrophysics Data System (ADS)

Misev, Dimitar; Baumann, Peter

2015-04-01

Data comes in various forms and types, and integration usually presents a problem that is often simply ignored and solved with ad-hoc solutions. Multidimensional arrays are an ubiquitous data type, that we find at the core of virtually all science and engineering domains, as sensor, model, image, statistics data. Naturally, arrays are richly described by and intertwined with additional metadata (alphanumeric relational data, XML, JSON, etc). Database systems, however, a fundamental building block of what we call "Big Data", lack adequate support for modelling and expressing these array data/metadata relationships. Array analytics is hence quite primitive or non-existent at all in modern relational DBMS. Recognizing this, we extended SQL with a new SQL/MDA part seamlessly integrating multidimensional array analytics into the standard database query language. We demonstrate the benefits of SQL/MDA with real-world examples executed in ASQLDB, an open-source mediator system based on HSQLDB and rasdaman, that already implements SQL/MDA.

The PDS4 Data Dictionary Tool - Metadata Design for Data Preparers

NASA Astrophysics Data System (ADS)

Raugh, A.; Hughes, J. S.

2017-12-01

One of the major design goals of the PDS4 development effort was to create an extendable Information Model (IM) for the archive, and to allow mission data designers/preparers to create extensions for metadata definitions specific to their own contexts. This capability is critical for the Planetary Data System - an archive that deals with a data collection that is diverse along virtually every conceivable axis. Amid such diversity in the data itself, it is in the best interests of the PDS archive and its users that all extensions to the IM follow the same design techniques, conventions, and restrictions as the core implementation itself. But it is unrealistic to expect mission data designers to acquire expertise in information modeling, model-driven design, ontology, schema formulation, and PDS4 design conventions and philosophy in order to define their own metadata. To bridge that expertise gap and bring the power of information modeling to the data label designer, the PDS Engineering Node has developed the data dictionary creation tool known as "LDDTool". This tool incorporates the same software used to maintain and extend the core IM, packaged with an interface that enables a developer to create his extension to the IM using the same, standards-based metadata framework PDS itself uses. Through this interface, the novice dictionary developer has immediate access to the common set of data types and unit classes for defining attributes, and a straight-forward method for constructing classes. The more experienced developer, using the same tool, has access to more sophisticated modeling methods like abstraction and extension, and can define context-specific validation rules. We present the key features of the PDS Local Data Dictionary Tool, which both supports the development of extensions to the PDS4 IM, and ensures their compatibility with the IM.
Cataloging the Net: Can We Do It?

ERIC Educational Resources Information Center

Oder, Norman

1998-01-01

Discusses possibilities for cataloging Internet resources and the role that the library profession can play. Topics include the Dublin Core metadata; public library projects (Michigan Electronic Library "MEL" and Librarians' Index to the Internet "LII"); academic library projects (INFOMINE, Scout Report); commercial sites…
Logic programming and metadata specifications

NASA Technical Reports Server (NTRS)

Lopez, Antonio M., Jr.; Saacks, Marguerite E.

1992-01-01

Artificial intelligence (AI) ideas and techniques are critical to the development of intelligent information systems that will be used to collect, manipulate, and retrieve the vast amounts of space data produced by 'Missions to Planet Earth.' Natural language processing, inference, and expert systems are at the core of this space application of AI. This paper presents logic programming as an AI tool that can support inference (the ability to draw conclusions from a set of complicated and interrelated facts). It reports on the use of logic programming in the study of metadata specifications for a small problem domain of airborne sensors, and the dataset characteristics and pointers that are needed for data access.
Large-Scale Data Collection Metadata Management at the National Computation Infrastructure

NASA Astrophysics Data System (ADS)

Wang, J.; Evans, B. J. K.; Bastrakova, I.; Ryder, G.; Martin, J.; Duursma, D.; Gohar, K.; Mackey, T.; Paget, M.; Siddeswara, G.

2014-12-01

Data Collection management has become an essential activity at the National Computation Infrastructure (NCI) in Australia. NCI's partners (CSIRO, Bureau of Meteorology, Australian National University, and Geoscience Australia), supported by the Australian Government and Research Data Storage Infrastructure (RDSI), have established a national data resource that is co-located with high-performance computing. This paper addresses the metadata management of these data assets over their lifetime. NCI manages 36 data collections (10+ PB) categorised as earth system sciences, climate and weather model data assets and products, earth and marine observations and products, geosciences, terrestrial ecosystem, water management and hydrology, astronomy, social science and biosciences. The data is largely sourced from NCI partners, the custodians of many of the national scientific records, and major research community organisations. The data is made available in a HPC and data-intensive environment - a ~56000 core supercomputer, virtual labs on a 3000 core cloud system, and data services. By assembling these large national assets, new opportunities have arisen to harmonise the data collections, making a powerful cross-disciplinary resource.To support the overall management, a Data Management Plan (DMP) has been developed to record the workflows, procedures, the key contacts and responsibilities. The DMP has fields that can be exported to the ISO19115 schema and to the collection level catalogue of GeoNetwork. The subset or file level metadata catalogues are linked with the collection level through parent-child relationship definition using UUID. A number of tools have been developed that support interactive metadata management, bulk loading of data, and support for computational workflows or data pipelines. NCI creates persistent identifiers for each of the assets. The data collection is tracked over its lifetime, and the recognition of the data providers, data owners, data generators and data aggregators are updated. A Digital Object Identifier is assigned using the Australian National Data Service (ANDS). Once the data has been quality assured, a DOI is minted and the metadata record updated. NCI's data citation policy establishes the relationship between research outcomes, data providers, and the data.
ISTIMES Integrated System for Transport Infrastructures Surveillance and Monitoring by Electromagnetic Sensing

NASA Astrophysics Data System (ADS)

Argenti, M.; Giannini, V.; Averty, R.; Bigagli, L.; Dumoulin, J.

2012-04-01

The EC FP7 ISTIMES project has the goal of realizing an ICT-based system exploiting distributed and local sensors for non destructive electromagnetic monitoring in order to make critical transport infrastructures more reliable and safe. Higher situation awareness thanks to real time and detailed information and images of the controlled infrastructure status allows improving decision capabilities for emergency management stakeholders. Web-enabled sensors and a service-oriented approach are used as core of the architecture providing a sys-tem that adopts open standards (e.g. OGC SWE, OGC CSW etc.) and makes efforts to achieve full interoperability with other GMES and European Spatial Data Infrastructure initiatives as well as compliance with INSPIRE. The system exploits an open easily scalable network architecture to accommodate a wide range of sensors integrated with a set of tools for handling, analyzing and processing large data volumes from different organizations with different data models. Situation Awareness tools are also integrated in the system. Definition of sensor observations and services follows a metadata model based on the ISO 19115 Core set of metadata elements and the O&M model of OGC SWE. The ISTIMES infrastructure is based on an e-Infrastructure for geospatial data sharing, with a Data Cata-log that implements the discovery services for sensor data retrieval, acting as a broker through static connections based on standard SOS and WNS interfaces; a Decision Support component which helps decision makers providing support for data fusion and inference and generation of situation indexes; a Presentation component which implements system-users interaction services for information publication and rendering, by means of a WEB Portal using SOA design principles; A security framework using Shibboleth open source middleware based on the Security Assertion Markup Language supporting Single Sign On (SSO). ACKNOWLEDGEMENT - The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under Grant Agreement n° 225663
DIRAC File Replica and Metadata Catalog

NASA Astrophysics Data System (ADS)

Tsaregorodtsev, A.; Poss, S.

2012-12-01

File replica and metadata catalogs are essential parts of any distributed data management system, which are largely determining its functionality and performance. A new File Catalog (DFC) was developed in the framework of the DIRAC Project that combines both replica and metadata catalog functionality. The DFC design is based on the practical experience with the data management system of the LHCb Collaboration. It is optimized for the most common patterns of the catalog usage in order to achieve maximum performance from the user perspective. The DFC supports bulk operations for replica queries and allows quick analysis of the storage usage globally and for each Storage Element separately. It supports flexible ACL rules with plug-ins for various policies that can be adopted by a particular community. The DFC catalog allows to store various types of metadata associated with files and directories and to perform efficient queries for the data based on complex metadata combinations. Definition of file ancestor-descendent relation chains is also possible. The DFC catalog is implemented in the general DIRAC distributed computing framework following the standard grid security architecture. In this paper we describe the design of the DFC and its implementation details. The performance measurements are compared with other grid file catalog implementations. The experience of the DFC Catalog usage in the CLIC detector project are discussed.
Searchers Net Treasure in Monterey.

ERIC Educational Resources Information Center

McDermott, Irene E.

1999-01-01

Reports on Web keyword searching, metadata, Dublin Core, Extensible Markup Language (XML), metasearch engines (metasearch engines search several Web indexes and/or directories and/or Usenet and/or specific Web sites), and the Year 2000 (Y2K) dilemma, all topics discussed at the second annual Internet Librarian Conference sponsored by Information…
Keeping Dublin Core Simple: Cross-Domain Discovery or Resource Description?; First Steps in an Information Commerce Economy: Digital Rights Management in the Emerging E-Book Environment; Interoperability: Digital Rights Management and the Emerging EBook Environment; Searching the Deep Web: Direct Query Engine Applications at the Department of Energy.

ERIC Educational Resources Information Center

Lagoze, Carl; Neylon, Eamonn; Mooney, Stephen; Warnick, Walter L.; Scott, R. L.; Spence, Karen J.; Johnson, Lorrie A.; Allen, Valerie S.; Lederman, Abe

2001-01-01

Includes four articles that discuss Dublin Core metadata, digital rights management and electronic books, including interoperability; and directed query engines, a type of search engine designed to access resources on the deep Web that is being used at the Department of Energy. (LRW)
Advancements in Large-Scale Data/Metadata Management for Scientific Data.

NASA Astrophysics Data System (ADS)

Guntupally, K.; Devarakonda, R.; Palanisamy, G.; Frame, M. T.

2017-12-01

Scientific data often comes with complex and diverse metadata which are critical for data discovery and users. The Online Metadata Editor (OME) tool, which was developed by an Oak Ridge National Laboratory team, effectively manages diverse scientific datasets across several federal data centers, such as DOE's Atmospheric Radiation Measurement (ARM) Data Center and USGS's Core Science Analytics, Synthesis, and Libraries (CSAS&L) project. This presentation will focus mainly on recent developments and future strategies for refining OME tool within these centers. The ARM OME is a standard based tool (https://www.archive.arm.gov/armome) that allows scientists to create and maintain metadata about their data products. The tool has been improved with new workflows that help metadata coordinators and submitting investigators to submit and review their data more efficiently. The ARM Data Center's newly upgraded Data Discovery Tool (http://www.archive.arm.gov/discovery) uses rich metadata generated by the OME to enable search and discovery of thousands of datasets, while also providing a citation generator and modern order-delivery techniques like Globus (using GridFTP), Dropbox and THREDDS. The Data Discovery Tool also supports incremental indexing, which allows users to find new data as and when they are added. The USGS CSAS&L search catalog employs a custom version of the OME (https://www1.usgs.gov/csas/ome), which has been upgraded with high-level Federal Geographic Data Committee (FGDC) validations and the ability to reserve and mint Digital Object Identifiers (DOIs). The USGS's Science Data Catalog (SDC) (https://data.usgs.gov/datacatalog) allows users to discover a myriad of science data holdings through a web portal. Recent major upgrades to the SDC and ARM Data Discovery Tool include improved harvesting performance and migration using new search software, such as Apache Solr 6.0 for serving up data/metadata to scientific communities. Our presentation will highlight the future enhancements of these tools which enable users to retrieve fast search results, along with parallelizing the retrieval process from online and High Performance Storage Systems. In addition, these improvements to the tools will support additional metadata formats like the Large-Eddy Simulation (LES) ARM Symbiotic and Observation (LASSO) bundle data.
Cytometry metadata in XML

NASA Astrophysics Data System (ADS)

Leif, Robert C.; Leif, Stephanie H.

2016-04-01

Introduction: The International Society for Advancement of Cytometry (ISAC) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). CytometryML will serve as a common metadata standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). The datatypes are primarily based on the Flow Cytometry and the Digital Imaging and Communication (DICOM) standards. A small section of the code was formatted with standard HTML formatting elements (p, h1, h2, etc.). Results:1) The part of MIFlowCyt that describes the Experimental Overview including the specimen and substantial parts of several other major elements has been implemented as CytometryML XML schemas (www.cytometryml.org). 2) The feasibility of using MIFlowCyt to provide the combination of an overview, table of contents, and/or an index of a scientific paper or a report has been demonstrated. Previously, a sample electronic publication, EPUB, was created that could contain both MIFlowCyt metadata as well as the binary data. Conclusions: The use of CytometryML technology together with XHTML5 and CSS permits the metadata to be directly formatted and together with the binary data to be stored in an EPUB container. This will facilitate: formatting, data- mining, presentation, data verification, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate a publication's adherence to the MIFlowCyt standard, promote interoperability and should also result in the textual and numeric data being published using web technology without any change in composition.
Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata.

PubMed

Hu, Wei; Zaveri, Amrapali; Qiu, Honglei; Dumontier, Michel

2017-09-18

The ability to efficiently search and filter datasets depends on access to high quality metadata. While most biomedical repositories require data submitters to provide a minimal set of metadata, some such as the Gene Expression Omnibus (GEO) allows users to specify additional metadata in the form of textual key-value pairs (e.g. sex: female). However, since there is no structured vocabulary to guide the submitter regarding the metadata terms to use, consequently, the 44,000,000+ key-value pairs in GEO suffer from numerous quality issues including redundancy, heterogeneity, inconsistency, and incompleteness. Such issues hinder the ability of scientists to hone in on datasets that meet their requirements and point to a need for accurate, structured and complete description of the data. In this study, we propose a clustering-based approach to address data quality issues in biomedical, specifically gene expression, metadata. First, we present three different kinds of similarity measures to compare metadata keys. Second, we design a scalable agglomerative clustering algorithm to cluster similar keys together. Our agglomerative cluster algorithm identified metadata keys that were similar, based on (i) name, (ii) core concept and (iii) value similarities, to each other and grouped them together. We evaluated our method using a manually created gold standard in which 359 keys were grouped into 27 clusters based on six types of characteristics: (i) age, (ii) cell line, (iii) disease, (iv) strain, (v) tissue and (vi) treatment. As a result, the algorithm generated 18 clusters containing 355 keys (four clusters with only one key were excluded). In the 18 clusters, there were keys that were identified correctly to be related to that cluster, but there were 13 keys which were not related to that cluster. We compared our approach with four other published methods. Our approach significantly outperformed them for most metadata keys and achieved the best average F-Score (0.63). Our algorithm identified keys that were similar to each other and grouped them together. Our intuition that underpins cleaning by clustering is that, dividing keys into different clusters resolves the scalability issues for data observation and cleaning, and keys in the same cluster with duplicates and errors can easily be found. Our algorithm can also be applied to other biomedical data types.
Solutions for extracting file level spatial metadata from airborne mission data

NASA Astrophysics Data System (ADS)

Schwab, M. J.; Stanley, M.; Pals, J.; Brodzik, M.; Fowler, C.; Icebridge Engineering/Spatial Metadata

2011-12-01

Authors: Michael Stanley Mark Schwab Jon Pals Mary J. Brodzik Cathy Fowler Collaboration: Raytheon EED and NSIDC Raytheon / EED 5700 Rivertech Court Riverdale, MD 20737 NSIDC University of Colorado UCB 449 Boulder, CO 80309-0449 Data sets acquired from satellites and aircraft may differ in many ways. We will focus on the differences in spatial coverage between the two platforms. Satellite data sets over a given period typically cover large geographic regions. These data are collected in a consistent, predictable and well understood manner due to the uniformity of satellite orbits. Since satellite data collection paths are typically smooth and uniform the data from satellite instruments can usually be described with simple spatial metadata. Subsequently, these spatial metadata can be stored and searched easily and efficiently. Conversely, aircraft have significantly more freedom to change paths, circle, overlap, and vary altitude all of which add complexity to the spatial metadata. Aircraft are also subject to wind and other elements that result in even more complicated and unpredictable spatial coverage areas. This unpredictability and complexity makes it more difficult to extract usable spatial metadata from data sets collected on aircraft missions. It is not feasible to use all of the location data from aircraft mission data sets for use as spatial metadata. The number of data points in typical data sets poses serious performance problems for spatial searching. In order to provide efficient spatial searching of the large number of files cataloged in our systems, we need to extract approximate spatial descriptions as geo-polygons from a small number of vertices (fewer than two hundred). We present some of the challenges and solutions for creating airborne mission-derived spatial metadata. We are implementing these methods to create the spatial metadata for insertion of IceBridge mission data into ECS for public access through NSIDC and ECHO but, they are potentially extensible to any aircraft mission data.
Java-Library for the Access, Storage and Editing of Calibration Metadata of Optical Sensors

NASA Astrophysics Data System (ADS)

Firlej, M.; Kresse, W.

2016-06-01

The standardization of the calibration of optical sensors in photogrammetry and remote sensing has been discussed for more than a decade. Projects of the German DGPF and the European EuroSDR led to the abstract International Technical Specification ISO/TS 19159-1:2014 "Calibration and validation of remote sensing imagery sensors and data - Part 1: Optical sensors". This article presents the first software interface for a read- and write-access to all metadata elements standardized in the ISO/TS 19159-1. This interface is based on an xml-schema that was automatically derived by ShapeChange from the UML-model of the Specification. The software interface serves two cases. First, the more than 300 standardized metadata elements are stored individually according to the xml-schema. Secondly, the camera manufacturers are using many administrative data that are not a part of the ISO/TS 19159-1. The new software interface provides a mechanism for input, storage, editing, and output of both types of data. Finally, an output channel towards a usual calibration protocol is provided. The interface is written in Java. The article also addresses observations made when analysing the ISO/TS 19159-1 and compiles a list of proposals for maturing the document, i.e. for an updated version of the Specification.
Ontologies for Effective Use of Context in E-Learning Settings

ERIC Educational Resources Information Center

Jovanovic, Jelena; Gasevic, Dragan; Knight, Colin; Richards, Griff

2007-01-01

This paper presents an ontology-based framework aimed at explicit representation of context-specific metadata derived from the actual usage of learning objects and learning designs. The core part of the proposed framework is a learning object context ontology, that leverages a range of other kinds of learning ontologies (e.g., user modeling…
Publications - RDF 2016-6 v. 1.1 | Alaska Division of Geological &

Science.gov Websites

Alaska's Mineral Industry Reports AKGeology.info Rare Earth Elements WebGeochem Engineering Geology Alaska 345.0 K Metadata - Read me Keywords Alaska Range; Analyses; Analyses and Sampling; Analytical Lab
Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes

PubMed Central

Sharma, Deepak K.; Solbrig, Harold R.; Prud’hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

2016-01-01

Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary’s metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration. PMID:28269909
Standardized Representation of Clinical Study Data Dictionaries with CIMI Archetypes.

PubMed

Sharma, Deepak K; Solbrig, Harold R; Prud'hommeaux, Eric; Pathak, Jyotishman; Jiang, Guoqian

2016-01-01

Researchers commonly use a tabular format to describe and represent clinical study data. The lack of standardization of data dictionary's metadata elements presents challenges for their harmonization for similar studies and impedes interoperability outside the local context. We propose that representing data dictionaries in the form of standardized archetypes can help to overcome this problem. The Archetype Modeling Language (AML) as developed by the Clinical Information Modeling Initiative (CIMI) can serve as a common format for the representation of data dictionary models. We mapped three different data dictionaries (identified from dbGAP, PheKB and TCGA) onto AML archetypes by aligning dictionary variable definitions with the AML archetype elements. The near complete alignment of data dictionaries helped map them into valid AML models that captured all data dictionary model metadata. The outcome of the work would help subject matter experts harmonize data models for quality, semantic interoperability and better downstream data integration.
A Proposal for a Thesaurus for Web Services in Solar Radiation

NASA Technical Reports Server (NTRS)

Gschwind, Benoit; Menard, Lionel; Ranchin, Thierry; Wald, Lucien; Stackhouse, Paul W., Jr.

2007-01-01

Metadata are necessary to discover, describe and exchange any type of information, resource and service at a large scale. A significant amount of effort has been made in the field of geography and environment to establish standards. Efforts still remain to address more specific domains such as renewable energies. This communication focuses on solar energy and more specifically on aspects in solar radiation that relate to geography and meteorology. A thesaurus in solar radiation is proposed for the keys elements in solar radiation namely time, space and radiation types. The importance of time-series in solar radiation is outlined and attributes of the key elements are discussed. An XML schema for encoding metadata is proposed. The exploitation of such a schema in web services is discussed. This proposal is a first attempt at establishing a thesaurus for describing data and applications in solar radiation.
In-field Access to Geoscientific Metadata through GPS-enabled Mobile Phones

NASA Astrophysics Data System (ADS)

Hobona, Gobe; Jackson, Mike; Jordan, Colm; Butchart, Ben

2010-05-01

Fieldwork is an integral part of much geosciences research. But whilst geoscientists have physical or online access to data collections whilst in the laboratory or at base stations, equivalent in-field access is not standard or straightforward. The increasing availability of mobile internet and GPS-supported mobile phones, however, now provides the basis for addressing this issue. The SPACER project was commissioned by the Rapid Innovation initiative of the UK Joint Information Systems Committee (JISC) to explore the potential for GPS-enabled mobile phones to access geoscientific metadata collections. Metadata collections within the geosciences and the wider geospatial domain can be disseminated through web services based on the Catalogue Service for Web(CSW) standard of the Open Geospatial Consortium (OGC) - a global grouping of over 380 private, public and academic organisations aiming to improve interoperability between geospatial technologies. CSW offers an XML-over-HTTP interface for querying and retrieval of geospatial metadata. By default, the metadata returned by CSW is based on the ISO19115 standard and encoded in XML conformant to ISO19139. The SPACER project has created a prototype application that enables mobile phones to send queries to CSW containing user-defined keywords and coordinates acquired from GPS devices built-into the phones. The prototype has been developed using the free and open source Google Android platform. The mobile application offers views for listing titles, presenting multiple metadata elements and a Google Map with an overlay of bounding coordinates of datasets. The presentation will describe the architecture and approach applied in the development of the prototype.
linkedISA: semantic representation of ISA-Tab experimental metadata.

PubMed

González-Beltrán, Alejandra; Maguire, Eamonn; Sansone, Susanna-Assunta; Rocca-Serra, Philippe

2014-01-01

Reporting and sharing experimental metadata- such as the experimental design, characteristics of the samples, and procedures applied, along with the analysis results, in a standardised manner ensures that datasets are comprehensible and, in principle, reproducible, comparable and reusable. Furthermore, sharing datasets in formats designed for consumption by humans and machines will also maximize their use. The Investigation/Study/Assay (ISA) open source metadata tracking framework facilitates standards-compliant collection, curation, visualization, storage and sharing of datasets, leveraging on other platforms to enable analysis and publication. The ISA software suite includes several components used in increasingly diverse set of life science and biomedical domains; it is underpinned by a general-purpose format, ISA-Tab, and conversions exist into formats required by public repositories. While ISA-Tab works well mainly as a human readable format, we have also implemented a linked data approach to semantically define the ISA-Tab syntax. We present a semantic web representation of the ISA-Tab syntax that complements ISA-Tab's syntactic interoperability with semantic interoperability. We introduce the linkedISA conversion tool from ISA-Tab to the Resource Description Framework (RDF), supporting mappings from the ISA syntax to multiple community-defined, open ontologies and capitalising on user-provided ontology annotations in the experimental metadata. We describe insights of the implementation and how annotations can be expanded driven by the metadata. We applied the conversion tool as part of Bio-GraphIIn, a web-based application supporting integration of the semantically-rich experimental descriptions. Designed in a user-friendly manner, the Bio-GraphIIn interface hides most of the complexities to the users, exposing a familiar tabular view of the experimental description to allow seamless interaction with the RDF representation, and visualising descriptors to drive the query over the semantic representation of the experimental design. In addition, we defined queries over the linkedISA RDF representation and demonstrated its use over the linkedISA conversion of datasets from Nature' Scientific Data online publication. Our linked data approach has allowed us to: 1) make the ISA-Tab semantics explicit and machine-processable, 2) exploit the existing ontology-based annotations in the ISA-Tab experimental descriptions, 3) augment the ISA-Tab syntax with new descriptive elements, 4) visualise and query elements related to the experimental design. Reasoning over ISA-Tab metadata and associated data will facilitate data integration and knowledge discovery.

Evaluating the potential effects of hurricanes on long-term sediment accumulation in two micro-tidal sub-estuaries: Barnegat Bay and Little Egg Harbor, New Jersey, U.S.A.

USGS Publications Warehouse

Marot, Marci E.; Smith, Christopher G.; Ellis, Alisha M.; Wheaton, Cathryn J.

2016-06-23

This report serves as an archive for sedimentological and radiochemical data derived from the surface sediments and box cores. Downloadable data are available as Excel spreadsheets, PDF files, and JPEG files, and include sediment core data plots and x-radiographs, as well as physical-properties, grain-size, alpha-spectroscopy, and gamma-spectroscopy data. Federal Geographic Data Committee metadata are available for analytical datasets in the data downloads page of this report.
Information-computational system for storage, search and analytical processing of environmental datasets based on the Semantic Web technologies

NASA Astrophysics Data System (ADS)

Titov, A.; Gordov, E.; Okladnikov, I.

2009-04-01

In this report the results of the work devoted to the development of working model of the software system for storage, semantically-enabled search and retrieval along with processing and visualization of environmental datasets containing results of meteorological and air pollution observations and mathematical climate modeling are presented. Specially designed metadata standard for machine-readable description of datasets related to meteorology, climate and atmospheric pollution transport domains is introduced as one of the key system components. To provide semantic interoperability the Resource Description Framework (RDF, http://www.w3.org/RDF/) technology means have been chosen for metadata description model realization in the form of RDF Schema. The final version of the RDF Schema is implemented on the base of widely used standards, such as Dublin Core Metadata Element Set (http://dublincore.org/), Directory Interchange Format (DIF, http://gcmd.gsfc.nasa.gov/User/difguide/difman.html), ISO 19139, etc. At present the system is available as a Web server (http://climate.risks.scert.ru/metadatabase/) based on the web-portal ATMOS engine [1] and is implementing dataset management functionality including SeRQL-based semantic search as well as statistical analysis and visualization of selected data archives [2,3]. The core of the system is Apache web server in conjunction with Tomcat Java Servlet Container (http://jakarta.apache.org/tomcat/) and Sesame Server (http://www.openrdf.org/) used as a database for RDF and RDF Schema. At present statistical analysis of meteorological and climatic data with subsequent visualization of results is implemented for such datasets as NCEP/NCAR Reanalysis, Reanalysis NCEP/DOE AMIP II, JMA/CRIEPI JRA-25, ECMWF ERA-40 and local measurements obtained from meteorological stations on the territory of Russia. This functionality is aimed primarily at finding of main characteristics of regional climate dynamics. The proposed system represents a step in the process of development of a distributed collaborative information-computational environment to support multidisciplinary investigations of Earth regional environment [4]. Partial support of this work by SB RAS Integration Project 34, SB RAS Basic Program Project 4.5.2.2, APN Project CBA2007-08NSY and FP6 Enviro-RISKS project (INCO-CT-2004-013427) is acknowledged. References 1. E.P. Gordov, V.N. Lykosov, and A.Z. Fazliev. Web portal on environmental sciences "ATMOS" // Advances in Geosciences. 2006. Vol. 8. p. 33 - 38. 2. Gordov E.P., Okladnikov I.G., Titov A.G. Development of elements of web based information-computational system supporting regional environment processes investigations // Journal of Computational Technologies, Vol. 12, Special Issue #3, 2007, pp. 20 - 28. 3. Okladnikov I.G., Titov A.G. Melnikova V.N., Shulgina T.M. Web-system for processing and visualization of meteorological and climatic data // Journal of Computational Technologies, Vol. 13, Special Issue #3, 2008, pp. 64 - 69. 4. Gordov E.P., Lykosov V.N. Development of information-computational infrastructure for integrated study of Siberia environment // Journal of Computational Technologies, Vol. 12, Special Issue #2, 2007, pp. 19 - 30.
The MPO system for automatic workflow documentation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abla, G.; Coviello, E. N.; Flanagan, S. M.

Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. Here, this article presents the Metadata, Provenance, and Ontology (MPO) System, the softwaremore » that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.« less
The MPO system for automatic workflow documentation

DOE PAGES

Abla, G.; Coviello, E. N.; Flanagan, S. M.; ...

2016-04-18

Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. Here, this article presents the Metadata, Provenance, and Ontology (MPO) System, the softwaremore » that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.« less
Multi-facetted Metadata - Describing datasets with different metadata schemas at the same time

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Klump, Jens; Bertelmann, Roland

2013-04-01

Inspired by the wish to re-use research data a lot of work is done to bring data systems of the earth sciences together. Discovery metadata is disseminated to data portals to allow building of customized indexes of catalogued dataset items. Data that were once acquired in the context of a scientific project are open for reappraisal and can now be used by scientists that were not part of the original research team. To make data re-use easier, measurement methods and measurement parameters must be documented in an application metadata schema and described in a written publication. Linking datasets to publications - as DataCite [1] does - requires again a specific metadata schema and every new use context of the measured data may require yet another metadata schema sharing only a subset of information with the meta information already present. To cope with the problem of metadata schema diversity in our common data repository at GFZ Potsdam we established a solution to store file-based research data and describe these with an arbitrary number of metadata schemas. Core component of the data repository is an eSciDoc infrastructure that provides versioned container objects, called eSciDoc [2] "items". The eSciDoc content model allows assigning files to "items" and adding any number of metadata records to these "items". The eSciDoc items can be submitted, revised, and finally published, which makes the data and metadata available through the internet worldwide. GFZ Potsdam uses eSciDoc to support its scientific publishing workflow, including mechanisms for data review in peer review processes by providing temporary web links for external reviewers that do not have credentials to access the data. Based on the eSciDoc API, panMetaDocs [3] provides a web portal for data management in research projects. PanMetaDocs, which is based on panMetaWorks [4], is a PHP based web application that allows to describe data with any XML-based schema. It uses the eSciDoc infrastructures REST-interface to store versioned dataset files and metadata in a XML-format. The software is able to administrate more than one eSciDoc metadata record per item and thus allows the description of a dataset according to its context. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and, at the same time, make use of contextual information available in a project setting. Access rights can be adjusted to set visibility of datasets to the required degree of openness. Metadata from separate instances of panMetaDocs can be syndicated to portals through RSS and OAI-PMH interfaces. The application architecture presented here allows storing file-based datasets and describe these datasets with any number of metadata schemas, depending on the intended use case. Data and metadata are stored in the same entity (eSciDoc items) and are managed by a software tool through the eSciDoc REST interface - in this case the application is panMetaDocs. Other software may re-use the produced items and modify the appropriate metadata records by accessing the web API of the eSciDoc data infrastructure. For presentation of the datasets in a web browser we are not bound to panMetaDocs. This is done by stylesheet transformation of the eSciDoc-item. [1] http://www.datacite.org [2] http://www.escidoc.org , eSciDoc, FIZ Karlruhe, Germany [3] http://panmetadocs.sf.net , panMetaDocs, GFZ Potsdam, Germany [4] http://metaworks.pangaea.de , panMetaWorks, Dr. R. Huber, MARUM, Univ. Bremen, Germany
Enhancing the MeSH thesaurus to retrieve French online health resources in a quality-controlled gateway.

PubMed

Douyère, Magaly; Soualmia, Lina F; Névéol, Aurélie; Rogozan, Alexandrina; Dahamna, Badisse; Leroy, Jean-Philippe; Thirion, Benoît; Darmoni, Stefan J

2004-12-01

The amount of health information available on the Internet is considerable. In this context, several health gateways have been developed. Among them, CISMeF (Catalogue and Index of Health Resources in French) was designed to catalogue and index health resources in French. The goal of this article is to describe the various enhancements to the MeSH thesaurus developed by the CISMeF team to adapt this terminology to the broader field of health Internet resources instead of scientific articles for the medline bibliographic database. CISMeF uses two standard tools for organizing information: the MeSH thesaurus and several metadata element sets, in particular the Dublin Core metadata format. The heterogeneity of Internet health resources led the CISMeF team to enhance the MeSH thesaurus with the introduction of two new concepts, respectively, resource types and metaterms. CISMeF resource types are a generalization of the publication types of medline. A resource type describes the nature of the resource and MeSH keyword/qualifier pairs describe the subject of the resource. A metaterm is generally a medical specialty or a biological science, which has semantic links with one or more MeSH keywords, qualifiers and resource types. The CISMeF terminology is exploited for several tasks: resource indexing performed manually, resource categorization performed automatically, visualization and navigation through the concept hierarchies and information retrieval using the Doc'CISMeF search engine. The CISMeF health gateway uses several MeSH thesaurus enhancements to optimize information retrieval, hierarchy navigation and automatic indexing.
THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devarakonda, Ranjeet; Shrestha, Biva; Palanisamy, Giri

Nobody is better suited to describe data than the scientist who created it. This description about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset [1]. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, andmore » locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [2][4]. OME is part of ORNL s Mercury software fleet [2][3]. It was jointly developed to support projects funded by the United States Geological Survey (USGS), U.S. Department of Energy (DOE), National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA). OME s architecture provides a customizable interface to support project-specific requirements. Using this new architecture, the ORNL team developed OME instances for USGS s Core Science Analytics, Synthesis, and Libraries (CSAS&L), DOE s Next Generation Ecosystem Experiments (NGEE) and Atmospheric Radiation Measurement (ARM) Program, and the international Surface Ocean Carbon Dioxide ATlas (SOCAT). Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. From the information on the form, the Metadata Editor can create an XML file on the server that the editor is installed or to the user s personal computer. Researchers can also use the ORNL Metadata Editor to modify existing XML metadata files. As an example, an NGEE Arctic scientist use OME to register their datasets to the NGEE data archive and allows the NGEE archive to publish these datasets via a data search portal (http://ngee.ornl.gov/data). These highly descriptive metadata created using OME allows the Archive to enable advanced data search options using keyword, geo-spatial, temporal and ontology filters. Similarly, ARM OME allows scientists or principal investigators (PIs) to submit their data products to the ARM data archive. How would OME help Big Data Centers like the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC)? The ORNL DAAC is one of NASA s Earth Observing System Data and Information System (EOSDIS) data centers managed by the Earth Science Data and Information System (ESDIS) Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological, geological, and chemical components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the NGEE and ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability. Useful Links: USGS OME: http://mercury.ornl.gov/OME/ NGEE OME: http://ngee-arctic.ornl.gov/ngeemetadata/ ARM OME: http://archive2.ornl.gov/armome/ Contact: Ranjeet Devarakonda (devarakondar@ornl.gov) References: [1] Federal Geographic Data Committee. Content standard for digital geospatial metadata. Federal Geographic Data Committee, 1998. [2] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [3] Wilson, B. E., Palanisamy, G., Devarakonda, R., Rhyne, B. T., Lindsley, C., & Green, J. (2010). Mercury Toolset for Spatiotemporal Metadata. [4] Pouchard, L. C., Branstetter, M. L., Cook, R. B., Devarakonda, R., Green, J., Palanisamy, G., ... & Noy, N. F. (2013). A Linked Science investigation: enhancing climate change data discovery with semantic technologies. Earth science informatics, 6(3), 175-185.« less
Playing the Metadata Game: Technologies and Strategies Used by Climate Diagnostics Center for Cataloging and Distributing Climate Data.

NASA Astrophysics Data System (ADS)

Schweitzer, R. H.

2001-05-01

The Climate Diagnostics Center maintains a collection of gridded climate data primarily for use by local researchers. Because this data is available on fast digital storage and because it has been converted to netCDF using a standard metadata convention (called COARDS), we recognize that this data collection is also useful to the community at large. At CDC we try to use technology and metadata standards to reduce our costs associated with making these data available to the public. The World Wide Web has been an excellent technology platform for meeting that goal. Specifically we have developed Web-based user interfaces that allow users to search, plot and download subsets from the data collection. We have also been exploring use of the Pacific Marine Environment Laboratory's Live Access Server (LAS) as an engine for this task. This would result in further savings by allowing us to concentrate on customizing the LAS where needed, rather that developing and maintaining our own system. One such customization currently under development is the use of Java Servlets and JavaServer pages in conjunction with a metadata database to produce a hierarchical user interface to LAS. In addition to these Web-based user interfaces all of our data are available via the Distributed Oceanographic Data System (DODS). This allows other sites using LAS and individuals using DODS-enabled clients to use our data as if it were a local file. All of these technology systems are driven by metadata. When we began to create netCDF files, we collaborated with several other agencies to develop a netCDF convention (COARDS) for metadata. At CDC we have extended that convention to incorporate additional metadata elements to make the netCDF files as self-describing as possible. Part of the local metadata is a set of controlled names for the variable, level in the atmosphere and ocean, statistic and data set for each netCDF file. To allow searching and easy reorganization of these metadata, we loaded the metadata from the netCDF files into a mySQL database. The combination of the mySQL database and the controlled names makes it possible to automate the construction of user interfaces and standard format metadata descriptions, like Federal Geographic Data Committee (FGDC) and Directory Interchange Format (DIF). These standard descriptions also include an association between our controlled names and standard keywords such as those developed by the Global Change Master Directory (GCMD). This talk will give an overview of each of these technology and metadata standards as it applies to work at the Climate Diagnostics Center. The talk will also discuss the pros and cons of each approach and discuss areas for future development.
An asynchronous traversal engine for graph-based rich metadata management

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Dong; Carns, Philip; Ross, Robert B.

Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less
A New Browser-based, Ontology-driven Tool for Generating Standardized, Deep Descriptions of Geoscience Models

NASA Astrophysics Data System (ADS)

Peckham, S. D.; Kelbert, A.; Rudan, S.; Stoica, M.

2016-12-01

Standardized metadata for models is the key to reliable and greatly simplified coupling in model coupling frameworks like CSDMS (Community Surface Dynamics Modeling System). This model metadata also helps model users to understand the important details that underpin computational models and to compare the capabilities of different models. These details include simplifying assumptions on the physics, governing equations and the numerical methods used to solve them, discretization of space (the grid) and time (the time-stepping scheme), state variables (input or output), model configuration parameters. This kind of metadata provides a "deep description" of a computational model that goes well beyond other types of metadata (e.g. author, purpose, scientific domain, programming language, digital rights, provenance, execution) and captures the science that underpins a model. While having this kind of standardized metadata for each model in a repository opens up a wide range of exciting possibilities, it is difficult to collect this information and a carefully conceived "data model" or schema is needed to store it. Automated harvesting and scraping methods can provide some useful information, but they often result in metadata that is inaccurate or incomplete, and this is not sufficient to enable the desired capabilities. In order to address this problem, we have developed a browser-based tool called the MCM Tool (Model Component Metadata) which runs on notebooks, tablets and smart phones. This tool was partially inspired by the TurboTax software, which greatly simplifies the necessary task of preparing tax documents. It allows a model developer or advanced user to provide a standardized, deep description of a computational geoscience model, including hydrologic models. Under the hood, the tool uses a new ontology for models built on the CSDMS Standard Names, expressed as a collection of RDF files (Resource Description Framework). This ontology is based on core concepts such as variables, objects, quantities, operations, processes and assumptions. The purpose of this talk is to present details of the new ontology and to then demonstrate the MCM Tool for several hydrologic models.
An asynchronous traversal engine for graph-based rich metadata management

DOE PAGES

Dai, Dong; Carns, Philip; Ross, Robert B.; ...

2016-06-23

Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less
Active AirCore Sampling: Constraining Point Sources of Methane and Other Gases with Fixed Wing Unmanned Aerial Systems

NASA Astrophysics Data System (ADS)

Bent, J. D.; Sweeney, C.; Tans, P. P.; Newberger, T.; Higgs, J. A.; Wolter, S.

2017-12-01

Accurate estimates of point source gas emissions are essential for reconciling top-down and bottom-up greenhouse gas measurements, but sampling such sources is challenging. Remote sensing methods are limited by resolution and cloud cover; aircraft methods are limited by air traffic control clearances, and the need to properly determine boundary layer height. A new sampling approach leverages the ability of unmanned aerial systems (UAS) to measure all the way to the surface near the source of emissions, improving sample resolution, and reducing the need to characterize a wide downstream swath, or measure to the full height of the planetary boundary layer (PBL). The "Active-AirCore" sampler, currently under development, will fly on a fixed wing UAS in Class G airspace, spiraling from the surface to 1200 ft AGL around point sources such as leaking oil wells to measure methane, carbon dioxide and carbon monoxide. The sampler collects a 100-meter long sample "core" of air in an 1/8" passivated stainless steel tube. This "core" is run on a high-precision instrument shortly after the UAS is recovered. Sample values are mapped to a specific geographic location by cross-referencing GPS and flow/pressure metadata, and fluxes are quantified by applying Gauss's theorem to the data, mapped onto the spatial "cylinder" circumscribed by the UAS. The AirCore-Active builds off the sampling ability and analytical approach of the related AirCore sampler, which profiles the atmosphere passively using a balloon launch platform, but will add an active pumping capability needed for near-surface horizontal sampling applications. Here, we show design elements, laboratory and field test results for methane, describe the overall goals of the mission, and discuss how the platform can be adapted, with minimal effort, to measure other gas species.
Data Publication and Interoperability for Long Tail Researchers via the Open Data Repository's (ODR) Data Publisher.

NASA Astrophysics Data System (ADS)

Stone, N.; Lafuente, B.; Bristow, T.; Keller, R.; Downs, R. T.; Blake, D. F.; Fonda, M.; Pires, A.

2016-12-01

Working primarily with astrobiology researchers at NASA Ames, the Open Data Repository (ODR) has been conducting a software pilot to meet the varying needs of this multidisciplinary community. Astrobiology researchers often have small communities or operate individually with unique data sets that don't easily fit into existing database structures. The ODR constructed its Data Publisher software to allow researchers to create databases with common metadata structures and subsequently extend them to meet their individual needs and data requirements. The software accomplishes these tasks through a web-based interface that allows collaborative creation and revision of common metadata templates and individual extensions to these templates for custom data sets. This allows researchers to search disparate datasets based on common metadata established through the metadata tools, but still facilitates distinct analyses and data that may be stored alongside the required common metadata. The software produces web pages that can be made publicly available at the researcher's discretion so that users may search and browse the data in an effort to make interoperability and data discovery a human-friendly task while also providing semantic data for machine-based discovery. Once relevant data has been identified, researchers can utilize the built-in application programming interface (API) that exposes the data for machine-based consumption and integration with existing data analysis tools (e.g. R, MATLAB, Project Jupyter - http://jupyter.org). The current evolution of the project has created the Astrobiology Habitable Environments Database (AHED)[1] which provides an interface to databases connected through a common metadata core. In the next project phase, the goal is for small research teams and groups to be self-sufficient in publishing their research data to meet funding mandates and academic requirements as well as fostering increased data discovery and interoperability through human-readable and machine-readable interfaces. This project is supported by the Science-Enabling Research Activity (SERA) and NASA NNX11AP82A, MSL. [1] B. Lafuente et al. (2016) AGU, submitted.
Archive of sediment data from vibracores collected in 2010 offshore of the Mississippi barrier islands

USGS Publications Warehouse

Kelso, Kyle W.; Flocks, James G.

2015-01-01

Selection of the core site locations was based on geophysical surveys conducted around the islands from 2008 to 2010. The surveys, using acoustic systems to image and interpret the nearsurface stratigraphy, were conducted to investigate the geologic controls on island evolution. This data series serves as an archive of sediment data collected from August to September 2010, offshore of the Mississippi barrier islands. Data products, including descriptive core logs, core photographs, results of sediment grain-size analyses, sample location maps, and geographic information system (GIS) data files with accompanying formal Federal Geographic Data Committee (FDGC) metadata can be downloaded from the data products and downloads page.
Process Architecture for Managing Digital Object Identifiers

NASA Astrophysics Data System (ADS)

Wanchoo, L.; James, N.; Stolte, E.

2014-12-01

In 2010, NASA's Earth Science Data and Information System (ESDIS) Project implemented a process for registering Digital Object Identifiers (DOIs) for data products distributed by Earth Observing System Data and Information System (EOSDIS). For the first 3 years, ESDIS evolved the process involving the data provider community in the development of processes for creating and assigning DOIs, and guidelines for the landing page. To accomplish this, ESDIS established two DOI User Working Groups: one for reviewing the DOI process whose recommendations were submitted to ESDIS in February 2014; and the other recently tasked to review and further develop DOI landing page guidelines for ESDIS approval by end of 2014. ESDIS has recently upgraded the DOI system from a manually-driven system to one that largely automates the DOI process. The new automated feature include: a) reviewing the DOI metadata, b) assigning of opaque DOI name if data provider chooses, and c) reserving, registering, and updating the DOIs. The flexibility of reserving the DOI allows data providers to embed and test the DOI in the data product metadata before formally registering with EZID. The DOI update process allows the changing of any DOI metadata except the DOI name unless the name has not been registered. Currently, ESDIS has processed a total of 557 DOIs of which 379 DOIs are registered with EZID and 178 are reserved with ESDIS. The DOI incorporates several metadata elements that effectively identify the data product and the source of availability. Of these elements, the Uniform Resource Locator (URL) attribute has the very important function of identifying the landing page which describes the data product. ESDIS in consultation with data providers in the Earth Science community is currently developing landing page guidelines that specify the key data product descriptive elements to be included on each data product's landing page. This poster will describe in detail the unique automated process and underlying system implemented by ESDIS for registering DOIs, as well as some of the lessons learned from the development of the process. In addition, this paper will summarize the recommendations made by the DOI Process and DOI Landing Page User Working Groups, and the procedures developed for implementing those recommendations.
Data Publishing and Sharing Via the THREDDS Data Repository

NASA Astrophysics Data System (ADS)

Wilson, A.; Caron, J.; Davis, E.; Baltzer, T.

2007-12-01

The terms "Team Science" and "Networked Science" have been coined to describe a virtual organization of researchers tied via some intellectual challenge, but often located in different organizations and locations. A critical component to these endeavors is publishing and sharing of content, including scientific data. Imagine pointing your web browser to a web page that interactively lets you upload data and metadata to a repository residing on a remote server, which can then be accessed by others in a secure fasion via the web. While any content can be added to this repository, it is designed particularly for storing and sharing scientific data and metadata. Server support includes uploading of data files that can subsequently be subsetted, aggregrated, and served in NetCDF or other scientific data formats. Metadata can be associated with the data and interactively edited. The THREDDS Data Repository (TDR) is a server that provides client initiated, on demand, location transparent storage for data of any type that can then be served by the THREDDS Data Server (TDS). The TDR provides functionality to: * securely store and "own" data files and associated metadata * upload files via HTTP and gridftp * upload a collection of data as single file * modify and restructure repository contents * incorporate metadata provided by the user * generate additional metadata programmatically * edit individual metadata elements The TDR can exist separately from a TDS, serving content via HTTP. Also, it can work in conjunction with the TDS, which includes functionality to provide: * access to data in a variety of formats via -- OPeNDAP -- OGC Web Coverage Service (for gridded datasets) -- bulk HTTP file transfer * a NetCDF view of datasets in NetCDF, OPeNDAP, HDF-5, GRIB, and NEXRAD formats * serving of very large volume datasets, such as NEXRAD radar * aggregation into virtual datasets * subsetting via OPeNDAP and NetCDF Subsetting services This talk will discuss TDR/TDS capabilities as well as how users can install this software to create their own repositories.
Database integration in a multimedia-modeling environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dorow, Kevin E.

2002-09-02

Integration of data from disparate remote sources has direct applicability to modeling, which can support Brownfield assessments. To accomplish this task, a data integration framework needs to be established. A key element in this framework is the metadata that creates the relationship between the pieces of information that are important in the multimedia modeling environment and the information that is stored in the remote data source. The design philosophy is to allow modelers and database owners to collaborate by defining this metadata in such a way that allows interaction between their components. The main parts of this framework include toolsmore » to facilitate metadata definition, database extraction plan creation, automated extraction plan execution / data retrieval, and a central clearing house for metadata and modeling / database resources. Cross-platform compatibility (using Java) and standard communications protocols (http / https) allow these parts to run in a wide variety of computing environments (Local Area Networks, Internet, etc.), and, therefore, this framework provides many benefits. Because of the specific data relationships described in the metadata, the amount of data that have to be transferred is kept to a minimum (only the data that fulfill a specific request are provided as opposed to transferring the complete contents of a data source). This allows for real-time data extraction from the actual source. Also, the framework sets up collaborative responsibilities such that the different types of participants have control over the areas in which they have domain knowledge-the modelers are responsible for defining the data relevant to their models, while the database owners are responsible for mapping the contents of the database using the metadata definitions. Finally, the data extraction mechanism allows for the ability to control access to the data and what data are made available.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Buttler, D J

The Java Metadata Facility is introduced by Java Specification Request (JSR) 175 [1], and incorporated into the Java language specification [2] in version 1.5 of the language. The specification allows annotations on Java program elements: classes, interfaces, methods, and fields. Annotations give programmers a uniform way to add metadata to program elements that can be used by code checkers, code generators, or other compile-time or runtime components. Annotations are defined by annotation types. These are defined the same way as interfaces, but with the symbol {at} preceding the interface keyword. There are additional restrictions on defining annotation types: (1) Theymore » cannot be generic; (2) They cannot extend other annotation types or interfaces; (3) Methods cannot have any parameters; (4) Methods cannot have type parameters; (5) Methods cannot throw exceptions; and (6) The return type of methods of an annotation type must be a primitive, a String, a Class, an annotation type, or an array, where the type of the array is restricted to one of the four allowed types. See [2] for additional restrictions and syntax. The methods of an annotation type define the elements that may be used to parameterize the annotation in code. Annotation types may have default values for any of its elements. For example, an annotation that specifies a defect report could initialize an element defining the defect outcome submitted. Annotations may also have zero elements. This could be used to indicate serializability for a class (as opposed to the current Serializability interface).« less
SAS- Semantic Annotation Service for Geoscience resources on the web

NASA Astrophysics Data System (ADS)

Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.

2015-12-01

There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.
panMetaDocs, eSciDoc, and DOIDB - an infrastructure for the curation and publication of file-based datasets for 'GFZ Data Services'

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Elger, Kirsten; Bertelmann, Roland; Klump, Jens

2016-04-01

With the foundation of DataCite in 2009 and the technical infrastructure installed in the last six years it has become very easy to create citable dataset DOIs. Nowadays, dataset DOIs are increasingly accepted and required by journals in reference lists of manuscripts. In addition, DataCite provides usage statistics [1] of assigned DOIs and offers a public search API to make research data count. By linking related information to the data, they become more useful for future generations of scientists. For this purpose, several identifier systems, as ISBN for books, ISSN for journals, DOI for articles or related data, Orcid for authors, and IGSN for physical samples can be attached to DOIs using the DataCite metadata schema [2]. While these are good preconditions to publish data, free and open solutions that help with the curation of data, the publication of research data, and the assignment of DOIs in one software seem to be rare. At GFZ Potsdam we built a modular software stack that is made of several free and open software solutions and we established 'GFZ Data Services'. 'GFZ Data Services' provides storage, a metadata editor for publication and a facility to moderate minted DOIs. All software solutions are connected through web APIs, which makes it possible to reuse and integrate established software. Core component of 'GFZ Data Services' is an eSciDoc [3] middleware that is used as central storage, and has been designed along the OAIS reference model for digital preservation. Thus, data are stored in self-contained packages that are made of binary file-based data and XML-based metadata. The eSciDoc infrastructure provides access control to data and it is able to handle half-open datasets, which is useful in embargo situations when a subset of the research data are released after an adequate period. The data exchange platform panMetaDocs [4] makes use of eSciDoc's REST API to upload file-based data into eSciDoc and uses a metadata editor [5] to annotate the files with metadata. The metadata editor has a user-friendly interface with nominal lists, extensive explanations, and an interactive mapping tool to provide assistance to scientists describing the data. It is possible to deposit metadata templates to fill certain fields with default values. The metadata editor generates metadata in the schemas ISO19139, NASA GCMD DIF, and DataCite and could be extended for other schemas. panMetaDocs is able to mint dataset DOIs through DOIDB, which is our component to moderate dataset DOIs issued through 'GFZ Data Services'. DOIDB accepts metadata in the schemas ISO19139, DIF, and DataCite. In addition, DOIDB provides an OAI-PMH interface to disseminate all deposited metadata to data portals. The presentation of datasets on DOI landing pages is done though XSLT stylesheet transformation of the XML-based metadata. The landing pages have been designed to meet needs of scientists. We are able to render the metadata to different layouts. Furthermore, additional information about datasets and publications is assembled into the webpage by querying public databases on the internet. The work presented here will focus on technical details of the software stack. [1] http://stats.datacite.org [2] http://www.dlib.org/dlib/january11/starr/01starr.html [3] http://www.escidoc.org [4] http://panmetadocs.sf.net [5] http://github.com/ulbricht

Data Discovery of Big and Diverse Climate Change Datasets - Options, Practices and Challenges

NASA Astrophysics Data System (ADS)

Palanisamy, G.; Boden, T.; McCord, R. A.; Frame, M. T.

2013-12-01

Developing data search tools is a very common, but often confusing, task for most of the data intensive scientific projects. These search interfaces need to be continually improved to handle the ever increasing diversity and volume of data collections. There are many aspects which determine the type of search tool a project needs to provide to their user community. These include: number of datasets, amount and consistency of discovery metadata, ancillary information such as availability of quality information and provenance, and availability of similar datasets from other distributed sources. Environmental Data Science and Systems (EDSS) group within the Environmental Science Division at the Oak Ridge National Laboratory has a long history of successfully managing diverse and big observational datasets for various scientific programs via various data centers such as DOE's Atmospheric Radiation Measurement Program (ARM), DOE's Carbon Dioxide Information and Analysis Center (CDIAC), USGS's Core Science Analytics and Synthesis (CSAS) metadata Clearinghouse and NASA's Distributed Active Archive Center (ORNL DAAC). This talk will showcase some of the recent developments for improving the data discovery within these centers The DOE ARM program recently developed a data discovery tool which allows users to search and discover over 4000 observational datasets. These datasets are key to the research efforts related to global climate change. The ARM discovery tool features many new functions such as filtered and faceted search logic, multi-pass data selection, filtering data based on data quality, graphical views of data quality and availability, direct access to data quality reports, and data plots. The ARM Archive also provides discovery metadata to other broader metadata clearinghouses such as ESGF, IASOA, and GOS. In addition to the new interface, ARM is also currently working on providing DOI metadata records to publishers such as Thomson Reuters and Elsevier. The ARM program also provides a standards based online metadata editor (OME) for PIs to submit their data to the ARM Data Archive. USGS CSAS metadata Clearinghouse aggregates metadata records from several USGS projects and other partner organizations. The Clearinghouse allows users to search and discover over 100,000 biological and ecological datasets from a single web portal. The Clearinghouse also enabled some new data discovery functions such as enhanced geo-spatial searches based on land and ocean classifications, metadata completeness rankings, data linkage via digital object identifiers (DOIs), and semantically enhanced keyword searches. The Clearinghouse also currently working on enabling a dashboard which allows the data providers to look at various statistics such as number their records accessed via the Clearinghouse, most popular keywords, metadata quality report and DOI creation service. The Clearinghouse also publishes metadata records to broader portals such as NSF DataONE and Data.gov. The author will also present how these capabilities are currently reused by the recent and upcoming data centers such as DOE's NGEE-Arctic project. References: [1] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L., Killeffer, T., Krassovski, M., ... & Frame, M. (2014, October). OME: Tool for generating and managing metadata to handle BigData. In BigData Conference (pp. 8-10).
HydroShare: An online, collaborative environment for the sharing of hydrologic data and models (Invited)

NASA Astrophysics Data System (ADS)

Tarboton, D. G.; Idaszak, R.; Horsburgh, J. S.; Ames, D.; Goodall, J. L.; Band, L. E.; Merwade, V.; Couch, A.; Arrigo, J.; Hooper, R. P.; Valentine, D. W.; Maidment, D. R.

2013-12-01

HydroShare is an online, collaborative system being developed for sharing hydrologic data and models. The goal of HydroShare is to enable scientists to easily discover and access data and models, retrieve them to their desktop or perform analyses in a distributed computing environment that may include grid, cloud or high performance computing model instances as necessary. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models and analyses. HydroShare is expanding the data sharing capability of the CUAHSI Hydrologic Information System by broadening the classes of data accommodated, creating new capability to share models and model components, and taking advantage of emerging social media functionality to enhance information about and collaboration around hydrologic data and models. One of the fundamental concepts in HydroShare is that of a Resource. All content is represented using a Resource Data Model that separates system and science metadata and has elements common to all resources as well as elements specific to the types of resources HydroShare will support. These will include different data types used in the hydrology community and models and workflows that require metadata on execution functionality. HydroShare will use the integrated Rule-Oriented Data System (iRODS) to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the execution of models and workflows. This presentation will introduce the HydroShare functionality developed to date, describe key elements of the Resource Data Model and outline the roadmap for future development.
Mesh Oriented datABase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tautges, Timothy J.

MOAB is a component for representing and evaluating mesh data. MOAB can store stuctured and unstructured mesh, consisting of elements in the finite element "zoo". The functional interface to MOAB is simple yet powerful, allowing the representation of many types of metadata commonly found on the mesh. MOAB is optimized for efficiency in space and time, based on access to mesh in chunks rather than through individual entities, while also versatile enough to support individual entity access. The MOAB data model consists of a mesh interface instance, mesh entities (vertices and elements), sets, and tags. Entities are addressed through handlesmore » rather than pointers, to allow the underlying representation of an entity to change without changing the handle to that entity. Sets are arbitrary groupings of mesh entities and other sets. Sets also support parent/child relationships as a relation distinct from sets containing other sets. The directed-graph provided by set parent/child relationships is useful for modeling topological relations from a geometric model or other metadata. Tags are named data which can be assigned to the mesh as a whole, individual entities, or sets. Tags are a mechanism for attaching data to individual entities and sets are a mechanism for describing relations between entities; the combination of these two mechanisms isa powerful yet simple interface for representing metadata or application-specific data. For example, sets and tags can be used together to describe geometric topology, boundary condition, and inter-processor interface groupings in a mesh. MOAB is used in several ways in various applications. MOAB serves as the underlying mesh data representation in the VERDE mesh verification code. MOAB can also be used as a mesh input mechanism, using mesh readers induded with MOAB, or as a tanslator between mesh formats, using readers and writers included with MOAB.« less
Cores to the rescue: how old cores enable new science

NASA Astrophysics Data System (ADS)

Ito, E.; Noren, A. J.; Brady, K.

2016-12-01

The value of archiving scientific specimens and collections for the purpose of enabling further research using new analytical techniques, resolving conflicting results, or repurposing them for entirely new research, is often discussed in abstract terms. We all agree that samples with adequate metadata ought to be archived systematically for easy access, for a long time and stored under optimal conditions. And yet, as storage space fills, there is a temptation to cull the collection, or when a researcher retires, to discard the collection unless the researcher manages to make his or her own arrangement for the collection to be accessioned elsewhere. Nobody has done anything with these samples in over 20 years! Who would want them? It turns out that plenty of us do want them, if we know how to find them and if they have sufficient metadata to assess past work and suitability for new analyses. The LacCore collection holds over 33 km of core from >6700 sites in diverse geographic locations worldwide with samples collected as early as 1950s. From these materials, there are many examples to illustrate the scientific value of archiving geologic samples. One example that benefitted Ito personally were cores from Lakes Mirabad and Zeribar, Iran, acquired in 1963 by Herb Wright and his associates. Several doctoral and postdoctoral students generated and published paleoecological reconstructions based on cladocerans, diatoms, pollen or plant macrofossils, mostly between 1963 and 1967. The cores were resampled in 1990s by a student being jointly advised by Wright and Ito for oxygen isotope analysis of endogenic calcite. The results were profitably compared with pollen and the results published in 2001 and 2006. From 1979 until very recently, visiting Iran for fieldwork was not pallowed for US scientists. Other examples will be given to further illustrate the power of archived samples to advance science.
Exploring Research Contributions of the North American Carbon Program using Google Earth and Google Map

NASA Astrophysics Data System (ADS)

Griffith, P. C.; Wilcox, L. E.; Morrell, A.

2009-12-01

The central objective of the North American Carbon Program (NACP), a core element of the US Global Change Research Program, is to quantify the sources and sinks of carbon dioxide, carbon monoxide, and methane in North America and adjacent ocean regions. The NACP consists of a wide range of investigators at universities and federal research centers. Although many of these investigators have worked together in the past, many have had few prior interactions and may not know of similar work within knowledge domains, much less across the diversity of environments and scientific approaches in the Program. Coordinating interactions and sharing data are major challenges in conducting NACP. The Google Earth and Google Map Collections on the NACP website (www.nacarbon.org) provide a geographical view of the research products contributed by each core and affiliated NACP project. Other relevant data sources (e.g. AERONET, LVIS) can also be browsed in spatial context with NACP contributions. Each contribution links to project-oriented metadata, or “project profiles”, that provide a greater understanding of the scientific and social context of each dataset and are an important means of communicating within the NACP and to the larger carbon cycle science community. Project profiles store information such as a project's title, leaders, participants, an abstract, keywords, funding agencies, associated intensive campaigns, expected data products, data needs, publications, and URLs to associated data centers, datasets, and metadata. Data products are research contributions that include biometric inventories, flux tower estimates, remote sensing land cover products, tools, services, and model inputs / outputs. Project leaders have been asked to identify these contributions to the site level whenever possible, either through simple latitude/longitude pair, or by uploading a KML, KMZ, or shape file. Project leaders may select custom icons to graphically categorize their contributions; for example, a ship for oceanographic samples, a tower for tower measurements. After post-processing, research contributions are added to the NACP Google Earth and Google Map Collection to facilitate discovery and use in synthesis activities of the Program.
Publishing datasets with eSciDoc and panMetaDocs

NASA Astrophysics Data System (ADS)

Ulbricht, D.; Klump, J.; Bertelmann, R.

2012-04-01

Currently serveral research institutions worldwide undertake considerable efforts to have their scientific datasets published and to syndicate them to data portals as extensively described objects identified by a persistent identifier. This is done to foster the reuse of data, to make scientific work more transparent, and to create a citable entity that can be referenced unambigously in written publications. GFZ Potsdam established a publishing workflow for file based research datasets. Key software components are an eSciDoc infrastructure [1] and multiple instances of the data curation tool panMetaDocs [2]. The eSciDoc repository holds data objects and their associated metadata in container objects, called eSciDoc items. A key metadata element in this context is the publication status of the referenced data set. PanMetaDocs, which is based on PanMetaWorks [3], is a PHP based web application that allows to describe data with any XML-based metadata schema. The metadata fields can be filled with static or dynamic content to reduce the number of fields that require manual entries to a minimum and make use of contextual information in a project setting. Access rights can be applied to set visibility of datasets to other project members and allow collaboration on and notifying about datasets (RSS) and interaction with the internal messaging system, that was inherited from panMetaWorks. When a dataset is to be published, panMetaDocs allows to change the publication status of the eSciDoc item from status "private" to "submitted" and prepare the dataset for verification by an external reviewer. After quality checks, the item publication status can be changed to "published". This makes the data and metadata available through the internet worldwide. PanMetaDocs is developed as an eSciDoc application. It is an easy to use graphical user interface to eSciDoc items, their data and metadata. It is also an application supporting a DOI publication agent during the process of publishing scientific datasets as electronic data supplements to research papers. Publication of research manuscripts has an already well established workflow that shares junctures with other processes and involves several parties in the process of dataset publication. Activities of the author, the reviewer, the print publisher and the data publisher have to be coordinated into a common data publication workflow. The case of data publication at GFZ Potsdam displays some specifics, e.g. the DOIDB webservice. The DOIDB is a proxy service at GFZ for the DataCite [4] DOI registration and its metadata store. DOIDB provides a local summary of the dataset DOIs registered through GFZ as a publication agent. An additional use case for the DOIDB is its function to enrich the datacite metadata with additional custom attributes, like a geographic reference in a DIF record. These attributes are at the moment not available in the datacite metadata schema but would be valuable elements for the compilation of data catalogues in the earth sciences and for dissemination of catalogue data via OAI-PMH. [1] http://www.escidoc.org , eSciDoc, FIZ Karlruhe, Germany [2] http://panmetadocs.sf.net , panMetaDocs, GFZ Potsdam, Germany [3] http://metaworks.pangaea.de , panMetaWorks, Dr. R. Huber, MARUM, Univ. Bremen, Germany [4] http://www.datacite.org
Evaluating the Quality and Usability of Open Data for Public Health Research: A Systematic Review of Data Offerings on 3 Open Data Platforms.

PubMed

Martin, Erika G; Law, Jennie; Ran, Weijia; Helbig, Natalie; Birkhead, Guthrie S

Government datasets are newly available on open data platforms that are publicly accessible, available in nonproprietary formats, free of charge, and with unlimited use and distribution rights. They provide opportunities for health research, but their quality and usability are unknown. To describe available open health data, identify whether data are presented in a way that is aligned with best practices and usable for researchers, and examine differences across platforms. Two reviewers systematically reviewed a random sample of data offerings on NYC OpenData (New York City, all offerings, n = 37), Health Data NY (New York State, 25% sample, n = 71), and HealthData.gov (US Department of Health and Human Services, 5% sample, n = 75), using a standard coding guide. Three open health data platforms at the federal, New York State, and New York City levels. Data characteristics from the coding guide were aggregated into summary indices for intrinsic data quality, contextual data quality, adherence to the Dublin Core metadata standards, and the 5-star open data deployment scheme. One quarter of the offerings were structured datasets; other presentation styles included charts (14.7%), documents describing data (12.0%), maps (10.9%), and query tools (7.7%). Health Data NY had higher intrinsic data quality (P < .001), contextual data quality (P < .001), and Dublin Core metadata standards adherence (P < .001). All met basic "web availability" open data standards; fewer met higher standards of "hyperlinked to other data." Although all platforms need improvement, they already provide readily available data for health research. Sustained effort on improving open data websites and metadata is necessary for ensuring researchers use these data, thereby increasing their research value.
JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO

NASA Astrophysics Data System (ADS)

Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.

2017-12-01

Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/
Component-Based Approach in Learning Management System Development

ERIC Educational Resources Information Center

Zaitseva, Larisa; Bule, Jekaterina; Makarov, Sergey

2013-01-01

The paper describes component-based approach (CBA) for learning management system development. Learning object as components of e-learning courses and their metadata is considered. The architecture of learning management system based on CBA being developed in Riga Technical University, namely its architecture, elements and possibilities are…
NASA's Earth Observing Data and Information System - Supporting Interoperability through a Scalable Architecture (Invited)

NASA Astrophysics Data System (ADS)

Mitchell, A. E.; Lowe, D. R.; Murphy, K. J.; Ramapriyan, H. K.

2011-12-01

Initiated in 1990, NASA's Earth Observing System Data and Information System (EOSDIS) is currently a petabyte-scale archive of data designed to receive, process, distribute and archive several terabytes of science data per day from NASA's Earth science missions. Comprised of 12 discipline specific data centers collocated with centers of science discipline expertise, EOSDIS manages over 6800 data products from many science disciplines and sources. NASA supports global climate change research by providing scalable open application layers to the EOSDIS distributed information framework. This allows many other value-added services to access NASA's vast Earth Science Collection and allows EOSDIS to interoperate with data archives from other domestic and international organizations. EOSDIS is committed to NASA's Data Policy of full and open sharing of Earth science data. As metadata is used in all aspects of NASA's Earth science data lifecycle, EOSDIS provides a spatial and temporal metadata registry and order broker called the EOS Clearing House (ECHO) that allows efficient search and access of cross domain data and services through the Reverb Client and Application Programmer Interfaces (APIs). Another core metadata component of EOSDIS is NASA's Global Change Master Directory (GCMD) which represents more than 25,000 Earth science data set and service descriptions from all over the world, covering subject areas within the Earth and environmental sciences. With inputs from the ECHO, GCMD and Soil Moisture Active Passive (SMAP) mission metadata models, EOSDIS is developing a NASA ISO 19115 Best Practices Convention. Adoption of an international metadata standard enables a far greater level of interoperability among national and international data products. NASA recently concluded a 'Metadata Harmony Study' of EOSDIS metadata capabilities/processes of ECHO and NASA's Global Change Master Directory (GCMD), to evaluate opportunities for improved data access and use, reduce efforts by data providers and improve metadata integrity. The result was a recommendation for EOSDIS to develop a 'Common Metadata Repository (CMR)' to manage the evolution of NASA Earth Science metadata in a unified and consistent way by providing a central storage and access capability that streamlines current workflows while increasing overall data quality and anticipating future capabilities. For applications users interested in monitoring and analyzing a wide variety of natural and man-made phenomena, EOSDIS provides access to near real-time products from the MODIS, OMI, AIRS, and MLS instruments in less than 3 hours from observation. To enable interactive exploration of NASA's Earth imagery, EOSDIS is developing a set of standard services to deliver global, full-resolution satellite imagery in a highly responsive manner. EOSDIS is also playing a lead role in the development of the CEOS WGISS Integrated Catalog (CWIC), which provides search and access to holdings of participating international data providers. EOSDIS provides a platform to expose and share information on NASA Earth science tools and data via Earthdata.nasa.gov while offering a coherent and interoperable system for the NASA Earth Science Data System (ESDS) Program.
NASA's Earth Observing Data and Information System - Supporting Interoperability through a Scalable Architecture (Invited)

NASA Astrophysics Data System (ADS)

Mitchell, A. E.; Lowe, D. R.; Murphy, K. J.; Ramapriyan, H. K.

2013-12-01

Initiated in 1990, NASA's Earth Observing System Data and Information System (EOSDIS) is currently a petabyte-scale archive of data designed to receive, process, distribute and archive several terabytes of science data per day from NASA's Earth science missions. Comprised of 12 discipline specific data centers collocated with centers of science discipline expertise, EOSDIS manages over 6800 data products from many science disciplines and sources. NASA supports global climate change research by providing scalable open application layers to the EOSDIS distributed information framework. This allows many other value-added services to access NASA's vast Earth Science Collection and allows EOSDIS to interoperate with data archives from other domestic and international organizations. EOSDIS is committed to NASA's Data Policy of full and open sharing of Earth science data. As metadata is used in all aspects of NASA's Earth science data lifecycle, EOSDIS provides a spatial and temporal metadata registry and order broker called the EOS Clearing House (ECHO) that allows efficient search and access of cross domain data and services through the Reverb Client and Application Programmer Interfaces (APIs). Another core metadata component of EOSDIS is NASA's Global Change Master Directory (GCMD) which represents more than 25,000 Earth science data set and service descriptions from all over the world, covering subject areas within the Earth and environmental sciences. With inputs from the ECHO, GCMD and Soil Moisture Active Passive (SMAP) mission metadata models, EOSDIS is developing a NASA ISO 19115 Best Practices Convention. Adoption of an international metadata standard enables a far greater level of interoperability among national and international data products. NASA recently concluded a 'Metadata Harmony Study' of EOSDIS metadata capabilities/processes of ECHO and NASA's Global Change Master Directory (GCMD), to evaluate opportunities for improved data access and use, reduce efforts by data providers and improve metadata integrity. The result was a recommendation for EOSDIS to develop a 'Common Metadata Repository (CMR)' to manage the evolution of NASA Earth Science metadata in a unified and consistent way by providing a central storage and access capability that streamlines current workflows while increasing overall data quality and anticipating future capabilities. For applications users interested in monitoring and analyzing a wide variety of natural and man-made phenomena, EOSDIS provides access to near real-time products from the MODIS, OMI, AIRS, and MLS instruments in less than 3 hours from observation. To enable interactive exploration of NASA's Earth imagery, EOSDIS is developing a set of standard services to deliver global, full-resolution satellite imagery in a highly responsive manner. EOSDIS is also playing a lead role in the development of the CEOS WGISS Integrated Catalog (CWIC), which provides search and access to holdings of participating international data providers. EOSDIS provides a platform to expose and share information on NASA Earth science tools and data via Earthdata.nasa.gov while offering a coherent and interoperable system for the NASA Earth Science Data System (ESDS) Program.
NASA Reverb: Standards-Driven Earth Science Data and Service Discovery

NASA Astrophysics Data System (ADS)

Cechini, M. F.; Mitchell, A.; Pilone, D.

2011-12-01

NASA's Earth Observing System Data and Information System (EOSDIS) is a core capability in NASA's Earth Science Data Systems Program. NASA's EOS ClearingHOuse (ECHO) is a metadata catalog for the EOSDIS, providing a centralized catalog of data products and registry of related data services. Working closely with the EOSDIS community, the ECHO team identified a need to develop the next generation EOS data and service discovery tool. This development effort relied on the following principles: + Metadata Driven User Interface - Users should be presented with data and service discovery capabilities based on dynamic processing of metadata describing the targeted data. + Integrated Data & Service Discovery - Users should be able to discovery data and associated data services that facilitate their research objectives. + Leverage Common Standards - Users should be able to discover and invoke services that utilize common interface standards. Metadata plays a vital role facilitating data discovery and access. As data providers enhance their metadata, more advanced search capabilities become available enriching a user's search experience. Maturing metadata formats such as ISO 19115 provide the necessary depth of metadata that facilitates advanced data discovery capabilities. Data discovery and access is not limited to simply the retrieval of data granules, but is growing into the more complex discovery of data services. These services include, but are not limited to, services facilitating additional data discovery, subsetting, reformatting, and re-projecting. The discovery and invocation of these data services is made significantly simpler through the use of consistent and interoperable standards. By utilizing an adopted standard, developing standard-specific adapters can be utilized to communicate with multiple services implementing a specific protocol. The emergence of metadata standards such as ISO 19119 plays a similarly important role in discovery as the 19115 standard. After a yearlong design, development, and testing process, the ECHO team successfully released "Reverb - The Next Generation Earth Science Discovery Tool." Reverb relies heavily on the information contained in dataset and granule metadata, such as ISO 19115, to provide a dynamic experience to users based on identified search facet values extracted from science metadata. Such an approach allows users to perform cross-dataset correlation and searches, discovering additional data that they may not previously have been aware of. In addition to data discovery, Reverb users may discover services associated with their data of interest. When services utilize supported standards and/or protocols, Reverb can facilitate the invocation of both synchronous and asynchronous data processing services. This greatly enhances a users ability to discover data of interest and accomplish their research goals. Extrapolating on the current movement towards interoperable standards and an increase in available services, data service invocation and chaining will become a natural part of data discovery. Reverb is one example of a discovery tool that provides a mechanism for transforming the earth science data discovery paradigm.
Development of the Lymphoma Enterprise Architecture Database: A caBIG(tm) Silver level compliant System

PubMed Central

Huang, Taoying; Shenoy, Pareen J.; Sinha, Rajni; Graiser, Michael; Bumpers, Kevin W.; Flowers, Christopher R.

2009-01-01

Lymphomas are the fifth most common cancer in United States with numerous histological subtypes. Integrating existing clinical information on lymphoma patients provides a platform for understanding biological variability in presentation and treatment response and aids development of novel therapies. We developed a cancer Biomedical Informatics Grid™ (caBIG™) Silver level compliant lymphoma database, called the Lymphoma Enterprise Architecture Data-system™ (LEAD™), which integrates the pathology, pharmacy, laboratory, cancer registry, clinical trials, and clinical data from institutional databases. We utilized the Cancer Common Ontological Representation Environment Software Development Kit (caCORE SDK) provided by National Cancer Institute’s Center for Bioinformatics to establish the LEAD™ platform for data management. The caCORE SDK generated system utilizes an n-tier architecture with open Application Programming Interfaces, controlled vocabularies, and registered metadata to achieve semantic integration across multiple cancer databases. We demonstrated that the data elements and structures within LEAD™ could be used to manage clinical research data from phase 1 clinical trials, cohort studies, and registry data from the Surveillance Epidemiology and End Results database. This work provides a clear example of how semantic technologies from caBIG™ can be applied to support a wide range of clinical and research tasks, and integrate data from disparate systems into a single architecture. This illustrates the central importance of caBIG™ to the management of clinical and biological data. PMID:19492074
Development of the Lymphoma Enterprise Architecture Database: a caBIG Silver level compliant system.

PubMed

Huang, Taoying; Shenoy, Pareen J; Sinha, Rajni; Graiser, Michael; Bumpers, Kevin W; Flowers, Christopher R

2009-04-03

Lymphomas are the fifth most common cancer in United States with numerous histological subtypes. Integrating existing clinical information on lymphoma patients provides a platform for understanding biological variability in presentation and treatment response and aids development of novel therapies. We developed a cancer Biomedical Informatics Grid (caBIG) Silver level compliant lymphoma database, called the Lymphoma Enterprise Architecture Data-system (LEAD), which integrates the pathology, pharmacy, laboratory, cancer registry, clinical trials, and clinical data from institutional databases. We utilized the Cancer Common Ontological Representation Environment Software Development Kit (caCORE SDK) provided by National Cancer Institute's Center for Bioinformatics to establish the LEAD platform for data management. The caCORE SDK generated system utilizes an n-tier architecture with open Application Programming Interfaces, controlled vocabularies, and registered metadata to achieve semantic integration across multiple cancer databases. We demonstrated that the data elements and structures within LEAD could be used to manage clinical research data from phase 1 clinical trials, cohort studies, and registry data from the Surveillance Epidemiology and End Results database. This work provides a clear example of how semantic technologies from caBIG can be applied to support a wide range of clinical and research tasks, and integrate data from disparate systems into a single architecture. This illustrates the central importance of caBIG to the management of clinical and biological data.
Air Quality uFIND: User-oriented Tool Set for Air Quality Data Discovery and Access

NASA Astrophysics Data System (ADS)

Hoijarvi, K.; Robinson, E. M.; Husar, R. B.; Falke, S. R.; Schultz, M. G.; Keating, T. J.

2012-12-01

Historically, there have been major impediments to seamless and effective data usage encountered by both data providers and users. Over the last five years, the international Air Quality (AQ) Community has worked through forums such as the Group on Earth Observations AQ Community of Practice, the ESIP AQ Working Group, and the Task Force on Hemispheric Transport of Air Pollution to converge on data format standards (e.g., netCDF), data access standards (e.g., Open Geospatial Consortium Web Coverage Services), metadata standards (e.g., ISO 19115), as well as other conventions (e.g., CF Naming Convention) in order to build an Air Quality Data Network. The centerpiece of the AQ Data Network is the web service-based tool set: user-oriented Filtering and Identification of Networked Data. The purpose of uFIND is to provide rich and powerful facilities for the user to: a) discover and choose a desired dataset by navigation through the multi-dimensional metadata space using faceted search, b) seamlessly access and browse datasets, and c) use uFINDs facilities as a web service for mashups with other AQ applications and portals. In a user-centric information system such as uFIND, the user experience is improved by metadata that includes the general fields for discovery as well as community-specific metadata to narrow the search beyond space, time and generic keyword searches. However, even with the community-specific additions, the ISO 19115 records were formed in compliance with the standard, so that other standards-based search interface could leverage this additional information. To identify the fields necessary for metadata discovery we started with the ISO 19115 Core Metadata fields and fields that were needed for a Catalog Service for the Web (CSW) Record. This fulfilled two goals - one to create valid ISO 19115 records and the other to be able to retrieve the records through a Catalog Service for the Web query. Beyond the required set of fields, the AQ Community added additional fields using a combination of keywords and ISO 19115 fields. These extensions allow discovery by measurement platform or observed phenomena. Beyond discovery metadata, the AQ records include service identification objects that allow standards-based clients, such as some brokers, to access the data found via OGC WCS or WMS data access protocols. uFIND, is one such smart client, this combination of discovery and access metadata allows the user to preview each registered dataset through spatial and temporal views; observe the data access and usage pattern and also find links to dataset-specific metadata directly in uFIND. The AQ data providers also benefit from this architecture since their data products are easier to find and re-use, enhancing the relevance and importance of their products. Finally, the earth science community at large benefits from the Service Oriented Architecture of uFIND, since it is a service itself and allows service-based interfacing with providers and users of the metadata, allowing uFIND facets to be further refined for a particular AQ application or completely repurposed for other Earth Science domains that use the same set of data access and metadata standards.
The design and implementation of the HY-1B Product Archive System

NASA Astrophysics Data System (ADS)

Liu, Shibin; Liu, Wei; Peng, Hailong

2010-11-01

Product Archive System (PAS), as a background system, is the core part of the Product Archive and Distribution System (PADS) which is the center for data management of the Ground Application System of HY-1B satellite hosted by the National Satellite Ocean Application Service of China. PAS integrates a series of updating methods and technologies, such as a suitable data transmittal mode, flexible configuration files and log information in order to make the system with several desirable characteristics, such as ease of maintenance, stability, minimal complexity. This paper describes seven major components of the PAS (Network Communicator module, File Collector module, File Copy module, Task Collector module, Metadata Extractor module, Product data Archive module, Metadata catalogue import module) and some of the unique features of the system, as well as the technical problems encountered and resolved.
DataMed - an open source discovery index for finding biomedical datasets.

PubMed

Chen, Xiaoling; Gururaj, Anupama E; Ozyurt, Burak; Liu, Ruiling; Soysal, Ergin; Cohen, Trevor; Tiryaki, Firat; Li, Yueling; Zong, Nansu; Jiang, Min; Rogith, Deevakar; Salimi, Mandana; Kim, Hyeon-Eui; Rocca-Serra, Philippe; Gonzalez-Beltran, Alejandra; Farcas, Claudiu; Johnson, Todd; Margolis, Ron; Alter, George; Sansone, Susanna-Assunta; Fore, Ian M; Ohno-Machado, Lucila; Grethe, Jeffrey S; Xu, Hua

2018-01-13

Finding relevant datasets is important for promoting data reuse in the biomedical domain, but it is challenging given the volume and complexity of biomedical data. Here we describe the development of an open source biomedical data discovery system called DataMed, with the goal of promoting the building of additional data indexes in the biomedical domain. DataMed, which can efficiently index and search diverse types of biomedical datasets across repositories, is developed through the National Institutes of Health-funded biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium. It consists of 2 main components: (1) a data ingestion pipeline that collects and transforms original metadata information to a unified metadata model, called DatA Tag Suite (DATS), and (2) a search engine that finds relevant datasets based on user-entered queries. In addition to describing its architecture and techniques, we evaluated individual components within DataMed, including the accuracy of the ingestion pipeline, the prevalence of the DATS model across repositories, and the overall performance of the dataset retrieval engine. Our manual review shows that the ingestion pipeline could achieve an accuracy of 90% and core elements of DATS had varied frequency across repositories. On a manually curated benchmark dataset, the DataMed search engine achieved an inferred average precision of 0.2033 and a precision at 10 (P@10, the number of relevant results in the top 10 search results) of 0.6022, by implementing advanced natural language processing and terminology services. Currently, we have made the DataMed system publically available as an open source package for the biomedical community. © The Author 2018. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ElemeNT: a computational tool for detecting core promoter elements.

PubMed

Sloutskin, Anna; Danino, Yehuda M; Orenstein, Yaron; Zehavi, Yonathan; Doniger, Tirza; Shamir, Ron; Juven-Gershon, Tamar

2015-01-01

Core promoter elements play a pivotal role in the transcriptional output, yet they are often detected manually within sequences of interest. Here, we present 2 contributions to the detection and curation of core promoter elements within given sequences. First, the Elements Navigation Tool (ElemeNT) is a user-friendly web-based, interactive tool for prediction and display of putative core promoter elements and their biologically-relevant combinations. Second, the CORE database summarizes ElemeNT-predicted core promoter elements near CAGE and RNA-seq-defined Drosophila melanogaster transcription start sites (TSSs). ElemeNT's predictions are based on biologically-functional core promoter elements, and can be used to infer core promoter compositions. ElemeNT does not assume prior knowledge of the actual TSS position, and can therefore assist in annotation of any given sequence. These resources, freely accessible at http://lifefaculty.biu.ac.il/gershon-tamar/index.php/resources, facilitate the identification of core promoter elements as active contributors to gene expression.
The Defense Messaging System (DMS) in the Navy Regional Enterprise Messaging System (NREMS) Environment: Evidence that Size Does Matter in DoD Business Process Engineering

DTIC Science & Technology

2007-06-01

data repository that will create a metadata card for each message for use by the federated search catalog as a reference. c. Joint DMS Core Product...yet. Once resolved, NREMS can move forward afloat. The AMHS in concert with NCES will be updated with the federated search capability. AMHS
Service Level Agreements in Service-Oriented Architecture Environments

DTIC Science & Technology

2008-09-01

the WS-Agreement [ Seidel 2007]. Indeed, core concepts of the WSLA were brought into the WS-Agreement, which also contains ideas from the Service...A Categorization Scheme for SLA Metrics. http://ibis.in.tum.de/staff/paschke/docs/MKWI2006_SLA_Paschke.pdf (2006). [ Seidel 2007] Seidel , Jan...addr-metadata-20070731/(2007). [Wohlstadter 2004] Wohlstadter, Eric; Tai, Stefan ; Mikalsen, Thomas; Rouvellou, Isabelle; & Devanbu, Premkumar

The Index to Marine and Lacustrine Geological Samples (IMLGS): Linking Digital Data to Physical Samples for the Marine Community

NASA Astrophysics Data System (ADS)

Stroker, K. J.; Jencks, J. H.; Eakins, B.

2016-12-01

The Index to Marine and Lacustrine Geological Samples (IMLGS) is a community designed and maintained resource enabling researchers to locate and request seafloor and lakebed geologic samples curated by partner institutions. The Index was conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center, now the National Centers for Environmental Information (NCEI), at a 1977 meeting convened by the National Science Foundation (NSF). The Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. The Curators Consortium, international in scope, meets biennially to share ideas and discuss best practices. NCEI serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the laborious task of creating and contributing metadata for over 205,000 sea floor and lake-bed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The IMLGS has a persistent URL/Digital Object Identifier (DOI), as well as DOIs assigned to partner collections for citation and to provide a persistent link to curator collections. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images : 1) at participating institutions, 2) in the NCEI archive, and 3) through a Linked Data interface maintained by the Rolling Deck to Repository R2R. Over 43,000 International GeoSample Numbers (IGSNs) linking to the System for Earth Sample Registration (SESAR) are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. The paper will discuss the database with a goal to increase the connections and links to related data at partner institutions.
University of TX Bureau of Economic Geology's Core Research Centers: The Time is Right for Registering Physical Samples and Assigning IGSN's - Workflows, Stumbling Blocks, and Successes.

NASA Astrophysics Data System (ADS)

Averett, A.; DeJarnett, B. B.

2016-12-01

The University Of Texas Bureau Of Economic Geology (BEG) serves as the geological survey for Texas and operates three geological sample repositories that house well over 2 million boxes of geological samples (cores and cuttings) and an abundant amount of geoscience data (geophysical logs, thin sections, geochemical analyses, etc.). Material is accessible and searchable online, and it is publically available to the geological community for research and education. Patrons access information about our collection by using our online core and log database (SQL format). BEG is currently undertaking a large project to: 1) improve the internal accuracy of metadata associated with the collection; 2) enhance the capabilities of the database for both BEG curators and researchers as well as our external patrons; and 3) ensure easy and efficient navigation for patrons through our online portal. As BEG undertakes this project, BEG is in the early stages of planning to export the metadata for its collection into SESAR (System for Earth Sample Registration) and have IGSN's (International GeoSample Numbers) assigned to its samples. Education regarding the value of IGSN's and an external registry (SESAR) has been crucial to receiving management support for the project because the concept and potential benefits of registering samples in a registry outside of the institution were not well-known prior to this project. Potential benefits such as increases in discoverability, repository recognition in publications, and interoperability were presented. The project was well-received by management, and BEG fully supports the effort to register our physical samples with SESAR. Since BEG is only in the initial phase of this project, any stumbling blocks, workflow issues, successes/failures, etc. can only be predicted at this point, but by mid-December, BEG expects to have several concrete issues to present in the session. Currently, our most pressing issue involves establishing the most efficient workflow for exporting of large amounts of metadata in a format that SESAR can easily ingest, and how this can be best accomplished with very few BEG staff assigned to the project.
MyOcean Internal Information System (Dial-P)

NASA Astrophysics Data System (ADS)

Blanc, Frederique; Jolibois, Tony; Loubrieu, Thomas; Manzella, Giuseppe; Mazzetti, Paolo; Nativi, Stefano

2010-05-01

MyOcean is a three-year project (2008-2011) which goal is the development and pre-operational validation of the GMES Marine Core Service for ocean monitoring and forecasting. It's a transition project that will conduct the European "operational oceanography" community towards the operational phase of a GMES European service, which demands more European integration, more operationality, and more service. Observations, model-based data, and added-value products will be generated - and enhanced thanks to dedicated expertise - by the following production units: • Five Thematic Assembly Centers, each of them dealing with a specific set of observation data: Sea Level, Ocean colour, Sea Surface Temperature, Sea Ice & Wind, and In Situ data, • Seven Monitoring and Forecasting Centers to serve the Global Ocean, the Arctic area, the Baltic Sea, the Atlantic North-West shelves area, the Atlantic Iberian-Biscay-Ireland area, the Mediterranean Sea and the Black sea. Intermediate and final users will discover, view and get the products by means of a central web desk, a central re-active manned service desk and thematic experts distributed across Europe. The MyOcean Information System (MIS) is considering the various aspects of an interoperable - federated information system. Data models support data and computer systems by providing the definition and format of data. The possibility of including the information in the data file is depending on data model adopted. In general there is little effort in the actual project to develop a ‘generic' data model. A strong push to develop a common model is provided by the EU Directive INSPIRE. At present, there is no single de-facto data format for storing observational data. Data formats are still evolving, with their underlying data models moving towards the concept of Feature Types based on ISO/TC211 standards. For example, Unidata are developing the Common Data Model that can represent scientific data types such as point, trajectory, station, grid, etc., which will be implemented in netCDF format. SeaDataNet is recommending ODV and NetCDF formats. Another problem related to data curation and interoperability is the possibility to use common vocabularies. Common vocabularies are developed in many international initiatives, such as GEMET (promoted by INSPIRE as a multilingual thesaurus), UNIDATA, SeaDataNet, Marine Metadata Initiative (MMI). MIS is considering the SeaDataNet vocabulary as a base for interoperability. Four layers of different abstraction levels of interoperability an be defined: - Technical/basic: this layer is implemented at each TAC or MFC through internet connection and basic services for data transfer and browsing (e.g FTP, HTTP, etc). - Syntactic: allowing the interchange of metadata and protocol elements. This layer corresponds to a definition Core Metadata Set, the format of exchange/delivery for the data and associated metadata and possible software. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant metadata set and common data formats). - Functional/pragmatic: based on a common set of functional primitives or on a common set of service definitions. This layer refers to the definition of services based on Web services standards. This layer is implemented by the DIAL-P logical interface (e.g. adoption of INSPIRE compliant network services). - Semantic: allowing to access similar classes of objects and services across multiple sites, with multilinguality of content as one specific aspect. This layer corresponds to MIS interface, terminology and thesaurus. Given the above requirements, the proposed solution is a federation of systems, where the individual participants are self-contained autonomous systems, but together form a consistent wider picture. A mid-tier integration layer mediates between existing systems, adapting their data and service model schema to the MIS. The developed MIS is a read-only system, i.e. does not allow updating (or inserting) data into the participant resource systems. The main advantages of the proposed approach are: • to enable information sources to join the MIS and publish their data and metadata in a secure way, without any modification to their existing resources and procedures and without any restriction to their autonomy; • to enable users to browse and query the MIS, receiving an aggregated result incorporating relevant data and metadata from across different sources; • to accommodate the growth of such a MIS, either in terms of its clients or of its information resources, as well as the evolution of the underlying data model.
Fallon, Nevada FORGE Distinct Element Reservoir Modeling

DOE Data Explorer

Blankenship, Doug; Pettitt, Will; Riahi, Azadeh; Hazzard, Jim; Blanksma, Derrick

2018-03-12

Archive containing input/output data for distinct element reservoir modeling for Fallon FORGE. Models created using 3DEC, InSite, and in-house Python algorithms (ITASCA). List of archived files follows; please see 'Modeling Metadata.pdf' (included as a resource below) for additional file descriptions. Data sources include regional geochemical model, well positions and geometry, principal stress field, capability for hydraulic fractures, capability for hydro-shearing, reservoir geomechanical model-stimulation into multiple zones, modeled thermal behavior during circulation, and microseismicity.
Publications - PDF 96-16 | Alaska Division of Geological & Geophysical

Science.gov Websites

Alaska's Mineral Industry Reports AKGeology.info Rare Earth Elements WebGeochem Engineering Geology Alaska fbx_prelim_geology Shapefile 6.5 M Metadata - Read me Keywords Age Dates; Antimony; Ar-Ar; Bedrock; Bedrock Geology ; Birch Hill Sequence; Bismuth; Chatanika Terrane; Construction Materials; Derivative; Economic Geology
ODISEES: A New Paradigm in Data Access

NASA Astrophysics Data System (ADS)

Huffer, E.; Little, M. M.; Kusterer, J.

2013-12-01

As part of its ongoing efforts to improve access to data, the Atmospheric Science Data Center has developed a high-precision Earth Science domain ontology (the 'ES Ontology') implemented in a graph database ('the Semantic Metadata Repository') that is used to store detailed, semantically-enhanced, parameter-level metadata for ASDC data products. The ES Ontology provides the semantic infrastructure needed to drive the ASDC's Ontology-Driven Interactive Search Environment for Earth Science ('ODISEES'), a data discovery and access tool, and will support additional data services such as analytics and visualization. The ES ontology is designed on the premise that naming conventions alone are not adequate to provide the information needed by prospective data consumers to assess the suitability of a given dataset for their research requirements; nor are current metadata conventions adequate to support seamless machine-to-machine interactions between file servers and end-user applications. Data consumers need information not only about what two data elements have in common, but also about how they are different. End-user applications need consistent, detailed metadata to support real-time data interoperability. The ES ontology is a highly precise, bottom-up, queriable model of the Earth Science domain that focuses on critical details about the measurable phenomena, instrument techniques, data processing methods, and data file structures. Earth Science parameters are described in detail in the ES Ontology and mapped to the corresponding variables that occur in ASDC datasets. Variables are in turn mapped to well-annotated representations of the datasets that they occur in, the instrument(s) used to create them, the instrument platforms, the processing methods, etc., creating a linked-data structure that allows both human and machine users to access a wealth of information critical to understanding and manipulating the data. The mappings are recorded in the Semantic Metadata Repository as RDF-triples. An off-the-shelf Ontology Development Environment and a custom Metadata Conversion Tool comprise a human-machine/machine-machine hybrid tool that partially automates the creation of metadata as RDF-triples by interfacing with existing metadata repositories and providing a user interface that solicits input from a human user, when needed. RDF-triples are pushed to the Ontology Development Environment, where a reasoning engine executes a series of inference rules whose antecedent conditions can be satisfied by the initial set of RDF-triples, thereby generating the additional detailed metadata that is missing in existing repositories. A SPARQL Endpoint, a web-based query service and a Graphical User Interface allow prospective data consumers - even those with no familiarity with NASA data products - to search the metadata repository to find and order data products that meet their exact specifications. A web-based API will provide an interface for machine-to-machine transactions.
QualityML: a dictionary for quality metadata encoding

NASA Astrophysics Data System (ADS)

Ninyerola, Miquel; Sevillano, Eva; Serral, Ivette; Pons, Xavier; Zabala, Alaitz; Bastin, Lucy; Masó, Joan

2014-05-01

The scenario of rapidly growing geodata catalogues requires tools focused on facilitate users the choice of products. Having quality fields populated in metadata allow the users to rank and then select the best fit-for-purpose products. In this direction, we have developed the QualityML (http://qualityml.geoviqua.org), a dictionary that contains hierarchically structured concepts to precisely define and relate quality levels: from quality classes to quality measurements. Generically, a quality element is the path that goes from the higher level (quality class) to the lowest levels (statistics or quality metrics). This path is used to encode quality of datasets in the corresponding metadata schemas. The benefits of having encoded quality, in the case of data producers, are related with improvements in their product discovery and better transmission of their characteristics. In the case of data users, particularly decision-makers, they would find quality and uncertainty measures to take the best decisions as well as perform dataset intercomparison. Also it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable. On one hand, the QualityML is a profile of the ISO geospatial metadata standards providing a set of rules for precisely documenting quality indicator parameters that is structured in 6 levels. On the other hand, QualityML includes semantics and vocabularies for the quality concepts. Whenever possible, if uses statistic expressions from the UncertML dictionary (http://www.uncertml.org) encoding. However it also extends UncertML to provide list of alternative metrics that are commonly used to quantify quality. A specific example, based on a temperature dataset, is shown below. The annual mean temperature map has been validated with independent in-situ measurements to obtain a global error of 0.5 ° C. Level 0: Quality class (e.g., Thematic accuracy) Level 1: Quality indicator (e.g., Quantitative attribute correctness) Level 2: Measurement field (e.g., DifferentialErrors1D) Level 3: Statistic or Metric (e.g., Half-lengthConfidenceInterval) Level 4: Units (e.g. Celsius degrees) Level 5: Value (e.g.0.5) Level 6: Specifications. Additional information on how the measurement took place, citation of the reference data, the traceability of the process and a publication describing the validation process encoded using new 19157 elements or the GeoViQua (http://www.geoviqua.org) Quality Model (PQM-UQM) extensions to the ISO models. Finally, keep in mind, that QualityML is not just suitable for encoding dataset level but also considers pixel and object level uncertainties. This is done by link the metadata quality descriptions with layers representing not just the data but the uncertainty values associated with each geospatial element.
DATS, the data tag suite to enable discoverability of datasets.

PubMed

Sansone, Susanna-Assunta; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Alter, George; Grethe, Jeffrey S; Xu, Hua; Fore, Ian M; Lyle, Jared; Gururaj, Anupama E; Chen, Xiaoling; Kim, Hyeon-Eui; Zong, Nansu; Li, Yueling; Liu, Ruiling; Ozyurt, I Burak; Ohno-Machado, Lucila

2017-06-06

Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)'s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed's goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.
The geochemical landscape of northwestern Wisconsin and adjacent parts of northern Michigan and Minnesota (geochemical data files)

USGS Publications Warehouse

Cannon, William F.; Woodruff, Laurel G.

2003-01-01

This data set consists of nine files of geochemical information on various types of surficial deposits in northwestern Wisconsin and immediately adjacent parts of Michigan and Minnesota. The files are presented in two formats: as dbase files in dbaseIV form and Microsoft Excel form. The data present multi-element chemical analyses of soils, stream sediments, and lake sediments. Latitude and longitude values are provided in each file so that the dbf files can be readily imported to GIS applications. Metadata files are provided in outline form, question and answer form and text form. The metadata includes information on procedures for sample collection, sample preparation, and chemical analyses including sensitivity and precision.
Ontology-Based Search of Genomic Metadata.

PubMed

Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

2016-01-01

The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.
Extension modules for storage, visualization and querying of genomic, genetic and breeding data in Tripal databases

PubMed Central

Lee, Taein; Cheng, Chun-Huai; Ficklin, Stephen; Yu, Jing; Humann, Jodi; Main, Dorrie

2017-01-01

Abstract Tripal is an open-source database platform primarily used for development of genomic, genetic and breeding databases. We report here on the release of the Chado Loader, Chado Data Display and Chado Search modules to extend the functionality of the core Tripal modules. These new extension modules provide additional tools for (1) data loading, (2) customized visualization and (3) advanced search functions for supported data types such as organism, marker, QTL/Mendelian Trait Loci, germplasm, map, project, phenotype, genotype and their respective metadata. The Chado Loader module provides data collection templates in Excel with defined metadata and data loaders with front end forms. The Chado Data Display module contains tools to visualize each data type and the metadata which can be used as is or customized as desired. The Chado Search module provides search and download functionality for the supported data types. Also included are the tools to visualize map and species summary. The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module. These Tripal Extension modules are implemented in the Genome Database for Rosaceae (rosaceae.org), CottonGen (cottongen.org), Citrus Genome Database (citrusgenomedb.org), Genome Database for Vaccinium (vaccinium.org) and the Cool Season Food Legume Database (coolseasonfoodlegume.org). Database URL: https://www.citrusgenomedb.org/, https://www.coolseasonfoodlegume.org/, https://www.cottongen.org/, https://www.rosaceae.org/, https://www.vaccinium.org/
How Safe and Persistent is your Research? It's all about relationships

NASA Astrophysics Data System (ADS)

Lin, J.

2017-12-01

The relationships between the scholarly resources over the course of the research lifecycle and with the people and places involved are at the core of not only scholarly communications, but the research enterprise. This session will consider the current state of persistent identifiers and their associated metadata in scholarly communications as a critical part of making sure research can be validated. Metadata ensure that the scholarly assets are knitted into the webbing of the research information network. I will discuss this in the context of the growing set of formats and types of research related materials accessible, discoverable, trackable, and reusable. I will highlight the developments in and the importance of expanded standardization and interoperability of scholarly information pertaining to authors, researchers, funders, and others involved in the creation and dissemination of content especially in light of the growing attention towards research integrity and reproducibility.
Scientific Platform as a Service - Tools and solutions for efficient access to and analysis of oceanographic data

NASA Astrophysics Data System (ADS)

Vines, Aleksander; Hansen, Morten W.; Korosov, Anton

2017-04-01

Existing infrastructure international and Norwegian projects, e.g., NorDataNet, NMDC and NORMAP, provide open data access through the OPeNDAP protocol following the conventions for CF (Climate and Forecast) metadata, designed to promote the processing and sharing of files created with the NetCDF application programming interface (API). This approach is now also being implemented in the Norwegian Sentinel Data Hub (satellittdata.no) to provide satellite EO data to the user community. Simultaneously with providing simplified and unified data access, these projects also seek to use and establish common standards for use and discovery metadata. This then allows development of standardized tools for data search and (subset) streaming over the internet to perform actual scientific analysis. A combinnation of software tools, which we call a Scientific Platform as a Service (SPaaS), will take advantage of these opportunities to harmonize and streamline the search, retrieval and analysis of integrated satellite and auxiliary observations of the oceans in a seamless system. The SPaaS is a cloud solution for integration of analysis tools with scientific datasets via an API. The core part of the SPaaS is a distributed metadata catalog to store granular metadata describing the structure, location and content of available satellite, model, and in situ datasets. The analysis tools include software for visualization (also online), interactive in-depth analysis, and server-based processing chains. The API conveys search requests between system nodes (i.e., interactive and server tools) and provides easy access to the metadata catalog, data repositories, and the tools. The SPaaS components are integrated in virtual machines, of which provisioning and deployment are automatized using existing state-of-the-art open-source tools (e.g., Vagrant, Ansible, Docker). The open-source code for scientific tools and virtual machine configurations is under version control at https://github.com/nansencenter/, and is coupled to an online continuous integration system (e.g., Travis CI).
The Digital Sample: Metadata, Unique Identification, and Links to Data and Publications

NASA Astrophysics Data System (ADS)

Lehnert, K. A.; Vinayagamoorthy, S.; Djapic, B.; Klump, J.

2006-12-01

A significant part of digital data in the Geosciences refers to physical samples of Earth materials, from igneous rocks to sediment cores to water or gas samples. The application and long-term utility of these sample-based data in research is critically dependent on (a) the availability of information (metadata) about the samples such as geographical location and time of sampling, or sampling method, (b) links between the different data types available for individual samples that are dispersed in the literature and in digital data repositories, and (c) access to the samples themselves. Major problems for achieving this include incomplete documentation of samples in publications, use of ambiguous sample names, and the lack of a central catalog that allows to find a sample's archiving location. The International Geo Sample Number IGSN, managed by the System for Earth Sample Registration SESAR, provides solutions for these problems. The IGSN is a unique persistent identifier for samples and other GeoObjects that can be obtained by submitting sample metadata to SESAR (www.geosamples.org). If data in a publication is referenced to an IGSN (rather than an ambiguous sample name), sample metadata can readily be extracted from the SESAR database, which evolves into a Global Sample Catalog that also allows to locate the owner or curator of the sample. Use of the IGSN in digital data systems allows building linkages between distributed data. SESAR is contributing to the development of sample metadata standards. SESAR will integrate the IGSN in persistent, resolvable identifiers based on the handle.net service to advance direct linkages between the digital representation of samples in SESAR (sample profiles) and their related data in the literature and in web-accessible digital data repositories. Technologies outlined by Klump et al. (this session) such as the automatic creation of ontologies by text mining applications will be explored for harvesting identifiers of publications and datasets that contain information about a specific sample in order to establish comprehensive data profiles for samples.
Distributed metadata servers for cluster file systems using shared low latency persistent key-value metadata store

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin; Pedone, Jr., James M.

A cluster file system is provided having a plurality of distributed metadata servers with shared access to one or more shared low latency persistent key-value metadata stores. A metadata server comprises an abstract storage interface comprising a software interface module that communicates with at least one shared persistent key-value metadata store providing a key-value interface for persistent storage of key-value metadata. The software interface module provides the key-value metadata to the at least one shared persistent key-value metadata store in a key-value format. The shared persistent key-value metadata store is accessed by a plurality of metadata servers. A metadata requestmore » can be processed by a given metadata server independently of other metadata servers in the cluster file system. A distributed metadata storage environment is also disclosed that comprises a plurality of metadata servers having an abstract storage interface to at least one shared persistent key-value metadata store.« less
A Metadata Standard for Hydroinformatic Data Conforming to International Standards

NASA Astrophysics Data System (ADS)

Notay, Vikram; Carstens, Georg; Lehfeldt, Rainer

2017-04-01

The affordable availability of computing power and digital storage has been a boon for the scientific community. The hydroinformatics community has also benefitted from the so-called digital revolution, which has enabled the tackling of more and more complex physical phenomena using hydroinformatic models, instruments, sensors, etc. With models getting more and more complex, computational domains getting larger and the resolution of computational grids and measurement data getting finer, a large amount of data is generated and consumed in any hydroinformatics related project. The ubiquitous availability of internet also contributes to this phenomenon with data being collected through sensor networks connected to telecommunications networks and the internet long before the term Internet of Things existed. Although generally good, this exponential increase in the number of available datasets gives rise to the need to describe this data in a standardised way to not only be able to get a quick overview about the data but to also facilitate interoperability of data from different sources. The Federal Waterways Engineering and Research Institute (BAW) is a federal authority of the German Federal Ministry of Transport and Digital Infrastructure. BAW acts as a consultant for the safe and efficient operation of the German waterways. As part of its consultation role, BAW operates a number of physical and numerical models for sections of inland and marine waterways. In order to uniformly describe the data produced and consumed by these models throughout BAW and to ensure interoperability with other federal and state institutes on the one hand and with EU countries on the other, a metadata profile for hydroinformatic data has been developed at BAW. The metadata profile is composed in its entirety using the ISO 19115 international standard for metadata related to geographic information. Due to the widespread use of the ISO 19115 standard in the existing geodata infrastructure worldwide, the profile provides a means to describe hydroinformatic data that conforms to existing metadata standards. Additionally, EU and German national standards, INSPIRE and GDI-DE have been considered to ensure interoperability on an international and national level. Finally, elements of the GovData profile of the Federal Government of Germany have been integrated to be able to participate in its Open Data initiative. All these factors make the metadata profile developed at BAW highly suitable for describing hydroinformatic data in particular and physical state variables in general. Further details about this metadata profile will be presented at the conference. Acknowledgements: The authors would like to thank Christoph Wosniok and Peter Schade for their contributions towards the development of this metadata standard.
The Geodetic Seamless Archive Centers Service Layer: A System Architecture for Federating Geodesy Data Repositories

NASA Astrophysics Data System (ADS)

McWhirter, J.; Boler, F. M.; Bock, Y.; Jamason, P.; Squibb, M. B.; Noll, C. E.; Blewitt, G.; Kreemer, C. W.

2010-12-01

Three geodesy Archive Centers, Scripps Orbit and Permanent Array Center (SOPAC), NASA's Crustal Dynamics Data Information System (CDDIS) and UNAVCO are engaged in a joint effort to define and develop a common Web Service Application Programming Interface (API) for accessing geodetic data holdings. This effort is funded by the NASA ROSES ACCESS Program to modernize the original GPS Seamless Archive Centers (GSAC) technology which was developed in the 1990s. A new web service interface, the GSAC-WS, is being developed to provide uniform and expanded mechanisms through which users can access our data repositories. In total, our respective archives hold tens of millions of files and contain a rich collection of site/station metadata. Though we serve similar user communities, we currently provide a range of different access methods, query services and metadata formats. This leads to a lack of consistency in the userís experience and a duplication of engineering efforts. The GSAC-WS API and its reference implementation in an underlying Java-based GSAC Service Layer (GSL) supports metadata and data queries into site/station oriented data archives. The general nature of this API makes it applicable to a broad range of data systems. The overall goals of this project include providing consistent and rich query interfaces for end users and client programs, the development of enabling technology to facilitate third party repositories in developing these web service capabilities and to enable the ability to perform data queries across a collection of federated GSAC-WS enabled repositories. A fundamental challenge faced in this project is to provide a common suite of query services across a heterogeneous collection of data yet enabling each repository to expose their specific metadata holdings. To address this challenge we are developing a "capabilities" based service where a repository can describe its specific query and metadata capabilities. Furthermore, the architecture of the GSL is based on a model-view paradigm that decouples the underlying data model semantics from particular representations of the data model. This will allow for the GSAC-WS enabled repositories to evolve their service offerings to incorporate new metadata definition formats (e.g., ISO-19115, FGDC, JSON, etc.) and new techniques for accessing their holdings. Building on the core GSAC-WS implementations the project is also developing a federated/distributed query service. This service will seamlessly integrate with the GSAC Service Layer and will support data and metadata queries across a collection of federated GSAC repositories.
Joint Battlespace Infosphere: Information Management Within a C2 Enterprise

DTIC Science & Technology

2005-06-01

using. In version 1.2, we support both MySQL and Oracle as underlying implementations where the XML metadata schema is mapped into relational tables in...Identity Servers, Role-Based Access Control, and Policy Representation – Databases: Oracle , MySQL , TigerLogic, Berkeley XML DB 15 Instrumentation Services...converted to SQL for execution. Invocations are then forwarded to the appropriate underlying IOR core components that have the responsibility of issuing
A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository.

PubMed

Smelter, Andrey; Moseley, Hunter N B

2018-01-01

The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific 'mwTab' flat file format. In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in 'mwTab' formatted files, we implemented a Python library and package. This Python package, named 'mwtab', is a parser for the domain-specific 'mwTab' flat file format, which provides facilities for reading, accessing, and writing 'mwTab' formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given 'mwTab' formatted file. In order to develop the 'mwtab' package we used the official 'mwTab' format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator. The 'mwtab' package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating 'mwTab' formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, 'pythonic' access and manipulation of data and metadata. Also, the package provides facilities to convert 'mwTab' files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The 'mwtab' package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between 'mwTab' and JSONized formats in raw text and a variety of compressed binary file formats. The 'mwtab' package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a 'User Guide', 'Tutorial', and 'API Reference'. The GitHub repository also provides 'mwtab' package unit-tests via a continuous integration service.
Log-less metadata management on metadata server for parallel file systems.

PubMed

Liao, Jianwei; Xiao, Guoqiang; Peng, Xiaoning

2014-01-01

This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally.

Log-Less Metadata Management on Metadata Server for Parallel File Systems

PubMed Central

Xiao, Guoqiang; Peng, Xiaoning

2014-01-01

This paper presents a novel metadata management mechanism on the metadata server (MDS) for parallel and distributed file systems. In this technique, the client file system backs up the sent metadata requests, which have been handled by the metadata server, so that the MDS does not need to log metadata changes to nonvolatile storage for achieving highly available metadata service, as well as better performance improvement in metadata processing. As the client file system backs up certain sent metadata requests in its memory, the overhead for handling these backup requests is much smaller than that brought by the metadata server, while it adopts logging or journaling to yield highly available metadata service. The experimental results show that this newly proposed mechanism can significantly improve the speed of metadata processing and render a better I/O data throughput, in contrast to conventional metadata management schemes, that is, logging or journaling on MDS. Besides, a complete metadata recovery can be achieved by replaying the backup logs cached by all involved clients, when the metadata server has crashed or gone into nonoperational state exceptionally. PMID:24892093
Texture-Based Correspondence Display

NASA Technical Reports Server (NTRS)

Gerald-Yamasaki, Michael

2004-01-01

Texture-based correspondence display is a methodology to display corresponding data elements in visual representations of complex multidimensional, multivariate data. Texture is utilized as a persistent medium to contain a visual representation model and as a means to create multiple renditions of data where color is used to identify correspondence. Corresponding data elements are displayed over a variety of visual metaphors in a normal rendering process without adding extraneous linking metadata creation and maintenance. The effectiveness of visual representation for understanding data is extended to the expression of the visual representation model in texture.
Legacy2Drupal - Conversion of an existing oceanographic relational database to a semantically enabled Drupal content management system

NASA Astrophysics Data System (ADS)

Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.

2009-12-01

Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.
A Metadata Management Framework for Collaborative Review of Science Data Products

NASA Astrophysics Data System (ADS)

Hart, A. F.; Cinquini, L.; Mattmann, C. A.; Thompson, D. R.; Wagstaff, K.; Zimdars, P. A.; Jones, D. L.; Lazio, J.; Preston, R. A.

2012-12-01

Data volumes generated by modern scientific instruments often preclude archiving the complete observational record. To compensate, science teams have developed a variety of "triage" techniques for identifying data of potential scientific interest and marking it for prioritized processing or permanent storage. This may involve multiple stages of filtering with both automated and manual components operating at different timescales. A promising approach exploits a fast, fully automated first stage followed by a more reliable offline manual review of candidate events. This hybrid approach permits a 24-hour rapid real-time response while also preserving the high accuracy of manual review. To support this type of second-level validation effort, we have developed a metadata-driven framework for the collaborative review of candidate data products. The framework consists of a metadata processing pipeline and a browser-based user interface that together provide a configurable mechanism for reviewing data products via the web, and capturing the full stack of associated metadata in a robust, searchable archive. Our system heavily leverages software from the Apache Object Oriented Data Technology (OODT) project, an open source data integration framework that facilitates the construction of scalable data systems and places a heavy emphasis on the utilization of metadata to coordinate processing activities. OODT provides a suite of core data management components for file management and metadata cataloging that form the foundation for this effort. The system has been deployed at JPL in support of the V-FASTR experiment [1], a software-based radio transient detection experiment that operates commensally at the Very Long Baseline Array (VLBA), and has a science team that is geographically distributed across several countries. Daily review of automatically flagged data is a shared responsibility for the team, and is essential to keep the project within its resource constraints. We describe the development of the platform using open source software, and discuss our experience deploying the system operationally. [1] R.B.Wayth,W.F.Brisken,A.T.Deller,W.A.Majid,D.R.Thompson, S. J. Tingay, and K. L. Wagstaff, "V-fastr: The vlba fast radio transients experiment," The Astrophysical Journal, vol. 735, no. 2, p. 97, 2011. Acknowledgement: This effort was supported by the Jet Propulsion Laboratory, managed by the California Institute of Technology under a contract with the National Aeronautics and Space Administration.
Legacy2Drupal: Conversion of an existing relational oceanographic database to a Drupal 7 CMS

NASA Astrophysics Data System (ADS)

Work, T. T.; Maffei, A. R.; Chandler, C. L.; Groman, R. C.

2011-12-01

Content Management Systems (CMSs) such as Drupal provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have already designed and implemented customized schemas for their metadata. The NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) has ported an existing relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal 7 Content Management System. This is an update on an effort described as a proof-of-concept in poster IN21B-1051, presented at AGU2009. The BCO-DMO project has translated all the existing database tables, input forms, website reports, and other features present in the existing system into Drupal CMS features. The replacement features are made possible by the use of Drupal content types, CCK node-reference fields, a custom theme, and a number of other supporting modules. This presentation describes the process used to migrate content in the original BCO-DMO metadata database to Drupal 7, some problems encountered during migration, and the modules used to migrate the content successfully. Strategic use of Drupal 7 CMS features that enable three separate but complementary interfaces to provide access to oceanographic research metadata will also be covered: 1) a Drupal 7-powered user front-end; 2) REST-ful JSON web services (providing a Mapserver interface to the metadata and data; and 3) a SPARQL interface to a semantic representation of the repository metadata (this feeding a new faceted search capability currently under development). The existing BCO-DMO ontology, developed in collaboration with Rensselaer Polytechnic Institute's Tetherless World Constellation, makes strategic use of pre-existing ontologies and will be used to drive semantically-enabled faceted search capabilities planned for the site. At this point, the use of semantic technologies included in the Drupal 7 core is anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 7 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO and other science data repositories for interoperability between systems that serve ecosystem research data.
Automated Database Mediation Using Ontological Metadata Mappings

PubMed Central

Marenco, Luis; Wang, Rixin; Nadkarni, Prakash

2009-01-01

Objective To devise an automated approach for integrating federated database information using database ontologies constructed from their extended metadata. Background One challenge of database federation is that the granularity of representation of equivalent data varies across systems. Dealing effectively with this problem is analogous to dealing with precoordinated vs. postcoordinated concepts in biomedical ontologies. Model Description The authors describe an approach based on ontological metadata mapping rules defined with elements of a global vocabulary, which allows a query specified at one granularity level to fetch data, where possible, from databases within the federation that use different granularities. This is implemented in OntoMediator, a newly developed production component of our previously described Query Integrator System. OntoMediator's operation is illustrated with a query that accesses three geographically separate, interoperating databases. An example based on SNOMED also illustrates the applicability of high-level rules to support the enforcement of constraints that can prevent inappropriate curator or power-user actions. Summary A rule-based framework simplifies the design and maintenance of systems where categories of data must be mapped to each other, for the purpose of either cross-database query or for curation of the contents of compositional controlled vocabularies. PMID:19567801
Lightweight Advertising and Scalable Discovery of Services, Datasets, and Events Using Feedcasts

NASA Astrophysics Data System (ADS)

Wilson, B. D.; Ramachandran, R.; Movva, S.

2010-12-01

Broadcast feeds (Atom or RSS) are a mechanism for advertising the existence of new data objects on the web, with metadata and links to further information. Users then subscribe to the feed to receive updates. This concept has already been used to advertise the new granules of science data as they are produced (datacasting), with browse images and metadata, and to advertise bundles of web services (service casting). Structured metadata is introduced into the XML feed format by embedding new XML tags (in defined namespaces), using typed links, and reusing built-in Atom feed elements. This “infocasting” concept can be extended to include many other science artifacts, including data collections, workflow documents, topical geophysical events (hurricanes, forest fires, etc.), natural hazard warnings, and short articles describing a new science result. The common theme is that each infocast contains machine-readable, structured metadata describing the object and enabling further manipulation. For example, service casts contain type links pointing to the service interface description (e.g., WSDL for SOAP services), service endpoint, and human-readable documentation. Our Infocasting project has three main goals: (1) define and evangelize micro-formats (metadata standards) so that providers can easily advertise their web services, datasets, and topical geophysical events by adding structured information to broadcast feeds; (2) develop authoring tools so that anyone can easily author such service advertisements, data casts, and event descriptions; and (3) provide a one-stop, Google-like search box in the browser that allows discovery of service, data and event casts visible on the web, and services & data registered in the GEOSS repository and other NASA repositories (GCMD & ECHO). To demonstrate the event casting idea, a series of micro-articles—with accompanying event casts containing links to relevant datasets, web services, and science analysis workflows--will be authored for several kinds of geophysical events, such as hurricanes, smoke plume events, tsunamis, etc. The talk will describe our progress so far, and some of the issues with leveraging existing metadata standards to define lightweight micro-formats.
A Grid Metadata Service for Earth and Environmental Sciences

NASA Astrophysics Data System (ADS)

Fiore, Sandro; Negro, Alessandro; Aloisio, Giovanni

2010-05-01

Critical challenges for climate modeling researchers are strongly connected with the increasingly complex simulation models and the huge quantities of produced datasets. Future trends in climate modeling will only increase computational and storage requirements. For this reason the ability to transparently access to both computational and data resources for large-scale complex climate simulations must be considered as a key requirement for Earth Science and Environmental distributed systems. From the data management perspective (i) the quantity of data will continuously increases, (ii) data will become more and more distributed and widespread, (iii) data sharing/federation will represent a key challenging issue among different sites distributed worldwide, (iv) the potential community of users (large and heterogeneous) will be interested in discovery experimental results, searching of metadata, browsing collections of files, compare different results, display output, etc.; A key element to carry out data search and discovery, manage and access huge and distributed amount of data is the metadata handling framework. What we propose for the management of distributed datasets is the GRelC service (a data grid solution focusing on metadata management). Despite the classical approaches, the proposed data-grid solution is able to address scalability, transparency, security and efficiency and interoperability. The GRelC service we propose is able to provide access to metadata stored in different and widespread data sources (relational databases running on top of MySQL, Oracle, DB2, etc. leveraging SQL as query language, as well as XML databases - XIndice, eXist, and libxml2 based documents, adopting either XPath or XQuery) providing a strong data virtualization layer in a grid environment. Such a technological solution for distributed metadata management leverages on well known adopted standards (W3C, OASIS, etc.); (ii) supports role-based management (based on VOMS), which increases flexibility and scalability; (iii) provides full support for Grid Security Infrastructure, which means (authorization, mutual authentication, data integrity, data confidentiality and delegation); (iv) is compatible with existing grid middleware such as gLite and Globus and finally (v) is currently adopted at the Euro-Mediterranean Centre for Climate Change (CMCC - Italy) to manage the entire CMCC data production activity as well as in the international Climate-G testbed.
A case for user-generated sensor metadata

NASA Astrophysics Data System (ADS)

Nüst, Daniel

2015-04-01

Cheap and easy to use sensing technology and new developments in ICT towards a global network of sensors and actuators promise previously unthought of changes for our understanding of the environment. Large professional as well as amateur sensor networks exist, and they are used for specific yet diverse applications across domains such as hydrology, meteorology or early warning systems. However the impact this "abundance of sensors" had so far is somewhat disappointing. There is a gap between (community-driven) sensor networks that could provide very useful data and the users of the data. In our presentation, we argue this is due to a lack of metadata which allows determining the fitness of use of a dataset. Syntactic or semantic interoperability for sensor webs have made great progress and continue to be an active field of research, yet they often are quite complex, which is of course due to the complexity of the problem at hand. But still, we see the most generic information to determine fitness for use is a dataset's provenance, because it allows users to make up their own minds independently from existing classification schemes for data quality. In this work we will make the case how curated user-contributed metadata has the potential to improve this situation. This especially applies for scenarios in which an observed property is applicable in different domains, and for set-ups where the understanding about metadata concepts and (meta-)data quality differs between data provider and user. On the one hand a citizen does not understand the ISO provenance metadata. On the other hand a researcher might find issues in publicly accessible time series published by citizens, which the latter might not be aware of or care about. Because users will have to determine fitness for use for each application on their own anyway, we suggest an online collaboration platform for user-generated metadata based on an extremely simplified data model. In the most basic fashion, metadata generated by users can be boiled down to a basic property of the world wide web: many information items, such as news or blog posts, allow users to create comments and rate the content. Therefore we argue to focus a core data model on one text field for a textual comment, one optional numerical field for a rating, and a resolvable identifier for the dataset that is commented on. We present a conceptual framework that integrates user comments in existing standards and relevant applications of online sensor networks and discuss possible approaches, such as linked data, brokering, or standalone metadata portals. We relate this framework to existing work in user generated content, such as proprietary rating systems on commercial websites, microformats, the GeoViQua User Quality Model, the CHARMe annotations, or W3C Open Annotation. These systems are also explored for commonalities and based on their very useful concepts and ideas; we present an outline for future extensions of the minimal model. Building on this framework we present a concept how a simplistic comment-rating-system can be extended to capture provenance information for spatio-temporal observations in the sensor web, and how this framework can be evaluated.
openBEB: open biological experiment browser for correlative measurements

PubMed Central

2014-01-01

Background New experimental methods must be developed to study interaction networks in systems biology. To reduce biological noise, individual subjects, such as single cells, should be analyzed using high throughput approaches. The measurement of several correlative physical properties would further improve data consistency. Accordingly, a considerable quantity of data must be acquired, correlated, catalogued and stored in a database for subsequent analysis. Results We have developed openBEB (open Biological Experiment Browser), a software framework for data acquisition, coordination, annotation and synchronization with database solutions such as openBIS. OpenBEB consists of two main parts: A core program and a plug-in manager. Whereas the data-type independent core of openBEB maintains a local container of raw-data and metadata and provides annotation and data management tools, all data-specific tasks are performed by plug-ins. The open architecture of openBEB enables the fast integration of plug-ins, e.g., for data acquisition or visualization. A macro-interpreter allows the automation and coordination of the different modules. An update and deployment mechanism keeps the core program, the plug-ins and the metadata definition files in sync with a central repository. Conclusions The versatility, the simple deployment and update mechanism, and the scalability in terms of module integration offered by openBEB make this software interesting for a large scientific community. OpenBEB targets three types of researcher, ideally working closely together: (i) Engineers and scientists developing new methods and instruments, e.g., for systems-biology, (ii) scientists performing biological experiments, (iii) theoreticians and mathematicians analyzing data. The design of openBEB enables the rapid development of plug-ins, which will inherently benefit from the “house keeping” abilities of the core program. We report the use of openBEB to combine live cell microscopy, microfluidic control and visual proteomics. In this example, measurements from diverse complementary techniques are combined and correlated. PMID:24666611
Engaging a community towards marine cyberinfrastructure: Lessons Learned from The Marine Metadata Interoperability initiative

NASA Astrophysics Data System (ADS)

Galbraith, N. R.; Graybeal, J.; Bermudez, L. E.; Wright, D.

2005-12-01

The Marine Metadata Interoperability (MMI) initiative promotes the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility. The project, operating since late 2004, presents several cultural organizational challenges because of the diversity of participants: scientists, technical experts, and data managers from around the world, all working in organizations with different corporate cultures, funding structures, and systems of decision-making. MMI provides educational resources at several levels. For instance, short introductions to metadata concepts are available, as well as guides and "cookbooks" for the quick and efficient preparation of marine metadata. For those who are building major marine data systems, including ocean-observing capabilities, there are training materials, marine metadata content examples, and resources for mapping elements between different metadata standards. The MMI also provides examples of good metadata practices in existing data systems, including the EU's Marine XML project, and functioning ocean/coastal clearinghouses and atlases developed by MMI team members. Communication tools that help build community: 1) Website, used to introduce the initiative to new visitors, and to provide in-depth guidance and resources to members and visitors. The site is built using Plone, an open source web content management system. Plone allows the site to serve as a wiki, to which every user can contribute material. This keeps the membership engaged and spreads the responsibility for the tasks of updating and expanding the site. 2) Email-lists, to engage the broad ocean sciences community. The discussion forums "news," "ask," and "site-help" are available for receiving regular updates on MMI activities, seeking advice or support on projects and standards, or for assistance with using the MMI site. Internal email lists are provided for the Technical Team, the Steering Committee and Executive Committee, and for several content-centered teams. These lists help keep committee members connected, and have been very successful in building consensus and momentum. 3) Regularly scheduled telecons, to provide the chance for interaction between members without the need to physically attend meetings. Both the steering committee and the technical team convene via phone every month. Discussions are guided by agendas published in advance, and minutes are kept on-line for reference. These telecons have been an important tool in moving the MMI project forward; they give members an opportunity for informal discussion and provide a timeframe for accomplishing tasks. 4) Workshops, to make progress towards community agreement, such as the technical workshop "Advancing Domain Vocabularies" August 9-11, 2005, in Boulder, Colorado, where featured domain and metadata experts developed mappings between existing marine metadata vocabularies. Most of the work of the meeting was performed in six small, carefully organized breakout teams, oriented around specific domains. 5) Calendar of events, to keep update the users and where any event related to marine metadata and interoperability can be posted. 6) Specific tools to reach agreements among distributed communities. For example, we developed a tool called Vocabulary Integration Environment (VINE), that allows formalized agreements of mappings across different vocabularies.
Metadata for Web Resources: How Metadata Works on the Web.

ERIC Educational Resources Information Center

Dillon, Martin

This paper discusses bibliographic control of knowledge resources on the World Wide Web. The first section sets the context of the inquiry. The second section covers the following topics related to metadata: (1) definitions of metadata, including metadata as tags and as descriptors; (2) metadata on the Web, including general metadata systems,…
Metadata Dictionary Database: A Proposed Tool for Academic Library Metadata Management

ERIC Educational Resources Information Center

Southwick, Silvia B.; Lampert, Cory

2011-01-01

This article proposes a metadata dictionary (MDD) be used as a tool for metadata management. The MDD is a repository of critical data necessary for managing metadata to create "shareable" digital collections. An operational definition of metadata management is provided. The authors explore activities involved in metadata management in…
NAVAIR Portable Source Initiative (NPSI) Standard for Reusable Source Dataset Metadata (RSDM) V2.4

DTIC Science & Technology

2012-09-26

defining a raster file format: <RasterFileFormat> <FormatName>TIFF</FormatName> <Order>BIP</Order> < DataType >8-BIT_UNSIGNED</ DataType ...interleaved by line (BIL); Band interleaved by pixel (BIP). element RasterFileFormatType/ DataType diagram type restriction of xsd:string facets
Harvesting NASA's Common Metadata Repository (CMR)

NASA Technical Reports Server (NTRS)

Shum, Dana; Durbin, Chris; Norton, James; Mitchell, Andrew

2017-01-01

As part of NASA's Earth Observing System Data and Information System (EOSDIS), the Common Metadata Repository (CMR) stores metadata for over 30,000 datasets from both NASA and international providers along with over 300M granules. This metadata enables sub-second discovery and facilitates data access. While the CMR offers a robust temporal, spatial and keyword search functionality to the general public and international community, it is sometimes more desirable for international partners to harvest the CMR metadata and merge the CMR metadata into a partner's existing metadata repository. This poster will focus on best practices to follow when harvesting CMR metadata to ensure that any changes made to the CMR can also be updated in a partner's own repository. Additionally, since each partner has distinct metadata formats they are able to consume, the best practices will also include guidance on retrieving the metadata in the desired metadata format using CMR's Unified Metadata Model translation software.
Harvesting NASA's Common Metadata Repository

NASA Astrophysics Data System (ADS)

Shum, D.; Mitchell, A. E.; Durbin, C.; Norton, J.

2017-12-01

As part of NASA's Earth Observing System Data and Information System (EOSDIS), the Common Metadata Repository (CMR) stores metadata for over 30,000 datasets from both NASA and international providers along with over 300M granules. This metadata enables sub-second discovery and facilitates data access. While the CMR offers a robust temporal, spatial and keyword search functionality to the general public and international community, it is sometimes more desirable for international partners to harvest the CMR metadata and merge the CMR metadata into a partner's existing metadata repository. This poster will focus on best practices to follow when harvesting CMR metadata to ensure that any changes made to the CMR can also be updated in a partner's own repository. Additionally, since each partner has distinct metadata formats they are able to consume, the best practices will also include guidance on retrieving the metadata in the desired metadata format using CMR's Unified Metadata Model translation software.
The European Plate Observing System (EPOS): Integrating Thematic Services for Solid Earth Science

NASA Astrophysics Data System (ADS)

Atakan, Kuvvet; Bailo, Daniele; Consortium, Epos

2016-04-01

The mission of EPOS is to monitor and understand the dynamic and complex Earth system by relying on new e-science opportunities and integrating diverse and advanced Research Infrastructures in Europe for solid Earth Science. EPOS will enable innovative multidisciplinary research for a better understanding of the Earth's physical and chemical processes that control earthquakes, volcanic eruptions, ground instability and tsunami as well as the processes driving tectonics and Earth's surface dynamics. Through integration of data, models and facilities EPOS will allow the Earth Science community to make a step change in developing new concepts and tools for key answers to scientific and socio-economic questions concerning geo-hazards and geo-resources as well as Earth sciences applications to the environment and to human welfare. EPOS, during its Implementation Phase (EPOS-IP), will integrate multidisciplinary data into a single e-infrastructure. Multidisciplinary data are organized and governed by the Thematic Core Services (TCS) and are driven by various scientific communities encompassing a wide spectrum of Earth science disciplines. These include Data, Data-products, Services and Software (DDSS), from seismology, near fault observatories, geodetic observations, volcano observations, satellite observations, geomagnetic observations, as well as data from various anthropogenic hazard episodes, geological information and modelling. In addition, transnational access to multi-scale laboratories and geo-energy test-beds for low-carbon energy will be provided. TCS DDSS will be integrated into Integrated Core Services (ICS), a platform that will ensure their interoperability and access to these services by the scientific community as well as other users within the society. This requires dedicated tasks for interactions with the various TCS-WPs, as well as the various distributed ICS (ICS-Ds), such as High Performance Computing (HPC) facilities, large scale data storage facilities, complex processing and visualization tools etc. Computational Earth Science (CES) services are identified as a transversal activity and is planned to be harmonized and provided within the ICS. Currently a comprehensive requirements and use cases elicitation process is started through interactions with the ten different Thematic Core Service work packages. The results of this will be used to harmonize the DDSS elements and prepare for interoperability across the various disciplines. For this purpose a dedicated workshop is planned where the representatives of all the TCS communities will jointly discuss and agree upon the harmonization process. The technical integration of the DDSS elements to a metadata structure adopting CERIF (Common European Research Information Format) standards will start after the harmonization process is completed. Various levels of maturity in the handling and availability of TCS specific DDSS elements among the different TCS groups, is one of the most challenging aspects of this integration. For this reason a roadmap for integration is being prepared where most mature DDSS elements will be implemented during the next 2 years after a community driven testing and validation process. Integration of the remaining DDSS elements will be a continuously evolving process in the coming years.
GEOCAB Portal: A gateway for discovering and accessing capacity building resources in Earth Observation

NASA Astrophysics Data System (ADS)

Desconnets, Jean-Christophe; Giuliani, Gregory; Guigoz, Yaniss; Lacroix, Pierre; Mlisa, Andiswa; Noort, Mark; Ray, Nicolas; Searby, Nancy D.

2017-02-01

The discovery of and access to capacity building resources are often essential to conduct environmental projects based on Earth Observation (EO) resources, whether they are Earth Observation products, methodological tools, techniques, organizations that impart training in these techniques or even projects that have shown practical achievements. Recognizing this opportunity and need, the European Commission through two FP7 projects jointly with the Group on Earth Observations (GEO) teamed up with the Committee on Earth observation Satellites (CEOS). The Global Earth Observation CApacity Building (GEOCAB) portal aims at compiling all current capacity building efforts on the use of EO data for societal benefits into an easily updateable and user-friendly portal. GEOCAB offers a faceted search to improve user discovery experience with a fully interactive world map with all inventoried projects and activities. This paper focuses on the conceptual framework used to implement the underlying platform. An ISO19115 metadata model associated with a terminological repository are the core elements that provide a semantic search application and an interoperable discovery service. The organization and the contribution of different user communities to ensure the management and the update of the content of GEOCAB are addressed.
Sensor-agnostic photogrammetric image registration with applications to population modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Devin A; Moehl, Jessica J

2016-01-01

Photogrammetric registration of airborne and spaceborne imagery is a crucial prerequisite to many data fusion tasks. While embedded sensor models provide a rough geolocation estimate, these metadata may be incomplete or imprecise. Manual solutions are appropriate for small-scale projects, but for rapid streams of cross-modal, multi-sensor, multi-temporal imagery with varying metadata standards, an automated approach is required. We present a high-performance image registration workflow to address this need. This paper outlines the core development concepts and demonstrates its utility with respect to the 2016 data fusion contest imagery. In particular, Iris ultra-HD video is georeferenced to the Earth surface viamore » registration to DEIMOS-2 imagery, which serves as a trusted control source. Geolocation provides opportunity to augment the video with spatial context, stereo-derived disparity, spectral sensitivity, change detection, and numerous ancillary geospatial layers. We conclude by leveraging these derivative data layers towards one such fusion application: population distribution modeling.« less
R classes and methods for SNP array data.

PubMed

Scharpf, Robert B; Ruczinski, Ingo

2010-01-01

The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.

Developing core elements and checklist items for global hospital antimicrobial stewardship programmes: a consensus approach.

PubMed

Pulcini, C; Binda, F; Lamkang, A S; Trett, A; Charani, E; Goff, D A; Harbarth, S; Hinrichsen, S L; Levy-Hara, G; Mendelson, M; Nathwani, D; Gunturu, R; Singh, S; Srinivasan, A; Thamlikitkul, V; Thursky, K; Vlieghe, E; Wertheim, H; Zeng, M; Gandra, S; Laxminarayan, R

2018-04-03

With increasing global interest in hospital antimicrobial stewardship (AMS) programmes, there is a strong demand for core elements of AMS to be clearly defined on the basis of principles of effectiveness and affordability. To date, efforts to identify such core elements have been limited to Europe, Australia, and North America. The aim of this study was to develop a set of core elements and their related checklist items for AMS programmes that should be present in all hospitals worldwide, regardless of resource availability. A literature review was performed by searching Medline and relevant websites to retrieve a list of core elements and items that could have global relevance. These core elements and items were evaluated by an international group of AMS experts using a structured modified Delphi consensus procedure, using two-phased online in-depth questionnaires. The literature review identified seven core elements and their related 29 checklist items from 48 references. Fifteen experts from 13 countries in six continents participated in the consensus procedure. Ultimately, all seven core elements were retained, as well as 28 of the initial checklist items plus one that was newly suggested, all with ≥80% agreement; 20 elements and items were rephrased. This consensus on core elements for hospital AMS programmes is relevant to both high- and low-to-middle-income countries and could facilitate the development of national AMS stewardship guidelines and adoption by healthcare settings worldwide. Copyright © 2018 European Society of Clinical Microbiology and Infectious Diseases. All rights reserved.
Simplified Metadata Curation via the Metadata Management Tool

NASA Astrophysics Data System (ADS)

Shum, D.; Pilone, D.

2015-12-01

The Metadata Management Tool (MMT) is the newest capability developed as part of NASA Earth Observing System Data and Information System's (EOSDIS) efforts to simplify metadata creation and improve metadata quality. The MMT was developed via an agile methodology, taking into account inputs from GCMD's science coordinators and other end-users. In its initial release, the MMT uses the Unified Metadata Model for Collections (UMM-C) to allow metadata providers to easily create and update collection records in the ISO-19115 format. Through a simplified UI experience, metadata curators can create and edit collections without full knowledge of the NASA Best Practices implementation of ISO-19115 format, while still generating compliant metadata. More experienced users are also able to access raw metadata to build more complex records as needed. In future releases, the MMT will build upon recent work done in the community to assess metadata quality and compliance with a variety of standards through application of metadata rubrics. The tool will provide users with clear guidance as to how to easily change their metadata in order to improve their quality and compliance. Through these features, the MMT allows data providers to create and maintain compliant and high quality metadata in a short amount of time.
Enriched Video Semantic Metadata: Authorization, Integration, and Presentation.

ERIC Educational Resources Information Center

Mu, Xiangming; Marchionini, Gary

2003-01-01

Presents an enriched video metadata framework including video authorization using the Video Annotation and Summarization Tool (VAST)-a video metadata authorization system that integrates both semantic and visual metadata-- metadata integration, and user level applications. Results demonstrated that the enriched metadata were seamlessly…
Metadata mapping and reuse in caBIG.

PubMed

Kunz, Isaac; Lin, Ming-Chin; Frey, Lewis

2009-02-05

This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.
Evaluating and Evolving Metadata in Multiple Dialects

NASA Astrophysics Data System (ADS)

Kozimor, J.; Habermann, T.; Powers, L. A.; Gordon, S.

2016-12-01

Despite many long-term homogenization efforts, communities continue to develop focused metadata standards along with related recommendations and (typically) XML representations (aka dialects) for sharing metadata content. Different representations easily become obstacles to sharing information because each representation generally requires a set of tools and skills that are designed, built, and maintained specifically for that representation. In contrast, community recommendations are generally described, at least initially, at a more conceptual level and are more easily shared. For example, most communities agree that dataset titles should be included in metadata records although they write the titles in different ways. This situation has led to the development of metadata repositories that can ingest and output metadata in multiple dialects. As an operational example, the NASA Common Metadata Repository (CMR) includes three different metadata dialects (DIF, ECHO, and ISO 19115-2). These systems raise a new question for metadata providers: if I have a choice of metadata dialects, which should I use and how do I make that decision? We have developed a collection of metadata evaluation tools that can be used to evaluate metadata records in many dialects for completeness with respect to recommendations from many organizations and communities. We have applied these tools to over 8000 collection and granule metadata records in four different dialects. This large collection of identical content in multiple dialects enables us to address questions about metadata and dialect evolution and to answer those questions quantitatively. We will describe those tools and results from evaluating the NASA CMR metadata collection.
TR32DB - Management of Research Data in a Collaborative, Interdisciplinary Research Project

NASA Astrophysics Data System (ADS)

Curdt, Constanze; Hoffmeister, Dirk; Waldhoff, Guido; Lang, Ulrich; Bareth, Georg

2015-04-01

The management of research data in a well-structured and documented manner is essential in the context of collaborative, interdisciplinary research environments (e.g. across various institutions). Consequently, set-up and use of a research data management (RDM) system like a data repository or project database is necessary. These systems should accompany and support scientists during the entire research life cycle (e.g. data collection, documentation, storage, archiving, sharing, publishing) and operate cross-disciplinary in interdisciplinary research projects. Challenges and problems of RDM are well-know. Consequently, the set-up of a user-friendly, well-documented, sustainable RDM system is essential, as well as user support and further assistance. In the framework of the Transregio Collaborative Research Centre 32 'Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, Modelling, and Data Assimilation' (CRC/TR32), funded by the German Research Foundation (DFG), a RDM system was self-designed and implemented. The CRC/TR32 project database (TR32DB, www.tr32db.de) is operating online since early 2008. The TR32DB handles all data, which are created by the involved project participants from several institutions (e.g. Universities of Cologne, Bonn, Aachen, and the Research Centre Jülich) and research fields (e.g. soil and plant sciences, hydrology, geography, geophysics, meteorology, remote sensing). Very heterogeneous research data are considered, which are resulting from field measurement campaigns, meteorological monitoring, remote sensing, laboratory studies and modelling approaches. Furthermore, outcomes like publications, conference contributions, PhD reports and corresponding images are regarded. The TR32DB project database is set-up in cooperation with the Regional Computing Centre of the University of Cologne (RRZK) and also located in this hardware environment. The TR32DB system architecture is composed of three main components: (i) a file-based data storage including backup, (ii) a database-based storage for administrative data and metadata, and (iii) a web-interface for user access. The TR32DB offers common features of RDM systems. These include data storage, entry of corresponding metadata by a user-friendly input wizard, search and download of data depending on user permission, as well as secure internal exchange of data. In addition, a Digital Object Identifier (DOI) can be allocated for specific datasets and several web mapping components are supported (e.g. Web-GIS and map search). The centrepiece of the TR32DB is the self-provided and implemented CRC/TR32 specific metadata schema. This enables the documentation of all involved, heterogeneous data with accurate, interoperable metadata. The TR32DB Metadata Schema is set-up in a multi-level approach and supports several metadata standards and schemes (e.g. Dublin Core, ISO 19115, INSPIRE, DataCite). Furthermore, metadata properties with focus on the CRC/TR32 background (e.g. CRC/TR32 specific keywords) and the supported data types are complemented. Mandatory, optional and automatic metadata properties are specified. Overall, the TR32DB is designed and implemented according to the needs of the CRC/TR32 (e.g. huge amount of heterogeneous data) and demands of the DFG (e.g. cooperation with a computing centre). The application of a self-designed, project-specific, interoperable metadata schema enables the accurate documentation of all CRC/TR32 data. The implementation of the TR32DB in the hardware environment of the RRZK ensures the access to the data after the end of the CRC/TR32 funding in 2018.
Core Standards of the EUBIROD Project. Defining a European Diabetes Data Dictionary for Clinical Audit and Healthcare Delivery.

PubMed

Cunningham, S G; Carinci, F; Brillante, M; Leese, G P; McAlpine, R R; Azzopardi, J; Beck, P; Bratina, N; Bocquet, V; Doggen, K; Jarosz-Chobot, P K; Jecht, M; Lindblad, U; Moulton, T; Metelko, Ž; Nagy, A; Olympios, G; Pruna, S; Skeie, S; Storms, F; Di Iorio, C T; Massi Benedetti, M

2016-01-01

A set of core diabetes indicators were identified in a clinical review of current evidence for the EUBIROD project. In order to allow accurate comparisons of diabetes indicators, a standardised currency for data storage and aggregation was required. We aimed to define a robust European data dictionary with appropriate clinical definitions that can be used to analyse diabetes outcomes and provide the foundation for data collection from existing electronic health records for diabetes. Existing clinical datasets used by 15 partner institutions across Europe were collated and common data items analysed for consistency in terms of recording, data definition and units of measurement. Where necessary, data mappings and algorithms were specified in order to allow partners to meet the standard definitions. A series of descriptive elements were created to document metadata for each data item, including recording, consistency, completeness and quality. While datasets varied in terms of consistency, it was possible to create a common standard that could be used by all. The minimum dataset defined 53 data items that were classified according to their feasibility and validity. Mappings and standardised definitions were used to create an electronic directory for diabetes care, providing the foundation for the EUBIROD data analysis repository, also used to implement the diabetes registry and model of care for Cyprus. The development of data dictionaries and standards can be used to improve the quality and comparability of health information. A data dictionary has been developed to be compatible with other existing data sources for diabetes, within and beyond Europe.
Metadata behind the Interoperability of Wireless Sensor Networks

PubMed Central

Ballari, Daniela; Wachowicz, Monica; Callejo, Miguel Angel Manso

2009-01-01

Wireless Sensor Networks (WSNs) produce changes of status that are frequent, dynamic and unpredictable, and cannot be represented using a linear cause-effect approach. Consequently, a new approach is needed to handle these changes in order to support dynamic interoperability. Our approach is to introduce the notion of context as an explicit representation of changes of a WSN status inferred from metadata elements, which in turn, leads towards a decision-making process about how to maintain dynamic interoperability. This paper describes the developed context model to represent and reason over different WSN status based on four types of contexts, which have been identified as sensing, node, network and organisational contexts. The reasoning has been addressed by developing contextualising and bridges rules. As a result, we were able to demonstrate how contextualising rules have been used to reason on changes of WSN status as a first step towards maintaining dynamic interoperability. PMID:22412330
Metadata behind the Interoperability of Wireless Sensor Networks.

PubMed

Ballari, Daniela; Wachowicz, Monica; Callejo, Miguel Angel Manso

2009-01-01

Wireless Sensor Networks (WSNs) produce changes of status that are frequent, dynamic and unpredictable, and cannot be represented using a linear cause-effect approach. Consequently, a new approach is needed to handle these changes in order to support dynamic interoperability. Our approach is to introduce the notion of context as an explicit representation of changes of a WSN status inferred from metadata elements, which in turn, leads towards a decision-making process about how to maintain dynamic interoperability. This paper describes the developed context model to represent and reason over different WSN status based on four types of contexts, which have been identified as sensing, node, network and organisational contexts. The reasoning has been addressed by developing contextualising and bridges rules. As a result, we were able to demonstrate how contextualising rules have been used to reason on changes of WSN status as a first step towards maintaining dynamic interoperability.
EOS ODL Metadata On-line Viewer

NASA Astrophysics Data System (ADS)

Yang, J.; Rabi, M.; Bane, B.; Ullman, R.

2002-12-01

We have recently developed and deployed an EOS ODL metadata on-line viewer. The EOS ODL metadata viewer is a web server that takes: 1) an EOS metadata file in Object Description Language (ODL), 2) parameters, such as which metadata to view and what style of display to use, and returns an HTML or XML document displaying the requested metadata in the requested style. This tool is developed to address widespread complaints by science community that the EOS Data and Information System (EOSDIS) metadata files in ODL are difficult to read by allowing users to upload and view an ODL metadata file in different styles using a web browser. Users have the selection to view all the metadata or part of the metadata, such as Collection metadata, Granule metadata, or Unsupported Metadata. Choices of display styles include 1) Web: a mouseable display with tabs and turn-down menus, 2) Outline: Formatted and colored text, suitable for printing, 3) Generic: Simple indented text, a direct representation of the underlying ODL metadata, and 4) None: No stylesheet is applied and the XML generated by the converter is returned directly. Not all display styles are implemented for all the metadata choices. For example, Web style is only implemented for Collection and Granule metadata groups with known attribute fields, but not for Unsupported, Other, and All metadata. The overall strategy of the ODL viewer is to transform an ODL metadata file to a viewable HTML in two steps. The first step is to convert the ODL metadata file to an XML using a Java-based parser/translator called ODL2XML. The second step is to transform the XML to an HTML using stylesheets. Both operations are done on the server side. This allows a lot of flexibility in the final result, and is very portable cross-platform. Perl CGI behind the Apache web server is used to run the Java ODL2XML, and then run the results through an XSLT processor. The EOS ODL viewer can be accessed from either a PC or a Mac using Internet Explorer 5.0+ or Netscape 4.7+.
Creating context for the experiment record. User-defined metadata: investigations into metadata usage in the LabTrove ELN.

PubMed

Willoughby, Cerys; Bird, Colin L; Coles, Simon J; Frey, Jeremy G

2014-12-22

The drive toward more transparency in research, the growing willingness to make data openly available, and the reuse of data to maximize the return on research investment all increase the importance of being able to find information and make links to the underlying data. The use of metadata in Electronic Laboratory Notebooks (ELNs) to curate experiment data is an essential ingredient for facilitating discovery. The University of Southampton has developed a Web browser-based ELN that enables users to add their own metadata to notebook entries. A survey of these notebooks was completed to assess user behavior and patterns of metadata usage within ELNs, while user perceptions and expectations were gathered through interviews and user-testing activities within the community. The findings indicate that while some groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users are making little attempts to use it, thereby endangering their ability to recover data in the future. A survey of patterns of metadata use in these notebooks, together with feedback from the user community, indicated that while a few groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users adopt a "minimum required" approach to metadata. To investigate whether the patterns of metadata use in LabTrove were unusual, a series of surveys were undertaken to investigate metadata usage in a variety of platforms supporting user-defined metadata. These surveys also provided the opportunity to investigate whether interface designs in these other environments might inform strategies for encouraging metadata creation and more effective use of metadata in LabTrove.
Metadata squared: enhancing its usability for volunteered geographic information and the GeoWeb

USGS Publications Warehouse

Poore, Barbara S.; Wolf, Eric B.; Sui, Daniel Z.; Elwood, Sarah; Goodchild, Michael F.

2013-01-01

The Internet has brought many changes to the way geographic information is created and shared. One aspect that has not changed is metadata. Static spatial data quality descriptions were standardized in the mid-1990s and cannot accommodate the current climate of data creation where nonexperts are using mobile phones and other location-based devices on a continuous basis to contribute data to Internet mapping platforms. The usability of standard geospatial metadata is being questioned by academics and neogeographers alike. This chapter analyzes current discussions of metadata to demonstrate how the media shift that is occurring has affected requirements for metadata. Two case studies of metadata use are presented—online sharing of environmental information through a regional spatial data infrastructure in the early 2000s, and new types of metadata that are being used today in OpenStreetMap, a map of the world created entirely by volunteers. Changes in metadata requirements are examined for usability, the ease with which metadata supports coproduction of data by communities of users, how metadata enhances findability, and how the relationship between metadata and data has changed. We argue that traditional metadata associated with spatial data infrastructures is inadequate and suggest several research avenues to make this type of metadata more interactive and effective in the GeoWeb.
Evolutions in Metadata Quality

NASA Astrophysics Data System (ADS)

Gilman, J.

2016-12-01

Metadata Quality is one of the chief drivers of discovery and use of NASA EOSDIS (Earth Observing System Data and Information System) data. Issues with metadata such as lack of completeness, inconsistency, and use of legacy terms directly hinder data use. As the central metadata repository for NASA Earth Science data, the Common Metadata Repository (CMR) has a responsibility to its users to ensure the quality of CMR search results. This talk will cover how we encourage metadata authors to improve the metadata through the use of integrated rubrics of metadata quality and outreach efforts. In addition we'll demonstrate Humanizers, a technique for dealing with the symptoms of metadata issues. Humanizers allow CMR administrators to identify specific metadata issues that are fixed at runtime when the data is indexed. An example Humanizer is the aliasing of processing level "Level 1" to "1" to improve consistency across collections. The CMR currently indexes 35K collections and 300M granules.
The role of metadata in managing large environmental science datasets. Proceedings

DOE Office of Scientific and Technical Information (OSTI.GOV)

Melton, R.B.; DeVaney, D.M.; French, J. C.

1995-06-01

The purpose of this workshop was to bring together computer science researchers and environmental sciences data management practitioners to consider the role of metadata in managing large environmental sciences datasets. The objectives included: establishing a common definition of metadata; identifying categories of metadata; defining problems in managing metadata; and defining problems related to linking metadata with primary data.
Evolving the Living With a Star Data System Definition

NASA Astrophysics Data System (ADS)

Otranto, J. F.; Dijoseph, M.

2003-12-01

NASA's Living With a Star (LWS) Program is a space weather-focused and applications-driven research program. The LWS Program is soliciting input from the solar, space physics, space weather, and climate science communities to develop a system that enables access to science data associated with these disciplines, and advances the development of discipline and interdisciplinary findings. The LWS Program will implement a data system that builds upon the existing and planned data capture, processing, and storage components put in place by individual spacecraft missions and also inter-project data management systems, including active and deep archives, and multi-mission data repositories. It is technically feasible for the LWS Program to integrate data from a broad set of resources, assuming they are either publicly accessible or allow access by permission. The LWS Program data system will work in coordination with spacecraft mission data systems and science data repositories, integrating their holdings using a common metadata representation. This common representation relies on a robust metadata definition that provides journalistic and technical data descriptions, plus linkages to supporting data products and tools. The LWS Program intends to become an enabling resource to PIs, interdisciplinary scientists, researchers, and students facilitating both access to a broad collection of science data, as well as the necessary supporting components to understand and make productive use of these data. For the LWS Program to represent science data that are physically distributed across various ground system elements, information will be collected about these distributed data products through a series of LWS Program-created agents. These agents will be customized to interface or interact with each one of these data systems, collect information, and forward any new metadata records to a LWS Program-developed metadata library. A populated LWS metadata library will function as a single point-of-contact that serves the entire science community as a first stop for data availability, whether or not science data are physically stored in an LWS-operated repository. Further, this metadata library will provide the user access to information for understanding these data including descriptions of the associated spacecraft and instrument, data format, calibration and operations issues, links to ancillary and correlative data products, links to processing tools and models associated with these data, and any corresponding findings produced using these data. The LWS may also support an active archive for solar, space physics, space weather, and climate data when these data would otherwise be discarded or archived off-line. This archive could potentially serve also as a data storage backup facility for LWS missions. The plan for the LWS Program metadata library is developed based upon input received from the solar and geospace science communities; the library's architecture is based on existing systems developed for serving science metadata. The LWS Program continues to seek constructive input from the science community, examples of both successes and failures in dealing with science data systems, and insights regarding the obstacles between the current state-of-the-practice and this vision for the LWS Program metadata library.
Building Format-Agnostic Metadata Repositories

NASA Astrophysics Data System (ADS)

Cechini, M.; Pilone, D.

2010-12-01

This presentation will discuss the problems that surround persisting and discovering metadata in multiple formats; a set of tenets that must be addressed in a solution; and NASA’s Earth Observing System (EOS) ClearingHOuse’s (ECHO) proposed approach. In order to facilitate cross-discipline data analysis, Earth Scientists will potentially interact with more than one data source. The most common data discovery paradigm relies on services and/or applications facilitating the discovery and presentation of metadata. What may not be common are the formats in which the metadata are formatted. As the number of sources and datasets utilized for research increases, it becomes more likely that a researcher will encounter conflicting metadata formats. Metadata repositories, such as the EOS ClearingHOuse (ECHO), along with data centers, must identify ways to address this issue. In order to define the solution to this problem, the following tenets are identified: - There exists a set of ‘core’ metadata fields recommended for data discovery. - There exists a set of users who will require the entire metadata record for advanced analysis. - There exists a set of users who will require a ‘core’ set of metadata fields for discovery only. - There will never be a cessation of new formats or a total retirement of all old formats. - Users should be presented metadata in a consistent format. ECHO has undertaken an effort to transform its metadata ingest and discovery services in order to support the growing set of metadata formats. In order to address the previously listed items, ECHO’s new metadata processing paradigm utilizes the following approach: - Identify a cross-format set of ‘core’ metadata fields necessary for discovery. - Implement format-specific indexers to extract the ‘core’ metadata fields into an optimized query capability. - Archive the original metadata in its entirety for presentation to users requiring the full record. - Provide on-demand translation of ‘core’ metadata to any supported result format. With this identified approach, the Earth Scientist is provided with a consistent data representation as they interact with a variety of datasets that utilize multiple metadata formats. They are then able to focus their efforts on the more critical research activities which they are undertaking.
Making Metadata Better with CMR and MMT

NASA Technical Reports Server (NTRS)

Gilman, Jason Arthur; Shum, Dana

2016-01-01

Ensuring complete, consistent and high quality metadata is a challenge for metadata providers and curators. The CMR and MMT systems provide providers and curators options to build in metadata quality from the start and also assess and improve the quality of already existing metadata.
Evolution in Metadata Quality: Common Metadata Repository's Role in NASA Curation Efforts

NASA Technical Reports Server (NTRS)

Gilman, Jason; Shum, Dana; Baynes, Katie

2016-01-01

Metadata Quality is one of the chief drivers of discovery and use of NASA EOSDIS (Earth Observing System Data and Information System) data. Issues with metadata such as lack of completeness, inconsistency, and use of legacy terms directly hinder data use. As the central metadata repository for NASA Earth Science data, the Common Metadata Repository (CMR) has a responsibility to its users to ensure the quality of CMR search results. This poster covers how we use humanizers, a technique for dealing with the symptoms of metadata issues, as well as our plans for future metadata validation enhancements. The CMR currently indexes 35K collections and 300M granules.
Describing environmental public health data: implementing a descriptive metadata standard on the environmental public health tracking network.

PubMed

Patridge, Jeff; Namulanda, Gonza

2008-01-01

The Environmental Public Health Tracking (EPHT) Network provides an opportunity to bring together diverse environmental and health effects data by integrating}?> local, state, and national databases of environmental hazards, environmental exposures, and health effects. To help users locate data on the EPHT Network, the network will utilize descriptive metadata that provide critical information as to the purpose, location, content, and source of these data. Since 2003, the Centers for Disease Control and Prevention's EPHT Metadata Subgroup has been working to initiate the creation and use of descriptive metadata. Efforts undertaken by the group include the adoption of a metadata standard, creation of an EPHT-specific metadata profile, development of an open-source metadata creation tool, and promotion of the creation of descriptive metadata by changing the perception of metadata in the public health culture.
Department of the Interior metadata implementation guide—Framework for developing the metadata component for data resource management

USGS Publications Warehouse

Obuch, Raymond C.; Carlino, Jennifer; Zhang, Lin; Blythe, Jonathan; Dietrich, Christopher; Hawkinson, Christine

2018-04-12

The Department of the Interior (DOI) is a Federal agency with over 90,000 employees across 10 bureaus and 8 agency offices. Its primary mission is to protect and manage the Nation’s natural resources and cultural heritage; provide scientific and other information about those resources; and honor its trust responsibilities or special commitments to American Indians, Alaska Natives, and affiliated island communities. Data and information are critical in day-to-day operational decision making and scientific research. DOI is committed to creating, documenting, managing, and sharing high-quality data and metadata in and across its various programs that support its mission. Documenting data through metadata is essential in realizing the value of data as an enterprise asset. The completeness, consistency, and timeliness of metadata affect users’ ability to search for and discover the most relevant data for the intended purpose; and facilitates the interoperability and usability of these data among DOI bureaus and offices. Fully documented metadata describe data usability, quality, accuracy, provenance, and meaning.Across DOI, there are different maturity levels and phases of information and metadata management implementations. The Department has organized a committee consisting of bureau-level points-of-contacts to collaborate on the development of more consistent, standardized, and more effective metadata management practices and guidance to support this shared mission and the information needs of the Department. DOI’s metadata implementation plans establish key roles and responsibilities associated with metadata management processes, procedures, and a series of actions defined in three major metadata implementation phases including: (1) Getting started—Planning Phase, (2) Implementing and Maintaining Operational Metadata Management Phase, and (3) the Next Steps towards Improving Metadata Management Phase. DOI’s phased approach for metadata management addresses some of the major data and metadata management challenges that exist across the diverse missions of the bureaus and offices. All employees who create, modify, or use data are involved with data and metadata management. Identifying, establishing, and formalizing the roles and responsibilities associated with metadata management are key to institutionalizing a framework of best practices, methodologies, processes, and common approaches throughout all levels of the organization; these are the foundation for effective data resource management. For executives and managers, metadata management strengthens their overarching views of data assets, holdings, and data interoperability; and clarifies how metadata management can help accelerate the compliance of multiple policy mandates. For employees, data stewards, and data professionals, formalized metadata management will help with the consistency of definitions, and approaches addressing data discoverability, data quality, and data lineage. In addition to data professionals and others associated with information technology; data stewards and program subject matter experts take on important metadata management roles and responsibilities as data flow through their respective business and science-related workflows. The responsibilities of establishing, practicing, and governing the actions associated with their specific metadata management roles are critical to successful metadata implementation.

Making Interoperability Easier with the NASA Metadata Management Tool

NASA Astrophysics Data System (ADS)

Shum, D.; Reese, M.; Pilone, D.; Mitchell, A. E.

2016-12-01

ISO 19115 has enabled interoperability amongst tools, yet many users find it hard to build ISO metadata for their collections because it can be large and overly flexible for their needs. The Metadata Management Tool (MMT), part of NASA's Earth Observing System Data and Information System (EOSDIS), offers users a modern, easy to use browser based tool to develop ISO compliant metadata. Through a simplified UI experience, metadata curators can create and edit collections without any understanding of the complex ISO-19115 format, while still generating compliant metadata. The MMT is also able to assess the completeness of collection level metadata by evaluating it against a variety of metadata standards. The tool provides users with clear guidance as to how to change their metadata in order to improve their quality and compliance. It is based on NASA's Unified Metadata Model for Collections (UMM-C) which is a simpler metadata model which can be cleanly mapped to ISO 19115. This allows metadata authors and curators to meet ISO compliance requirements faster and more accurately. The MMT and UMM-C have been developed in an agile fashion, with recurring end user tests and reviews to continually refine the tool, the model and the ISO mappings. This process is allowing for continual improvement and evolution to meet the community's needs.
Image processing tool for automatic feature recognition and quantification

DOEpatents

Chen, Xing; Stoddard, Ryan J.

2017-05-02

A system for defining structures within an image is described. The system includes reading of an input file, preprocessing the input file while preserving metadata such as scale information and then detecting features of the input file. In one version the detection first uses an edge detector followed by identification of features using a Hough transform. The output of the process is identified elements within the image.
Metadata mapping and reuse in caBIG™

PubMed Central

Kunz, Isaac; Lin, Ming-Chin; Frey, Lewis

2009-01-01

Background This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG™). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG™ framework or other frameworks that use metadata repositories. Results The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG™ framework and potentially any framework that uses a metadata repository. Conclusion This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG™. This effort contributes to facilitating the development of interoperable systems within caBIG™ as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies. PMID:19208192
Revision of IRIS/IDA Seismic Station Metadata

NASA Astrophysics Data System (ADS)

Xu, W.; Davis, P.; Auerbach, D.; Klimczak, E.

2017-12-01

Trustworthy data quality assurance has always been one of the goals of seismic network operators and data management centers. This task is considerably complex and evolving due to the huge quantities as well as the rapidly changing characteristics and complexities of seismic data. Published metadata usually reflect instrument response characteristics and their accuracies, which includes zero frequency sensitivity for both seismometer and data logger as well as other, frequency-dependent elements. In this work, we are mainly focused studying the variation of the seismometer sensitivity with time of IRIS/IDA seismic recording systems with a goal to improve the metadata accuracy for the history of the network. There are several ways to measure the accuracy of seismometer sensitivity for the seismic stations in service. An effective practice recently developed is to collocate a reference seismometer in proximity to verify the in-situ sensors' calibration. For those stations with a secondary broadband seismometer, IRIS' MUSTANG metric computation system introduced a transfer function metric to reflect two sensors' gain ratios in the microseism frequency band. In addition, a simulation approach based on M2 tidal measurements has been proposed and proven to be effective. In this work, we compare and analyze the results from three different methods, and concluded that the collocated-sensor method is most stable and reliable with the minimum uncertainties all the time. However, for epochs without both the collocated sensor and secondary seismometer, we rely on the analysis results from tide method. For the data since 1992 on IDA stations, we computed over 600 revised seismometer sensitivities for all the IRIS/IDA network calibration epochs. Hopefully further revision procedures will help to guarantee that the data is accurately reflected by the metadata of these stations.
Experimental constraints on light elements in the Earth’s outer core

PubMed Central

Zhang, Youjun; Sekine, Toshimori; He, Hongliang; Yu, Yin; Liu, Fusheng; Zhang, Mingjian

2016-01-01

Earth’s outer core is liquid and dominantly composed of iron and nickel (~5–10 wt%). Its density, however, is ~8% lower than that of liquid iron, and requires the presence of a significant amount of light element(s). A good way to specify the light element(s) is a direct comparison of density and sound velocity measurements between seismological data and those of possible candidate compositions at the core conditions. We report the sound velocity measurements of a model core composition in the Fe-Ni-Si system at the outer core conditions by shock-wave experiments. Combining with the previous studies, we found that the best estimate for the outer core’s light elements is ~6 wt% Si, ~2 wt% S, and possible ~1–2.5 wt% O. This composition satisfies the requirements imposed by seismology, geochemistry, and some models of the early core formation. This finding may help us to further constrain the thermal structure of the Earth and the models of Earth’s core formation. PMID:26932596
Identification of both copy number variation-type and constant-type core elements in a large segmental duplication region of the mouse genome

PubMed Central

2013-01-01

Background Copy number variation (CNV), an important source of diversity in genomic structure, is frequently found in clusters called CNV regions (CNVRs). CNVRs are strongly associated with segmental duplications (SDs), but the composition of these complex repetitive structures remains unclear. Results We conducted self-comparative-plot analysis of all mouse chromosomes using the high-speed and large-scale-homology search algorithm SHEAP. For eight chromosomes, we identified various types of large SD as tartan-checked patterns within the self-comparative plots. A complex arrangement of diagonal split lines in the self-comparative-plots indicated the presence of large homologous repetitive sequences. We focused on one SD on chromosome 13 (SD13M), and developed SHEPHERD, a stepwise ab initio method, to extract longer repetitive elements and to characterize repetitive structures in this region. Analysis using SHEPHERD showed the existence of 60 core elements, which were expected to be the basic units that form SDs within the repetitive structure of SD13M. The demonstration that sequences homologous to the core elements (>70% homology) covered approximately 90% of the SD13M region indicated that our method can characterize the repetitive structure of SD13M effectively. Core elements were composed largely of fragmented repeats of a previously identified type, such as long interspersed nuclear elements (LINEs), together with partial genic regions. Comparative genome hybridization array analysis showed that whereas 42 core elements were components of CNVR that varied among mouse strains, 8 did not vary among strains (constant type), and the status of the others could not be determined. The CNV-type core elements contained significantly larger proportions of long terminal repeat (LTR) types of retrotransposon than the constant-type core elements, which had no CNV. The higher divergence rates observed in the CNV-type core elements than in the constant type indicate that the CNV-type core elements have a longer evolutionary history than constant-type core elements in SD13M. Conclusions Our methodology for the identification of repetitive core sequences simplifies characterization of the structures of large SDs and detailed analysis of CNV. The results of detailed structural and quantitative analyses in this study might help to elucidate the biological role of one of the SDs on chromosome 13. PMID:23834397
GraphMeta: Managing HPC Rich Metadata in Graphs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Dong; Chen, Yong; Carns, Philip

High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes, but also from increasingly diverse metadata, which contains data provenance and arbitrary user-defined attributes in addition to traditional POSIX metadata. This ‘rich’ metadata is becoming critical to supporting advanced data management functionality such as data auditing and validation. In our prior work, we identified a graph-based model as a promising solution to uniformly manage HPC rich metadata due to its flexibility and generality. However, at the same time, graph-based HPC rich metadata anagement also introducesmore » significant challenges to the underlying infrastructure. In this study, we first identify the challenges on the underlying infrastructure to support scalable, high-performance rich metadata management. Based on that, we introduce GraphMeta, a graphbased engine designed for this use case. It achieves performance scalability by introducing a new graph partitioning algorithm and a write-optimal storage engine. We evaluate GraphMeta under both synthetic and real HPC metadata workloads, compare it with other approaches, and demonstrate its advantages in terms of efficiency and usability for rich metadata management in HPC systems.« less
Core-Mantle Partitioning of Volatile Elements and the Origin of Volatile Elements in Earth and Moon

NASA Technical Reports Server (NTRS)

Righter, K.; Pando, K.; Danielson, L.; Nickodem, K.

2014-01-01

Depletions of siderophile elements in mantles have placed constraints on the conditions on core segregation and differentiation in bodies such as Earth, Earth's Moon, Mars, and asteroid 4 Vesta. Among the siderophile elements there are a sub-set that are also volatile (volatile siderophile elements or VSE; Ga, Ge, In, As, Sb, Sn, Bi, Zn, Cu, Cd), and thus can help to constrain the origin of volatile elements in these bodies, and in particular the Earth and Moon. One of the fundamental observations of the geochemistry of the Moon is the overall depletion of volatile elements relative to the Earth, but a satisfactory explanation has remained elusive. Hypotheses for Earth include addition during accretion and core formation and mobilized into the metallic core, multiple stage origin, or addition after the core formed. Any explanation for volatile elements in the Earth's mantle must also be linked to an explanation of these elements in the lunar mantle. New metal-silicate partitioning data will be applied to the origin of volatile elements in both the Earth and Moon, and will evaluate theories for exogenous versus endogenous origin of volatile elements.
Community-Driven Initiatives to Achieve Interoperability for Ecological and Environmental Data

NASA Astrophysics Data System (ADS)

Madin, J.; Bowers, S.; Jones, M.; Schildhauer, M.

2007-12-01

Advances in ecology and environmental science increasingly depend on information from multiple disciplines to tackle broader and more complex questions about the natural world. Such advances, however, are hindered by data heterogeneity, which impedes the ability of researchers to discover, interpret, and integrate relevant data that have been collected by others. Here, we outline two community-building initiatives for improving data interoperability in the ecological and environmental sciences, one that is well-established (the Ecological Metadata Language [EML]), and another that is actively underway (a unified model for observations and measurements). EML is a metadata specification developed for the ecology discipline, and is based on prior work done by the Ecological Society of America and associated efforts to ensure a modular and extensible framework to document ecological data. EML "modules" are designed to describe one logical part of the total metadata that should be included with any ecological dataset. EML was developed through a series of working meetings, ongoing discussion forums and email lists, with participation from a broad range of ecological and environmental scientists, as well as computer scientists and software developers. Where possible, EML adopted syntax from the other metadata standards for other disciplines (e.g., Dublin Core, Content Standard for Digital Geospatial Metadata, and more). Although EML has not yet been ratified through a standards body, it has become the de facto metadata standard for a large range of ecological data management projects, including for the Long Term Ecological Research Network, the National Center for Ecological Analysis and Synthesis, and the Ecological Society of America. The second community-building initiative is based on work through the Scientific Environment for Ecological Knowledge (SEEK) as well as a recent workshop on multi-disciplinary data management. This initiative aims at improving interoperability by describing the semantics of data at the level of observation and measurement (rather than the traditional focus at the level of the data set) and will define the necessary specifications and technologies to facilitate semantic interpretation and integration of observational data for the environmental sciences. As such, this initiative will focus on unifying the various existing approaches for representing and describing observation data (e.g., SEEK's Observation Ontology, CUAHSI's Observation Data Model, NatureServe's Observation Data Standard, to name a few). Products of this initiative will be compatible with existing standards and build upon recent advances in knowledge representation (e.g., W3C's recommended Web Ontology Language, OWL) that have demonstrated practical utility in enhancing scientific communication and data interoperability in other communities (e.g., the genomics community). A community-sanctioned, extensible, and unified model for observational data will support metadata standards such as EML while reducing the "babel" of scientific dialects that currently impede effective data integration, which will in turn provide a strong foundation for enabling cross-disciplinary synthetic research in the ecological and environmental sciences.
REACTOR UNLOADING

DOEpatents

Leverett, M.C.

1958-02-18

This patent is related to gas cooled reactors wherein the fuel elements are disposed in vertical channels extending through the reactor core, the cooling gas passing through the channels from the bottom to the top of the core. The invention is a means for unloading the fuel elements from the core and comprises dump values in the form of flat cars mounted on wheels at the bottom of the core structure which support vertical stacks of fuel elements. When the flat cars are moved, either manually or automatically, for normal unloading purposes, or due to a rapid rise in the reproduction ratio within the core, the fuel elements are permtted to fall by gravity out of the core structure thereby reducing the reproduction ratio or stopping the reaction as desired.
Metabolonote: A Wiki-Based Database for Managing Hierarchical Metadata of Metabolome Analyses

PubMed Central

Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu

2015-01-01

Metabolomics – technology for comprehensive detection of small molecules in an organism – lags behind the other “omics” in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers’ understanding and use of data but also submitters’ motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/. PMID:25905099
Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses.

PubMed

Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu

2015-01-01

Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
Core-Mantle Partitioning of Volatile Siderophile Elements and the Origin of Volatile Elements in the Earth

NASA Technical Reports Server (NTRS)

Nickodem, K.; Righter, K.; Danielson, L.; Pando, K.; Lee, C.

2012-01-01

There are currently several hypotheses on the origin of volatile siderophile elements in the Earth. One hypothesis is that they were added during Earth s accretion and core formation and mobilized into the metallic core [1], others claim multiple stage origin [2], while some hypothesize that volatiles were added after the core already formed [3]. Several volatile siderophile elements are depleted in Earth s mantle relative to the chondrites, something which continues to puzzle many scientists. This depletion is likely due to a combination of volatility and core formation. The Earth s core is composed of Fe and some lighter constituents, although the abundances of these lighter elements are unknown [4]. Si is one of these potential light elements [5] although few studies have analyzed the effect of Si on metal-silicate partitioning, in particular the volatile elements. As, In, Ge, and Sb are trace volatile siderophile elements which are depleted in the mantle but have yet to be extensively studied. The metal-silicate partition coefficients of these elements will be measured to determine the effect of Si. Partition coefficients depend on temperature, pressure, oxygen fugacity, and metal and silicate composition and can constrain the concentrations of volatile, siderophile elements found in the mantle. Reported here are the results from 13 experiments examining the partitioning of As, In, Ge, and Sb between metallic and silicate liquid. These experiments will examine the effect of temperature, and metal-composition (i.e., Si content) on these elements in or-der to gain a greater understanding of the core-mantle separation which occurred during the Earth s early stages. The data can then be applied to the origin of volatile elements in the Earth.
Sulfur in Earth's Mantle and Its Behavior During Core Formation

NASA Technical Reports Server (NTRS)

Chabot, Nancy L.; Righter,Kevin

2006-01-01

The density of Earth's outer core requires that about 5-10% of the outer core be composed of elements lighter than Fe-Ni; proposed choices for the "light element" component of Earth's core include H, C, O, Si, S, and combinations of these elements [e.g. 1]. Though samples of Earth's core are not available, mantle samples contain elemental signatures left behind from the formation of Earth's core. The abundances of siderophile (metal-loving) elements in Earth's mantle have been used to gain insight into the early accretion and differentiation history of Earth, the process by which the core and mantle formed, and the composition of the core [e.g. 2-4]. Similarly, the abundance of potential light elements in Earth's mantle could also provide constraints on Earth's evolution and core composition. The S abundance in Earth's mantle is 250 ( 50) ppm [5]. It has been suggested that 250 ppm S is too high to be due to equilibrium core formation in a high pressure, high temperature magma ocean on early Earth and that the addition of S to the mantle from the subsequent accretion of a late veneer is consequently required [6]. However, this earlier work of Li and Agee [6] did not parameterize the metalsilicate partitioning behavior of S as a function of thermodynamic variables, limiting the different pressure and temperature conditions during core formation that could be explored. Here, the question of explaining the mantle abundance of S is revisited, through parameterizing existing metal-silicate partitioning data for S and applying the parameterization to core formation in Earth.
Metadata (MD)

Treesearch

Robert E. Keane

2006-01-01

The Metadata (MD) table in the FIREMON database is used to record any information about the sampling strategy or data collected using the FIREMON sampling procedures. The MD method records metadata pertaining to a group of FIREMON plots, such as all plots in a specific FIREMON project. FIREMON plots are linked to metadata using a unique metadata identifier that is...
Sedimentological and radiochemical characteristics of marsh deposits from Assateague Island and the adjacent vicinity, Maryland and Virginia, following Hurricane Sandy

USGS Publications Warehouse

Smith, Christopher G.; Marot, Marci E.; Ellis, Alisha M.; Wheaton, Cathryn J.; Bernier, Julie C.; Adams, C. Scott

2015-09-15

This report serves as an archive for sedimentological and radiochemical data derived from the surface sediments and marsh cores collected March 26–April 4, 2014. Select surficial data are available for the additional sampling periods October 21–30, 2014. Downloadable data are available as Excel spreadsheets and as JPEG files. Additional files include: Field documentation, x-radiographs, photographs, detailed results of sediment grain size analyses, and formal Federal Geographic Data Committee metadata (data downloads).
Sediment data collected in 2010 from Cat Island, Mississippi

USGS Publications Warehouse

Buster, Noreen A.; Kelso, Kyle W.; Miselis, Jennifer L.; Kindinger, Jack G.

2014-01-01

Scientists from the U.S. Geological Survey, St. Petersburg Coastal and Marine Science Center, in collaboration with the U.S. Army Corps of Engineers, conducted geophysical and sedimentological surveys in 2010 around Cat Island, Mississippi, which is the westernmost island in the Mississippi-Alabama barrier island chain. The objective of the study was to understand the geologic evolution of Cat Island relative to other barrier islands in the northern Gulf of Mexico by identifying relationships between the geologic history, present day morphology, and sediment distribution. This data series serves as an archive of terrestrial and marine sediment vibracores collected August 4-6 and October 20-22, 2010, respectively. Geographic information system data products include marine and terrestrial core locations and 2007 shoreline data. Additional files include marine and terrestrial core description logs, core photos, results of sediment grain-size analyses, optically stimulated luminescence dating and carbon-14 dating locations and results, Field Activity Collection System logs, and formal Federal Geographic Data Committee metadata.
Open Core Data approaches to exposing facility data to support FAIR principles

NASA Astrophysics Data System (ADS)

Fils, D.; Lehnert, K.; Noren, A. J.

2017-12-01

The Open Core Data (OCD) award from NSF is focused on exposing scientific drilling data from the JOIDES Resolution Science Operator (JRSO) and Continental Scientific Drilling Coordination Office (CSDCO) following guidance from the Force 11 FAIR principles and the W3C "best practices" recommendations and notes. The goal of this implementation is to provide the identification, access, citation and provenance of these data to support the research community. OCD employs Linked Open Data (LOD) patterns and HTML5 microdata publishing via JSON-LD using various vocabularies. These vocabularies include schema.org, GeoLink and other relevant community vocabularies. Attention is paid to enabling hypermedia navigation between resources to aid in fast and efficient harvesting of the metadata directly from the LOD approach using web architecture patterns. Further, the vocabularies are employed to address the need of both DOI assignment and creation of data citation entries following ESIP data citation recommendations. The use of LOD, community vocabularies and persistent identifiers has enabled linking between hosted and remote data resources. In addition to the semantic metadata and LOD pattern, OCD is implementing approaches to data packaging to facilitate data use. OCD is currently using the CSV for the Web approach but is moving to implement frictionless data packages. This data package model provide access to a large suite of tools, libraries and workbenches to support data utilization, validation and visualization. Further, a basic reference implementation of the W3C PROV-AQ pingback pattern is under testing. This work is done in coordination with the RDA Provenance Patterns WG and follows patterns already employed by Geoscience Australia. This development is also done in coordination with ESIP provenance work. As needed, more traditional Application Program Interfaces (APIs) are exposed following best practices in RESTful services. All these capabilities are implemented in Open Core Data in the lightest possible manner to address the desired functions while being as easy to maintain as possible. The approaches, lessons learned and takeaways from this work at Open Core Data to date will be presented.
A data discovery index for the social sciences

PubMed Central

Krämer, Thomas; Klas, Claus-Peter; Hausstein, Brigitte

2018-01-01

This paper describes a novel search index for social and economic research data, one that enables users to search up-to-date references for data holdings in these disciplines. The index can be used for comparative analysis of publication of datasets in different areas of social science. The core of the index is the da|ra registration agency’s database for social and economic data, which contains high-quality searchable metadata from registered data publishers. Research data’s metadata records are harvested from data providers around the world and included in the index. In this paper, we describe the currently available indices on social science datasets and their shortcomings. Next, we describe the motivation behind and the purpose for the data discovery index as a dedicated and curated platform for finding social science research data and gesisDataSearch, its user interface. Further, we explain the harvesting, filtering and indexing procedure and give usage instructions for the dataset index. Lastly, we show that the index is currently the most comprehensive and most accessible collection of social science data descriptions available. PMID:29633988
EPOS IP - Data, Data Products, Services and Software (DDSS Master Table)

NASA Astrophysics Data System (ADS)

Michalek, Jan; Atakan, Kuvvet

2017-04-01

The "European Plate Observing System - Implementation Phase" (EPOS IP, 2014-2019) project is about building a pan-European infrastructure for accessing solid Earth science data. This ambitious plan started in 2002 already with a Conception Phase and continued by an EPOS PP (Preparatory Phase, 2010-2014) where about 20 partners joined the project. The current EPOS IP project includes 47 partners plus 6 associate partners from 25 countries from all over Europe and several international organizations (ORFEUS, EMSC, EUREF). However, the community contributing to the EPOS integration plan is larger than the official partnership of EPOS IP project, because more countries are represented by the international organizations and because within each country there are several research institutions involved. The list of Data, Data Products, Services and Software (DDSS) provided by individual institutions, consortia or organizations which will become part of the EPOS system are currently collected in document called DDSS Master Table. There are 10 work packages (WP8-WP17) creating the Thematic Core Services (TCS) always grouped by a specific topic: Seismology, Near Fault Observatories, GNSS Data and Products, Volcano Observations, Satellite Data, Geomagnetic Observations, Anthropogenic Hazards, Geological Information and Modelling, Multi-scale laboratories and Geo-Energy Test Beds for Low Carbon Energy. Each of this group declared a list of DDSS elements which are about to be implemented. Currently there are about 455 DDSS elements in the DDSS Master Table. These DDSS elements are of different maturity and about 122 are declared by TCS groups to be ready for implementation which means that the data are well described with metadata, following the standards specific for their domain and, in the best case, with some services allowing their access already. The DDSS elements differ by its complexity as well. The DDSS Master Table serves as an overview of the DDSS elements and includes most of the important information needed for further implementation and is continuously updated as the project evolves. The presentation is showing statistics describing the current status of DDSS Master Table and complexity of the organizational structure at the TCS level.

What Are the Core Elements of Your Curriculum?

ERIC Educational Resources Information Center

Exchange: The Early Childhood Leaders' Magazine Since 1978, 2009

2009-01-01

Several administrators discuss the core elements of their curriculum. These core elements are: (1) Child-centered; (2) Play; (3) Problem solving; (4) Respect; (5)Creativity; (6) Community; (7) Independence; (8) Curiosity; (9) Love of learning; (10) Relationship; (11) Cooperation; (12) Self-confidence; (13) Language; (14) Joy; (15) Nature; Natural…
Documentation Resources on the ESIP Wiki

NASA Technical Reports Server (NTRS)

Habermann, Ted; Kozimor, John; Gordon, Sean

2017-01-01

The ESIP community includes data providers and users that communicate with one another through datasets and metadata that describe them. Improving this communication depends on consistent high-quality metadata. The ESIP Documentation Cluster and the wiki play an important central role in facilitating this communication. We will describe and demonstrate sections of the wiki that provide information about metadata concept definitions, metadata recommendation, metadata dialects, and guidance pages. We will also describe and demonstrate the ISO Explorer, a tool that the community is developing to help metadata creators.
Fast processing of digital imaging and communications in medicine (DICOM) metadata using multiseries DICOM format.

PubMed

Ismail, Mahmoud; Philbin, James

2015-04-01

The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes. This work promotes storing DICOM studies in MSD format to reduce the metadata processing time. A set of experiments are performed that update the metadata of a set of DICOM studies for deidentification and migration. The studies are stored in both the traditional single frame DICOM (SFD) format and the MSD format. The results show that it is faster to update studies' metadata in MSD format than in SFD format because the bulk data is separated in MSD and is not retrieved from the storage system. In addition, it is space efficient to store the deidentified studies in MSD format as it shares the same bulk data object with the original study. In summary, separation of metadata from pixel data using the MSD format provides fast metadata access and speeds up applications that process only the metadata.
Transforming Dermatologic Imaging for the Digital Era: Metadata and Standards.

PubMed

Caffery, Liam J; Clunie, David; Curiel-Lewandrowski, Clara; Malvehy, Josep; Soyer, H Peter; Halpern, Allan C

2018-01-17

Imaging is increasingly being used in dermatology for documentation, diagnosis, and management of cutaneous disease. The lack of standards for dermatologic imaging is an impediment to clinical uptake. Standardization can occur in image acquisition, terminology, interoperability, and metadata. This paper presents the International Skin Imaging Collaboration position on standardization of metadata for dermatologic imaging. Metadata is essential to ensure that dermatologic images are properly managed and interpreted. There are two standards-based approaches to recording and storing metadata in dermatologic imaging. The first uses standard consumer image file formats, and the second is the file format and metadata model developed for the Digital Imaging and Communication in Medicine (DICOM) standard. DICOM would appear to provide an advantage over using consumer image file formats for metadata as it includes all the patient, study, and technical metadata necessary to use images clinically. Whereas, consumer image file formats only include technical metadata and need to be used in conjunction with another actor-for example, an electronic medical record-to supply the patient and study metadata. The use of DICOM may have some ancillary benefits in dermatologic imaging including leveraging DICOM network and workflow services, interoperability of images and metadata, leveraging existing enterprise imaging infrastructure, greater patient safety, and better compliance to legislative requirements for image retention.
Fast processing of digital imaging and communications in medicine (DICOM) metadata using multiseries DICOM format

PubMed Central

Ismail, Mahmoud; Philbin, James

2015-01-01

Abstract. The digital imaging and communications in medicine (DICOM) information model combines pixel data and its metadata in a single object. There are user scenarios that only need metadata manipulation, such as deidentification and study migration. Most picture archiving and communication system use a database to store and update the metadata rather than updating the raw DICOM files themselves. The multiseries DICOM (MSD) format separates metadata from pixel data and eliminates duplicate attributes. This work promotes storing DICOM studies in MSD format to reduce the metadata processing time. A set of experiments are performed that update the metadata of a set of DICOM studies for deidentification and migration. The studies are stored in both the traditional single frame DICOM (SFD) format and the MSD format. The results show that it is faster to update studies’ metadata in MSD format than in SFD format because the bulk data is separated in MSD and is not retrieved from the storage system. In addition, it is space efficient to store the deidentified studies in MSD format as it shares the same bulk data object with the original study. In summary, separation of metadata from pixel data using the MSD format provides fast metadata access and speeds up applications that process only the metadata. PMID:26158117
From the inside-out: Retrospectives on a metadata improvement process to advance the discoverability of NASÁs earth science data

NASA Astrophysics Data System (ADS)

Hernández, B. E.; Bugbee, K.; le Roux, J.; Beaty, T.; Hansen, M.; Staton, P.; Sisco, A. W.

2017-12-01

Earth observation (EO) data collected as part of NASA's Earth Observing System Data and Information System (EOSDIS) is now searchable via the Common Metadata Repository (CMR). The Analysis and Review of CMR (ARC) Team at Marshall Space Flight Center has been tasked with reviewing all NASA metadata records in the CMR ( 7,000 records). Each collection level record and constituent granule level metadata are reviewed for both completeness as well as compliance with the CMR's set of metadata standards, as specified in the Unified Metadata Model (UMM). NASA's Distributed Active Archive Centers (DAACs) have been harmonizing priority metadata records within the context of the inter-agency federal Big Earth Data Initiative (BEDI), which seeks to improve the discoverability, accessibility, and usability of EO data. Thus, the first phase of this project constitutes reviewing BEDI metadata records, while the second phase will constitute reviewing the remaining non-BEDI records in CMR. This presentation will discuss the ARC team's findings in terms of the overall quality of BEDI records across all DAACs as well as compliance with UMM standards. For instance, only a fifth of the collection-level metadata fields needed correction, compared to a quarter of the granule-level fields. It should be noted that the degree to which DAACs' metadata did not comply with the UMM standards may reflect multiple factors, such as recent changes in the UMM standards, and the utilization of different metadata formats (e.g. DIF 10, ECHO 10, ISO 19115-1) across the DAACs. Insights, constructive criticism, and lessons learned from this metadata review process will be contributed from both ORNL and SEDAC. Further inquiry along such lines may lead to insights which may improve the metadata curation process moving forward. In terms of the broader implications for metadata compliance with the UMM standards, this research has shown that a large proportion of the prioritized collections have already been made compliant, although the process of improving metadata quality is ongoing and iterative. Further research is also warranted into whether or not the gains in metadata quality are also driving gains in data use.
Forum Guide to Metadata: The Meaning behind Education Data. NFES 2009-805

ERIC Educational Resources Information Center

National Forum on Education Statistics, 2009

2009-01-01

The purpose of this guide is to empower people to more effectively use data as information. To accomplish this, the publication explains what metadata are; why metadata are critical to the development of sound education data systems; what components comprise a metadata system; what value metadata bring to data management and use; and how to…
Dynamic federations: storage aggregation using open tools and protocols

NASA Astrophysics Data System (ADS)

Furano, Fabrizio; Brito da Rocha, Ricardo; Devresse, Adrien; Keeble, Oliver; Álvarez Ayllón, Alejandro; Fuhrmann, Patrick

2012-12-01

A number of storage elements now offer standard protocol interfaces like NFS 4.1/pNFS and WebDAV, for access to their data repositories, in line with the standardization effort of the European Middleware Initiative (EMI). Also the LCG FileCatalogue (LFC) can offer such features. Here we report on work that seeks to exploit the federation potential of these protocols and build a system that offers a unique view of the storage and metadata ensemble and the possibility of integration of other compatible resources such as those from cloud providers. The challenge, here undertaken by the providers of dCache and DPM, and pragmatically open to other Grid and Cloud storage solutions, is to build such a system while being able to accommodate name translations from existing catalogues (e.g. LFCs), experiment-based metadata catalogues, or stateless algorithmic name translations, also known as “trivial file catalogues”. Such so-called storage federations of standard protocols-based storage elements give a unique view of their content, thus promoting simplicity in accessing the data they contain and offering new possibilities for resilience and data placement strategies. The goal is to consider HTTP and NFS4.1-based storage elements and metadata catalogues and make them able to cooperate through an architecture that properly feeds the redirection mechanisms that they are based upon, thus giving the functionalities of a “loosely coupled” storage federation. One of the key requirements is to use standard clients (provided by OS'es or open source distributions, e.g. Web browsers) to access an already aggregated system; this approach is quite different from aggregating the repositories at the client side through some wrapper API, like for instance GFAL, or by developing new custom clients. Other technical challenges that will determine the success of this initiative include performance, latency and scalability, and the ability to create worldwide storage federations that are able to redirect clients to repositories that they can efficiently access, for instance trying to choose the endpoints that are closer or applying other criteria. We believe that the features of a loosely coupled federation of open-protocols-based storage elements will open many possibilities of evolving the current computing models without disrupting them, and, at the same time, will be able to operate with the existing infrastructures, follow their evolution path and add storage centers that can be acquired as a third-party service.
A novel framework for assessing metadata quality in epidemiological and public health research settings

PubMed Central

McMahon, Christiana; Denaxas, Spiros

2016-01-01

Metadata are critical in epidemiological and public health research. However, a lack of biomedical metadata quality frameworks and limited awareness of the implications of poor quality metadata renders data analyses problematic. In this study, we created and evaluated a novel framework to assess metadata quality of epidemiological and public health research datasets. We performed a literature review and surveyed stakeholders to enhance our understanding of biomedical metadata quality assessment. The review identified 11 studies and nine quality dimensions; none of which were specifically aimed at biomedical metadata. 96 individuals completed the survey; of those who submitted data, most only assessed metadata quality sometimes, and eight did not at all. Our framework has four sections: a) general information; b) tools and technologies; c) usability; and d) management and curation. We evaluated the framework using three test cases and sought expert feedback. The framework can assess biomedical metadata quality systematically and robustly. PMID:27570670
A novel framework for assessing metadata quality in epidemiological and public health research settings.

PubMed

McMahon, Christiana; Denaxas, Spiros

2016-01-01

Metadata are critical in epidemiological and public health research. However, a lack of biomedical metadata quality frameworks and limited awareness of the implications of poor quality metadata renders data analyses problematic. In this study, we created and evaluated a novel framework to assess metadata quality of epidemiological and public health research datasets. We performed a literature review and surveyed stakeholders to enhance our understanding of biomedical metadata quality assessment. The review identified 11 studies and nine quality dimensions; none of which were specifically aimed at biomedical metadata. 96 individuals completed the survey; of those who submitted data, most only assessed metadata quality sometimes, and eight did not at all. Our framework has four sections: a) general information; b) tools and technologies; c) usability; and d) management and curation. We evaluated the framework using three test cases and sought expert feedback. The framework can assess biomedical metadata quality systematically and robustly.
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises

NASA Astrophysics Data System (ADS)

Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.

2011-12-01

JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.
CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dennis, John; Edwards, Jim; Evans, Kate J

2012-01-01

The Community Atmosphere Model (CAM) version 5 includes a spectral element dynamical core option from NCAR's High-Order Method Modeling Environment. It is a continuous Galerkin spectral finite element method designed for fully unstructured quadrilateral meshes. The current configurations in CAM are based on the cubed-sphere grid. The main motivation for including a spectral element dynamical core is to improve the scalability of CAM by allowing quasi-uniform grids for the sphere that do not require polar filters. In addition, the approach provides other state-of-the-art capabilities such as improved conservation properties. Spectral elements are used for the horizontal discretization, while most othermore » aspects of the dynamical core are a hybrid of well tested techniques from CAM's finite volume and global spectral dynamical core options. Here we first give a overview of the spectral element dynamical core as used in CAM. We then give scalability and performance results from CAM running with three different dynamical core options within the Community Earth System Model, using a pre-industrial time-slice configuration. We focus on high resolution simulations of 1/4 degree, 1/8 degree, and T340 spectral truncation.« less
Towards Data Value-Level Metadata for Clinical Studies.

PubMed

Zozus, Meredith Nahm; Bonner, Joseph

2017-01-01

While several standards for metadata describing clinical studies exist, comprehensive metadata to support traceability of data from clinical studies has not been articulated. We examine uses of metadata in clinical studies. We examine and enumerate seven sources of data value-level metadata in clinical studies inclusive of research designs across the spectrum of the National Institutes of Health definition of clinical research. The sources of metadata inform categorization in terms of metadata describing the origin of a data value, the definition of a data value, and operations to which the data value was subjected. The latter is further categorized into information about changes to a data value, movement of a data value, retrieval of a data value, and data quality checks, constraints or assessments to which the data value was subjected. The implications of tracking and managing data value-level metadata are explored.
Acoustic Metadata Management and Transparent Access to Networked Oceanographic Data Sets

DTIC Science & Technology

2015-09-30

the deployment that was analyzed. It contains a Start and End time that must lie within the timespan over which the instrument was deployed. A list...encounter – Detections record acoustic encounters. The start time denotes when the animals were first detected acoustically and the end time...systematically. Children of Detection include elements such as the Start and End times of the call, bin, or encounter, a species identifier from the
The PDS4 Information Model and its Role in Agile Science Data Curation

NASA Astrophysics Data System (ADS)

Hughes, J. S.; Crichton, D.

2017-12-01

PDS4 is an information model-driven service architecture supporting the capture, management, distribution and integration of massive planetary science data captured in distributed data archives world-wide. The PDS4 Information Model (IM), the core element of the architecture, was developed using lessons learned from 20 years of archiving Planetary Science Data and best practices for information model development. The foundational principles were adopted from the Open Archival Information System (OAIS) Reference Model (ISO 14721), the Metadata Registry Specification (ISO/IEC 11179), and W3C XML (Extensible Markup Language) specifications. These provided respectively an object oriented model for archive information systems, a comprehensive schema for data dictionaries and hierarchical governance, and rules for rules for encoding documents electronically. The PDS4 Information model is unique in that it drives the PDS4 infrastructure by providing the representation of concepts and their relationships, constraints, rules, and operations; a sharable, stable, and organized set of information requirements; and machine parsable definitions that are suitable for configuring and generating code. This presentation will provide an over of the PDS4 Information Model and how it is being leveraged to develop and evolve the PDS4 infrastructure and enable agile curation of over 30 years of science data collected by the international Planetary Science community.
Managing Complex Change in Clinical Study Metadata

PubMed Central

Brandt, Cynthia A.; Gadagkar, Rohit; Rodriguez, Cesar; Nadkarni, Prakash M.

2004-01-01

In highly functional metadata-driven software, the interrelationships within the metadata become complex, and maintenance becomes challenging. We describe an approach to metadata management that uses a knowledge-base subschema to store centralized information about metadata dependencies and use cases involving specific types of metadata modification. Our system borrows ideas from production-rule systems in that some of this information is a high-level specification that is interpreted and executed dynamically by a middleware engine. Our approach is implemented in TrialDB, a generic clinical study data management system. We review approaches that have been used for metadata management in other contexts and describe the features, capabilities, and limitations of our system. PMID:15187070
Ubiquitous UAVs: a cloud based framework for storing, accessing and processing huge amount of video footage in an efficient way

NASA Astrophysics Data System (ADS)

Efstathiou, Nectarios; Skitsas, Michael; Psaroudakis, Chrysostomos; Koutras, Nikolaos

2017-09-01

Nowadays, video surveillance cameras are used for the protection and monitoring of a huge number of facilities worldwide. An important element in such surveillance systems is the use of aerial video streams originating from onboard sensors located on Unmanned Aerial Vehicles (UAVs). Video surveillance using UAVs represent a vast amount of video to be transmitted, stored, analyzed and visualized in a real-time way. As a result, the introduction and development of systems able to handle huge amount of data become a necessity. In this paper, a new approach for the collection, transmission and storage of aerial videos and metadata is introduced. The objective of this work is twofold. First, the integration of the appropriate equipment in order to capture and transmit real-time video including metadata (i.e. position coordinates, target) from the UAV to the ground and, second, the utilization of the ADITESS Versatile Media Content Management System (VMCMS-GE) for storing of the video stream and the appropriate metadata. Beyond the storage, VMCMS-GE provides other efficient management capabilities such as searching and processing of videos, along with video transcoding. For the evaluation and demonstration of the proposed framework we execute a use case where the surveillance of critical infrastructure and the detection of suspicious activities is performed. Collected video Transcodingis subject of this evaluation as well.
XML at the ADC: Steps to a Next Generation Data Archive

NASA Astrophysics Data System (ADS)

Shaya, E.; Blackwell, J.; Gass, J.; Oliversen, N.; Schneider, G.; Thomas, B.; Cheung, C.; White, R. A.

1999-05-01

The eXtensible Markup Language (XML) is a document markup language that allows users to specify their own tags, to create hierarchical structures to qualify their data, and to support automatic checking of documents for structural validity. It is being intensively supported by nearly every major corporate software developer. Under the funds of a NASA AISRP proposal, the Astronomical Data Center (ADC, http://adc.gsfc.nasa.gov) is developing an infrastructure for importation, enhancement, and distribution of data and metadata using XML as the document markup language. We discuss the preliminary Document Type Definition (DTD, at http://adc.gsfc.nasa.gov/xml) which specifies the elements and their attributes in our metadata documents. This attempts to define both the metadata of an astronomical catalog and the `header' information of an astronomical table. In addition, we give an overview of the planned flow of data through automated pipelines from authors and journal presses into our XML archive and retrieval through the web via the XML-QL Query Language and eXtensible Style Language (XSL) scripts. When completed, the catalogs and journal tables at the ADC will be tightly hyperlinked to enhance data discovery. In addition one will be able to search on fragmentary information. For instance, one could query for a table by entering that the second author is so-and-so or that the third author is at such-and-such institution.
METHOD AND APPARATUS FOR CONTROLLING NEUTRON DENSITY

DOEpatents

Wigner, E.P.; Young, G.J.; Weinberg, A.M.

1961-06-27

A neutronic reactor comprising a moderator containing uniformly sized and spaced channels and uniformly dimensioned fuel elements is patented. The fuel elements have a fissionable core and an aluminum jacket. The cores and the jackets of the fuel elements in the central channels of the reactor are respectively thinner and thicker than the cores and jackets of the fuel elements in the remainder of the reactor, producing a flattened flux.
Metazen – metadata capture for metagenomes

PubMed Central

2014-01-01

Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508

Metazen - metadata capture for metagenomes.

PubMed

Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

2014-01-01

As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.
Improving Access to NASA Earth Science Data through Collaborative Metadata Curation

NASA Astrophysics Data System (ADS)

Sisco, A. W.; Bugbee, K.; Shum, D.; Baynes, K.; Dixon, V.; Ramachandran, R.

2017-12-01

The NASA-developed Common Metadata Repository (CMR) is a high-performance metadata system that currently catalogs over 375 million Earth science metadata records. It serves as the authoritative metadata management system of NASA's Earth Observing System Data and Information System (EOSDIS), enabling NASA Earth science data to be discovered and accessed by a worldwide user community. The size of the EOSDIS data archive is steadily increasing, and the ability to manage and query this archive depends on the input of high quality metadata to the CMR. Metadata that does not provide adequate descriptive information diminishes the CMR's ability to effectively find and serve data to users. To address this issue, an innovative and collaborative review process is underway to systematically improve the completeness, consistency, and accuracy of metadata for approximately 7,000 data sets archived by NASA's twelve EOSDIS data centers, or Distributed Active Archive Centers (DAACs). The process involves automated and manual metadata assessment of both collection and granule records by a team of Earth science data specialists at NASA Marshall Space Flight Center. The team communicates results to DAAC personnel, who then make revisions and reingest improved metadata into the CMR. Implementation of this process relies on a network of interdisciplinary collaborators leveraging a variety of communication platforms and long-range planning strategies. Curating metadata at this scale and resolving metadata issues through community consensus improves the CMR's ability to serve current and future users and also introduces best practices for stewarding the next generation of Earth Observing System data. This presentation will detail the metadata curation process, its outcomes thus far, and also share the status of ongoing curation activities.
Nuclear reactor composite fuel assembly

DOEpatents

Burgess, Donn M.; Marr, Duane R.; Cappiello, Michael W.; Omberg, Ronald P.

1980-01-01

A core and composite fuel assembly for a liquid-cooled breeder nuclear reactor including a plurality of elongated coextending driver and breeder fuel elements arranged to form a generally polygonal bundle within a thin-walled duct. The breeder elements are larger in cross section than the driver elements, and each breeder element is laterally bounded by a number of the driver elements. Each driver element further includes structure for spacing the driver elements from adjacent fuel elements and, where adjacent, the thin-walled duct. A core made up of the fuel elements can advantageously include fissile fuel of only one enrichment, while varying the effective enrichment of any given assembly or core region, merely by varying the relative number and size of the driver and breeder elements.
CMR Metadata Curation

NASA Technical Reports Server (NTRS)

Shum, Dana; Bugbee, Kaylin

2017-01-01

This talk explains the ongoing metadata curation activities in the Common Metadata Repository. It explores tools that exist today which are useful for building quality metadata and also opens up the floor for discussions on other potentially useful tools.
42 CFR 457.1140 - Program specific review process: Core elements of review.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 4 2011-10-01 2011-10-01 false Program specific review process: Core elements of review. 457.1140 Section 457.1140 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF... review process: Core elements of review. In adopting the procedures for review of matters described in...
42 CFR 457.1140 - Program specific review process: Core elements of review.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 42 Public Health 4 2010-10-01 2010-10-01 false Program specific review process: Core elements of review. 457.1140 Section 457.1140 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF... review process: Core elements of review. In adopting the procedures for review of matters described in...
34 CFR 200.26 - Core elements of a schoolwide program.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 34 Education 1 2010-07-01 2010-07-01 false Core elements of a schoolwide program. 200.26 Section 200.26 Education Regulations of the Offices of the Department of Education OFFICE OF ELEMENTARY AND... Improving Basic Programs Operated by Local Educational Agencies Schoolwide Programs § 200.26 Core elements...
The GIK-Archive of sediment core radiographs with documentation

NASA Astrophysics Data System (ADS)

Grobe, Hannes; Winn, Kyaw; Werner, Friedrich; Driemel, Amelie; Schumacher, Stefanie; Sieger, Rainer

2017-12-01

The GIK-Archive of radiographs is a collection of X-ray negative and photographic images of sediment cores based on exposures taken since the early 1960s. During four decades of marine geological work at the University of Kiel, Germany, several thousand hours of sampling, careful preparation and X-raying were spent on producing a unique archive of sediment radiographs from several parts of the World Ocean. The archive consists of more than 18 500 exposures on chemical film that were digitized, geo-referenced, supplemented with metadata and archived in the data library PANGAEA®. With this publication, the images have become available open-access for use by the scientific community at https://doi.org/10.1594/PANGAEA.854841.
Distributed heterogeneous inspecting system and its middleware-based solution.

PubMed

Huang, Li-can; Wu, Zhao-hui; Pan, Yun-he

2003-01-01

There are many cases when an organization needs to monitor the data and operations of its supervised departments, especially those departments which are not owned by this organization and are managed by their own information systems. Distributed Heterogeneous Inspecting System (DHIS) is the system an organization uses to monitor its supervised departments by inspecting their information systems. In DHIS, the inspected systems are generally distributed, heterogeneous, and constructed by different companies. DHIS has three key processes-abstracting core data sets and core operation sets, collecting these sets, and inspecting these collected sets. In this paper, we present the concept and mathematical definition of DHIS, a metadata method for solving the interoperability, a security strategy for data transferring, and a middleware-based solution of DHIS. We also describe an example of the inspecting system at WENZHOU custom.
Creating Access Points to Instrument-Based Atmospheric Data: Perspectives from the ARM Metadata Manager

NASA Astrophysics Data System (ADS)

Troyan, D.

2016-12-01

The Atmospheric Radiation Measurement (ARM) program has been collecting data from instruments in diverse climate regions for nearly twenty-five years. These data are made available to all interested parties at no cost via specially designed tools found on the ARM website (www.arm.gov). Metadata is created and applied to the various datastreams to facilitate information retrieval using the ARM website, the ARM Data Discovery Tool, and data quality reporting tools. Over the last year, the Metadata Manager - a relatively new position within the ARM program - created two documents that summarize the state of ARM metadata processes: ARM Metadata Workflow, and ARM Metadata Standards. These documents serve as guides to the creation and management of ARM metadata. With many of ARM's data functions spread around the Department of Energy national laboratory complex and with many of the original architects of the metadata structure no longer working for ARM, there is increased importance on using these documents to resolve issues from data flow bottlenecks and inaccurate metadata to improving data discovery and organizing web pages. This presentation will provide some examples from the workflow and standards documents. The examples will illustrate the complexity of the ARM metadata processes and the efficiency by which the metadata team works towards achieving the goal of providing access to data collected under the auspices of the ARM program.
Efficient processing of MPEG-21 metadata in the binary domain

NASA Astrophysics Data System (ADS)

Timmerer, Christian; Frank, Thomas; Hellwagner, Hermann; Heuer, Jörg; Hutter, Andreas

2005-10-01

XML-based metadata is widely adopted across the different communities and plenty of commercial and open source tools for processing and transforming are available on the market. However, all of these tools have one thing in common: they operate on plain text encoded metadata which may become a burden in constrained and streaming environments, i.e., when metadata needs to be processed together with multimedia content on the fly. In this paper we present an efficient approach for transforming such kind of metadata which are encoded using MPEG's Binary Format for Metadata (BiM) without additional en-/decoding overheads, i.e., within the binary domain. Therefore, we have developed an event-based push parser for BiM encoded metadata which transforms the metadata by a limited set of processing instructions - based on traditional XML transformation techniques - operating on bit patterns instead of cost-intensive string comparisons.
The New Online Metadata Editor for Generating Structured Metadata

NASA Astrophysics Data System (ADS)

Devarakonda, R.; Shrestha, B.; Palanisamy, G.; Hook, L.; Killeffer, T.; Boden, T.; Cook, R. B.; Zolly, L.; Hutchison, V.; Frame, M. T.; Cialella, A. T.; Lazer, K.

2014-12-01

Nobody is better suited to "describe" data than the scientist who created it. This "description" about a data is called Metadata. In general terms, Metadata represents the who, what, when, where, why and how of the dataset. eXtensible Markup Language (XML) is the preferred output format for metadata, as it makes it portable and, more importantly, suitable for system discoverability. The newly developed ORNL Metadata Editor (OME) is a Web-based tool that allows users to create and maintain XML files containing key information, or metadata, about the research. Metadata include information about the specific projects, parameters, time periods, and locations associated with the data. Such information helps put the research findings in context. In addition, the metadata produced using OME will allow other researchers to find these data via Metadata clearinghouses like Mercury [1] [2]. Researchers simply use the ORNL Metadata Editor to enter relevant metadata into a Web-based form. How is OME helping Big Data Centers like ORNL DAAC? The ORNL DAAC is one of NASA's Earth Observing System Data and Information System (EOSDIS) data centers managed by the ESDIS Project. The ORNL DAAC archives data produced by NASA's Terrestrial Ecology Program. The DAAC provides data and information relevant to biogeochemical dynamics, ecological data, and environmental processes, critical for understanding the dynamics relating to the biological components of the Earth's environment. Typically data produced, archived and analyzed is at a scale of multiple petabytes, which makes the discoverability of the data very challenging. Without proper metadata associated with the data, it is difficult to find the data you are looking for and equally difficult to use and understand the data. OME will allow data centers like the ORNL DAAC to produce meaningful, high quality, standards-based, descriptive information about their data products in-turn helping with the data discoverability and interoperability.References:[1] Devarakonda, Ranjeet, et al. "Mercury: reusable metadata management, data discovery and access system." Earth Science Informatics 3.1-2 (2010): 87-94. [2] Wilson, Bruce E., et al. "Mercury Toolset for Spatiotemporal Metadata." NASA Technical Reports Server (NTRS) (2010).
NUCLEAR REACTOR CORE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bell, F.R.

1963-02-01

A nuclear reactor core composed of a number of identical elements of solid moderator material fitted together was designed. Each moderator element is apertured to provide channels for fuel and coolant. The elements have an external shape which permits them to be stacked in layers with similar elements, with the surfaces of adjacent elements fitting and in contact with each other. The cross section of the element is of a general hexagonal shape with identations and protrusions, so that the elements can be fitted together. The described core should not be liable to fracture under transverse loading. Specific arrangements ofmore » moderator elements and fuel and coolant apertures are described. (M.P.G.)« less
Insights into Mercury's interior structure from geodesy measurements

NASA Astrophysics Data System (ADS)

Rivoldini, A.; Van Hoolst, T.; Trinh, A.

2013-09-01

The measurements of the gravitational field of Mercury by MESSENGER [1] and improved measurements of the spin state of Mercury [2] provide important constraints on the interior structure of Mercury. In particular, these data give strong constraints on the radius and density of Mercury's core and on the core's concentration of sulfur if sulfur is the only light element in the core [3]. Although sulfur is ubiquitously invoked as being the principal candidate light element in terrestrial planet's cores its abundance in the core depends on the redox conditions during planetary formation. MESSENGER data from remote sensing of Mercury's surface [4] indicate a high abundance of sulfur and confirm the low abundance in FeO supporting the hypotheses that Mercury formed under reducing conditions [5]. Therefore, substantial amounts of other light elements like for instance silicon could be present together with sulfur inside Mercury's core. Unlike sulfur, which does almost not partition into solid iron under Mercury's core pressure and temperature conditions, silicon partitions virtually equally between solid and liquid iron. Thus, if silicon is the only light element inside the core, the density jump at the inner-core outer-core boundary is significantly smaller if compared to an Fe - FeS core. If both silicon and sulfur are present inside Mercury's core then as a consequence of a large immiscibility region in liquid Fe - Si - S at Mercury's core conditions and for specific concentrations of light elements [6] a thin layer much enriched in sulfur and depleted in silicon could form at the top of the core. In this study we analyze interior structure models with silicon as the only light element in the core and with both silicon and sulfur in the core. Compared to models with Fe - FeS both settings have different mass distributions within their cores and will likely deform differently due to different elastic properties. Consequently their libration and tides will be different. Here we will use the measured 88 day libration amplitude and polar moment of inertia of Mercury in order to constrain the interior structure of both settings and calculate their tides.
An Approach to Information Management for AIR7000 with Metadata and Ontologies

DTIC Science & Technology

2009-10-01

metadata. We then propose an approach based on Semantic Technologies including the Resource Description Framework (RDF) and Upper Ontologies, for the...mandating specific metadata schemas can result in interoperability problems. For example, many standards within the ADO mandate the use of XML for metadata...such problems, we propose an archi- tecture in which different metadata schemes can inter operate. By using RDF (Resource Description Framework ) as a
Making Interoperability Easier with NASA's Metadata Management Tool (MMT)

NASA Technical Reports Server (NTRS)

Shum, Dana; Reese, Mark; Pilone, Dan; Baynes, Katie

2016-01-01

While the ISO-19115 collection level metadata format meets many users' needs for interoperable metadata, it can be cumbersome to create it correctly. Through the MMT's simple UI experience, metadata curators can create and edit collections which are compliant with ISO-19115 without full knowledge of the NASA Best Practices implementation of ISO-19115 format. Users are guided through the metadata creation process through a forms-based editor, complete with field information, validation hints and picklists. Once a record is completed, users can download the metadata in any of the supported formats with just 2 clicks.
Metazen – metadata capture for metagenomes

DOE PAGES

Bischof, Jared; Harrison, Travis; Paczian, Tobias; ...

2014-12-08

Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less
Metazen – metadata capture for metagenomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, Jared; Harrison, Travis; Paczian, Tobias

Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less
Why can't I manage my digital images like MP3s? The evolution and intent of multimedia metadata

NASA Astrophysics Data System (ADS)

Goodrum, Abby; Howison, James

2005-01-01

This paper considers the deceptively simple question: Why can't digital images be managed in the simple and effective manner in which digital music files are managed? We make the case that the answer is different treatments of metadata in different domains with different goals. A central difference between the two formats stems from the fact that digital music metadata lookup services are collaborative and automate the movement from a digital file to the appropriate metadata, while image metadata services do not. To understand why this difference exists we examine the divergent evolution of metadata standards for digital music and digital images and observed that the processes differ in interesting ways according to their intent. Specifically music metadata was developed primarily for personal file management and community resource sharing, while the focus of image metadata has largely been on information retrieval. We argue that lessons from MP3 metadata can assist individuals facing their growing personal image management challenges. Our focus therefore is not on metadata for cultural heritage institutions or the publishing industry, it is limited to the personal libraries growing on our hard-drives. This bottom-up approach to file management combined with p2p distribution radically altered the music landscape. Might such an approach have a similar impact on image publishing? This paper outlines plans for improving the personal management of digital images-doing image metadata and file management the MP3 way-and considers the likelihood of success.
Why can't I manage my digital images like MP3s? The evolution and intent of multimedia metadata

NASA Astrophysics Data System (ADS)

Goodrum, Abby; Howison, James

2004-12-01

This paper considers the deceptively simple question: Why can"t digital images be managed in the simple and effective manner in which digital music files are managed? We make the case that the answer is different treatments of metadata in different domains with different goals. A central difference between the two formats stems from the fact that digital music metadata lookup services are collaborative and automate the movement from a digital file to the appropriate metadata, while image metadata services do not. To understand why this difference exists we examine the divergent evolution of metadata standards for digital music and digital images and observed that the processes differ in interesting ways according to their intent. Specifically music metadata was developed primarily for personal file management and community resource sharing, while the focus of image metadata has largely been on information retrieval. We argue that lessons from MP3 metadata can assist individuals facing their growing personal image management challenges. Our focus therefore is not on metadata for cultural heritage institutions or the publishing industry, it is limited to the personal libraries growing on our hard-drives. This bottom-up approach to file management combined with p2p distribution radically altered the music landscape. Might such an approach have a similar impact on image publishing? This paper outlines plans for improving the personal management of digital images-doing image metadata and file management the MP3 way-and considers the likelihood of success.

Metal-silicate partitioning and the light element in the core (Invited)

NASA Astrophysics Data System (ADS)

Wood, B. J.; Wade, J.; Tuff, J.

2009-12-01

Most attempts to constrain the concentrations of “light” elements in the Earth’s core rely either on cosmochemical arguments or on arguments based on the densities and equations of state of Fe-alloys containing the element of concern. Despite its utility, the latter approach yields a wide range of permissible compositions and hence weak constraints. The major problem with the cosmochemical approach is that the abundances in the bulk Earth of all the candidate “light” elements- H, C, O, Si and S are highly uncertain because of their volatile behavior during planetary accretion. In contrast, refractory elements appear to be in approximately CI chondritic relative abundances in the Earth. This leads to the potential for using the partitioning of refractory siderophile elements between the mantle and core to constrain the concentrations of light elements in the core. Recent experimental metal-silicate partitioning data, coupled with mantle abundances of refractory siderophile elements (e.g. Wade and Wood, EPSL v.236, 78—95,2005; Kegler et. al. EPSL v.268, 28-40,2008) have shown that the core segregated from the mantle under high pressure conditions (~40 GPa). If a wide range of elements, from very siderophile, (e.g. Mo) through moderately (Ni, Co, W) to weakly siderophile (V, Cr, Nb, Si) are considered, the Earth also appears to have become more oxidized during accretion. Metal-silicate partitioning of some elements is also sensitive to the light element content of the metal. For example, Nb and W partitioning depend strongly on carbon, Mo on silicon and Cr on sulfur. Given the measured mantle abundances of the refractory elements, these observations enable the Si and C contents of the core to be constrained at ~5% and <2% respectively while partitioning is consistent with a cosmochemically-estimated S content of ~2%.
The Role of Metadata Standards in EOSDIS Search and Retrieval Applications

NASA Technical Reports Server (NTRS)

Pfister, Robin

1999-01-01

Metadata standards play a critical role in data search and retrieval systems. Metadata tie software to data so the data can be processed, stored, searched, retrieved and distributed. Without metadata these actions are not possible. The process of populating metadata to describe science data is an important service to the end user community so that a user who is unfamiliar with the data, can easily find and learn about a particular dataset before an order decision is made. Once a good set of standards are in place, the accuracy with which data search can be performed depends on the degree to which metadata standards are adhered during product definition. NASA's Earth Observing System Data and Information System (EOSDIS) provides examples of how metadata standards are used in data search and retrieval.
openPDS: protecting the privacy of metadata through SafeAnswers.

PubMed

de Montjoye, Yves-Alexandre; Shmueli, Erez; Wang, Samuel S; Pentland, Alex Sandy

2014-01-01

The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.
openPDS: Protecting the Privacy of Metadata through SafeAnswers

PubMed Central

de Montjoye, Yves-Alexandre; Shmueli, Erez; Wang, Samuel S.; Pentland, Alex Sandy

2014-01-01

The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research. PMID:25007320
Adaptive Data Gathering in Mobile Sensor Networks Using Speedy Mobile Elements

PubMed Central

Lai, Yongxuan; Xie, Jinshan; Lin, Ziyu; Wang, Tian; Liao, Minghong

2015-01-01

Data gathering is a key operator for applications in wireless sensor networks; yet it is also a challenging problem in mobile sensor networks when considering that all nodes are mobile and the communications among them are opportunistic. This paper proposes an efficient data gathering scheme called ADG that adopts speedy mobile elements as the mobile data collector and takes advantage of the movement patterns of the network. ADG first extracts the network meta-data at initial epochs, and calculates a set of proxy nodes based on the meta-data. Data gathering is then mapped into the Proxy node Time Slot Allocation (PTSA) problem that schedules the time slots and orders, according to which the data collector could gather the maximal amount of data within a limited period. Finally, the collector follows the schedule and picks up the sensed data from the proxy nodes through one hop of message transmissions. ADG learns the period when nodes are relatively stationary, so that the collector is able to pick up the data from them during the limited data gathering period. Moreover, proxy nodes and data gathering points could also be timely updated so that the collector could adapt to the change of node movements. Extensive experimental results show that the proposed scheme outperforms other data gathering schemes on the cost of message transmissions and the data gathering rate, especially under the constraint of limited data gathering period. PMID:26389903
caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability.

PubMed

Komatsoulis, George A; Warzel, Denise B; Hartel, Francis W; Shanbhag, Krishnakant; Chilukuri, Ram; Fragoso, Gilberto; Coronado, Sherri de; Reeves, Dianne M; Hadfield, Jillaine B; Ludet, Christophe; Covitz, Peter A

2008-02-01

One of the requirements for a federated information system is interoperability, the ability of one computer system to access and use the resources of another system. This feature is particularly important in biomedical research systems, which need to coordinate a variety of disparate types of data. In order to meet this need, the National Cancer Institute Center for Bioinformatics (NCICB) has created the cancer Common Ontologic Representation Environment (caCORE), an interoperability infrastructure based on Model Driven Architecture. The caCORE infrastructure provides a mechanism to create interoperable biomedical information systems. Systems built using the caCORE paradigm address both aspects of interoperability: the ability to access data (syntactic interoperability) and understand the data once retrieved (semantic interoperability). This infrastructure consists of an integrated set of three major components: a controlled terminology service (Enterprise Vocabulary Services), a standards-based metadata repository (the cancer Data Standards Repository) and an information system with an Application Programming Interface (API) based on Domain Model Driven Architecture. This infrastructure is being leveraged to create a Semantic Service-Oriented Architecture (SSOA) for cancer research by the National Cancer Institute's cancer Biomedical Informatics Grid (caBIG).
caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability

PubMed Central

Komatsoulis, George A.; Warzel, Denise B.; Hartel, Frank W.; Shanbhag, Krishnakant; Chilukuri, Ram; Fragoso, Gilberto; de Coronado, Sherri; Reeves, Dianne M.; Hadfield, Jillaine B.; Ludet, Christophe; Covitz, Peter A.

2008-01-01

One of the requirements for a federated information system is interoperability, the ability of one computer system to access and use the resources of another system. This feature is particularly important in biomedical research systems, which need to coordinate a variety of disparate types of data. In order to meet this need, the National Cancer Institute Center for Bioinformatics (NCICB) has created the cancer Common Ontologic Representation Environment (caCORE), an interoperability infrastructure based on Model Driven Architecture. The caCORE infrastructure provides a mechanism to create interoperable biomedical information systems. Systems built using the caCORE paradigm address both aspects of interoperability: the ability to access data (syntactic interoperability) and understand the data once retrieved (semantic interoperability). This infrastructure consists of an integrated set of three major components: a controlled terminology service (Enterprise Vocabulary Services), a standards-based metadata repository (the cancer Data Standards Repository) and an information system with an Application Programming Interface (API) based on Domain Model Driven Architecture. This infrastructure is being leveraged to create a Semantic Service Oriented Architecture (SSOA) for cancer research by the National Cancer Institute’s cancer Biomedical Informatics Grid (caBIG™). PMID:17512259
Progress in defining a standard for file-level metadata

NASA Technical Reports Server (NTRS)

Williams, Joel; Kobler, Ben

1996-01-01

In the following narrative, metadata required to locate a file on tape or collection of tapes will be referred to as file-level metadata. This paper discribes the rationale for and the history of the effort to define a standard for this metadata.
Achieving interoperability for metadata registries using comparative object modeling.

PubMed

Park, Yu Rang; Kim, Ju Han

2010-01-01

Achieving data interoperability between organizations relies upon agreed meaning and representation (metadata) of data. For managing and registering metadata, many organizations have built metadata registries (MDRs) in various domains based on international standard for MDR framework, ISO/IEC 11179. Following this trend, two pubic MDRs in biomedical domain have been created, United States Health Information Knowledgebase (USHIK) and cancer Data Standards Registry and Repository (caDSR), from U.S. Department of Health & Human Services and National Cancer Institute (NCI), respectively. Most MDRs are implemented with indiscriminate extending for satisfying organization-specific needs and solving semantic and structural limitation of ISO/IEC 11179. As a result it is difficult to address interoperability among multiple MDRs. In this paper, we propose an integrated metadata object model for achieving interoperability among multiple MDRs. To evaluate this model, we developed an XML Schema Definition (XSD)-based metadata exchange format. We created an XSD-based metadata exporter, supporting both the integrated metadata object model and organization-specific MDR formats.
Request queues for interactive clients in a shared file system of a parallel computing system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, John M.; Faibish, Sorin

Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue;more » and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.« less
NASA's Earth Observing System Data and Information System - Many Mechanisms for On-Going Evolution

NASA Astrophysics Data System (ADS)

Ramapriyan, H. K.

2012-12-01

NASA's Earth Observing System Data and Information System has been serving a broad user community since August 1994. As a long-lived multi-mission system serving multiple scientific disciplines and a diverse user community, EOSDIS has been evolving continuously. It has had and continues to have many forms of community input to help with this evolution. Early in its history, it had inputs from the EOSDIS Advisory Panel, benefited from the reviews by various external committees and evolved into the present distributed architecture with discipline-based Distributed Active Archive Centers (DAACs), Science Investigator-led Processing Systems and a cross-DAAC search and data access capability. EOSDIS evolution has been helped by advances in computer technology, moving from an initially planned supercomputing environment to SGI workstations to Linux Clusters for computation and from near-line archives of robotic silos with tape cassettes to RAID-disk-based on-line archives for storage. The network capacities have increased steadily over the years making delivery of data on media almost obsolete. The advances in information systems technologies have been having an even greater impact on the evolution of EOSDIS. In the early days, the advent of the World Wide Web came as a game-changer in the operation of EOSDIS. The metadata model developed for the EOSDIS Core System for representing metadata from EOS standard data products has had an influence on the Federal Geographic Data Committee's metadata content standard and the ISO metadata standards. The influence works both ways. As ISO 19115 metadata standard has developed in recent years, EOSDIS is reviewing its metadata to ensure compliance with the standard. Improvements have been made in the cross-DAAC search and access of data using the centralized metadata clearing house (EOS Clearing House - ECHO) and the client Reverb. Given the diversity of the Earth science disciplines served by the DAACs, the DAACs have developed a number of software tools tailored to their respective user communities. Web services play an important part in improved access to data products including some basic analysis and visualization capabilities. A coherent view into all capabilities available from EOSDIS is evolving through the "Coherent Web" effort. Data are being made available in near real-time for scientific research as well as time-critical applications. On-going community inputs for infusion for maintaining vitality of EOSDIS come from technology developments by NASA-sponsored community data system programs - Advancing Collaborative Connections for Earth System Science (ACCESS), Making Earth System Data Records for Use in Research Environments (MEaSUREs) and Applied Information System Technology (AIST), as well as participation in Earth Science Data System Working Groups, the Earth Science Information Partners Federation and other interagency/international activities. An important source of community needs is the annual American Customer Satisfaction Index survey of EOSDIS users. Some of the key areas in which improvements are required and incremental progress is being made are: ease of discovery and access; cross-organizational interoperability; data inter-use; ease of collaboration; ease of citation of datasets; preservation of provenance and context and making them conveniently available to users.
Making metadata usable in a multi-national research setting.

PubMed

Ellul, Claire; Foord, Joanna; Mooney, John

2013-11-01

SECOA (Solutions for Environmental Contrasts in Coastal Areas) is a multi-national research project examining the effects of human mobility on urban settlements in fragile coastal environments. This paper describes the setting up of a SECOA metadata repository for non-specialist researchers such as environmental scientists and tourism experts. Conflicting usability requirements of two groups - metadata creators and metadata users - are identified along with associated limitations of current metadata standards. A description is given of a configurable metadata system designed to grow as the project evolves. This work is of relevance for similar projects such as INSPIRE. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Inter-University Upper Atmosphere Global Observation Network (IUGONET) Metadata Database and Its Interoperability

NASA Astrophysics Data System (ADS)

Yatagai, A. I.; Iyemori, T.; Ritschel, B.; Koyama, Y.; Hori, T.; Abe, S.; Tanaka, Y.; Shinbori, A.; Umemura, N.; Sato, Y.; Yagi, M.; Ueno, S.; Hashiguchi, N. O.; Kaneda, N.; Belehaki, A.; Hapgood, M. A.

2013-12-01

The IUGONET is a Japanese program to build a metadata database for ground-based observations of the upper atmosphere [1]. The project began in 2009 with five Japanese institutions which archive data observed by radars, magnetometers, photometers, radio telescopes and helioscopes, and so on, at various altitudes from the Earth's surface to the Sun. Systems have been developed to allow searching of the above described metadata. We have been updating the system and adding new and updated metadata. The IUGONET development team adopted the SPASE metadata model [2] to describe the upper atmosphere data. This model is used as the common metadata format by the virtual observatories for solar-terrestrial physics. It includes metadata referring to each data file (called a 'Granule'), which enable a search for data files as well as data sets. Further details are described in [2] and [3]. Currently, three additional Japanese institutions are being incorporated in IUGONET. Furthermore, metadata of observations of the troposphere, taken at the observatories of the middle and upper atmosphere radar at Shigaraki and the Meteor radar in Indonesia, have been incorporated. These additions will contribute to efficient interdisciplinary scientific research. In the beginning of 2013, the registration of the 'Observatory' and 'Instrument' metadata was completed, which makes it easy to overview of the metadata database. The number of registered metadata as of the end of July, totalled 8.8 million, including 793 observatories and 878 instruments. It is important to promote interoperability and/or metadata exchange between the database development groups. A memorandum of agreement has been signed with the European Near-Earth Space Data Infrastructure for e-Science (ESPAS) project, which has similar objectives to IUGONET with regard to a framework for formal collaboration. Furthermore, observations by satellites and the International Space Station are being incorporated with a view for making/linking metadata databases. The development of effective data systems will contribute to the progress of scientific research on solar terrestrial physics, climate and the geophysical environment. Any kind of cooperation, metadata input and feedback, especially for linkage of the databases, is welcomed. References 1. Hayashi, H. et al., Inter-university Upper Atmosphere Global Observation Network (IUGONET), Data Sci. J., 12, WDS179-184, 2013. 2. King, T. et al., SPASE 2.0: A standard data model for space physics. Earth Sci. Inform. 3, 67-73, 2010, doi:10.1007/s12145-010-0053-4. 3. Hori, T., et al., Development of IUGONET metadata format and metadata management system. J. Space Sci. Info. Jpn., 105-111, 2012. (in Japanese)
Towards Precise Metadata-set for Discovering 3D Geospatial Models in Geo-portals

NASA Astrophysics Data System (ADS)

Zamyadi, A.; Pouliot, J.; Bédard, Y.

2013-09-01

Accessing 3D geospatial models, eventually at no cost and for unrestricted use, is certainly an important issue as they become popular among participatory communities, consultants, and officials. Various geo-portals, mainly established for 2D resources, have tried to provide access to existing 3D resources such as digital elevation model, LIDAR or classic topographic data. Describing the content of data, metadata is a key component of data discovery in geo-portals. An inventory of seven online geo-portals and commercial catalogues shows that the metadata referring to 3D information is very different from one geo-portal to another as well as for similar 3D resources in the same geo-portal. The inventory considered 971 data resources affiliated with elevation. 51% of them were from three geo-portals running at Canadian federal and municipal levels whose metadata resources did not consider 3D model by any definition. Regarding the remaining 49% which refer to 3D models, different definition of terms and metadata were found, resulting in confusion and misinterpretation. The overall assessment of these geo-portals clearly shows that the provided metadata do not integrate specific and common information about 3D geospatial models. Accordingly, the main objective of this research is to improve 3D geospatial model discovery in geo-portals by adding a specific metadata-set. Based on the knowledge and current practices on 3D modeling, and 3D data acquisition and management, a set of metadata is proposed to increase its suitability for 3D geospatial models. This metadata-set enables the definition of genuine classes, fields, and code-lists for a 3D metadata profile. The main structure of the proposal contains 21 metadata classes. These classes are classified in three packages as General and Complementary on contextual and structural information, and Availability on the transition from storage to delivery format. The proposed metadata set is compared with Canadian Geospatial Data Infrastructure (CGDI) metadata which is an implementation of North American Profile of ISO-19115. The comparison analyzes the two metadata against three simulated scenarios about discovering needed 3D geo-spatial datasets. Considering specific metadata about 3D geospatial models, the proposed metadata-set has six additional classes on geometric dimension, level of detail, geometric modeling, topology, and appearance information. In addition classes on data acquisition, preparation, and modeling, and physical availability have been specialized for 3D geospatial models.
A volatile rich Earth's core?

NASA Astrophysics Data System (ADS)

Morard, G.; Antonangeli, D.; Andrault, D.; Nakajima, Y.

2017-12-01

The composition of the Earth's core is still an open question. Although mostly composed of iron, it contains impurities that lower its density and melting point with respect to pure Fe. Knowledge of the nature and abundance of light elements (O, S, Si, C or H) in the core has major implications for establishing the bulk composition of the Earth and for building the model of Earth's differentiation. Geochemical models of the Earth's formation point out that its building blocks were depleted in volatile elements compared to the chondritic abundance, therefore light elements such as S, H or C cannot be the major elements alloyed with iron in the Earth's core. However, such models should be compatible with the comparison of seismic properties of the Earth's core and physical properties of iron alloys under extreme conditions, such as sound velocity or density of solid and liquid. The present work will discuss the recent progress for compositional model issued from studies of phase diagrams and elastic properties of iron alloys under core conditions and highlight the compatibility of volatile elements with observed properties of the Earth's core, in potential contradiction with models derived from metal-silicate partitioning experiments.
Interactive Visualization Systems and Data Integration Methods for Supporting Discovery in Collections of Scientific Information

DTIC Science & Technology

2011-05-01

iTunes illustrate the difference between the centralized approach of digital library systems and the distributed approach of container file formats...metadata in a container file format. Apple’s iTunes uses a centralized metadata approach and allows users to maintain song metadata in a single...one iTunes library to another the metadata must be copied separately or reentered in the new library. This demonstrates the utility of storing metadata
Mitogenome metadata: current trends and proposed standards.

PubMed

Strohm, Jeff H T; Gwiazdowski, Rodger A; Hanner, Robert

2016-09-01

Mitogenome metadata are descriptive terms about the sequence, and its specimen description that allow both to be digitally discoverable and interoperable. Here, we review a sampling of mitogenome metadata published in the journal Mitochondrial DNA between 2005 and 2014. Specifically, we have focused on a subset of metadata fields that are available for GenBank records, and specified by the Genomics Standards Consortium (GSC) and other biodiversity metadata standards; and we assessed their presence across three main categories: collection, biological and taxonomic information. To do this we reviewed 146 mitogenome manuscripts, and their associated GenBank records, and scored them for 13 metadata fields. We also explored the potential for mitogenome misidentification using their sequence diversity, and taxonomic metadata on the Barcode of Life Datasystems (BOLD). For this, we focused on all Lepidoptera and Perciformes mitogenomes included in the review, along with additional mitogenome sequence data mined from Genbank. Overall, we found that none of 146 mitogenome projects provided all the metadata we looked for; and only 17 projects provided at least one category of metadata across the three main categories. Comparisons using mtDNA sequences from BOLD, suggest that some mitogenomes may be misidentified. Lastly, we appreciate the research potential of mitogenomes announced through this journal; and we conclude with a suggestion of 13 metadata fields, available on GenBank, that if provided in a mitogenomes's GenBank record, would increase their research value.
Metadata and Service at the GFZ ISDC Portal

NASA Astrophysics Data System (ADS)

Ritschel, B.

2008-05-01

The online service portal of the GFZ Potsdam Information System and Data Center (ISDC) is an access point for all manner of geoscientific geodata, its corresponding metadata, scientific documentation and software tools. At present almost 2000 national and international users and user groups have the opportunity to request Earth science data from a portfolio of 275 different products types and more than 20 Million single data files with an added volume of approximately 12 TByte. The majority of the data and information, the portal currently offers to the public, are global geomonitoring products such as satellite orbit and Earth gravity field data as well as geomagnetic and atmospheric data for the exploration. These products for Earths changing system are provided via state-of-the art retrieval techniques. The data product catalog system behind these techniques is based on the extensive usage of standardized metadata, which are describing the different geoscientific product types and data products in an uniform way. Where as all ISDC product types are specified by NASA's Directory Interchange Format (DIF), Version 9.0 Parent XML DIF metadata files, the individual data files are described by extended DIF metadata documents. Depending on the beginning of the scientific project, one part of data files are described by extended DIF, Version 6 metadata documents and the other part are specified by data Child XML DIF metadata documents. Both, the product type dependent parent DIF metadata documents and the data file dependent child DIF metadata documents are derived from a base-DIF.xsd xml schema file. The ISDC metadata philosophy defines a geoscientific product as a package consisting of mostly one or sometimes more than one data file plus one extended DIF metadata file. Because NASA's DIF metadata standard has been developed in order to specify a collection of data only, the extension of the DIF standard consists of new and specific attributes, which are necessary for an explicit identification of single data files and the set-up of a comprehensive Earth science data catalog. The huge ISDC data catalog is realized by product type dependent tables filled with data file related metadata, which have relations to corresponding metadata tables. The product type describing parent DIF XML metadata documents are stored and managed in ORACLE's XML storage structures. In order to improve the interoperability of the ISDC service portal, the existing proprietary catalog system will be extended by an ISO 19115 based web catalog service. In addition to this development there is ISDC related concerning semantic network of different kind of metadata resources, like different kind of standardized and not-standardized metadata documents and literature as well as Web 2.0 user generated information derived from tagging activities and social navigation data.
Mercury Toolset for Spatiotemporal Metadata

NASA Technical Reports Server (NTRS)

Wilson, Bruce E.; Palanisamy, Giri; Devarakonda, Ranjeet; Rhyne, B. Timothy; Lindsley, Chris; Green, James

2010-01-01

Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily) harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.
Mercury Toolset for Spatiotemporal Metadata

NASA Astrophysics Data System (ADS)

Devarakonda, Ranjeet; Palanisamy, Giri; Green, James; Wilson, Bruce; Rhyne, B. Timothy; Lindsley, Chris

2010-06-01

Mercury (http://mercury.ornl.gov) is a set of tools for federated harvesting, searching, and retrieving metadata, particularly spatiotemporal metadata. Version 3.0 of the Mercury toolset provides orders of magnitude improvements in search speed, support for additional metadata formats, integration with Google Maps for spatial queries, facetted type search, support for RSS (Really Simple Syndication) delivery of search results, and enhanced customization to meet the needs of the multiple projects that use Mercury. It provides a single portal to very quickly search for data and information contained in disparate data management systems, each of which may use different metadata formats. Mercury harvests metadata and key data from contributing project servers distributed around the world and builds a centralized index. The search interfaces then allow the users to perform a variety of fielded, spatial, and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury periodically (typically daily)harvests metadata sources through a collection of interfaces and re-indexes these metadata to provide extremely rapid search capabilities, even over collections with tens of millions of metadata records. A number of both graphical and application interfaces have been constructed within Mercury, to enable both human users and other computer programs to perform queries. Mercury was also designed to support multiple different projects, so that the particular fields that can be queried and used with search filters are easy to configure for each different project.

Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators

ERIC Educational Resources Information Center

Mayernik, Matthew Stephen

2011-01-01

As digital data creation technologies become more prevalent, data and metadata management are necessary to make data available, usable, sharable, and storable. Researchers in many scientific settings, however, have little experience or expertise in data and metadata management. In this dissertation, I explore the everyday data and metadata…
NetCDF4/HDF5 and Linked Data in the Real World - Enriching Geoscientific Metadata without Bloat

NASA Astrophysics Data System (ADS)

Ip, Alex; Car, Nicholas; Druken, Kelsey; Poudjom-Djomani, Yvette; Butcher, Stirling; Evans, Ben; Wyborn, Lesley

2017-04-01

NetCDF4 has become the dominant generic format for many forms of geoscientific data, leveraging (and constraining) the versatile HDF5 container format, while providing metadata conventions for interoperability. However, the encapsulation of detailed metadata within each file can lead to metadata "bloat", and difficulty in maintaining consistency where metadata is replicated to multiple locations. Complex conceptual relationships are also difficult to represent in simple key-value netCDF metadata. Linked Data provides a practical mechanism to address these issues by associating the netCDF files and their internal variables with complex metadata stored in Semantic Web vocabularies and ontologies, while complying with and complementing existing metadata conventions. One of the stated objectives of the netCDF4/HDF5 formats is that they should be self-describing: containing metadata sufficient for cataloguing and using the data. However, this objective can be regarded as only partially-met where details of conventions and definitions are maintained externally to the data files. For example, one of the most widely used netCDF community standards, the Climate and Forecasting (CF) Metadata Convention, maintains standard vocabularies for a broad range of disciplines across the geosciences, but this metadata is currently neither readily discoverable nor machine-readable. We have previously implemented useful Linked Data and netCDF tooling (ncskos) that associates netCDF files, and individual variables within those files, with concepts in vocabularies formulated using the Simple Knowledge Organization System (SKOS) ontology. NetCDF files contain Uniform Resource Identifier (URI) links to terms represented as SKOS Concepts, rather than plain-text representations of those terms, so we can use simple, standardised web queries to collect and use rich metadata for the terms from any Linked Data-presented SKOS vocabulary. Geoscience Australia (GA) manages a large volume of diverse geoscientific data, much of which is being translated from proprietary formats to netCDF at NCI Australia. This data is made available through the NCI National Environmental Research Data Interoperability Platform (NERDIP) for programmatic access and interdisciplinary analysis. The netCDF files contain both scientific data variables (e.g. gravity, magnetic or radiometric values), but also domain-specific operational values (e.g. specific instrument parameters) best described fully in formal vocabularies. Our ncskos codebase provides access to multiple stores of detailed external metadata in a standardised fashion. Geophysical datasets are generated from a "survey" event, and GA maintains corporate databases of all surveys and their associated metadata. It is impractical to replicate the full source survey metadata into each netCDF dataset so, instead, we link the netCDF files to survey metadata using public Linked Data URIs. These URIs link to Survey class objects which we model as a subclass of Activity objects as defined by the PROV Ontology, and we provide URI resolution for them via a custom Linked Data API which draws current survey metadata from GA's in-house databases. We have demonstrated that Linked Data is a practical way to associate netCDF data with detailed, external metadata. This allows us to ensure that catalogued metadata is kept consistent with metadata points-of-truth, and we can infer complex conceptual relationships not possible with netCDF key-value attributes alone.
Experiments on Lunar Core Composition: Phase Equilibrium Analysis of A Multi-Element (Fe-Ni-S-C) System

NASA Technical Reports Server (NTRS)

Go, B. M.; Righter, K.; Danielson, L.; Pando, K.

2015-01-01

Previous geochemical and geophysical experiments have proposed the presence of a small, metallic lunar core, but its composition is still being investigated. Knowledge of core composition can have a significant effect on understanding the thermal history of the Moon, the conditions surrounding the liquid-solid or liquid-liquid field, and siderophile element partitioning between mantle and core. However, experiments on complex bulk core compositions are very limited. One limitation comes from numerous studies that have only considered two or three element systems such as Fe-S or Fe-C, which do not supply a comprehensive understanding for complex systems such as Fe-Ni-S-Si-C. Recent geophysical data suggests the presence of up to 6% lighter elements. Reassessments of Apollo seismological analyses and samples have also shown the need to acquire more data for a broader range of pressures, temperatures, and compositions. This study considers a complex multi-element system (Fe-Ni-S-C) for a relevant pressure and temperature range to the Moon's core conditions.
Content Metadata Standards for Marine Science: A Case Study

USGS Publications Warehouse

Riall, Rebecca L.; Marincioni, Fausto; Lightsom, Frances L.

2004-01-01

The U.S. Geological Survey developed a content metadata standard to meet the demands of organizing electronic resources in the marine sciences for a broad, heterogeneous audience. These metadata standards are used by the Marine Realms Information Bank project, a Web-based public distributed library of marine science from academic institutions and government agencies. The development and deployment of this metadata standard serve as a model, complete with lessons about mistakes, for the creation of similarly specialized metadata standards for digital libraries.
Major and trace elements in Mahogany zone oil shale in two cores from the Green River Formation, piceance basin, Colorado

USGS Publications Warehouse

Tuttle, M.L.; Dean, W.E.; Parduhn, N.L.

1983-01-01

The Parachute Creek Member of the lacustrine Green River Formation contains thick sequences of rich oil-shale. The richest sequence and the richest oil-shale bed occurring in the member are called the Mahogany zone and the Mahogany bed, respectively, and were deposited in ancient Lake Uinta. The name "Mahogany" is derived from the red-brown color imparted to the rock by its rich-kerogen content. Geochemical abundance and distribution of eight major and 18 trace elements were determined in the Mahogany zone sampled from two cores, U. S. Geological Survey core hole CR-2 and U. S. Bureau of Mines core hole O1-A (Figure 1). The oil shale from core hole CR-2 was deposited nearer the margin of Lake Uinta than oil shale from core hole O1-A. The major- and trace-element chemistry of the Mahogany zone from each of these two cores is compared using elemental abundances and Q-mode factor modeling. The results of chemical analyses of 44 CR-2 Mahogany samples and 76 O1-A Mahogany samples are summarized in Figure 2. The average geochemical abundances for shale (1) and black shale (2) are also plotted on Figure 2 for comparison. The elemental abundances in the samples from the two cores are similar for the majority of elements. Differences at the 95% probability level are higher concentrations of Ca, Cu, La, Ni, Sc and Zr in the samples from core hole CR-2 compared to samples from core hole O1-A and higher concentrations of As and Sr in samples from core hole O1-A compared to samples from core hole CR-2. These differences presumably reflect slight differences in depositional conditions or source material at the two sites. The Mahogany oil shale from the two cores has lower concentrations of most trace metals and higher concentrations of carbonate-related elements (Ca, Mg, Sr and Na) compared to the average shale and black shale. During deposition of the Mahogany oil shale, large quantities of carbonates were precipitated resulting in the enrichment of carbonate-related elements and dilution of most trace elements as pointed out in several previous studies. Q-mode factor modeling is a statistical method used to group samples on the basis of compositional similarities. Factor end-member samples are chosen by the model. All other sample compositions are represented by varying proportions of the factor end-members and grouped as to their highest proportion. The compositional similarities defined by the Q-mode model are helpful in understanding processes controlling multi-element distributions. The models for each core are essentially identical. A four-factor model explains 70% of the variance in the CR-2 data and 64% of the O1-A data (the average correlation coefficients are 0. 84 and 0. 80, respectively). Increasing the number of factors above 4 results in the addition of unique instead of common factors. Table I groups the elements based on high factor-loading scores (the amount of influence each element has in defining the model factors). Similar elemental associations are found in both cores. Elemental abundances are plotted as a function of core depth using a five-point weighted moving average of the original data to smooth the curve (Figure 3 and 4). The plots are grouped according to the four factors defined by the Q-mode models and show similar distributions for elements within the same factor. Factor 1 samples are rich in most trace metals. High oil yield and the presence of illite characterize the end-member samples for this factor (3, 4) suggesting that adsorption of metals onto clay particles or organic matter is controlling the distribution of the metals. Precipitation of some metals as sulfides is possible (5). Factor 2 samples are high in elements commonly associated with minerals of detrital or volcanogenic origin. Altered tuff beds and lenses are prevalent within the Mahogany zone. The CR-2 end-member samples for this factor contain analcime (3) which is an alteration product within the tuff beds of the Green River Formation. Th
HIGH TEMPERATURE, HIGH POWER HETEROGENEOUS NUCLEAR REACTOR

DOEpatents

Hammond, R.P.; Wykoff, W.R.; Busey, H.M.

1960-06-14

A heterogeneous nuclear reactor is designed comprising a stationary housing and a rotatable annular core being supported for rotation about a vertical axis in the housing, the core containing a plurality of radial fuel- element supporting channels, the cylindrical empty space along the axis of the core providing a central plenum for the disposal of spent fuel elements, the core cross section outer periphery being vertically gradated in radius one end from the other to provide a coolant duct between the core and the housing, and means for inserting fresh fuel elements in the supporting channels under pressure and while the reactor is in operation.
Properties of iron alloys under the Earth's core conditions

NASA Astrophysics Data System (ADS)

Morard, Guillaume; Andrault, Denis; Antonangeli, Daniele; Bouchet, Johann

2014-05-01

The Earth's core is constituted of iron and nickel alloyed with lighter elements. In view of their affinity with the metallic phase, their relative high abundance in the solar system and their moderate volatility, a list of potential light elements have been established, including sulfur, silicon and oxygen. We will review the effects of these elements on different aspects of Fe-X high pressure phase diagrams under Earth's core conditions, such as melting temperature depression, solid-liquid partitioning during crystallization, and crystalline structure of the solid phases. Once extrapolated to the inner-outer core boundary, these petrological properties can be used to constrain the Earth's core properties.
Development of Web GIS for complex processing and visualization of climate geospatial datasets as an integral part of dedicated Virtual Research Environment

NASA Astrophysics Data System (ADS)

Gordov, Evgeny; Okladnikov, Igor; Titov, Alexander

2017-04-01

For comprehensive usage of large geospatial meteorological and climate datasets it is necessary to create a distributed software infrastructure based on the spatial data infrastructure (SDI) approach. Currently, it is generally accepted that the development of client applications as integrated elements of such infrastructure should be based on the usage of modern web and GIS technologies. The paper describes the Web GIS for complex processing and visualization of geospatial (mainly in NetCDF and PostGIS formats) datasets as an integral part of the dedicated Virtual Research Environment for comprehensive study of ongoing and possible future climate change, and analysis of their implications, providing full information and computing support for the study of economic, political and social consequences of global climate change at the global and regional levels. The Web GIS consists of two basic software parts: 1. Server-side part representing PHP applications of the SDI geoportal and realizing the functionality of interaction with computational core backend, WMS/WFS/WPS cartographical services, as well as implementing an open API for browser-based client software. Being the secondary one, this part provides a limited set of procedures accessible via standard HTTP interface. 2. Front-end part representing Web GIS client developed according to a "single page application" technology based on JavaScript libraries OpenLayers (http://openlayers.org/), ExtJS (https://www.sencha.com/products/extjs), GeoExt (http://geoext.org/). It implements application business logic and provides intuitive user interface similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. Boundless/OpenGeo architecture was used as a basis for Web-GIS client development. According to general INSPIRE requirements to data visualization Web GIS provides such standard functionality as data overview, image navigation, scrolling, scaling and graphical overlay, displaying map legends and corresponding metadata information. The specialized Web GIS client contains three basic tires: • Tier of NetCDF metadata in JSON format • Middleware tier of JavaScript objects implementing methods to work with: o NetCDF metadata o XML file of selected calculations configuration (XML task) o WMS/WFS/WPS cartographical services • Graphical user interface tier representing JavaScript objects realizing general application business logic Web-GIS developed provides computational processing services launching to support solving tasks in the area of environmental monitoring, as well as presenting calculation results in the form of WMS/WFS cartographical layers in raster (PNG, JPG, GeoTIFF), vector (KML, GML, Shape), and binary (NetCDF) formats. It has shown its effectiveness in the process of solving real climate change research problems and disseminating investigation results in cartographical formats. The work is supported by the Russian Science Foundation grant No 16-19-10257.
Concurrent array-based queue

DOEpatents

Heidelberger, Philip; Steinmacher-Burow, Burkhard

2015-01-06

According to one embodiment, a method for implementing an array-based queue in memory of a memory system that includes a controller includes configuring, in the memory, metadata of the array-based queue. The configuring comprises defining, in metadata, an array start location in the memory for the array-based queue, defining, in the metadata, an array size for the array-based queue, defining, in the metadata, a queue top for the array-based queue and defining, in the metadata, a queue bottom for the array-based queue. The method also includes the controller serving a request for an operation on the queue, the request providing the location in the memory of the metadata of the queue.
WGISS-45 International Directory Network (IDN) Report

NASA Technical Reports Server (NTRS)

Morahan, Michael

2018-01-01

The objective of this presentation is to provide IDN (International Directory Network) updates on features and activities to the Committee on Earth Observation Satellites (CEOS) Working Group on Information Systems and Services (WGISS) and provider community. The following topics will be will be discussed during the presentation: Transition of Providers DIF-9 (Directory Interchange Format-9) to DIF-10 Metadata Records in the Common Metadata Repository (CMR); GCMD (Global Change Master Directory) Keyword Update; DIF-10 and UMM-C (Unified Metadata Model-Collections) Schema Changes; Metadata Validation of Provider Metadata; docBUILDER for Submitting IDN Metadata to the CMR (i.e. Registration); and Mapping WGClimate Essential Climate Variable (ECV) Inventory to IDN Records.
CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs.

PubMed

Gilbert, N; Labuda, D

1999-03-16

A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.
SBML Level 3 package: Groups, Version 1 Release 1

PubMed Central

Hucka, Michael; Smith, Lucian P.

2017-01-01

Summary Biological models often contain components that have relationships with each other, or that modelers want to treat as belonging to groups with common characteristics or shared metadata. The SBML Level 3 Version 1 Core specification does not provide an explicit mechanism for expressing such relationships, but it does provide a mechanism for SBML packages to extend the Core specification and add additional syntactical constructs. The SBML Groups package for SBML Level 3 adds the necessary features to SBML to allow grouping of model components to be expressed. Such groups do not affect the mathematical interpretation of a model, but they do provide a way to add information that can be useful for modelers and software tools. The SBML Groups package enables a modeler to include definitions of groups and nested groups, each of which may be annotated to convey why that group was created, and what it represents. PMID:28187406
Explorative Analyses of Nursing Research Data.

PubMed

Kim, Hyeoneui; Jang, Imho; Quach, Jimmy; Richardson, Alex; Kim, Jaemin; Choi, Jeeyae

2016-10-26

As a first step of pursuing the vision of "big data science in nursing," we described the characteristics of nursing research data reported in 194 published nursing studies. We also explored how completely the Version 1 metadata specification of biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) represents these metadata. The metadata items of the nursing studies were all related to one or more of the bioCADDIE metadata entities. However, values of many metadata items of the nursing studies were not sufficiently represented through the bioCADDIE metadata. This was partly due to the differences in the scope of the content that the bioCADDIE metadata are designed to represent. The 194 nursing studies reported a total of 1,181 unique data items, the majority of which take non-numeric values. This indicates the importance of data standardization to enable the integrative analyses of these data to support big data science in nursing. © The Author(s) 2016.
MPEG-7: standard metadata for multimedia content

NASA Astrophysics Data System (ADS)

Chang, Wo

2005-08-01

The eXtensible Markup Language (XML) metadata technology of describing media contents has emerged as a dominant mode of making media searchable both for human and machine consumptions. To realize this premise, many online Web applications are pushing this concept to its fullest potential. However, a good metadata model does require a robust standardization effort so that the metadata content and its structure can reach its maximum usage between various applications. An effective media content description technology should also use standard metadata structures especially when dealing with various multimedia contents. A new metadata technology called MPEG-7 content description has merged from the ISO MPEG standards body with the charter of defining standard metadata to describe audiovisual content. This paper will give an overview of MPEG-7 technology and what impact it can bring forth to the next generation of multimedia indexing and retrieval applications.
Quality Assurance for Digital Learning Object Repositories: Issues for the Metadata Creation Process

ERIC Educational Resources Information Center

Currier, Sarah; Barton, Jane; O'Beirne, Ronan; Ryan, Ben

2004-01-01

Metadata enables users to find the resources they require, therefore it is an important component of any digital learning object repository. Much work has already been done within the learning technology community to assure metadata quality, focused on the development of metadata standards, specifications and vocabularies and their implementation…
A Model for the Creation of Human-Generated Metadata within Communities

ERIC Educational Resources Information Center

Brasher, Andrew; McAndrew, Patrick

2005-01-01

This paper considers situations for which detailed metadata descriptions of learning resources are necessary, and focuses on human generation of such metadata. It describes a model which facilitates human production of good quality metadata by the development and use of structured vocabularies. Using examples, this model is applied to single and…
Enhancing SCORM Metadata for Assessment Authoring in E-Learning

ERIC Educational Resources Information Center

Chang, Wen-Chih; Hsu, Hui-Huang; Smith, Timothy K.; Wang, Chun-Chia

2004-01-01

With the rapid development of distance learning and the XML technology, metadata play an important role in e-Learning. Nowadays, many distance learning standards, such as SCORM, AICC CMI, IEEE LTSC LOM and IMS, use metadata to tag learning materials. However, most metadata models are used to define learning materials and test problems. Few…
MEANS FOR COOLING REACTORS

DOEpatents

Wheeler, J.A.

1957-11-01

A design of a reactor is presented in which the fuel elements may be immersed in a liquid coolant when desired without the necessity of removing them from the reactor structure. The fuel elements, containing the fissionable material are in plate form and are disposed within spaced slots in a moderator material, such as graphite to form the core. Adjacent the core is a tank containing the liquid coolant. The fuel elements are mounted in spaced relationship on a rotatable shaft which is located between the core and the tank so that by rotation of the shaft the fuel elements may be either inserted in the slots in the core to sustain a chain reaction or immersed in the coolant.
Metadata in the Wild: An Empirical Survey of OPeNDAP-accessible Metadata and its Implications for Discovery

NASA Astrophysics Data System (ADS)

Hardy, D.; Janée, G.; Gallagher, J.; Frew, J.; Cornillon, P.

2006-12-01

The OPeNDAP Data Access Protocol (DAP) is a community standard for sharing scientific data across the Internet. Data providers using DAP have adopted a variety of metadata conventions to improve data utility, such as COARDS (1995) and CF (2003). Our results show, however, that metadata do not follow these conventions in practice. We collected metadata from over a hundred DAP servers, tens of thousands of data objects, and hundreds of collections. We found that a minority claim to adhere to a metadata convention, and a small percentage accurately adhere to their stated convention. We present descriptive statistics of our survey and highlight common traits such as well-populated attributes. Our empirical results indicate that unified search services cannot rely solely on metadata conventions. Although we encourage all providers to adopt a small subset of the CF convention for discovery purposes, we have no evidence to suggest that improved conventions would simplify the fundamental problem of heterogeneity. Large-scale discovery services must find methods for integrating incompatible metadata.
A Shared Infrastructure for Federated Search Across Distributed Scientific Metadata Catalogs

NASA Astrophysics Data System (ADS)

Reed, S. A.; Truslove, I.; Billingsley, B. W.; Grauch, A.; Harper, D.; Kovarik, J.; Lopez, L.; Liu, M.; Brandt, M.

2013-12-01

The vast amount of science metadata can be overwhelming and highly complex. Comprehensive analysis and sharing of metadata is difficult since institutions often publish to their own repositories. There are many disjoint standards used for publishing scientific data, making it difficult to discover and share information from different sources. Services that publish metadata catalogs often have different protocols, formats, and semantics. The research community is limited by the exclusivity of separate metadata catalogs and thus it is desirable to have federated search interfaces capable of unified search queries across multiple sources. Aggregation of metadata catalogs also enables users to critique metadata more rigorously. With these motivations in mind, the National Snow and Ice Data Center (NSIDC) and Advanced Cooperative Arctic Data and Information Service (ACADIS) implemented two search interfaces for the community. Both the NSIDC Search and ACADIS Arctic Data Explorer (ADE) use a common infrastructure which keeps maintenance costs low. The search clients are designed to make OpenSearch requests against Solr, an Open Source search platform. Solr applies indexes to specific fields of the metadata which in this instance optimizes queries containing keywords, spatial bounds and temporal ranges. NSIDC metadata is reused by both search interfaces but the ADE also brokers additional sources. Users can quickly find relevant metadata with minimal effort and ultimately lowers costs for research. This presentation will highlight the reuse of data and code between NSIDC and ACADIS, discuss challenges and milestones for each project, and will identify creation and use of Open Source libraries.

A New Look at Data Usage by Using Metadata Attributes as Indicators of Data Quality

NASA Astrophysics Data System (ADS)

Won, Y. I.; Wanchoo, L.; Behnke, J.

2016-12-01

NASA's Earth Observing System Data and Information System (EOSDIS) stores and distributes data from EOS satellites, as well as ancillary, airborne, in-situ, and socio-economic data. Twelve EOSDIS data centers support different scientific disciplines by providing products and services tailored to specific science communities. Although discipline oriented, these data centers provide common data management functions of ingest, archive and distribution, as well as documentation of their data and services on their web-sites. The Earth Science Data and Information System (ESDIS) Project collects these metrics from the EOSDIS data centers on a daily basis through a tool called the ESDIS Metrics System (EMS). These metrics are used in this study. The implementation of the Earthdata Login - formerly known as the User Registration System (URS) - across the various NASA data centers provides the EMS additional information about users obtaining data products from EOSDIS data centers. These additional user attributes collected by the Earthdata login, such as the user's primary area of study can augment the understanding of data usage, which in turn can help the EOSDIS program better understand the users' needs. This study will review the key metrics (users, distributed volume, and files) in multiple ways to gain an understanding of the significance of the metadata. Characterizing the usability of data by key metadata elements such as discipline and study area, will assist in understanding how the users have evolved over time. The data usage pattern based on version numbers may also provide some insight into the level of data quality. In addition, the data metrics by various services such as the Open-source Project for a Network Data Access Protocol (OPeNDAP), Web Map Service (WMS), Web Coverage Service (WCS), and subsets, will address how these services have extended the usage of data. Over-all, this study will present the usage of data and metadata by metrics analyses and will assist data centers in better supporting the needs of the users.
Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects

NASA Astrophysics Data System (ADS)

Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan

2010-10-01

The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.
A Finite Element Analysis for Predicting the Residual Compressive Strength of Impact-Damaged Sandwich Panels

NASA Technical Reports Server (NTRS)

Ratcliffe, James G.; Jackson, Wade C.

2008-01-01

A simple analysis method has been developed for predicting the residual compressive strength of impact-damaged sandwich panels. The method is tailored for honeycomb core-based sandwich specimens that exhibit an indentation growth failure mode under axial compressive loading, which is driven largely by the crushing behavior of the core material. The analysis method is in the form of a finite element model, where the impact-damaged facesheet is represented using shell elements and the core material is represented using spring elements, aligned in the thickness direction of the core. The nonlinear crush response of the core material used in the analysis is based on data from flatwise compression tests. A comparison with a previous analysis method and some experimental data shows good agreement with results from this new approach.
A Finite Element Analysis for Predicting the Residual Compression Strength of Impact-Damaged Sandwich Panels

NASA Technical Reports Server (NTRS)

Ratcliffe, James G.; Jackson, Wade C.

2008-01-01

A simple analysis method has been developed for predicting the residual compression strength of impact-damaged sandwich panels. The method is tailored for honeycomb core-based sandwich specimens that exhibit an indentation growth failure mode under axial compression loading, which is driven largely by the crushing behavior of the core material. The analysis method is in the form of a finite element model, where the impact-damaged facesheet is represented using shell elements and the core material is represented using spring elements, aligned in the thickness direction of the core. The nonlinear crush response of the core material used in the analysis is based on data from flatwise compression tests. A comparison with a previous analysis method and some experimental data shows good agreement with results from this new approach.
A standard for measuring metadata quality in spectral libraries

NASA Astrophysics Data System (ADS)

Rasaiah, B.; Jones, S. D.; Bellman, C.

2013-12-01

A standard for measuring metadata quality in spectral libraries Barbara Rasaiah, Simon Jones, Chris Bellman RMIT University Melbourne, Australia barbara.rasaiah@rmit.edu.au, simon.jones@rmit.edu.au, chris.bellman@rmit.edu.au ABSTRACT There is an urgent need within the international remote sensing community to establish a metadata standard for field spectroscopy that ensures high quality, interoperable metadata sets that can be archived and shared efficiently within Earth observation data sharing systems. Metadata are an important component in the cataloguing and analysis of in situ spectroscopy datasets because of their central role in identifying and quantifying the quality and reliability of spectral data and the products derived from them. This paper presents approaches to measuring metadata completeness and quality in spectral libraries to determine reliability, interoperability, and re-useability of a dataset. Explored are quality parameters that meet the unique requirements of in situ spectroscopy datasets, across many campaigns. Examined are the challenges presented by ensuring that data creators, owners, and data users ensure a high level of data integrity throughout the lifecycle of a dataset. Issues such as field measurement methods, instrument calibration, and data representativeness are investigated. The proposed metadata standard incorporates expert recommendations that include metadata protocols critical to all campaigns, and those that are restricted to campaigns for specific target measurements. The implication of semantics and syntax for a robust and flexible metadata standard are also considered. Approaches towards an operational and logistically viable implementation of a quality standard are discussed. This paper also proposes a way forward for adapting and enhancing current geospatial metadata standards to the unique requirements of field spectroscopy metadata quality. [0430] BIOGEOSCIENCES / Computational methods and data processing [0480] BIOGEOSCIENCES / Remote sensing [1904] INFORMATICS / Community standards [1912] INFORMATICS / Data management, preservation, rescue [1926] INFORMATICS / Geospatial [1930] INFORMATICS / Data and information governance [1946] INFORMATICS / Metadata [1952] INFORMATICS / Modeling [1976] INFORMATICS / Software tools and services [9810] GENERAL OR MISCELLANEOUS / New fields
Improving Metadata Compliance for Earth Science Data Records

NASA Astrophysics Data System (ADS)

Armstrong, E. M.; Chang, O.; Foster, D.

2014-12-01

One of the recurring challenges of creating earth science data records is to ensure a consistent level of metadata compliance at the granule level where important details of contents, provenance, producer, and data references are necessary to obtain a sufficient level of understanding. These details are important not just for individual data consumers but also for autonomous software systems. Two of the most popular metadata standards at the granule level are the Climate and Forecast (CF) Metadata Conventions and the Attribute Conventions for Dataset Discovery (ACDD). Many data producers have implemented one or both of these models including the Group for High Resolution Sea Surface Temperature (GHRSST) for their global SST products and the Ocean Biology Processing Group for NASA ocean color and SST products. While both the CF and ACDD models contain various level of metadata richness, the actual "required" attributes are quite small in number. Metadata at the granule level becomes much more useful when recommended or optional attributes are implemented that document spatial and temporal ranges, lineage and provenance, sources, keywords, and references etc. In this presentation we report on a new open source tool to check the compliance of netCDF and HDF5 granules to the CF and ACCD metadata models. The tool, written in Python, was originally implemented to support metadata compliance for netCDF records as part of the NOAA's Integrated Ocean Observing System. It outputs standardized scoring for metadata compliance for both CF and ACDD, produces an objective summary weight, and can be implemented for remote records via OPeNDAP calls. Originally a command-line tool, we have extended it to provide a user-friendly web interface. Reports on metadata testing are grouped in hierarchies that make it easier to track flaws and inconsistencies in the record. We have also extended it to support explicit metadata structures and semantic syntax for the GHRSST project that can be easily adapted to other satellite missions as well. Overall, we hope this tool will provide the community with a useful mechanism to improve metadata quality and consistency at the granule level by providing objective scoring and assessment, as well as encourage data producers to improve metadata quality and quantity.
Topographic and hydrographic survey data for the São Francisco River near Torrinha, Bahia, Brazil, 2014

USGS Publications Warehouse

Fosness, Ryan L.; Dietsch, Benjamin J.

2015-10-21

This report presents the surveying techniques and data-processing methods used to collect, process, and disseminate topographic and hydrographic data. All standard and non‑standard data-collection methods, techniques, and data process methods were documented. Additional discussion describes the quality-assurance and quality-control elements used in this study, along with the limitations for the Torrinha-Itacoatiara study reach data. The topographic and hydrographic geospatial data are published along with associated metadata.
The effect of acidified sample storage time on the determination of trace element concentration in ice cores by ICP-SFMS

NASA Astrophysics Data System (ADS)

Uglietti, C.; Gabrielli, P.; Lutton, A.; Olesik, J.; Thompson, L. G.

2012-12-01

Trace elements in micro-particles entrapped in ice cores are a valuable proxy of past climate and environmental variations. Inductively coupled plasma sector field mass spectrometry (ICP-SFMS) is generally recognized as a sensitive and accurate technique for the quantification of ultra-trace element concentrations in ice cores. Usually, ICP-SFMS analyses of ice core samples are performed by melting and acidifying aliquots. Acidification is important to transfer trace elements from particles into solution by partial and/or complete dissolution. Only elements in solution and in sufficiently small particles will be vaporized and converted to elemental ions in the plasma for detection by ICP-SFMS. However, experimental results indicate that differences in acidified sample storage time at room temperature may lead to the recovery of different trace element fractions. Moreover, different lithologies of the relatively abundant crustal material entrapped in the ice matrix could also influence the fraction of trace elements that are converted into elemental ions in the plasma. These factors might affect the determination of trace elements concentrations in ice core samples and hamper the comparison of results obtained from ice cores from different locations and/or epochs. In order to monitor the transfer of elements from particles into solution in acidified melted ice core samples during storage, a test was performed on sections from nine ice cores retrieved from low latitude drilling sites around the world. When compared to ice cores from polar regions, these samples are characterized by a relative high content of micro-particles that may leach trace elements into solution differently. Of the nine ice cores, five are from the Tibetan Plateau (Dasuopu, Guliya, Naimonanyi, Puruogangri and Dunde), two from the Andes (Quelccaya and Huascaran), one from Africa (Kilimanjaro) and one from the Eastern Alps (Ortles). These samples were decontaminated by triple rinsing, melted and stored in pre-cleaned low-density polyethylene bottles, and kept frozen until acidification (2% v/v ultra-pure HNO3). Determination of twenty trace elements (Ag, Al, As, Bi, Cd, Co, Cr, Cu, Fe, Mn, Mo, Pb, Rb, Sb, Sn, Ti, Tl, U, V, and Zn) was repeated at different times after acidification using the same aliquot. Analyses show a mean increase of 40-50% in trace element concentration in all the samples during the first 15 days of storage after acidification, except Al, Fe, V and Cr, which show a larger increase (90-100%). After 15 days the trace element concentrations reach generally stable values (with small increases within measurement uncertainty), except for the Naimonanyi and Kilimanjaro samples which continue to increase. In contrast, Ag concentration decreases after one week, likely due to its low stability in the acidified solution that may depend on the Cl- concentration. We froze the samples 43 days after the acidification. After two weeks the samples were melted and re-analyzed by ICP-SFMS in two different laboratories as an inter-calibration exercise. The results show a good correspondence between the measured concentrations determined by the two instruments and a consistent additional increase of 20-30% of measured trace element concentrations in almost all samples.
Wyoming Landscape Conservation Initiative data management and integration

USGS Publications Warehouse

Latysh, Natalie; Bristol, R. Sky

2011-01-01

Six Federal agencies, two State agencies, and two local entities formally support the Wyoming Landscape Conservation Initiative (WLCI) and work together on a landscape scale to manage fragile habitats and wildlife resources amidst growing energy development in southwest Wyoming. The U.S. Geological Survey (USGS) was tasked with implementing targeted research and providing scientific information about southwest Wyoming to inform the development of WLCI habitat enhancement and restoration projects conducted by land management agencies. Many WLCI researchers and decisionmakers representing the Bureau of Land Management, U.S. Fish and Wildlife Service, the State of Wyoming, and others have overwhelmingly expressed the need for a stable, robust infrastructure to promote sharing of data resources produced by multiple entities, including metadata adequately describing the datasets. Descriptive metadata facilitates use of the datasets by users unfamiliar with the data. Agency representatives advocate development of common data handling and distribution practices among WLCI partners to enhance availability of comprehensive and diverse data resources for use in scientific analyses and resource management. The USGS Core Science Informatics (CSI) team is developing and promoting data integration tools and techniques across USGS and partner entity endeavors, including a data management infrastructure to aid WLCI researchers and decisionmakers.
Metadata Management on the SCEC PetaSHA Project: Helping Users Describe, Discover, Understand, and Use Simulation Data in a Large-Scale Scientific Collaboration

NASA Astrophysics Data System (ADS)

Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.

2007-12-01

Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.
MOAB : a mesh-oriented database.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tautges, Timothy James; Ernst, Corey; Stimpson, Clint

A finite element mesh is used to decompose a continuous domain into a discretized representation. The finite element method solves PDEs on this mesh by modeling complex functions as a set of simple basis functions with coefficients at mesh vertices and prescribed continuity between elements. The mesh is one of the fundamental types of data linking the various tools in the FEA process (mesh generation, analysis, visualization, etc.). Thus, the representation of mesh data and operations on those data play a very important role in FEA-based simulations. MOAB is a component for representing and evaluating mesh data. MOAB can storemore » structured and unstructured mesh, consisting of elements in the finite element 'zoo'. The functional interface to MOAB is simple yet powerful, allowing the representation of many types of metadata commonly found on the mesh. MOAB is optimized for efficiency in space and time, based on access to mesh in chunks rather than through individual entities, while also versatile enough to support individual entity access. The MOAB data model consists of a mesh interface instance, mesh entities (vertices and elements), sets, and tags. Entities are addressed through handles rather than pointers, to allow the underlying representation of an entity to change without changing the handle to that entity. Sets are arbitrary groupings of mesh entities and other sets. Sets also support parent/child relationships as a relation distinct from sets containing other sets. The directed-graph provided by set parent/child relationships is useful for modeling topological relations from a geometric model or other metadata. Tags are named data which can be assigned to the mesh as a whole, individual entities, or sets. Tags are a mechanism for attaching data to individual entities and sets are a mechanism for describing relations between entities; the combination of these two mechanisms is a powerful yet simple interface for representing metadata or application-specific data. For example, sets and tags can be used together to describe geometric topology, boundary condition, and inter-processor interface groupings in a mesh. MOAB is used in several ways in various applications. MOAB serves as the underlying mesh data representation in the VERDE mesh verification code. MOAB can also be used as a mesh input mechanism, using mesh readers included with MOAB, or as a translator between mesh formats, using readers and writers included with MOAB. The remainder of this report is organized as follows. Section 2, 'Getting Started', provides a few simple examples of using MOAB to perform simple tasks on a mesh. Section 3 discusses the MOAB data model in more detail, including some aspects of the implementation. Section 4 summarizes the MOAB function API. Section 5 describes some of the tools included with MOAB, and the implementation of mesh readers/writers for MOAB. Section 6 contains a brief description of MOAB's relation to the TSTT mesh interface. Section 7 gives a conclusion and future plans for MOAB development. Section 8 gives references cited in this report. A reference description of the full MOAB API is contained in Section 9.« less
Fuel handling system for a nuclear reactor

DOEpatents

Saiveau, James G.; Kann, William J.; Burelbach, James P.

1986-01-01

A pool type nuclear fission reactor has a core, with a plurality of core elements and a redan which confines coolant as a hot pool at a first end of the core separated from a cold pool at a second end of the core by the redan. A fuel handling system for use with such reactors comprises a core element storage basket located outside of the redan in the cold pool. An access passage is formed in the redan with a gate for opening and closing the passage to maintain the temperature differential between the hot pool and the cold pool. A mechanism is provided for opening and closing the gate. A lifting arm is also provided for manipulating the fuel core elements through the access passage between the storage basket and the core when the redan gate is open.
Fuel handling system for a nuclear reactor

DOEpatents

Saiveau, James G.; Kann, William J.; Burelbach, James P.

1986-12-02

A pool type nuclear fission reactor has a core, with a plurality of core elements and a redan which confines coolant as a hot pool at a first end of the core separated from a cold pool at a second end of the core by the redan. A fuel handling system for use with such reactors comprises a core element storage basket located outside of the redan in the cold pool. An access passage is formed in the redan with a gate for opening and closing the passage to maintain the temperature differential between the hot pool and the cold pool. A mechanism is provided for opening and closing the gate. A lifting arm is also provided for manipulating the fuel core elements through the access passage between the storage basket and the core when the redan gate is open.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Blanchard, I.; Badro, J.; Siebert, J.

We present gallium concentration (normalized to CI chondrites) in the mantle is at the same level as that of lithophile elements with similar volatility, implying that there must be little to no gallium in Earth's core. Metal-silicate partitioning experiments, however, have shown that gallium is a moderately siderophile element and should be therefore depleted in the mantle by core formation. Moreover, gallium concentrations in the mantle (4 ppm) are too high to be only brought by the late veneer; and neither pressure, nor temperature, nor silicate composition has a large enough effect on gallium partitioning to make it lithophile. Wemore » therefore systematically investigated the effect of core composition (light element content) on the partitioning of gallium by carrying out metal–silicate partitioning experiments in a piston–cylinder press at 2 GPa between 1673 K and 2073 K. Four light elements (Si, O, S, C) were considered, and their effect was found to be sufficiently strong to make gallium lithophile. The partitioning of gallium was then modeled and parameterized as a function of pressure, temperature, redox and core composition. A continuous core formation model was used to track the evolution of gallium partitioning during core formation, for various magma ocean depths, geotherms, core light element contents, and magma ocean composition (redox) during accretion. The only model for which the final gallium concentration in the silicate Earth matched the observed value is the one involving a light-element rich core equilibrating in a FeO-rich deep magma ocean (>1300 km) with a final pressure of at least 50 GPa. More specifically, the incorporation of S and C in the core provided successful models only for concentrations that lie far beyond their allowable cosmochemical or geophysical limits, whereas realistic O and Si amounts (less than 5 wt.%) in the core provided successful models for magma oceans deeper that 1300 km. In conclusion, these results offer a strong argument for an O- and Si-rich core, formed in a deep terrestrial magma ocean, along with oxidizing conditions.« less
Suppression of grp78 core promoter element-mediated stress induction by the dbpA and dbpB (YB-1) cold shock domain proteins.

PubMed Central

Li, W W; Hsiung, Y; Wong, V; Galvin, K; Zhou, Y; Shi, Y; Lee, A S

1997-01-01

The highly conserved grp78 core promoter element plays an important role in the induction of grp78 under diverse stress signals. Previous studies have established a functional region in the 3' half of the core (stress-inducible change region [SICR]) which exhibits stress-inducible changes in stressed nuclei. The human transcription factor YY1 is shown to bind the SICR and transactivate the core element under stress conditions. Here we report that expression library screening with the core element has identified two new core binding proteins, YB-1 and dbpA. Both proteins belong to the Y-box family of proteins characterized by an evolutionarily conserved DNA binding motif, the cold shock domain (CSD). In contrast to YY1, which binds only double-stranded SICR, the Y-box/CSD proteins much prefer the lower strand of the SICR. The Y-box proteins can repress the inducibility of the grp78 core element mediated by treatment of cells with A23187, thapsigargin, and tunicamycin. In gel shift assays, YY1 binding to the core element is inhibited by either YB-1 or dbpA. A yeast interaction trap screen using LexA-YY1 as a bait and a HeLa cell cDNA-acid patch fusion library identified YB-1 as a YY1-interacting protein. In cotransfection experiments, the Y-box proteins antagonize the YY1-mediated enhancement of transcription directed by the grp78 core in stressed cells. Thus, the CSD proteins may be part of the stress signal transduction mechanism in the mammalian system. PMID:8972186
A Virtual Science Data Environment for Carbon Dioxide Observations

NASA Astrophysics Data System (ADS)

Verma, R.; Goodale, C. E.; Hart, A. F.; Law, E.; Crichton, D. J.; Mattmann, C. A.; Gunson, M. R.; Braverman, A. J.; Nguyen, H. M.; Eldering, A.; Castano, R.; Osterman, G. B.

2011-12-01

Climate science data are often distributed cross-institutionally and made available using heterogeneous interfaces. With respect to observational carbon-dioxide (CO2) records, these data span across national as well as international institutions and are typically distributed using a variety of data standards. Such an arrangement can yield challenges from a research perspective, as users often need to independently aggregate datasets as well as address the issue of data quality. To tackle this dispersion and heterogeneity of data, we have developed the CO2 Virtual Science Data Environment - a comprehensive approach to virtually integrating CO2 data and metadata from multiple missions and providing a suite of computational services that facilitate analysis, comparison, and transformation of that data. The Virtual Science Environment provides climate scientists with a unified web-based destination for discovering relevant observational data in context, and supports a growing range of online tools and services for analyzing and transforming the available data to suit individual research needs. It includes web-based tools to geographically and interactively search for CO2 observations collected from multiple airborne, space, as well as terrestrial platforms. Moreover, the data analysis services it provides over the Internet, including offering techniques such as bias estimation and spatial re-gridding, move computation closer to the data and reduce the complexity of performing these operations repeatedly and at scale. The key to enabling these services, as well as consolidating the disparate data into a unified resource, has been to focus on leveraging metadata descriptors as the foundation of our data environment. This metadata-centric architecture, which leverages the Dublin Core standard, forgoes the need to replicate remote datasets locally. Instead, the system relies upon an extensive, metadata-rich virtual data catalog allowing on-demand browsing and retrieval of CO2 records from multiple missions. In other words, key metadata information about remote CO2 records is stored locally while the data itself is preserved at its respective archive of origin. This strategy has been made possible by our method of encapsulating the heterogeneous sources of data using a common set of web-based services, including services provided by Jet Propulsion Laboratory's Climate Data Exchange (CDX). Furthermore, this strategy has enabled us to scale across missions, and to provide access to a broad array of CO2 observational data. Coupled with on-demand computational services and an intuitive web-portal interface, the CO2 Virtual Science Data Environment effectively transforms heterogeneous CO2 records from multiple sources into a unified resource for scientific discovery.
Evaluating the privacy properties of telephone metadata.

PubMed

Mayer, Jonathan; Mutchler, Patrick; Mitchell, John C

2016-05-17

Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences.
Studies of Big Data metadata segmentation between relational and non-relational databases

NASA Astrophysics Data System (ADS)

Golosova, M. V.; Grigorieva, M. A.; Klimentov, A. A.; Ryabinkin, E. A.; Dimitrov, G.; Potekhin, M.

2015-12-01

In recent years the concepts of Big Data became well established in IT. Systems managing large data volumes produce metadata that describe data and workflows. These metadata are used to obtain information about current system state and for statistical and trend analysis of the processes these systems drive. Over the time the amount of the stored metadata can grow dramatically. In this article we present our studies to demonstrate how metadata storage scalability and performance can be improved by using hybrid RDBMS/NoSQL architecture.
Evaluating the privacy properties of telephone metadata

PubMed Central

Mayer, Jonathan; Mutchler, Patrick; Mitchell, John C.

2016-01-01

Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences. PMID:27185922
Incorporating ISO Metadata Using HDF Product Designer

NASA Technical Reports Server (NTRS)

Jelenak, Aleksandar; Kozimor, John; Habermann, Ted

2016-01-01

The need to store in HDF5 files increasing amounts of metadata of various complexity is greatly overcoming the capabilities of the Earth science metadata conventions currently in use. Data producers until now did not have much choice but to come up with ad hoc solutions to this challenge. Such solutions, in turn, pose a wide range of issues for data managers, distributors, and, ultimately, data users. The HDF Group is experimenting on a novel approach of using ISO 19115 metadata objects as a catch-all container for all the metadata that cannot be fitted into the current Earth science data conventions. This presentation will showcase how the HDF Product Designer software can be utilized to help data producers include various ISO metadata objects in their products.

Composition of the low seismic velocity E' layer at the top of Earth's core

NASA Astrophysics Data System (ADS)

Badro, J.; Brodholt, J. P.

2017-12-01

Evidence for a layer (E') at the top of the outer core has been available since the '90s and while different studies suggest slightly different velocity contrasts and thicknesses, the common observation is that the layer has lower velocities than the bulk outer core (PREM). Although there are no direct measurements on the density of this layer, dynamic stability requires it to be less dense than the bulk outer core under those same pressure and temperature conditions. Using ab initio simulations on Fe-Ni-S-C-O-Si liquids we constrain the origin and composition of the low-velocity layer E' at the top of Earth's outer core. We find that increasing the concentration of any light-element always increases velocity and so a low-velocity and low-density layer (for stability) cannot be made by simply increasing light element concentration. This rules out barodiffusion or upwards sedimentation of a light phase for its origin. However, exchanging elements can—depending on the elements exchanged—produce such a layer. We evaluate three possibilities. Firstly, crystallization of a light phase from a core containing more than one light element may make such a layer, but only if the crystalizing phase is very Fe-rich, which is at odds with available phase diagrams at CMB conditions. Secondly, the E' layer may result from incomplete mixing of an early Earth core with a late impactor, depending on the light element compositions of the impactor and Earth's core, but such a primordial stratification is neither supported by dynamical models of the core nor thermodynamic models of core merger after the giant impact. The last and most plausible scenario is core-mantle chemical interaction; using thermodynamic models for metal-silicate partitioning of silicon and oxygen at CMB conditions, we show that a reaction between the core and an FeO-rich basal magma ocean can enrich the core in oxygen while depleting it in silicon, in relative amounts that produce a light and slow layer consistent with seismological observations.
The Pasamonte unequilibrated eucrite: Pyroxene REE systematic and major-, minor-, and trace-element zoning. [Abstract only

NASA Technical Reports Server (NTRS)

Pun, A.; Papike, J. J.

1994-01-01

We are evaluating the trace-element concentrations in the pyroxenes of Pasamonte. Pasamonte is a characteristic member of the main group eucrites, and has recently been redescribed as a polymict eucrite. Our Pasamonte sample contained eucritic clasts with textures ranging from subophitic to moderately coarse-grained. This study concentrates on pyroxenes from an unequilibrated, coarse-grained eucrite clast. Major-, minor-, and trace-element analyses were measured for zoned pyroxenes in the eucritic clast of Pasamonte. The major- and minor-element zoning traverses were measured using the JEOL 733 electron probe with an Oxford-Link imaging/analysis system. Complemenatry trace elements were then measured for the core and rim of each of the grains by SIMS. The trace elements analyzed consisted of eight REE, Sr, Y, and Zr. These analyses were performed on a Cameca 4f ion probe. The results of the CI chondrite normalized (average CI trace-element analyses for several grains and the major- and minor-element zoning patterns from a single pyroxene grain are given. The Eu abundance in the cores of the pyroxenes represents the detection limit and therefore the (-Eu) anomaly is a minimum. Major- and minor-element patterns are typical for igneous zoning. Pyroxene cores are Mg enriched, whereas the rims are enriched in Fe and Ca. Also, Ti and Mn are found to increase, while Cr and Al generally decrease in core-to-rim traverses. The cores of the pyroxenes are more depleted in the Rare Earth Elements (REE) than the rims. Using the minor- and trace-element concentrations of bulk Pasamonte and the minor- and trace-element concentrations from the cores of the pyroxenes in Pasamonte measured in this study, we calculated partition coefficients between pyroxene and melt. This calculation assumes that bulk Pasamonte is representative of a melt composition.
The Fuzziness of Giant Planets’ Cores

DOE Office of Scientific and Technical Information (OSTI.GOV)

Helled, Ravit; Stevenson, David

2017-05-01

Giant planets are thought to have cores in their deep interiors, and the division into a heavy-element core and hydrogen–helium envelope is applied in both formation and structure models. We show that the primordial internal structure depends on the planetary growth rate, in particular, the ratio of heavy elements accretion to gas accretion. For a wide range of likely conditions, this ratio is in one-to-one correspondence with the resulting post-accretion profile of heavy elements within the planet. This flux ratio depends sensitively on the assumed solid-surface density in the surrounding nebula. We suggest that giant planets’ cores might not bemore » distinct from the envelope and includes some hydrogen and helium, and the deep interior can have a gradual heavy-element structure. Accordingly, Jupiter’s core may not be well defined. Accurate measurements of Jupiter’s gravitational field by Juno could put constraints on Jupiter’s core mass. However, as we suggest here, the definition of Jupiter’s core is complex, and the core’s physical properties (mass, density) depend on the actual definition of the core and on the planet’s growth history.« less
BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences.

PubMed

McQuilton, Peter; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Thurston, Milo; Lister, Allyson; Maguire, Eamonn; Sansone, Susanna-Assunta

2016-01-01

BioSharing (http://www.biosharing.org) is a manually curated, searchable portal of three linked registries. These resources cover standards (terminologies, formats and models, and reporting guidelines), databases, and data policies in the life sciences, broadly encompassing the biological, environmental and biomedical sciences. Launched in 2011 and built by the same core team as the successful MIBBI portal, BioSharing harnesses community curation to collate and cross-reference resources across the life sciences from around the world. BioSharing makes these resources findable and accessible (the core of the FAIR principle). Every record is designed to be interlinked, providing a detailed description not only on the resource itself, but also on its relations with other life science infrastructures. Serving a variety of stakeholders, BioSharing cultivates a growing community, to which it offers diverse benefits. It is a resource for funding bodies and journal publishers to navigate the metadata landscape of the biological sciences; an educational resource for librarians and information advisors; a publicising platform for standard and database developers/curators; and a research tool for bench and computer scientists to plan their work. BioSharing is working with an increasing number of journals and other registries, for example linking standards and databases to training material and tools. Driven by an international Advisory Board, the BioSharing user-base has grown by over 40% (by unique IP address), in the last year thanks to successful engagement with researchers, publishers, librarians, developers and other stakeholders via several routes, including a joint RDA/Force11 working group and a collaboration with the International Society for Biocuration. In this article, we describe BioSharing, with a particular focus on community-led curation.Database URL: https://www.biosharing.org. © The Author(s) 2016. Published by Oxford University Press.
Forensic Tools to Track and Connect Physical Samples to Related Data

NASA Astrophysics Data System (ADS)

Molineux, A.; Thompson, A. C.; Baumgardner, R. W.

2016-12-01

Identifiers, such as local sample numbers, are critical to successfully connecting physical samples and related data. However, identifiers must be globally unique. The International Geo Sample Number (IGSN) generated when registering the sample in the System for Earth Sample Registration (SESAR) provides a globally unique alphanumeric code associated with basic metadata, related samples and their current physical storage location. When registered samples are published, users can link the figured samples to the basic metadata held at SESAR. The use cases we discuss include plant specimens from a Permian core, Holocene corals and derived powders, and thin sections with SEM stubs. Much of this material is now published. The plant taxonomic study from the core is a digital pdf and samples can be directly linked from the captions to the SESAR record. The study of stable isotopes from the corals is not yet digitally available, but individual samples are accessible. Full data and media records for both studies are located in our database where higher quality images, field notes, and section diagrams may exist. Georeferences permit mapping in current and deep time plate configurations. Several aspects emerged during this study. The first, ensure adequate and consistent details are registered with SESAR. Second, educate and encourage the researcher to obtain IGSNs. Third, publish the archive numbers, assigned prior to publication, alongside the IGSN. This provides access to further data through an Integrated Publishing Toolkit (IPT)/aggregators/or online repository databases, thus placing the initial sample in a much richer context for future studies. Fourth, encourage software developers to customize community software to extract data from a database and use it to register samples in bulk. This would improve workflow and provide a path for registration of large legacy collections.
Chemical evolution of the Earth: Equilibrium or disequilibrium process?

NASA Technical Reports Server (NTRS)

Sato, M.

1985-01-01

To explain the apparent chemical incompatibility of the Earth's core and mantle or the disequilibrium process, various core forming mechanisms have been proposed, i.e., rapid disequilibrium sinking of molten iron, an oxidized core or protocore materials, and meteorite contamination of the upper mantle after separation from the core. Adopting concepts used in steady state thermodynamics, a method is devised for evaluating how elements should distribute stable in the Earth's interior for the present gradients of temperature, pressure, and gravitational acceleration. Thermochemical modeling gives useful insights into the nature of chemical evolution of the Earth without overly speculative assumptions. Further work must be done to reconcile siderophile elements, rare gases, and possible light elements in the outer core.
Lessons Learned in over Two Decades of GPS/GNSS Data Center Support

NASA Astrophysics Data System (ADS)

Boler, F. M.; Estey, L. H.; Meertens, C. M.; Maggert, D.

2014-12-01

The UNAVCO Data Center in Boulder, Colorado, curates, archives, and distributes geodesy data and products, mainly GPS/GNSS data from 3,000 permanent stations and 10,000 campaign sites around the globe. Although now having core support from NSF and NASA, the archive began around 1992 as a grass-roots effort of a few UNAVCO staff and community members to preserve data going back to 1986. Open access to this data is generally desired, but the Data Center in fact operates under an evolving suite of data access policies ranging from open access to nondisclosure for special cases. Key to processing this data is having the correct equipment metadata; reliably obtaining this metadata continues to be a challenge, in spite of modern cyberinfrastructure and tools, mostly due to human errors or lack of consistent operator training. New metadata problems surface when trying to design and publish modern Digital Object Identifiers for data sets where PIs, funding sources, and historical project names now need to be corrected and verified for data sets going back almost three decades. Originally, the data was GPS-only based on three signals on two carrier frequencies. Modern GNSS covers GPS modernization (three more signals and one additional carrier) as well as open signals and carriers of additional systems such as GLONASS, Galileo, BeiDou, and QZSS, requiring ongoing adaptive strategies to assess the quality of modern datasets. Also, new scientific uses of these data benefit from higher data rates than was needed for early tectonic applications. In addition, there has been a migration from episodic campaign sites (hence sparse data) to continuously operating stations (hence dense data) over the last two decades. All of these factors make it difficult to realistically plan even simple data center functions such as on-line storage capacity.
Viewing and Editing Earth Science Metadata MOBE: Metadata Object Browser and Editor in Java

NASA Astrophysics Data System (ADS)

Chase, A.; Helly, J.

2002-12-01

Metadata is an important, yet often neglected aspect of successful archival efforts. However, to generate robust, useful metadata is often a time consuming and tedious task. We have been approaching this problem from two directions: first by automating metadata creation, pulling from known sources of data, and in addition, what this (paper/poster?) details, developing friendly software for human interaction with the metadata. MOBE and COBE(Metadata Object Browser and Editor, and Canonical Object Browser and Editor respectively), are Java applications for editing and viewing metadata and digital objects. MOBE has already been designed and deployed, currently being integrated into other areas of the SIOExplorer project. COBE is in the design and development stage, being created with the same considerations in mind as those for MOBE. Metadata creation, viewing, data object creation, and data object viewing, when taken on a small scale are all relatively simple tasks. Computer science however, has an infamous reputation for transforming the simple into complex. As a system scales upwards to become more robust, new features arise and additional functionality is added to the software being written to manage the system. The software that emerges from such an evolution, though powerful, is often complex and difficult to use. With MOBE the focus is on a tool that does a small number of tasks very well. The result has been an application that enables users to manipulate metadata in an intuitive and effective way. This allows for a tool that serves its purpose without introducing additional cognitive load onto the user, an end goal we continue to pursue.
Managing biomedical image metadata for search and retrieval of similar images.

PubMed

Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris

2011-08-01

Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.
A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models.

PubMed

Misra, Dharitri; Chen, Siyuan; Thoma, George R

2009-01-01

One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.
The Metadata Cloud: The Last Piece of a Distributed Data System Model

NASA Astrophysics Data System (ADS)

King, T. A.; Cecconi, B.; Hughes, J. S.; Walker, R. J.; Roberts, D.; Thieman, J. R.; Joy, S. P.; Mafi, J. N.; Gangloff, M.

2012-12-01

Distributed data systems have existed ever since systems were networked together. Over the years the model for distributed data systems have evolved from basic file transfer to client-server to multi-tiered to grid and finally to cloud based systems. Initially metadata was tightly coupled to the data either by embedding the metadata in the same file containing the data or by co-locating the metadata in commonly named files. As the sources of data multiplied, data volumes have increased and services have specialized to improve efficiency; a cloud system model has emerged. In a cloud system computing and storage are provided as services with accessibility emphasized over physical location. Computation and data clouds are common implementations. Effectively using the data and computation capabilities requires metadata. When metadata is stored separately from the data; a metadata cloud is formed. With a metadata cloud information and knowledge about data resources can migrate efficiently from system to system, enabling services and allowing the data to remain efficiently stored until used. This is especially important with "Big Data" where movement of the data is limited by bandwidth. We examine how the metadata cloud completes a general distributed data system model, how standards play a role and relate this to the existing types of cloud computing. We also look at the major science data systems in existence and compare each to the generalized cloud system model.
Turning Data into Information: Assessing and Reporting GIS Metadata Integrity Using Integrated Computing Technologies

ERIC Educational Resources Information Center

Mulrooney, Timothy J.

2009-01-01

A Geographic Information System (GIS) serves as the tangible and intangible means by which spatially related phenomena can be created, analyzed and rendered. GIS metadata serves as the formal framework to catalog information about a GIS data set. Metadata is independent of the encoded spatial and attribute information. GIS metadata is a subset of…
Integrating XQuery-Enabled SCORM XML Metadata Repositories into an RDF-Based E-Learning P2P Network

ERIC Educational Resources Information Center

Qu, Changtao; Nejdl, Wolfgang

2004-01-01

Edutella is an RDF-based E-Learning P2P network that is aimed to accommodate heterogeneous learning resource metadata repositories in a P2P manner and further facilitate the exchange of metadata between these repositories based on RDF. Whereas Edutella provides RDF metadata repositories with a quite natural integration approach, XML metadata…
Raising orphans from a metadata morass: A researcher's guide to re-use of public 'omics data.

PubMed

Bhandary, Priyanka; Seetharam, Arun S; Arendsee, Zebulun W; Hur, Manhoi; Wurtele, Eve Syrkin

2018-02-01

More than 15 petabases of raw RNAseq data is now accessible through public repositories. Acquisition of other 'omics data types is expanding, though most lack a centralized archival repository. Data-reuse provides tremendous opportunity to extract new knowledge from existing experiments, and offers a unique opportunity for robust, multi-'omics analyses by merging metadata (information about experimental design, biological samples, protocols) and data from multiple experiments. We illustrate how predictive research can be accelerated by meta-analysis with a study of orphan (species-specific) genes. Computational predictions are critical to infer orphan function because their coding sequences provide very few clues. The metadata in public databases is often confusing; a test case with Zea mays mRNA seq data reveals a high proportion of missing, misleading or incomplete metadata. This metadata morass significantly diminishes the insight that can be extracted from these data. We provide tips for data submitters and users, including specific recommendations to improve metadata quality by more use of controlled vocabulary and by metadata reviews. Finally, we advocate for a unified, straightforward metadata submission and retrieval system. Copyright © 2017 Elsevier B.V. All rights reserved.
Grain-size distribution and selected major and trace element concentrations in bed-sediment cores from the Lower Granite Reservoir and Snake and Clearwater Rivers, eastern Washington and northern Idaho, 2010

USGS Publications Warehouse

Braun, Christopher L.; Wilson, Jennifer T.; Van Metre, Peter C.; Weakland, Rhonda J.; Fosness, Ryan L.; Williams, Marshall L.

2012-01-01

Fifty subsamples from 15 cores were analyzed for major and trace elements. Concentrations of trace elements were low, with respect to sediment quality guidelines, in most cores. Typically, major and trace element concentrations were lower in the subsamples collected from the Snake River compared to those collected from the Clearwater River, the confluence of the Snake and Clearwater Rivers, and Lower Granite Reservoir. Generally, lower concentrations of major and trace elements were associated with coarser sediments (larger than 0.0625 millimeter) and higher concentrations of major and trace elements were associated with finer sediments (smaller than 0.0625 millimeter).
Science friction: data, metadata, and collaboration.

PubMed

Edwards, Paul N; Mayernik, Matthew S; Batcheller, Archer L; Bowker, Geoffrey C; Borgman, Christine L

2011-10-01

When scientists from two or more disciplines work together on related problems, they often face what we call 'science friction'. As science becomes more data-driven, collaborative, and interdisciplinary, demand increases for interoperability among data, tools, and services. Metadata--usually viewed simply as 'data about data', describing objects such as books, journal articles, or datasets--serve key roles in interoperability. Yet we find that metadata may be a source of friction between scientific collaborators, impeding data sharing. We propose an alternative view of metadata, focusing on its role in an ephemeral process of scientific communication, rather than as an enduring outcome or product. We report examples of highly useful, yet ad hoc, incomplete, loosely structured, and mutable, descriptions of data found in our ethnographic studies of several large projects in the environmental sciences. Based on this evidence, we argue that while metadata products can be powerful resources, usually they must be supplemented with metadata processes. Metadata-as-process suggests the very large role of the ad hoc, the incomplete, and the unfinished in everyday scientific work.
Recipes for Semantic Web Dog Food — The ESWC and ISWC Metadata Projects

NASA Astrophysics Data System (ADS)

Möller, Knud; Heath, Tom; Handschuh, Siegfried; Domingue, John

Semantic Web conferences such as ESWC and ISWC offer prime opportunities to test and showcase semantic technologies. Conference metadata about people, papers and talks is diverse in nature and neither too small to be uninteresting or too big to be unmanageable. Many metadata-related challenges that may arise in the Semantic Web at large are also present here. Metadata must be generated from sources which are often unstructured and hard to process, and may originate from many different players, therefore suitable workflows must be established. Moreover, the generated metadata must use appropriate formats and vocabularies, and be served in a way that is consistent with the principles of linked data. This paper reports on the metadata efforts from ESWC and ISWC, identifies specific issues and barriers encountered during the projects, and discusses how these were approached. Recommendations are made as to how these may be addressed in the future, and we discuss how these solutions may generalize to metadata production for the Semantic Web at large.
BASINS Metadata

EPA Pesticide Factsheets

Metadata or data about data describes the content, quality, condition, and other characteristics of data. Geospatial metadata are critical to data discovery and serves as the fuel for the Geospatial One-Stop data portal.
Visualization of JPEG Metadata

NASA Astrophysics Data System (ADS)

Malik Mohamad, Kamaruddin; Deris, Mustafa Mat

There are a lot of information embedded in JPEG image than just graphics. Visualization of its metadata would benefit digital forensic investigator to view embedded data including corrupted image where no graphics can be displayed in order to assist in evidence collection for cases such as child pornography or steganography. There are already available tools such as metadata readers, editors and extraction tools but mostly focusing on visualizing attribute information of JPEG Exif. However, none have been done to visualize metadata by consolidating markers summary, header structure, Huffman table and quantization table in a single program. In this paper, metadata visualization is done by developing a program that able to summarize all existing markers, header structure, Huffman table and quantization table in JPEG. The result shows that visualization of metadata helps viewing the hidden information within JPEG more easily.
Effect of Silicon on Activity Coefficients of Siderophile Elements (P, Au, Pd, As, Ge, Sb, and In) in Liquid Fe, with Application to Core Formation

NASA Technical Reports Server (NTRS)

Righter, K.; Pando, K.; Danielson, L. R.; Humayun, M.; Righter, M.; Lapen, T.; Boujibar, A.

2016-01-01

Earth's core contains approximately 10 percent light elements that are likely a combination of S, C, Si, and O, with Si possibly being the most abundant. Si dissolved into Fe liquids can have a large effect on the magnitude of the activity coefficient of siderophile elements (SE) in Fe liquids, and thus the partitioning behavior of those elements between core and mantle. The effect of Si can be small such as for Ni and Co, or large such as for Mo, Ge, Sb, As. The effect of Si on many siderophile elements is unknown yet could be an important, and as yet unquantified, influence on the core-mantle partitioning of SE. Here we report new experiments designed to quantify the effect of Si on the partitioning of P, Au, Pd, and many other SE between metal and silicate melt. The results will be applied to Earth, for which we have excellent constraints on the mantle siderophile element concentrations.

Barium and calcium analyses in sediment cores using µ-XRF core scanners

NASA Astrophysics Data System (ADS)

Acar, Dursun; Çaǧatay, Namık; Genç, S. Can; Eriş, K. Kadir; Sarı, Erol; Uçarkus, Gülsen

2017-04-01

Barium and Ca are used as proxies for organic productivity in paleooceanographic studies. With its heavy atomic weight (137.33 u), barium is easily detectable in small concentrations (several ppm levels) in marine sediments using XRF methods, including the analysis by µ-XRF core scanners. Calcium has an intermediate atomic weight (40.078 u) but is a major element in the earth's crust and in sediments and sedimentary rocks, and hence it is easily detectable by µ-XRF techniques. Normally, µ-XRF elemental analysis of cores are carried out using split half cores or 1-2 cm thich u-channels with an original moisture. Sediment cores show variation in different water content (and porosity) along their length. This in turn results in variation in the XRF counts of the elements and causes error in the elemental concentrations. We tried µ-XRF elemental analysis of split half cores, subsampled as 1 cm thick u-channels with original moisture and 0.3 mm-thin film slices of the core with original wet sample and after air drying with humidity protector mylar film. We found considerable increase in counts of most elements, and in particular for Ba and Ca, when we used 0.3 mm thin film, dried slice. In the case of Ba, the counts increased about three times that of the analysis made with wet and 1 cm thick u-channels. The higher Ba and Ca counts are mainly due to the possible precipitation of Ba as barite and Ca as gypsum from oxidation of Fe-sulphides and the evaporation of pore waters. The secondary barite and gypsum precipitation would be especially serious in unoxic sediment units, such as sapropels, with considerable Fe-sulphides and bio-barite.It is therefore suggested that reseachers should be cautious of such secondary precipitation on core surfaces when analyzing cores that have long been exposed to the atmospheric conditions.
Composition of the core from gallium metal–silicate partitioning experiments

DOE PAGES

Blanchard, I.; Badro, J.; Siebert, J.; ...

2015-07-24

We present gallium concentration (normalized to CI chondrites) in the mantle is at the same level as that of lithophile elements with similar volatility, implying that there must be little to no gallium in Earth's core. Metal-silicate partitioning experiments, however, have shown that gallium is a moderately siderophile element and should be therefore depleted in the mantle by core formation. Moreover, gallium concentrations in the mantle (4 ppm) are too high to be only brought by the late veneer; and neither pressure, nor temperature, nor silicate composition has a large enough effect on gallium partitioning to make it lithophile. Wemore » therefore systematically investigated the effect of core composition (light element content) on the partitioning of gallium by carrying out metal–silicate partitioning experiments in a piston–cylinder press at 2 GPa between 1673 K and 2073 K. Four light elements (Si, O, S, C) were considered, and their effect was found to be sufficiently strong to make gallium lithophile. The partitioning of gallium was then modeled and parameterized as a function of pressure, temperature, redox and core composition. A continuous core formation model was used to track the evolution of gallium partitioning during core formation, for various magma ocean depths, geotherms, core light element contents, and magma ocean composition (redox) during accretion. The only model for which the final gallium concentration in the silicate Earth matched the observed value is the one involving a light-element rich core equilibrating in a FeO-rich deep magma ocean (>1300 km) with a final pressure of at least 50 GPa. More specifically, the incorporation of S and C in the core provided successful models only for concentrations that lie far beyond their allowable cosmochemical or geophysical limits, whereas realistic O and Si amounts (less than 5 wt.%) in the core provided successful models for magma oceans deeper that 1300 km. In conclusion, these results offer a strong argument for an O- and Si-rich core, formed in a deep terrestrial magma ocean, along with oxidizing conditions.« less
Use of a metadata documentation and search tool for large data volumes: The NGEE arctic example

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devarakonda, Ranjeet; Hook, Leslie A; Killeffer, Terri S

The Online Metadata Editor (OME) is a web-based tool to help document scientific data in a well-structured, popular scientific metadata format. In this paper, we will discuss the newest tool that Oak Ridge National Laboratory (ORNL) has developed to generate, edit, and manage metadata and how it is helping data-intensive science centers and projects, such as the U.S. Department of Energy s Next Generation Ecosystem Experiments (NGEE) in the Arctic to prepare metadata and make their big data produce big science and lead to new discoveries.
Creating preservation metadata from XML-metadata profiles

NASA Astrophysics Data System (ADS)

Ulbricht, Damian; Bertelmann, Roland; Gebauer, Petra; Hasler, Tim; Klump, Jens; Kirchner, Ingo; Peters-Kottig, Wolfgang; Mettig, Nora; Rusch, Beate

2014-05-01

Registration of dataset DOIs at DataCite makes research data citable and comes with the obligation to keep data accessible in the future. In addition, many universities and research institutions measure data that is unique and not repeatable like the data produced by an observational network and they want to keep these data for future generations. In consequence, such data should be ingested in preservation systems, that automatically care for file format changes. Open source preservation software that is developed along the definitions of the ISO OAIS reference model is available but during ingest of data and metadata there are still problems to be solved. File format validation is difficult, because format validators are not only remarkably slow - due to variety in file formats different validators return conflicting identification profiles for identical data. These conflicts are hard to resolve. Preservation systems have a deficit in the support of custom metadata. Furthermore, data producers are sometimes not aware that quality metadata is a key issue for the re-use of data. In the project EWIG an university institute and a research institute work together with Zuse-Institute Berlin, that is acting as an infrastructure facility, to generate exemplary workflows for research data into OAIS compliant archives with emphasis on the geosciences. The Institute for Meteorology provides timeseries data from an urban monitoring network whereas GFZ Potsdam delivers file based data from research projects. To identify problems in existing preservation workflows the technical work is complemented by interviews with data practitioners. Policies for handling data and metadata are developed. Furthermore, university teaching material is created to raise the future scientists awareness of research data management. As a testbed for ingest workflows the digital preservation system Archivematica [1] is used. During the ingest process metadata is generated that is compliant to the Metadata Encoding and Transmission Standard (METS). To find datasets in future portals and to make use of this data in own scientific work, proper selection of discovery metadata and application metadata is very important. Some XML-metadata profiles are not suitable for preservation, because version changes are very fast and make it nearly impossible to automate the migration. For other XML-metadata profiles schema definitions are changed after publication of the profile or the schema definitions become inaccessible, which might cause problems during validation of the metadata inside the preservation system [2]. Some metadata profiles are not used widely enough and might not even exist in the future. Eventually, discovery and application metadata have to be embedded into the mdWrap-subtree of the METS-XML. [1] http://www.archivematica.org [2] http://dx.doi.org/10.2218/ijdc.v7i1.215
Assisted editing od SensorML with EDI. A bottom-up scenario towards the definition of sensor profiles.

NASA Astrophysics Data System (ADS)

Oggioni, Alessandro; Tagliolato, Paolo; Fugazza, Cristiano; Bastianini, Mauro; Pavesi, Fabio; Pepe, Monica; Menegon, Stefano; Basoni, Anna; Carrara, Paola

2015-04-01

Sensor observation systems for environmental data have become increasingly important in the last years. The EGU's Informatics in Oceanography and Ocean Science track stressed the importance of management tools and solutions for marine infrastructures. We think that full interoperability among sensor systems is still an open issue and that the solution to this involves providing appropriate metadata. Several open source applications implement the SWE specification and, particularly, the Sensor Observation Services (SOS) standard. These applications allow for the exchange of data and metadata in XML format between computer systems. However, there is a lack of metadata editing tools supporting end users in this activity. Generally speaking, it is hard for users to provide sensor metadata in the SensorML format without dedicated tools. In particular, such a tool should ease metadata editing by providing, for standard sensors, all the invariant information to be included in sensor metadata, thus allowing the user to concentrate on the metadata items that are related to the specific deployment. RITMARE, the Italian flagship project on marine research, envisages a subproject, SP7, for the set-up of the project's spatial data infrastructure. SP7 developed EDI, a general purpose, template-driven metadata editor that is composed of a backend web service and an HTML5/javascript client. EDI can be customized for managing the creation of generic metadata encoded as XML. Once tailored to a specific metadata format, EDI presents the users a web form with advanced auto completion and validation capabilities. In the case of sensor metadata (SensorML versions 1.0.1 and 2.0), the EDI client is instructed to send an "insert sensor" request to an SOS endpoint in order to save the metadata in an SOS server. In the first phase of project RITMARE, EDI has been used to simplify the creation from scratch of SensorML metadata by the involved researchers and data managers. An interesting by-product of this ongoing work is currently constituting an archive of predefined sensor descriptions. This information is being collected in order to further ease metadata creation in the next phase of the project. Users will be able to choose among a number of sensor and sensor platform prototypes: These will be specific instances on which it will be possible to define, in a bottom-up approach, "sensor profiles". We report on the outcome of this activity.
Data dictionary services in XNAT and the Human Connectome Project.

PubMed

Herrick, Rick; McKay, Michael; Olsen, Timothy; Horton, William; Florida, Mark; Moore, Charles J; Marcus, Daniel S

2014-01-01

The XNAT informatics platform is an open source data management tool used by biomedical imaging researchers around the world. An important feature of XNAT is its highly extensible architecture: users of XNAT can add new data types to the system to capture the imaging and phenotypic data generated in their studies. Until recently, XNAT has had limited capacity to broadcast the meaning of these data extensions to users, other XNAT installations, and other software. We have implemented a data dictionary service for XNAT, which is currently being used on ConnectomeDB, the Human Connectome Project (HCP) public data sharing website. The data dictionary service provides a framework to define key relationships between data elements and structures across the XNAT installation. This includes not just core data representing medical imaging data or subject or patient evaluations, but also taxonomical structures, security relationships, subject groups, and research protocols. The data dictionary allows users to define metadata for data structures and their properties, such as value types (e.g., textual, integers, floats) and valid value templates, ranges, or field lists. The service provides compatibility and integration with other research data management services by enabling easy migration of XNAT data to standards-based formats such as the Resource Description Framework (RDF), JavaScript Object Notation (JSON), and Extensible Markup Language (XML). It also facilitates the conversion of XNAT's native data schema into standard neuroimaging vocabularies and structures.
Data dictionary services in XNAT and the Human Connectome Project

PubMed Central

Herrick, Rick; McKay, Michael; Olsen, Timothy; Horton, William; Florida, Mark; Moore, Charles J.; Marcus, Daniel S.

2014-01-01

The XNAT informatics platform is an open source data management tool used by biomedical imaging researchers around the world. An important feature of XNAT is its highly extensible architecture: users of XNAT can add new data types to the system to capture the imaging and phenotypic data generated in their studies. Until recently, XNAT has had limited capacity to broadcast the meaning of these data extensions to users, other XNAT installations, and other software. We have implemented a data dictionary service for XNAT, which is currently being used on ConnectomeDB, the Human Connectome Project (HCP) public data sharing website. The data dictionary service provides a framework to define key relationships between data elements and structures across the XNAT installation. This includes not just core data representing medical imaging data or subject or patient evaluations, but also taxonomical structures, security relationships, subject groups, and research protocols. The data dictionary allows users to define metadata for data structures and their properties, such as value types (e.g., textual, integers, floats) and valid value templates, ranges, or field lists. The service provides compatibility and integration with other research data management services by enabling easy migration of XNAT data to standards-based formats such as the Resource Description Framework (RDF), JavaScript Object Notation (JSON), and Extensible Markup Language (XML). It also facilitates the conversion of XNAT's native data schema into standard neuroimaging vocabularies and structures. PMID:25071542
Interfacial characterization of ceramic core materials with veneering porcelain for all-ceramic bi-layered restorative systems.

PubMed

Tagmatarchis, Alexander; Tripodakis, Aris-Petros; Filippatos, Gerasimos; Zinelis, Spiros; Eliades, George

2014-01-01

The aim of the study was to characterize the elemental distribution at the interface between all-ceramic core and veneering porcelain materials. Three groups of all-ceramic cores were selected: A) Glass-ceramics (Cergo, IPS Empress, IPS Empress 2, e-max Press, Finesse); B) Glass-infiltrated ceramics (Celay Alumina, Celay Zirconia) and C) Densely sintered ceramics (Cercon, Procera Alumina, ZirCAD, Noritake Zirconia). The cores were combined with compatible veneering porcelains and three flat square test specimens were produced for each system. The core-veneer interfaces were examined by scanning electron microscopy and energy dispersive x-ray microanalysis. The glass-ceramic systems showed interfacial zones reach in Si and O, with the presence of K, Ca, Al in core and Ca, Ce, Na, Mg or Al in veneer material, depending on the system tested. IPS Empress and IPS Empress 2 demonstrated distinct transitional phases at the core-veneer interface. In the glassinfiltrated systems, intermixing of core (Ce, La) with veneer (Na, Si) elements occurred, whereas an abrupt drop of the core-veneer elemental concentration was documented at the interfaces of all densely sintered ceramics. The results of the study provided no evidence of elemental interdiffusion at the core-veneer interfaces in densely sintered ceramics, which implies lack of primary chemical bonding. For the glass-containing systems (glassceramics and glass-infiltrated ceramics) interdiffusion of the glass-phase seems to play a critical role in establishing a primary bonding condition between ceramic core and veneering porcelain.
The PDS4 Metadata Management System

NASA Astrophysics Data System (ADS)

Raugh, A. C.; Hughes, J. S.

2018-04-01

We present the key features of the Planetary Data System (PDS) PDS4 Information Model as an extendable metadata management system for planetary metadata related to data structure, analysis/interpretation, and provenance.
Dynamic Non-Hierarchical File Systems for Exascale Storage

DOE Office of Scientific and Technical Information (OSTI.GOV)

Long, Darrell E.; Miller, Ethan L

This constitutes the final report for “Dynamic Non-Hierarchical File Systems for Exascale Storage”. The ultimate goal of this project was to improve data management in scientific computing and high-end computing (HEC) applications, and to achieve this goal we proposed: to develop the first, HEC-targeted, file system featuring rich metadata and provenance collection, extreme scalability, and future storage hardware integration as core design goals, and to evaluate and develop a flexible non-hierarchical file system interface suitable for providing more powerful and intuitive data management interfaces to HEC and scientific computing users. Data management is swiftly becoming a serious problem in themore » scientific community – while copious amounts of data are good for obtaining results, finding the right data is often daunting and sometimes impossible. Scientists participating in a Department of Energy workshop noted that most of their time was spent “...finding, processing, organizing, and moving data and it’s going to get much worse”. Scientists should not be forced to become data mining experts in order to retrieve the data they want, nor should they be expected to remember the naming convention they used several years ago for a set of experiments they now wish to revisit. Ideally, locating the data you need would be as easy as browsing the web. Unfortunately, existing data management approaches are usually based on hierarchical naming, a 40 year-old technology designed to manage thousands of files, not exabytes of data. Today’s systems do not take advantage of the rich array of metadata that current high-end computing (HEC) file systems can gather, including content-based metadata and provenance1 information. As a result, current metadata search approaches are typically ad hoc and often work by providing a parallel management system to the “main” file system, as is done in Linux (the locate utility), personal computers, and enterprise search appliances. These search applications are often optimized for a single file system, making it difficult to move files and their metadata between file systems. Users have tried to solve this problem in several ways, including the use of separate databases to index file properties, the encoding of file properties into file names, and separately gathering and managing provenance data, but none of these approaches has worked well, either due to limited usefulness or scalability, or both. Our research addressed several key issues: High-performance, real-time metadata harvesting: extracting important attributes from files dynamically and immediately updating indexes used to improve search; Transparent, automatic, and secure provenance capture: recording the data inputs and processing steps used in the production of each file in the system; Scalable indexing: indexes that are optimized for integration with the file system; Dynamic file system structure: our approach provides dynamic directories similar to those in semantic file systems, but these are the native organization rather than a feature grafted onto a conventional system. In addition to these goals, our research effort will include evaluating the impact of new storage technologies on the file system design and performance. In particular, the indexing and metadata harvesting functions can potentially benefit from the performance improvements promised by new storage class memories.« less
Design of Community Resource Inventories as a Component of Scalable Earth Science Infrastructure: Experience of the Earthcube CINERGI Project

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Valentine, D. W., Jr.; Grethe, J. S.; Hsu, L.; Malik, T.; Bermudez, L. E.; Gupta, A.; Lehnert, K. A.; Whitenack, T.; Ozyurt, I. B.; Condit, C.; Calderon, R.; Musil, L.

2014-12-01

EarthCube is envisioned as a cyberinfrastructure that fosters new, transformational geoscience by enabling sharing, understanding and scientifically-sound and efficient re-use of formerly unconnected data resources, software, models, repositories, and computational power. Its purpose is to enable science enterprise and workforce development via an extensible and adaptable collaboration and resource integration framework. A key component of this vision is development of comprehensive inventories supporting resource discovery and re-use across geoscience domains. The goal of the EarthCube CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) project is to create a methodology and assemble a large inventory of high-quality information resources with standard metadata descriptions and traceable provenance. The inventory is compiled from metadata catalogs maintained by geoscience data facilities, as well as from user contributions. The latter mechanism relies on community resource viewers: online applications that support update and curation of metadata records. Once harvested into CINERGI, metadata records from domain catalogs and community resource viewers are loaded into a staging database implemented in MongoDB, and validated for compliance with ISO 19139 metadata schema. Several types of metadata defects detected by the validation engine are automatically corrected with help of several information extractors or flagged for manual curation. The metadata harvesting, validation and processing components generate provenance statements using W3C PROV notation, which are stored in a Neo4J database. Thus curated metadata, along with the provenance information, is re-published and accessed programmatically and via a CINERGI online application. This presentation focuses on the role of resource inventories in a scalable and adaptable information infrastructure, and on the CINERGI metadata pipeline and its implementation challenges. Key project components are described at the project's website (http://workspace.earthcube.org/cinergi), which also provides access to the initial resource inventory, the inventory metadata model, metadata entry forms and a collection of the community resource viewers.
Identification of the core sequence elements in Penaeus stylirostris densovirus promoters

USDA-ARS?s Scientific Manuscript database

This manuscript describes the role of different core elements in the transcriptional activity of promoters in a marine parvovirus, Penaeus stylirostris densovirus (PstDNV) that infects shrimp. Although comprehensive information on the role of different elements in the promoters of several animal par...
Architecture of the local spatial data infrastructure for regional climate change research

NASA Astrophysics Data System (ADS)

Titov, Alexander; Gordov, Evgeny

2013-04-01

Georeferenced datasets (meteorological databases, modeling and reanalysis results, etc.) are actively used in modeling and analysis of climate change for various spatial and temporal scales. Due to inherent heterogeneity of environmental datasets as well as their size which might constitute up to tens terabytes for a single dataset studies in the area of climate and environmental change require a special software support based on SDI approach. A dedicated architecture of the local spatial data infrastructure aiming at regional climate change analysis using modern web mapping technologies is presented. Geoportal is a key element of any SDI, allowing searching of geoinformation resources (datasets and services) using metadata catalogs, producing geospatial data selections by their parameters (data access functionality) as well as managing services and applications of cartographical visualization. It should be noted that due to objective reasons such as big dataset volume, complexity of data models used, syntactic and semantic differences of various datasets, the development of environmental geodata access, processing and visualization services turns out to be quite a complex task. Those circumstances were taken into account while developing architecture of the local spatial data infrastructure as a universal framework providing geodata services. So that, the architecture presented includes: 1. Effective in terms of search, access, retrieval and subsequent statistical processing, model of storing big sets of regional georeferenced data, allowing in particular to store frequently used values (like monthly and annual climate change indices, etc.), thus providing different temporal views of the datasets 2. General architecture of the corresponding software components handling geospatial datasets within the storage model 3. Metadata catalog describing in detail using ISO 19115 and CF-convention standards datasets used in climate researches as a basic element of the spatial data infrastructure as well as its publication according to OGC CSW (Catalog Service Web) specification 4. Computational and mapping web services to work with geospatial datasets based on OWS (OGC Web Services) standards: WMS, WFS, WPS 5. Geoportal as a key element of thematic regional spatial data infrastructure providing also software framework for dedicated web applications development To realize web mapping services Geoserver software is used since it provides natural WPS implementation as a separate software module. To provide geospatial metadata services GeoNetwork Opensource (http://geonetwork-opensource.org) product is planned to be used for it supports ISO 19115/ISO 19119/ISO 19139 metadata standards as well as ISO CSW 2.0 profile for both client and server. To implement thematic applications based on geospatial web services within the framework of local SDI geoportal the following open source software have been selected: 1. OpenLayers JavaScript library, providing basic web mapping functionality for the thin client such as web browser 2. GeoExt/ExtJS JavaScript libraries for building client-side web applications working with geodata services. The web interface developed will be similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. The work is partially supported by RF Ministry of Education and Science grant 8345, SB RAS Program VIII.80.2.1 and IP 131.
Metadata improvements driving new tools and services at a NASA data center

NASA Astrophysics Data System (ADS)

Moroni, D. F.; Hausman, J.; Foti, G.; Armstrong, E. M.

2011-12-01

The NASA Physical Oceanography DAAC (PO.DAAC) is responsible for distributing and maintaining satellite derived oceanographic data from a number of NASA and non-NASA missions for the physical disciplines of ocean winds, sea surface temperature, ocean topography and gravity. Currently its holdings consist of over 600 datasets with a data archive in excess of 200 Terrabytes. The PO.DAAC has recently embarked on a metadata quality and completeness project to migrate, update and improve metadata records for over 300 public datasets. An interactive database management tool has been developed to allow data scientists to enter, update and maintain metadata records. This tool communicates directly with PO.DAAC's Data Management and Archiving System (DMAS), which serves as the new archival and distribution backbone as well as a permanent repository of dataset and granule-level metadata. Although we will briefly discuss the tool, more important ramifications are the ability to now expose, propagate and leverage the metadata in a number of ways. First, the metadata are exposed directly through a faceted and free text search interface directly from drupal-based PO.DAAC web pages allowing for quick browsing and data discovery especially by "drilling" through the various facet levels that organize datasets by time/space resolution, processing level, sensor, measurement type etc. Furthermore, the metadata can now be exposed through web services to produce metadata records in a number of different formats such as FGDC and ISO 19115, or potentially propagated to visualization and subsetting tools, and other discovery interfaces. The fundamental concept is that the metadata forms the essential bridge between the user, and the tool or discovery mechanism for a broad range of ocean earth science data records.
EPA Metadata Style Guide Keywords and EPA Organization Names

EPA Pesticide Factsheets

The following keywords and EPA organization names listed below, along with EPA’s Metadata Style Guide, are intended to provide suggestions and guidance to assist with the standardization of metadata records.
A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models

PubMed Central

Misra, Dharitri; Chen, Siyuan; Thoma, George R.

2010-01-01

One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386
AthenaMT: upgrading the ATLAS software framework for the many-core world with multi-threading

NASA Astrophysics Data System (ADS)

Leggett, Charles; Baines, John; Bold, Tomasz; Calafiura, Paolo; Farrell, Steven; van Gemmeren, Peter; Malon, David; Ritsch, Elmar; Stewart, Graeme; Snyder, Scott; Tsulaia, Vakhtang; Wynne, Benjamin; ATLAS Collaboration

2017-10-01

ATLAS’s current software framework, Gaudi/Athena, has been very successful for the experiment in LHC Runs 1 and 2. However, its single threaded design has been recognized for some time to be increasingly problematic as CPUs have increased core counts and decreased available memory per core. Even the multi-process version of Athena, AthenaMP, will not scale to the range of architectures we expect to use beyond Run2. After concluding a rigorous requirements phase, where many design components were examined in detail, ATLAS has begun the migration to a new data-flow driven, multi-threaded framework, which enables the simultaneous processing of singleton, thread unsafe legacy Algorithms, cloned Algorithms that execute concurrently in their own threads with different Event contexts, and fully re-entrant, thread safe Algorithms. In this paper we report on the process of modifying the framework to safely process multiple concurrent events in different threads, which entails significant changes in the underlying handling of features such as event and time dependent data, asynchronous callbacks, metadata, integration with the online High Level Trigger for partial processing in certain regions of interest, concurrent I/O, as well as ensuring thread safety of core services. We also report on upgrading the framework to handle Algorithms that are fully re-entrant.
Core formation in the Moon: The mystery of the excess depletion of Mo, W and P

NASA Technical Reports Server (NTRS)

Newsom, H. E.; Maehr, S. A.

1993-01-01

We have evaluated siderophile element depletion models for the Moon in light of our improved statistical treatment of siderophile element abundance data and new information on the physics of core formation. If core formation occurred in the Moon at the large degrees of partial melting necessary for metal segregation, according to recent estimates, then a significant inconsistency (not seen in the eucrite parent body) exists in the depletion of the incompatible siderophile elements Mo, W, and P, compared to other siderophile elements in the Moon. The siderophile data, with the exception of Mo, are most consistent with terrestrial initial siderophile abundances and segregation of a very small core in the Moon. Our improved abundance estimates and possible explanations for these discrepancies are discussed.
The effect of melt composition on metal-silicate partitioning of siderophile elements and constraints on core formation in the angrite parent body

NASA Astrophysics Data System (ADS)

Steenstra, E. S.; Sitabi, A. B.; Lin, Y. H.; Rai, N.; Knibbe, J. S.; Berndt, J.; Matveev, S.; van Westrenen, W.

2017-09-01

We present 275 new metal-silicate partition coefficients for P, S, V, Cr, Mn, Co, Ni, Ge, Mo, and W obtained at moderate P (1.5 GPa) and high T (1683-1883 K). We investigate the effect of silicate melt composition using four end member silicate melt compositions. We identify possible silicate melt dependencies of the metal-silicate partitioning of lower valence elements Ni, Ge and V, elements that are usually assumed to remain unaffected by changes in silicate melt composition. Results for the other elements are consistent with the dependence of their metal-silicate partition coefficients on the individual major oxide components of the silicate melt composition suggested by recently reported parameterizations and theoretical considerations. Using multiple linear regression, we parameterize compiled metal-silicate partitioning results including our new data and report revised expressions that predict their metal-silicate partitioning behavior as a function of P-T-X-fO2. We apply these results to constrain the conditions that prevailed during core formation in the angrite parent body (APB). Our results suggest the siderophile element depletions in angrite meteorites are consistent with a CV bulk composition and constrain APB core formation to have occurred at mildly reducing conditions of 1.4 ± 0.5 log units below the iron-wüstite buffer (ΔIW), corresponding to a APB core mass of 18 ± 11%. The core mass range is constrained to 21 ± 8 mass% if light elements (S and/or C) are assumed to reside in the APB core. Incorporation of light elements in the APB core does not yield significantly different redox states for APB core-mantle differentiation. The inferred redox state is in excellent agreement with independent fO2 estimates recorded by pyroxene and olivine in angrites.
The Importance of Metadata in System Development and IKM

DTIC Science & Technology

2003-02-01

Defence R& D Canada The Importance of Metadata in System Development and IKM Anthony W. Isenor Technical Memorandum DRDC Atlantic TM 2003-011...Metadata in System Development and IKM Anthony W. Isenor Defence R& D Canada – Atlantic Technical Memorandum DRDC Atlantic TM 2003-011 February... it is important for searches and providing relevant information to the client. A comparison of metadata standards was conducted with emphasis on

The Global Streamflow Indices and Metadata Archive (GSIM) - Part 1: The production of a daily streamflow archive and metadata

NASA Astrophysics Data System (ADS)

Do, Hong Xuan; Gudmundsson, Lukas; Leonard, Michael; Westra, Seth

2018-04-01

This is the first part of a two-paper series presenting the Global Streamflow Indices and Metadata archive (GSIM), a worldwide collection of metadata and indices derived from more than 35 000 daily streamflow time series. This paper focuses on the compilation of the daily streamflow time series based on 12 free-to-access streamflow databases (seven national databases and five international collections). It also describes the development of three metadata products (freely available at https://doi.pangaea.de/10.1594/PANGAEA.887477): (1) a GSIM catalogue collating basic metadata associated with each time series, (2) catchment boundaries for the contributing area of each gauge, and (3) catchment metadata extracted from 12 gridded global data products representing essential properties such as land cover type, soil type, and climate and topographic characteristics. The quality of the delineated catchment boundary is also made available and should be consulted in GSIM application. The second paper in the series then explores production and analysis of streamflow indices. Having collated an unprecedented number of stations and associated metadata, GSIM can be used to advance large-scale hydrological research and improve understanding of the global water cycle.
Design and Implementation of a Metadata-rich File System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ames, S; Gokhale, M B; Maltzahn, C

2010-01-19

Despite continual improvements in the performance and reliability of large scale file systems, the management of user-defined file system metadata has changed little in the past decade. The mismatch between the size and complexity of large scale data stores and their ability to organize and query their metadata has led to a de facto standard in which raw data is stored in traditional file systems, while related, application-specific metadata is stored in relational databases. This separation of data and semantic metadata requires considerable effort to maintain consistency and can result in complex, slow, and inflexible system operation. To address thesemore » problems, we have developed the Quasar File System (QFS), a metadata-rich file system in which files, user-defined attributes, and file relationships are all first class objects. In contrast to hierarchical file systems and relational databases, QFS defines a graph data model composed of files and their relationships. QFS incorporates Quasar, an XPATH-extended query language for searching the file system. Results from our QFS prototype show the effectiveness of this approach. Compared to the de facto standard, the QFS prototype shows superior ingest performance and comparable query performance on user metadata-intensive operations and superior performance on normal file metadata operations.« less
Achieving Sub-Second Search in the CMR

NASA Astrophysics Data System (ADS)

Gilman, J.; Baynes, K.; Pilone, D.; Mitchell, A. E.; Murphy, K. J.

2014-12-01

The Common Metadata Repository (CMR) is the next generation Earth Science Metadata catalog for NASA's Earth Observing data. It joins together the holdings from the EOS Clearing House (ECHO) and the Global Change Master Directory (GCMD), creating a unified, authoritative source for EOSDIS metadata. The CMR allows ingest in many different formats while providing consistent search behavior and retrieval in any supported format. Performance is a critical component of the CMR, ensuring improved data discovery and client interactivity. The CMR delivers sub-second search performance for any of the common query conditions (including spatial) across hundreds of millions of metadata granules. It also allows the addition of new metadata concepts such as visualizations, parameter metadata, and documentation. The CMR's goals presented many challenges. This talk will describe the CMR architecture, design, and innovations that were made to achieve its goals. This includes: * Architectural features like immutability and backpressure. * Data management techniques such as caching and parallel loading that give big performance gains. * Open Source and COTS tools like Elasticsearch search engine. * Adoption of Clojure, a functional programming language for the Java Virtual Machine. * Development of a custom spatial search plugin for Elasticsearch and why it was necessary. * Introduction of a unified model for metadata that maps every supported metadata format to a consistent domain model.
Grid-wide neuroimaging data federation in the context of the NeuroLOG project

PubMed Central

Michel, Franck; Gaignard, Alban; Ahmad, Farooq; Barillot, Christian; Batrancourt, Bénédicte; Dojat, Michel; Gibaud, Bernard; Girard, Pascal; Godard, David; Kassel, Gilles; Lingrand, Diane; Malandain, Grégoire; Montagnat, Johan; Pélégrini-Issac, Mélanie; Pennec, Xavier; Rojas Balderrama, Javier; Wali, Bacem

2010-01-01

Grid technologies are appealing to deal with the challenges raised by computational neurosciences and support multi-centric brain studies. However, core grids middleware hardly cope with the complex neuroimaging data representation and multi-layer data federation needs. Moreover, legacy neuroscience environments need to be preserved and cannot be simply superseded by grid services. This paper describes the NeuroLOG platform design and implementation, shedding light on its Data Management Layer. It addresses the integration of brain image files, associated relational metadata and neuroscience semantic data in a heterogeneous distributed environment, integrating legacy data managers through a mediation layer. PMID:20543431
Identification of water-quality trends using sediment cores from Dillon Reservoir, Summit County, Colorado

USGS Publications Warehouse

Greve, Adrienne I.; Spahr, Norman E.; Van Metre, Peter C.; Wilson, Jennifer T.

2001-01-01

Since the construction of Dillon Reservoir, in Summit County, Colorado, in 1963, its drainage area has been the site of rapid urban development and the continued influence of historical mining. In an effort to assess changes in water quality within the drainage area, sediment cores were collected from Dillon Reservoir in 1997. The sediment cores were analyzed for pesticides, polychlorinated biphenyls (PCBs), polycyclic aromatic hydrocarbons (PAHs), and trace elements. Pesticides, PCBs, and PAHs were used to determine the effects of urban development, and trace elements were used to identify mining contributions. Water-quality and streambed-sediment samples, collected at the mouth of three streams that drain into Dillon Reservoir, were analyzed for trace elements. Of the 14 pesticides and 3 PCBs for which the sediment samples were analyzed, only 2 pesticides were detected. Low amounts of dichloro-diphenyldichloroethylene (DDE) and dichloro-diphenyldichloroethane (DDD), metabolites of dichlorodiphenyltrichloroethane (DDT), were found at core depths of 5 centimeters and below 15 centimeters in a core collected near the dam. The longest core, which was collected near the dam, spanned the entire sedimentation history of the reservoir. Concentrations of total combustion PAH and the ratio of fluoranthene to pyrene in the core sample decreased with core depth and increased over time. This relation is likely due to growth in residential and tourist populations in the region. Comparisons between core samples gathered in each arm of the reservoir showed the highest PAH concentrations were found in the Tenmile Creek arm, the only arm that has an urban area on its shores, the town of Frisco. All PAH concentrations, except the pyrene concentration in one segment in the core near the dam and acenaphthylene concentrations in the tops of three cores taken in the reservoir arms, were below Canadian interim freshwater sediment-quality guidelines. Concentrations of arsenic, cadmium, chromium, copper, lead, and zinc in sediment samples from Dillon Reservoir exceeded the Canadian interim freshwater sediment-quality guidelines. Copper, iron, lithium, nickel, scandium, titanium, and vanadium concentrations in sediment samples decreased over time. Other elements, while no trend was evident, displayed concentration spikes in the down-core profiles, indicating loads entering the reservoir may have been larger than they were in 1997. The highest concentrations of copper, lead, manganese, mercury, and zinc were detected during the late 1970's and early 1980's. Elevated concentrations of trace elements in sediment in Dillon Reservoir likely resulted from historical mining in the drainage area. The downward trend identified for copper, iron, lithium, nickel, scandium, titanium, and vanadium may be due in part to restoration efforts in mining-affected areas and a decrease in active mining in the Dillon Reservoir watershed. Although many trace-element core-sediment concentrations exceeded the Canadian probable effect level for freshwater lakes, under current limnological conditions, the high core-sediment concentrations do not adversely affect water quality in Dillon Reservoir. The trace-element concentrations in the reservoir water column meet the standards established by the Colorado Water Quality Control Commission. Although many trace-element core-sediment concentrations exceeded the Canadian probable effect level for freshwater lakes, under current limnological conditions, the high core-sediment concentrations do not adversely affect water quality in Dillon Reservoir. The trace-element concentrations in the reservoir water column meet the standards established by the Colorado Water Quality Control Commission.
The LTER Network Information System: Improving Data Quality and Synthesis through Community Collaboration

NASA Astrophysics Data System (ADS)

Servilla, M.; Brunt, J.

2011-12-01

Emerging in the 1980's as a U.S. National Science Foundation funded research network, the Long Term Ecological Research (LTER) Network began with six sites and with the goal of performing comparative data collection and analysis of major biotic regions of North America. Today, the LTER Network includes 26 sites located in North America, Antarctica, Puerto Rico, and French Polynesia and has contributed a corpus of over 7,000 data sets to the public domain. The diversity of LTER research has led to a wealth of scientific data derived from atmospheric to terrestrial to oceanographic to anthropogenic studies. Such diversity, however, is a contributing factor to data being published with poor or inconsistent quality or to data lacking descriptive documentation sufficient for understanding their origin or performing derivative studies. It is for these reasons that the LTER community, in collaboration with the LTER Network Office, have embarked on the development of the LTER Network Information System (NIS) - an integrative data management approach to improve the process by which quality LTER data and metadata are assembled into a central archive, thereby enabling better discovery, analysis, and synthesis of derived data products. The mission of the LTER NIS is to promote advances in collaborative and synthetic ecological science at multiple temporal and spatial scales by providing the information management and technology infrastructure to increase: ? availability and quality of data from LTER sites - by the use and support of standardized approaches to metadata management and access to data; ? timeliness and number of LTER derived data products - by creating a suite of middleware programs and workflows that make it easy to create and maintain integrated data sets derived from LTER data; and ? knowledge generated from the synthesis of LTER data - by creating standardized access and easy to use applications to discover, access, and use LTER data. The LTER NIS will utilize the Provenance Aware Synthesis Tracking Architecture (PASTA), which will provide the LTER community a metadata-driven data-flow framework to automatically harvest data from LTER research sites and make it available through a well defined software interface. We distinguish PASTA from the more generalized NIS by classifying framework components as critical and enabling cyberinfrastructure that, collectively, provide the services defined by the above mission. Data and metadata will have to pass a set of community defined quality criteria before entry into PASTA, including the use of semantic informing metadata elements and the conformance of data to their structural descriptions provided by metadata. As a result, consumers of data products from PASTA will be assured that metadata are complete and include provenance information where applicable and the data are of the highest quality. Development of the NIS is being performed through community participation. Advisory groups, called "Tiger Teams", are enlisted from the general LTER membership to provide input to the design of the NIS. Other LTER working groups contribute community-based software into the NIS; these include modules for controlled vocabularies, scientific units, and personnel. We anticipate a 2014 release of the LTER NIS.
Tools for proactive collection and use of quality metadata in GEOSS

NASA Astrophysics Data System (ADS)

Bastin, L.; Thum, S.; Maso, J.; Yang, K. X.; Nüst, D.; Van den Broek, M.; Lush, V.; Papeschi, F.; Riverola, A.

2012-12-01

The GEOSS Common Infrastructure allows interactive evaluation and selection of Earth Observation datasets by the scientific community and decision makers, but the data quality information needed to assess fitness for use is often patchy and hard to visualise when comparing candidate datasets. In a number of studies over the past decade, users repeatedly identified the same types of gaps in quality metadata, specifying the need for enhancements such as peer and expert review, better traceability and provenance information, information on citations and usage of a dataset, warning about problems identified with a dataset and potential workarounds, and 'soft knowledge' from data producers (e.g. recommendations for use which are not easily encoded using the existing standards). Despite clear identification of these issues in a number of recommendations, the gaps persist in practice and are highlighted once more in our own, more recent, surveys. This continuing deficit may well be the result of a historic paucity of tools to support the easy documentation and continual review of dataset quality. However, more recent developments in tools and standards, as well as more general technological advances, present the opportunity for a community of scientific users to adopt a more proactive attitude by commenting on their uses of data, and for that feedback to be federated with more traditional and static forms of metadata, allowing a user to more accurately assess the suitability of a dataset for their own specific context and reliability thresholds. The EU FP7 GeoViQua project aims to develop this opportunity by adding data quality representations to the existing search and visualisation functionalities of the Geo Portal. Subsequently we will help to close the gap by providing tools to easily create quality information, and to permit user-friendly exploration of that information as the ultimate incentive for improved data quality documentation. Quality information is derived from producer metadata, from the data themselves, from validation of in-situ sensor data, from provenance information and from user feedback, and will be aggregated to produce clear and useful summaries of quality, including a GEO Label. GeoViQua's conceptual quality information models for users and producers are specifically described and illustrated in this presentation. These models (which have been encoded as XML schemas and can be accessed at http://schemas.geoviqua.org/) are designed to satisfy the identified user needs while remaining consistent with current standards such as ISO 19115 and advanced drafts such as ISO 19157. The resulting components being developed for the GEO Portal are designed to lower the entry barrier to users who wish to help to generate and explore rich and useful metadata. This metadata will include reviews, comments and ratings, reports of usage in specific domains and specification of datasets used for benchmarking, as well as rich quantitative information encoded in more traditional data quality elements such as thematic correctness and positional accuracy. The value of the enriched metadata will also be enhanced by graphical tools for visualizing spatially distributed uncertainties. We demonstrate practical example applications in selected environmental application domains.
Mercury's core evolution

NASA Astrophysics Data System (ADS)

Deproost, Marie-Hélène; Rivoldini, Attilio; Van Hoolst, Tim

2016-10-01

Remote sensing data of Mercury's surface by MESSENGER indicate that Mercury formed under reducing conditions. As a consequence, silicon is likely the main light element in the core together with a possible small fraction of sulfur. Compared to sulfur, which does almost not partition into solid iron at Mercury's core conditions and strongly decreases the melting temperature, silicon partitions almost equally well between solid and liquid iron and is not very effective at reducing the melting temperature of iron. Silicon as the major light element constituent instead of sulfur therefore implies a significantly higher core liquidus temperature and a decrease in the vigor of compositional convection generated by the release of light elements upon inner core formation.Due to the immiscibility in liquid Fe-Si-S at low pressure (below 15 GPa), the core might also not be homogeneous and consist of an inner S-poor Fe-Si core below a thinner Si-poor Fe-S layer. Here, we study the consequences of a silicon-rich core and the effect of the blanketing Fe-S layer on the thermal evolution of Mercury's core and on the generation of a magnetic field.
Syntactic and Semantic Validation without a Metadata Management System

NASA Technical Reports Server (NTRS)

Pollack, Janine; Gokey, Christopher D.; Kendig, David; Olsen, Lola; Wharton, Stephen W. (Technical Monitor)

2001-01-01

The ability to maintain quality information is essential to securing the confidence in any system for which the information serves as a data source. NASA's Global Change Master Directory (GCMD), an online Earth science data locator, holds over 9000 data set descriptions and is in a constant state of flux as metadata are created and updated on a daily basis. In such a system, the importance of maintaining the consistency and integrity of these-metadata is crucial. The GCMD has developed a metadata management system utilizing XML, controlled vocabulary, and Java technologies to ensure the metadata not only adhere to valid syntax, but also exhibit proper semantics.
Core Formation Process and Light Elements in the Planetary Core

NASA Astrophysics Data System (ADS)

Ohtani, E.; Sakairi, T.; Watanabe, K.; Kamada, S.; Sakamaki, T.; Hirao, N.

2015-12-01

Si, O, and S are major candidates for light elements in the planetary core. In the early stage of the planetary formation, the core formation started by percolation of the metallic liquid though silicate matrix because Fe-S-O and Fe-S-Si eutectic temperatures are significantly lower than the solidus of the silicates. Therefore, in the early stage of accretion of the planets, the eutectic liquid with S enrichment was formed and separated into the core by percolation. The major light element in the core at this stage will be sulfur. The internal pressure and temperature increased with the growth of the planets, and the metal component depleted in S was molten. The metallic melt contained both Si and O at high pressure in the deep magma ocean in the later stage. Thus, the core contains S, Si, and O in this stage of core formation. Partitioning experiments between solid and liquid metals indicate that S is partitioned into the liquid metal, whereas O is weakly into the liquid. Partitioning of Si changes with the metallic iron phases, i.e., fcc iron-alloy coexisting with the metallic liquid below 30 GPa is depleted in Si. Whereas hcp-Fe alloy above 30 GPa coexisting with the liquid favors Si. This contrast of Si partitioning provides remarkable difference in compositions of the solid inner core and liquid outer core among different terrestrial planets. Our melting experiments of the Fe-S-Si and Fe-O-S systems at high pressure indicate the core-adiabats in small planets, Mercury and Mars, are greater than the slope of the solidus and liquidus curves of these systems. Thus, in these planets, the core crystallized at the top of the liquid core and 'snowing core' formation occurred during crystallization. The solid inner core is depleted in both Si and S whereas the liquid outer core is relatively enriched in Si and S in these planets. On the other hand, the core adiabats in large planets, Earth and Venus, are smaller than the solidus and liquidus curves of the systems. The inner core of these planets crystallized at the center of the core and it has the relatively Si rich inner core and the S enriched outer core. Based on melting and solid-liquid partitioning, the equation of state, and sound velocity of iron-light element alloys, we examined the plausible distribution of light elements in the liquid outer and solid inner cores of the terrestrial planets.
Trace element fluxes during the last 100 years in sediment near a nuclear power plant

NASA Astrophysics Data System (ADS)

Bojórquez-Sánchez, S.; Marmolejo-Rodríguez, A. J.; Ruiz-Fernández, A. C.; Sánchez-González, A.; Sánchez-Cabeza, J. A.; Bojórquez-Leyva, H.; Pérez-Bernal, L. H.

2017-11-01

The Salada coastal lagoon is located in Veracruz (Mexico) near the Laguna Verde Nuclear Power Plant (LVNPP). Currently, the lagoon receives the cooling waters used in the LVNPP. To evaluate the fluxes and mobilization of trace elements due to human activities in the area, two sediment cores from the coastal flood plains of Salada Lagoon were analysed. Cores were collected using PVC tubes. Sediments cores were analysed every centimetre for dating (210Pb by alpha detector) and trace metal analysis using ICP-Mass Spectrometry. The dating of both sediment cores covers the period from 1900 to 2013, which includes the construction of the LVNPP (1970's). The Normalized Enrichment Factor shows enrichment of Ag, As and Cr in both sediment cores. These enrichments correspond to the extent of mining activity (which reached a maximum in the 1900's) and to the geological setting of the coastal zone. The profiles of the element fluxes in both sediment cores reflected the construction and operation of the LVNPP; however, the elements content did not show evidence of pollution coming from the LVNPP.
Evaluating non-relational storage technology for HEP metadata and meta-data catalog

NASA Astrophysics Data System (ADS)

Grigorieva, M. A.; Golosova, M. V.; Gubin, M. Y.; Klimentov, A. A.; Osipova, V. V.; Ryabinkin, E. A.

2016-10-01

Large-scale scientific experiments produce vast volumes of data. These data are stored, processed and analyzed in a distributed computing environment. The life cycle of experiment is managed by specialized software like Distributed Data Management and Workload Management Systems. In order to be interpreted and mined, experimental data must be accompanied by auxiliary metadata, which are recorded at each data processing step. Metadata describes scientific data and represent scientific objects or results of scientific experiments, allowing them to be shared by various applications, to be recorded in databases or published via Web. Processing and analysis of constantly growing volume of auxiliary metadata is a challenging task, not simpler than the management and processing of experimental data itself. Furthermore, metadata sources are often loosely coupled and potentially may lead to an end-user inconsistency in combined information queries. To aggregate and synthesize a range of primary metadata sources, and enhance them with flexible schema-less addition of aggregated data, we are developing the Data Knowledge Base architecture serving as the intelligence behind GUIs and APIs.
Survey data and metadata modelling using document-oriented NoSQL

NASA Astrophysics Data System (ADS)

Rahmatuti Maghfiroh, Lutfi; Gusti Bagus Baskara Nugraha, I.

2018-03-01

Survey data that are collected from year to year have metadata change. However it need to be stored integratedly to get statistical data faster and easier. Data warehouse (DW) can be used to solve this limitation. However there is a change of variables in every period that can not be accommodated by DW. Traditional DW can not handle variable change via Slowly Changing Dimension (SCD). Previous research handle the change of variables in DW to manage metadata by using multiversion DW (MVDW). MVDW is designed using relational model. Some researches also found that developing nonrelational model in NoSQL database has reading time faster than the relational model. Therefore, we propose changes to metadata management by using NoSQL. This study proposes a model DW to manage change and algorithms to retrieve data with metadata changes. Evaluation of the proposed models and algorithms result in that database with the proposed design can retrieve data with metadata changes properly. This paper has contribution in comprehensive data analysis with metadata changes (especially data survey) in integrated storage.
Improving the accessibility and re-use of environmental models through provision of model metadata - a scoping study

NASA Astrophysics Data System (ADS)

Riddick, Andrew; Hughes, Andrew; Harpham, Quillon; Royse, Katherine; Singh, Anubha

2014-05-01

There has been an increasing interest both from academic and commercial organisations over recent years in developing hydrologic and other environmental models in response to some of the major challenges facing the environment, for example environmental change and its effects and ensuring water resource security. This has resulted in a significant investment in modelling by many organisations both in terms of financial resources and intellectual capital. To capitalise on the effort on producing models, then it is necessary for the models to be both discoverable and appropriately described. If this is not undertaken then the effort in producing the models will be wasted. However, whilst there are some recognised metadata standards relating to datasets these may not completely address the needs of modellers regarding input data for example. Also there appears to be a lack of metadata schemes configured to encourage the discovery and re-use of the models themselves. The lack of an established standard for model metadata is considered to be a factor inhibiting the more widespread use of environmental models particularly the use of linked model compositions which fuse together hydrologic models with models from other environmental disciplines. This poster presents the results of a Natural Environment Research Council (NERC) funded scoping study to understand the requirements of modellers and other end users for metadata about data and models. A user consultation exercise using an on-line questionnaire has been undertaken to capture the views of a wide spectrum of stakeholders on how they are currently managing metadata for modelling. This has provided a strong confirmation of our original supposition that there is a lack of systems and facilities to capture metadata about models. A number of specific gaps in current provision for data and model metadata were also identified, including a need for a standard means to record detailed information about the modelling environment and the model code used, to assist the selection of models for linked compositions. Existing best practice, including the use of current metadata standards (e.g. ISO 19110, ISO 19115 and ISO 19119) and the metadata components of WaterML were also evaluated. In addition to commonly used metadata attributes (e.g. spatial reference information) there was significant interest in recording a variety of additional metadata attributes. These included more detailed information about temporal data, and also providing estimates of data accuracy and uncertainty within metadata. This poster describes the key results of this study, including a number of gaps in the provision of metadata for modelling, and outlines how these might be addressed. Overall the scoping study has highlighted significant interest in addressing this issue within the environmental modelling community. There is therefore an impetus for on-going research, and we are seeking to take this forward through collaboration with other interested organisations. Progress towards an internationally recognised model metadata standard is suggested.
Four-terminal circuit element with photonic core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sampayan, Stephen

A four-terminal circuit element is described that includes a photonic core inside of the circuit element that uses a wide bandgap semiconductor material that exhibits photoconductivity and allows current flow through the material in response to the light that is incident on the wide bandgap material. The four-terminal circuit element can be configured based on various hardware structures using a single piece or multiple pieces or layers of a wide bandgap semiconductor material to achieve various designed electrical properties such as high switching voltages by using the photoconductive feature beyond the breakdown voltages of semiconductor devices or circuits operated basedmore » on electrical bias or control designs. The photonic core aspect of the four-terminal circuit element provides unique features that enable versatile circuit applications to either replace the semiconductor transistor-based circuit elements or semiconductor diode-based circuit elements.« less
Automated Metadata Extraction

DTIC Science & Technology

2008-06-01

provides a means for file owners to add metadata which can then be used by iTunes for cataloging and searching [4]. Metadata can be stored in different...based and contain AAC data formats [3]. Specifically, Apple uses Protected AAC to encode copy-protected music titles purchased from the iTunes Music...Store [4]. The files purchased from the iTunes Music Store include the following metadata. • Name • Email address of purchaser • Year • Album
A Solution to Metadata: Using XML Transformations to Automate Metadata

DTIC Science & Technology

2010-06-01

developed their own metadata standards—Directory Interchange Format (DIF), Ecological Metadata Language ( EML ), and International Organization for...mented all their data using the EML standard. However, when later attempting to publish to a data clearinghouse— such as the Geospatial One-Stop (GOS...construct calls to its transform(s) method by providing the type of the incoming content (e.g., eml ), the type of the resulting content (e.g., fgdc) and
The Department of Defense Net-Centric Data Strategy: Implementation Requires a Joint Community of Interest (COI) Working Group and Joint COI Oversight Council

DTIC Science & Technology

2007-05-17

metadata formats, metadata repositories, enterprise portals and federated search engines that make data visible, available, and usable to users...and provides the metadata formats, metadata repositories, enterprise portals and federated search engines that make data visible, available, and...develop an enterprise- wide data sharing plan, establishment of mission area governance processes for CIOs, DISA development of federated search specifications
FRAMES Metadata Reporting Templates for Ecohydrological Observations, version 1.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christianson, Danielle; Varadharajan, Charuleka; Christoffersen, Brad

FRAMES is a a set of Excel metadata files and package-level descriptive metadata that are designed to facilitate and improve capture of desired metadata for ecohydrological observations. The metadata are bundled with data files into a data package and submitted to a data repository (e.g. the NGEE Tropics Data Repository) via a web form. FRAMES standardizes reporting of diverse ecohydrological and biogeochemical data for synthesis across a range of spatiotemporal scales and incorporates many best data science practices. This version of FRAMES supports observations for primarily automated measurements collected by permanently located sensors, including sap flow (tree water use), leafmore » surface temperature, soil water content, dendrometry (stem diameter growth increment), and solar radiation. Version 1.1 extend the controlled vocabulary and incorporates functionality to facilitate programmatic use of data and FRAMES metadata (R code available at NGEE Tropics Data Repository).« less
Assessing Public Metabolomics Metadata, Towards Improving Quality.

PubMed

Ferreira, João D; Inácio, Bruno; Salek, Reza M; Couto, Francisco M

2017-12-13

Public resources need to be appropriately annotated with metadata in order to make them discoverable, reproducible and traceable, further enabling them to be interoperable or integrated with other datasets. While data-sharing policies exist to promote the annotation process by data owners, these guidelines are still largely ignored. In this manuscript, we analyse automatic measures of metadata quality, and suggest their application as a mean to encourage data owners to increase the metadata quality of their resources and submissions, thereby contributing to higher quality data, improved data sharing, and the overall accountability of scientific publications. We analyse these metadata quality measures in the context of a real-world repository of metabolomics data (i.e. MetaboLights), including a manual validation of the measures, and an analysis of their evolution over time. Our findings suggest that the proposed measures can be used to mimic a manual assessment of metadata quality.

EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal.

PubMed

Baker, Ed

2013-01-01

Many institutions and individuals use embedded metadata to aid in the management of their image collections. Many deskop image management solutions such as Adobe Bridge and online tools such as Flickr also make use of embedded metadata to describe, categorise and license images. Until now Scratchpads (a data management system and virtual research environment for biodiversity) have not made use of these metadata, and users have had to manually re-enter this information if they have wanted to display it on their Scratchpad site. The Drupal described here allows users to map metadata embedded in their images to the associated field in the Scratchpads image form using one or more customised mappings. The module works seamlessly with the bulk image uploader used on Scratchpads and it is therefore possible to upload hundreds of images easily with automatic metadata (EXIF, XMP and IPTC) extraction and mapping.
Collection Metadata Solutions for Digital Library Applications

NASA Technical Reports Server (NTRS)

Hill, Linda L.; Janee, Greg; Dolin, Ron; Frew, James; Larsgaard, Mary

1999-01-01

Within a digital library, collections may range from an ad hoc set of objects that serve a temporary purpose to established library collections intended to persist through time. The objects in these collections vary widely, from library and data center holdings to pointers to real-world objects, such as geographic places, and the various metadata schemas that describe them. The key to integrated use of such a variety of collections in a digital library is collection metadata that represents the inherent and contextual characteristics of a collection. The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes: in XML form, the collection metadata "registers" the collection with the user interface client; in HTML form, it is used for user documentation; eventually, it will be used to describe the collection to network search agents; and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.
HDF-EOS Web Server

NASA Technical Reports Server (NTRS)

Ullman, Richard; Bane, Bob; Yang, Jingli

2008-01-01

A shell script has been written as a means of automatically making HDF-EOS-formatted data sets available via the World Wide Web. ("HDF-EOS" and variants thereof are defined in the first of the two immediately preceding articles.) The shell script chains together some software tools developed by the Data Usability Group at Goddard Space Flight Center to perform the following actions: Extract metadata in Object Definition Language (ODL) from an HDF-EOS file, Convert the metadata from ODL to Extensible Markup Language (XML), Reformat the XML metadata into human-readable Hypertext Markup Language (HTML), Publish the HTML metadata and the original HDF-EOS file to a Web server and an Open-source Project for a Network Data Access Protocol (OPeN-DAP) server computer, and Reformat the XML metadata and submit the resulting file to the EOS Clearinghouse, which is a Web-based metadata clearinghouse that facilitates searching for, and exchange of, Earth-Science data.
EXIF Custom: Automatic image metadata extraction for Scratchpads and Drupal

PubMed Central

2013-01-01

Abstract Many institutions and individuals use embedded metadata to aid in the management of their image collections. Many deskop image management solutions such as Adobe Bridge and online tools such as Flickr also make use of embedded metadata to describe, categorise and license images. Until now Scratchpads (a data management system and virtual research environment for biodiversity) have not made use of these metadata, and users have had to manually re-enter this information if they have wanted to display it on their Scratchpad site. The Drupal described here allows users to map metadata embedded in their images to the associated field in the Scratchpads image form using one or more customised mappings. The module works seamlessly with the bulk image uploader used on Scratchpads and it is therefore possible to upload hundreds of images easily with automatic metadata (EXIF, XMP and IPTC) extraction and mapping. PMID:24723768
The Energy Industry Profile of ISO/DIS 19115-1: Facilitating Discovery and Evaluation of, and Access to Distributed Information Resources

NASA Astrophysics Data System (ADS)

Hills, S. J.; Richard, S. M.; Doniger, A.; Danko, D. M.; Derenthal, L.; Energistics Metadata Work Group

2011-12-01

A diverse group of organizations representative of the international community involved in disciplines relevant to the upstream petroleum industry, - energy companies, - suppliers and publishers of information to the energy industry, - vendors of software applications used by the industry, - partner government and academic organizations, has engaged in the Energy Industry Metadata Standards Initiative. This Initiative envisions the use of standard metadata within the community to enable significant improvements in the efficiency with which users discover, evaluate, and access distributed information resources. The metadata standard needed to realize this vision is the initiative's primary deliverable. In addition to developing the metadata standard, the initiative is promoting its adoption to accelerate realization of the vision, and publishing metadata exemplars conformant with the standard. Implementation of the standard by community members, in the form of published metadata which document the information resources each organization manages, will allow use of tools requiring consistent metadata for efficient discovery and evaluation of, and access to, information resources. While metadata are expected to be widely accessible, access to associated information resources may be more constrained. The initiative is being conducting by Energistics' Metadata Work Group, in collaboration with the USGIN Project. Energistics is a global standards group in the oil and natural gas industry. The Work Group determined early in the initiative, based on input solicited from 40+ organizations and on an assessment of existing metadata standards, to develop the target metadata standard as a profile of a revised version of ISO 19115, formally the "Energy Industry Profile of ISO/DIS 19115-1 v1.0" (EIP). The Work Group is participating on the ISO/TC 211 project team responsible for the revision of ISO 19115, now ready for "Draft International Standard" (DIS) status. With ISO 19115 an established, capability-rich, open standard for geographic metadata, EIP v1 is expected to be widely acceptable within the community and readily sustainable over the long-term. The EIP design, also per community requirements, will enable discovery, evaluation, and access to types of information resources considered important to the community, including structured and unstructured digital resources, and physical assets such as hardcopy documents and material samples. This presentation will briefly review the development of this initiative as well as the current and planned Work Group activities. More time will be spent providing an overview of the EIP v1, including the requirements it prescribes, design efforts made to enable automated metadata capture and processing, and the structure and content of its documentation, which was written to minimize ambiguity and facilitate implementation. The Work Group considers EIP v1 a solid initial design for interoperable metadata, and first step toward the vision of the Initiative.
DialysisNet: Application for Integrating and Management Data Sources of Hemodialysis Information by Continuity of Care Record.

PubMed

Ku, Ho Suk; Kim, Sungho; Kim, HyeHyeon; Chung, Hee-Joon; Park, Yu Rang; Kim, Ju Han

2014-04-01

Health Avatar Beans was for the management of chronic kidney disease and end-stage renal disease (ESRD). This article is about the DialysisNet system in Health Avatar Beans for the seamless management of ESRD based on the personal health record. For hemodialysis data modeling, we identified common data elements for hemodialysis information (CDEHI). We used ASTM continuity of care record (CCR) and ISO/IEC 11179 for the compliance method with a standard model for the CDEHI. According to the contents of the ASTM CCR, we mapped the CDHEI to the contents and created the metadata from that. It was transformed and parsed into the database and verified according to the ASTM CCR/XML schema definition (XSD). DialysisNet was created as an iPad application. The contents of the CDEHI were categorized for effective management. For the evaluation of information transfer, we used CarePlatform, which was developed for data access. The metadata of CDEHI in DialysisNet was exchanged by the CarePlatform with semantic interoperability. The CDEHI was separated into a content list for individual patient data, a contents list for hemodialysis center data, consultation and transfer form, and clinical decision support data. After matching to the CCR, the CDEHI was transformed to metadata, and it was transformed to XML and proven according to the ASTM CCR/XSD. DialysisNet has specific consideration of visualization, graphics, images, statistics, and database. We created the DialysisNet application, which can integrate and manage data sources for hemodialysis information based on CCR standards.
LANCE in ECHO - Merging Science and Near Real-Time Data Search and Order

NASA Astrophysics Data System (ADS)

Kreisler, S.; Murphy, K. J.; Vollmer, B.; Lighty, L.; Mitchell, A. E.; Devine, N.

2012-12-01

NASA's Earth Observing System (EOS) Data and Information System (EOSDIS) Land Atmosphere Near real-time Capability for EOS (LANCE) project provides expedited data products from the Terra, Aqua, and Aura satellites within three hours of observation. In order to satisfy latency requirements, LANCE data are produced with relaxed ancillary data resulting in a product that may have minor differences from its science quality counterpart. LANCE products are used by a number of different groups to support research and applications that require near real-time earth observations, such as disaster relief, hazard and air quality monitoring, and weather forecasting. LANCE elements process raw rate-buffered and/or session-based production datasets into higher-level products, which are freely available to registered users via LANCE FTP sites. The LANCE project also generates near real-time full resolution browse imagery from these products, which can be accessed through the Global Imagery Browse Services (GIBS). In an effort to support applications and services that require timely access to these near real-time products, the project is currently implementing the publication of LANCE product metadata to the EOS ClearingHouse (ECHO), a centralized EOSDIS registry of EOS data. Metadata within ECHO is made available through an Application Program Interface (API), and applications can utilize the API to allow users to efficiently search and order LANCE data. Publishing near real-time data to ECHO will permit applications to access near real-time product metadata prior to the release of its science quality counterpart and to associate imagery from GIBS with its underlying data product.
Software and hardware infrastructure for research in electrophysiology

PubMed Central

Mouček, Roman; Ježek, Petr; Vařeka, Lukáš; Řondík, Tomáš; Brůha, Petr; Papež, Václav; Mautner, Pavel; Novotný, Jiří; Prokop, Tomáš; Štěbeták, Jan

2014-01-01

As in other areas of experimental science, operation of electrophysiological laboratory, design and performance of electrophysiological experiments, collection, storage and sharing of experimental data and metadata, analysis and interpretation of these data, and publication of results are time consuming activities. If these activities are well organized and supported by a suitable infrastructure, work efficiency of researchers increases significantly. This article deals with the main concepts, design, and development of software and hardware infrastructure for research in electrophysiology. The described infrastructure has been primarily developed for the needs of neuroinformatics laboratory at the University of West Bohemia, the Czech Republic. However, from the beginning it has been also designed and developed to be open and applicable in laboratories that do similar research. After introducing the laboratory and the whole architectural concept the individual parts of the infrastructure are described. The central element of the software infrastructure is a web-based portal that enables community researchers to store, share, download and search data and metadata from electrophysiological experiments. The data model, domain ontology and usage of semantic web languages and technologies are described. Current data publication policy used in the portal is briefly introduced. The registration of the portal within Neuroscience Information Framework is described. Then the methods used for processing of electrophysiological signals are presented. The specific modifications of these methods introduced by laboratory researches are summarized; the methods are organized into a laboratory workflow. Other parts of the software infrastructure include mobile and offline solutions for data/metadata storing and a hardware stimulator communicating with an EEG amplifier and recording software. PMID:24639646
Software and hardware infrastructure for research in electrophysiology.

PubMed

Mouček, Roman; Ježek, Petr; Vařeka, Lukáš; Rondík, Tomáš; Brůha, Petr; Papež, Václav; Mautner, Pavel; Novotný, Jiří; Prokop, Tomáš; Stěbeták, Jan

2014-01-01

As in other areas of experimental science, operation of electrophysiological laboratory, design and performance of electrophysiological experiments, collection, storage and sharing of experimental data and metadata, analysis and interpretation of these data, and publication of results are time consuming activities. If these activities are well organized and supported by a suitable infrastructure, work efficiency of researchers increases significantly. This article deals with the main concepts, design, and development of software and hardware infrastructure for research in electrophysiology. The described infrastructure has been primarily developed for the needs of neuroinformatics laboratory at the University of West Bohemia, the Czech Republic. However, from the beginning it has been also designed and developed to be open and applicable in laboratories that do similar research. After introducing the laboratory and the whole architectural concept the individual parts of the infrastructure are described. The central element of the software infrastructure is a web-based portal that enables community researchers to store, share, download and search data and metadata from electrophysiological experiments. The data model, domain ontology and usage of semantic web languages and technologies are described. Current data publication policy used in the portal is briefly introduced. The registration of the portal within Neuroscience Information Framework is described. Then the methods used for processing of electrophysiological signals are presented. The specific modifications of these methods introduced by laboratory researches are summarized; the methods are organized into a laboratory workflow. Other parts of the software infrastructure include mobile and offline solutions for data/metadata storing and a hardware stimulator communicating with an EEG amplifier and recording software.
Reliable and Persistent Identification of Linked Data Elements

NASA Astrophysics Data System (ADS)

Wood, David

Linked Data techniques rely upon common terminology in a manner similar to a relational database'vs reliance on a schema. Linked Data terminology anchors metadata descriptions and facilitates navigation of information. Common vocabularies ease the human, social tasks of understanding datasets sufficiently to construct queries and help to relate otherwise disparate datasets. Vocabulary terms must, when using the Resource Description Framework, be grounded in URIs. A current bestpractice on the World Wide Web is to serve vocabulary terms as Uniform Resource Locators (URLs) and present both human-readable and machine-readable representations to the public. Linked Data terminology published to theWorldWideWeb may be used by others without reference or notification to the publishing party. That presents a problem: Vocabulary publishers take on an implicit responsibility to maintain and publish their terms via the URLs originally assigned, regardless of the inconvenience such a responsibility may cause. Over the course of years, people change jobs, publishing organizations change Internet domain names, computers change IP addresses,systems administrators publish old material in new ways. Clearly, a mechanism is required to manageWeb-based vocabularies over a long term. This chapter places Linked Data vocabularies in context with the wider concepts of metadata in general and specifically metadata on the Web. Persistent identifier mechanisms are reviewed, with a particular emphasis on Persistent URLs, or PURLs. PURLs and PURL services are discussed in the context of Linked Data. Finally, historic weaknesses of PURLs are resolved by the introduction of a federation of PURL services to address needs specific to Linked Data.
Siderophile Element Constraints on the Conditions of Core Formation in Mars

NASA Technical Reports Server (NTRS)

Righter, K.; Humayun, M.

2012-01-01

Siderophile element concentrations in planetary basalts and mantle samples have been used to estimate conditions of core formation for many years and have included applications to Earth, Moon, Mars and asteroid 4 Vesta [1]. For Earth, we have samples of mantle and a diverse collection of mantle melts which have provided a mature understanding of the how to reconstruct the concentration of siderophile elements in mantle materials, from only concentrations in surficial basalt (e.g., [2]). This approach has led to the consensus views that Earth underwent an early magma ocean stage to pressures of 40-50 GPa (e.g., [3,4]), Moon melted extensively and formed a small (approx. 2 mass %) metallic core [5], and 4 Vesta contains a metallic core that is approximately 18 mass % [6,7]. Based on new data from newly found meteorites, robotic spacecraft, and experimental partitioning studies, [8] showed that eight siderophile elements (Ni, Co, Mo, W, Ga, P, V and Cr) are consistent with equilibration of a 20 mass% S-rich metallic core with the mantle at pressures of 14 +/- 3 GPa. We aim to test this rather simple scenario with additional analyses of meteorites for a wide range of siderophile elements, and application of new experimental data for the volatile siderophile and highly siderophile elements.
Marine sedimentary coring and high-quality, multi-proxy records in the high-latitude North Pacific: a synthesis of paleoceanographic cruise and research effort

NASA Astrophysics Data System (ADS)

Borreggine, M. J.; Myhre, S. E.; Smith-Mislan, A.; Davis, C. V.; Deutsch, C.

2016-12-01

We assessed sedimentary coring efforts, data acquisition and publications from the subpolar North Pacific and marginal seas from 1951-2015. We found a total of 1,249 sediment cores collected by American, French, Japanese and Russian research vessels across the Subarctic Pacific (639 cores), Alaskan Gyre (8 cores), Sea of Okhotsk (270 cores), Bering Sea (120 cores), and the Sea of Japan (212 cores). Of these, 27% are investigated in peer-reviewed publications; this fraction varies from the Subarctic Pacific (18%), Alaskan Gyre (100%), Sea of Okhotsk (33%), Bering Sea (57%), and the Sea of Japan (25%). We assess the biological, geochemical, isotopic, and stratigraphic lines of evidence available for these cores, alongside coring technology, location, depth, cruise and vessel metadata. Coring effort peaked in 1996, 2009, and 2010 where 86, 90, and 67 cores, respectively, were recovered in the five regions collectively. Piston cores are the most common (347 cores) of the 24 different coring technologies used in the last 64 years. Published sedimentation rates range across the Subarctic Pacific (0.132-208 cm/ka), Alaskan Gyre (9-10,000 cm/ka), Sea of Okhotsk (0.7-115.5 cm/ka), Bering Sea (3-250 cm/ka), and the Sea of Japan (0.5-25 cm/ka), with the highest rates in the Alaskan Gyre. Age model development has transitioned from singular techniques to multiproxy approaches. Recent chronologies are built using a mix of isotope stratigraphy, radiocarbon dating, magnetostratigraphy, biostratigraphy, tephrochronology, % opal, color, and lithophysical proxies. Out of 275 published chronologies for the North Pacific, 132 (48%) are built with radiocarbon dating. Sedimentary data in the North Pacific includes biological, geochemical, isotopic, and stratigraphic analyses, and we document all proxy evidence to-date across all cores assessed. This database of coring and publication provides a unique resource and comprehensive assessment to the paleoceanographic community, can be used to identify strengths and weaknesses in North Pacific paleoceanography, and will be made publicly available. Additionally, the database is used to recreate past sea ice, temperature, and oxygen conditions in two additional submissions at the 2016 AGU Fall Meeting.
Sound velocity of iron-light element compounds and the chemical structure of the inner core

NASA Astrophysics Data System (ADS)

Ohtani, E.; Sakamaki, T.; Fukui, H.; Tanaka, R.; Shibazaki, Y.; Kamada, S.; Sakairi, T.; Takahashi, S.; Tsutsui, S.; Baron, A. Q. R.

2016-12-01

The light elements in the core could constrain the conditions of accretion, subsequent magma ocean, and core formation stages of the Earth. There are several studies for sound velocity measurements of the iron-light elements alloys. However, the measurements are not enough to constrain the light element abundance in the core tightly at present due to inter-laboratory inconsistencies using different methods which are originated from the difficulties to make such measurements under the extreme conditions. We measured the sound velocity of iron alloy compounds at high pressure and temperature relevant to the Earth's core using double-sided laser heating of a DAC combined with inelastic X-ray scattering at SPring-8. We measured the compressional velocity of hcp-Fe up to 166 GPa and 3000 K, and derived a clear temperature dependence of the Birch's law for hcp-Fe. We measured the compressional velocity of Fe0.89Si0.11 alloy and Fe3C at high pressure and temperature, and we could not detect temperature dependency in Birch's law in these compounds. Additionally, we measured the sound velocity of Fe3S, Fe0.83Ni0.09Si0.08 alloy, and FeH at high pressure. Combining our new data set which showed remarkable differences from previous data on the sound velocity, we present a model of the chemical structure of the inner core. The outer core composition was also estimated based on partitioning behaviors of these light elements between solid and liquid iron alloys under the core conditions.
CORE-SINEs: Eukaryotic short interspersed retroposing elements with common sequence motifs

PubMed Central

Gilbert, Nicolas; Labuda, Damian

1999-01-01

A 65-bp “core” sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3′ ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome. PMID:10077603
MCNP-model for the OAEP Thai Research Reactor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gallmeier, F.X.; Tang, J.S.; Primm, R.T. III

An MCNP input was prepared for the Thai Research Reactor, making extensive use of the MCNP geometry`s lattice feature that allows a flexible and easy rearrangement of the core components and the adjustment of the control elements. The geometry was checked for overdefined or undefined zones by two-dimensional plots of cuts through the core configuration with the MCNP geometry plotting capabilities, and by a three-dimensional view of the core configuration with the SABRINA code. Cross sections were defined for a hypothetical core of 67 standard fuel elements and 38 low-enriched uranium fuel elements--all filled with fresh fuel. Three test calculationsmore » were performed with the MCNP4B-code to obtain the multiplication factor for the cases with control elements fully inserted, fully withdrawn, and at a working position.« less
A document centric metadata registration tool constructing earth environmental data infrastructure

NASA Astrophysics Data System (ADS)

Ichino, M.; Kinutani, H.; Ono, M.; Shimizu, T.; Yoshikawa, M.; Masuda, K.; Fukuda, K.; Kawamoto, H.

2009-12-01

DIAS (Data Integration and Analysis System) is one of GEOSS activities in Japan. It is also a leading part of the GEOSS task with the same name defined in GEOSS Ten Year Implementation Plan. The main mission of DIAS is to construct data infrastructure that can effectively integrate earth environmental data such as observation data, numerical model outputs, and socio-economic data provided from the fields of climate, water cycle, ecosystem, ocean, biodiversity and agriculture. Some of DIAS's data products are available at the following web site of http://www.jamstec.go.jp/e/medid/dias. Most of earth environmental data commonly have spatial and temporal attributes such as the covering geographic scope or the created date. The metadata standards including these common attributes are published by the geographic information technical committee (TC211) in ISO (the International Organization for Standardization) as specifications of ISO 19115:2003 and 19139:2007. Accordingly, DIAS metadata is developed with basing on ISO/TC211 metadata standards. From the viewpoint of data users, metadata is useful not only for data retrieval and analysis but also for interoperability and information sharing among experts, beginners and nonprofessionals. On the other hand, from the viewpoint of data providers, two problems were pointed out after discussions. One is that data providers prefer to minimize another tasks and spending time for creating metadata. Another is that data providers want to manage and publish documents to explain their data sets more comprehensively. Because of solving these problems, we have been developing a document centric metadata registration tool. The features of our tool are that the generated documents are available instantly and there is no extra cost for data providers to generate metadata. Also, this tool is developed as a Web application. So, this tool does not demand any software for data providers if they have a web-browser. The interface of the tool provides the section titles of the documents and by filling out the content of each section, the documents for the data sets are automatically published in PDF and HTML format. Furthermore, the metadata XML file which is compliant with ISO19115 and ISO19139 is created at the same moment. The generated metadata are managed in the metadata database of the DIAS project, and will be used in various ISO19139 compliant metadata management tools, such as GeoNetwork.
Theoretical and experimental investigation of architected core materials incorporating negative stiffness elements

NASA Astrophysics Data System (ADS)

Chang, Chia-Ming; Keefe, Andrew; Carter, William B.; Henry, Christopher P.; McKnight, Geoff P.

2014-04-01

Structural assemblies incorporating negative stiffness elements have been shown to provide both tunable damping properties and simultaneous high stiffness and damping over prescribed displacement regions. In this paper we explore the design space for negative stiffness based assemblies using analytical modeling combined with finite element analysis. A simplified spring model demonstrates the effects of element stiffness, geometry, and preloads on the damping and stiffness performance. Simplified analytical models were validated for realistic structural implementations through finite element analysis. A series of complementary experiments was conducted to compare with modeling and determine the effects of each element on the system response. The measured damping performance follows the theoretical predictions obtained by analytical modeling. We applied these concepts to a novel sandwich core structure that exhibited combined stiffness and damping properties 8 times greater than existing foam core technologies.
RNA connectivity requirements between conserved elements in the core of the yeast telomerase RNP

PubMed Central

Mefford, Melissa A; Rafiq, Qundeel; Zappulla, David C

2013-01-01

Telomerase is a specialized chromosome end-replicating enzyme required for genome duplication in many eukaryotes. An RNA and reverse transcriptase protein subunit comprise its enzymatic core. Telomerase is evolving rapidly, particularly its RNA component. Nevertheless, nearly all telomerase RNAs, including those of H. sapiens and S. cerevisiae, share four conserved structural elements: a core-enclosing helix (CEH), template-boundary element, template, and pseudoknot, in this order along the RNA. It is not clear how these elements coordinate telomerase activity. We find that although rearranging the order of the four conserved elements in the yeast telomerase RNA subunit, TLC1, disrupts activity, the RNA ends can be moved between the template and pseudoknot in vitro and in vivo. However, the ends disrupt activity when inserted between the other structured elements, defining an Area of Required Connectivity (ARC). Within the ARC, we find that only the junction nucleotides between the pseudoknot and CEH are essential. Integrating all of our findings provides a basic map of functional connections in the core of the yeast telomerase RNP and a framework to understand conserved element coordination in telomerase mechanism. PMID:24129512
Contents of the JPL Distributed Active Archive Center (DAAC) archive, version 2-91

NASA Technical Reports Server (NTRS)

Smith, Elizabeth A. (Editor); Lassanyi, Ruby A. (Editor)

1991-01-01

The Distributed Active Archive Center (DAAC) archive at the Jet Propulsion Laboratory (JPL) includes satellite data sets for the ocean sciences and global change research to facilitate multidisciplinary use of satellite ocean data. Parameters include sea surface height, surface wind vector, sea surface temperature, atmospheric liquid water, and surface pigment concentration. The Jet Propulsion Laboratory DAAC is an element of the Earth Observing System Data and Information System (EOSDIS) and will be the United States distribution site for the Ocean Topography Experiment (TOPEX)/POSEIDON data and metadata.
JPL Physical Oceanography Distributed Active Archive Center (PO.DAAC) data availability, version 1-94

NASA Technical Reports Server (NTRS)

1994-01-01

The Physical Oceanography Distributed Active Archive Center (PO.DAAC) archive at the Jet Propulsion Laboratory (JPL) includes satellite data sets for the ocean sciences and global-change research to facilitate multidisciplinary use of satellite ocean data. Parameters include sea-surface height, surface-wind vector, sea-surface temperature, atmospheric liquid water, and integrated water vapor. The JPL PO.DAAC is an element of the Earth Observing System Data and Information System (EOSDIS) and is the United States distribution site for Ocean Topography Experiment (TOPEX)/POSEIDON data and metadata.

Converting ODM Metadata to FHIR Questionnaire Resources.

PubMed

Doods, Justin; Neuhaus, Philipp; Dugas, Martin

2016-01-01

Interoperability between systems and data sharing between domains is becoming more and more important. The portal medical-data-models.org offers more than 5.300 UMLS annotated forms in CDISC ODM format in order to support interoperability, but several additional export formats are available. CDISC's ODM and HL7's framework FHIR Questionnaire resource were analyzed, a mapping between elements created and a converter implemented. The developed converter was integrated into the portal with FHIR Questionnaire XML or JSON download options. New FHIR applications can now use this large library of forms.
A metadata-driven approach to data repository design.

PubMed

Harvey, Matthew J; McLean, Andrew; Rzepa, Henry S

2017-01-01

The design and use of a metadata-driven data repository for research data management is described. Metadata is collected automatically during the submission process whenever possible and is registered with DataCite in accordance with their current metadata schema, in exchange for a persistent digital object identifier. Two examples of data preview are illustrated, including the demonstration of a method for integration with commercial software that confers rich domain-specific data analytics without introducing customisation into the repository itself.
Electrical and thermal conductivity of Fe-C alloy at high pressure: implications for effects of carbon on the geodynamo of the Earth's core

NASA Astrophysics Data System (ADS)

Zhang, C.; Lin, J. F.; Liu, Y.; Feng, S.; Jin, C.; Yoshino, T.

2017-12-01

Thermal conductivity of iron alloy in the Earth's core plays a crucial role in constraining the energetics of the geodynamo and the thermal evolution of the planet. Studies on the thermal conductivity of iron reveal the importance of the effects of light elements and high temperature. Carbon has been proposed to be a candidate light element in Earth's core for its meteoritic abundance and high-pressure velocity-density profiles of iron carbides (e.g., Fe7C3). In this study, we employed four-probe van der Pauw method in a diamond anvil cell to measure the electrical resistivity of pure iron, iron carbon alloy, and iron carbides at high pressures. These studies were complimented with synchrotron X-ray diffraction and focused ion beam (FIB) analyses. Our results show significant changes in the electrical conductivity of these iron-carbon alloys that are consistent previous reports with structural and electronic transitions at high pressures, indicating that these transitions should be taken into account in evaluating the electrical and thermal conductivity at high pressure. To apply our results to understand the thermal conduction in the Earth's core, we have compared our results with literature values for the electrical and thermal conductivity of iron alloyed with light elements (C, Si) at high pressures. These comparisons permit the validity of the Wiedemann-Franz law and Matthiessen's rule for the effects of light elements on the thermal conductivity of the Earth's core. We found that an addition of a light element such as carbon has an strong effect on the reducing the thermal conductivity of Earth's core, but the magnitude of the alloying effect strongly depends on the identity of the light element and the crystal and electronic structures. Based on our results and literature values, we have modelled the electrical and thermal conductivity of iron-carbon alloy at Earth's core pressure-temperature conditions to the effects on the heat flux in the Earth's core. In this presentation, we will address how carbon as a potential light element in the Earth's core can significantly affect our view of the heat flux across the core-mantle boundary and geodynamo of our planet.
Social tagging in the life sciences: characterizing a new metadata resource for bioinformatics.

PubMed

Good, Benjamin M; Tennis, Joseph T; Wilkinson, Mark D

2009-09-25

Academic social tagging systems, such as Connotea and CiteULike, provide researchers with a means to organize personal collections of online references with keywords (tags) and to share these collections with others. One of the side-effects of the operation of these systems is the generation of large, publicly accessible metadata repositories describing the resources in the collections. In light of the well-known expansion of information in the life sciences and the need for metadata to enhance its value, these repositories present a potentially valuable new resource for application developers. Here we characterize the current contents of two scientifically relevant metadata repositories created through social tagging. This investigation helps to establish how such socially constructed metadata might be used as it stands currently and to suggest ways that new social tagging systems might be designed that would yield better aggregate products. We assessed the metadata that users of CiteULike and Connotea associated with citations in PubMed with the following metrics: coverage of the document space, density of metadata (tags) per document, rates of inter-annotator agreement, and rates of agreement with MeSH indexing. CiteULike and Connotea were very similar on all of the measurements. In comparison to PubMed, document coverage and per-document metadata density were much lower for the social tagging systems. Inter-annotator agreement within the social tagging systems and the agreement between the aggregated social tagging metadata and MeSH indexing was low though the latter could be increased through voting. The most promising uses of metadata from current academic social tagging repositories will be those that find ways to utilize the novel relationships between users, tags, and documents exposed through these systems. For more traditional kinds of indexing-based applications (such as keyword-based search) to benefit substantially from socially generated metadata in the life sciences, more documents need to be tagged and more tags are needed for each document. These issues may be addressed both by finding ways to attract more users to current systems and by creating new user interfaces that encourage more collectively useful individual tagging behaviour.
Effect of Silicon on Activity Coefficients of Platinum in Liquid Fe-Si, With Application to Core Formation

NASA Technical Reports Server (NTRS)

Righter, K.; Pando, K.; Danielson, L. R.; Humayun, M.

2017-01-01

Earth's core contains approximately 10% of a light element that is likely a combination of S, C, Si, and O, with Si possibly being the most abundant light element. Si dissolved into Fe liquids can have a large effect on the magnitude of the activity coefficient of siderophile elements (SE) in Fe liquids, and thus the partitioning behavior of those elements between core and mantle. The effect of Si can be small such as for Ni and Co, or large such as for Mo, Ge, Sb, As. The effect of Si on many siderophile elements is unknown yet could be an important, and as yet unquantified, influence on the core-mantle partitioning of SE. Here we report new experiments designed to quantify the effect of Si on the partitioning of Pt (with Re and Ru in progress or planned) between metal and silicate melt. The results will be applied to Earth, for which we have excellent constraints on the mantle Pt concentrations.
Geochemical stratigraphy of two regolith cores from the Central Highlands of the moon

NASA Technical Reports Server (NTRS)

Korotev, R. L.

1991-01-01

High-resolution concentration profiles are presented for 20-22 chemical elements in the under 1-mm grain-size fractions of 60001-7 and 60009/10. Emphasis is placed on the stratigraphic features of the cores, and the fresh results are compared with those of previous petrographic and geochemical studies. For elements associated with major mineral phases, the variations in concentration in both cores exceed that observed in some 40 samples of surface and trench soils. Most of the variation in lithophile element concentrations at depths of 18 to 21 cm results from the mixing of two components - oil that is relatively mafic and rich in incompatible trace elements (ITEs), and coarse-grained anorthosite. The linearity of mixing lines on two-element concentration plots argues that the relative abundances of these various subcomponents are sufficiently uniform from sample to sample and from region to region in the core that the mixture behaves effectively as a single component. Soils at depths of 52-55 cm exhibit very low concentrations of ITEs.
Managing Sustainable Data Infrastructures: The Gestalt of EOSDIS

NASA Technical Reports Server (NTRS)

Behnke, Jeanne; Lowe, Dawn; Lindsay, Francis; Lynnes, Chris; Mitchell, Andrew

2016-01-01

EOSDIS epitomizes a System of Systems, whose many varied and distributed parts are integrated into a single, highly functional organized science data system. A distributed architecture was adopted to ensure discipline-specific support for the science data, while also leveraging standards and establishing policies and tools to enable interdisciplinary research, and analysis across multiple scientific instruments. The EOSDIS is composed of system elements such as geographically distributed archive centers used to manage the stewardship of data. The infrastructure consists of underlying capabilities connections that enable the primary system elements to function together. For example, one key infrastructure component is the common metadata repository, which enables discovery of all data within the EOSDIS system. EOSDIS employs processes and standards to ensure partners can work together effectively, and provide coherent services to users.
The CMS Data Management System

NASA Astrophysics Data System (ADS)

Giffels, M.; Guo, Y.; Kuznetsov, V.; Magini, N.; Wildish, T.

2014-06-01

The data management elements in CMS are scalable, modular, and designed to work together. The main components are PhEDEx, the data transfer and location system; the Data Booking Service (DBS), a metadata catalog; and the Data Aggregation Service (DAS), designed to aggregate views and provide them to users and services. Tens of thousands of samples have been cataloged and petabytes of data have been moved since the run began. The modular system has allowed the optimal use of appropriate underlying technologies. In this contribution we will discuss the use of both Oracle and NoSQL databases to implement the data management elements as well as the individual architectures chosen. We will discuss how the data management system functioned during the first run, and what improvements are planned in preparation for 2015.
Influence of precipitating light elements on stable stratification below the core/mantle boundary

NASA Astrophysics Data System (ADS)

O'Rourke, J. G.; Stevenson, D. J.

2017-12-01

Stable stratification below the core/mantle boundary is often invoked to explain anomalously low seismic velocities in this region. Diffusion of light elements like oxygen or, more slowly, silicon could create a stabilizing chemical gradient in the outermost core. Heat flow less than that conducted along the adiabatic gradient may also produce thermal stratification. However, reconciling either origin with the apparent longevity (>3.45 billion years) of Earth's magnetic field remains difficult. Sub-isentropic heat flow would not drive a dynamo by thermal convection before the nucleation of the inner core, which likely occurred less than one billion years ago and did not instantly change the heat flow. Moreover, an oxygen-enriched layer below the core/mantle boundary—the source of thermal buoyancy—could establish double-diffusive convection where motion in the bulk fluid is suppressed below a slowly advancing interface. Here we present new models that explain both stable stratification and a long-lived dynamo by considering ongoing precipitation of magnesium oxide and/or silicon dioxide from the core. Lithophile elements may partition into iron alloys under extreme pressure and temperature during Earth's formation, especially after giant impacts. Modest core/mantle heat flow then drives compositional convection—regardless of thermal conductivity—since their solubility is strongly temperature-dependent. Our models begin with bulk abundances for the mantle and core determined by the redox conditions during accretion. We then track equilibration between the core and a primordial basal magma ocean followed by downward diffusion of light elements. Precipitation begins at a depth that is most sensitive to temperature and oxygen abundance and then creates feedbacks with the radial thermal and chemical profiles. Successful models feature a stable layer with low seismic velocity (which mandates multi-component evolution since a single light element typically increases seismic velocity) growing to its present-day size while allowing enough precipitation to drive compositional convection below. Crucially, this modeling offers unique constrains on Earth's accretion and the light element composition of the core compared to degenerate estimates derived from bulk density and seismic measurements.
Statistical Constraints from Siderophile Elements on Earth's Accretion, Differentiation, and Initial Core Stratification

NASA Astrophysics Data System (ADS)

O'Rourke, J. G.; Stevenson, D. J.

2015-12-01

Abundances of siderophile elements in the primitive mantle constrain the conditions of Earth's core/mantle differentiation. Core growth occurred as Earth accreted from collisions between planetesimals and larger embryos of unknown original provenance, so geochemistry is directly related to the overall dynamics of Solar System formation. Recent studies claim that only certain conditions of equilibration (pressure, temperature, and oxygen fugacity) during core formation can reproduce the available data. Typical analyses, however, only consider the effects of varying a few out of tens of free parameters in continuous core formation models. Here we describe the Markov chain Monte Carlo method, which simultaneously incorporates the large uncertainties on Earth's composition and the parameterizations that describe elemental partitioning between metal and silicate. This Bayesian technique is vastly more computationally efficient than a simple grid search and is well suited to models of planetary accretion that involve a plethora of variables. In contrast to previous work, we find that analyses of siderophile elements alone cannot yield a unique scenario for Earth's accretion. Our models predict a wide range of possible light element contents for the core, encompassing all combinations permitted by seismology and mineral physics. Specifically, we are agnostic between silicon and oxygen as the dominant light element, and the addition of carbon or sulfur is also permissible but not well constrained. Redox conditions may have remained roughly constant during Earth's accretion or relatively oxygen-rich material could have been incorporated before reduced embryos. Pressures and temperatures of equilibration, likewise, may only increase slowly throughout accretion. Therefore, we do not necessarily expect a thick (>500 km), compositionally stratified layer that is stable against convection to develop at the top of the core of Earth (or, by analogy, Venus). A thinner stable layer might inhibit the initialization of the dynamo.
Polished Downhole Transducer Having Improved Signal Coupling

DOEpatents

Hall, David R.; Fox, Joe

2006-03-28

Apparatus and methods to improve signal coupling in downhole inductive transmission elements to reduce the dispersion of magnetic energy at the tool joints and to provide consistent impedance and contact between transmission elements located along the drill string. A transmission element for transmitting information between downhole tools is disclosed in one embodiment of the invention as including an annular core constructed of a magnetically conductive material. The annular core forms an open channel around its circumference and is configured to form a closed channel by mating with a corresponding annular core along an annular mating surface. The mating surface is polished to provide improved magnetic coupling with the corresponding annular core. An annular conductor is disposed within the open channel.
Master Metadata Repository and Metadata-Management System

NASA Technical Reports Server (NTRS)

Armstrong, Edward; Reed, Nate; Zhang, Wen

2007-01-01

A master metadata repository (MMR) software system manages the storage and searching of metadata pertaining to data from national and international satellite sources of the Global Ocean Data Assimilation Experiment (GODAE) High Resolution Sea Surface Temperature Pilot Project [GHRSSTPP]. These sources produce a total of hundreds of data files daily, each file classified as one of more than ten data products representing global sea-surface temperatures. The MMR is a relational database wherein the metadata are divided into granulelevel records [denoted file records (FRs)] for individual satellite files and collection-level records [denoted data set descriptions (DSDs)] that describe metadata common to all the files from a specific data product. FRs and DSDs adhere to the NASA Directory Interchange Format (DIF). The FRs and DSDs are contained in separate subdatabases linked by a common field. The MMR is configured in MySQL database software with custom Practical Extraction and Reporting Language (PERL) programs to validate and ingest the metadata records. The database contents are converted into the Federal Geographic Data Committee (FGDC) standard format by use of the Extensible Markup Language (XML). A Web interface enables users to search for availability of data from all sources.
GeoCSV: tabular text formatting for geoscience data

NASA Astrophysics Data System (ADS)

Stults, M.; Arko, R. A.; Davis, E.; Ertz, D. J.; Turner, M.; Trabant, C. M.; Valentine, D. W., Jr.; Ahern, T. K.; Carbotte, S. M.; Gurnis, M.; Meertens, C.; Ramamurthy, M. K.; Zaslavsky, I.; McWhirter, J.

2015-12-01

The GeoCSV design was developed within the GeoWS project as a way to provide a baseline of compatibility between tabular text data sets from various sub-domains in geoscience. Funded through NSF's EarthCube initiative, the GeoWS project aims to develop common web service interfaces for data access across hydrology, geodesy, seismology, marine geophysics, atmospheric science and other areas. The GeoCSV format is an essential part of delivering data via simple web services for discovery and utilization by both humans and machines. As most geoscience disciplines have developed and use data formats specific for their needs, tabular text data can play a key role as a lowest common denominator useful for exchanging and integrating data across sub-domains. The design starts with a core definition compatible with best practices described by the W3C - CSV on the Web Working Group (CSVW). Compatibility with CSVW is intended to ensure the broadest usability of data expressed as GeoCSV. An optional, simple, but limited metadata description mechanism was added to allow inclusion of important metadata with comma separated data, while staying with the definition of a "dialect" by CSVW. The format is designed both for creating new datasets and to annotate data sets already in a tabular text format such that they are compliant with GeoCSV.
Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools

PubMed Central

Sud, Manish; Fahy, Eoin; Cotter, Dawn; Azam, Kenan; Vadivelu, Ilango; Burant, Charles; Edison, Arthur; Fiehn, Oliver; Higashi, Richard; Nair, K. Sreekumaran; Sumner, Susan; Subramaniam, Shankar

2016-01-01

The Metabolomics Workbench, available at www.metabolomicsworkbench.org, is a public repository for metabolomics metadata and experimental data spanning various species and experimental platforms, metabolite standards, metabolite structures, protocols, tutorials, and training material and other educational resources. It provides a computational platform to integrate, analyze, track, deposit and disseminate large volumes of heterogeneous data from a wide variety of metabolomics studies including mass spectrometry (MS) and nuclear magnetic resonance spectrometry (NMR) data spanning over 20 different species covering all the major taxonomic categories including humans and other mammals, plants, insects, invertebrates and microorganisms. Additionally, a number of protocols are provided for a range of metabolite classes, sample types, and both MS and NMR-based studies, along with a metabolite structure database. The metabolites characterized in the studies available on the Metabolomics Workbench are linked to chemical structures in the metabolite structure database to facilitate comparative analysis across studies. The Metabolomics Workbench, part of the data coordinating effort of the National Institute of Health (NIH) Common Fund's Metabolomics Program, provides data from the Common Fund's Metabolomics Resource Cores, metabolite standards, and analysis tools to the wider metabolomics community and seeks data depositions from metabolomics researchers across the world. PMID:26467476
Siderophile Volatile Element Partitioning during Core Formation.

NASA Astrophysics Data System (ADS)

Loroch, D. C.; Hackler, S.; Rohrbach, A.; Klemme, S.

2017-12-01

Since the nineteen sixties it is known, that the Earth's mantle is depleted relative to CI chondrite in numerous elements as a result of accretion and core-mantle differentiation. Additionally, if we take the chondritic composition as the initial solar nebular element abundances, the Earth lacks 85 % of K and up to 98 % of other volatiles. However one potentially very important group of elements has received considerably less attention in this context and these elements are the siderophile but volatile elements (SVEs). SVEs perhaps provide important information regarding the timing of volatile delivery to Earth. Especially for the SVEs the partitioning between metal melt and silicate melt (Dmetal/silicate) at core formation conditions is poorly constrained, never the less they are very important for most of the core formation models. This study is producing new metal-silicate partitioning data for a wide range of SVEs (S, Se, Te, Tl, Ag, As, Au, Cd, Bi, Pb, Sn, Cu, Ge, Zn, In and Ga) with a focus on the P, T and fO2dependencies. The initial hypothesis that we are aiming to test uses the accretion of major portions of volatile elements while the core formation was still active. The key points of this study are: - What are the effects of P, T and fO2 on SVE metal-silicate partioning? - What is the effect of compositional complexity on SVE metal-silicate partioning? - How can SVE's D-values fit into current models of core formation? The partitioning experiments will be performed using a Walker type multi anvil apparatus in a pressure range between 10 and 20 GPa and temperatures of 1700 up to 2100 °C. To determine the Dmetal/silicate values we are using a field emission high-resolution JEOL JXA-8530F EPMA for major elements and a Photon Machines Analyte G2 Excimer laser (193 nm) ablation system coupled to a Thermo Fisher Element 2 single-collector ICP-MS (LA-ICP-MS) for the trace elements. We recently finished the first sets of experiments and can provide the corresponding datasets. Based on the general understanding of Dmetal/silicate values we expect to depend on the composition, in this particular case this means a variation in sulfur and carbon content of the core composition, and also a change of the redox conditions. The major goal however is to derive a model of core formation on Earth that includes and also explains the SVEs.
Sediment data collected in 2014 from Barnegat Bay, New Jersey

USGS Publications Warehouse

Bernier, Julie C.; Stalk, Chelsea, A.; Kelso, Kyle W.; Miselis, Jennifer L.; Tunstead, Rob

2016-05-23

In response to the 2010 Governor’s Action Plan to clean up the Barnegat Bay–Little Egg Harbor (BBLEH) estuary in New Jersey, the U.S. Geological Survey (USGS) partnered with the New Jersey Department of Environmental Protection in 2011 to begin a multidisciplinary research project to understand the physical controls on water quality in the bay. Between 2011 and 2013, USGS scientists mapped the geological and morphological characteristics of the seafloor of the BBLEH estuary using a suite of geophysical tools. However, this mapping effort included only surficial characterization of bay sediments; to verify the sub-surface geophysical data, sediment cores were required.This report serves as an archive of sedimentologic data from 18 vibracores collected from Barnegat Bay between May and August of 2014 by the U.S. Department of Agriculture Natural Resources Conservation Service (NRCS) on behalf of the USGS. The vibracores were collected in conjunction with an ongoing NRCS subaqueous soil survey for the BBLEH estuary. The data presented in this report, including descriptive core logs, core photographs, processed grain-size data, and Geographic Information System (GIS) data files with accompanying formal Federal Geographic Data Committee metadata, can be viewed or downloaded from the Data Products and Downloads page.
Brady's Geothermal Field Nodal Seismometers Metadata

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lesley Parker

Metadata for the nodal seismometer array deployed at the POROTOMO's Natural Laboratory in Brady Hot Spring, Nevada during the March 2016 testing. Metadata includes location and timing for each instrument as well as file lists of data to be uploaded in a separate submission.
The HTA core model: a novel method for producing and reporting health technology assessments.

PubMed

Lampe, Kristian; Mäkelä, Marjukka; Garrido, Marcial Velasco; Anttila, Heidi; Autti-Rämö, Ilona; Hicks, Nicholas J; Hofmann, Björn; Koivisto, Juha; Kunz, Regina; Kärki, Pia; Malmivaara, Antti; Meiesaar, Kersti; Reiman-Möttönen, Päivi; Norderhaug, Inger; Pasternack, Iris; Ruano-Ravina, Alberto; Räsänen, Pirjo; Saalasti-Koskinen, Ulla; Saarni, Samuli I; Walin, Laura; Kristensen, Finn Børlum

2009-12-01

The aim of this study was to develop and test a generic framework to enable international collaboration for producing and sharing results of health technology assessments (HTAs). Ten international teams constructed the HTA Core Model, dividing information contained in a comprehensive HTA into standardized pieces, the assessment elements. Each element contains a generic issue that is translated into practical research questions while performing an assessment. Elements were described in detail in element cards. Two pilot assessments, designated as Core HTAs were also produced. The Model and Core HTAs were both validated. Guidance on the use of the HTA Core Model was compiled into a Handbook. The HTA Core Model considers health technologies through nine domains. Two applications of the Model were developed, one for medical and surgical interventions and another for diagnostic technologies. Two Core HTAs were produced in parallel with developing the model, providing the first real-life testing of the Model and input for further development. The results of formal validation and public feedback were primarily positive. Development needs were also identified and considered. An online Handbook is available. The HTA Core Model is a novel approach to HTA. It enables effective international production and sharing of HTA results in a structured format. The face validity of the Model was confirmed during the project, but further testing and refining are needed to ensure optimal usefulness and user-friendliness. Core HTAs are intended to serve as a basis for local HTA reports. Core HTAs do not contain recommendations on technology use.
MODERATOR ELEMENTS FOR UNIFORM POWER NUCLEAR REACTOR

DOEpatents

Balent, R.

1963-03-12

This patent describes a method of obtaining a flatter flux and more uniform power generation across the core of a nuclear reactor. The method comprises using moderator elements having differing moderating strength. The elements have an increasing amount of the better moderating material as a function of radial and/or axial distance from the reactor core center. (AEC)
The RBV metadata catalog

NASA Astrophysics Data System (ADS)

Andre, Francois; Fleury, Laurence; Gaillardet, Jerome; Nord, Guillaume

2015-04-01

RBV (Réseau des Bassins Versants) is a French initiative to consolidate the national efforts made by more than 15 elementary observatories funded by various research institutions (CNRS, INRA, IRD, IRSTEA, Universities) that study river and drainage basins. The RBV Metadata Catalogue aims at giving an unified vision of the work produced by every observatory to both the members of the RBV network and any external person interested by this domain of research. Another goal is to share this information with other existing metadata portals. Metadata management is heterogeneous among observatories ranging from absence to mature harvestable catalogues. Here, we would like to explain the strategy used to design a state of the art catalogue facing this situation. Main features are as follows : - Multiple input methods: Metadata records in the catalog can either be entered with the graphical user interface, harvested from an existing catalogue or imported from information system through simplified web services. - Hierarchical levels: Metadata records may describe either an observatory, one of its experimental site or a single dataset produced by one instrument. - Multilingualism: Metadata can be easily entered in several configurable languages. - Compliance to standards : the backoffice part of the catalogue is based on a CSW metadata server (Geosource) which ensures ISO19115 compatibility and the ability of being harvested (globally or partially). On going tasks focus on the use of SKOS thesaurus and SensorML description of the sensors. - Ergonomy : The user interface is built with the GWT Framework to offer a rich client application with a fully ajaxified navigation. - Source code sharing : The work has led to the development of reusable components which can be used to quickly create new metadata forms in other GWT applications You can visit the catalogue (http://portailrbv.sedoo.fr/) or contact us by email rbv@sedoo.fr.

OntoStudyEdit: a new approach for ontology-based representation and management of metadata in clinical and epidemiological research.

PubMed

Uciteli, Alexandr; Herre, Heinrich

2015-01-01

The specification of metadata in clinical and epidemiological study projects absorbs significant expense. The validity and quality of the collected data depend heavily on the precise and semantical correct representation of their metadata. In various research organizations, which are planning and coordinating studies, the required metadata are specified differently, depending on many conditions, e.g., on the used study management software. The latter does not always meet the needs of a particular research organization, e.g., with respect to the relevant metadata attributes and structuring possibilities. The objective of the research, set forth in this paper, is the development of a new approach for ontology-based representation and management of metadata. The basic features of this approach are demonstrated by the software tool OntoStudyEdit (OSE). The OSE is designed and developed according to the three ontology method. This method for developing software is based on the interactions of three different kinds of ontologies: a task ontology, a domain ontology and a top-level ontology. The OSE can be easily adapted to different requirements, and it supports an ontologically founded representation and efficient management of metadata. The metadata specifications can by imported from various sources; they can be edited with the OSE, and they can be exported in/to several formats, which are used, e.g., by different study management software. Advantages of this approach are the adaptability of the OSE by integrating suitable domain ontologies, the ontological specification of mappings between the import/export formats and the DO, the specification of the study metadata in a uniform manner and its reuse in different research projects, and an intuitive data entry for non-expert users.
EarthCube Data Discovery Hub: Enhancing, Curating and Finding Data across Multiple Geoscience Data Sources.

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Valentine, D.; Richard, S. M.; Gupta, A.; Meier, O.; Peucker-Ehrenbrink, B.; Hudman, G.; Stocks, K. I.; Hsu, L.; Whitenack, T.; Grethe, J. S.; Ozyurt, I. B.

2017-12-01

EarthCube Data Discovery Hub (DDH) is an EarthCube Building Block project using technologies developed in CINERGI (Community Inventory of EarthCube Resources for Geoscience Interoperability) to enable geoscience users to explore a growing portfolio of EarthCube-created and other geoscience-related resources. Over 1 million metadata records are available for discovery through the project portal (cinergi.sdsc.edu). These records are retrieved from data facilities, including federal, state and academic sources, or contributed by geoscientists through workshops, surveys, or other channels. CINERGI metadata augmentation pipeline components 1) provide semantic enhancement based on a large ontology of geoscience terms, using text analytics to generate keywords with references to ontology classes, 2) add spatial extents based on place names found in the metadata record, and 3) add organization identifiers to the metadata. The records are indexed and can be searched via a web portal and standard search APIs. The added metadata content improves discoverability and interoperability of the registered resources. Specifically, the addition of ontology-anchored keywords enables faceted browsing and lets users navigate to datasets related by variables measured, equipment used, science domain, processes described, geospatial features studied, and other dataset characteristics that are generated by the pipeline. DDH also lets data curators access and edit the automatically generated metadata records using the CINERGI metadata editor, accept or reject the enhanced metadata content, and consider it in updating their metadata descriptions. We consider several complex data discovery workflows, in environmental seismology (quantifying sediment and water fluxes using seismic data), marine biology (determining available temperature, location, weather and bleaching characteristics of coral reefs related to measurements in a given coral reef survey), and river geochemistry (discovering observations relevant to geochemical measurements outside the tidal zone, given specific discharge conditions).
A new method for geochemical characterization of atmospheric mineral dust from polar ice cores: preliminary results from Talos Dome ice core (East Antarctica, Pacific-Ross Sea sector)

NASA Astrophysics Data System (ADS)

Baccolo, Giovanni; Delmonte, Barbara; Clemenza, Massimiliano; Previtali, Ezio; Maggi, Valter

2015-04-01

Assessing the elemental composition of atmospheric dust entrapped in polar ice cores is important for the identification of the potential dust sources and thus for the reconstruction of past atmospheric circulation, at local, regional and global scale. Accurate determination of major and trace elements in the insoluble fraction of dust extracted from ice cores is also useful to better understand some geochemical and biogeochemical mechanisms which are linked with the climate system. The extremely reduced concentration of dust in polar ice (typical Antarctic concentrations during interglacials are in the range of 10 ppb), the limited availability of such samples and the high risk of contamination make these analyses a challenge. A new method based on low background Instrumental Neutron Activation Analysis (INAA) was specifically developed for this kind of samples. The method allows the determination of the concentration of up to 35 elements in extremely reduced dust samples (20-30 μg). These elements span from major to trace and ultra-trace elements. Preliminary results from TALDICE (TALos Dome Ice CorE, East Antarctica, Pacific-Ross Sea Sector) ice core are presented along with results from potential source areas in Victoria Land. A set of 5 samples from Talos Dome, corresponding to the last termination, MIS3, MIS4 and MIS6 were prepared and analyzed by INAA.
Insights into Mercury's interior structure from geodesy measurements and global contraction

NASA Astrophysics Data System (ADS)

Rivoldini, A.; Van Hoolst, T.

2014-04-01

The measurements of the gravitational field of Mercury by MESSENGER [6] and improved measurements of the spin state of Mercury [3] provide important insights on its interior structure. In particular, these data give strong constraints on the radius and density of Mercury's core [5, 2]. However, present geodesy data do not provide strong constraints on the radius of the inner core. The data allow for models with a fully molten liquid core to models which have an inner core radius that is smaller than about 1760km [5], if it is assumed that sulfur is the only light element in the core. Models without an inner core are, however, at odds with the observed internally generated magnetic field of Mercury since Mercury's dynamo cannot operate by secular cooling alone at present. The present radius of the inner core depends mainly on Mercury's thermal state and light elements inside the core. Because of the secular cooling of the planet,the temperature inside the core drops below the liquidus temperature of the core material somewhere in the core and leads to the formation of an inner core and to the global contraction of the planet. The amount of contraction depends on the temperature decrease, on the thermal expansion of the materials inside the planet, and on the volume of crystallized liquid core alloy. In this study we use geodesy data, the recent estimate about the radial contraction of Mercury [1], and thermo-chemical evolution calculations in order to improve our knowledge about Mercury's inner core radius and thermal state. Since data from remote sensing of Mercury's surface [4] indicate that Mercury formed under reducing conditions we consider models that have sulfur and silicon as light elements inside their core. Unlike sulfur, which does almost not partition into solid iron under Mercury's core pressure and temperature conditions, silicon partitions virtually equally between solid and liquid iron. As a consequence, the density difference between the liquid and the crystallized material is smaller than for sulfur as only light element inside the core and therefore, for a given inner core radius the contraction of the planet is likely smaller.
Plastic deformation of FeSi at high pressures: implications for planetary cores

NASA Astrophysics Data System (ADS)

Kupenko, Ilya; Merkel, Sébastien; Achorner, Melissa; Plückthun, Christian; Liermann, Hanns-Peter; Sanchez-Valle, Carmen

2017-04-01

The cores of terrestrial planets is mostly comprised of a Fe-Ni alloy, but it should additionally contain some light element(s) in order to explain the observed core density. Silicon has long been considered as a likely candidate because of geochemical and cosmochemical arguments: the Mg/Si and Fe/Si ratios of the Earth does not match those of the chondrites. Since silicon preferentially partition into iron-nickel metal, having 'missing' silicon in the core would solve this problem. Moreover, the evidence of present (e.g. Mercury) or ancient (e.g. Mars) magnetic fields on the terrestrial planets is a good indicator of (at least partially) liquid cores. The estimated temperature profiles of these planets, however, lay below iron melting curve. The addition of light elements in their metal cores could allow reducing their core-alloy melting temperature and, hence, the generation of a magnetic field. Although the effect of light elements on the stability and elasticity of Fe-Ni alloys has been widely investigated, their effect on the plasticity of core materials remains largely unknown. Yet, this information is crucial for understanding how planetary cores deform. Here we investigate the plastic deformation of ɛ-FeSi up to 50 GPa at room temperature employing a technique of radial x-ray diffraction in diamond anvil cells. Stoichiometric FeSi endmember is a good first-order approximation of the Fe-FeSi system and a good starting material to develop new experimental perspectives. In this work, we focused on the low-pressure polymorph of FeSi that would be the stable phase in the cores of small terrestrial planets. We will present the analysis of measured data and discuss their potential application to constrain plastic deformation in planetary cores.
The lunar core and the origin of the moon

NASA Astrophysics Data System (ADS)

Newsom, H. E.

1984-05-01

The results of recent analyses of concentrations of refractory siderophile elements molybdenum and rhenium in lunar rock samples suggest that most siderophile elements in lunar crustal rocks and mare basalts are significantly less concentrated than in the earth's mantle and much less than in chondrite meteorites. The depletion of siderophile elements in the samples implies the existence of a metal core, and the amount of metal in the core is directly related to the conditions under which segregation occurs. The consequences of the data are discussed in terms of three theoretical models of lunar evolution: a terrestrial origin model; a terrestrial origin model which takes metal segregation into account; and an independent origin model. It is shown that less metal is needed for a terrestrial origin because the earth's mantle was already partially depleted in siderophile elements due to the formation of the earth core.
NEUTRONIC REACTOR FUEL ELEMENT AND CORE SYSTEM

DOEpatents

Moore, W.T.

1958-09-01

This patent relates to neutronic reactors and in particular to an improved fuel element and a novel reactor core system for facilitating removal of contaminating fission products, as they are fermed, from association with the flssionable fuel, so as to mitigate the interferent effects of such fission products during reactor operation. The fuel elements are comprised of tubular members impervious to fluid and contatning on their interior surfaces a thin layer of fissionable material providing a central void. The core structure is comprised of a plurality of the tubular fuel elements arranged in parallel and a closed manifold connected to their ends. In the reactor the core structure is dispersed in a water moderator and coolant within a pressure vessel, and a means connected to said manifuld is provided for withdrawing and disposing of mobile fission product contamination from the interior of the feel tubes and manifold.
Simulation on reactor TRIGA Puspati core kinetics fueled with thorium (Th) based fuel element

NASA Astrophysics Data System (ADS)

Mohammed, Abdul Aziz; Pauzi, Anas Muhamad; Rahman, Shaik Mohmmed Haikhal Abdul; Zin, Muhamad Rawi Muhammad; Jamro, Rafhayudi; Idris, Faridah Mohamad

2016-01-01

In confronting global energy requirement and the search for better technologies, there is a real case for widening the range of potential variations in the design of nuclear power plants. Smaller and simpler reactors are attractive, provided they can meet safety and security standards and non-proliferation issues. On fuel cycle aspect, thorium fuel cycles produce much less plutonium and other radioactive transuranic elements than uranium fuel cycles. Although not fissile itself, Th-232 will absorb slow neutrons to produce uranium-233 (233U), which is fissile. By introducing Thorium, the numbers of highly enriched uranium fuel element can be reduced while maintaining the core neutronic performance. This paper describes the core kinetic of a small research reactor core like TRIGA fueled with a Th filled fuel element matrix using a general purpose Monte Carlo N-Particle (MCNP) code.
Simulation on reactor TRIGA Puspati core kinetics fueled with thorium (Th) based fuel element

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mohammed, Abdul Aziz, E-mail: azizM@uniten.edu.my; Rahman, Shaik Mohmmed Haikhal Abdul; Pauzi, Anas Muhamad, E-mail: anas@uniten.edu.my

2016-01-22

In confronting global energy requirement and the search for better technologies, there is a real case for widening the range of potential variations in the design of nuclear power plants. Smaller and simpler reactors are attractive, provided they can meet safety and security standards and non-proliferation issues. On fuel cycle aspect, thorium fuel cycles produce much less plutonium and other radioactive transuranic elements than uranium fuel cycles. Although not fissile itself, Th-232 will absorb slow neutrons to produce uranium-233 ({sup 233}U), which is fissile. By introducing Thorium, the numbers of highly enriched uranium fuel element can be reduced while maintainingmore » the core neutronic performance. This paper describes the core kinetic of a small research reactor core like TRIGA fueled with a Th filled fuel element matrix using a general purpose Monte Carlo N-Particle (MCNP) code.« less
Constraints on the coupled thermal evolution of the Earth's core and mantle, the age of the inner core, and the origin of the 186Os/188Os “core signal” in plume-derived lavas

NASA Astrophysics Data System (ADS)

Lassiter, J. C.

2006-10-01

The possibility that some mantle plumes may carry a geochemical signature of core/mantle interaction has rightly generated considerable interest and attention in recent years. Correlated 186Os- 187Os enrichments in some plume-derived lavas (Hawaii, Gorgona, Kostomuksha) have been interpreted as deriving from an outer core with elevated Pt/Os and Re/Os ratios due to the solidification of the Earth's inner core (c.f., [A.D. Brandon, R.J. Walker, The debate over core-mantle interaction, Earth Planet. Sci. Lett. 232 (2005) 211-225.] and references therein). Conclusive identification of a "core signal" in plume-derived lavas would profoundly influence our understanding of mantle convection and evolution. This paper reevaluates the Os-isotope evidence for core/mantle interaction by examining other geochemical constraints on core/mantle interaction, geophysical constraints on the thermal evolution of the outer core, and geochemical and cosmochemical constraints on the abundance of heat-producing elements in the core. Additional study of metal/silicate and sulfide/silicate partitioning of K, Pb, and other trace elements is needed to more tightly constrain the likely starting composition of the Earth's core. However, available data suggest that the observed 186Os enrichments in Hawaiian and other plume-derived lavas are unlikely to derive from core/mantle interaction. 1) Core/mantle interaction sufficient to produce the observed 186Os enrichments would likely have significant effects on other tracers such as Pb- and W-isotopes that are not observed. 2) Significant partitioning of K or other heat-producing elements into the core would produce a "core depletion" pattern in the Silicate Earth very different from that observed. 3) In the absence of heat-producing elements in the core, core/mantle heat flow of ˜ 6-15 TW estimated from several independent geophysical constraints suggests an inner core age (< ˜ 2.5 Ga) too young for the outer core to have developed a significant 186Os enrichment. Core/mantle thermal and chemical interaction remains an important problem that warrants future research. However, Os-isotopes may have only limited utility in this area due to the relatively young age of the Earth's inner core.
Sedimentologic characteristics of recent washover deposits from Assateague Island, Maryland

USGS Publications Warehouse

Bernier, Julie C.; Zaremba, Nicholas J.; Wheaton, Cathryn J.; Ellis, Alisha M.; Marot, Marci E.; Smith, Christopher G.

2016-06-08

This report describes sediment data collected using sand augers in active overwash zones on Assateague Island in Maryland. Samples were collected by the U.S. Geological Survey (USGS) during two surveys in March/April and October 2014 (USGS Field Activity Numbers [FAN] 2014-301-FA and 2014-322-FA, respectively). The physical characteristics (for example, sediment texture or bedding structure) of and spatial differences among these deposits will provide information about overwash processes and sediment transport from the sandy barrier-island reaches to the back-barrier environments. Metrics derived from these data, such as mean grain size or deposit thicknesses, can be used to ground-truth remote sensing and geophysical data and can also be incorporated into sediment transport models. Data products, including sample location tables, descriptive core logs, core photographs and x-radiographs, the results of sediment grain-size analyses, and Geographic Information System (GIS) data files with accompanying formal Federal Geographic Data Committee (FGDC) metadata can be downloaded from the Data Downloads page.
75 FR 4689 - Electronic Tariff Filings

Federal Register 2010, 2011, 2012, 2013, 2014

2010-01-29

... collaborative process relies upon the use of metadata (or information) about the tariff filing, including such... code.\\5\\ Because the Commission is using the electronic metadata to establish statutory action dates... code, as well as accurately providing any other metadata. 6. Similarly, the Commission will be using...
The center for expanded data annotation and retrieval

PubMed Central

Bean, Carol A; Cheung, Kei-Hoi; Dumontier, Michel; Durante, Kim A; Gevaert, Olivier; Gonzalez-Beltran, Alejandra; Khatri, Purvesh; Kleinstein, Steven H; O’Connor, Martin J; Pouliot, Yannick; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Wiser, Jeffrey A

2015-01-01

The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. PMID:26112029
Establishing semantic interoperability of biomedical metadata registries using extended semantic relationships.

PubMed

Park, Yu Rang; Yoon, Young Jo; Kim, Hye Hyeon; Kim, Ju Han

2013-01-01

Achieving semantic interoperability is critical for biomedical data sharing between individuals, organizations and systems. The ISO/IEC 11179 MetaData Registry (MDR) standard has been recognized as one of the solutions for this purpose. The standard model, however, is limited. Representing concepts consist of two or more values, for instance, are not allowed including blood pressure with systolic and diastolic values. We addressed the structural limitations of ISO/IEC 11179 by an integrated metadata object model in our previous research. In the present study, we introduce semantic extensions for the model by defining three new types of semantic relationships; dependency, composite and variable relationships. To evaluate our extensions in a real world setting, we measured the efficiency of metadata reduction by means of mapping to existing others. We extracted metadata from the College of American Pathologist Cancer Protocols and then evaluated our extensions. With no semantic loss, one third of the extracted metadata could be successfully eliminated, suggesting better strategy for implementing clinical MDRs with improved efficiency and utility.
Separation of metadata and pixel data to speed DICOM tag morphing.

PubMed

Ismail, Mahmoud; Philbin, James

2013-01-01

The DICOM information model combines pixel data and metadata in single DICOM object. It is not possible to access the metadata separately from the pixel data. There are use cases where only metadata is accessed. The current DICOM object format increases the running time of those use cases. Tag morphing is one of those use cases. Tag morphing includes deletion, insertion or manipulation of one or more of the metadata attributes. It is typically used for order reconciliation on study acquisition or to localize the issuer of patient ID (IPID) and the patient ID attributes when data from one domain is transferred to a different domain. In this work, we propose using Multi-Series DICOM (MSD) objects, which separate metadata from pixel data and remove duplicate attributes, to reduce the time required for Tag Morphing. The time required to update a set of study attributes in each format is compared. The results show that the MSD format significantly reduces the time required for tag morphing.
Do Community Recommendations Improve Metadata?

NASA Astrophysics Data System (ADS)

Gordon, S.; Habermann, T.; Jones, M. B.; Leinfelder, B.; Mecum, B.; Powers, L. A.; Slaughter, P.

2016-12-01

Complete documentation of scientific data is the surest way to facilitate discovery and reuse. What is complete metadata? There are many metadata recommendations from communities like the OGC, FGDC, NASA, and LTER, that can provide data documentation guidance for discovery, access, use and understanding. Often, the recommendations that communities develop are for a particular metadata dialect. Two examples of this are the LTER Completeness recommendation for EML and the FGDC Data Discovery recommendation for CSDGM. Can community adoption of a recommendation ensure that what is included in the metadata is understandable to the scientific community and beyond? By applying quantitative analysis to different LTER and USGS metadata collections in DataOne and ScienceBase, we show that community recommendations can improve the completeness of collections over time. Additionally, by comparing communities in DataOne that use the EML and CSDGM dialects, but have not adopted the recommendations to the communities that have, the positive effects of recommendation adoption on documentation completeness can be measured.
Metadata Sets for e-Government Resources: The Extended e-Government Metadata Schema (eGMS+)

NASA Astrophysics Data System (ADS)

Charalabidis, Yannis; Lampathaki, Fenareti; Askounis, Dimitris

In the dawn of the Semantic Web era, metadata appear as a key enabler that assists management of the e-Government resources related to the provision of personalized, efficient and proactive services oriented towards the real citizens’ needs. As different authorities typically use different terms to describe their resources and publish them in various e-Government Registries that may enhance the access to and delivery of governmental knowledge, but also need to communicate seamlessly at a national and pan-European level, the need for a unified e-Government metadata standard emerges. This paper presents the creation of an ontology-based extended metadata set for e-Government Resources that embraces services, documents, XML Schemas, code lists, public bodies and information systems. Such a metadata set formalizes the exchange of information between portals and registries and assists the service transformation and simplification efforts, while it can be further taken into consideration when applying Web 2.0 techniques in e-Government.
A Generic Metadata Editor Supporting System Using Drupal CMS

NASA Astrophysics Data System (ADS)

Pan, J.; Banks, N. G.; Leggott, M.

2011-12-01

Metadata handling is a key factor in preserving and reusing scientific data. In recent years, standardized structural metadata has become widely used in Geoscience communities. However, there exist many different standards in Geosciences, such as the current version of the Federal Geographic Data Committee's Content Standard for Digital Geospatial Metadata (FGDC CSDGM), the Ecological Markup Language (EML), the Geography Markup Language (GML), and the emerging ISO 19115 and related standards. In addition, there are many different subsets within the Geoscience subdomain such as the Biological Profile of the FGDC (CSDGM), or for geopolitical regions, such as the European Profile or the North American Profile in the ISO standards. It is therefore desirable to have a software foundation to support metadata creation and editing for multiple standards and profiles, without re-inventing the wheels. We have developed a software module as a generic, flexible software system to do just that: to facilitate the support for multiple metadata standards and profiles. The software consists of a set of modules for the Drupal Content Management System (CMS), with minimal inter-dependencies to other Drupal modules. There are two steps in using the system's metadata functions. First, an administrator can use the system to design a user form, based on an XML schema and its instances. The form definition is named and stored in the Drupal database as a XML blob content. Second, users in an editor role can then use the persisted XML definition to render an actual metadata entry form, for creating or editing a metadata record. Behind the scenes, the form definition XML is transformed into a PHP array, which is then rendered via Drupal Form API. When the form is submitted the posted values are used to modify a metadata record. Drupal hooks can be used to perform custom processing on metadata record before and after submission. It is trivial to store the metadata record as an actual XML file or in a storage/archive system. We are working on adding many features to help editor users, such as auto completion, pre-populating of forms, partial saving, as well as automatic schema validation. In this presentation we will demonstrate a few sample editors, including an FGDC editor and a bare bone editor for ISO 19115/19139. We will also demonstrate the use of templates during the definition phase, with the support of export and import functions. Form pre-population and input validation will also be covered. Theses modules are available as open-source software from the Islandora software foundation, as a component of a larger Drupal-based data archive system. They can be easily installed as stand-alone system, or to be plugged into other existing metadata platforms.
Intrinsic Aniostropic Anelasticity of Hcp Iron Due to Light Element Solute Atoms

NASA Astrophysics Data System (ADS)

Redfern, S. A. T.

2014-12-01

Earth's inner core is elastically anisotropic, with seismology showing faster wave propagation along the polar axis compared to the equatorial plane. Some inner core studies report anisotropic seismic attenuation. Attenuation of body-waves has, previously, been postulated to be due to scattering by anisotropic microstructure, but recent normal mode studies also show strong anisotropic attenuation (Mäkinen et al. 2014). This suggests that the anisotropic attenuation is a result of the intrinsic (and anisotropic) anelastic properties of the solid iron alloy forming Earth's inner core. Here, I consider the origins of inner core anisotropic attenuation. Possibilities include grain boundary relaxation, dislocation bowing/glide, or point defect (alloying element) relaxations. The inner core is an almost perfect environment for near-equilibrium crystallisation, with very low temperature gradients across the inner core, low gravity, and slow crystallisation rates. It is assumed that grain sizes may be of the order of hundreds of metres. This implies vanishingly small volumes of grain boundary, and insignificant grain boundary relaxation. The very high homologous temperature and the absence of obvious deviatoric stress, also leads one to conclude that dislocation densities are low. On the other hand, estimates for light element concentrations are of the order of a few % with O, S, Si, C and H at various times being suggested as candidate elements. Light element solutes in hcp metals contribute to intrinsic anelastic attenuation if they occur in sufficient concentrations to pair and form elastic dipoles. Switching of dipoles under the stress of a passing seismic wave will result in anelastic mechanical loss. Such attenuation has been measured in hcp metals in the lab, and is anisotropic due to the intrinsic elastic anisotropy of the host lattice. Such solute pair relaxations result in a "Zener effect", which is suggested here to be responsible for observed anisotropic seismic attenuation. Zener relaxation magnitude scales with solute concentrationand is consistent with around 5% loght element. Variations in attenuation are expected in a core with spatially varying concentrations of light element, and attenuation tomography of the inner core could, therefore, be employed to map chemical heterogeneity.
Ceos Wgiss Common Framework for Wgiss Connected Data Assets

NASA Astrophysics Data System (ADS)

Enloe, Y.; Mitchell, A. E.; Albani, M.; Yapur, M.

2016-12-01

The Committee on Earth Observation Satellites (CEOS), established in 1984 to coordinate civil space-borne observations of the Earth, has been building through its Working Group on Information Systems and Services (WGISS), a common data framework to identify and connect data assets at member agencies. Some of these data assets are federated systems such as the CEOS WGISS Integrated Catalog (CWIC), the European Space Agency's FedEO (Federated Earth Observations Missions Access) system, and the International Directory Network (IDN) which is an international effort developed by NASA to assist researchers in locating information on available data sets. A system level team provides coordination and oversight to make this loosely coupled federated system function and evolve. WGISS has identified 2 search standards, the Open Geospatial Consortium (OGC) Catalog Services for the Web (CSW) and the CEOS OpenSearch Best Practices (which references the OGC OpenSearch Geo and Time Extensions and OGC OpenSearch Extension for Earth Observation) as well as an interoperable metadata standard (ISO 19115) for use within the WGISS Connected Assets. Data partners must register their data collections in the IDN using the Global Change Master Directory (GCMD) Keywords. Data partners need to support one of the 2 search standards and be able to map their internal metadata to the ISO 19115 metadata elements. All searchable data must have a data access path. Clients can offer search and access to all or a subset of the satellite data available through the WGISS Connected Data Assets. Clients can offer support for a 2-step search: (1) Discovery through collection search using platform, instrument, science keywords, etc. at the IDN and (2) Search granule metadata at data partners through CWIC or FedEO. There are more than a dozen international agencies that offer their data through the WGISS Federation or working on developing their connections. This list includes European Space Agency, NASA, NOAA, USGS, National Institute for Space Research (Brazil), Canadian Center for Mapping and Earth Observations (CCMEO), the Academy for Opto-Electronics (China), the Indian Space Research Organization (ISRO), EUMETSAT, Russian Federal Space Agency (ROSCOSMOS) and several agencies within Australia.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.