An Expressive and Efficient Language for XML Information Retrieval.
ERIC Educational Resources Information Center
Chinenyanga, Taurai Tapiwa; Kushmerick, Nicholas
2002-01-01
Discusses XML and information retrieval and describes a query language, ELIXIR (expressive and efficient language for XML information retrieval), with a textual similarity operator that can be used for similarity joins. Explains the algorithm for answering ELIXIR queries to generate intermediate relational data. (Author/LRW)
Information Retrieval System for Japanese Standard Disease-Code Master Using XML Web Service
Hatano, Kenji; Ohe, Kazuhiko
2003-01-01
Information retrieval system of Japanese Standard Disease-Code Master Using XML Web Service is developed. XML Web Service is a new distributed processing system by standard internet technologies. With seamless remote method invocation of XML Web Service, users are able to get the latest disease code master information from their rich desktop applications or internet web sites, which refer to this service. PMID:14728364
A Novel Navigation Paradigm for XML Repositories.
ERIC Educational Resources Information Center
Azagury, Alain; Factor, Michael E.; Maarek, Yoelle S.; Mandler, Benny
2002-01-01
Discusses data exchange over the Internet and describes the architecture and implementation of an XML document repository that promotes a navigation paradigm for XML documents based on content and context. Topics include information retrieval and semistructured documents; and file systems as information storage infrastructure, particularly XMLFS.…
A Survey in Indexing and Searching XML Documents.
ERIC Educational Resources Information Center
Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James
2002-01-01
Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…
Querying and Ranking XML Documents.
ERIC Educational Resources Information Center
Schlieder, Torsten; Meuss, Holger
2002-01-01
Discussion of XML, information retrieval, precision, and recall focuses on a retrieval technique that adopts the similarity measure of the vector space model, incorporates the document structure, and supports structured queries. Topics include a query model based on tree matching; structured queries and term-based ranking; and term frequency and…
Information persistence using XML database technology
NASA Astrophysics Data System (ADS)
Clark, Thomas A.; Lipa, Brian E. G.; Macera, Anthony R.; Staskevich, Gennady R.
2005-05-01
The Joint Battlespace Infosphere (JBI) Information Management (IM) services provide information exchange and persistence capabilities that support tailored, dynamic, and timely access to required information, enabling near real-time planning, control, and execution for DoD decision making. JBI IM services will be built on a substrate of network centric core enterprise services and when transitioned, will establish an interoperable information space that aggregates, integrates, fuses, and intelligently disseminates relevant information to support effective warfighter business processes. This virtual information space provides individual users with information tailored to their specific functional responsibilities and provides a highly tailored repository of, or access to, information that is designed to support a specific Community of Interest (COI), geographic area or mission. Critical to effective operation of JBI IM services is the implementation of repositories, where data, represented as information, is represented and persisted for quick and easy retrieval. This paper will address information representation, persistence and retrieval using existing database technologies to manage structured data in Extensible Markup Language (XML) format as well as unstructured data in an IM services-oriented environment. Three basic categories of database technologies will be compared and contrasted: Relational, XML-Enabled, and Native XML. These technologies have diverse properties such as maturity, performance, query language specifications, indexing, and retrieval methods. We will describe our application of these evolving technologies within the context of a JBI Reference Implementation (RI) by providing some hopefully insightful anecdotes and lessons learned along the way. This paper will also outline future directions, promising technologies and emerging COTS products that can offer more powerful information management representations, better persistence mechanisms and improved retrieval techniques.
Structuring Legacy Pathology Reports by openEHR Archetypes to Enable Semantic Querying.
Kropf, Stefan; Krücken, Peter; Mueller, Wolf; Denecke, Kerstin
2017-05-18
Clinical information is often stored as free text, e.g. in discharge summaries or pathology reports. These documents are semi-structured using section headers, numbered lists, items and classification strings. However, it is still challenging to retrieve relevant documents since keyword searches applied on complete unstructured documents result in many false positive retrieval results. We are concentrating on the processing of pathology reports as an example for unstructured clinical documents. The objective is to transform reports semi-automatically into an information structure that enables an improved access and retrieval of relevant data. The data is expected to be stored in a standardized, structured way to make it accessible for queries that are applied to specific sections of a document (section-sensitive queries) and for information reuse. Our processing pipeline comprises information modelling, section boundary detection and section-sensitive queries. For enabling a focused search in unstructured data, documents are automatically structured and transformed into a patient information model specified through openEHR archetypes. The resulting XML-based pathology electronic health records (PEHRs) are queried by XQuery and visualized by XSLT in HTML. Pathology reports (PRs) can be reliably structured into sections by a keyword-based approach. The information modelling using openEHR allows saving time in the modelling process since many archetypes can be reused. The resulting standardized, structured PEHRs allow accessing relevant data by retrieving data matching user queries. Mapping unstructured reports into a standardized information model is a practical solution for a better access to data. Archetype-based XML enables section-sensitive retrieval and visualisation by well-established XML techniques. Focussing the retrieval to particular sections has the potential of saving retrieval time and improving the accuracy of the retrieval.
An XML-based system for the flexible classification and retrieval of clinical practice guidelines.
Ganslandt, T.; Mueller, M. L.; Krieglstein, C. F.; Senninger, N.; Prokosch, H. U.
2002-01-01
Beneficial effects of clinical practice guidelines (CPGs) have not yet reached expectations due to limited routine adoption. Electronic distribution and reminder systems have the potential to overcome implementation barriers. Existing electronic CPG repositories like the National Guideline Clearinghouse (NGC) provide individual access but lack standardized computer-readable interfaces necessary for automated guideline retrieval. The aim of this paper was to facilitate automated context-based selection and presentation of CPGs. Using attributes from the NGC classification scheme, an XML-based metadata repository was successfully implemented, providing document storage, classification and retrieval functionality. Semi-automated extraction of attributes was implemented for the import of XML guideline documents using XPath. A hospital information system interface was exemplarily implemented for diagnosis-based guideline invocation. Limitations of the implemented system are discussed and possible future work is outlined. Integration of standardized computer-readable search interfaces into existing CPG repositories is proposed. PMID:12463831
ERIC Educational Resources Information Center
Chang, May
2000-01-01
Describes the development of electronic finding aids for archives at the University of Illinois, Urbana-Champaign that used XML (extensible markup language) and EAD (encoded archival description) to enable more flexible information management and retrieval than using MARC or a relational database management system. EAD template is appended.…
Catalog Descriptions Using VOTable Files
NASA Astrophysics Data System (ADS)
Thompson, R.; Levay, K.; Kimball, T.; White, R.
2008-08-01
Additional information is frequently required to describe database table contents and make it understandable to users. For this reason, the Multimission Archive at Space Telescope (MAST) creates Òdescription filesÓ for each table/catalog. After trying various XML and CSV formats, we finally chose VOTable. These files are easy to update via an HTML form, easily read using an XML parser such as (in our case) the PHP5 SimpleXML extension, and have found multiple uses in our data access/retrieval process.
NASA Astrophysics Data System (ADS)
Pascoe, Charlotte; Lawrence, Bryan; Moine, Marie-Pierre; Ford, Rupert; Devine, Gerry
2010-05-01
The EU METAFOR Project (http://metaforclimate.eu) has created a web-based model documentation questionnaire to collect metadata from the modelling groups that are running simulations in support of the Coupled Model Intercomparison Project - 5 (CMIP5). The CMIP5 model documentation questionnaire will retrieve information about the details of the models used, how the simulations were carried out, how the simulations conformed to the CMIP5 experiment requirements and details of the hardware used to perform the simulations. The metadata collected by the CMIP5 questionnaire will allow CMIP5 data to be compared in a scientifically meaningful way. This paper describes the life-cycle of the CMIP5 questionnaire development which starts with relatively unstructured input from domain specialists and ends with formal XML documents that comply with the METAFOR Common Information Model (CIM). Each development step is associated with a specific tool. (1) Mind maps are used to capture information requirements from domain experts and build a controlled vocabulary, (2) a python parser processes the XML files generated by the mind maps, (3) Django (python) is used to generate the dynamic structure and content of the web based questionnaire from processed xml and the METAFOR CIM, (4) Python parsers ensure that information entered into the CMIP5 questionnaire is output as CIM compliant xml, (5) CIM compliant output allows automatic information capture tools to harvest questionnaire content into databases such as the Earth System Grid (ESG) metadata catalogue. This paper will focus on how Django (python) and XML input files are used to generate the structure and content of the CMIP5 questionnaire. It will also address how the choice of development tools listed above provided a framework that enabled working scientists (who we would never ordinarily get to interact with UML and XML) to be part the iterative development process and ensure that the CMIP5 model documentation questionnaire reflects what scientists want to know about the models. Keywords: metadata, CMIP5, automatic information capture, tool development
An exponentiation method for XML element retrieval.
Wichaiwong, Tanakorn
2014-01-01
XML document is now widely used for modelling and storing structured documents. The structure is very rich and carries important information about contents and their relationships, for example, e-Commerce. XML data-centric collections require query terms allowing users to specify constraints on the document structure; mapping structure queries and assigning the weight are significant for the set of possibly relevant documents with respect to structural conditions. In this paper, we present an extension to the MEXIR search system that supports the combination of structural and content queries in the form of content-and-structure queries, which we call the Exponentiation function. It has been shown the structural information improve the effectiveness of the search system up to 52.60% over the baseline BM25 at MAP.
ScotlandsPlaces XML: Bespoke XML or XML Mapping?
ERIC Educational Resources Information Center
Beamer, Ashley; Gillick, Mark
2010-01-01
Purpose: The purpose of this paper is to investigate web services (in the form of parameterised URLs), specifically in the context of the ScotlandsPlaces project. This involves cross-domain querying, data retrieval and display via the development of a bespoke XML standard rather than existing XML formats and mapping between them.…
Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang
1999-01-01
Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230
ERIC Educational Resources Information Center
Tennant, Roy, Ed.
This book presents examples of how libraries are using XML (eXtensible Markup Language) to solve problems, expand services, and improve systems. Part I contains papers on using XML in library catalog records: "Updating MARC Records with XMLMARC" (Kevin S. Clarke, Stanford University) and "Searching and Retrieving XML Records via the…
An Exponentiation Method for XML Element Retrieval
2014-01-01
XML document is now widely used for modelling and storing structured documents. The structure is very rich and carries important information about contents and their relationships, for example, e-Commerce. XML data-centric collections require query terms allowing users to specify constraints on the document structure; mapping structure queries and assigning the weight are significant for the set of possibly relevant documents with respect to structural conditions. In this paper, we present an extension to the MEXIR search system that supports the combination of structural and content queries in the form of content-and-structure queries, which we call the Exponentiation function. It has been shown the structural information improve the effectiveness of the search system up to 52.60% over the baseline BM25 at MAP. PMID:24696643
An XML Data Model for Inverted Image Indexing
NASA Astrophysics Data System (ADS)
So, Simon W.; Leung, Clement H. C.; Tse, Philip K. C.
2003-01-01
The Internet world makes increasing use of XML-based technologies. In multimedia data indexing and retrieval, the MPEG-7 standard for Multimedia Description Scheme is specified using XML. The flexibility of XML allows users to define other markup semantics for special contexts, construct data-centric XML documents, exchange standardized data between computer systems, and present data in different applications. In this paper, the Inverted Image Indexing paradigm is presented and modeled using XML Schema.
Transformation of HDF-EOS metadata from the ECS model to ISO 19115-based XML
NASA Astrophysics Data System (ADS)
Wei, Yaxing; Di, Liping; Zhao, Baohua; Liao, Guangxuan; Chen, Aijun
2007-02-01
Nowadays, geographic data, such as NASA's Earth Observation System (EOS) data, are playing an increasing role in many areas, including academic research, government decisions and even in people's every lives. As the quantity of geographic data becomes increasingly large, a major problem is how to fully make use of such data in a distributed, heterogeneous network environment. In order for a user to effectively discover and retrieve the specific information that is useful, the geographic metadata should be described and managed properly. Fortunately, the emergence of XML and Web Services technologies greatly promotes information distribution across the Internet. The research effort discussed in this paper presents a method and its implementation for transforming Hierarchical Data Format (HDF)-EOS metadata from the NASA ECS model to ISO 19115-based XML, which will be managed by the Open Geospatial Consortium (OGC) Catalogue Services—Web Profile (CSW). Using XML and international standards rather than domain-specific models to describe the metadata of those HDF-EOS data, and further using CSW to manage the metadata, can allow metadata information to be searched and interchanged more widely and easily, thus promoting the sharing of HDF-EOS data.
XML at the ADC: Steps to a Next Generation Data Archive
NASA Astrophysics Data System (ADS)
Shaya, E.; Blackwell, J.; Gass, J.; Oliversen, N.; Schneider, G.; Thomas, B.; Cheung, C.; White, R. A.
1999-05-01
The eXtensible Markup Language (XML) is a document markup language that allows users to specify their own tags, to create hierarchical structures to qualify their data, and to support automatic checking of documents for structural validity. It is being intensively supported by nearly every major corporate software developer. Under the funds of a NASA AISRP proposal, the Astronomical Data Center (ADC, http://adc.gsfc.nasa.gov) is developing an infrastructure for importation, enhancement, and distribution of data and metadata using XML as the document markup language. We discuss the preliminary Document Type Definition (DTD, at http://adc.gsfc.nasa.gov/xml) which specifies the elements and their attributes in our metadata documents. This attempts to define both the metadata of an astronomical catalog and the `header' information of an astronomical table. In addition, we give an overview of the planned flow of data through automated pipelines from authors and journal presses into our XML archive and retrieval through the web via the XML-QL Query Language and eXtensible Style Language (XSL) scripts. When completed, the catalogs and journal tables at the ADC will be tightly hyperlinked to enhance data discovery. In addition one will be able to search on fragmentary information. For instance, one could query for a table by entering that the second author is so-and-so or that the third author is at such-and-such institution.
A hierarchical SVG image abstraction layer for medical imaging
NASA Astrophysics Data System (ADS)
Kim, Edward; Huang, Xiaolei; Tan, Gang; Long, L. Rodney; Antani, Sameer
2010-03-01
As medical imaging rapidly expands, there is an increasing need to structure and organize image data for efficient analysis, storage and retrieval. In response, a large fraction of research in the areas of content-based image retrieval (CBIR) and picture archiving and communication systems (PACS) has focused on structuring information to bridge the "semantic gap", a disparity between machine and human image understanding. An additional consideration in medical images is the organization and integration of clinical diagnostic information. As a step towards bridging the semantic gap, we design and implement a hierarchical image abstraction layer using an XML based language, Scalable Vector Graphics (SVG). Our method encodes features from the raw image and clinical information into an extensible "layer" that can be stored in a SVG document and efficiently searched. Any feature extracted from the raw image including, color, texture, orientation, size, neighbor information, etc., can be combined in our abstraction with high level descriptions or classifications. And our representation can natively characterize an image in a hierarchical tree structure to support multiple levels of segmentation. Furthermore, being a world wide web consortium (W3C) standard, SVG is able to be displayed by most web browsers, interacted with by ECMAScript (standardized scripting language, e.g. JavaScript, JScript), and indexed and retrieved by XML databases and XQuery. Using these open source technologies enables straightforward integration into existing systems. From our results, we show that the flexibility and extensibility of our abstraction facilitates effective storage and retrieval of medical images.
PubMed Interact: an Interactive Search Application for MEDLINE/PubMed
Muin, Michael; Fontelo, Paul; Ackerman, Michael
2006-01-01
Online search and retrieval systems are important resources for medical literature research. Progressive Web 2.0 technologies provide opportunities to improve search strategies and user experience. Using PHP, Document Object Model (DOM) manipulation and Asynchronous JavaScript and XML (Ajax), PubMed Interact allows greater functionality so users can refine search parameters with ease and interact with the search results to retrieve and display relevant information and related articles. PMID:17238658
Information Retrieval and Graph Analysis Approaches for Book Recommendation.
Benkoussas, Chahinez; Bellot, Patrice
2015-01-01
A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.
Information Retrieval and Graph Analysis Approaches for Book Recommendation
Benkoussas, Chahinez; Bellot, Patrice
2015-01-01
A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments. PMID:26504899
Content-Aware DataGuide with Incremental Index Update using Frequently Used Paths
NASA Astrophysics Data System (ADS)
Sharma, A. K.; Duhan, Neelam; Khattar, Priyanka
2010-11-01
Size of the WWW is increasing day by day. Due to the absence of structured data on the Web, it becomes very difficult for information retrieval tools to fully utilize the Web information. As a solution to this problem, XML pages come into play, which provide structural information to the users to some extent. Without efficient indexes, query processing can be quite inefficient due to an exhaustive traversal on XML data. In this paper an improved content-centric approach of Content-Aware DataGuide, which is an indexing technique for XML databases, is being proposed that uses frequently used paths from historical query logs to improve query performance. The index can be updated incrementally according to the changes in query workload and thus, the overhead of reconstruction can be minimized. Frequently used paths are extracted using any Sequential Pattern mining algorithm on subsequent queries in the query workload. After this, the data structures are incrementally updated. This indexing technique proves to be efficient as partial matching queries can be executed efficiently and users can now get the more relevant documents in results.
Distribution of immunodeficiency fact files with XML--from Web to WAP.
Väliaho, Jouni; Riikonen, Pentti; Vihinen, Mauno
2005-06-26
Although biomedical information is growing rapidly, it is difficult to find and retrieve validated data especially for rare hereditary diseases. There is an increased need for services capable of integrating and validating information as well as proving it in a logically organized structure. A XML-based language enables creation of open source databases for storage, maintenance and delivery for different platforms. Here we present a new data model called fact file and an XML-based specification Inherited Disease Markup Language (IDML), that were developed to facilitate disease information integration, storage and exchange. The data model was applied to primary immunodeficiencies, but it can be used for any hereditary disease. Fact files integrate biomedical, genetic and clinical information related to hereditary diseases. IDML and fact files were used to build a comprehensive Web and WAP accessible knowledge base ImmunoDeficiency Resource (IDR) available at http://bioinf.uta.fi/idr/. A fact file is a user oriented user interface, which serves as a starting point to explore information on hereditary diseases. The IDML enables the seamless integration and presentation of genetic and disease information resources in the Internet. IDML can be used to build information services for all kinds of inherited diseases. The open source specification and related programs are available at http://bioinf.uta.fi/idml/.
Saadawi, Gilan M; Harrison, James H
2006-10-01
Clinical laboratory procedure manuals are typically maintained as word processor files and are inefficient to store and search, require substantial effort for review and updating, and integrate poorly with other laboratory information. Electronic document management systems could improve procedure management and utility. As a first step toward building such systems, we have developed a prototype electronic format for laboratory procedures using Extensible Markup Language (XML). Representative laboratory procedures were analyzed to identify document structure and data elements. This information was used to create a markup vocabulary, CLP-ML, expressed as an XML Document Type Definition (DTD). To determine whether this markup provided advantages over generic markup, we compared procedures structured with CLP-ML or with the vocabulary of the Health Level Seven, Inc. (HL7) Clinical Document Architecture (CDA) narrative block. CLP-ML includes 124 XML tags and supports a variety of procedure types across different laboratory sections. When compared with a general-purpose markup vocabulary (CDA narrative block), CLP-ML documents were easier to edit and read, less complex structurally, and simpler to traverse for searching and retrieval. In combination with appropriate software, CLP-ML is designed to support electronic authoring, reviewing, distributing, and searching of clinical laboratory procedures from a central repository, decreasing procedure maintenance effort and increasing the utility of procedure information. A standard electronic procedure format could also allow laboratories and vendors to share procedures and procedure layouts, minimizing duplicative word processor editing. Our results suggest that laboratory-specific markup such as CLP-ML will provide greater benefit for such systems than generic markup.
Techniques for integrating ‐omics data
Akula, Siva Prasad; Miriyala, Raghava Naidu; Thota, Hanuman; Rao, Allam Appa; Gedela, Srinubabu
2009-01-01
The challenge for -omics research is to tackle the problem of fragmentation of knowledge by integrating several sources of heterogeneous information into a coherent entity. It is widely recognized that successful data integration is one of the keys to improve productivity for stored data. Through proper data integration tools and algorithms, researchers may correlate relationships that enable them to make better and faster decisions. The need for data integration is essential for present ‐omics community, because ‐omics data is currently spread world wide in wide variety of formats. These formats can be integrated and migrated across platforms through different techniques and one of the important techniques often used is XML. XML is used to provide a document markup language that is easier to learn, retrieve, store and transmit. It is semantically richer than HTML. Here, we describe bio warehousing, database federation, controlled vocabularies and highlighting the XML application to store, migrate and validate -omics data. PMID:19255651
Techniques for integrating -omics data.
Akula, Siva Prasad; Miriyala, Raghava Naidu; Thota, Hanuman; Rao, Allam Appa; Gedela, Srinubabu
2009-01-01
The challenge for -omics research is to tackle the problem of fragmentation of knowledge by integrating several sources of heterogeneous information into a coherent entity. It is widely recognized that successful data integration is one of the keys to improve productivity for stored data. Through proper data integration tools and algorithms, researchers may correlate relationships that enable them to make better and faster decisions. The need for data integration is essential for present -omics community, because -omics data is currently spread world wide in wide variety of formats. These formats can be integrated and migrated across platforms through different techniques and one of the important techniques often used is XML. XML is used to provide a document markup language that is easier to learn, retrieve, store and transmit. It is semantically richer than HTML. Here, we describe bio warehousing, database federation, controlled vocabularies and highlighting the XML application to store, migrate and validate -omics data.
Querying archetype-based EHRs by search ontology-based XPath engineering.
Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich
2018-05-11
Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.
Information Model Translation to Support a Wider Science Community
NASA Astrophysics Data System (ADS)
Hughes, John S.; Crichton, Daniel; Ritschel, Bernd; Hardman, Sean; Joyner, Ronald
2014-05-01
The Planetary Data System (PDS), NASA's long-term archive for solar system exploration data, has just released PDS4, a modernization of the PDS architecture, data standards, and technical infrastructure. This next generation system positions the PDS to meet the demands of the coming decade, including big data, international cooperation, distributed nodes, and multiple ways of analysing and interpreting data. It also addresses three fundamental project goals: providing more efficient data delivery by data providers to the PDS, enabling a stable, long-term usable planetary science data archive, and enabling services for the data consumer to find, access, and use the data they require in contemporary data formats. The PDS4 information architecture is used to describe all PDS data using a common model. Captured in an ontology modeling tool it supports a hierarchy of data dictionaries built to the ISO/IEC 11179 standard and is designed to increase flexibility, enable complex searches at the product level, and to promote interoperability that facilitates data sharing both nationally and internationally. A PDS4 information architecture design requirement stipulates that the content of the information model must be translatable to external data definition languages such as XML Schema, XMI/XML, and RDF/XML. To support the semantic Web standards we are now in the process of mapping the contents into RDF/XML to support SPARQL capable databases. We are also building a terminological ontology to support virtually unified data retrieval and access. This paper will provide an overview of the PDS4 information architecture focusing on its domain information model and how the translation and mapping are being accomplished.
An XML-based Generic Tool for Information Retrieval in Solar Databases
NASA Astrophysics Data System (ADS)
Scholl, Isabelle F.; Legay, Eric; Linsolas, Romain
This paper presents the current architecture of the `Solar Web Project' now in its development phase. This tool will provide scientists interested in solar data with a single web-based interface for browsing distributed and heterogeneous catalogs of solar observations. The main goal is to have a generic application that can be easily extended to new sets of data or to new missions with a low level of maintenance. It is developed with Java and XML is used as a powerful configuration language. The server, independent of any database scheme, can communicate with a client (the user interface) and several local or remote archive access systems (such as existing web pages, ftp sites or SQL databases). Archive access systems are externally described in XML files. The user interface is also dynamically generated from an XML file containing the window building rules and a simplified database description. This project is developed at MEDOC (Multi-Experiment Data and Operations Centre), located at the Institut d'Astrophysique Spatiale (Orsay, France). Successful tests have been conducted with other solar archive access systems.
The SGML Standardization Framework and the Introduction of XML
Grütter, Rolf
2000-01-01
Extensible Markup Language (XML) is on its way to becoming a global standard for the representation, exchange, and presentation of information on the World Wide Web (WWW). More than that, XML is creating a standardization framework, in terms of an open network of meta-standards and mediators that allows for the definition of further conventions and agreements in specific business domains. Such an approach is particularly needed in the healthcare domain; XML promises to especially suit the particularities of patient records and their lifelong storage, retrieval, and exchange. At a time when change rather than steadiness is becoming the faithful feature of our society, standardization frameworks which support a diversified growth of specifications that are appropriate to the actual needs of the users are becoming more and more important; and efforts should be made to encourage this new attempt at standardization to grow in a fruitful direction. Thus, the introduction of XML reflects a standardization process which is neither exclusively based on an acknowledged standardization authority, nor a pure market standard. Instead, a consortium of companies, academic institutions, and public bodies has agreed on a common recommendation based on an existing standardization framework. The consortium's process of agreeing to a standardization framework will doubtlessly be successful in the case of XML, and it is suggested that it should be considered as a generic model for standardization processes in the future. PMID:11720931
The SGML standardization framework and the introduction of XML.
Fierz, W; Grütter, R
2000-01-01
Extensible Markup Language (XML) is on its way to becoming a global standard for the representation, exchange, and presentation of information on the World Wide Web (WWW). More than that, XML is creating a standardization framework, in terms of an open network of meta-standards and mediators that allows for the definition of further conventions and agreements in specific business domains. Such an approach is particularly needed in the healthcare domain; XML promises to especially suit the particularities of patient records and their lifelong storage, retrieval, and exchange. At a time when change rather than steadiness is becoming the faithful feature of our society, standardization frameworks which support a diversified growth of specifications that are appropriate to the actual needs of the users are becoming more and more important; and efforts should be made to encourage this new attempt at standardization to grow in a fruitful direction. Thus, the introduction of XML reflects a standardization process which is neither exclusively based on an acknowledged standardization authority, nor a pure market standard. Instead, a consortium of companies, academic institutions, and public bodies has agreed on a common recommendation based on an existing standardization framework. The consortium's process of agreeing to a standardization framework will doubtlessly be successful in the case of XML, and it is suggested that it should be considered as a generic model for standardization processes in the future.
2015-03-01
fall in the lossy category (Gonzalez, Woods , & Eddins, 2009, p. 420). For the textual or numeric data in XML, however, lossy compression is...7/1,337 > Professional Notes Being Efficient with Bandwidth By Lieutenant Commander Steve Debich, Lieutenant Bruce Hill, Captain Scot Miller (Retired...2005). XML Binary Characterization. Retrieved from http://www.w3.org/TR/xbc-characterization/ Gonzalez, R., Woods , R., & Eddins, S. (2009
NASA Astrophysics Data System (ADS)
Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan
2010-10-01
The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.
Development of the Plate Tectonics and Seismology markup languages with XML
NASA Astrophysics Data System (ADS)
Babaie, H.; Babaei, A.
2003-04-01
The Extensible Markup Language (XML) and its specifications such as the XSD Schema, allow geologists to design discipline-specific vocabularies such as Seismology Markup Language (SeismML) or Plate Tectonics Markup Language (TectML). These languages make it possible to store and interchange structured geological information over the Web. Development of a geological markup language requires mapping geological concepts, such as "Earthquake" or "Plate" into a UML object model, applying a modeling and design environment. We have selected four inter-related geological concepts: earthquake, fault, plate, and orogeny, and developed four XML Schema Definitions (XSD), that define the relationships, cardinalities, hierarchies, and semantics of these concepts. In such a geological concept model, the UML object "Earthquake" is related to one or more "Wave" objects, each arriving to a seismic station at a specific "DateTime", and relating to a specific "Epicenter" object that lies at a unique "Location". The "Earthquake" object occurs along a "Segment" of a "Fault" object, which is related to a specific "Plate" object. The "Fault" has its own associations with such things as "Bend", "Step", and "Segment", and could be of any kind (e.g., "Thrust", "Transform'). The "Plate" is related to many other objects such as "MOR", "Subduction", and "Forearc", and is associated with an "Orogeny" object that relates to "Deformation" and "Strain" and several other objects. These UML objects were mapped into XML Metadata Interchange (XMI) formats, which were then converted into four XSD Schemas. The schemas were used to create and validate the XML instance documents, and to create a relational database hosting the plate tectonics and seismological data in the Microsoft Access format. The SeismML and TectML allow seismologists and structural geologists, among others, to submit and retrieve structured geological data on the Internet. A seismologist, for example, can submit peer-reviewed and reliable data about a specific earthquake to a Java Server Page on our web site hosting the XML application. Other geologists can readily retrieve the submitted data, saved in files or special tables of the designed database, through a search engine designed with J2EE (JSP, servlet, Java Bean) and XML specifications such as XPath, XPointer, and XSLT. When extended to include all the important concepts of seismology and plate tectonics, the two markup languages will make global interchange of geological data a reality.
Fumoto, Masaki; Miyazaki, Satoru; Sugawara, Hideaki
2002-01-01
Genome Information Broker (GIB) is a powerful tool for the study of comparative genomics. GIB allows users to retrieve and display partial and/or whole genome sequences together with the relevant biological annotation. GIB has accumulated all the completed microbial genome and has recently been expanded to include Arabidopsis thaliana genome data from DDBJ/EMBL/GenBank. In the near future, hundreds of genome sequences will be determined. In order to handle such huge data, we have enhanced the GIB architecture by using XML, CORBA and distributed RDBs. We introduce the new GIB here. GIB is freely accessible at http://gib.genes.nig.ac.jp/. PMID:11752256
2009-09-21
specified by contract no. W7714-040875/001/SV. This document contains the design of the JNDMS software to the system architecture level. Other...alternative for the presentation functions. ASP, Java, ActiveX , DLL, HTML, DHTML, SOAP, .NET HTML, DHTML, XML, Jscript, VBScript, SOAP, .NET...retrieved through the network, typically by a network management console. Information is contained in a Management Information Base (MIB), which is a data
Digitising legacy zoological taxonomic literature: Processes, products and using the output
Lyal, Christopher H. C.
2016-01-01
Abstract By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate metadata to, for example, enable algorithmic assessment of availability of a name under the Code. Legacy (and new) literature delivered in this fashion will become part of a global taxonomic resource from which users can extract tailored content to meet their particular needs, be they nomenclatural, taxonomic, faunistic or other. To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output of the many efforts has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, the means by which many scientific papers are indexed and located is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising taxonomic names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be searched directly. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original papers and extract target content manually. This opens the door to dynamic linking of new content with extant systems: automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. A means to do this is through marking up content into XML, and the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. We now need to automate this process as far as possible. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying in its production the requirements of the Code, is fit for purpose in the future. PMID:26877659
Huang, Mingbo; Hu, Ding; Yu, Donglan; Zheng, Zhensheng; Wang, Kuijian
2011-12-01
Enhanced extracorporeal counterpulsation (EECP) information consists of both text and hemodynamic waveform data. At present EECP text information has been successfully managed through Web browser, while the management and sharing of hemodynamic waveform data through Internet has not been solved yet. In order to manage EECP information completely, based on the in-depth analysis of EECP hemodynamic waveform file of digital imaging and communications in medicine (DICOM) format and its disadvantages in Internet sharing, we proposed the use of the extensible markup language (XML), which is currently the Internet popular data exchange standard, as the storage specification for the sharing of EECP waveform data. Then we designed a web-based sharing system of EECP hemodynamic waveform data via ASP. NET 2.0 platform. Meanwhile, we specifically introduced the four main system function modules and their implement methods, including DICOM to XML conversion module, EECP waveform data management module, retrieval and display of EECP waveform module and the security mechanism of the system.
XML Based Markup Languages for Specific Domains
NASA Astrophysics Data System (ADS)
Varde, Aparna; Rundensteiner, Elke; Fahrenholz, Sally
A challenging area in web based support systems is the study of human activities in connection with the web, especially with reference to certain domains. This includes capturing human reasoning in information retrieval, facilitating the exchange of domain-specific knowledge through a common platform and developing tools for the analysis of data on the web from a domain expert's angle. Among the techniques and standards related to such work, we have XML, the eXtensible Markup Language. This serves as a medium of communication for storing and publishing textual, numeric and other forms of data seamlessly. XML tag sets are such that they preserve semantics and simplify the understanding of stored information by users. Often domain-specific markup languages are designed using XML, with a user-centric perspective. Standardization bodies and research communities may extend these to include additional semantics of areas within and related to the domain. This chapter outlines the issues to be considered in developing domain-specific markup languages: the motivation for development, the semantic considerations, the syntactic constraints and other relevant aspects, especially taking into account human factors. Illustrating examples are provided from domains such as Medicine, Finance and Materials Science. Particular emphasis in these examples is on the Materials Markup Language MatML and the semantics of one of its areas, namely, the Heat Treating of Materials. The focus of this chapter, however, is not the design of one particular language but rather the generic issues concerning the development of domain-specific markup languages.
Partial Automation of Requirements Tracing
NASA Technical Reports Server (NTRS)
Hayes, Jane; Dekhtyar, Alex; Sundaram, Senthil; Vadlamudi, Sravanthi
2006-01-01
Requirements Tracing on Target (RETRO) is software for after-the-fact tracing of textual requirements to support independent verification and validation of software. RETRO applies one of three user-selectable information-retrieval techniques: (1) term frequency/inverse document frequency (TF/IDF) vector retrieval, (2) TF/IDF vector retrieval with simple thesaurus, or (3) keyword extraction. One component of RETRO is the graphical user interface (GUI) for use in initiating a requirements-tracing project (a pair of artifacts to be traced to each other, such as a requirements spec and a design spec). Once the artifacts have been specified and the IR technique chosen, another component constructs a representation of the artifact elements and stores it on disk. Next, the IR technique is used to produce a first list of candidate links (potential matches between the two artifact levels). This list, encoded in Extensible Markup Language (XML), is optionally processed by a filtering component designed to make the list somewhat smaller without sacrificing accuracy. Through the GUI, the user examines a number of links and returns decisions (yes, these are links; no, these are not links). Coded in XML, these decisions are provided to a "feedback processor" component that prepares the data for the next application of the IR technique. The feedback reduces the incidence of erroneous candidate links. Unlike related prior software, RETRO does not require the user to assign keywords, and automatically builds a document index.
iSMART: Ontology-based Semantic Query of CDA Documents
Liu, Shengping; Ni, Yuan; Mei, Jing; Li, Hanyu; Xie, Guotong; Hu, Gang; Liu, Haifeng; Hou, Xueqiao; Pan, Yue
2009-01-01
The Health Level 7 Clinical Document Architecture (CDA) is widely accepted as the format for electronic clinical document. With the rich ontological references in CDA documents, the ontology-based semantic query could be performed to retrieve CDA documents. In this paper, we present iSMART (interactive Semantic MedicAl Record reTrieval), a prototype system designed for ontology-based semantic query of CDA documents. The clinical information in CDA documents will be extracted into RDF triples by a declarative XML to RDF transformer. An ontology reasoner is developed to infer additional information by combining the background knowledge from SNOMED CT ontology. Then an RDF query engine is leveraged to enable the semantic queries. This system has been evaluated using the real clinical documents collected from a large hospital in southern China. PMID:20351883
Design and implementation of CUAHSI WaterML and WaterOneFlow Web Services
NASA Astrophysics Data System (ADS)
Valentine, D. W.; Zaslavsky, I.; Whitenack, T.; Maidment, D.
2007-12-01
WaterOneFlow is a term for a group of web services created by and for the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) community. CUAHSI web services facilitate the retrieval of hydrologic observations information from online data sources using the SOAP protocol. CUAHSI Water Markup Language (below referred to as WaterML) is an XML schema defining the format of messages returned by the WaterOneFlow web services. \
ERIC Educational Resources Information Center
Scharf, David
2002-01-01
Discusses XML (extensible markup language), particularly as it relates to libraries. Topics include organizing information; cataloging; metadata; similarities to HTML; organizations dealing with XML; making XML useful; a history of XML; the semantic Web; related technologies; XML at the Library of Congress; and its role in improving the…
Issues and solutions for storage, retrieval, and searching of MPEG-7 documents
NASA Astrophysics Data System (ADS)
Chang, Yuan-Chi; Lo, Ming-Ling; Smith, John R.
2000-10-01
The ongoing MPEG-7 standardization activity aims at creating a standard for describing multimedia content in order to facilitate the interpretation of the associated information content. Attempting to address a broad range of applications, MPEG-7 has defined a flexible framework consisting of Descriptors, Description Schemes, and Description Definition Language. Descriptors and Description Schemes describe features, structure and semantics of multimedia objects. They are written in the Description Definition Language (DDL). In the most recent revision, DDL applies XML (Extensible Markup Language) Schema with MPEG-7 extensions. DDL has constructs that support inclusion, inheritance, reference, enumeration, choice, sequence, and abstract type of Description Schemes and Descriptors. In order to enable multimedia systems to use MPEG-7, a number of important problems in storing, retrieving and searching MPEG-7 documents need to be solved. This paper reports on initial finding on issues and solutions of storing and accessing MPEG-7 documents. In particular, we discuss the benefits of using a virtual document management framework based on XML Access Server (XAS) in order to bridge the MPEG-7 multimedia applications and database systems. The need arises partly because MPEG-7 descriptions need customized storage schema, indexing and search engines. We also discuss issues arising in managing dependence and cross-description scheme search.
NASA Astrophysics Data System (ADS)
Maidantchik, C.; Ferreira, F.; Grael, F.; Atlas Tile Calorimeter Community
2010-04-01
The web system described here provides features to monitor the ATLAS Detector Control System (DCS) acquired data. The DCS is responsible for overseeing the coherent and safe operation of the ATLAS experiment hardware. In the context of the Hadronic Tile Calorimeter Detector (TileCal), it controls the power supplies of the readout electronics acquiring voltages, currents, temperatures and coolant pressure measurements. The physics data taking requires the stable operation of the power sources. The TileDCS Web System retrieves automatically data and extracts the statistics for given periods of time. The mean and standard deviation outcomes are stored as XML files and are compared to preset thresholds. Further, a graphical representation of the TileCal cylinders indicates the state of the supply system of each detector drawer. Colors are designated for each kind of state. In this way problems are easier to find and the collaboration members can focus on them. The user selects a module and the system presents detailed information. It is possible to verify the statistics and generate charts of the parameters over the time. The TileDCS Web System also presents information about the power supplies latest status. One wedge is colored green whenever the system is on. Otherwise it is colored red. Furthermore, it is possible to perform customized analysis. It provides search interfaces where the user can set the module, parameters, and the time period of interest. The system also produces the output of the retrieved data as charts, XML files, CSV and ROOT files according to the user's choice.
Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application
Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H
2017-01-01
Background The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. Objective The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Methods Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. Results The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. Conclusions This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. PMID:28903894
Extension of the COG and arCOG databases by amino acid and nucleotide sequences
Meereis, Florian; Kaufmann, Michael
2008-01-01
Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
Indexing Temporal XML Using FIX
NASA Astrophysics Data System (ADS)
Zheng, Tiankun; Wang, Xinjun; Zhou, Yingchun
XML has become an important criterion for description and exchange of information. It is of practical significance to introduce the temporal information on this basis, because time has penetrated into all walks of life as an important property information .Such kind of database can track document history and recover information to state of any time before, and is called Temporal XML database. We advise a new feature vector on the basis of FIX which is a feature-based XML index, and build an index on temporal XML database using B+ tree, donated TFIX. We also put forward a new query algorithm upon it for temporal query. Our experiments proved that this index has better performance over other kinds of XML indexes. The index can satisfy all TXPath queries with depth up to K(>0).
Kilintzis, Vassilis; Beredimas, Nikolaos; Chouvarda, Ioanna
2014-01-01
An integral part of a system that manages medical data is the persistent storage engine. For almost twenty five years Relational Database Management Systems(RDBMS) were considered the obvious decision, yet today new technologies have emerged that require our attention as possible alternatives. Triplestores store information in terms of RDF triples without necessarily binding to a specific predefined structural model. In this paper we present an attempt to compare the performance of Apache JENA-Fuseki and the Virtuoso Universal Server 6 triplestores with that of MySQL 5.6 RDBMS for storing and retrieving medical information that it is communicated as RDF/XML ontology instances over a RESTful web service. The results show that the performance, calculated as average time of storing and retrieving instances, is significantly better using Virtuoso Server while MySQL performed better than Fuseki.
Storm Prediction Center Convective Outlooks
Current Watches Meso. Discussions Conv. Outlooks Tstm. Outlooks Fire Wx Outlooks XML logo RSS Feeds E-Mail : Retrieve Outlooks Weather Topics: Watches, Mesoscale Discussions, Outlooks, Fire Weather, All Products
Mann, G; Birkmann, C; Schmidt, T; Schaeffler, V
1999-01-01
Introduction Present solutions for the representation and retrieval of medical information from online sources are not very satisfying. Either the retrieval process lacks of precision and completeness the representation does not support the update and maintenance of the represented information. Most efforts are currently put into improving the combination of search engines and HTML based documents. However, due to the current shortcomings of methods for natural language understanding there are clear limitations to this approach. Furthermore, this approach does not solve the maintenance problem. At least medical information exceeding a certain complexity seems to afford approaches that rely on structured knowledge representation and corresponding retrieval mechanisms. Methods Knowledge-based information systems are based on the following fundamental ideas. The representation of information is based on ontologies that define the structure of the domain's concepts and their relations. Views on domain models are defined and represented as retrieval schemata. Retrieval schemata can be interpreted as canonical query types focussing on specific aspects of the provided information (e.g. diagnosis or therapy centred views). Based on these retrieval schemata it can be decided which parts of the information in the domain model must be represented explicitly and formalised to support the retrieval process. As representation language propositional logic is used. All other information can be represented in a structured but informal way using text, images etc. Layout schemata are used to assign layout information to retrieved domain concepts. Depending on the target environment HTML or XML can be used. Results Based on this approach two knowledge-based information systems have been developed. The 'Ophthalmologic Knowledge-based Information System for Diabetic Retinopathy' (OKIS-DR) provides information on diagnoses, findings, examinations, guidelines, and reference images related to diabetic retinopathy. OKIS-DR uses combinations of findings to specify the information that must be retrieved. The second system focuses on nutrition related allergies and intolerances. Information on allergies and intolerances of a patient are used to retrieve general information on the specified combination of allergies and intolerances. As a special feature the system generates tables showing food types and products that are tolerated or not tolerated by patients. Evaluation by external experts and user groups showed that the described approach of knowledge-based information systems increases the precision and completeness of knowledge retrieval. Due to the structured and non-redundant representation of information the maintenance and update of the information can be simplified. Both systems are available as WWW based online knowledge bases and CD-ROMs (cf. http://mta.gsf.de topic: products).
Storm Prediction Center Product & Report Archives
Current Watches Meso. Discussions Conv. Outlooks Tstm. Outlooks Fire Wx Outlooks XML logo RSS Feeds E-Mail Archived Fire Weather Outlooks To view fire weather outlooks for a previous day, type in the date you wish to retrieve (e.g. 140503 for May 3, 2014). Data available since June 4, 2002. 150101 Retrieve Fire
ERIC Educational Resources Information Center
Mitri, Michel
2012-01-01
XML has become the most ubiquitous format for exchange of data between applications running on the Internet. Most Web Services provide their information to clients in the form of XML. The ability to process complex XML documents in order to extract relevant information is becoming as important a skill for IS students to master as querying…
XML — an opportunity for
NASA Astrophysics Data System (ADS)
Houlding, Simon W.
2001-08-01
Extensible markup language (XML) is a recently introduced meta-language standard on the Web. It provides the rules for development of metadata (markup) standards for information transfer in specific fields. XML allows development of markup languages that describe what information is rather than how it should be presented. This allows computer applications to process the information in intelligent ways. In contrast hypertext markup language (HTML), which fuelled the initial growth of the Web, is a metadata standard concerned exclusively with presentation of information. Besides its potential for revolutionizing Web activities, XML provides an opportunity for development of meaningful data standards in specific application fields. The rapid endorsement of XML by science, industry and e-commerce has already spawned new metadata standards in such fields as mathematics, chemistry, astronomy, multi-media and Web micro-payments. Development of XML-based data standards in the geosciences would significantly reduce the effort currently wasted on manipulating and reformatting data between different computer platforms and applications and would ensure compatibility with the new generation of Web browsers. This paper explores the evolution, benefits and status of XML and related standards in the more general context of Web activities and uses this as a platform for discussion of its potential for development of data standards in the geosciences. Some of the advantages of XML are illustrated by a simple, browser-compatible demonstration of XML functionality applied to a borehole log dataset. The XML dataset and the associated stylesheet and schema declarations are available for FTP download.
δ-dependency for privacy-preserving XML data publishing.
Landberg, Anders H; Nguyen, Kinh; Pardede, Eric; Rahayu, J Wenny
2014-08-01
An ever increasing amount of medical data such as electronic health records, is being collected, stored, shared and managed in large online health information systems and electronic medical record systems (EMR) (Williams et al., 2001; Virtanen, 2009; Huang and Liou, 2007) [1-3]. From such rich collections, data is often published in the form of census and statistical data sets for the purpose of knowledge sharing and enabling medical research. This brings with it an increasing need for protecting individual people privacy, and it becomes an issue of great importance especially when information about patients is exposed to the public. While the concept of data privacy has been comprehensively studied for relational data, models and algorithms addressing the distinct differences and complex structure of XML data are yet to be explored. Currently, the common compromise method is to convert private XML data into relational data for publication. This ad hoc approach results in significant loss of useful semantic information previously carried in the private XML data. Health data often has very complex structure, which is best expressed in XML. In fact, XML is the standard format for exchanging (e.g. HL7 version 3(1)) and publishing health information. Lack of means to deal directly with data in XML format is inevitably a serious drawback. In this paper we propose a novel privacy protection model for XML, and an algorithm for implementing this model. We provide general rules, both for transforming a private XML schema into a published XML schema, and for mapping private XML data to the new privacy-protected published XML data. In addition, we propose a new privacy property, δ-dependency, which can be applied to both relational and XML data, and that takes into consideration the hierarchical nature of sensitive data (as opposed to "quasi-identifiers"). Lastly, we provide an implementation of our model, algorithm and privacy property, and perform an experimental analysis, to demonstrate the proposed privacy scheme in practical application. Copyright © 2014. Published by Elsevier Inc.
Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R
2007-12-11
Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.
phyloXML: XML for evolutionary biology and comparative genomics
Han, Mira V; Zmasek, Christian M
2009-01-01
Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types. Results We developed a XML language, named phyloXML, for describing evolutionary trees, as well as various associated data items. PhyloXML provides elements for commonly used items, such as branch lengths, support values, taxonomic names, and gene names and identifiers. By using "property" elements, phyloXML can be adapted to novel and unforeseen use cases. We also developed various software tools for reading, writing, conversion, and visualization of phyloXML formatted data. Conclusion PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data. More information about phyloXML itself, the XSD schema, as well as tools implementing and supporting phyloXML, is available at . PMID:19860910
An Efficient G-XML Data Management Method using XML Spatial Index for Mobile Devices
NASA Astrophysics Data System (ADS)
Tamada, Takashi; Momma, Kei; Seo, Kazuo; Hijikata, Yoshinori; Nishida, Shogo
This paper presents an efficient G-XML data management method for mobile devices. G-XML is XML based encoding for the transport of geographic information. Mobile devices, such as PDA and mobile-phone, performance trail desktop machines, so some techniques are needed for processing G-XML data on mobile devices. In this method, XML-format spatial index file is used to improve an initial display time of G-XML data. This index file contains XML pointer of each feature in G-XML data and classifies these features by multi-dimensional data structures. From the experimental result, we can prove this method speed up about 3-7 times an initial display time of G-XML data on mobile devices.
A Space Surveillance Ontology: Captured in an XML Schema
2000-10-01
characterization in a way most appropriate to a sub- domain. 6. The commercial market is embracing XML, and the military can take advantage of this significant...the space surveillance ontology effort to two key efforts: the Defense Information Infrastructure Common Operating Environment (DII COE) XML...strongly believe XML schemas will supplant them. Some of the advantages that XML schemas provide over DTDs include: • Strong data typing: The XML Schema
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L.; Sanders, Brian; Grethe, Jeffrey S.; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W.; Martone, Maryann E.
2009-01-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov. PMID:18958629
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L; Sanders, Brian; Grethe, Jeffrey S; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W; Martone, Maryann E
2008-09-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov.
2015-07-01
Acronyms ASCII American Standard Code for Information Interchange DAU data acquisition unit DDML data display markup language IHAL...Transfer Standard URI uniform resource identifier W3C World Wide Web Consortium XML extensible markup language XSD XML schema definition XML Style...Style Guide, RCC 125-15, July 2015 1 Introduction The next generation of telemetry systems will rely heavily on extensible markup language (XML
PACS and electronic health records
NASA Astrophysics Data System (ADS)
Cohen, Simona; Gilboa, Flora; Shani, Uri
2002-05-01
Electronic Health Record (EHR) is a major component of the health informatics domain. An important part of the EHR is the medical images obtained over a patient's lifetime and stored in diverse PACS. The vision presented in this paper is that future medical information systems will convert data from various medical sources -- including diverse modalities, PACS, HIS, CIS, RIS, and proprietary systems -- to HL7 standard XML documents. Then, the various documents are indexed and compiled to EHRs, upon which complex queries can be posed. We describe the conversion of data retrieved from PACS systems through DICOM to HL7 standard XML documents. This enables the EHR system to answer queries such as 'Get all chest images of patients at the age of 20-30, that have blood type 'A' and are allergic to pine trees', which a single PACS cannot answer. The integration of data from multiple sources makes our approach capable of delivering such answers. It enables the correlation of medical, demographic, clinical, and even genetic information. In addition, by fully indexing all the tagged data in DICOM objects, it becomes possible to offer access to huge amounts of valuable data, which can be better exploited in the specific radiology domain.
Yokochi, Masashi; Kobayashi, Naohiro; Ulrich, Eldon L; Kinjo, Akira R; Iwata, Takeshi; Ioannidis, Yannis E; Livny, Miron; Markley, John L; Nakamura, Haruki; Kojima, Chojiro; Fujiwara, Toshimichi
2016-05-05
The nuclear magnetic resonance (NMR) spectroscopic data for biological macromolecules archived at the BioMagResBank (BMRB) provide a rich resource of biophysical information at atomic resolution. The NMR data archived in NMR-STAR ASCII format have been implemented in a relational database. However, it is still fairly difficult for users to retrieve data from the NMR-STAR files or the relational database in association with data from other biological databases. To enhance the interoperability of the BMRB database, we present a full conversion of BMRB entries to two standard structured data formats, XML and RDF, as common open representations of the NMR-STAR data. Moreover, a SPARQL endpoint has been deployed. The described case study demonstrates that a simple query of the SPARQL endpoints of the BMRB, UniProt, and Online Mendelian Inheritance in Man (OMIM), can be used in NMR and structure-based analysis of proteins combined with information of single nucleotide polymorphisms (SNPs) and their phenotypes. We have developed BMRB/XML and BMRB/RDF and demonstrate their use in performing a federated SPARQL query linking the BMRB to other databases through standard semantic web technologies. This will facilitate data exchange across diverse information resources.
XML Flight/Ground Data Dictionary Management
NASA Technical Reports Server (NTRS)
Wright, Jesse; Wiklow, Colette
2007-01-01
A computer program generates Extensible Markup Language (XML) files that effect coupling between the command- and telemetry-handling software running aboard a spacecraft and the corresponding software running in ground support systems. The XML files are produced by use of information from the flight software and from flight-system engineering. The XML files are converted to legacy ground-system data formats for command and telemetry, transformed into Web-based and printed documentation, and used in developing new ground-system data-handling software. Previously, the information about telemetry and command was scattered in various paper documents that were not synchronized. The process of searching and reading the documents was time-consuming and introduced errors. In contrast, the XML files contain all of the information in one place. XML structures can evolve in such a manner as to enable the addition, to the XML files, of the metadata necessary to track the changes and the associated documentation. The use of this software has reduced the extent of manual operations in developing a ground data system, thereby saving considerable time and removing errors that previously arose in the translation and transcription of software information from the flight to the ground system.
Representing nested semantic information in a linear string of text using XML.
Krauthammer, Michael; Johnson, Stephen B; Hripcsak, George; Campbell, David A; Friedman, Carol
2002-01-01
XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information.
Representing nested semantic information in a linear string of text using XML.
Krauthammer, Michael; Johnson, Stephen B.; Hripcsak, George; Campbell, David A.; Friedman, Carol
2002-01-01
XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information. PMID:12463856
Using XML to Separate Content from the Presentation Software in eLearning Applications
ERIC Educational Resources Information Center
Merrill, Paul F.
2005-01-01
This paper has shown how XML (extensible Markup Language) can be used to mark up content. Since XML documents, with meaningful tags, can be interpreted easily by humans as well as computers, they are ideal for the interchange of information. Because XML tags can be defined by an individual or organization, XML documents have proven useful in a…
The Cadmio XML healthcare record.
Barbera, Francesco; Ferri, Fernando; Ricci, Fabrizio L; Sottile, Pier Angelo
2002-01-01
The management of clinical data is a complex task. Patient related information reported in patient folders is a set of heterogeneous and structured data accessed by different users having different goals (in local or geographical networks). XML language provides a mechanism for describing, manipulating, and visualising structured data in web-based applications. XML ensures that the structured data is managed in a uniform and transparent manner independently from the applications and their providers guaranteeing some interoperability. Extracting data from the healthcare record and structuring them according to XML makes the data available through browsers. The MIC/MIE model (Medical Information Category/Medical Information Elements), which allows the definition and management of healthcare records and used in CADMIO, a HISA based project, is described in this paper, using XML for allowing the data to be visualised through web browsers.
Report of Official Foreign Travel to Montreal, Canada
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mason, J. D.
How can DOE, NNSA, and Y-12 best handle the integration of information from diverse sources, and what will best ensure that legacy data will survive changes in computing systems for the future? Although there is no simple answer, it is becoming increasingly clear throughout the information-management industry that a key component of both preservation and integration of information is the adoption of standardized data formats. The most notable standardized format is XML, to which almost all data is now migrating. XML is derived from SGML, as is HTML, the common language of the World Wide Web. XML is becoming increasinglymore » important as part of the Y-12 data infrastructure. Y-12 is implementing a new generation of XML-based publishing systems. Y-12 already has been supporting projects at DOE Headquarters, such as the Guidance Streamlining Initiative (GSI) that will result in the storage of classification guidance in XML. Y-12 collects some test data in XML as the result of Electronic Data Capture (EDC), and XML data is also used in Engineering Releases. I am participating in a series of projects sponsored by the PRIDE initiative that include the capture of dimensional certification and other similar records in XML, the creation of XML formats for Electronic Data Capture, and the creation of Quality Evaluation Reports in XML. In support of DOE's use of SGML, XML, HTML, Topic Maps, and related standards, I served 1985-2007 as chairman of the international committee responsible for SGML and standards derived from it, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations; I continue to belong to the committee. During the August 2010 trip, I co-chaired the conference Balisage 2010.« less
Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application.
Peissig, Peggy; Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H
2017-09-13
The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. ©Peggy Peissig, Kelsey M Schwei, Christopher Kadolph, Joseph Finamore, Efrain Cancel, Catherine A McCarty, Asha Okorie, Kate L Thomas, Jennifer Allen Pacheco, Jyotishman Pathak, Stephen B Ellis, Joshua C Denny, Luke V Rasmussen, Gerard Tromp, Marc S Williams, Tamara R Vrabec, Murray H Brilliant. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.09.2017.
Semantically Interoperable XML Data
Vergara-Niedermayr, Cristobal; Wang, Fusheng; Pan, Tony; Kurc, Tahsin; Saltz, Joel
2013-01-01
XML is ubiquitously used as an information exchange platform for web-based applications in healthcare, life sciences, and many other domains. Proliferating XML data are now managed through latest native XML database technologies. XML data sources conforming to common XML schemas could be shared and integrated with syntactic interoperability. Semantic interoperability can be achieved through semantic annotations of data models using common data elements linked to concepts from ontologies. In this paper, we present a framework and software system to support the development of semantic interoperable XML based data sources that can be shared through a Grid infrastructure. We also present our work on supporting semantic validated XML data through semantic annotations for XML Schema, semantic validation and semantic authoring of XML data. We demonstrate the use of the system for a biomedical database of medical image annotations and markups. PMID:25298789
SeqHound: biological sequence and structure database as a platform for bioinformatics research
2002-01-01
Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134
Hoelzer, S; Schweiger, R K; Boettcher, H A; Tafazzoli, A G; Dudeck, J
2001-01-01
The purpose of guidelines in clinical practice is to improve the effectiveness and efficiency of clinical care. It is known that nationally or internationally produced guidelines which, in particular, do not involve medical processes at the time of consultation, do not take local factors into account, and have no consistent implementation strategy, have limited impact in changing either the behaviour of physicians, or patterns of care. The literature provides evidence for the effectiveness of computerization of CPGs for increasing compliance and improving patient outcomes. Probably the most effective concepts are knowledge-based functions for decision support or monitoring that are integrated in clinical information systems. This approach is mostly restricted by the effort required for development and maintenance of the information systems and the limited number of implemented medical rules. Most of the guidelines are text-based, and are primarily published in medical journals and posted on the internet. However, internet-published guidelines have little impact on the behaviour of physicians. It can be difficult and time-consuming to browse the internet to find (a) the correct guidelines to an existing diagnosis and (b) and adequate recommendation for a specific clinical problem. Our objective is to provide a web-based guideline service that takes as input clinical data on a particular patient and returns as output a customizable set of recommendations regarding diagnosis and treatment. Information in healthcare is to a very large extent transmitted and stored as unstructured or slightly structured text such as discharge letters, reports, forms, etc. The same applies for facilities containing medical information resources for clinical purposes and research such as text books, articles, guidelines, etc. Physicians are used to obtaining information from text-based sources. Since most guidelines are text-based, it would be practical to use a document-based solution that preserves the original cohesiveness. The lack of structure limits the automatic identification and extraction of the information contained in these resources. For this reason, we have chosen a document-based approach using eXtensible Markup Language (XML) with its schema definition and related technologies. XML empowers the applications for in-context searching. In addition it allows the same content to be represented in different ways. Our XML reference clinical data model for guidelines has been realized with the XML schema definition. The schema is used for structuring new text-based guidelines and updating existing documents. It is also used to establish search strategies on the document base. We hypothesize that enabling the physicians to query the available CPGs easily, and to get access to selected and specific information at the point of care will foster increased use. Based on current evidence we are confident that it will have substantial impact on the care provided, and will improve health outcomes.
Framework and prototype for a secure XML-based electronic health records system.
Steele, Robert; Gardner, William; Chandra, Darius; Dillon, Tharam S
2007-01-01
Security of personal medical information has always been a challenge for the advancement of Electronic Health Records (EHRs) initiatives. eXtensible Markup Language (XML), is rapidly becoming the key standard for data representation and transportation. The widespread use of XML and the prospect of its use in the Electronic Health (e-health) domain highlights the need for flexible access control models for XML data and documents. This paper presents a declarative access control model for XML data repositories that utilises an expressive XML role control model. The operational semantics of this model are illustrated by Xplorer, a user interface generation engine which supports search-browse-navigate activities on XML repositories.
Engineering Analysis Using a Web-based Protocol
NASA Technical Reports Server (NTRS)
Schoeffler, James D.; Claus, Russell W.
2002-01-01
This paper reviews the development of a web-based framework for engineering analysis. A one-dimensional, high-speed analysis code called LAPIN was used in this study, but the approach can be generalized to any engineering analysis tool. The web-based framework enables users to store, retrieve, and execute an engineering analysis from a standard web-browser. We review the encapsulation of the engineering data into the eXtensible Markup Language (XML) and various design considerations in the storage and retrieval of application data.
Paterson, Trevor; Law, Andy
2009-08-14
Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. We have developed a simple generic XML schema (GenomicMappingData.xsd - GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows. The data exchange standard we present here provides a useful generic format for transfer and integration of genomic and genetic mapping data. The extensibility of our schema allows for inclusion of additional data and provides a mechanism for typing mapping objects via third party standards. Web services retrieving GMD-compliant mapping data demonstrate that use of this exchange standard provides a practical mechanism for achieving data integration, by facilitating syntactically and semantically-controlled access to the data.
Paterson, Trevor; Law, Andy
2009-01-01
Background Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. Results We have developed a simple generic XML schema (GenomicMappingData.xsd – GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows. Conclusion The data exchange standard we present here provides a useful generic format for transfer and integration of genomic and genetic mapping data. The extensibility of our schema allows for inclusion of additional data and provides a mechanism for typing mapping objects via third party standards. Web services retrieving GMD-compliant mapping data demonstrate that use of this exchange standard provides a practical mechanism for achieving data integration, by facilitating syntactically and semantically-controlled access to the data. PMID:19682365
ERIC Educational Resources Information Center
Gazan, Rich
2000-01-01
Surveys the current state of Extensible Markup Language (XML), a metalanguage for creating structured documents that describe their own content, and its implications for information professionals. Predicts that XML will become the common language underlying Web, word processing, and database formats. Also discusses Extensible Stylesheet Language…
XML syntax for clinical laboratory procedure manuals.
Saadawi, Gilan; Harrison, James H
2003-01-01
We have developed a document type description (DTD) in Extensable Markup Language (XML) for clinical laboratory procedures. Our XML syntax can adequately structure a variety of procedure types across different laboratories and is compatible with current procedure standards. The combination of this format with an XML content management system and appropriate style sheets will allow efficient procedure maintenance, distributed access, customized display and effective searching across a large body of test information.
ERIC Educational Resources Information Center
Banerjee, Kyle
2002-01-01
Discusses XML, how it has transformed the way information is managed and delivered, and its impact on libraries. Topics include how XML differs from other markup languages; the document object model (DOM); style sheets; practical applications for archival materials, interlibrary loans, digital collections, and MARC data; and future possibilities.…
NASA Astrophysics Data System (ADS)
Antani, Sameer K.; Natarajan, Mukil; Long, Jonathan L.; Long, L. Rodney; Thoma, George R.
2005-04-01
The article describes the status of our ongoing R&D at the U.S. National Library of Medicine (NLM) towards the development of an advanced multimedia database biomedical information system that supports content-based image retrieval (CBIR). NLM maintains a collection of 17,000 digitized spinal X-rays along with text survey data from the Second National Health and Nutritional Examination Survey (NHANES II). These data serve as a rich data source for epidemiologists and researchers of osteoarthritis and musculoskeletal diseases. It is currently possible to access these through text keyword queries using our Web-based Medical Information Retrieval System (WebMIRS). CBIR methods developed specifically for biomedical images could offer direct visual searching of these images by means of example image or user sketch. We are building a system which supports hybrid queries that have text and image-content components. R&D goals include developing algorithms for robust image segmentation for localizing and identifying relevant anatomy, labeling the segmented anatomy based on its pathology, developing suitable indexing and similarity matching methods for images and image features, and associating the survey text information for query and retrieval along with the image data. Some highlights of the system developed in MATLAB and Java are: use of a networked or local centralized database for text and image data; flexibility to incorporate new research work; provides a means to control access to system components under development; and use of XML for structured reporting. The article details the design, features, and algorithms in this third revision of this prototype system, CBIR3.
Standardization of XML Database Exchanges and the James Webb Space Telescope Experience
NASA Technical Reports Server (NTRS)
Gal-Edd, Jonathan; Detter, Ryan; Jones, Ron; Fatig, Curtis C.
2007-01-01
Personnel from the National Aeronautics and Space Administration (NASA) James Webb Space Telescope (JWST) Project have been working with various standard communities such the Object Management Group (OMG) and the Consultative Committee for Space Data Systems (CCSDS) to assist in the definition of a common extensible Markup Language (XML) for database exchange format. The CCSDS and OMG standards are intended for the exchange of core command and telemetry information, not for all database information needed to exercise a NASA space mission. The mission-specific database, containing all the information needed for a space mission, is translated from/to the standard using a translator. The standard is meant to provide a system that encompasses 90% of the information needed for command and telemetry processing. This paper will discuss standardization of the XML database exchange format, tools used, and the JWST experience, as well as future work with XML standard groups both commercial and government.
Get It Together: Integrating Data with XML.
ERIC Educational Resources Information Center
Miller, Ron
2003-01-01
Discusses the use of XML for data integration to move data across different platforms, including across the Internet, from a variety of sources. Topics include flexibility; standards; organizing databases; unstructured data and the use of meta tags to encode it with XML information; cost effectiveness; and eliminating client software licenses.…
A future Outlook: Web based Simulation of Hydrodynamic models
NASA Astrophysics Data System (ADS)
Islam, A. S.; Piasecki, M.
2003-12-01
Despite recent advances to present simulation results as 3D graphs or animation contours, the modeling user community still faces some shortcomings when trying to move around and analyze data. Typical problems include the lack of common platforms with standard vocabulary to exchange simulation results from different numerical models, insufficient descriptions about data (metadata), lack of robust search and retrieval tools for data, and difficulties to reuse simulation domain knowledge. This research demonstrates how to create a shared simulation domain in the WWW and run a number of models through multi-user interfaces. Firstly, meta-datasets have been developed to describe hydrodynamic model data based on geographic metadata standard (ISO 19115) that has been extended to satisfy the need of the hydrodynamic modeling community. The Extended Markup Language (XML) is used to publish this metadata by the Resource Description Framework (RDF). Specific domain ontology for Web Based Simulation (WBS) has been developed to explicitly define vocabulary for the knowledge based simulation system. Subsequently, this knowledge based system is converted into an object model using Meta Object Family (MOF). The knowledge based system acts as a Meta model for the object oriented system, which aids in reusing the domain knowledge. Specific simulation software has been developed based on the object oriented model. Finally, all model data is stored in an object relational database. Database back-ends help store, retrieve and query information efficiently. This research uses open source software and technology such as Java Servlet and JSP, Apache web server, Tomcat Servlet Engine, PostgresSQL databases, Protégé ontology editor, RDQL and RQL for querying RDF in semantic level, Jena Java API for RDF. Also, we use international standards such as the ISO 19115 metadata standard, and specifications such as XML, RDF, OWL, XMI, and UML. The final web based simulation product is deployed as Web Archive (WAR) files which is platform and OS independent and can be used by Windows, UNIX, or Linux. Keywords: Apache, ISO 19115, Java Servlet, Jena, JSP, Metadata, MOF, Linux, Ontology, OWL, PostgresSQL, Protégé, RDF, RDQL, RQL, Tomcat, UML, UNIX, Windows, WAR, XML
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Metadata and Service at the GFZ ISDC Portal
NASA Astrophysics Data System (ADS)
Ritschel, B.
2008-05-01
The online service portal of the GFZ Potsdam Information System and Data Center (ISDC) is an access point for all manner of geoscientific geodata, its corresponding metadata, scientific documentation and software tools. At present almost 2000 national and international users and user groups have the opportunity to request Earth science data from a portfolio of 275 different products types and more than 20 Million single data files with an added volume of approximately 12 TByte. The majority of the data and information, the portal currently offers to the public, are global geomonitoring products such as satellite orbit and Earth gravity field data as well as geomagnetic and atmospheric data for the exploration. These products for Earths changing system are provided via state-of-the art retrieval techniques. The data product catalog system behind these techniques is based on the extensive usage of standardized metadata, which are describing the different geoscientific product types and data products in an uniform way. Where as all ISDC product types are specified by NASA's Directory Interchange Format (DIF), Version 9.0 Parent XML DIF metadata files, the individual data files are described by extended DIF metadata documents. Depending on the beginning of the scientific project, one part of data files are described by extended DIF, Version 6 metadata documents and the other part are specified by data Child XML DIF metadata documents. Both, the product type dependent parent DIF metadata documents and the data file dependent child DIF metadata documents are derived from a base-DIF.xsd xml schema file. The ISDC metadata philosophy defines a geoscientific product as a package consisting of mostly one or sometimes more than one data file plus one extended DIF metadata file. Because NASA's DIF metadata standard has been developed in order to specify a collection of data only, the extension of the DIF standard consists of new and specific attributes, which are necessary for an explicit identification of single data files and the set-up of a comprehensive Earth science data catalog. The huge ISDC data catalog is realized by product type dependent tables filled with data file related metadata, which have relations to corresponding metadata tables. The product type describing parent DIF XML metadata documents are stored and managed in ORACLE's XML storage structures. In order to improve the interoperability of the ISDC service portal, the existing proprietary catalog system will be extended by an ISO 19115 based web catalog service. In addition to this development there is ISDC related concerning semantic network of different kind of metadata resources, like different kind of standardized and not-standardized metadata documents and literature as well as Web 2.0 user generated information derived from tagging activities and social navigation data.
Pinciroli, Francesco; Masseroli, Marco; Acerbo, Livio A; Bonacina, Stefano; Ferrari, Roberto; Marchente, Mario
2004-01-01
This paper presents a low cost software platform prototype supporting health care personnel in retrieving patient referral multimedia data. These information are centralized in a server machine and structured by using a flexible eXtensible Markup Language (XML) Bio-Image Referral Database (BIRD). Data are distributed on demand to requesting client in an Intranet network and transformed via eXtensible Stylesheet Language (XSL) to be visualized in an uniform way on market browsers. The core server operation software has been developed in PHP Hypertext Preprocessor scripting language, which is very versatile and useful for crafting a dynamic Web environment.
An XML-Based Mission Command Language for Autonomous Underwater Vehicles (AUVs)
2003-06-01
P. XML: How To Program . Prentice Hall, Inc. Upper Saddle River, New Jersey, 2001 Digital Signature Activity Statement, W3C www.w3.org/Signature...languages because it does not directly specify how information is to be presented, but rather defines the structure (and thus semantics) of the...command and control (C2) aspects of using XML to increase the utility of AUVs. XML programming will be addressed. Current mine warfare doctrine will be
NASA Astrophysics Data System (ADS)
Mueller, Wolfgang; Mueller, Henning; Marchand-Maillet, Stephane; Pun, Thierry; Squire, David M.; Pecenovic, Zoran; Giess, Christoph; de Vries, Arjen P.
2000-10-01
While in the area of relational databases interoperability is ensured by common communication protocols (e.g. ODBC/JDBC using SQL), Content Based Image Retrieval Systems (CBIRS) and other multimedia retrieval systems are lacking both a common query language and a common communication protocol. Besides its obvious short term convenience, interoperability of systems is crucial for the exchange and analysis of user data. In this paper, we present and describe an extensible XML-based query markup language, called MRML (Multimedia Retrieval markup Language). MRML is primarily designed so as to ensure interoperability between different content-based multimedia retrieval systems. Further, MRML allows researchers to preserve their freedom in extending their system as needed. MRML encapsulates multimedia queries in a way that enable multimedia (MM) query languages, MM content descriptions, MM query engines, and MM user interfaces to grow independently from each other, reaching a maximum of interoperability while ensuring a maximum of freedom for the developer. For benefitting from this, only a few simple design principles have to be respected when extending MRML for one's fprivate needs. The design of extensions withing the MRML framework will be described in detail in the paper. MRML has been implemented and tested for the CBIRS Viper, using the user interface Snake Charmer. Both are part of the GNU project and can be downloaded at our site.
Noelle, G; Dudeck, J
1999-01-01
Two years, since the World Wide Web Consortium (W3C) has published the first specification of the eXtensible Markup Language (XML) there exist some concrete tools and applications to work with XML-based data. In particular, new generation Web browsers offer great opportunities to develop new kinds of medical, web-based applications. There are several data-exchange formats in medicine, which have been established in the last years: HL-7, DICOM, EDIFACT and, in the case of Germany, xDT. Whereas communication and information exchange becomes increasingly important, the development of appropriate and necessary interfaces causes problems, rising costs and effort. It has been also recognised that it is difficult to define a standardised interchange format, for one of the major future developments in medical telematics: the electronic patient record (EPR) and its availability on the Internet. Whereas XML, especially in an industrial environment, is celebrated as a generic standard and a solution for all problems concerning e-commerce, in a medical context there are only few applications developed. Nevertheless, the medical environment is an appropriate area for building XML applications: as the information and communication management becomes increasingly important in medical businesses, the role of the Internet changes quickly from an information to a communication medium. The first XML based applications in healthcare show us the advantage for a future engagement of the healthcare industry in XML: such applications are open, easy to extend and cost-effective. Additionally, XML is much more than a simple new data interchange format: many proposals for data query (XQL), data presentation (XSL) and other extensions have been proposed to the W3C and partly realised in medical applications.
An XML-Based Knowledge Management System of Port Information for U.S. Coast Guard Cutters
2003-03-01
using DTDs was not chosen. XML Schema performs many of the same functions as SQL type schemas, but differ by the unique structure of XML documents...to access data from content files within the developed system. XPath is not equivalent to SQL . While XPath is very powerful at reaching into an XML...document and finding nodes or node sets, it is not a complete query language. For operations like joins, unions, intersections, etc., SQL is far
NASA Astrophysics Data System (ADS)
You, Xiaozhen; Yao, Zhihong
2005-04-01
As a standard of communication and storage for medical digital images, DICOM has been playing a very important role in integration of hospital information. In DICOM, tags are expressed by numbers, and only standard data elements can be shared by looking up Data Dictionary while private tags can not. As such, a DICOM file's readability and extensibility is limited. In addition, reading DICOM files needs special software. In our research, we introduced XML into DICOM, defining an XML-based DICOM special transfer format, XML-DCM, a DICOM storage format, X-DCM, as well as developing a program package to realize format interchange among DICOM, XML-DCM, and X-DCM. XML-DCM is based on the DICOM structure while replacing numeric tags with accessible XML character string tags. The merits are as following: a) every character string tag of XML-DCM has explicit meaning, so users can understand standard data elements and those private data elements easily without looking up the Data Dictionary. In this way, the readability and data sharing of DICOM files are greatly improved; b) According to requirements, users can set new character string tags with explicit meaning to their own system to extend the capacity of data elements; c) User can read the medical image and associated information conveniently through IE, ultimately enlarging the scope of data sharing. The application of storage format X-DCM will reduce data redundancy and save storage memory. The result of practical application shows that XML-DCM does favor integration and share of medical image data among different systems or devices.
Integrated Syntactic/Semantic XML Data Validation with a Reusable Software Component
ERIC Educational Resources Information Center
Golikov, Steven
2013-01-01
Data integration is a critical component of enterprise system integration, and XML data validation is the foundation for sound data integration of XML-based information systems. Since B2B e-commerce relies on data validation as one of the critical components for enterprise integration, it is imperative for financial industries and e-commerce…
Adding XML to the MIS Curriculum: Lessons from the Classroom
ERIC Educational Resources Information Center
Wagner, William P.; Pant, Vik; Hilken, Ralph
2008-01-01
eXtensible Markup Language (XML) is a new technology that is currently being extolled by many industry experts and software vendors. Potentially it represents a platform independent language for sharing information over networks in a way that is much more seamless than with previous technologies. It is extensible in that XML serves as a "meta"…
Data Manipulation in an XML-Based Digital Image Library
ERIC Educational Resources Information Center
Chang, Naicheng
2005-01-01
Purpose: To help to clarify the role of XML tools and standards in supporting transition and migration towards a fully XML-based environment for managing access to information. Design/methodology/approach: The Ching Digital Image Library, built on a three-tier architecture, is used as a source of examples to illustrate a number of methods of data…
Utilizing the Structure and Content Information for XML Document Clustering
NASA Astrophysics Data System (ADS)
Tran, Tien; Kutty, Sangeetha; Nayak, Richi
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 document mining challenge. The clustering approach utilizes both the structure and content information of the Wikipedia XML document collection. A latent semantic kernel (LSK) is used to measure the semantic similarity between XML documents based on their content features. The construction of a latent semantic kernel involves the computing of singular vector decomposition (SVD). On a large feature space matrix, the computation of SVD is very expensive in terms of time and memory requirements. Thus in this clustering approach, the dimension of the document space of a term-document matrix is reduced before performing SVD. The document space reduction is based on the common structural information of the Wikipedia XML document collection. The proposed clustering approach has shown to be effective on the Wikipedia collection in the INEX 2008 document mining challenge.
A Practical Introduction to the XML, Extensible Markup Language, by Way of Some Useful Examples
ERIC Educational Resources Information Center
Snyder, Robin
2004-01-01
XML, Extensible Markup Language, is important as a way to represent and encapsulate the structure of underlying data in a portable way that supports data exchange regardless of the physical storage of the data. This paper (and session) introduces some useful and practical aspects of XML technology for sharing information in a educational setting…
The Protein Information Resource: an integrated public resource of functional annotation of proteins
Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.
2002-01-01
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
XML-Based Generator of C++ Code for Integration With GUIs
NASA Technical Reports Server (NTRS)
Hua, Hook; Oyafuso, Fabiano; Klimeck, Gerhard
2003-01-01
An open source computer program has been developed to satisfy a need for simplified organization of structured input data for scientific simulation programs. Typically, such input data are parsed in from a flat American Standard Code for Information Interchange (ASCII) text file into computational data structures. Also typically, when a graphical user interface (GUI) is used, there is a need to completely duplicate the input information while providing it to a user in a more structured form. Heretofore, the duplication of the input information has entailed duplication of software efforts and increases in susceptibility to software errors because of the concomitant need to maintain two independent input-handling mechanisms. The present program implements a method in which the input data for a simulation program are completely specified in an Extensible Markup Language (XML)-based text file. The key benefit for XML is storing input data in a structured manner. More importantly, XML allows not just storing of data but also describing what each of the data items are. That XML file contains information useful for rendering the data by other applications. It also then generates data structures in the C++ language that are to be used in the simulation program. In this method, all input data are specified in one place only, and it is easy to integrate the data structures into both the simulation program and the GUI. XML-to-C is useful in two ways: 1. As an executable, it generates the corresponding C++ classes and 2. As a library, it automatically fills the objects with the input data values.
MedlinePlus Milestones: 1998-present
... page links and information daily and also offers access to this full XML content through its Web ... search-based Web service that allows developers to access MedlinePlus health topic data in XML format. MedlinePlus ...
Toxics Release Inventory Chemical Hazard Information Profiles (TRI-CHIP) Dataset
The Toxics Release Inventory (TRI) Chemical Hazard Information Profiles (TRI-CHIP) dataset contains hazard information about the chemicals reported in TRI. Users can use this XML-format dataset to create their own databases and hazard analyses of TRI chemicals. The hazard information is compiled from a series of authoritative sources including the Integrated Risk Information System (IRIS). The dataset is provided as a downloadable .zip file that when extracted provides XML files and schemas for the hazard information tables.
NCBI GEO: mining tens of millions of expression profiles--database and tools update.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Edgar, Ron
2007-01-01
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at http://www.ncbi.nlm.nih.gov/geo/
Knowledge-Centric Management of Business Rules in a Pharmacy
NASA Astrophysics Data System (ADS)
Puustjärvi, Juha; Puustjärvi, Leena
A business rule defines or constraints some aspect of the business. In healthcare sector many of the business rules are dictated by law or medical regulations, which are constantly changing. This is a challenge for the healthcare organizations. Although there is available several commercial business rule management systems the problem from pharmacies point of view is that these systems are overly geared towards the automation and manipulation of business rules, while the main need in pharmacies lies in easy retrieving of business rules within daily routines. Another problem is that business rule management systems are isolated in the sense that they have their own data stores that cannot be accessed by other information systems used in pharmacies. As a result, a pharmacist is burdened by accessing many systems inside a user task. In order to avoid this problem we have modeled business rules as well as their relationships to other relevant information by OWL (Web Ontology Language) such that the ontology is shared among the pharmacy's applications. In this way we can avoid the problems of isolated applications and replicated data. The ontology also encourages pharmacies business agility, i.e., the ability to react more rapidly to the changes required by the new business rules. The deployment of the ontology requires that stored business rules are annotated by appropriate metadata descriptions, which are presented by RDF/XML serialization format. However, neither the designer nor the pharmacists are burdened by RDF/XML format as there are sophisticated graphical editors that can be used.
Tongue Scrapers Only Slightly Reduce Bad Breath
... information you need from the Academy of General Dentistry Friday, June 29, 2018 About | Contact InfoBites Quick ... study in the September/October issue of General Dentistry, the Academy?xml:namespace> of General Dentistry?xml: ...
ADASS Web Database XML Project
NASA Astrophysics Data System (ADS)
Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.
In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.
An XML-based method for astronomy software designing
NASA Astrophysics Data System (ADS)
Liao, Mingxue; Aili, Yusupu; Zhang, Jin
XML-based method for standardization of software designing is introduced and analyzed and successfully applied to renovating the hardware and software of the digital clock at Urumqi Astronomical Station. Basic strategy for eliciting time information from the new digital clock of FT206 in the antenna control program is introduced. By FT206, the need to compute how many centuries passed since a certain day with sophisticated formulas is eliminated and it is no longer necessary to set right UT time for the computer holding control over antenna because the information about year, month, day are all deduced from Julian day dwelling in FT206, rather than from computer time. With XML-based method and standard for software designing, various existing designing methods are unified, communications and collaborations between developers are facilitated, and thus Internet-based mode of developing software becomes possible. The trend of development of XML-based designing method is predicted.
NASA Technical Reports Server (NTRS)
Rice, J. Kevin
2013-01-01
The XTCE GOVSAT software suite contains three tools: validation, search, and reporting. The Extensible Markup Language (XML) Telemetric and Command Exchange (XTCE) GOVSAT Tool Suite is written in Java for manipulating XTCE XML files. XTCE is a Consultative Committee for Space Data Systems (CCSDS) and Object Management Group (OMG) specification for describing the format and information in telemetry and command packet streams. These descriptions are files that are used to configure real-time telemetry and command systems for mission operations. XTCE s purpose is to exchange database information between different systems. XTCE GOVSAT consists of rules for narrowing the use of XTCE for missions. The Validation Tool is used to syntax check GOVSAT XML files. The Search Tool is used to search (i.e. command and telemetry mnemonics) the GOVSAT XML files and view the results. Finally, the Reporting Tool is used to create command and telemetry reports. These reports can be displayed or printed for use by the operations team.
XML Schema Guide for Primary CDR Submissions
This document presents the extensible markup language (XML) schema guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDRweb tool. E-CDRweb is the electronic, web-based tool provided by Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document provides the user with tips and guidance on correctly using the version 1.7 XML schema. Please note that the order of the elements must match the schema.
Building VoiceXML-Based Applications
2002-01-01
basketball games. The Busline systems were pri- y developed using an early implementation of VoiceXML he NBA Update Line was developed using VoiceXML...traveling in and out of Pittsburgh’s rsity neighborhood. The second project is the NBA Up- Line, which provides callers with real-time information NBA ... NBA UPDATE LINE The target user of this system is a fairly knowledgeable basket- ball fan; the system must therefore be able to provide detailed
Non-invasive lightweight integration engine for building EHR from autonomous distributed systems.
Angulo, Carlos; Crespo, Pere; Maldonado, José A; Moner, David; Pérez, Daniel; Abad, Irene; Mandingorra, Jesús; Robles, Montserrat
2007-12-01
In this paper we describe Pangea-LE, a message-oriented lightweight data integration engine that allows homogeneous and concurrent access to clinical information from disperse and heterogeneous data sources. The engine extracts the information and passes it to the requesting client applications in a flexible XML format. The XML response message can be formatted on demand by appropriate Extensible Stylesheet Language (XSL) transformations in order to meet the needs of client applications. We also present a real deployment in a hospital where Pangea-LE collects and generates an XML view of all the available patient clinical information. The information is presented to healthcare professionals in an Electronic Health Record (EHR) viewer Web application with patient search and EHR browsing capabilities. Implantation in a real setting has been a success due to the non-invasive nature of Pangea-LE which respects the existing information systems.
Web Based Data Access to the World Data Center for Climate
NASA Astrophysics Data System (ADS)
Toussaint, F.; Lautenschlager, M.
2006-12-01
The World Data Center for Climate (WDC-Climate, www.wdc-climate.de) is hosted by the Model &Data Group (M&D) of the Max Planck Institute for Meteorology. The M&D department is financed by the German government and uses the computers and mass storage facilities of the German Climate Computing Centre (Deutsches Klimarechenzentrum, DKRZ). The WDC-Climate provides web access to 200 Terabytes of climate data; the total mass storage archive contains nearly 4 Petabytes. Although the majority of the datasets concern model output data, some satellite and observational data are accessible as well. The underlying relational database is distributed on five servers. The CERA relational data model is used to integrate catalogue data and mass data. The flexibility of the model allows to store and access very different types of data and metadata. The CERA metadata catalogue provides easy access to the content of the CERA database as well as to other data in the web. Visit ceramodel.wdc-climate.de for additional information on the CERA data model. The majority of the users access data via the CERA metadata catalogue, which is open without registration. However, prior to retrieving data user are required to check in and apply for a userid and password. The CERA metadata catalogue is servlet based. So it is accessible worldwide through any web browser at cera.wdc-climate.de. In addition to data and metadata access by the web catalogue, WDC-Climate offers a number of other forms of web based data access. All metadata are available via http request as xml files in various metadata formats (ISO, DC, etc., see wini.wdc-climate.de) which allows for easy data interchange with other catalogues. Model data can be retrieved in GRIB, ASCII, NetCDF, and binary (IEEE) format. WDC-Climate serves as data centre for various projects. Since xml files are accessible by http, the integration of data into applications of different projects is very easy. Projects supported by WDC-Climate are e.g. CEOP, IPCC, and CARIBIC. A script tool for data download (jblob) is offered on the web page, to make retrieval of huge data quantities more comfortable.
Towards a distributed information architecture for avionics data
NASA Technical Reports Server (NTRS)
Mattmann, Chris; Freeborn, Dana; Crichton, Dan
2003-01-01
Avionics data at the National Aeronautics and Space Administration's (NASA) Jet Propulsion Laboratory (JPL consists of distributed, unmanaged, and heterogeneous information that is hard for flight system design engineers to find and use on new NASA/JPL missions. The development of a systematic approach for capturing, accessing and sharing avionics data critical to the support of NASA/JPL missions and projects is required. We propose a general information architecture for managing the existing distributed avionics data sources and a method for querying and retrieving avionics data using the Object Oriented Data Technology (OODT) framework. OODT uses XML messaging infrastructure that profiles data products and their locations using the ISO-11179 data model for describing data products. Queries against a common data dictionary (which implements the ISO model) are translated to domain dependent source data models, and distributed data products are returned asynchronously through the OODT middleware. Further work will include the ability to 'plug and play' new manufacturer data sources, which are distributed at avionics component manufacturer locations throughout the United States.
Hoelzer, Simon; Schweiger, Ralf K; Dudeck, Joachim
2003-01-01
With the introduction of ICD-10 as the standard for diagnostics, it becomes necessary to develop an electronic representation of its complete content, inherent semantics, and coding rules. The authors' design relates to the current efforts by the CEN/TC 251 to establish a European standard for hierarchical classification systems in health care. The authors have developed an electronic representation of ICD-10 with the eXtensible Markup Language (XML) that facilitates integration into current information systems and coding software, taking different languages and versions into account. In this context, XML provides a complete processing framework of related technologies and standard tools that helps develop interoperable applications. XML provides semantic markup. It allows domain-specific definition of tags and hierarchical document structure. The idea of linking and thus combining information from different sources is a valuable feature of XML. In addition, XML topic maps are used to describe relationships between different sources, or "semantically associated" parts of these sources. The issue of achieving a standardized medical vocabulary becomes more and more important with the stepwise implementation of diagnostically related groups, for example. The aim of the authors' work is to provide a transparent and open infrastructure that can be used to support clinical coding and to develop further software applications. The authors are assuming that a comprehensive representation of the content, structure, inherent semantics, and layout of medical classification systems can be achieved through a document-oriented approach.
Hoelzer, Simon; Schweiger, Ralf K.; Dudeck, Joachim
2003-01-01
With the introduction of ICD-10 as the standard for diagnostics, it becomes necessary to develop an electronic representation of its complete content, inherent semantics, and coding rules. The authors' design relates to the current efforts by the CEN/TC 251 to establish a European standard for hierarchical classification systems in health care. The authors have developed an electronic representation of ICD-10 with the eXtensible Markup Language (XML) that facilitates integration into current information systems and coding software, taking different languages and versions into account. In this context, XML provides a complete processing framework of related technologies and standard tools that helps develop interoperable applications. XML provides semantic markup. It allows domain-specific definition of tags and hierarchical document structure. The idea of linking and thus combining information from different sources is a valuable feature of XML. In addition, XML topic maps are used to describe relationships between different sources, or “semantically associated” parts of these sources. The issue of achieving a standardized medical vocabulary becomes more and more important with the stepwise implementation of diagnostically related groups, for example. The aim of the authors' work is to provide a transparent and open infrastructure that can be used to support clinical coding and to develop further software applications. The authors are assuming that a comprehensive representation of the content, structure, inherent semantics, and layout of medical classification systems can be achieved through a document-oriented approach. PMID:12807813
Ajax Architecture Implementation Techniques
NASA Astrophysics Data System (ADS)
Hussaini, Syed Asadullah; Tabassum, S. Nasira; Baig, Tabassum, M. Khader
2012-03-01
Today's rich Web applications use a mix of Java Script and asynchronous communication with the application server. This mechanism is also known as Ajax: Asynchronous JavaScript and XML. The intent of Ajax is to exchange small pieces of data between the browser and the application server, and in doing so, use partial page refresh instead of reloading the entire Web page. AJAX (Asynchronous JavaScript and XML) is a powerful Web development model for browser-based Web applications. Technologies that form the AJAX model, such as XML, JavaScript, HTTP, and XHTML, are individually widely used and well known. However, AJAX combines these technologies to let Web pages retrieve small amounts of data from the server without having to reload the entire page. This capability makes Web pages more interactive and lets them behave like local applications. Web 2.0 enabled by the Ajax architecture has given rise to a new level of user interactivity through web browsers. Many new and extremely popular Web applications have been introduced such as Google Maps, Google Docs, Flickr, and so on. Ajax Toolkits such as Dojo allow web developers to build Web 2.0 applications quickly and with little effort.
WaterML: an XML Language for Communicating Water Observations Data
NASA Astrophysics Data System (ADS)
Maidment, D. R.; Zaslavsky, I.; Valentine, D.
2007-12-01
One of the great impediments to the synthesis of water information is the plethora of formats used to publish such data. Each water agency uses its own approach. XML (eXtended Markup Languages) are generalizations of Hypertext Markup Language to communicate specific kinds of information via the internet. WaterML is an XML language for water observations data - streamflow, water quality, groundwater levels, climate, precipitation and aquatic biology data, recorded at fixed, point locations as a function of time. The Hydrologic Information System project of the Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUAHSI) has defined WaterML and prepared a set of web service functions called WaterOneFLow that use WaterML to provide information about observation sites, the variables measured there and the values of those measurments. WaterML has been submitted to the Open GIS Consortium for harmonization with its standards for XML languages. Academic investigators at a number of testbed locations in the WATERS network are providing data in WaterML format using WaterOneFlow web services. The USGS and other federal agencies are also working with CUAHSI to similarly provide access to their data in WaterML through WaterOneFlow services.
XSemantic: An Extension of LCA Based XML Semantic Search
NASA Astrophysics Data System (ADS)
Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai
One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.
XML Schema Guide for Secondary CDR Submissions
This document presents the extensible markup language (XML) schema guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDRweb tool. E-CDRweb is the electronic, web-based tool provided by Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document provides the user with tips and guidance on correctly using the version 1.1 XML schema for the Joint Submission Form. Please note that the order of the elements must match the schema.
Flight Dynamic Model Exchange using XML
NASA Technical Reports Server (NTRS)
Jackson, E. Bruce; Hildreth, Bruce L.
2002-01-01
The AIAA Modeling and Simulation Technical Committee has worked for several years to develop a standard by which the information needed to develop physics-based models of aircraft can be specified. The purpose of this standard is to provide a well-defined set of information, definitions, data tables and axis systems so that cooperating organizations can transfer a model from one simulation facility to another with maximum efficiency. This paper proposes using an application of the eXtensible Markup Language (XML) to implement the AIAA simulation standard. The motivation and justification for using a standard such as XML is discussed. Necessary data elements to be supported are outlined. An example of an aerodynamic model as an XML file is given. This example includes definition of independent and dependent variables for function tables, definition of key variables used to define the model, and axis systems used. The final steps necessary for implementation of the standard are presented. Software to take an XML-defined model and import/export it to/from a given simulation facility is discussed, but not demonstrated. That would be the next step in final implementation of standards for physics-based aircraft dynamic models.
Shuttle-Data-Tape XML Translator
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2005-01-01
JSDTImport is a computer program for translating native Shuttle Data Tape (SDT) files from American Standard Code for Information Interchange (ASCII) format into databases in other formats. JSDTImport solves the problem of organizing the SDT content, affording flexibility to enable users to choose how to store the information in a database to better support client and server applications. JSDTImport can be dynamically configured by use of a simple Extensible Markup Language (XML) file. JSDTImport uses this XML file to define how each record and field will be parsed, its layout and definition, and how the resulting database will be structured. JSDTImport also includes a client application programming interface (API) layer that provides abstraction for the data-querying process. The API enables a user to specify the search criteria to apply in gathering all the data relevant to a query. The API can be used to organize the SDT content and translate into a native XML database. The XML format is structured into efficient sections, enabling excellent query performance by use of the XPath query language. Optionally, the content can be translated into a Structured Query Language (SQL) database for fast, reliable SQL queries on standard database server computers.
Using XML and XSLT for flexible elicitation of mental-health risk knowledge.
Buckingham, C D; Ahmed, A; Adams, A E
2007-03-01
Current tools for assessing risks associated with mental-health problems require assessors to make high-level judgements based on clinical experience. This paper describes how new technologies can enhance qualitative research methods to identify lower-level cues underlying these judgements, which can be collected by people without a specialist mental-health background. Content analysis of interviews with 46 multidisciplinary mental-health experts exposed the cues and their interrelationships, which were represented by a mind map using software that stores maps as XML. All 46 mind maps were integrated into a single XML knowledge structure and analysed by a Lisp program to generate quantitative information about the numbers of experts associated with each part of it. The knowledge was refined by the experts, using software developed in Flash to record their collective views within the XML itself. These views specified how the XML should be transformed by XSLT, a technology for rendering XML, which resulted in a validated hierarchical knowledge structure associating patient cues with risks. Changing knowledge elicitation requirements were accommodated by flexible transformations of XML data using XSLT, which also facilitated generation of multiple data-gathering tools suiting different assessment circumstances and levels of mental-health knowledge.
Compressing Aviation Data in XML Format
NASA Technical Reports Server (NTRS)
Patel, Hemil; Lau, Derek; Kulkarni, Deepak
2003-01-01
Design, operations and maintenance activities in aviation involve analysis of variety of aviation data. This data is typically in disparate formats making it difficult to use with different software packages. Use of a self-describing and extensible standard called XML provides a solution to this interoperability problem. XML provides a standardized language for describing the contents of an information stream, performing the same kind of definitional role for Web content as a database schema performs for relational databases. XML data can be easily customized for display using Extensible Style Sheets (XSL). While self-describing nature of XML makes it easy to reuse, it also increases the size of data significantly. Therefore, transfemng a dataset in XML form can decrease throughput and increase data transfer time significantly. It also increases storage requirements significantly. A natural solution to the problem is to compress the data using suitable algorithm and transfer it in the compressed form. We found that XML-specific compressors such as Xmill and XMLPPM generally outperform traditional compressors. However, optimal use of Xmill requires of discovery of optimal options to use while running Xmill. This, in turn, depends on the nature of data used. Manual disc0ver.y of optimal setting can require an engineer to experiment for weeks. We have devised an XML compression advisory tool that can analyze sample data files and recommend what compression tool would work the best for this data and what are the optimal settings to be used with a XML compression tool.
2003-09-01
590-595, September 1996. Deitel , H.M., Deitel , P.J., Nieto, T.R., Lin, T.M., Sadhu, P., XML: How to Program , Prentice Hall, 2001. Du, Y...communications will result in a total track following error equal to the sum of the errors for the two vehicles........48 xv Figure 36. Test point programming ...Refer to (Hunter 2000), ( Deitel 2001), or similar references for additional information regarding the XML standard. Figure 17. XML example
XML-based approaches for the integration of heterogeneous bio-molecular data.
Mesiti, Marco; Jiménez-Ruiz, Ernesto; Sanz, Ismael; Berlanga-Llavori, Rafael; Perlasca, Paolo; Valentini, Giorgio; Manset, David
2009-10-15
The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources.
XML, Ontologies, and Their Clinical Applications.
Yu, Chunjiang; Shen, Bairong
2016-01-01
The development of information technology has resulted in its penetration into every area of clinical research. Various clinical systems have been developed, which produce increasing volumes of clinical data. However, saving, exchanging, querying, and exploiting these data are challenging issues. The development of Extensible Markup Language (XML) has allowed the generation of flexible information formats to facilitate the electronic sharing of structured data via networks, and it has been used widely for clinical data processing. In particular, XML is very useful in the fields of data standardization, data exchange, and data integration. Moreover, ontologies have been attracting increased attention in various clinical fields in recent years. An ontology is the basic level of a knowledge representation scheme, and various ontology repositories have been developed, such as Gene Ontology and BioPortal. The creation of these standardized repositories greatly facilitates clinical research in related fields. In this chapter, we discuss the basic concepts of XML and ontologies, as well as their clinical applications.
James Webb Space Telescope XML Database: From the Beginning to Today
NASA Technical Reports Server (NTRS)
Gal-Edd, Jonathan; Fatig, Curtis C.
2005-01-01
The James Webb Space Telescope (JWST) Project has been defining, developing, and exercising the use of a common eXtensible Markup Language (XML) for the command and telemetry (C&T) database structure. JWST is the first large NASA space mission to use XML for databases. The JWST project started developing the concepts for the C&T database in 2002. The database will need to last at least 20 years since it will be used beginning with flight software development, continuing through Observatory integration and test (I&T) and through operations. Also, a database tool kit has been provided to the 18 various flight software development laboratories located in the United States, Europe, and Canada that allows the local users to create their own databases. Recently the JWST Project has been working with the Jet Propulsion Laboratory (JPL) and Object Management Group (OMG) XML Telemetry and Command Exchange (XTCE) personnel to provide all the information needed by JWST and JPL for exchanging database information using a XML standard structure. The lack of standardization requires custom ingest scripts for each ground system segment, increasing the cost of the total system. Providing a non-proprietary standard of the telemetry and command database definition formation will allow dissimilar systems to communicate without the need for expensive mission specific database tools and testing of the systems after the database translation. The various ground system components that would benefit from a standardized database are the telemetry and command systems, archives, simulators, and trending tools. JWST has exchanged the XML database with the Eclipse, EPOCH, ASIST ground systems, Portable spacecraft simulator (PSS), a front-end system, and Integrated Trending and Plotting System (ITPS) successfully. This paper will discuss how JWST decided to use XML, the barriers to a new concept, experiences utilizing the XML structure, exchanging databases with other users, and issues that have been experienced in creating databases for the C&T system.
NASA Astrophysics Data System (ADS)
Chen, Nengcheng; Di, Liping; Yu, Genong; Gong, Jianya; Wei, Yaxing
2009-02-01
Recent advances in Sensor Web geospatial data capture, such as high-resolution in satellite imagery and Web-ready data processing and modeling technologies, have led to the generation of large numbers of datasets from real-time or near real-time observations and measurements. Finding which sensor or data complies with criteria such as specific times, locations, and scales has become a bottleneck for Sensor Web-based applications, especially remote-sensing observations. In this paper, an architecture for use of the integration Sensor Observation Service (SOS) with the Open Geospatial Consortium (OGC) Catalogue Service-Web profile (CSW) is put forward. The architecture consists of a distributed geospatial sensor observation service, a geospatial catalogue service based on the ebXML Registry Information Model (ebRIM), SOS search and registry middleware, and a geospatial sensor portal. The SOS search and registry middleware finds the potential SOS, generating data granule information and inserting the records into CSW. The contents and sequence of the services, the available observations, and the metadata of the observations registry are described. A prototype system is designed and implemented using the service middleware technology and a standard interface and protocol. The feasibility and the response time of registry and retrieval of observations are evaluated using a realistic Earth Observing-1 (EO-1) SOS scenario. Extracting information from SOS requires the same execution time as record generation for CSW. The average data retrieval response time in SOS+CSW mode is 17.6% of that of the SOS-alone mode. The proposed architecture has the more advantages of SOS search and observation data retrieval than the existing sensor Web enabled systems.
Development of a Google-based search engine for data mining radiology reports.
Erinjeri, Joseph P; Picus, Daniel; Prior, Fred W; Rubin, David A; Koppel, Paul
2009-08-01
The aim of this study is to develop a secure, Google-based data-mining tool for radiology reports using free and open source technologies and to explore its use within an academic radiology department. A Health Insurance Portability and Accountability Act (HIPAA)-compliant data repository, search engine and user interface were created to facilitate treatment, operations, and reviews preparatory to research. The Institutional Review Board waived review of the project, and informed consent was not required. Comprising 7.9 GB of disk space, 2.9 million text reports were downloaded from our radiology information system to a fileserver. Extensible markup language (XML) representations of the reports were indexed using Google Desktop Enterprise search engine software. A hypertext markup language (HTML) form allowed users to submit queries to Google Desktop, and Google's XML response was interpreted by a practical extraction and report language (PERL) script, presenting ranked results in a web browser window. The query, reason for search, results, and documents visited were logged to maintain HIPAA compliance. Indexing averaged approximately 25,000 reports per hour. Keyword search of a common term like "pneumothorax" yielded the first ten most relevant results of 705,550 total results in 1.36 s. Keyword search of a rare term like "hemangioendothelioma" yielded the first ten most relevant results of 167 total results in 0.23 s; retrieval of all 167 results took 0.26 s. Data mining tools for radiology reports will improve the productivity of academic radiologists in clinical, educational, research, and administrative tasks. By leveraging existing knowledge of Google's interface, radiologists can quickly perform useful searches.
Fernández, José M; Valencia, Alfonso
2004-10-12
Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.
Vittorini, Pierpaolo; Tarquinio, Antonietta; di Orio, Ferdinando
2009-03-01
The eXtensible markup language (XML) is a metalanguage which is useful to represent and exchange data between heterogeneous systems. XML may enable healthcare practitioners to document, monitor, evaluate, and archive medical information and services into distributed computer environments. Therefore, the most recent proposals on electronic health records (EHRs) are usually based on XML documents. Since none of the existing nomenclatures were specifically developed for use in automated clinical information systems, but were adapted to such use, numerous current EHRs are organized as a sequence of events, each represented through codes taken from international classification systems. In nursing, a hierarchically organized problem-solving approach is followed, which hardly couples with the sequential organization of such EHRs. Therefore, the paper presents an XML data model for the Omaha System taxonomy, which is one of the most important international nomenclatures used in the home healthcare nursing context. Such a data model represents the formal definition of EHRs specifically developed for nursing practice. Furthermore, the paper delineates a Java application prototype which is able to manage such documents, shows the possibility to transform such documents into readable web pages, and reports several case studies, one currently managed by the home care service of a Health Center in Central Italy.
Programmatic access to data and information at the IRIS DMC via web services
NASA Astrophysics Data System (ADS)
Weertman, B. R.; Trabant, C.; Karstens, R.; Suleiman, Y. Y.; Ahern, T. K.; Casey, R.; Benson, R. B.
2011-12-01
The IRIS Data Management Center (DMC) has developed a suite of web services that provide access to the DMC's time series holdings, their related metadata and earthquake catalogs. In addition, services are available to perform simple, on-demand time series processing at the DMC prior to being shipped to the user. The primary goal is to provide programmatic access to data and processing services in a manner usable by and useful to the research community. The web services are relatively simple to understand and use and will form the foundation on which future DMC access tools will be built. Based on standard Web technologies they can be accessed programmatically with a wide range of programming languages (e.g. Perl, Python, Java), command line utilities such as wget and curl or with any web browser. We anticipate these services being used for everything from simple command line access, used in shell scripts and higher programming languages to being integrated within complex data processing software. In addition to improving access to our data by the seismological community the web services will also make our data more accessible to other disciplines. The web services available from the DMC include ws-bulkdataselect for the retrieval of large volumes of miniSEED data, ws-timeseries for the retrieval of individual segments of time series data in a variety of formats (miniSEED, SAC, ASCII, audio WAVE, and PNG plots) with optional signal processing, ws-station for station metadata in StationXML format, ws-resp for the retrieval of instrument response in RESP format, ws-sacpz for the retrieval of sensor response in the SAC poles and zeros convention and ws-event for the retrieval of earthquake catalogs. To make the services even easier to use, the DMC is developing a library that allows Java programmers to seamlessly retrieve and integrate DMC information into their own programs. The library will handle all aspects of dealing with the services and will parse the returned data. By using this library a developer will not need to learn the details of the service interfaces or understand the data formats returned. This library will be used to build the software bridge needed to request data and information from within MATLAB°. We also provide several client scripts written in Perl for the retrieval of waveform data, metadata and earthquake catalogs using command line programs. For more information on the DMC's web services please visit http://www.iris.edu/ws/
Method for gathering and summarizing internet information
Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna
2010-04-06
A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
System for gathering and summarizing internet information
Potok, Thomas E.; Elmore, Mark Thomas; Reed, Joel Wesley; Treadwell, Jim N.; Samatova, Nagiza Faridovna
2006-07-04
A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
Method for gathering and summarizing internet information
Potok, Thomas E [Oak Ridge, TN; Elmore, Mark Thomas [Oak Ridge, TN; Reed, Joel Wesley [Knoxville, TN; Treadwell, Jim N [Louisville, TN; Samatova, Nagiza Faridovna [Oak Ridge, TN
2008-01-01
A computer method of gathering and summarizing large amounts of information comprises collecting information from a plurality of information sources (14, 51) according to respective maps (52) of the information sources (14), converting the collected information from a storage format to XML-language documents (26, 53) and storing the XML-language documents in a storage medium, searching for documents (55) according to a search query (13) having at least one term and identifying the documents (26) found in the search, and displaying the documents as nodes (33) of a tree structure (32) having links (34) and nodes (33) so as to indicate similarity of the documents to each other.
MACSIMS : multiple alignment of complete sequences information management system
Thompson, Julie D; Muller, Arnaud; Waterhouse, Andrew; Procter, Jim; Barton, Geoffrey J; Plewniak, Frédéric; Poch, Olivier
2006-01-01
Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . PMID:16792820
The future application of GML database in GIS
NASA Astrophysics Data System (ADS)
Deng, Yuejin; Cheng, Yushu; Jing, Lianwen
2006-10-01
In 2004, the Geography Markup Language (GML) Implementation Specification (version 3.1.1) was published by Open Geospatial Consortium, Inc. Now more and more applications in geospatial data sharing and interoperability depend on GML. The primary purpose of designing GML is for exchange and transportation of geo-information by standard modeling and encoding of geography phenomena. However, the problems of how to organize and access lots of GML data effectively arise in applications. The research on GML database focuses on these problems. The effective storage of GML data is a hot topic in GIS communities today. GML Database Management System (GDBMS) mainly deals with the problem of storage and management of GML data. Now two types of XML database, namely Native XML Database, and XML-Enabled Database are classified. Since GML is an application of the XML standard to geographic data, the XML database system can also be used for the management of GML. In this paper, we review the status of the art of XML database, including storage, index and query languages, management systems and so on, then move on to the GML database. At the end, the future prospect of GML database in GIS application is presented.
A Survey and Analysis of Access Control Architectures for XML Data
2006-03-01
13 4. XML Query Engines ...castle and the drawbridge over the moat. Extending beyond the visual analogy, there are many key components to the protection of information and...technology. While XML’s original intent was to enable large-scale electronic publishing over the internet, its functionality is firmly rooted in its
XML and Bibliographic Data: The TVS (Transport, Validation and Services) Model.
ERIC Educational Resources Information Center
de Carvalho, Joaquim; Cordeiro, Maria Ines
This paper discusses the role of XML in library information systems at three major levels: as are presentation language that enables the transport of bibliographic data in a way that is technologically independent and universally understood across systems and domains; as a language that enables the specification of complex validation rules…
An introduction to the Semantic Web for health sciences librarians.
Robu, Ioana; Robu, Valentin; Thirion, Benoit
2006-04-01
The paper (1) introduces health sciences librarians to the main concepts and principles of the Semantic Web (SW) and (2) briefly reviews a number of projects on the handling of biomedical information that uses SW technology. The paper is structured into two main parts. "Semantic Web Technology" provides a high-level description, with examples, of the main standards and concepts: extensible markup language (XML), Resource Description Framework (RDF), RDF Schema (RDFS), ontologies, and their utility in information retrieval, concluding with mention of more advanced SW languages and their characteristics. "Semantic Web Applications and Research Projects in the Biomedical Field" is a brief review of the Unified Medical Language System (UMLS), Generalised Architecture for Languages, Encyclopedias and Nomenclatures in Medicine (GALEN), HealthCyberMap, LinkBase, and the thesaurus of the National Cancer Institute (NCI). The paper also mentions other benefits and by-products of the SW, citing projects related to them. Some of the problems facing the SW vision are presented, especially the ways in which the librarians' expertise in organizing knowledge and in structuring information may contribute to SW projects.
Application of XML to Journal Table Archiving
NASA Astrophysics Data System (ADS)
Shaya, E. J.; Blackwell, J. H.; Gass, J. E.; Kargatis, V. E.; Schneider, G. L.; Weiland, J. L.; Borne, K. D.; White, R. A.; Cheung, C. Y.
1998-12-01
The Astronomical Data Center (ADC) at the NASA Goddard Space Flight Center is a major archive for machine-readable astronomical data tables. Many ADC tables are derived from published journal articles. Article tables are reformatted to be machine-readable and documentation is crafted to facilitate proper reuse by researchers. The recent switch of journals to web based electronic format has resulted in the generation of large amounts of tabular data that could be captured into machine-readable archive format at fairly low cost. The large data flow of the tables from all major North American astronomical journals (a factor of 100 greater than the present rate at the ADC) necessitates the development of rigorous standards for the exchange of data between researchers, publishers, and the archives. We have selected a suitable markup language that can fully describe the large variety of astronomical information contained in ADC tables. The eXtensible Markup Language XML is a powerful internet-ready documentation format for data. It provides a precise and clear data description language that is both machine- and human-readable. It is rapidly becoming the standard format for business and information transactions on the internet and it is an ideal common metadata exchange format. By labelling, or "marking up", all elements of the information content, documents are created that computers can easily parse. An XML archive can easily and automatically be maintained, ingested into standard databases or custom software, and even totally restructured whenever necessary. Structuring astronomical data into XML format will enable efficient and focused search capabilities via off-the-shelf software. The ADC is investigating XML's expanded hyperlinking power to enhance connectivity within the ADC data/metadata and developing XSL display scripts to enhance display of astronomical data. The ADC XML Definition Type Document can be viewed at http://messier.gsfc.nasa.gov/dtdhtml/DTD-TREE.html
Rubin, Daniel L; Hewett, Micheal; Oliver, Diane E; Klein, Teri E; Altman, Russ B
2002-01-01
Ontologies are useful for organizing large numbers of concepts having complex relationships, such as the breadth of genetic and clinical knowledge in pharmacogenomics. But because ontologies change and knowledge evolves, it is time consuming to maintain stable mappings to external data sources that are in relational format. We propose a method for interfacing ontology models with data acquisition from external relational data sources. This method uses a declarative interface between the ontology and the data source, and this interface is modeled in the ontology and implemented using XML schema. Data is imported from the relational source into the ontology using XML, and data integrity is checked by validating the XML submission with an XML schema. We have implemented this approach in PharmGKB (http://www.pharmgkb.org/), a pharmacogenetics knowledge base. Our goals were to (1) import genetic sequence data, collected in relational format, into the pharmacogenetics ontology, and (2) automate the process of updating the links between the ontology and data acquisition when the ontology changes. We tested our approach by linking PharmGKB with data acquisition from a relational model of genetic sequence information. The ontology subsequently evolved, and we were able to rapidly update our interface with the external data and continue acquiring the data. Similar approaches may be helpful for integrating other heterogeneous information sources in order make the diversity of pharmacogenetics data amenable to computational analysis.
The XBabelPhish MAGE-ML and XML translator.
Maier, Don; Wymore, Farrell; Sherlock, Gavin; Ball, Catherine A
2008-01-18
MAGE-ML has been promoted as a standard format for describing microarray experiments and the data they produce. Two characteristics of the MAGE-ML format compromise its use as a universal standard: First, MAGE-ML files are exceptionally large - too large to be easily read by most people, and often too large to be read by most software programs. Second, the MAGE-ML standard permits many ways of representing the same information. As a result, different producers of MAGE-ML create different documents describing the same experiment and its data. Recognizing all the variants is an unwieldy software engineering task, resulting in software packages that can read and process MAGE-ML from some, but not all producers. This Tower of MAGE-ML Babel bars the unencumbered exchange of microarray experiment descriptions couched in MAGE-ML. We have developed XBabelPhish - an XQuery-based technology for translating one MAGE-ML variant into another. XBabelPhish's use is not restricted to translating MAGE-ML documents. It can transform XML files independent of their DTD, XML schema, or semantic content. Moreover, it is designed to work on very large (> 200 Mb.) files, which are common in the world of MAGE-ML. XBabelPhish provides a way to inter-translate MAGE-ML variants for improved interchange of microarray experiment information. More generally, it can be used to transform most XML files, including very large ones that exceed the capacity of most XML tools.
An XML-Based Protocol for Distributed Event Services
NASA Technical Reports Server (NTRS)
Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)
2001-01-01
A recent trend in distributed computing is the construction of high-performance distributed systems called computational grids. One difficulty we have encountered is that there is no standard format for the representation of performance information and no standard protocol for transmitting this information. This limits the types of performance analysis that can be undertaken in complex distributed systems. To address this problem, we present an XML-based protocol for transmitting performance events in distributed systems and evaluate the performance of this protocol.
Design and implementation of a portal for the medical equipment market: MEDICOM.
Palamas, S; Kalivas, D; Panou-Diamandi, O; Zeelenberg, C; van Nimwegen, C
2001-01-01
The MEDICOM (Medical Products Electronic Commerce) Portal provides the electronic means for medical-equipment manufacturers to communicate online with their customers while supporting the Purchasing Process and Post Market Surveillance. The Portal offers a powerful Internet-based search tool for finding medical products and manufacturers. Its main advantage is the fast, reliable and up-to-date retrieval of information while eliminating all unrelated content that a general-purpose search engine would retrieve. The Universal Medical Device Nomenclature System (UMDNS) registers all products. The Portal accepts end-user requests and generates a list of results containing text descriptions of devices, UMDNS attribute values, and links to manufacturer Web pages and online catalogues for access to more-detailed information. Device short descriptions are provided by the corresponding manufacturer. The Portal offers technical support for integration of the manufacturers Web sites with itself. The network of the Portal and the connected manufacturers sites is called the MEDICOM system. To establish an environment hosting all the interactions of consumers (health care organizations and professionals) and providers (manufacturers, distributors, and resellers of medical devices). The Portal provides the end-user interface, implements system management, and supports database compatibility. The Portal hosts information about the whole MEDICOM system (Common Database) and summarized descriptions of medical devices (Short Description Database); the manufacturers servers present extended descriptions. The Portal provides end-user profiling and registration, an efficient product-searching mechanism, bulletin boards, links to on-line libraries and standards, on-line information for the MEDICOM system, and special messages or advertisements from manufacturers. Platform independence and interoperability characterize the system design. Relational Database Management Systems are used for the system s databases. The end-user interface is implemented using HTML, Javascript, Java applets, and XML documents. Communication between the Portal and the manufacturers servers is implemented using a CORBA interface. Remote administration of the Portal is enabled by dynamically-generated HTML interfaces based on XML documents. A representative group of users evaluated the system. The aim of the evaluation was validation of the usability of all of MEDICOM s functionality. The evaluation procedure was based on ISO/IEC 9126 Information technology - Software product evaluation - Quality characteristics and guidelines for their use. The overall user evaluation of the MEDICOM system was very positive. The MEDICOM system was characterized as an innovative concept that brings significant added value to medical-equipment commerce. The eventual benefits of the MEDICOM system are (a) establishment of a worldwide-accessible marketplace between manufacturers and health care professionals that provides up-to-date and high-quality product information in an easy and friendly way and (b) enhancement of the efficiency of marketing procedures and after-sales support.
Design and Implementation of a Portal for the Medical Equipment Market: MEDICOM
Kalivas, Dimitris; Panou-Diamandi, Ourania; Zeelenberg, Cees; van Nimwegen, Chris
2001-01-01
Background The MEDICOM (Medical Products Electronic Commerce) Portal provides the electronic means for medical-equipment manufacturers to communicate online with their customers while supporting the Purchasing Process and Post Market Surveillance. The Portal offers a powerful Internet-based search tool for finding medical products and manufacturers. Its main advantage is the fast, reliable and up-to-date retrieval of information while eliminating all unrelated content that a general-purpose search engine would retrieve. The Universal Medical Device Nomenclature System (UMDNS) registers all products. The Portal accepts end-user requests and generates a list of results containing text descriptions of devices, UMDNS attribute values, and links to manufacturer Web pages and online catalogues for access to more-detailed information. Device short descriptions are provided by the corresponding manufacturer. The Portal offers technical support for integration of the manufacturers' Web sites with itself. The network of the Portal and the connected manufacturers' sites is called the MEDICOM system. Objective To establish an environment hosting all the interactions of consumers (health care organizations and professionals) and providers (manufacturers, distributors, and resellers of medical devices). Methods The Portal provides the end-user interface, implements system management, and supports database compatibility. The Portal hosts information about the whole MEDICOM system (Common Database) and summarized descriptions of medical devices (Short Description Database); the manufacturers' servers present extended descriptions. The Portal provides end-user profiling and registration, an efficient product-searching mechanism, bulletin boards, links to on-line libraries and standards, on-line information for the MEDICOM system, and special messages or advertisements from manufacturers. Platform independence and interoperability characterize the system design. Relational Database Management Systems are used for the system's databases. The end-user interface is implemented using HTML, Javascript, Java applets, and XML documents. Communication between the Portal and the manufacturers' servers is implemented using a CORBA interface. Remote administration of the Portal is enabled by dynamically-generated HTML interfaces based on XML documents. A representative group of users evaluated the system. The aim of the evaluation was validation of the usability of all of MEDICOM's functionality. The evaluation procedure was based on ISO/IEC 9126 Information technology - Software product evaluation - Quality characteristics and guidelines for their use. Results The overall user evaluation of the MEDICOM system was very positive. The MEDICOM system was characterized as an innovative concept that brings significant added value to medical-equipment commerce. Conclusions The eventual benefits of the MEDICOM system are (a) establishment of a worldwide-accessible marketplace between manufacturers and health care professionals that provides up-to-date and high-quality product information in an easy and friendly way and (b) enhancement of the efficiency of marketing procedures and after-sales support. PMID:11772547
Minimum information required for a DMET experiment reporting.
Kumuthini, Judit; Mbiyavanga, Mamana; Chimusa, Emile R; Pathak, Jyotishman; Somervuo, Panu; Van Schaik, Ron Hn; Dolzan, Vita; Mizzi, Clint; Kalideen, Kusha; Ramesar, Raj S; Macek, Milan; Patrinos, George P; Squassina, Alessio
2016-09-01
To provide pharmacogenomics reporting guidelines, the information and tools required for reporting to public omic databases. For effective DMET data interpretation, sharing, interoperability, reproducibility and reporting, we propose the Minimum Information required for a DMET Experiment (MIDE) reporting. MIDE provides reporting guidelines and describes the information required for reporting, data storage and data sharing in the form of XML. The MIDE guidelines will benefit the scientific community with pharmacogenomics experiments, including reporting pharmacogenomics data from other technology platforms, with the tools that will ease and automate the generation of such reports using the standardized MIDE XML schema, facilitating the sharing, dissemination, reanalysis of datasets through accessible and transparent pharmacogenomics data reporting.
XML and its impact on content and structure in electronic health care documents.
Sokolowski, R.; Dudeck, J.
1999-01-01
Worldwide information networks have the requirement that electronic documents must be easily accessible, portable, flexible and system-independent. With the development of XML (eXtensible Markup Language), the future of electronic documents, health care informatics and the Web itself are about to change. The intent of the recently formed ASTM E31.25 subcommittee, "XML DTDs for Health Care", is to develop standard electronic document representations of paper-based health care documents and forms. A goal of the subcommittee is to work together to enhance existing levels of interoperability among the various XML/SGML standardization efforts, products and systems in health care. The ASTM E31.25 subcommittee uses common practices and software standards to develop the implementation recommendations for XML documents in health care. The implementation recommendations are being developed to standardize the many different structures of documents. These recommendations are in the form of a set of standard DTDs, or document type definitions that match the electronic document requirements in the health care industry. This paper discusses recent efforts of the ASTM E31.25 subcommittee. PMID:10566338
An Extensible Information Grid for Risk Management
NASA Technical Reports Server (NTRS)
Maluf, David A.; Bell, David G.
2003-01-01
This paper describes recent work on developing an extensible information grid for risk management at NASA - a RISK INFORMATION GRID. This grid is being developed by integrating information grid technology with risk management processes for a variety of risk related applications. To date, RISK GRID applications are being developed for three main NASA processes: risk management - a closed-loop iterative process for explicit risk management, program/project management - a proactive process that includes risk management, and mishap management - a feedback loop for learning from historical risks that escaped other processes. This is enabled through an architecture involving an extensible database, structuring information with XML, schemaless mapping of XML, and secure server-mediated communication using standard protocols.
The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF
NASA Astrophysics Data System (ADS)
Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.
2006-12-01
In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.
A Flexible Online Metadata Editing and Management System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilar, Raul; Pan, Jerry Yun; Gries, Corinna
2010-01-01
A metadata editing and management system is being developed employing state of the art XML technologies. A modular and distributed design was chosen for scalability, flexibility, options for customizations, and the possibility to add more functionality at a later stage. The system consists of a desktop design tool or schema walker used to generate code for the actual online editor, a native XML database, and an online user access management application. The design tool is a Java Swing application that reads an XML schema, provides the designer with options to combine input fields into online forms and give the fieldsmore » user friendly tags. Based on design decisions, the tool generates code for the online metadata editor. The code generated is an implementation of the XForms standard using the Orbeon Framework. The design tool fulfills two requirements: First, data entry forms based on one schema may be customized at design time and second data entry applications may be generated for any valid XML schema without relying on custom information in the schema. However, the customized information generated at design time is saved in a configuration file which may be re-used and changed again in the design tool. Future developments will add functionality to the design tool to integrate help text, tool tips, project specific keyword lists, and thesaurus services. Additional styling of the finished editor is accomplished via cascading style sheets which may be further customized and different look-and-feels may be accumulated through the community process. The customized editor produces XML files in compliance with the original schema, however, data from the current page is saved into a native XML database whenever the user moves to the next screen or pushes the save button independently of validity. Currently the system uses the open source XML database eXist for storage and management, which comes with third party online and desktop management tools. However, access to metadata files in the application introduced here is managed in a custom online module, using a MySQL backend accessed by a simple Java Server Faces front end. A flexible system with three grouping options, organization, group and single editing access is provided. Three levels were chosen to distribute administrative responsibilities and handle the common situation of an information manager entering the bulk of the metadata but leave specifics to the actual data provider.« less
Conversion of Radiology Reporting Templates to the MRRT Standard.
Kahn, Charles E; Genereaux, Brad; Langlotz, Curtis P
2015-10-01
In 2013, the Integrating the Healthcare Enterprise (IHE) Radiology workgroup developed the Management of Radiology Report Templates (MRRT) profile, which defines both the format of radiology reporting templates using an extension of Hypertext Markup Language version 5 (HTML5), and the transportation mechanism to query, retrieve, and store these templates. Of 200 English-language report templates published by the Radiological Society of North America (RSNA), initially encoded as text and in an XML schema language, 168 have been converted successfully into MRRT using a combination of automated processes and manual editing; conversion of the remaining 32 templates is in progress. The automated conversion process applied Extensible Stylesheet Language Transformation (XSLT) scripts, an XML parsing engine, and a Java servlet. The templates were validated for proper HTML5 and MRRT syntax using web-based services. The MRRT templates allow radiologists to share best-practice templates across organizations and have been uploaded to the template library to supersede the prior XML-format templates. By using MRRT transactions and MRRT-format templates, radiologists will be able to directly import and apply templates from the RSNA Report Template Library in their own MRRT-compatible vendor systems. The availability of MRRT-format reporting templates will stimulate adoption of the MRRT standard and is expected to advance the sharing and use of templates to improve the quality of radiology reports.
KAT: A Flexible XML-based Knowledge Authoring Environment
Hulse, Nathan C.; Rocha, Roberto A.; Del Fiol, Guilherme; Bradshaw, Richard L.; Hanna, Timothy P.; Roemer, Lorrie K.
2005-01-01
As part of an enterprise effort to develop new clinical information systems at Intermountain Health Care, the authors have built a knowledge authoring tool that facilitates the development and refinement of medical knowledge content. At present, users of the application can compose order sets and an assortment of other structured clinical knowledge documents based on XML schemas. The flexible nature of the application allows the immediate authoring of new types of documents once an appropriate XML schema and accompanying Web form have been developed and stored in a shared repository. The need for a knowledge acquisition tool stems largely from the desire for medical practitioners to be able to write their own content for use within clinical applications. We hypothesize that medical knowledge content for clinical use can be successfully created and maintained through XML-based document frameworks containing structured and coded knowledge. PMID:15802477
Suggestions for Improvement of User Access to GOCE L2 Data
NASA Astrophysics Data System (ADS)
Tscherning, C. C.
2011-07-01
ESA's has required that most GOCE L2 products are delivered in XML format. This creates difficulties for the users because a Parser written in Perl is needed to convert the files to files without XML tags. However several products, such as the coefficients of spherical harmonic coefficients are made available on standard form through the International Center for Global Gravity Field Models. The variance-covariance information for the gravity field models is only available without XML tags. It is suggested that all XML products are made available in the Virtual Data Archive as files without tags. This will besides making the data directly usable by a FORTRAN program also reduce the size (storage requirements) of the product to about 30 %. A further reduction of used storage should be made by tuning the number of digits for the individual quantities in the products, so that it corresponds to the actual number of significant digits.
XML Based Scientific Data Management Facility
NASA Technical Reports Server (NTRS)
Mehrotra, P.; Zubair, M.; Bushnell, Dennis M. (Technical Monitor)
2002-01-01
The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of XML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management ,facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.
A Simple XML Producer-Consumer Protocol
NASA Technical Reports Server (NTRS)
Smith, Warren; Gunter, Dan; Quesnel, Darcy
2000-01-01
This document describes a simple XML-based protocol that can be used for producers of events to communicate with consumers of events. The protocol described here is not meant to be the most efficient protocol, the most logical protocol, or the best protocol in any way. This protocol was defined quickly and it's intent is to give us a reasonable protocol that we can implement relatively easily and then use to gain experience in distributed event services. This experience will help us evaluate proposals for event representations, XML-based encoding of information, and communication protocols. The next section of this document describes how we represent events in this protocol and then defines the two events that we choose to use for our initial experiments. These definitions are made by example so that they are informal and easy to understand. The following section then proceeds to define the producer-consumer protocol we have agreed upon for our initial experiments.
Knowledge representation for fuzzy inference aided medical image interpretation.
Gal, Norbert; Stoicu-Tivadar, Vasile
2012-01-01
Knowledge defines how an automated system transforms data into information. This paper suggests a representation method of medical imaging knowledge using fuzzy inference systems coded in XML files. The imaging knowledge incorporates features of the investigated objects in linguistic form and inference rules that can transform the linguistic data into information about a possible diagnosis. A fuzzy inference system is used to model the vagueness of the linguistic medical imaging terms. XML files are used to facilitate easy manipulation and deployment of the knowledge into the imaging software. Preliminary results are presented.
2003-01-01
xml Internet . Teal Group Corp. Aviation Week and Space Technology , 18 March 2003, 1. 62 Babak Minovi, “Turbine Industry Struggles with Weak Markets ...xml Internet . Teal Group Corp. Aviation Week and Space Technology , 18 March 2003, 1. 64 Babak Minovi, “Turbine Industry Struggles with Weak Markets ...what several executives referred to as the “perfect storm” now blowing through the aviation market . With this information many questions remain: Will
Schema for Spacecraft-Command Dictionary
NASA Technical Reports Server (NTRS)
Laubach, Sharon; Garcia, Celina; Maxwell, Scott; Wright, Jesse
2008-01-01
An Extensible Markup Language (XML) schema was developed as a means of defining and describing a structure for capturing spacecraft command- definition and tracking information in a single location in a form readable by both engineers and software used to generate software for flight and ground systems. A structure defined within this schema is then used as the basis for creating an XML file that contains command definitions.
Streamlining geospatial metadata in the Semantic Web
NASA Astrophysics Data System (ADS)
Fugazza, Cristiano; Pepe, Monica; Oggioni, Alessandro; Tagliolato, Paolo; Carrara, Paola
2016-04-01
In the geospatial realm, data annotation and discovery rely on a number of ad-hoc formats and protocols. These have been created to enable domain-specific use cases generalized search is not feasible for. Metadata are at the heart of the discovery process and nevertheless they are often neglected or encoded in formats that either are not aimed at efficient retrieval of resources or are plainly outdated. Particularly, the quantum leap represented by the Linked Open Data (LOD) movement did not induce so far a consistent, interlinked baseline in the geospatial domain. In a nutshell, datasets, scientific literature related to them, and ultimately the researchers behind these products are only loosely connected; the corresponding metadata intelligible only to humans, duplicated on different systems, seldom consistently. Instead, our workflow for metadata management envisages i) editing via customizable web- based forms, ii) encoding of records in any XML application profile, iii) translation into RDF (involving the semantic lift of metadata records), and finally iv) storage of the metadata as RDF and back-translation into the original XML format with added semantics-aware features. Phase iii) hinges on relating resource metadata to RDF data structures that represent keywords from code lists and controlled vocabularies, toponyms, researchers, institutes, and virtually any description one can retrieve (or directly publish) in the LOD Cloud. In the context of a distributed Spatial Data Infrastructure (SDI) built on free and open-source software, we detail phases iii) and iv) of our workflow for the semantics-aware management of geospatial metadata.
Spreadsheets for Analyzing and Optimizing Space Missions
NASA Technical Reports Server (NTRS)
Some, Raphael R.; Agrawal, Anil K.; Czikmantory, Akos J.; Weisbin, Charles R.; Hua, Hook; Neff, Jon M.; Cowdin, Mark A.; Lewis, Brian S.; Iroz, Juana; Ross, Rick
2009-01-01
XCALIBR (XML Capability Analysis LIBRary) is a set of Extensible Markup Language (XML) database and spreadsheet- based analysis software tools designed to assist in technology-return-on-investment analysis and optimization of technology portfolios pertaining to outer-space missions. XCALIBR is also being examined for use in planning, tracking, and documentation of projects. An XCALIBR database contains information on mission requirements and technological capabilities, which are related by use of an XML taxonomy. XCALIBR incorporates a standardized interface for exporting data and analysis templates to an Excel spreadsheet. Unique features of XCALIBR include the following: It is inherently hierarchical by virtue of its XML basis. The XML taxonomy codifies a comprehensive data structure and data dictionary that includes performance metrics for spacecraft, sensors, and spacecraft systems other than sensors. The taxonomy contains >700 nodes representing all levels, from system through subsystem to individual parts. All entries are searchable and machine readable. There is an intuitive Web-based user interface. The software automatically matches technologies to mission requirements. The software automatically generates, and makes the required entries in, an Excel return-on-investment analysis software tool. The results of an analysis are presented in both tabular and graphical displays.
A unified approach for development of Urdu Corpus for OCR and demographic purpose
NASA Astrophysics Data System (ADS)
Choudhary, Prakash; Nain, Neeta; Ahmed, Mushtaq
2015-02-01
This paper presents a methodology for the development of an Urdu handwritten text image Corpus and application of Corpus linguistics in the field of OCR and information retrieval from handwritten document. Compared to other language scripts, Urdu script is little bit complicated for data entry. To enter a single character it requires a combination of multiple keys entry. Here, a mixed approach is proposed and demonstrated for building Urdu Corpus for OCR and Demographic data collection. Demographic part of database could be used to train a system to fetch the data automatically, which will be helpful to simplify existing manual data-processing task involved in the field of data collection such as input forms like Passport, Ration Card, Voting Card, AADHAR, Driving licence, Indian Railway Reservation, Census data etc. This would increase the participation of Urdu language community in understanding and taking benefit of the Government schemes. To make availability and applicability of database in a vast area of corpus linguistics, we propose a methodology for data collection, mark-up, digital transcription, and XML metadata information for benchmarking.
TCGA2BED: extracting, extending, integrating, and querying The Cancer Genome Atlas.
Cumbo, Fabio; Fiscon, Giulia; Ceri, Stefano; Masseroli, Marco; Weitschek, Emanuel
2017-01-03
Data extraction and integration methods are becoming essential to effectively access and take advantage of the huge amounts of heterogeneous genomics and clinical data increasingly available. In this work, we focus on The Cancer Genome Atlas, a comprehensive archive of tumoral data containing the results of high-throughout experiments, mainly Next Generation Sequencing, for more than 30 cancer types. We propose TCGA2BED a software tool to search and retrieve TCGA data, and convert them in the structured BED format for their seamless use and integration. Additionally, it supports the conversion in CSV, GTF, JSON, and XML standard formats. Furthermore, TCGA2BED extends TCGA data with information extracted from other genomic databases (i.e., NCBI Entrez Gene, HGNC, UCSC, and miRBase). We also provide and maintain an automatically updated data repository with publicly available Copy Number Variation, DNA-methylation, DNA-seq, miRNA-seq, and RNA-seq (V1,V2) experimental data of TCGA converted into the BED format, and their associated clinical and biospecimen meta data in attribute-value text format. The availability of the valuable TCGA data in BED format reduces the time spent in taking advantage of them: it is possible to efficiently and effectively deal with huge amounts of cancer genomic data integratively, and to search, retrieve and extend them with additional information. The BED format facilitates the investigators allowing several knowledge discovery analyses on all tumor types in TCGA with the final aim of understanding pathological mechanisms and aiding cancer treatments.
A Framework for Resilient Remote Monitoring
2014-08-01
of low-level observables are availa- ble, audited , and recorded. This establishes the need for a re- mote monitoring framework that can integrate with...Security, WS-Policy, SAML, XML Signature, and XML Encryption. Pearson Higher Education, 2004. [3] OMG, “Common Secure Interoperability Protocol...www.darpa.mil/Our_Work/I2O/Programs/Integrated_Cyb er_Analysis_System_%28ICAS%29.aspx. [8] D. Miller and B. Pearson , Security information and event man
ISO, FGDC, DIF and Dublin Core - Making Sense of Metadata Standards for Earth Science Data
NASA Astrophysics Data System (ADS)
Jones, P. R.; Ritchey, N. A.; Peng, G.; Toner, V. A.; Brown, H.
2014-12-01
Metadata standards provide common definitions of metadata fields for information exchange across user communities. Despite the broad adoption of metadata standards for Earth science data, there are still heterogeneous and incompatible representations of information due to differences between the many standards in use and how each standard is applied. Federal agencies are required to manage and publish metadata in different metadata standards and formats for various data catalogs. In 2014, the NOAA National Climatic data Center (NCDC) managed metadata for its scientific datasets in ISO 19115-2 in XML, GCMD Directory Interchange Format (DIF) in XML, DataCite Schema in XML, Dublin Core in XML, and Data Catalog Vocabulary (DCAT) in JSON, with more standards and profiles of standards planned. Of these standards, the ISO 19115-series metadata is the most complete and feature-rich, and for this reason it is used by NCDC as the source for the other metadata standards. We will discuss the capabilities of metadata standards and how these standards are being implemented to document datasets. Successful implementations include developing translations and displays using XSLTs, creating links to related data and resources, documenting dataset lineage, and establishing best practices. Benefits, gaps, and challenges will be highlighted with suggestions for improved approaches to metadata storage and maintenance.
An introduction to the Semantic Web for health sciences librarians*
Robu, Ioana; Robu, Valentin; Thirion, Benoit
2006-01-01
Objectives: The paper (1) introduces health sciences librarians to the main concepts and principles of the Semantic Web (SW) and (2) briefly reviews a number of projects on the handling of biomedical information that uses SW technology. Methodology: The paper is structured into two main parts. “Semantic Web Technology” provides a high-level description, with examples, of the main standards and concepts: extensible markup language (XML), Resource Description Framework (RDF), RDF Schema (RDFS), ontologies, and their utility in information retrieval, concluding with mention of more advanced SW languages and their characteristics. “Semantic Web Applications and Research Projects in the Biomedical Field” is a brief review of the Unified Medical Language System (UMLS), Generalised Architecture for Languages, Encyclopedias and Nomenclatures in Medicine (GALEN), HealthCyberMap, LinkBase, and the thesaurus of the National Cancer Institute (NCI). The paper also mentions other benefits and by-products of the SW, citing projects related to them. Discussion and Conclusions: Some of the problems facing the SW vision are presented, especially the ways in which the librarians' expertise in organizing knowledge and in structuring information may contribute to SW projects. PMID:16636713
Progress on an implementation of MIFlowCyt in XML
NASA Astrophysics Data System (ADS)
Leif, Robert C.; Leif, Stephanie H.
2015-03-01
Introduction: The International Society for Advancement of Cytometry (ISAC) Data Standards Task Force (DSTF) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). The CytometryML schemas, are based in part upon the Flow Cytometry Standard and Digital Imaging and Communication (DICOM) standards. CytometryML has and will be extended and adapted to include MIFlowCyt, as well as to serve as a common standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). Individual major elements of the MIFlowCyt schema were translated into XML and filled with reasonable data. A small section of the code was formatted with HTML formatting elements. Results: The differences in the amount of detail to be recorded for 1) users of standard techniques including data analysts and 2) others, such as method and device creators, laboratory and other managers, engineers, and regulatory specialists required that separate data-types be created to describe the instrument configuration and components. A very substantial part of the MIFlowCyt element that describes the Experimental Overview part of the MIFlowCyt and substantial parts of several other major elements have been developed. Conclusions: The future use of structured XML tags and web technology should facilitate searching of experimental information, its presentation, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate in publications adherence to the MIFlowCyt standard. The use of CytometryML together with XML technology should also result in the textual and numeric data being published using web technology without any change in composition. Preliminary testing indicates that CytometryML XML pages can be directly formatted with the combination of HTML and CSS.
Speed up of XML parsers with PHP language implementation
NASA Astrophysics Data System (ADS)
Georgiev, Bozhidar; Georgieva, Adriana
2012-11-01
In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.
Report of Official Foreign Travel to Germany, May 16-June 1, 2001
DOE Office of Scientific and Technical Information (OSTI.GOV)
J. D. Mason
2001-06-18
The Department of Energy (DOE) and associated agencies have moved rapidly toward electronic production, management, and dissemination of scientific and technical information. The World-Wide Web (WWW) has become a primary means of information dissemination. Electronic commerce (EC) is becoming the preferred means of procurement. DOE, like other government agencies, depends on and encourages the use of international standards in data communications. Like most government agencies, DOE has expressed a preference for openly developed standards over proprietary designs promoted as ''standards'' by vendors. In particular, there is a preference for standards developed by organizations such as the International Organization for Standardizationmore » (ISO) and the American National Standards Institute (ANSI) that use open, public processes to develop their standards. Among the most widely adopted international standards is the Standard Generalized Markup Language (SGML, ISO 8879:1986, FIPS 152), to which DOE long ago made a commitment. Besides the official commitment, which has resulted in several specialized projects, DOE makes heavy use of coding derived from SGML: Most documents on the WWW are coded in HTML (Hypertext Markup Language), which is an application of SGML. The World-Wide Web Consortium (W3C), with the backing of major software houses like Adobe, IBM, Microsoft, Netscape, Oracle, and Sun, is promoting XML (eXtensible Markup Language), a class of SGML applications, for the future of the WWW and the basis for EC. In support of DOE's use of these standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my May 2001 trip, I chaired the spring 2001 meeting of SC34 in Berlin, Germany. I also attended XML Europe 2001, a major conference on the use of SGML and XML sponsored by the Graphic Communications Association (GCA), and chaired a meeting of the International SGML/XML Users' Group (ISUG). In addition to the widespread use of the WWW among DOE's plants and facilities in Oak Ridge and among DOE sites across the nation, there have been several past and present SGML- and XML-based projects at the Y-12 National Security Complex (Y-12). Our local project team has done SGML and XML development at Y-12 and Oak Ridge National Laboratory (ORNL) since the late 1980s. SGML is a component of the Weapons Records Archiving and Preservation (WRAP) project at Y-12 and is the format for catalog metadata chosen for weapons records by the Nuclear Weapons Information Group (NWIG). The ''Ferret'' system for automated classification analysis uses XML to structure its knowledge base. The Ferret team also provides XML consulting to OSTI and DOE Headquarters, particularly the National Nuclear Security Administration (NNSA). Supporting standards development allows DOE and Y-12 the opportunity both to provide input into the process and to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML, XML, and related topics.« less
CytometryML, an XML format based on DICOM and FCS for analytical cytology data.
Leif, Robert C; Leif, Suzanne B; Leif, Stephanie H
2003-07-01
Flow Cytometry Standard (FCS) was initially created to standardize the software researchers use to analyze, transmit, and store data produced by flow cytometers and sorters. Because of the clinical utility of flow cytometry, it is necessary to have a standard consistent with the requirements of medical regulatory agencies. We extended the existing mapping of FCS to the Digital Imaging and Communications in Medicine (DICOM) standard to include list-mode data produced by flow cytometry, laser scanning cytometry, and microscopic image cytometry. FCS list-mode was mapped to the DICOM Waveform Information Object. We created a collection of Extensible Markup Language (XML) schemas to express the DICOM analytical cytologic text-based data types except for large binary objects. We also developed a cytometry markup language, CytometryML, in an open environment subject to continuous peer review. The feasibility of expressing the data contained in FCS, including list-mode in DICOM, was demonstrated; and a preliminary mapping for list-mode data in the form of XML schemas and documents was completed. DICOM permitted the creation of indices that can be used to rapidly locate in a list-mode file the cells that are members of a subset. DICOM and its coding schemes for other medical standards can be represented by XML schemas, which can be combined with other relevant XML applications, such as Mathematical Markup Language (MathML). The use of XML format based on DICOM for analytical cytology met most of the previously specified requirements and appears capable of meeting the others; therefore, the present FCS should be retired and replaced by an open, XML-based, standard CytometryML. Copyright 2003 Wiley-Liss, Inc.
A collaborative computer auditing system under SOA-based conceptual model
NASA Astrophysics Data System (ADS)
Cong, Qiushi; Huang, Zuoming; Hu, Jibing
2013-03-01
Some of the current challenges of computer auditing are the obstacles to retrieving, converting and translating data from different database schema. During the last few years, there are many data exchange standards under continuous development such as Extensible Business Reporting Language (XBRL). These XML document standards can be used for data exchange among companies, financial institutions, and audit firms. However, for many companies, it is still expensive and time-consuming to translate and provide XML messages with commercial application packages, because it is complicated and laborious to search and transform data from thousands of tables in the ERP databases. How to transfer transaction documents for supporting continuous auditing or real time auditing between audit firms and their client companies is a important topic. In this paper, a collaborative computer auditing system under SOA-based conceptual model is proposed. By utilizing the widely used XML document standards and existing data transformation applications developed by different companies and software venders, we can wrap these application as commercial web services that will be easy implemented under the forthcoming application environments: service-oriented architecture (SOA). Under the SOA environments, the multiagency mechanism will help the maturity and popularity of data assurance service over the Internet. By the wrapping of data transformation components with heterogeneous databases or platforms, it will create new component markets composed by many software vendors and assurance service companies to provide data assurance services for audit firms, regulators or third parties.
PubMedPortable: A Framework for Supporting the Development of Text Mining Applications.
Döring, Kersten; Grüning, Björn A; Telukunta, Kiran K; Thomas, Philippe; Günther, Stefan
2016-01-01
Information extraction from biomedical literature is continuously growing in scope and importance. Many tools exist that perform named entity recognition, e.g. of proteins, chemical compounds, and diseases. Furthermore, several approaches deal with the extraction of relations between identified entities. The BioCreative community supports these developments with yearly open challenges, which led to a standardised XML text annotation format called BioC. PubMed provides access to the largest open biomedical literature repository, but there is no unified way of connecting its data to natural language processing tools. Therefore, an appropriate data environment is needed as a basis to combine different software solutions and to develop customised text mining applications. PubMedPortable builds a relational database and a full text index on PubMed citations. It can be applied either to the complete PubMed data set or an arbitrary subset of downloaded PubMed XML files. The software provides the infrastructure to combine stand-alone applications by exporting different data formats, e.g. BioC. The presented workflows show how to use PubMedPortable to retrieve, store, and analyse a disease-specific data set. The provided use cases are well documented in the PubMedPortable wiki. The open-source software library is small, easy to use, and scalable to the user's system requirements. It is freely available for Linux on the web at https://github.com/KerstenDoering/PubMedPortable and for other operating systems as a virtual container. The approach was tested extensively and applied successfully in several projects.
PubMedPortable: A Framework for Supporting the Development of Text Mining Applications
Döring, Kersten; Grüning, Björn A.; Telukunta, Kiran K.; Thomas, Philippe; Günther, Stefan
2016-01-01
Information extraction from biomedical literature is continuously growing in scope and importance. Many tools exist that perform named entity recognition, e.g. of proteins, chemical compounds, and diseases. Furthermore, several approaches deal with the extraction of relations between identified entities. The BioCreative community supports these developments with yearly open challenges, which led to a standardised XML text annotation format called BioC. PubMed provides access to the largest open biomedical literature repository, but there is no unified way of connecting its data to natural language processing tools. Therefore, an appropriate data environment is needed as a basis to combine different software solutions and to develop customised text mining applications. PubMedPortable builds a relational database and a full text index on PubMed citations. It can be applied either to the complete PubMed data set or an arbitrary subset of downloaded PubMed XML files. The software provides the infrastructure to combine stand-alone applications by exporting different data formats, e.g. BioC. The presented workflows show how to use PubMedPortable to retrieve, store, and analyse a disease-specific data set. The provided use cases are well documented in the PubMedPortable wiki. The open-source software library is small, easy to use, and scalable to the user’s system requirements. It is freely available for Linux on the web at https://github.com/KerstenDoering/PubMedPortable and for other operating systems as a virtual container. The approach was tested extensively and applied successfully in several projects. PMID:27706202
In-field Access to Geoscientific Metadata through GPS-enabled Mobile Phones
NASA Astrophysics Data System (ADS)
Hobona, Gobe; Jackson, Mike; Jordan, Colm; Butchart, Ben
2010-05-01
Fieldwork is an integral part of much geosciences research. But whilst geoscientists have physical or online access to data collections whilst in the laboratory or at base stations, equivalent in-field access is not standard or straightforward. The increasing availability of mobile internet and GPS-supported mobile phones, however, now provides the basis for addressing this issue. The SPACER project was commissioned by the Rapid Innovation initiative of the UK Joint Information Systems Committee (JISC) to explore the potential for GPS-enabled mobile phones to access geoscientific metadata collections. Metadata collections within the geosciences and the wider geospatial domain can be disseminated through web services based on the Catalogue Service for Web(CSW) standard of the Open Geospatial Consortium (OGC) - a global grouping of over 380 private, public and academic organisations aiming to improve interoperability between geospatial technologies. CSW offers an XML-over-HTTP interface for querying and retrieval of geospatial metadata. By default, the metadata returned by CSW is based on the ISO19115 standard and encoded in XML conformant to ISO19139. The SPACER project has created a prototype application that enables mobile phones to send queries to CSW containing user-defined keywords and coordinates acquired from GPS devices built-into the phones. The prototype has been developed using the free and open source Google Android platform. The mobile application offers views for listing titles, presenting multiple metadata elements and a Google Map with an overlay of bounding coordinates of datasets. The presentation will describe the architecture and approach applied in the development of the prototype.
Astronomical Instrumentation System Markup Language
NASA Astrophysics Data System (ADS)
Goldbaum, Jesse M.
2016-05-01
The Astronomical Instrumentation System Markup Language (AISML) is an Extensible Markup Language (XML) based file format for maintaining and exchanging information about astronomical instrumentation. The factors behind the need for an AISML are first discussed followed by the reasons why XML was chosen as the format. Next it's shown how XML also provides the framework for a more precise definition of an astronomical instrument and how these instruments can be combined to form an Astronomical Instrumentation System (AIS). AISML files for several instruments as well as one for a sample AIS are provided. The files demonstrate how AISML can be utilized for various tasks from web page generation and programming interface to instrument maintenance and quality management. The advantages of widespread adoption of AISML are discussed.
QuakeML - An XML Schema for Seismology
NASA Astrophysics Data System (ADS)
Wyss, A.; Schorlemmer, D.; Maraini, S.; Baer, M.; Wiemer, S.
2004-12-01
We propose an extensible format-definition for seismic data (QuakeML). Sharing data and seismic information efficiently is one of the most important issues for research and observational seismology in the future. The eXtensible Markup Language (XML) is playing an increasingly important role in the exchange of a variety of data. Due to its extensible definition capabilities, its wide acceptance and the existing large number of utilities and libraries for XML, a structured representation of various types of seismological data should in our opinion be developed by defining a 'QuakeML' standard. Here we present the QuakeML definitions for parameter databases and further efforts, e.g. a central QuakeML catalog database and a web portal for exchanging codes and stylesheets.
Informal information for web-based engineering catalogues
NASA Astrophysics Data System (ADS)
Allen, Richard D.; Culley, Stephen J.; Hicks, Ben J.
2001-10-01
Success is highly dependent on the ability of a company to efficiently produce optimal designs. In order to achieve this companies must minimize time to market and possess the ability to make fully informed decisions at the early phase of the design process. Such decisions may include the choice of component and suppliers, as well as cost and maintenance considerations. Computer modeling and electronic catalogues are becoming the preferred medium for the selection and design of mechanical components. In utilizing these techniques, the designer demands the capability to identify, evaluate and select mechanical components both quantitatively and qualitatively. Quantitative decisions generally encompass performance data included in the formal catalogue representation. It is in the area of qualitative decisions that the use of what the authors call 'Informal Information' is of crucial importance. Thus, 'Informal Information' must often be incorporated into the selection process and selection systems. This would enable more informed decisions to be made quicker, without the need for information retrieval via discussion with colleagues in the design environment. This paper provides an overview of the use of electronic information in the design of mechanical systems, including a discussion of limitations of current technology. The importance of Informal Information is discussed and the requirements for association with web based electronic catalogues are developed. This system is based on a flexible XML schema and enables the storage, classification and recall of Informal Information packets. Furthermore, a strategy for the inclusion of Informal Information is proposed, and an example case is used to illustrate the benefits.
Joint Battlespace Infosphere: Information Management Within a C2 Enterprise
2005-06-01
using. In version 1.2, we support both MySQL and Oracle as underlying implementations where the XML metadata schema is mapped into relational tables in...Identity Servers, Role-Based Access Control, and Policy Representation – Databases: Oracle , MySQL , TigerLogic, Berkeley XML DB 15 Instrumentation Services...converted to SQL for execution. Invocations are then forwarded to the appropriate underlying IOR core components that have the responsibility of issuing
C3I and Modelling and Simulation (M&S) Interoperability
2004-03-01
customised Open Source products. The technical implementation is based on the use of the eXtendend Markup Language (XML) and Python . XML is developed...to structure, store and send information. The language is focus on the description of data. Python is a portable, interpreted, object-oriented...programming language. A huge variety of usable Open Source Projects were issued by the Python Community. 3.1 Phase 1: Feasibility Studies Phase 1 was
MPEG-7 audio-visual indexing test-bed for video retrieval
NASA Astrophysics Data System (ADS)
Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian
2003-12-01
This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
PyPDB: a Python API for the Protein Data Bank.
Gilpin, William
2016-01-01
We have created a Python programming interface for the RCSB Protein Data Bank (PDB) that allows search and data retrieval for a wide range of result types, including BLAST and sequence motif queries. The API relies on the existing XML-based API and operates by creating custom XML requests from native Python types, allowing extensibility and straightforward modification. The package has the ability to perform many types of advanced search of the PDB that are otherwise only available through the PDB website. PyPDB is implemented exclusively in Python 3 using standard libraries for maximal compatibility. The most up-to-date version, including iPython notebooks containing usage tutorials, is available free-of-charge under an open-source MIT license via GitHub at https://github.com/williamgilpin/pypdb, and the full API reference is at http://williamgilpin.github.io/pypdb_docs/html/. The latest stable release is also available on PyPI. wgilpin@stanford.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A Petri Net-Based Software Process Model for Developing Process-Oriented Information Systems
NASA Astrophysics Data System (ADS)
Li, Yu; Oberweis, Andreas
Aiming at increasing flexibility, efficiency, effectiveness, and transparency of information processing and resource deployment in organizations to ensure customer satisfaction and high quality of products and services, process-oriented information systems (POIS) represent a promising realization form of computerized business information systems. Due to the complexity of POIS, explicit and specialized software process models are required to guide POIS development. In this chapter we characterize POIS with an architecture framework and present a Petri net-based software process model tailored for POIS development with consideration of organizational roles. As integrated parts of the software process model, we also introduce XML nets, a variant of high-level Petri nets as basic methodology for business processes modeling, and an XML net-based software toolset providing comprehensive functionalities for POIS development.
C2 Product-Centric Approach to Transforming Current C4ISR Information Architectures
2004-06-01
each type of environment . For “Cultural Feature” Entity Kind only the “Land” Domain is defined and for “ Environmental ” Entity Kind only the...take advantage of both worlds. In particular, the unifying concept of a Model-Driven Architecture (MDA) under development by the Object Management...Exchange Requirements (IER) to an XML environment . FCS [4] developers have embraced both UML and XML for their architectures and MIP [5] too is migrating
A JEE RESTful service to access Conditions Data in ATLAS
NASA Astrophysics Data System (ADS)
Formica, Andrea; Gallas, E. J.
2015-12-01
Usage of condition data in ATLAS is extensive for offline reconstruction and analysis (e.g. alignment, calibration, data quality). The system is based on the LCG Conditions Database infrastructure, with read and write access via an ad hoc C++ API (COOL), a system which was developed before Run 1 data taking began. The infrastructure dictates that the data is organized into separate schemas (assigned to subsystems/groups storing distinct and independent sets of conditions), making it difficult to access information from several schemas at the same time. We have thus created PL/SQL functions containing queries to provide content extraction at multi-schema level. The PL/SQL API has been exposed to external clients by means of a Java application providing DB access via REST services, deployed inside an application server (JBoss WildFly). The services allow navigation over multiple schemas via simple URLs. The data can be retrieved either in XML or JSON formats, via simple clients (like curl or Web browsers).
Making journals accessible to the visually impaired: the future is near
GARDNER, John; BULATOV, Vladimir; KELLY, Robert
2010-01-01
The American Physical Society (APS) has been a leader in using markup languages for publishing. ViewPlus has led development of innovative technologies for graphical information accessibility by people with print disabilities. APS, ViewPlus, and other collaborators in the Enhanced Reading Project are working together to develop the necessary technology and infrastructure for APS to publish its journals in the DAISY (Digital Accessible Information SYstem) eXtended Markup Language (XML) format, in which all text, math, and figures would be accessible to people who are blind or have other print disabilities. The first APS DAISY XML publications are targeted for late 2010. PMID:20676358
Robinson, Judas; de Lusignan, Simon; Kostkova, Patty; Madge, Bruce; Marsh, A; Biniaris, C
2006-01-01
Rich Site Summary (RSS) feeds are a method for disseminating and syndicating the contents of a website using extensible mark-up language (XML). The Primary Care Electronic Library (PCEL) distributes recent additions to the site in the form of an RSS feed. When new resources are added to PCEL, they are manually assigned medical subject headings (MeSH terms), which are then automatically mapped to SNOMED-CT terms using the Unified Medical Language System (UMLS) Metathesaurus. The library is thus searchable using MeSH or SNOMED-CT. Our syndicate partner wished to have remote access to PCEL coronary heart disease (CHD) information resources based on SNOMED-CT search terms. To pilot the supply of relevant information resources in response to clinically coded requests, using RSS syndication for transmission between web servers. Our syndicate partner provided a list of CHD SNOMED-CT terms to its end-users, a list which was coded according to UMLS specifications. When the end-user requested relevant information resources, this request was relayed from our syndicate partner's web server to the PCEL web server. The relevant resources were retrieved from the PCEL MySQL database. This database is accessed using a server side scripting language (PHP), which enables the production of dynamic RSS feeds on the basis of Source Asserted Identifiers (CODEs) contained in UMLS. Retrieving resources using SNOMED-CT terms using syndication can be used to build a functioning application. The process from request to display of syndicated resources took less than one second. The results of the pilot illustrate that it is possible to exchange data between servers using RSS syndication. This method could be utilised dynamically to supply digital library resources to a clinical system with SNOMED-CT data used as the standard of reference.
Using XML to encode TMA DES metadata.
Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul
2011-01-01
The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs.
Using XML to encode TMA DES metadata
Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul
2011-01-01
Background: The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. Materials and Methods: We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. Results: We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. Conclusions: All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs. PMID:21969921
Mercury- Distributed Metadata Management, Data Discovery and Access System
NASA Astrophysics Data System (ADS)
Palanisamy, Giri; Wilson, Bruce E.; Devarakonda, Ranjeet; Green, James M.
2007-12-01
Mercury is a federated metadata harvesting, search and retrieval tool based on both open source and ORNL- developed software. It was originally developed for NASA, and the Mercury development consortium now includes funding from NASA, USGS, and DOE. Mercury supports various metadata standards including XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115 (under development). Mercury provides a single portal to information contained in disparate data management systems. It collects metadata and key data from contributing project servers distributed around the world and builds a centralized index. The Mercury search interfaces then allow the users to perform simple, fielded, spatial and temporal searches across these metadata sources. This centralized repository of metadata with distributed data sources provides extremely fast search results to the user, while allowing data providers to advertise the availability of their data and maintain complete control and ownership of that data. Mercury supports various projects including: ORNL DAAC, NBII, DADDI, LBA, NARSTO, CDIAC, OCEAN, I3N, IAI, ESIP and ARM. The new Mercury system is based on a Service Oriented Architecture and supports various services such as Thesaurus Service, Gazetteer Web Service and UDDI Directory Services. This system also provides various search services including: RSS, Geo-RSS, OpenSearch, Web Services and Portlets. Other features include: Filtering and dynamic sorting of search results, book-markable search results, save, retrieve, and modify search criteria.
Non-invasive light-weight integration engine for building EHR from autonomous distributed systems.
Crespo Molina, Pere; Angulo Fernández, Carlos; Maldonado Segura, José A; Moner Cano, David; Robles Viejo, Montserrat
2006-01-01
Pangea-LE is a message oriented light-weight integration engine, allowing concurrent access to clinical information from disperse and heterogeneous data sources. The engine extracts the information and serves it to the requester client applications in a flexible XML format. This XML response message can be formatted on demand by the appropriate XSL (Extensible Stylesheet Language) transformation in order to fit client application needs. In this article we present a real use case sample where Pangea-LE collects and generates "on the fly" a structured view of all the patient clinical information available in a healthcare organisation. This information is presented to healthcare professionals in an EHR (Electronic Health Record) viewer Web application with patient search and EHR browsing capabilities. Implantation in a real environment has been a notable success due to the non-invasive method which extremely respects the existing information systems.
XWeB: The XML Warehouse Benchmark
NASA Astrophysics Data System (ADS)
Mahboubi, Hadj; Darmont, Jérôme
With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.
An XML-based interchange format for genotype-phenotype data.
Whirl-Carrillo, M; Woon, M; Thorn, C F; Klein, T E; Altman, R B
2008-02-01
Recent advances in high-throughput genotyping and phenotyping have accelerated the creation of pharmacogenomic data. Consequently, the community requires standard formats to exchange large amounts of diverse information. To facilitate the transfer of pharmacogenomics data between databases and analysis packages, we have created a standard XML (eXtensible Markup Language) schema that describes both genotype and phenotype data as well as associated metadata. The schema accommodates information regarding genes, drugs, diseases, experimental methods, genomic/RNA/protein sequences, subjects, subject groups, and literature. The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB; www.pharmgkb.org) has used this XML schema for more than 5 years to accept and process submissions containing more than 1,814,139 SNPs on 20,797 subjects using 8,975 assays. Although developed in the context of pharmacogenomics, the schema is of general utility for exchange of genotype and phenotype data. We have written syntactic and semantic validators to check documents using this format. The schema and code for validation is available to the community at http://www.pharmgkb.org/schema/index.html (last accessed: 8 October 2007). (c) 2007 Wiley-Liss, Inc.
Software Development Of XML Parser Based On Algebraic Tools
NASA Astrophysics Data System (ADS)
Georgiev, Bozhidar; Georgieva, Adriana
2011-12-01
In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.
Clustering XML Documents Using Frequent Subtrees
NASA Astrophysics Data System (ADS)
Kutty, Sangeetha; Tran, Tien; Nayak, Richi; Li, Yuefeng
This paper presents an experimental study conducted over the INEX 2008 Document Mining Challenge corpus using both the structure and the content of XML documents for clustering them. The concise common substructures known as the closed frequent subtrees are generated using the structural information of the XML documents. The closed frequent subtrees are then used to extract the constrained content from the documents. A matrix containing the term distribution of the documents in the dataset is developed using the extracted constrained content. The k-way clustering algorithm is applied to the matrix to obtain the required clusters. In spite of the large number of documents in the INEX 2008 Wikipedia dataset, the proposed frequent subtree-based clustering approach was successful in clustering the documents. This approach significantly reduces the dimensionality of the terms used for clustering without much loss in accuracy.
Architecture of portable electronic medical records system integrated with streaming media.
Chen, Wei; Shih, Chien-Chou
2012-02-01
Due to increasing occurrence of accidents and illness during business trips, travel, or overseas studies, the requirement for portable EMR (Electronic Medical Records) has increased. This study proposes integrating streaming media technology into the EMR system to facilitate referrals, contracted laboratories, and disease notification among hospitals. The current study encoded static and dynamic medical images of patients into a streaming video format and stored them in a Flash Media Server (FMS). Based on the Taiwan Electronic Medical Record Template (TMT) standard, EMR records can be converted into XML documents and used to integrate description fields with embedded streaming videos. This investigation implemented a web-based portable EMR interchanging system using streaming media techniques to expedite exchanging medical image information among hospitals. The proposed architecture of the portable EMR retrieval system not only provides local hospital users the ability to acquire EMR text files from a previous hospital, but also helps access static and dynamic medical images as reference for clinical diagnosis and treatment. The proposed method protects property rights of medical images through information security mechanisms of the Medical Record Interchange Service Center and Health Certificate Authorization to facilitate proper, efficient, and continuous treatment of patients.
Demonstrating NaradaBrokering as a Middleware Fabric for Grid-based Remote Visualization Services
NASA Astrophysics Data System (ADS)
Pallickara, S.; Erlebacher, G.; Yuen, D.; Fox, G.; Pierce, M.
2003-12-01
Remote Visualization Services (RVS) have tended to rely on approaches based on the client server paradigm. Here we demonstrate our approach - based on a distributed brokering infrastructure, NaradaBrokering [1] - that relies on distributed, asynchronous and loosely coupled interactions to meet the requirements and constraints of RVS. In our approach to RVS, services advertise their capabilities to the broker network that manages these service advertisements. Among the services considered within our system are those that perform graphic transformations, mediate access to specialized datasets and finally those that manage the execution of specified tasks. There could be multiple instances of each of these services and the system ensures that load for a given service is distributed efficiently over these service instances. We will demonstrate implementation of concepts that we outlined in the oral presentation. This would involve two or more visualization servers interacting asynchronously with multiple clients through NaradaBrokering. The communicating entities may exchange SOAP [2] (Simple Object Access Protocol) messages. SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of three parts: an envelope that describes what is in a message and how to process it, rules for expressing instances of application-defined data types, and a convention for representing remote invocation related operations. Furthermore, we will also demonstrate how clients can retrieve their results after prolonged disconnects or after any failures that might have taken place. The entities, services and clients alike, are not limited by the geographical distances that separate them. We are planning to test this system in the context of trans-Atlantic links separating interacting entities. {[1]} The NaradaBrokering Project: http://www.naradabrokering.org {[2]} Newcomer, E., 2002, Understanding web services: XML, WSDL, SOAP, and UDDI, Addison Wesley Professional.
77 FR 22707 - Electronic Reporting Under the Toxic Substances Control Act
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-17
... completes metadata information, the web-based tool validates the submission by performing a basic error... uploading PDF attachments or other file types, such as XML, and completing metadata information would be...
Harmonised information exchange between decentralised food composition database systems.
Pakkala, H; Christensen, T; de Victoria, I Martínez; Presser, K; Kadvan, A
2010-11-01
The main aim of the European Food Information Resource (EuroFIR) project is to develop and disseminate a comprehensive, coherent and validated data bank for the distribution of food composition data (FCD). This can only be accomplished by harmonising food description and data documentation and by the use of standardised thesauri. The data bank is implemented through a network of local FCD storages (usually national) under the control and responsibility of the local (national) EuroFIR partner. The implementation of the system based on the EuroFIR specifications is under development. The data interchange happens through the EuroFIR Web Services interface, allowing the partners to implement their system using methods and software suitable for the local computer environment. The implementation uses common international standards, such as Simple Object Access Protocol, Web Service Description Language and Extensible Markup Language (XML). A specifically constructed EuroFIR search facility (eSearch) was designed for end users. The EuroFIR eSearch facility compiles queries using a specifically designed Food Data Query Language and sends a request to those network nodes linked to the EuroFIR Web Services that will most likely have the requested information. The retrieved FCD are compiled into a specifically designed data interchange format (the EuroFIR Food Data Transport Package) in XML, which is sent back to the EuroFIR eSearch facility as the query response. The same request-response operation happens in all the nodes that have been selected in the EuroFIR eSearch facility for a certain task. Finally, the FCD are combined by the EuroFIR eSearch facility and delivered to the food compiler. The implementation of FCD interchange using decentralised computer systems instead of traditional data-centre models has several advantages. First of all, the local partners have more control over their FCD, which will increase commitment and improve quality. Second, a multicentred solution is more economically viable than the creation of a centralised data bank, because of the lack of national political support for multinational systems.
An Introduction to the Extensible Markup Language (XML).
ERIC Educational Resources Information Center
Bryan, Martin
1998-01-01
Describes Extensible Markup Language (XML), a subset of the Standard Generalized Markup Language (SGML) that is designed to make it easy to interchange structured documents over the Internet. Topics include Document Type Definition (DTD), components of XML, the use of XML, text and non-text elements, and uses for XML-coded files. (LRW)
XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme
NASA Astrophysics Data System (ADS)
Chebotko, Artem; Fu, Bin
Query evaluation in an XML database requires reconstructing XML subtrees rooted at nodes found by an XML query. Since XML subtree reconstruction can be expensive, one approach to improve query response time is to use reconstruction views - materialized XML subtrees of an XML document, whose nodes are frequently accessed by XML queries. For this approach to be efficient, the principal requirement is a framework for view selection. In this work, we are the first to formalize and study the problem of XML reconstruction view selection. The input is a tree T, in which every node i has a size c i and profit p i , and the size limitation C. The target is to find a subset of subtrees rooted at nodes i 1, ⋯ , i k respectively such that c_{i_1}+\\cdots +c_{i_k}le C, and p_{i_1}+\\cdots +p_{i_k} is maximal. Furthermore, there is no overlap between any two subtrees selected in the solution. We prove that this problem is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) as a solution.
NASA Astrophysics Data System (ADS)
Boldrini, Enrico; Schaap, Dick M. A.; Nativi, Stefano
2013-04-01
SeaDataNet implements a distributed pan-European infrastructure for Ocean and Marine Data Management whose nodes are maintained by 40 national oceanographic and marine data centers from 35 countries riparian to all European seas. A unique portal makes possible distributed discovery, visualization and access of the available sea data across all the member nodes. Geographic metadata play an important role in such an infrastructure, enabling an efficient documentation and discovery of the resources of interest. In particular: - Common Data Index (CDI) metadata describe the sea datasets, including identification information (e.g. product title, interested area), evaluation information (e.g. data resolution, constraints) and distribution information (e.g. download endpoint, download protocol); - Cruise Summary Reports (CSR) metadata describe cruises and field experiments at sea, including identification information (e.g. cruise title, name of the ship), acquisition information (e.g. utilized instruments, number of samples taken) In the context of the second phase of SeaDataNet (SeaDataNet 2 EU FP7 project, grant agreement 283607, started on October 1st, 2011 for a duration of 4 years) a major target is the setting, adoption and promotion of common international standards, to the benefit of outreach and interoperability with the international initiatives and communities (e.g. OGC, INSPIRE, GEOSS, …). A standardization effort conducted by CNR with the support of MARIS, IFREMER, STFC, BODC and ENEA has led to the creation of a ISO 19115 metadata profile of CDI and its XML encoding based on ISO 19139. The CDI profile is now in its stable version and it's being implemented and adopted by the SeaDataNet community tools and software. The effort has then continued to produce an ISO based metadata model and its XML encoding also for CSR. The metadata elements included in the CSR profile belong to different models: - ISO 19115: E.g. cruise identification information, including title and area of interest; metadata responsible party information - ISO 19115-2: E.g. acquisition information, including date of sampling, instruments used - SeaDataNet: E.g. SeaDataNet community specific, including EDMO and EDMERP code lists Two main guidelines have been followed in the metadata model drafting: - All the obligations and constraints required by both the ISO standards and INSPIRE directive had to be satisfied. These include the presence of specific elements with given cardinality (e.g. mandatory metadata date stamp, mandatory lineage information) - All the content information of legacy CSR format had to be supported by the new metadata model. An XML encoding of the CSR profile has been defined as well. Based on the ISO 19139 XML schema and constraints, it adds the new elements specific of the SeaDataNet community. The associated Schematron rules are used to enforce constraints not enforceable just with the Schema and to validate elements content against the SeaDataNet code lists vocabularies.
Space Communications Emulation Facility
NASA Technical Reports Server (NTRS)
Hill, Chante A.
2004-01-01
Establishing space communication between ground facilities and other satellites is a painstaking task that requires many precise calculations dealing with relay time, atmospheric conditions, and satellite positions, to name a few. The Space Communications Emulation Facility (SCEF) team here at NASA is developing a facility that will approximately emulate the conditions in space that impact space communication. The emulation facility is comprised of a 32 node distributed cluster of computers; each node representing a satellite or ground station. The objective of the satellites is to observe the topography of the Earth (water, vegetation, land, and ice) and relay this information back to the ground stations. Software originally designed by the University of Kansas, labeled the Emulation Manager, controls the interaction of the satellites and ground stations, as well as handling the recording of data. The Emulation Manager is installed on a Linux Operating System, employing both Java and C++ programming codes. The emulation scenarios are written in extensible Markup Language, XML. XML documents are designed to store, carry, and exchange data. With XML documents data can be exchanged between incompatible systems, which makes it ideal for this project because Linux, MAC and Windows Operating Systems are all used. Unfortunately, XML documents cannot display data like HTML documents. Therefore, the SCEF team uses XML Schema Definition (XSD) or just schema to describe the structure of an XML document. Schemas are very important because they have the capability to validate the correctness of data, define restrictions on data, define data formats, and convert data between different data types, among other things. At this time, in order for the Emulation Manager to open and run an XML emulation scenario file, the user must first establish a link between the schema file and the directory under which the XML scenario files are saved. This procedure takes place on the command line on the Linux Operating System. Once this link has been established the Emulation manager validates all the XML files in that directory against the schema file, before the actual scenario is run. Using some very sophisticated commercial software called the Satellite Tool Kit (STK) installed on the Linux box, the Emulation Manager is able to display the data and graphics generated by the execution of a XML emulation scenario file. The Emulation Manager software is written in JAVA programming code. Since the SCEF project is in the developmental stage, the source code for this type of software is being modified to better fit the requirements of the SCEF project. Some parameters for the emulation are hard coded, set at fixed values. Members of the SCEF team are altering the code to allow the user to choose the values of these hard coded parameters by inserting a toolbar onto the preexisting GUI.
Multimedia platform for authoring and presentation of clinical rounds in cardiology
NASA Astrophysics Data System (ADS)
Ratib, Osman M.; Allada, Vivekanand; Dahlbom, Magdalena; Lapstra, Lorelle
2003-05-01
We developed a multimedia presentation platform that allows retrieving data from any digital and analog modalities and to prepare a script of a clinical presentation in an XML format. This system was designed for cardiac multi-disciplinary conferences involving different cardiology specialists as well as cardiovascular surgeons. A typical presentation requires preparation of summary reports of data obtained from the different investigations and imaging techniques. An XML-based scripting methodology was developed to allow for preparation of clinical presentations. The image display program uses the generated script for the sequential presentation of different images that are displayed on pre-determined presentation settings. The ability to prepare and present clinical conferences electronically is more efficient and less time consuming than conventional settings using analog and digital documents, films and videotapes. The script of a given presentation can further be saved as part of the patient record for subsequent review of the documents and images that supported a given medical or therapeutic decision. This also constitutes a perfect documentation method for surgeons and physicians responsible of therapeutic procedures that were decided upon during the clinical conference. It allows them to review the relevant data that supported a given therapeutic decision.
Activate/Inhibit KGCS Gateway via Master Console EIC Pad-B Display
NASA Technical Reports Server (NTRS)
Ferreira, Pedro Henrique
2014-01-01
My internship consisted of two major projects for the Launch Control System.The purpose of the first project was to implement the Application Control Language (ACL) to Activate Data Acquisition (ADA) and to Inhibit Data Acquisition (IDA) the Kennedy Ground Control Sub-Systems (KGCS) Gateway, to update existing Pad-B End Item Control (EIC) Display to program the ADA and IDA buttons with new ACL, and to test and release the ACL Display.The second project consisted of unit testing all of the Application Services Framework (ASF) by March 21st. The XmlFileReader was unit tested and reached 100 coverage. The XmlFileReader class is used to grab information from XML files and use them to initialize elements in the other framework elements by using the Xerces C++ XML Parser; which is open source commercial off the shelf software. The ScriptThread was also tested. ScriptThread manages the creation and activation of script threads. A large amount of the time was used in initializing the environment and learning how to set up unit tests and getting familiar with the specific segments of the project that were assigned to us.
2005-09-01
aid this thesis would not have come to existence. First, we would like to thank Eric Chaum for his enthusiasm and recommendation for doing this...by David Hina [HINA 00], discusses the use of available Commercial-Off-The-Shelf (COTS) XML technology to provide for the exchange of data between...air and maritime components and therefore forms the basis of a joint command and control data model [ CHAUM 04]. The data model is one of the two
Karvounis, E C; Tsakanikas, V D; Fotiou, E; Fotiadis, D I
2010-01-01
The paper proposes a novel Extensible Markup Language (XML) based format called ART-ML that aims at supporting the interoperability and the reuse of models of blood flow, mass transport and plaque formation, exported by ARTool. ARTool is a platform for the automatic processing of various image modalities of coronary and carotid arteries. The images and their content are fused to develop morphological models of the arteries in easy to handle 3D representations. The platform incorporates efficient algorithms which are able to perform blood flow simulation. In addition atherosclerotic plaque development is estimated taking into account morphological, flow and genetic factors. ART-ML provides a XML format that enables the representation and management of embedded models within the ARTool platform and the storage and interchange of well-defined information. This approach influences in the model creation, model exchange, model reuse and result evaluation.
XML Based Scientific Data Management Facility
NASA Technical Reports Server (NTRS)
Mehrotra, Piyush; Zubair, M.; Ziebartt, John (Technical Monitor)
2001-01-01
The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of HTML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.
Bottom-Up Evaluation of Twig Join Pattern Queries in XML Document Databases
NASA Astrophysics Data System (ADS)
Chen, Yangjun
Since the extensible markup language XML emerged as a new standard for information representation and exchange on the Internet, the problem of storing, indexing, and querying XML documents has been among the major issues of database research. In this paper, we study the twig pattern matching and discuss a new algorithm for processing ordered twig pattern queries. The time complexity of the algorithmis bounded by O(|D|·|Q| + |T|·leaf Q ) and its space overhead is by O(leaf T ·leaf Q ), where T stands for a document tree, Q for a twig pattern and D is a largest data stream associated with a node q of Q, which contains the database nodes that match the node predicate at q. leaf T (leaf Q ) represents the number of the leaf nodes of T (resp. Q). In addition, the algorithm can be adapted to an indexing environment with XB-trees being used.
The XSD-Builder Specification Language—Toward a Semantic View of XML Schema Definition
NASA Astrophysics Data System (ADS)
Fong, Joseph; Cheung, San Kuen
In the present database market, XML database model is a main structure for the forthcoming database system in the Internet environment. As a conceptual schema of XML database, XML Model has its limitation on presenting its data semantics. System analyst has no toolset for modeling and analyzing XML system. We apply XML Tree Model (shown in Figure 2) as a conceptual schema of XML database to model and analyze the structure of an XML database. It is important not only for visualizing, specifying, and documenting structural models, but also for constructing executable systems. The tree model represents inter-relationship among elements inside different logical schema such as XML Schema Definition (XSD), DTD, Schematron, XDR, SOX, and DSD (shown in Figure 1, an explanation of the terms in the figure are shown in Table 1). The XSD-Builder consists of XML Tree Model, source language, translator, and XSD. The source language is called XSD-Source which is mainly for providing an environment with concept of user friendliness while writing an XSD. The source language will consequently be translated by XSD-Translator. Output of XSD-Translator is an XSD which is our target and is called as an object language.
The PDS-based Data Processing, Archiving and Management Procedures in Chang'e Mission
NASA Astrophysics Data System (ADS)
Zhang, Z. B.; Li, C.; Zhang, H.; Zhang, P.; Chen, W.
2017-12-01
PDS is adopted as standard format of scientific data and foundation of all data-related procedures in Chang'e mission. Unlike the geographically distributed nature of the planetary data system, all procedures of data processing, archiving, management and distribution are proceeded in the headquarter of Ground Research and Application System of Chang'e mission in a centralized manner. The RAW data acquired by the ground stations is transmitted to and processed by data preprocessing subsystem (DPS) for the production of PDS-compliant Level 0 Level 2 data products using established algorithms, with each product file being well described using an attached label, then all products with the same orbit number are put together into a scheduled task for archiving along with a XML archive list file recoding all product files' properties such as file name, file size etc. After receiving the archive request from DPS, data management subsystem (DMS) is provoked to parse the XML list file to validate all the claimed files and their compliance to PDS using a prebuilt data dictionary, then to exact metadata of each data product file from its PDS label and the fields of its normalized filename. Various requirements of data management, retrieving, distribution and application can be well met using the flexible combination of the rich metadata empowered by the PDS. In the forthcoming CE-5 mission, all the design of data structure and procedures will be updated from PDS version 3 used in previous CE-1, CE-2 and CE-3 missions to the new version 4, the main changes would be: 1) a dedicated detached XML label will be used to describe the corresponding scientific data acquired by the 4 instruments carried, the XML parsing framework used in archive list validation will be reused for the label after some necessary adjustments; 2) all the image data acquired by the panorama camera, landing camera and lunar mineralogical spectrometer should use an Array_2D_Image/Array_3D_Image object to store image data, and use a Table_Character object to store image frame header; the tabulated data acquired by the lunar regolith penetrating radar should use a Table_Binary object to store measurements.
An XML-Based Manipulation and Query Language for Rule-Based Information
NASA Astrophysics Data System (ADS)
Mansour, Essam; Höpfner, Hagen
Rules are utilized to assist in the monitoring process that is required in activities, such as disease management and customer relationship management. These rules are specified according to the application best practices. Most of research efforts emphasize on the specification and execution of these rules. Few research efforts focus on managing these rules as one object that has a management life-cycle. This paper presents our manipulation and query language that is developed to facilitate the maintenance of this object during its life-cycle and to query the information contained in this object. This language is based on an XML-based model. Furthermore, we evaluate the model and language using a prototype system applied to a clinical case study.
NASA Astrophysics Data System (ADS)
Bikakis, Nikos; Gioldasis, Nektarios; Tsinaraki, Chrisa; Christodoulakis, Stavros
SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.
Pathology data integration with eXtensible Markup Language.
Berman, Jules J
2005-02-01
It is impossible to overstate the importance of XML (eXtensible Markup Language) as a data organization tool. With XML, pathologists can annotate all of their data (clinical and anatomic) in a format that can transform every pathology report into a database, without compromising narrative structure. The purpose of this manuscript is to provide an overview of XML for pathologists. Examples will demonstrate how pathologists can use XML to annotate individual data elements and to structure reports in a common format that can be merged with other XML files or queried using standard XML tools. This manuscript gives pathologists a glimpse into how XML allows pathology data to be linked to other types of biomedical data and reduces our dependence on centralized proprietary databases.
XML technology planning database : lessons learned
NASA Technical Reports Server (NTRS)
Some, Raphael R.; Neff, Jon M.
2005-01-01
A hierarchical Extensible Markup Language(XML) database called XCALIBR (XML Analysis LIBRary) has been developed by Millennium Program to assist in technology investment (ROI) analysis and technology Language Capability the New return on portfolio optimization. The database contains mission requirements and technology capabilities, which are related by use of an XML dictionary. The XML dictionary codifies a standardized taxonomy for space missions, systems, subsystems and technologies. In addition to being used for ROI analysis, the database is being examined for use in project planning, tracking and documentation. During the past year, the database has moved from development into alpha testing. This paper describes the lessons learned during construction and testing of the prototype database and the motivation for moving from an XML taxonomy to a standard XML-based ontology.
RUBE: an XML-based architecture for 3D process modeling and model fusion
NASA Astrophysics Data System (ADS)
Fishwick, Paul A.
2002-07-01
Information fusion is a critical problem for science and engineering. There is a need to fuse information content specified as either data or model. We frame our work in terms of fusing dynamic and geometric models, to create an immersive environment where these models can be juxtaposed in 3D, within the same interface. The method by which this is accomplished fits well into other eXtensible Markup Language (XML) approaches to fusion in general. The task of modeling lies at the heart of the human-computer interface, joining the human to the system under study through a variety of sensory modalities. I overview modeling as a key concern for the Defense Department and the Air Force, and then follow with a discussion of past, current, and future work. Past work began with a package with C and has progressed, in current work, to an implementation in XML. Our current work is defined within the RUBE architecture, which is detailed in subsequent papers devoted to key components. We have built RUBE as a next generation modeling framework using our prior software, with research opportunities in immersive 3D and tangible user interfaces.
A Simple XML Producer-Consumer Protocol
NASA Technical Reports Server (NTRS)
Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)
2001-01-01
There are many different projects from government, academia, and industry that provide services for delivering events in distributed environments. The problem with these event services is that they are not general enough to support all uses and they speak different protocols so that they cannot interoperate. We require such interoperability when we, for example, wish to analyze the performance of an application in a distributed environment. Such an analysis might require performance information from the application, computer systems, networks, and scientific instruments. In this work we propose and evaluate a standard XML-based protocol for the transmission of events in distributed systems. One recent trend in government and academic research is the development and deployment of computational grids. Computational grids are large-scale distributed systems that typically consist of high-performance compute, storage, and networking resources. Examples of such computational grids are the DOE Science Grid, the NASA Information Power Grid (IPG), and the NSF Partnerships for Advanced Computing Infrastructure (PACIs). The major effort to deploy these grids is in the area of developing the software services to allow users to execute applications on these large and diverse sets of resources. These services include security, execution of remote applications, managing remote data, access to information about resources and services, and so on. There are several toolkits for providing these services such as Globus, Legion, and Condor. As part of these efforts to develop computational grids, the Global Grid Forum is working to standardize the protocols and APIs used by various grid services. This standardization will allow interoperability between the client and server software of the toolkits that are providing the grid services. The goal of the Performance Working Group of the Grid Forum is to standardize protocols and representations related to the storage and distribution of performance data. These standard protocols and representations must support tasks such as profiling parallel applications, monitoring the status of computers and networks, and monitoring the performance of services provided by a computational grid. This paper describes a proposed protocol and data representation for the exchange of events in a distributed system. The protocol exchanges messages formatted in XML and it can be layered atop any low-level communication protocol such as TCP or UDP Further, we describe Java and C++ implementations of this protocol and discuss their performance. The next section will provide some further background information. Section 3 describes the main communication patterns of our protocol. Section 4 describes how we represent events and related information using XML. Section 5 describes our protocol and Section 6 discusses the performance of two implementations of the protocol. Finally, an appendix provides the XML Schema definition of our protocol and event information.
... this page, please enable JavaScript. MedlinePlus produces XML data sets that you are welcome to download and use. If you have questions about the MedlinePlus XML files, please contact us . For additional sources of MedlinePlus data in XML format, visit our Web service page, ...
Integration of DICOM and openEHR standards
NASA Astrophysics Data System (ADS)
Wang, Ying; Yao, Zhihong; Liu, Lei
2011-03-01
The standard format for medical imaging storage and transmission is DICOM. openEHR is an open standard specification in health informatics that describes the management and storage, retrieval and exchange of health data in electronic health records. Considering that the integration of DICOM and openEHR is beneficial to information sharing, on the basis of XML-based DICOM format, we developed a method of creating a DICOM Imaging Archetype in openEHR to enable the integration of DICOM and openEHR. Each DICOM file contains abundant imaging information. However, because reading a DICOM involves looking up the DICOM Data Dictionary, the readability of a DICOM file has been limited. openEHR has innovatively adopted two level modeling method, making clinical information divided into lower level, the information model, and upper level, archetypes and templates. But one critical challenge posed to the development of openEHR is the information sharing problem, especially in imaging information sharing. For example, some important imaging information cannot be displayed in an openEHR file. In this paper, to enhance the readability of a DICOM file and semantic interoperability of an openEHR file, we developed a method of mapping a DICOM file to an openEHR file by adopting the form of archetype defined in openEHR. Because an archetype has a tree structure, after mapping a DICOM file to an openEHR file, the converted information is structuralized in conformance with openEHR format. This method enables the integration of DICOM and openEHR and data exchange without losing imaging information between two standards.
XML and E-Journals: The State of Play.
ERIC Educational Resources Information Center
Wusteman, Judith
2003-01-01
Discusses the introduction of the use of XML (Extensible Markup Language) in publishing electronic journals. Topics include standards, including DTDs (Document Type Definition), or document type definitions; aggregator requirements; SGML (Standard Generalized Markup Language); benefits of XML for e-journals; XML metadata; the possibility of…
HepML, an XML-based format for describing simulated data in high energy physics
NASA Astrophysics Data System (ADS)
Belov, S.; Dudko, L.; Kekelidze, D.; Sherstnev, A.
2010-10-01
In this paper we describe a HepML format and a corresponding C++ library developed for keeping complete description of parton level events in a unified and flexible form. HepML tags contain enough information to understand what kind of physics the simulated events describe and how the events have been prepared. A HepML block can be included into event files in the LHEF format. The structure of the HepML block is described by means of several XML Schemas. The Schemas define necessary information for the HepML block and how this information should be located within the block. The library libhepml is a C++ library intended for parsing and serialization of HepML tags, and representing the HepML block in computer memory. The library is an API for external software. For example, Matrix Element Monte Carlo event generators can use the library for preparing and writing a header of an LHEF file in the form of HepML tags. In turn, Showering and Hadronization event generators can parse the HepML header and get the information in the form of C++ classes. libhepml can be used in C++, C, and Fortran programs. All necessary parts of HepML have been prepared and we present the project to the HEP community. Program summaryProgram title: libhepml Catalogue identifier: AEGL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 138 866 No. of bytes in distributed program, including test data, etc.: 613 122 Distribution format: tar.gz Programming language: C++, C Computer: PCs and workstations Operating system: Scientific Linux CERN 4/5, Ubuntu 9.10 RAM: 1 073 741 824 bytes (1 Gb) Classification: 6.2, 11.1, 11.2 External routines: Xerces XML library ( http://xerces.apache.org/xerces-c/), Expat XML Parser ( http://expat.sourceforge.net/) Nature of problem: Monte Carlo simulation in high energy physics is divided into several stages. Various programs exist for these stages. In this article we are interested in interfacing different Monte Carlo event generators via data files, in particular, Matrix Element (ME) generators and Showering and Hadronization (SH) generators. There is a widely accepted format for data files for such interfaces - Les Houches Event Format (LHEF). Although information kept in an LHEF file is enough for proper working of SH generators, it is insufficient for understanding how events in the LHEF file have been prepared and which physical model has been applied. In this paper we propose an extension of the format for keeping additional information available in generators. We propose to add a new information block, marked up with XML tags, to the LHEF file. This block describes events in the file in more detail. In particular, it stores information about a physical model, kinematical cuts, generator, etc. This helps to make LHEF files self-documented. Certainly, HepML can be applied in more general context, not in LHEF files only. Solution method: In order to overcome drawbacks of the original LHEF accord we propose to add a new information block of HepML tags. HepML is an XML-based markup language. We designed several XML Schemas for all tags in the language. Any HepML document should follow rules of the Schemas. The language is equipped with a library for operation with HepML tags and documents. This C++ library, called libhepml, consists of classes for HepML objects, which represent a HepML document in computer memory, parsing classes, serializating classes, and some auxiliary classes. Restrictions: The software is adapted for solving problems, described in the article. There are no additional restrictions. Running time: Tests have been done on a computer with Intel(R) Core(TM)2 Solo, 1.4 GHz. Parsing of a HepML file: 6 ms (size of the HepML files is 12.5 Kb) Writing of a HepML block to file: 14 ms (file size 12.5 Kb) Merging of two HepML blocks and writing to file: 18 ms (file size - 25.0 Kb).
VSDMIP: virtual screening data management on an integrated platform
NASA Astrophysics Data System (ADS)
Gil-Redondo, Rubén; Estrada, Jorge; Morreale, Antonio; Herranz, Fernando; Sancho, Javier; Ortiz, Ángel R.
2009-03-01
A novel software (VSDMIP) for the virtual screening (VS) of chemical libraries integrated within a MySQL relational database is presented. Two main features make VSDMIP clearly distinguishable from other existing computational tools: (i) its database, which stores not only ligand information but also the results from every step in the VS process, and (ii) its modular and pluggable architecture, which allows customization of the VS stages (such as the programs used for conformer generation or docking), through the definition of a detailed workflow employing user-configurable XML files. VSDMIP, therefore, facilitates the storage and retrieval of VS results, easily adapts to the specific requirements of each method and tool used in the experiments, and allows the comparison of different VS methodologies. To validate the usefulness of VSDMIP as an automated tool for carrying out VS several experiments were run on six protein targets (acetylcholinesterase, cyclin-dependent kinase 2, coagulation factor Xa, estrogen receptor alpha, p38 MAP kinase, and neuraminidase) using nine binary (actives/inactive) test sets. The performance of several VS configurations was evaluated by means of enrichment factors and receiver operating characteristic plots.
Medina-Aunon, J. Alberto; Martínez-Bartolomé, Salvador; López-García, Miguel A.; Salazar, Emilio; Navajas, Rosana; Jones, Andrew R.; Paradela, Alberto; Albar, Juan P.
2011-01-01
The development of the HUPO-PSI's (Proteomics Standards Initiative) standard data formats and MIAPE (Minimum Information About a Proteomics Experiment) guidelines should improve proteomics data sharing within the scientific community. Proteomics journals have encouraged the use of these standards and guidelines to improve the quality of experimental reporting and ease the evaluation and publication of manuscripts. However, there is an evident lack of bioinformatics tools specifically designed to create and edit standard file formats and reports, or embed them within proteomics workflows. In this article, we describe a new web-based software suite (The ProteoRed MIAPE web toolkit) that performs several complementary roles related to proteomic data standards. First, it can verify that the reports fulfill the minimum information requirements of the corresponding MIAPE modules, highlighting inconsistencies or missing information. Second, the toolkit can convert several XML-based data standards directly into human readable MIAPE reports stored within the ProteoRed MIAPE repository. Finally, it can also perform the reverse operation, allowing users to export from MIAPE reports into XML files for computational processing, data sharing, or public database submission. The toolkit is thus the first application capable of automatically linking the PSI's MIAPE modules with the corresponding XML data exchange standards, enabling bidirectional conversions. This toolkit is freely available at http://www.proteored.org/MIAPE/. PMID:21983993
CliniProteus: A flexible clinical trials information management system
Mathura, Venkatarajan S; Rangareddy, Mahendiranath; Gupta, Pankaj; Mullan, Michael
2007-01-01
Clinical trials involve multi-site heterogeneous data generation with complex data input-formats and forms. The data should be captured and queried in an integrated fashion to facilitate further analysis. Electronic case-report forms (eCRF) are gaining popularity since it allows capture of clinical information in a rapid manner. We have designed and developed an XML based flexible clinical trials data management framework in .NET environment that can be used for efficient design and deployment of eCRFs to efficiently collate data and analyze information from multi-site clinical trials. The main components of our system include an XML form designer, a Patient registration eForm, reusable eForms, multiple-visit data capture and consolidated reports. A unique id is used for tracking the trial, site of occurrence, the patient and the year of recruitment. Availability http://www.rfdn.org/bioinfo/CTMS/ctms.html. PMID:21670796
Design and implementation of a health data interoperability mediator.
Kuo, Mu-Hsing; Kushniruk, Andre William; Borycki, Elizabeth Marie
2010-01-01
The objective of this study is to design and implement a common-gateway oriented mediator to solve the health data interoperability problems that exist among heterogeneous health information systems. The proposed mediator has three main components: (1) a Synonym Dictionary (SD) that stores a set of global metadata and terminologies to serve as the mapping intermediary, (2) a Semantic Mapping Engine (SME) that can be used to map metadata and instance semantics, and (3) a DB-to-XML module that translates source health data stored in a database into XML format and back. A routine admission notification data exchange scenario is used to test the efficiency and feasibility of the proposed mediator. The study results show that the proposed mediator can make health information exchange more efficient.
Web mapping system for complex processing and visualization of environmental geospatial datasets
NASA Astrophysics Data System (ADS)
Titov, Alexander; Gordov, Evgeny; Okladnikov, Igor
2016-04-01
Environmental geospatial datasets (meteorological observations, modeling and reanalysis results, etc.) are used in numerous research applications. Due to a number of objective reasons such as inherent heterogeneity of environmental datasets, big dataset volume, complexity of data models used, syntactic and semantic differences that complicate creation and use of unified terminology, the development of environmental geodata access, processing and visualization services as well as client applications turns out to be quite a sophisticated task. According to general INSPIRE requirements to data visualization geoportal web applications have to provide such standard functionality as data overview, image navigation, scrolling, scaling and graphical overlay, displaying map legends and corresponding metadata information. It should be noted that modern web mapping systems as integrated geoportal applications are developed based on the SOA and might be considered as complexes of interconnected software tools for working with geospatial data. In the report a complex web mapping system including GIS web client and corresponding OGC services for working with geospatial (NetCDF, PostGIS) dataset archive is presented. There are three basic tiers of the GIS web client in it: 1. Tier of geospatial metadata retrieved from central MySQL repository and represented in JSON format 2. Tier of JavaScript objects implementing methods handling: --- NetCDF metadata --- Task XML object for configuring user calculations, input and output formats --- OGC WMS/WFS cartographical services 3. Graphical user interface (GUI) tier representing JavaScript objects realizing web application business logic Metadata tier consists of a number of JSON objects containing technical information describing geospatial datasets (such as spatio-temporal resolution, meteorological parameters, valid processing methods, etc). The middleware tier of JavaScript objects implementing methods for handling geospatial metadata, task XML object, and WMS/WFS cartographical services interconnects metadata and GUI tiers. The methods include such procedures as JSON metadata downloading and update, launching and tracking of the calculation task running on the remote servers as well as working with WMS/WFS cartographical services including: obtaining the list of available layers, visualizing layers on the map, exporting layers in graphical (PNG, JPG, GeoTIFF), vector (KML, GML, Shape) and digital (NetCDF) formats. Graphical user interface tier is based on the bundle of JavaScript libraries (OpenLayers, GeoExt and ExtJS) and represents a set of software components implementing web mapping application business logic (complex menus, toolbars, wizards, event handlers, etc.). GUI provides two basic capabilities for the end user: configuring the task XML object functionality and cartographical information visualizing. The web interface developed is similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. Web mapping system developed has shown its effectiveness in the process of solving real climate change research problems and disseminating investigation results in cartographical form. The work is supported by SB RAS Basic Program Projects VIII.80.2.1 and IV.38.1.7.
Next-Generation Search Engines for Information Retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Hook, Leslie A; Palanisamy, Giri
In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore » finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a centralized index. The harvested files are indexed against SOLR search API consistently, so that it can render search capabilities such as simple, fielded, spatial and temporal searches across a span of projects ranging from land, atmosphere, and ocean ecology. Mercury also provides data sharing capabilities using Open Archive Initiatives Protocol for Metadata Handling (OAI-PMH). In this paper we will discuss about the best practices for archiving data and metadata, new searching techniques, efficient ways of data retrieval and information display.« less
NASA Astrophysics Data System (ADS)
Benedict, K. K.; Scott, S.
2013-12-01
While there has been a convergence towards a limited number of standards for representing knowledge (metadata) about geospatial (and other) data objects and collections, there exist a variety of community conventions around the specific use of those standards and within specific data discovery and access systems. This combination of limited (but multiple) standards and conventions creates a challenge for system developers that aspire to participate in multiple data infrastrucutres, each of which may use a different combination of standards and conventions. While Extensible Markup Language (XML) is a shared standard for encoding most metadata, traditional direct XML transformations (XSLT) from one standard to another often result in an imperfect transfer of information due to incomplete mapping from one standard's content model to another. This paper presents the work at the University of New Mexico's Earth Data Analysis Center (EDAC) in which a unified data and metadata management system has been developed in support of the storage, discovery and access of heterogeneous data products. This system, the Geographic Storage, Transformation and Retrieval Engine (GSTORE) platform has adopted a polyglot database model in which a combination of relational and document-based databases are used to store both data and metadata, with some metadata stored in a custom XML schema designed as a superset of the requirements for multiple target metadata standards: ISO 19115-2/19139/19110/19119, FGCD CSDGM (both with and without remote sensing extensions) and Dublin Core. Metadata stored within this schema is complemented by additional service, format and publisher information that is dynamically "injected" into produced metadata documents when they are requested from the system. While mapping from the underlying common metadata schema is relatively straightforward, the generation of valid metadata within each target standard is necessary but not sufficient for integration into multiple data infrastructures, as has been demonstrated through EDAC's testing and deployment of metadata into multiple external systems: Data.Gov, the GEOSS Registry, the DataONE network, the DSpace based institutional repository at UNM and semantic mediation systems developed as part of the NASA ACCESS ELSeWEB project. Each of these systems requires valid metadata as a first step, but to make most effective use of the delivered metadata each also has a set of conventions that are specific to the system. This presentation will provide an overview of the underlying metadata management model, the processes and web services that have been developed to automatically generate metadata in a variety of standard formats and highlight some of the specific modifications made to the output metadata content to support the different conventions used by the multiple metadata integration endpoints.
A Low-Storage-Consumption XML Labeling Method for Efficient Structural Information Extraction
NASA Astrophysics Data System (ADS)
Liang, Wenxin; Takahashi, Akihiro; Yokota, Haruo
Recently, labeling methods to extract and reconstruct the structural information of XML data, which are important for many applications such as XPath query and keyword search, are becoming more attractive. To achieve efficient structural information extraction, in this paper we propose C-DO-VLEI code, a novel update-friendly bit-vector encoding scheme, based on register-length bit operations combining with the properties of Dewey Order numbers, which cannot be implemented in other relevant existing schemes such as ORDPATH. Meanwhile, the proposed method also achieves lower storage consumption because it does not require either prefix schema or any reserved codes for node insertion. We performed experiments to evaluate and compare the performance and storage consumption of the proposed method with those of the ORDPATH method. Experimental results show that the execution times for extracting depth information and parent node labels using the C-DO-VLEI code are about 25% and 15% less, respectively, and the average label size using the C-DO-VLEI code is about 24% smaller, comparing with ORDPATH.
Setting the Standard: XML on Campus.
ERIC Educational Resources Information Center
Rawlins, Mike
2001-01-01
Explains what XML (Extensible Markup Language) is; where to find it in a few years (everywhere from Web pages, to database management systems, to common campus applications); issues that will make XML somewhat of an experimental strategy in the near term; and the importance of decision-makers being abreast of XML trends in standards, tools…
2003-01-01
Authenticat’n (XCBF) Authorizat’n (XACML) (SAML) Privacy (P3P) Digital Rights Management (XrML) Content Mngmnt (DASL) (WebDAV) Content Syndicat’n...Registry/ Repository BPSS eCommerce XML/EDI Universal Business Language (UBL) Internet & Computing Human Resources (HR-XML) Semantic KEY XML SPECIFICATIONS
The Proteins API: accessing key integrated protein and genome information
Antunes, Ricardo; Alpi, Emanuele; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd
2017-01-01
Abstract The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to ‘talk’ to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). PMID:28383659
The Proteins API: accessing key integrated protein and genome information.
Nightingale, Andrew; Antunes, Ricardo; Alpi, Emanuele; Bursteinas, Borisas; Gonzales, Leonardo; Liu, Wudong; Luo, Jie; Qi, Guoying; Turner, Edd; Martin, Maria
2017-07-03
The Proteins API provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). Using the coordinates service, researchers are able to retrieve the genomic sequence coordinates for proteins in UniProtKB. This, the LSS genomics and proteomics data for UniProt proteins is programmatically only available through this service. A Swagger UI has been implemented to provide documentation, an interface for users, with little or no programming experience, to 'talk' to the services to quickly and easily formulate queries with the services and obtain dynamically generated source code for popular programming languages, such as Java, Perl, Python and Ruby. Search results are returned as standard JSON, XML or GFF data objects. The Proteins API is a scalable, reliable, fast, easy to use RESTful services that provides a broad protein information resource for users to ask questions based upon their field of expertise and allowing them to gain an integrated overview of protein annotations available to aid their knowledge gain on proteins in biological processes. The Proteins API is available at (http://www.ebi.ac.uk/proteins/api/doc). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
NCBI Bookshelf: books and documents in life sciences and health care
Hoeppner, Marilu A.
2013-01-01
Bookshelf (http://www.ncbi.nlm.nih.gov/books/) is a full-text electronic literature resource of books and documents in life sciences and health care at the National Center for Biotechnology Information (NCBI). Created in 1999 with a single book as an encyclopedic reference for resources such as PubMed and GenBank, it has grown to its current size of >1300 titles. Unlike other NCBI databases, such as GenBank and Gene, which have a strict data structure, books come in all forms; they are diverse in publication types, formats, sizes and authoring models. The Bookshelf data format is XML tagged in the NCBI Book DTD (Document Type Definition), modeled after the National Library of Medicine journal article DTDs. The book DTD has been used for systematically tagging the diverse data formats of books, a move that has set the foundation for the growth of this resource. Books at NCBI followed the route of journal articles in the PubMed Central project, using the PubMed Central architectural framework, workflows and processes. Through integration with other NCBI molecular databases, books at NCBI can be used to provide reference information for biological data and facilitate its discovery. This article describes Bookshelf at NCBI: its growth, data handling and retrieval and integration with molecular databases. PMID:23203889
NCBI Bookshelf: books and documents in life sciences and health care.
Hoeppner, Marilu A
2013-01-01
Bookshelf (http://www.ncbi.nlm.nih.gov/books/) is a full-text electronic literature resource of books and documents in life sciences and health care at the National Center for Biotechnology Information (NCBI). Created in 1999 with a single book as an encyclopedic reference for resources such as PubMed and GenBank, it has grown to its current size of >1300 titles. Unlike other NCBI databases, such as GenBank and Gene, which have a strict data structure, books come in all forms; they are diverse in publication types, formats, sizes and authoring models. The Bookshelf data format is XML tagged in the NCBI Book DTD (Document Type Definition), modeled after the National Library of Medicine journal article DTDs. The book DTD has been used for systematically tagging the diverse data formats of books, a move that has set the foundation for the growth of this resource. Books at NCBI followed the route of journal articles in the PubMed Central project, using the PubMed Central architectural framework, workflows and processes. Through integration with other NCBI molecular databases, books at NCBI can be used to provide reference information for biological data and facilitate its discovery. This article describes Bookshelf at NCBI: its growth, data handling and retrieval and integration with molecular databases.
NASA Astrophysics Data System (ADS)
Knapic, C.; Zanichelli, A.; Dovgan, E.; Nanni, M.; Stagni, M.; Righini, S.; Sponza, M.; Bedosti, F.; Orlati, A.; Smareglia, R.
2016-07-01
Radio Astronomical Data models are becoming very complex since the huge possible range of instrumental configurations available with the modern Radio Telescopes. What in the past was the last frontiers of data formats in terms of efficiency and flexibility is now evolving with new strategies and methodologies enabling the persistence of a very complex, hierarchical and multi-purpose information. Such an evolution of data models and data formats require new data archiving techniques in order to guarantee data preservation following the directives of Open Archival Information System and the International Virtual Observatory Alliance for data sharing and publication. Currently, various formats (FITS, MBFITS, VLBI's XML description files and ancillary files) of data acquired with the Medicina and Noto Radio Telescopes can be stored and handled by a common Radio Archive, that is planned to be released to the (inter)national community by the end of 2016. This state-of-the-art archiving system for radio astronomical data aims at delegating as much as possible to the software setting how and where the descriptors (metadata) are saved, while the users perform user-friendly queries translated by the web interface into complex interrogations on the database to retrieve data. In such a way, the Archive is ready to be Virtual Observatory compliant and as much as possible user-friendly.
Automating Data Submission to a National Archive
NASA Astrophysics Data System (ADS)
Work, T. T.; Chandler, C. L.; Groman, R. C.; Allison, M. D.; Gegg, S. R.; Biological; Chemical Oceanography Data Management Office
2010-12-01
In late 2006, the U.S. National Science Foundation (NSF) funded the Biological and Chemical Oceanographic Data Management Office (BCO-DMO) at Woods Hole Oceanographic Institution (WHOI) to work closely with investigators to manage oceanographic data generated from their research projects. One of the final data management tasks is to ensure that the data are permanently archived at the U.S. National Oceanographic Data Center (NODC) or other appropriate national archiving facility. In the past, BCO-DMO submitted data to NODC as an email with attachments including a PDF file (a manually completed metadata record) and one or more data files. This method is no longer feasible given the rate at which data sets are contributed to BCO-DMO. Working with collaborators at NODC, a more streamlined and automated workflow was developed to keep up with the increased volume of data that must be archived at NODC. We will describe our new workflow; a semi-automated approach for contributing data to NODC that includes a Federal Geographic Data Committee (FGDC) compliant Extensible Markup Language (XML) metadata file accompanied by comma-delimited data files. The FGDC XML file is populated from information stored in a MySQL database. A crosswalk described by an Extensible Stylesheet Language Transformation (XSLT) is used to transform the XML formatted MySQL result set to a FGDC compliant XML metadata file. To ensure data integrity, the MD5 algorithm is used to generate a checksum and manifest of the files submitted to NODC for permanent archive. The revised system supports preparation of detailed, standards-compliant metadata that facilitate data sharing and enable accurate reuse of multidisciplinary information. The approach is generic enough to be adapted for use by other data management groups.
NASA Astrophysics Data System (ADS)
Wang, Xiaodong; Zhang, Xiaoyu; Cai, Hongming; Xu, Boyi
Enacting a supply-chain process involves variant partners and different IT systems. REST receives increasing attention for distributed systems with loosely coupled resources. Nevertheless, resource model incompatibilities and conflicts prevent effective process modeling and deployment in resource-centric Web service environment. In this paper, a Petri-net based framework for supply-chain process integration is proposed. A resource meta-model is constructed to represent the basic information of resources. Then based on resource meta-model, XML schemas and documents are derived, which represent resources and their states in Petri-net. Thereafter, XML-net, a high level Petri-net, is employed for modeling control and data flow of process. From process model in XML-net, RESTful services and choreography descriptions are deduced. Therefore, unified resource representation and RESTful services description are proposed for cross-system integration in a more effective way. A case study is given to illustrate the approach and the desirable features of the approach are discussed.
Towards health care process description framework: an XML DTD design.
Staccini, P.; Joubert, M.; Quaranta, J. F.; Aymard, S.; Fieschi, D.; Fieschi, M.
2001-01-01
The development of health care and hospital information systems has to meet users needs as well as requirements such as the tracking of all care activities and the support of quality improvement. The use of process-oriented analysis is of-value to provide analysts with: (i) a systematic description of activities; (ii) the elicitation of the useful data to perform and record care tasks; (iii) the selection of relevant decision-making support. But paper-based tools are not a very suitable way to manage and share the documentation produced during this step. The purpose of this work is to propose a method to implement the results of process analysis according to XML techniques (eXtensible Markup Language). It is based on the IDEF0 activity modeling language (Integration DEfinition for Function modeling). A hierarchical description of a process and its components has been defined through a flat XML file with a grammar of proper metadata tags. Perspectives of this method are discussed. PMID:11825265
MedlinePlus Connect: Frequently Asked Questions (FAQs)
... topic data in XML format. Using the Web service, software developers can build applications that utilize MedlinePlus health topic information. The service accepts keyword searches as requests and returns relevant ...
NASA Astrophysics Data System (ADS)
Uznir, U.; Anton, F.; Suhaibah, A.; Rahman, A. A.; Mioc, D.
2013-09-01
The advantages of three dimensional (3D) city models can be seen in various applications including photogrammetry, urban and regional planning, computer games, etc.. They expand the visualization and analysis capabilities of Geographic Information Systems on cities, and they can be developed using web standards. However, these 3D city models consume much more storage compared to two dimensional (2D) spatial data. They involve extra geometrical and topological information together with semantic data. Without a proper spatial data clustering method and its corresponding spatial data access method, retrieving portions of and especially searching these 3D city models, will not be done optimally. Even though current developments are based on an open data model allotted by the Open Geospatial Consortium (OGC) called CityGML, its XML-based structure makes it challenging to cluster the 3D urban objects. In this research, we propose an opponent data constellation technique of space-filling curves (3D Hilbert curves) for 3D city model data representation. Unlike previous methods, that try to project 3D or n-dimensional data down to 2D or 3D using Principal Component Analysis (PCA) or Hilbert mappings, in this research, we extend the Hilbert space-filling curve to one higher dimension for 3D city model data implementations. The query performance was tested using a CityGML dataset of 1,000 building blocks and the results are presented in this paper. The advantages of implementing space-filling curves in 3D city modeling will improve data retrieval time by means of optimized 3D adjacency, nearest neighbor information and 3D indexing. The Hilbert mapping, which maps a subinterval of the [0, 1] interval to the corresponding portion of the d-dimensional Hilbert's curve, preserves the Lebesgue measure and is Lipschitz continuous. Depending on the applications, several alternatives are possible in order to cluster spatial data together in the third dimension compared to its clustering in 2D.
Fundamental Data Standards for Science Data System Interoperability and Data Correlation
NASA Astrophysics Data System (ADS)
Hughes, J. Steven; Gopala Krishna, Barla; Rye, Elizabeth; Crichton, Daniel
The advent of the Web and languages such as XML have brought an explosion of online science data repositories and the promises of correlated data and interoperable systems. However there have been relatively few successes in meeting the expectations of science users in the internet age. For example a Google-like search for images of Mars will return many highly-derived and appropriately tagged images but largely ignore the majority of images in most online image repositories. Once retrieved, users are further frustrated by poor data descriptions, arcane formats, and badly organized ancillary information. A wealth of research indicates that shared information models are needed to enable system interoperability and data correlation. However, at a more fundamental level, data correlation and system interoperability are dependant on a relatively few shared data standards. A com-mon data dictionary standard, for example, allows the controlled vocabulary used in a science repository to be shared with potential collaborators. Common data registry and product iden-tification standards enable systems to efficiently find, locate, and retrieve data products and their metadata from remote repositories. Information content standards define categories of descriptive data that help make the data products scientifically useful to users who were not part of the original team that produced the data. The Planetary Data System (PDS) has a plan to move the PDS to a fully online, federated system. This plan addresses new demands on the system including increasing data volume, numbers of missions, and complexity of missions. A key component of this plan is the upgrade of the PDS Data Standards. The adoption of the core PDS data standards by the International Planetary Data Alliance (IPDA) adds the element of international cooperation to the plan. This presentation will provide an overview of the fundamental data standards being adopted by the PDS that transcend science domains and that will help to meet the PDS's and IPDA's system interoperability and data correlation requirements.
XML: A Language To Manage the World Wide Web. ERIC Digest.
ERIC Educational Resources Information Center
Davis-Tanous, Jennifer R.
This digest provides an overview of XML (Extensible Markup Language), a markup language used to construct World Wide Web pages. Topics addressed include: (1) definition of a markup language, including comparison of XML with SGML (Standard Generalized Markup Language) and HTML (HyperText Markup Language); (2) how XML works, including sample tags,…
Kataoka, Satoshi; Ohe, Kazuhiko; Mochizuki, Mayumi; Ueda, Shiro
2002-01-01
We have developed an adverse drug reaction (ADR) reporting system integrating it with Hospital Information System (HIS) of the University of Tokyo Hospital. Since this system is designed with JAVA, it is portable without re-compiling to any operating systems on which JAVA virtual machines work. In this system, we implemented an automatic data filling function using XML-based (extended Markup Language) files generated by HIS. This new specification would decrease the time needed for physicians and pharmacists to fill the spontaneous ADR reports. By clicking a button, the report is sent to the text database through Simple Mail Transfer Protocol (SMTP) electronic mails. The destination of the report mail can be changed arbitrarily by administrators, which adds this system more flexibility for practical operation. Although we tried our best to use the SGML-based (Standard Generalized Markup Language) ICH M2 guideline to follow the global standard of the case report, we eventually adopted XML as the output report format. This is because we found some problems in handling two bytes characters with ICH guideline and XML has a lot of useful features. According to our pilot survey conducted at the University of Tokyo Hospital, many physicians answered that our idea, integrating ADR reporting system to HIS, would increase the ADR reporting numbers.
Interoperability, Data Control and Battlespace Visualization using XML, XSLT and X3D
2003-09-01
26 Rosenthal, Arnon, Seligman , Len and Costello, Roger, XML, Databases, and Interoperability, Federal Database Colloquium, AFCEA, San Diego...79 Rosenthal, Arnon, Seligman , Len and Costello, Roger, “XML, Databases, and Interoperability”, Federal Database Colloquium, AFCEA, San Diego, 1999... Linda , Mastering XML, Premium Edition, SYBEX, 2001 Wooldridge, Michael , An Introduction to MultiAgent Systems, Wiley, 2002 PAPERS Abernathy, M
Integrating and visualizing primary data from prospective and legacy taxonomic literature
Agosti, Donat; Penev, Lyubomir; Sautter, Guido; Georgiev, Teodor; Catapano, Terry; Patterson, David; King, David; Pereira, Serrano; Vos, Rutger Aldo; Sierra, Soraya
2015-01-01
Abstract Specimen data in taxonomic literature are among the highest quality primary biodiversity data. Innovative cybertaxonomic journals are using workflows that maintain data structure and disseminate electronic content to aggregators and other users; such structure is lost in traditional taxonomic publishing. Legacy taxonomic literature is a vast repository of knowledge about biodiversity. Currently, access to that resource is cumbersome, especially for non-specialist data consumers. Markup is a mechanism that makes this content more accessible, and is especially suited to machine analysis. Fine-grained XML (Extensible Markup Language) markup was applied to all (37) open-access articles published in the journal Zootaxa containing treatments on spiders (Order: Araneae). The markup approach was optimized to extract primary specimen data from legacy publications. These data were combined with data from articles containing treatments on spiders published in Biodiversity Data Journal where XML structure is part of the routine publication process. A series of charts was developed to visualize the content of specimen data in XML-tagged taxonomic treatments, either singly or in aggregate. The data can be filtered by several fields (including journal, taxon, institutional collection, collecting country, collector, author, article and treatment) to query particular aspects of the data. We demonstrate here that XML markup using GoldenGATE can address the challenge presented by unstructured legacy data, can extract structured primary biodiversity data which can be aggregated with and jointly queried with data from other Darwin Core-compatible sources, and show how visualization of these data can communicate key information contained in biodiversity literature. We complement recent studies on aspects of biodiversity knowledge using XML structured data to explore 1) the time lag between species discovry and description, and 2) the prevelence of rarity in species descriptions. PMID:26023286
Compression of Probabilistic XML Documents
NASA Astrophysics Data System (ADS)
Veldman, Irma; de Keijzer, Ander; van Keulen, Maurice
Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of [10,9]. The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.
The Implications of Well-Formedness on Web-Based Educational Resources.
ERIC Educational Resources Information Center
Mohler, James L.
Within all institutions, Web developers are beginning to utilize technologies that make sites more than static information resources. Databases such as XML (Extensible Markup Language) and XSL (Extensible Stylesheet Language) are key technologies that promise to extend the Web beyond the "information storehouse" paradigm and provide…
Towards the XML schema measurement based on mapping between XML and OO domain
NASA Astrophysics Data System (ADS)
Rakić, Gordana; Budimac, Zoran; Heričko, Marjan; Pušnik, Maja
2017-07-01
Measuring quality of IT solutions is a priority in software engineering. Although numerous metrics for measuring object-oriented code already exist, measuring quality of UML models or XML Schemas is still developing. One of the research questions in the overall research leaded by ideas described in this paper is whether we can apply already defined object-oriented design metrics on XML schemas based on predefined mappings. In this paper, basic ideas for mentioned mapping are presented. This mapping is prerequisite for setting the future approach to XML schema quality measuring with object-oriented metrics.
Menezes, Pedro Monteiro; Cook, Timothy Wayne; Cavalini, Luciana Tricai
2016-01-01
To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies.
Construction of a nasopharyngeal carcinoma 2D/MS repository with Open Source XML database--Xindice.
Li, Feng; Li, Maoyu; Xiao, Zhiqiang; Zhang, Pengfei; Li, Jianling; Chen, Zhuchu
2006-01-11
Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database. The raw NPC proteome experiment data were captured into one XML document with Human Proteome Markup Language (HUP-ML) editor and imported into native XML database Xindice. The 2D/MS repository of NPC proteome was constructed with Apache, PHP and Xindice to provide access to the database via Internet. On our website, two methods, keyword query and click query, were provided at the same time to access the entries of the NPC proteome database. Our 2D/MS repository can be used to share the raw NPC proteomics data that are generated from gel-based proteomics experiments. The database, as well as the PHP source codes for constructing users' own proteome repository, can be accessed at http://www.xyproteomics.org/.
DeepBlue epigenomic data server: programmatic data retrieval and analysis of epigenome region sets
Albrecht, Felipe; List, Markus; Bock, Christoph; Lengauer, Thomas
2016-01-01
Large amounts of epigenomic data are generated under the umbrella of the International Human Epigenome Consortium, which aims to establish 1000 reference epigenomes within the next few years. These data have the potential to unravel the complexity of epigenomic regulation. However, their effective use is hindered by the lack of flexible and easy-to-use methods for data retrieval. Extracting region sets of interest is a cumbersome task that involves several manual steps: identifying the relevant experiments, downloading the corresponding data files and filtering the region sets of interest. Here we present the DeepBlue Epigenomic Data Server, which streamlines epigenomic data analysis as well as software development. DeepBlue provides a comprehensive programmatic interface for finding, selecting, filtering, summarizing and downloading region sets. It contains data from four major epigenome projects, namely ENCODE, ROADMAP, BLUEPRINT and DEEP. DeepBlue comes with a user manual, examples and a well-documented application programming interface (API). The latter is accessed via the XML-RPC protocol supported by many programming languages. To demonstrate usage of the API and to enable convenient data retrieval for non-programmers, we offer an optional web interface. DeepBlue can be openly accessed at http://deepblue.mpi-inf.mpg.de. PMID:27084938
Biological data integration: wrapping data and tools.
Lacroix, Zoé
2002-06-01
Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.
Learn about the Environmental Information Exchange Network
States, territories, and tribes can partner with EPA to exchange standardized data to improve its quality, increase its availability, and integrate it better across different sources using connecting nodes, xml schema, and a data exchange template.
... on MedlinePlus health topic pages. With the Web service, software developers can build applications that leverage the authoritative, reliable health information in MedlinePlus. The MedlinePlus Web service is free of charge and does not require ...
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and ARM
NASA Astrophysics Data System (ADS)
Crow, M. C.; Devarakonda, R.; Killeffer, T.; Hook, L.; Boden, T.; Wullschleger, S.
2017-12-01
Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This poster describes tools being used in several projects at Oak Ridge National Laboratory (ORNL), with a focus on the U.S. Department of Energy's Next Generation Ecosystem Experiment in the Arctic (NGEE Arctic) and Atmospheric Radiation Measurements (ARM) project, and their usage at different stages of the data lifecycle. The Online Metadata Editor (OME) is used for the documentation and archival stages while a Data Search tool supports indexing, cataloging, and searching. The NGEE Arctic OME Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload while adhering to standard metadata formats. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The Data Search Tool conveniently displays each data record in a thumbnail containing the title, source, and date range, and features a quick view of the metadata associated with that record, as well as a direct link to the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for geo-searching. These tools are supported by the Mercury [2] consortium (funded by DOE, NASA, USGS, and ARM) and developed and managed at Oak Ridge National Laboratory. Mercury is a set of tools for collecting, searching, and retrieving metadata and data. Mercury collects metadata from contributing project servers, then indexes the metadata to make it searchable using Apache Solr, and provides access to retrieve it from the web page. Metadata standards that Mercury supports include: XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115.
The NASA Program Management Tool: A New Vision in Business Intelligence
NASA Technical Reports Server (NTRS)
Maluf, David A.; Swanson, Keith; Putz, Peter; Bell, David G.; Gawdiak, Yuri
2006-01-01
This paper describes a novel approach to business intelligence and program management for large technology enterprises like the U.S. National Aeronautics and Space Administration (NASA). Two key distinctions of the approach are that 1) standard business documents are the user interface, and 2) a "schema-less" XML database enables flexible integration of technology information for use by both humans and machines in a highly dynamic environment. The implementation utilizes patent-pending NASA software called the NASA Program Management Tool (PMT) and its underlying "schema-less" XML database called Netmark. Initial benefits of PMT include elimination of discrepancies between business documents that use the same information and "paperwork reduction" for program and project management in the form of reducing the effort required to understand standard reporting requirements and to comply with those reporting requirements. We project that the underlying approach to business intelligence will enable significant benefits in the timeliness, integrity and depth of business information available to decision makers on all organizational levels.
Specifics on a XML Data Format for Scientific Data
NASA Astrophysics Data System (ADS)
Shaya, E.; Thomas, B.; Cheung, C.
An XML-based data format for interchange and archiving of scientific data would benefit in many ways from the features standardized in XML. Foremost of these features is the world-wide acceptance and adoption of XML. Applications, such as browsers, XQL and XSQL advanced query, XML editing, or CSS or XSLT transformation, that are coming out of industry and academia can be easily adopted and provide startling new benefits and features. We have designed a prototype of a core format for holding, in a very general way, parameters, tables, scalar and vector fields, atlases, animations and complex combinations of these. This eXtensible Data Format (XDF) makes use of XML functionalities such as: self-validation of document structure, default values for attributes, XLink hyperlinks, entity replacements, internal referencing, inheritance, and XSLT transformation. An API is available to aid in detailed assembly, extraction, and manipulation. Conversion tools to and from FITS and other existing data formats are under development. In the future, we hope to provide object oriented interfaces to C++, Java, Python, IDL, Mathematica, Maple, and various databases. http://xml.gsfc.nasa.gov/XDF
EOS ODL Metadata On-line Viewer
NASA Astrophysics Data System (ADS)
Yang, J.; Rabi, M.; Bane, B.; Ullman, R.
2002-12-01
We have recently developed and deployed an EOS ODL metadata on-line viewer. The EOS ODL metadata viewer is a web server that takes: 1) an EOS metadata file in Object Description Language (ODL), 2) parameters, such as which metadata to view and what style of display to use, and returns an HTML or XML document displaying the requested metadata in the requested style. This tool is developed to address widespread complaints by science community that the EOS Data and Information System (EOSDIS) metadata files in ODL are difficult to read by allowing users to upload and view an ODL metadata file in different styles using a web browser. Users have the selection to view all the metadata or part of the metadata, such as Collection metadata, Granule metadata, or Unsupported Metadata. Choices of display styles include 1) Web: a mouseable display with tabs and turn-down menus, 2) Outline: Formatted and colored text, suitable for printing, 3) Generic: Simple indented text, a direct representation of the underlying ODL metadata, and 4) None: No stylesheet is applied and the XML generated by the converter is returned directly. Not all display styles are implemented for all the metadata choices. For example, Web style is only implemented for Collection and Granule metadata groups with known attribute fields, but not for Unsupported, Other, and All metadata. The overall strategy of the ODL viewer is to transform an ODL metadata file to a viewable HTML in two steps. The first step is to convert the ODL metadata file to an XML using a Java-based parser/translator called ODL2XML. The second step is to transform the XML to an HTML using stylesheets. Both operations are done on the server side. This allows a lot of flexibility in the final result, and is very portable cross-platform. Perl CGI behind the Apache web server is used to run the Java ODL2XML, and then run the results through an XSLT processor. The EOS ODL viewer can be accessed from either a PC or a Mac using Internet Explorer 5.0+ or Netscape 4.7+.
A Priority Fuzzy Logic Extension of the XQuery Language
NASA Astrophysics Data System (ADS)
Škrbić, Srdjan; Wettayaprasit, Wiphada; Saeueng, Pannipa
2011-09-01
In recent years there have been significant research findings in flexible XML querying techniques using fuzzy set theory. Many types of fuzzy extensions to XML data model and XML query languages have been proposed. In this paper, we introduce priority fuzzy logic extensions to XQuery language. Describing these extensions we introduce a new query language. Moreover, we describe a way to implement an interpreter for this language using an existing XML native database.
QRFXFreeze: Queryable Compressor for RFX.
Senthilkumar, Radha; Nandagopal, Gomathi; Ronald, Daphne
2015-01-01
The verbose nature of XML has been mulled over again and again and many compression techniques for XML data have been excogitated over the years. Some of the techniques incorporate support for querying the XML database in its compressed format while others have to be decompressed before they can be queried. XML compression in which querying is directly supported instantaneously with no compromise over time is forced to compromise over space. In this paper, we propose the compressor, QRFXFreeze, which not only reduces the space of storage but also supports efficient querying. The compressor does this without decompressing the compressed XML file. The compressor supports all kinds of XML documents along with insert, update, and delete operations. The forte of QRFXFreeze is that the textual data are semantically compressed and are indexed to reduce the querying time. Experimental results show that the proposed compressor performs much better than other well-known compressors.
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression.
Wu, Gang; Zheng, Yuanpu; Qureshi, Imran; Zin, Htar Thant; Beck, Tyler; Bulka, Blazej; Freeland, Stephen J
2007-01-01
Here we present the Synthetic Gene Database (SGDB): a relational database that houses sequences and associated experimental information on synthetic (artificially engineered) genes from all peer-reviewed studies published to date. At present, the database comprises information from more than 200 published experiments. This resource not only provides reference material to guide experimentalists in designing new genes that improve protein expression, but also offers a dataset for analysis by bioinformaticians who seek to test ideas regarding the underlying factors that influence gene expression. The SGDB was built under MySQL database management system. We also offer an XML schema for standardized data description of synthetic genes. Users can access the database at http://www.evolvingcode.net/codon/sgdb/index.php, or batch downloads all information through XML files. Moreover, users may visually compare the coding sequences of a synthetic gene and its natural counterpart with an integrated web tool at http://www.evolvingcode.net/codon/sgdb/aligner.php, and discuss questions, findings and related information on an associated e-forum at http://www.evolvingcode.net/forum/viewforum.php?f=27.
Guo, Jinqiu; Takada, Akira; Tanaka, Koji; Sato, Junzo; Suzuki, Muneou; Takahashi, Kiwamu; Daimon, Hiroyuki; Suzuki, Toshiaki; Nakashima, Yusei; Araki, Kenji; Yoshihara, Hiroyuki
2005-08-01
With the evolving and diverse electronic medical record (EMR) systems, there appears to be an ever greater need to link EMR systems and patient accounting systems with a standardized data exchange format. To this end, the CLinical Accounting InforMation (CLAIM) data exchange standard was developed. CLAIM is subordinate to the Medical Markup Language (MML) standard, which allows the exchange of medical data among different medical institutions. CLAIM uses eXtensible Markup Language (XML) as a meta-language. The current version, 2.1, inherited the basic structure of MML 2.x and contains two modules including information related to registration, appointment, procedure and charging. CLAIM 2.1 was implemented successfully in Japan in 2001. Consequently, it was confirmed that CLAIM could be used as an effective data exchange format between EMR systems and patient accounting systems.
XCEDE: An Extensible Schema For Biomedical Data
Gadde, Syam; Aucoin, Nicole; Grethe, Jeffrey S.; Keator, David B.; Marcus, Daniel S.; Pieper, Steve
2013-01-01
The XCEDE (XML-based Clinical and Experimental Data Exchange) XML schema, developed by members of the BIRN (Biomedical Informatics Research Network), provides an extensive metadata hierarchy for storing, describing and documenting the data generated by scientific studies. Currently at version 2.0, the XCEDE schema serves as a specification for the exchange of scientific data between databases, analysis tools, and web services. It provides a structured metadata hierarchy, storing information relevant to various aspects of an experiment (project, subject, protocol, etc.). Each hierarchy level also provides for the storage of data provenance information allowing for a traceable record of processing and/or changes to the underlying data. The schema is extensible to support the needs of various data modalities and to express types of data not originally envisioned by the developers. The latest version of the XCEDE schema and manual are available from http://www.xcede.org/ PMID:21479735
Kottmann, Renzo; Gray, Tanya; Murphy, Sean; Kagan, Leonid; Kravitz, Saul; Lombardot, Thierry; Field, Dawn; Glöckner, Frank Oliver
2008-06-01
The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that implements the "Minimum Information about a Genome Sequence" (MIGS) specification and its extension, the "Minimum Information about a Metagenome Sequence" (MIMS). GCDML is an XML Schema for generating MIGS/MIMS compliant reports for data entry, exchange, and storage. When mature, this sample-centric, strongly-typed schema will provide a diverse set of descriptors for describing the exact origin and processing of a biological sample, from sampling to sequencing, and subsequent analysis. Here we describe the need for such a project, outline design principles required to support the project, and make an open call for participation in defining the future content of GCDML. GCDML is freely available, and can be downloaded, along with documentation, from the GSC Web site (http://gensc.org).
NeXML: rich, extensible, and verifiable representation of comparative data and metadata.
Vos, Rutger A; Balhoff, James P; Caravas, Jason A; Holder, Mark T; Lapp, Hilmar; Maddison, Wayne P; Midford, Peter E; Priyam, Anurag; Sukumaran, Jeet; Xia, Xuhua; Stoltzfus, Arlin
2012-07-01
In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.
NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata
Vos, Rutger A.; Balhoff, James P.; Caravas, Jason A.; Holder, Mark T.; Lapp, Hilmar; Maddison, Wayne P.; Midford, Peter E.; Priyam, Anurag; Sukumaran, Jeet; Xia, Xuhua; Stoltzfus, Arlin
2012-01-01
Abstract In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML. PMID:22357728
Managing and Querying Image Annotation and Markup in XML.
Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel
2010-01-01
Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.
Hoelzer, Simon; Schweiger, Ralf K; Liu, Raymond; Rudolf, Dirk; Rieger, Joerg; Dudeck, Joachim
2005-01-01
With the introduction of the ICD-10 as the standard for diagnosis, the development of an electronic representation of its complete content, inherent semantics and coding rules is necessary. Our concept refers to current efforts of the CEN/TC 251 to establish a European standard for hierarchical classification systems in healthcare. We have developed an electronic representation of the ICD-10 with the extensible Markup Language (XML) that facilitates the integration in current information systems or coding software taking into account different languages and versions. In this context, XML offers a complete framework of related technologies and standard tools for processing that helps to develop interoperable applications.
Managing and Querying Image Annotation and Markup in XML
Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel
2010-01-01
Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167
WITH: a system to write clinical trials using XML and RDBMS.
Fazi, Paola; Luzi, Daniela; Manco, Mariarosaria; Ricci, Fabrizio L.; Toffoli, Giovanni; Vignetti, Marco
2002-01-01
The paper illustrates the system WITH (Write on Internet clinical Trials in Haematology) which supports the writing of a clinical trial (CT) document. The requirements of this system have been defined analysing the writing process of a CT and then modelling the content of its sections together with their logical and temporal relationships. The system WITH allows: a) editing the document text; b) re-using the text; and c) facilitating the cooperation and the collaborative writing. It is based on XML mark-up language, and on a RDBMS. This choice guarantees: a) process standardisation; b) process management; c) efficient delivery of information-based tasks; and d) explicit focus on process design. PMID:12463823
Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya
2007-01-20
It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.
Electronic patient record and archive of records in Cardio.net system for telecardiology.
Sierdziński, Janusz; Karpiński, Grzegorz
2003-01-01
In modern medicine the well structured patient data set, fast access to it and reporting capability become an important question. With the dynamic development of information technology (IT) such question is solved via building electronic patient record (EPR) archives. We then obtain fast access to patient data, diagnostic and treatment protocols etc. It results in more efficient, better and cheaper treatment. The aim of the work was to design a uniform Electronic Patient Record, implemented in cardio.net system for telecardiology allowing the co-operation among regional hospitals and reference centers. It includes questionnaires for demographic data and questionnaires supporting doctor's work (initial diagnosis, final diagnosis, history and physical, ECG at the discharge, applied treatment, additional tests, drugs, daily and periodical reports). The browser is implemented in EPR archive to facilitate data retrieval. Several tools for creating EPR and EPR archive were used such as: XML, PHP, Java Script and MySQL. The separate question is the security of data on WWW server. The security is ensured via Security Socket Layer (SSL) protocols and other tools. EPR in Cardio.net system is a module enabling the co-work of many physicians and the communication among different medical centers.
A comparison of database systems for XML-type data.
Risse, Judith E; Leunissen, Jack A M
2010-01-01
In the field of bioinformatics interchangeable data formats based on XML are widely used. XML-type data is also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. In this paper we analyse the suitability of different database systems for storing and querying large datasets in general and Medline in particular. All reviewed database systems perform well when tested with small to medium sized datasets, however when the full Medline dataset is queried a large variation in query times is observed. There is not one system that is vastly superior to the others in this comparison and, depending on the database size and the query requirements, different systems are most suitable. The best all-round solution is the Oracle 11~g database system using the new binary storage option. Alias-i's Lingpipe is a more lightweight, customizable and sufficiently fast solution. It does however require more initial configuration steps. For data with a changing XML structure Sedna and BaseX as native XML database systems or MySQL with an XML-type column are suitable.
75 FR 57480 - Agency Information Collection Activities: Customs Declaration (Form 6059B)
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-21
... Form 6059B can be found at: http://www.cbp.gov/xp/cgov/travel/vacation/sample_declaration_form.xml... Annual Responses: 105,606,000. Estimated Time per Response: 4 minutes. Estimated Total Annual Burden...
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises
NASA Astrophysics Data System (ADS)
Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.
2011-12-01
JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.
Cook, Timothy Wayne; Cavalini, Luciana Tricai
2016-01-01
Objectives To present the technical background and the development of a procedure that enriches the semantics of Health Level Seven version 2 (HL7v2) messages for software-intensive systems in telemedicine trauma care. Methods This study followed a multilevel model-driven approach for the development of semantically interoperable health information systems. The Pre-Hospital Trauma Life Support (PHTLS) ABCDE protocol was adopted as the use case. A prototype application embedded the semantics into an HL7v2 message as an eXtensible Markup Language (XML) file, which was validated against an XML schema that defines constraints on a common reference model. This message was exchanged with a second prototype application, developed on the Mirth middleware, which was also used to parse and validate both the original and the hybrid messages. Results Both versions of the data instance (one pure XML, one embedded in the HL7v2 message) were equally validated and the RDF-based semantics recovered by the receiving side of the prototype from the shared XML schema. Conclusions This study demonstrated the semantic enrichment of HL7v2 messages for intensive-software telemedicine systems for trauma care, by validating components of extracts generated in various computing environments. The adoption of the method proposed in this study ensures the compliance of the HL7v2 standard in Semantic Web technologies. PMID:26893947
Extraction and labeling high-resolution images from PDF documents
NASA Astrophysics Data System (ADS)
Chachra, Suchet K.; Xue, Zhiyun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Accuracy of content-based image retrieval is affected by image resolution among other factors. Higher resolution images enable extraction of image features that more accurately represent the image content. In order to improve the relevance of search results for our biomedical image search engine, Open-I, we have developed techniques to extract and label high-resolution versions of figures from biomedical articles supplied in the PDF format. Open-I uses the open-access subset of biomedical articles from the PubMed Central repository hosted by the National Library of Medicine. Articles are available in XML and in publisher supplied PDF formats. As these PDF documents contain little or no meta-data to identify the embedded images, the task includes labeling images according to their figure number in the article after they have been successfully extracted. For this purpose we use the labeled small size images provided with the XML web version of the article. This paper describes the image extraction process and two alternative approaches to perform image labeling that measure the similarity between two images based upon the image intensity projection on the coordinate axes and similarity based upon the normalized cross-correlation between the intensities of two images. Using image identification based on image intensity projection, we were able to achieve a precision of 92.84% and a recall of 82.18% in labeling of the extracted images.
Development of XML Schema for Broadband Digital Seismograms and Data Center Portal
NASA Astrophysics Data System (ADS)
Takeuchi, N.; Tsuboi, S.; Ishihara, Y.; Nagao, H.; Yamagishi, Y.; Watanabe, T.; Yanaka, H.; Yamaji, H.
2008-12-01
There are a number of data centers around the globe, where the digital broadband seismograms are opened to researchers. Those centers use their own user interfaces and there are no standard to access and retrieve seismograms from different data centers using unified interface. One of the emergent technologies to realize unified user interface for different data centers is the concept of WebService and WebService portal. Here we have developed a prototype of data center portal for digital broadband seismograms. This WebService portal uses WSDL (Web Services Description Language) to accommodate differences among the different data centers. By using the WSDL, alteration and addition of data center user interfaces can be easily managed. This portal, called NINJA Portal, assumes three WebServices: (1) database Query service, (2) Seismic event data request service, and (3) Seismic continuous data request service. Current system supports both station search of database Query service and seismic continuous data request service. Data centers supported by this NINJA portal will be OHP data center in ERI and Pacific21 data center in IFREE/JAMSTEC in the beginning. We have developed metadata standard for seismological data based on QuakeML for parametric data, which has been developed by ETH Zurich, and XML-SEED for waveform data, which was developed by IFREE/JAMSTEC. The prototype of NINJA portal is now released through IFREE web page (http://www.jamstec.go.jp/pacific21/).
hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.
Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe
2016-10-07
Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .
Are We Ready for Another Change? Digital Signatures Can Change How We Handle the Academic Record
ERIC Educational Resources Information Center
Black, Thomas C.; Mohr, John
2004-01-01
In this electronic age, where information is digital and service is virtual, the registrar profession is changing rapidly to keep up with increasing standards and expectations. EDI and now XML standards enable system-to-system exchanges of academic records information. While many of the registrar's profession display student academic records under…
Modeling the Arden Syntax for medical decisions in XML.
Kim, Sukil; Haug, Peter J; Rocha, Roberto A; Choi, Inyoung
2008-10-01
A new model expressing Arden Syntax with the eXtensible Markup Language (XML) was developed to increase its portability. Every example was manually parsed and reviewed until the schema and the style sheet were considered to be optimized. When the first schema was finished, several MLMs in Arden Syntax Markup Language (ArdenML) were validated against the schema. They were then transformed to HTML formats with the style sheet, during which they were compared to the original text version of their own MLM. When faults were found in the transformed MLM, the schema and/or style sheet was fixed. This cycle continued until all the examples were encoded into XML documents. The original MLMs were encoded in XML according to the proposed XML schema and reverse-parsed MLMs in ArdenML were checked using a public domain Arden Syntax checker. Two hundred seventy seven examples of MLMs were successfully transformed into XML documents using the model, and the reverse-parse yielded the original text version of MLMs. Two hundred sixty five of the 277 MLMs showed the same error patterns before and after transformation, and all 11 errors related to statement structure were resolved in XML version. The model uses two syntax checking mechanisms, first an XML validation process, and second, a syntax check using an XSL style sheet. Now that we have a schema for ArdenML, we can also begin the development of style sheets for transformation ArdenML into other languages.
An improved resource management model based on MDS
NASA Astrophysics Data System (ADS)
Yuan, Man; Sun, Changying; Li, Pengfei; Sun, Yongdong; He, Rui
2005-11-01
GRID technology provides a kind of convenient method for managing GRID resources. This service is so-called monitoring, discovering service. This method is proposed by Globus Alliance, in this GRID environment, all kinds of resources, such as computational resources, storage resources and other resources can be organized by MDS specifications. However, this MDS is a theory framework, particularly, in a small world intranet, in the case of limit of resources, the MDS has its own limitation. Based on MDS, an improved light method for managing corporation computational resources and storage resources is proposed in intranet(IMDS). Firstly, in MDS, all kinds of resource description information is stored in LDAP, it is well known although LDAP is a light directory access protocol, in practice, programmers rarely master how to access and store resource information into LDAP store, in such way, it limits MDS to be used. So, in intranet, these resources' description information can be stored in RDBMS, programmers and users can access this information by standard SQL. Secondly, in MDS, how to monitor all kinds of resources in GRID is not transparent for programmers and users. In such way, it limits its application scope, in general, resource monitoring method base on SNMP is widely employed in intranet, therefore, a kind of resource monitoring method based on SNMP is integrated into MDS. Finally, all kinds of resources in the intranet can be described by XML, and all kinds of resources' description information is stored in RDBMS, such as MySql, and retrieved by standard SQL, dynamic information for all kinds of resources can be sent to resource storage by SNMP, A prototype resource description, monitoring is designed and implemented in intranet.
XML Content Finally Arrives on the Web!
ERIC Educational Resources Information Center
Funke, Susan
1998-01-01
Explains extensible markup language (XML) and how it differs from hypertext markup language (HTML) and standard generalized markup language (SGML). Highlights include features of XML, including better formatting of documents, better searching capabilities, multiple uses for hyperlinking, and an increase in Web applications; Web browsers; and what…
The Essen Learning Model--A Step towards a Representation of Learning Objectives.
ERIC Educational Resources Information Center
Bick, Markus; Pawlowski, Jan M.; Veith, Patrick
The importance of the Extensible Markup Language (XML) technology family in the field of Computer Assisted Learning (CAL) can not be denied. The Instructional Management Systems Project (IMS), for example, provides a learning resource XML binding specification. Considering this specification and other implementations using XML to represent…
XML Schema Languages: Beyond DTD.
ERIC Educational Resources Information Center
Ioannides, Demetrios
2000-01-01
Discussion of XML (extensible markup language) and the traditional DTD (document type definition) format focuses on efforts of the World Wide Web Consortium's XML schema working group to develop a schema language to replace DTD that will be capable of defining the set of constraints of any possible data resource. (Contains 14 references.) (LRW)
XML: A Publisher's Perspective.
ERIC Educational Resources Information Center
Andrews, Timothy M.
1999-01-01
Explains eXtensible Markup Language (XML) and describes how Dow Jones Interactive is using it to improve the news-gathering and dissemination process through intranets and the World Wide Web. Discusses benefits of using XML, the relationship to HyperText Markup Language (HTML), lack of available software tools and industry support, and future…
Data Access System for Hydrology
NASA Astrophysics Data System (ADS)
Whitenack, T.; Zaslavsky, I.; Valentine, D.; Djokic, D.
2007-12-01
As part of the CUAHSI HIS (Consortium of Universities for the Advancement of Hydrologic Science, Inc., Hydrologic Information System), the CUAHSI HIS team has developed Data Access System for Hydrology or DASH. DASH is based on commercial off the shelf technology, which has been developed in conjunction with a commercial partner, ESRI. DASH is a web-based user interface, developed in ASP.NET developed using ESRI ArcGIS Server 9.2 that represents a mapping, querying and data retrieval interface over observation and GIS databases, and web services. This is the front end application for the CUAHSI Hydrologic Information System Server. The HIS Server is a software stack that organizes observation databases, geographic data layers, data importing and management tools, and online user interfaces such as the DASH application, into a flexible multi- tier application for serving both national-level and locally-maintained observation data. The user interface of the DASH web application allows online users to query observation networks by location and attributes, selecting stations in a user-specified area where a particular variable was measured during a given time interval. Once one or more stations and variables are selected, the user can retrieve and download the observation data for further off-line analysis. The DASH application is highly configurable. The mapping interface can be configured to display map services from multiple sources in multiple formats, including ArcGIS Server, ArcIMS, and WMS. The observation network data is configured in an XML file where you specify the network's web service location and its corresponding map layer. Upon initial deployment, two national level observation networks (USGS NWIS daily values and USGS NWIS Instantaneous values) are already pre-configured. There is also an optional login page which can be used to restrict access as well as providing a alternative to immediate downloads. For large request, users would be notified via email with a link to their data when it is ready.
jmzML, an open-source Java API for mzML, the PSI standard for MS data.
Côté, Richard G; Reisinger, Florian; Martens, Lennart
2010-04-01
We here present jmzML, a Java API for the Proteomics Standards Initiative mzML data standard. Based on the Java Architecture for XML Binding and XPath-based XML indexer random-access XML parser, jmzML can handle arbitrarily large files in minimal memory, allowing easy and efficient processing of mzML files using the Java programming language. jmzML also automatically resolves internal XML references on-the-fly. The library (which includes a viewer) can be downloaded from http://jmzml.googlecode.com.
DeepBlue epigenomic data server: programmatic data retrieval and analysis of epigenome region sets.
Albrecht, Felipe; List, Markus; Bock, Christoph; Lengauer, Thomas
2016-07-08
Large amounts of epigenomic data are generated under the umbrella of the International Human Epigenome Consortium, which aims to establish 1000 reference epigenomes within the next few years. These data have the potential to unravel the complexity of epigenomic regulation. However, their effective use is hindered by the lack of flexible and easy-to-use methods for data retrieval. Extracting region sets of interest is a cumbersome task that involves several manual steps: identifying the relevant experiments, downloading the corresponding data files and filtering the region sets of interest. Here we present the DeepBlue Epigenomic Data Server, which streamlines epigenomic data analysis as well as software development. DeepBlue provides a comprehensive programmatic interface for finding, selecting, filtering, summarizing and downloading region sets. It contains data from four major epigenome projects, namely ENCODE, ROADMAP, BLUEPRINT and DEEP. DeepBlue comes with a user manual, examples and a well-documented application programming interface (API). The latter is accessed via the XML-RPC protocol supported by many programming languages. To demonstrate usage of the API and to enable convenient data retrieval for non-programmers, we offer an optional web interface. DeepBlue can be openly accessed at http://deepblue.mpi-inf.mpg.de. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Adopting and adapting a commercial view of web services for the Navy
NASA Astrophysics Data System (ADS)
Warner, Elizabeth; Ladner, Roy; Katikaneni, Uday; Petry, Fred
2005-05-01
Web Services are being adopted as the enabling technology to provide net-centric capabilities for many Department of Defense operations. The Navy Enterprise Portal, for example, is Web Services-based, and the Department of the Navy is promulgating guidance for developing Web Services. Web Services, however, only constitute a baseline specification that provides the foundation on which users, under current approaches, write specialized applications in order to retrieve data over the Internet. Application development may increase dramatically as the number of different available Web Services increases. Reasons for specialized application development include XML schema versioning differences, adoption/use of diverse business rules, security access issues, and time/parameter naming constraints, among others. We are currently developing for the US Navy a system which will improve delivery of timely and relevant meteorological and oceanographic (MetOc) data to the warfighter. Our objective is to develop an Advanced MetOc Broker (AMB) that leverages Web Services technology to identify, retrieve and integrate relevant MetOc data in an automated manner. The AMB will utilize a Mediator, which will be developed by applying ontological research and schema matching techniques to MetOc forms of data. The AMB, using the Mediator, will support a new, advanced approach to the use of Web Services; namely, the automated identification, retrieval and integration of MetOc data. Systems based on this approach will then not require extensive end-user application development for each Web Service from which data can be retrieved. Users anywhere on the globe will be able to receive timely environmental data that fits their particular needs.
Tirado-Ramos, Alfredo; Hu, Jingkun; Lee, K P
2002-01-01
Supplement 23 to DICOM (Digital Imaging and Communications for Medicine), Structured Reporting, is a specification that supports a semantically rich representation of image and waveform content, enabling experts to share image and related patient information. DICOM SR supports the representation of textual and coded data linked to images and waveforms. Nevertheless, the medical information technology community needs models that work as bridges between the DICOM relational model and open object-oriented technologies. The authors assert that representations of the DICOM Structured Reporting standard, using object-oriented modeling languages such as the Unified Modeling Language, can provide a high-level reference view of the semantically rich framework of DICOM and its complex structures. They have produced an object-oriented model to represent the DICOM SR standard and have derived XML-exchangeable representations of this model using World Wide Web Consortium specifications. They expect the model to benefit developers and system architects who are interested in developing applications that are compliant with the DICOM SR specification.
Word add-in for ontology recognition: semantic enrichment of scientific literature.
Fink, J Lynn; Fernicola, Pablo; Chandran, Rahul; Parastatidis, Savas; Wade, Alex; Naim, Oscar; Quinn, Gregory B; Bourne, Philip E
2010-02-24
In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.
Information Metacatalog for a Grid
NASA Technical Reports Server (NTRS)
Kolano, Paul
2007-01-01
SWIM is a Software Information Metacatalog that gathers detailed information about the software components and packages installed on a grid resource. Information is currently gathered for Executable and Linking Format (ELF) executables and shared libraries, Java classes, shell scripts, and Perl and Python modules. SWIM is built on top of the POUR framework, which is described in the preceding article. SWIM consists of a set of Perl modules for extracting software information from a system, an XML schema defining the format of data that can be added by users, and a POUR XML configuration file that describes how these elements are used to generate periodic, on-demand, and user-specified information. Periodic software information is derived mainly from the package managers used on each system. SWIM collects information from native package managers in FreeBSD, Solaris, and IRX as well as the RPM, Perl, and Python package managers on multiple platforms. Because not all software is available, or installed in package form, SWIM also crawls the set of relevant paths from the File System Hierarchy Standard that defines the standard file system structure used by all major UNIX distributions. Using these two techniques, the vast majority of software installed on a system can be located. SWIM computes the same information gathered by the periodic routines for specific files on specific hosts, and locates software on a system given only its name and type.
ERIC Educational Resources Information Center
Herrera-Viedma, Enrique; Peis, Eduardo
2003-01-01
Presents a fuzzy evaluation method of SGML documents based on computing with words. Topics include filtering the amount of information available on the Web to assist users in their search processes; document type definitions; linguistic modeling; user-system interaction; and use with XML and other markup languages. (Author/LRW)
A Dozen Primers on Important Information Standards
ERIC Educational Resources Information Center
Dempsey, Kathy, Comp.
2007-01-01
This is a compilation of 12 primers on important information standards and protocols. These primers are: (1) Atom; (2) COinS; (3) MADS; (4) MARC 21/MARCXML; (5) MIX; (6) MXG; (7) OpenSearch; (8) PREMIS; (9) RESTful HTTP; (10) unAPI; (11) XMPP (aka Jabber); and (12) ZeeRex. The Atom Syndication Format defines a new XML-based syndication format for…
Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions
Kerrien, Samuel; Orchard, Sandra; Montecchi-Palazzi, Luisa; Aranda, Bruno; Quinn, Antony F; Vinod, Nisha; Bader, Gary D; Xenarios, Ioannis; Wojcik, Jérôme; Sherman, David; Tyers, Mike; Salama, John J; Moore, Susan; Ceol, Arnaud; Chatr-aryamontri, Andrew; Oesterheld, Matthias; Stümpflen, Volker; Salwinski, Lukasz; Nerothin, Jason; Cerami, Ethan; Cusick, Michael E; Vidal, Marc; Gilson, Michael; Armstrong, John; Woollard, Peter; Hogue, Christopher; Eisenberg, David; Cesareni, Gianni; Apweiler, Rolf; Hermjakob, Henning
2007-01-01
Background Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. Results The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. Conclusion The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel. PMID:17925023
Kiah, M L Mat; Nabi, Mohamed S; Zaidan, B B; Zaidan, A A
2013-10-01
This study aims to provide security solutions for implementing electronic medical records (EMRs). E-Health organizations could utilize the proposed method and implement recommended solutions in medical/health systems. Majority of the required security features of EMRs were noted. The methods used were tested against each of these security features. In implementing the system, the combination that satisfied all of the security features of EMRs was selected. Secure implementation and management of EMRs facilitate the safeguarding of the confidentiality, integrity, and availability of e-health organization systems. Health practitioners, patients, and visitors can use the information system facilities safely and with confidence anytime and anywhere. After critically reviewing security and data transmission methods, a new hybrid method was proposed to be implemented on EMR systems. This method will enhance the robustness, security, and integration of EMR systems. The hybrid of simple object access protocol/extensible markup language (XML) with advanced encryption standard and secure hash algorithm version 1 has achieved the security requirements of an EMR system with the capability of integrating with other systems through the design of XML messages.
The carbohydrate sequence markup language (CabosML): an XML description of carbohydrate structures.
Kikuchi, Norihiro; Kameyama, Akihiko; Nakaya, Shuuichi; Ito, Hiromi; Sato, Takashi; Shikanai, Toshihide; Takahashi, Yoriko; Narimatsu, Hisashi
2005-04-15
Bioinformatics resources for glycomics are very poor as compared with those for genomics and proteomics. The complexity of carbohydrate sequences makes it difficult to define a common language to represent them, and the development of bioinformatics tools for glycomics has not progressed. In this study, we developed a carbohydrate sequence markup language (CabosML), an XML description of carbohydrate structures. The language definition (XML Schema) and an experimental database of carbohydrate structures using an XML database management system are available at http://www.phoenix.hydra.mki.co.jp/CabosDemo.html kikuchi@hydra.mki.co.jp.
Structured Course Objects in a Digital Library
NASA Technical Reports Server (NTRS)
Maly, K.; Zubair, M.; Liu, X.; Nelson, M.; Zeil, S.
1999-01-01
We are developing an Undergraduate Digital Library Framework (UDLF) that will support creation/archiving of courses and reuse of existing course material to evolve courses. UDLF supports the publication of course materials for later instantiation for a specific offering and allows the addition of time-dependent and student-specific information and structures. Instructors and, depending on permissions, students can access the general course materials or the materials for a specific offering. We are building a reference implementation based on NCSTRL+, a digital library derived from NCSTRL. Digital objects in NCSTRL+ are called buckets, self-contained entities that carry their own methods for access and display. Current bucket implementations have a two level structure of packages and elements. This is not a rich enough structure for course objects in UDLF. Typically, courses can only be modeled as a multilevel hierarchy and among different courses, both the syntax and semantics of terms may vary. Therefore, we need a mechanism to define, within a particular library, course models, their constituent objects, and the associated semantics in a flexible, extensible way. In this paper, we describe our approach to define and implement these multilayered course objects. We use XML technology to emulate complex data structures within the NCSTRL+ buckets. We have developed authoring and browsing tools to manipulate these course objects. In our current implementation a user downloading an XML based course bucket also downloads the XML-aware tools: an applet that enables the user to edit or browse the bucket. We claim that XML provides an effective means to represent multi-level structure of a course bucket.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hurst, Aaron M.
A data structure based on an eXtensible Markup Language (XML) hierarchy according to experimental nuclear structure data in the Evaluated Nuclear Structure Data File (ENSDF) is presented. A Python-coded translator has been developed to interpret the standard one-card records of the ENSDF datasets, together with their associated quantities defined according to field position, and generate corresponding representative XML output. The quantities belonging to this mixed-record format are described in the ENSDF manual. Of the 16 ENSDF records in total, XML output has been successfully generated for 15 records. An XML-translation for the Comment Record is yet to be implemented; thismore » will be considered in a separate phase of the overall translation effort. Continuation records, not yet implemented, will also be treated in a future phase of this work. Several examples are presented in this document to illustrate the XML schema and methods for handling the various ENSDF data types. However, the proposed nomenclature for the XML elements and attributes need not necessarily be considered as a fixed set of constructs. Indeed, better conventions may be suggested and a consensus can be achieved amongst the various groups of people interested in this project. The main purpose here is to present an initial phase of the translation effort to demonstrate the feasibility of interpreting ENSDF datasets and creating a representative XML-structured hierarchy for data storage.« less
Technical Communication, Knowledge Management, and XML.
ERIC Educational Resources Information Center
Applen, J. D.
2002-01-01
Describes how technical communicators can become involved in knowledge management. Examines how technical communicators can teach organizations to design, access, and contribute to databases; alert them to new information; and facilitate trust and sharing. Concludes that successful technical communicators would do well to establish a culture that…
Designing and Managing Your Digital Library.
ERIC Educational Resources Information Center
Guenther, Kim
2000-01-01
Discusses digital libraries and Web site design issues. Highlights include accessibility issues, including standards, markup languages like HTML and XML, and metadata; building virtual communities; the use of Web portals for customized delivery of information; quality assurance tools, including data mining; and determining user needs, including…
KEGGtranslator: visualizing and converting the KEGG PATHWAY database to various formats.
Wrzodek, Clemens; Dräger, Andreas; Zell, Andreas
2011-08-15
The KEGG PATHWAY database provides a widely used service for metabolic and nonmetabolic pathways. It contains manually drawn pathway maps with information about the genes, reactions and relations contained therein. To store these pathways, KEGG uses KGML, a proprietary XML-format. Parsers and translators are needed to process the pathway maps for usage in other applications and algorithms. We have developed KEGGtranslator, an easy-to-use stand-alone application that can visualize and convert KGML formatted XML-files into multiple output formats. Unlike other translators, KEGGtranslator supports a plethora of output formats, is able to augment the information in translated documents (e.g. MIRIAM annotations) beyond the scope of the KGML document, and amends missing components to fragmentary reactions within the pathway to allow simulations on those. KEGGtranslator is freely available as a Java(™) Web Start application and for download at http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator/. KGML files can be downloaded from within the application. clemens.wrzodek@uni-tuebingen.de Supplementary data are available at Bioinformatics online.
Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements
NASA Technical Reports Server (NTRS)
Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri
2006-01-01
NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.
Adaptive Hypermedia Educational System Based on XML Technologies.
ERIC Educational Resources Information Center
Baek, Yeongtae; Wang, Changjong; Lee, Sehoon
This paper proposes an adaptive hypermedia educational system using XML technologies, such as XML, XSL, XSLT, and XLink. Adaptive systems are capable of altering the presentation of the content of the hypermedia on the basis of a dynamic understanding of the individual user. The user profile can be collected in a user model, while the knowledge…
EquiX-A Search and Query Language for XML.
ERIC Educational Resources Information Center
Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander
2002-01-01
Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)
Internet Patient Records: new techniques
Moehrs, Sascha; Anedda, Paolo; Tuveri, Massimiliano; Zanetti, Gianluigi
2001-01-01
Background The ease by which the Internet is able to distribute information to geographically-distant users on a wide variety of computers makes it an obvious candidate for a technological solution for electronic patient record systems. Indeed, second-generation Internet technologies such as the ones described in this article - XML (eXtensible Markup Language), XSL (eXtensible Style Language), DOM (Document Object Model), CSS (Cascading Style Sheet), JavaScript, and JavaBeans - may significantly reduce the complexity of the development of distributed healthcare systems. Objective The demonstration of an experimental Electronic Patient Record (EPR) system built from those technologies that can support viewing of medical imaging exams and graphically-rich clinical reporting tools, while conforming to the newly emerging XML standard for digital documents. In particular, we aim to promote rapid prototyping of new reports by clinical specialists. Methods We have built a prototype EPR client, InfoDOM, that runs in both the popular web browsers. In this second version it receives each EPR as an XML record served via the secure SSL (Secure Socket Layer) protocol. JavaBean software components manipulate the XML to store it and then to transform it into a variety of useful clinical views. First a web page summary for the patient is produced. From that web page other JavaBeans can be launched. In particular, we have developed a medical imaging exam Viewer and a clinical Reporter bean parameterized appropriately for the particular patient and exam in question. Both present particular views of the XML data. The Viewer reads image sequences from a patient-specified network URL on a PACS (Picture Archiving and Communications System) server and presents them in a user-controllable animated sequence, while the Reporter provides a configurable anatomical map of the site of the pathology, from which individual "reportlets" can be launched. The specification of these reportlets is achieved using standard HTML forms and thus may conceivably be authored by clinical specialists. A generic JavaScript library has been written that allows the seamless incorporation of such contributions into the InfoDOM client. In conjunction with another JavaBean, that library renders graphically-enhanced reporting tools that read and write content to and from the XML data-structure, ready for resubmission to the EPR server. Results We demonstrate the InfoDOM experimental EPR system that is currently being adapted for test-bed use in three hospitals in Cagliari, Italy. For this we are working with specialists in neurology, radiology, and epilepsy. Conclusions Early indications are that the rapid prototyping of reports afforded by our EPR system can assist communication between clinical specialists and our system developers. We are now experimenting with new technologies that may provide services to the kind of XML EPR client described here. PMID:11720950
Lee, Ken Ka-Yin; Tang, Wai-Choi; Choi, Kup-Sze
2013-04-01
Clinical data are dynamic in nature, often arranged hierarchically and stored as free text and numbers. Effective management of clinical data and the transformation of the data into structured format for data analysis are therefore challenging issues in electronic health records development. Despite the popularity of relational databases, the scalability of the NoSQL database model and the document-centric data structure of XML databases appear to be promising features for effective clinical data management. In this paper, three database approaches--NoSQL, XML-enabled and native XML--are investigated to evaluate their suitability for structured clinical data. The database query performance is reported, together with our experience in the databases development. The results show that NoSQL database is the best choice for query speed, whereas XML databases are advantageous in terms of scalability, flexibility and extensibility, which are essential to cope with the characteristics of clinical data. While NoSQL and XML technologies are relatively new compared to the conventional relational database, both of them demonstrate potential to become a key database technology for clinical data management as the technology further advances. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Using SWE Standards for Ubiquitous Environmental Sensing: A Performance Analysis
Tamayo, Alain; Granell, Carlos; Huerta, Joaquín
2012-01-01
Although smartphone applications represent the most typical data consumer tool from the citizen perspective in environmental applications, they can also be used for in-situ data collection and production in varied scenarios, such as geological sciences and biodiversity. The use of standard protocols, such as SWE, to exchange information between smartphones and sensor infrastructures brings benefits such as interoperability and scalability, but their reliance on XML is a potential problem when large volumes of data are transferred, due to limited bandwidth and processing capabilities on mobile phones. In this article we present a performance analysis about the use of SWE standards in smartphone applications to consume and produce environmental sensor data, analysing to what extent the performance problems related to XML can be alleviated by using alternative uncompressed and compressed formats.
Method and system to discover and recommend interesting documents
Potok, Thomas Eugene; Steed, Chad Allen; Patton, Robert Matthew
2017-01-31
Disclosed are several examples of systems that can read millions of news feeds per day about topics (e.g., your customers, competitors, markets, and partners), and provide a small set of the most relevant items to read to keep current with the overwhelming amount of information currently available. Topics of interest can be chosen by the user of the system for use as seeds. The seeds can be vectorized and compared with the target documents to determine their similarity. The similarities can be sorted from highest to lowest so that the most similar seed and target documents are at the top of the list. This output can be produced in XML format so that an RSS Reader can format the XML. This allows for easy Internet access to these recommendations.
Telemetry Attributes Transfer Standard (TMATS) Handbook
2017-01-01
information regarding the project’s security classification guide and/or downgrading information should be provided as a comment. (G\\ COM ) G\\SC:U; The...96TH TEST WING 412TH TEST WING ARNOLD ENGINEERING DEVELOPMENT COMPLEX NATIONAL AERONAUTICS AND SPACE ADMINISTRATION This page intentionally...TMATS) Handbook, RCC Document 124-17, January 2017 v Figure 2-16. “Look and Feel” D-Group Engine Temperature Measurement Example XML
An adaptable XML based approach for scientific data management and integration
NASA Astrophysics Data System (ADS)
Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo
2008-03-01
Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.
An Adaptable XML Based Approach for Scientific Data Management and Integration.
Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo
2008-02-20
Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.
TMATS/ IHAL/ DDML Schema Validation
2017-02-01
task was to create a method for performing IRIG eXtensible Markup Language (XML) schema validation. As opposed to XML instance document validation...TMATS / IHAL / DDML Schema Validation, RCC 126-17, February 2017 vii Acronyms DDML Data Display Markup Language HUD heads-up display iNET...system XML eXtensible Markup Language TMATS / IHAL / DDML Schema Validation, RCC 126-17, February 2017 viii This page intentionally left blank
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-16
... posting CSV file samples. Order No. 770 revised the process for filing EQRs. Pursuant to Order No. 770, one of the new processes for filing allows EQRs to be filed using an XML file. The XML schema that is needed to file EQRs in this manner is now posted on the Commission's Web site at http://www.ferc.gov/docs...
Developing Modular and Adaptable Courseware Using TeachML.
ERIC Educational Resources Information Center
Wehner, Frank; Lorz, Alexander
This paper presents the use of an XML grammar for two complementary projects--CHAMELEON (Cooperative Hypermedia Adaptive MultimEdia Learning Objects) and EIT (Enabling Informal Teamwork). Areas of applications are modular courseware documents and the collaborative authoring process of didactical units. A number of requirements for a suitable…
Yoichiro Nambu and the Mechanism of Spontaneous Broken Symmetries in
Subatomic Physics RSS Archive Videos XML DOE R&D Accomplishments DOE R&D Mechanism of Spontaneous Broken Symmetries in Subatomic Physics Resources with Additional Information Physics "for the discovery of the mechanism of spontaneous broken symmetry in subatomic physics"
Teaching XBRL to Graduate Business Students: A Hands-On Approach
ERIC Educational Resources Information Center
Pinsker, Robert
2004-01-01
EXtensible Business Reporting Language (XBRL) is a non-proprietary, computer language that has many uses. Known primarily as the Extensible Markup Language (XML) for business reporting, XBRL allows entities to report their business information (i.e., financial statements, announcements, etc.) on the Internet and communicate with other entities'…
Operation and Maintenance Support Information (OMSI) Creation, Management, and Repurposing With XML
2004-09-01
engines that cost tens of thousands of dollars. There are many middleware applications on the commercial and open-source market . The “Big Four......planners can begin an incremental planning effort early in the facility construction phase. This thesis provides a non-proprietary, no- cost solution to
A new XML-based query language, CSRML, has been developed for representing chemical substructures, molecules, reaction rules, and reactions. CSRML queries are capable of integrating additional forms of information beyond the simple substructure (e.g., SMARTS) or reaction transfor...
CytometryML binary data standards
NASA Astrophysics Data System (ADS)
Leif, Robert C.
2005-03-01
CytometryML is a proposed new Analytical Cytology (Cytomics) data standard, which is based on a common set of XML schemas for encoding flow cytometry and digital microscopy text based data types (metadata). CytometryML schemas reference both DICOM (Digital Imaging and Communications in Medicine) codes and FCS keywords. Flow Cytometry Standard (FCS) list-mode has been mapped to the DICOM Waveform Information Object. The separation of the large binary data objects (list mode and image data) from the XML description of the metadata permits the metadata to be directly displayed, analyzed, and reported with standard commercial software packages; the direct use of XML languages; and direct interfacing with clinical information systems. The separation of the binary data into its own files simplifies parsing because all extraneous header data has been eliminated. The storage of images as two-dimensional arrays without any extraneous data, such as in the Adobe Photoshop RAW format, facilitates the development by scientists of their own analysis and visualization software. Adobe Photoshop provided the display infrastructure and the translation facility to interconvert between the image data from commercial formats and RAW format. Similarly, the storage and parsing of list mode binary data type with a group of parameters that are specified at compilation time is straight forward. However when the user is permitted at run-time to select a subset of the parameters and/or specify results of mathematical manipulations, the development of special software was required. The use of CytometryML will permit investigators to be able to create their own interoperable data analysis software and to employ commercially available software to disseminate their data.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
QuakeML: XML for Seismological Data Exchange and Resource Metadata Description
NASA Astrophysics Data System (ADS)
Euchner, F.; Schorlemmer, D.; Becker, J.; Heinloo, A.; Kästli, P.; Saul, J.; Weber, B.; QuakeML Working Group
2007-12-01
QuakeML is an XML-based data exchange format for seismology that is under development. Current collaborators are from ETH, GFZ, USC, USGS, IRIS DMC, EMSC, ORFEUS, and ISTI. QuakeML development was motivated by the lack of a widely accepted and well-documented data format that is applicable to a broad range of fields in seismology. The development team brings together expertise from communities dealing with analysis and creation of earthquake catalogs, distribution of seismic bulletins, and real-time processing of seismic data. Efforts to merge QuakeML with existing XML dialects are under way. The first release of QuakeML will cover a basic description of seismic events including picks, arrivals, amplitudes, magnitudes, origins, focal mechanisms, and moment tensors. Further extensions are in progress or planned, e.g., for macroseismic information, location probability density functions, slip distributions, and ground motion information. The QuakeML language definition is supplemented by a concept to provide resource metadata and facilitate metadata exchange between distributed data providers. For that purpose, we introduce unique, location-independent identifiers of seismological resources. As an application of QuakeML, ETH Zurich currently develops a Python-based seismicity analysis toolkit as a contribution to CSEP (Collaboratory for the Study of Earthquake Predictability). We follow a collaborative and transparent development approach along the lines of the procedures of the World Wide Web Consortium (W3C). QuakeML currently is in working draft status. The standard description will be subjected to a public Request for Comments (RFC) process and eventually reach the status of a recommendation. QuakeML can be found at http://www.quakeml.org.
National Space Science Data Center Information Model
NASA Astrophysics Data System (ADS)
Bell, E. V.; McCaslin, P.; Grayzeck, E.; McLaughlin, S. A.; Kodis, J. M.; Morgan, T. H.; Williams, D. R.; Russell, J. L.
2013-12-01
The National Space Science Data Center (NSSDC) was established by NASA in 1964 to provide for the preservation and dissemination of scientific data from NASA missions. It has evolved to support distributed, active archives that were established in the Planetary, Astrophysics, and Heliophysics disciplines through a series of Memoranda of Understanding. The disciplines took over responsibility for working with new projects to acquire and distribute data for community researchers while the NSSDC remained vital as a deep archive. Since 2000, NSSDC has been using the Archive Information Package to preserve data over the long term. As part of its effort to streamline the ingest of data into the deep archive, the NSSDC developed and implemented a data model of desired and required metadata in XML. This process, in use for roughly five years now, has been successfully used to support the identification and ingest of data into the NSSDC archive, most notably those data from the Planetary Data System (PDS) submitted under PDS3. A series of software packages (X-ware) were developed to handle the submission of data from the PDS nodes utilizing a volume structure. An XML submission manifest is generated at the PDS provider site prior to delivery to NSSDC. The manifest ensures the fidelity of PDS data delivered to NSSDC. Preservation metadata is captured in an XML object when NSSDC archives the data. With the recent adoption by the PDS of the XML-based PDS4 data model, there is an opportunity for the NSSDC to provide additional services to the PDS such as the preservation, tracking, and restoration of individual products (e.g., a specific data file or document), which was unfeasible in the previous PDS3 system. The NSSDC is modifying and further streamlining its data ingest process to take advantage of the PDS4 model, an important consideration given the ever-increasing amount of data being generated and archived by orbiting missions at the Moon and Mars, other active projects such as BRRISON, LADEE, MAVEN, INSIGHT, OSIRIS-REX and ground-based observatories. Streamlining the ingest process also benefits the continued processing of PDS3 data. We will report on our progress and status.
Unified Science Information Model for SoilSCAPE using the Mercury Metadata Search System
NASA Astrophysics Data System (ADS)
Devarakonda, Ranjeet; Lu, Kefa; Palanisamy, Giri; Cook, Robert; Santhana Vannan, Suresh; Moghaddam, Mahta Clewley, Dan; Silva, Agnelo; Akbar, Ruzbeh
2013-12-01
SoilSCAPE (Soil moisture Sensing Controller And oPtimal Estimator) introduces a new concept for a smart wireless sensor web technology for optimal measurements of surface-to-depth profiles of soil moisture using in-situ sensors. The objective is to enable a guided and adaptive sampling strategy for the in-situ sensor network to meet the measurement validation objectives of spaceborne soil moisture sensors such as the Soil Moisture Active Passive (SMAP) mission. This work is being carried out at the University of Michigan, the Massachusetts Institute of Technology, University of Southern California, and Oak Ridge National Laboratory. At Oak Ridge National Laboratory we are using Mercury metadata search system [1] for building a Unified Information System for the SoilSCAPE project. This unified portal primarily comprises three key pieces: Distributed Search/Discovery; Data Collections and Integration; and Data Dissemination. Mercury, a Federally funded software for metadata harvesting, indexing, and searching would be used for this module. Soil moisture data sources identified as part of this activity such as SoilSCAPE and FLUXNET (in-situ sensors), AirMOSS (airborne retrieval), SMAP (spaceborne retrieval), and are being indexed and maintained by Mercury. Mercury would be the central repository of data sources for cal/val for soil moisture studies and would provide a mechanism to identify additional data sources. Relevant metadata from existing inventories such as ORNL DAAC, USGS Clearinghouse, ARM, NASA ECHO, GCMD etc. would be brought in to this soil-moisture data search/discovery module. The SoilSCAPE [2] metadata records will also be published in broader metadata repositories such as GCMD, data.gov. Mercury can be configured to provide a single portal to soil moisture information contained in disparate data management systems located anywhere on the Internet. Mercury is able to extract, metadata systematically from HTML pages or XML files using a variety of methods including OAI-PMH [3]. The Mercury search interface then allows users to perform simple, fielded, spatial and temporal searches across a central harmonized index of metadata. Mercury supports various metadata standards including FGDC, ISO-19115, DIF, Dublin-Core, Darwin-Core, and EML. This poster describes in detail how Mercury implements the Unified Science Information Model for Soil moisture data. References: [1]Devarakonda R., et al. Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics (2010), 3(1): 87-94. [2]Devarakonda R., et al. Daymet: Single Pixel Data Extraction Tool. http://daymet.ornl.gov/singlepixel.html (2012). Last Accesses 10-01-2013 [3]Devarakonda R., et al. Data sharing and retrieval using OAI-PMH. Earth Science Informatics (2011), 4(1): 1-5.
NASA Astrophysics Data System (ADS)
Euchner, F.; Schorlemmer, D.; Kästli, P.; Quakeml Group, T
2008-12-01
QuakeML is an XML-based exchange format for seismological data which is being developed using a community-driven approach. It covers basic event description, including picks, arrivals, amplitudes, magnitudes, origins, focal mechanisms, and moment tensors. Contributions have been made from ETH, GFZ, USC, SCEC, USGS, IRIS DMC, EMSC, ORFEUS, GNS, ZAMG, BRGM, and ISTI. The current release (Version 1.1, Proposed Recommendation) reflects the results of a public Request for Comments process which has been documented online at http://quakeml.org/RFC_BED_1.0. QuakeML has recently been adopted as a distribution format for earthquake catalogs by GNS Science, New Zealand, and the European-Mediterranean Seismological Centre (EMSC). These institutions provide prototype QuakeML web services. Furthermore, integration of the QuakeML data model in the CSEP (Collaboratory for the Study of Earthquake Predictability, http://www.cseptesting.org) testing center software developed by SCEC is under way. QuakePy is a Python- based seismicity analysis toolkit which is based on the QuakeML data model. Recently, QuakePy has been used to implement the PMC method for calculating network recording completeness (Schorlemmer and Woessner 2008, in press). Completeness results for seismic networks in Southern California and Japan can be retrieved through the CompletenessWeb (http://completenessweb.org). Future QuakeML development will include an extension for macroseismic information. Furthermore, development on seismic inventory information, resource identifiers, and resource metadata is under way. Online resources: http://www.quakeml.org, http://www.quakepy.org
Intelligent Visualization of Geo-Information on the Future Web
NASA Astrophysics Data System (ADS)
Slusallek, P.; Jochem, R.; Sons, K.; Hoffmann, H.
2012-04-01
Visualization is a key component of the "Observation Web" and will become even more important in the future as geo data becomes more widely accessible. The common statement that "Data that cannot be seen, does not exist" is especially true for non-experts, like most citizens. The Web provides the most interesting platform for making data easily and widely available. However, today's Web is not well suited for the interactive visualization and exploration that is often needed for geo data. Support for 3D data was added only recently and at an extremely low level (WebGL), but even the 2D visualization capabilities of HTML e.g. (images, canvas, SVG) are rather limited, especially regarding interactivity. We have developed XML3D as an extension to HTML-5. It allows for compactly describing 2D and 3D data directly as elements of an HTML-5 document. All graphics elements are part of the Document Object Model (DOM) and can be manipulated via the same set of DOM events and methods that millions of Web developers use on a daily basis. Thus, XML3D makes highly interactive 2D and 3D visualization easily usable, not only for geo data. XML3D is supported by any WebGL-capable browser but we also provide native implementations in Firefox and Chromium. As an example, we show how OpenStreetMap data can be mapped directly to XML3D and visualized interactively in any Web page. We show how this data can be easily augmented with additional data from the Web via a few lines of Javascript. We also show how embedded semantic data (via RDFa) allows for linking the visualization back to the data's origin, thus providing an immersive interface for interacting with and modifying the original data. XML3D is used as key input for standardization within the W3C Community Group on "Declarative 3D for the Web" chaired by the DFKI and has recently been selected as one of the Generic Enabler for the EU Future Internet initiative.
A Unified Mathematical Definition of Classical Information Retrieval.
ERIC Educational Resources Information Center
Dominich, Sandor
2000-01-01
Presents a unified mathematical definition for the classical models of information retrieval and identifies a mathematical structure behind relevance feedback. Highlights include vector information retrieval; probabilistic information retrieval; and similarity information retrieval. (Contains 118 references.) (Author/LRW)
Concept-based query language approach to enterprise information systems
NASA Astrophysics Data System (ADS)
Niemi, Timo; Junkkari, Marko; Järvelin, Kalervo
2014-01-01
In enterprise information systems (EISs) it is necessary to model, integrate and compute very diverse data. In advanced EISs the stored data often are based both on structured (e.g. relational) and semi-structured (e.g. XML) data models. In addition, the ad hoc information needs of end-users may require the manipulation of data-oriented (structural), behavioural and deductive aspects of data. Contemporary languages capable of treating this kind of diversity suit only persons with good programming skills. In this paper we present a concept-oriented query language approach to manipulate this diversity so that the programming skill requirements are considerably reduced. In our query language, the features which need technical knowledge are hidden in application-specific concepts and structures. Therefore, users need not be aware of the underlying technology. Application-specific concepts and structures are represented by the modelling primitives of the extended RDOOM (relational deductive object-oriented modelling) which contains primitives for all crucial real world relationships (is-a relationship, part-of relationship, association), XML documents and views. Our query language also supports intensional and extensional-intensional queries, in addition to conventional extensional queries. In its query formulation, the end-user combines available application-specific concepts and structures through shared variables.
Word add-in for ontology recognition: semantic enrichment of scientific literature
2010-01-01
Background In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. Results The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. Conclusions The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata. PMID:20181245
Labeling RDF Graphs for Linear Time and Space Querying
NASA Astrophysics Data System (ADS)
Furche, Tim; Weinzierl, Antonius; Bry, François
Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.
Fuel Economy Label and CAFE Data Inventory
The Fuel Economy Label and CAFE Data asset contains measured summary fuel economy estimates and test data for light-duty vehicle manufacturers by model for certification as required under the Energy Policy and Conservation Act of 1975 (EPCA) and The Energy Independent Security Act of 2007 (EISA) to collect vehicle fuel economy estimates for the creation of Economy Labels and for the calculation of Corporate Average Fuel Economy (CAFE). Manufacturers submit data on an annual basis, or as needed to document vehicle model changes.The EPA performs targeted fuel economy confirmatory tests on approximately 15% of vehicles submitted for validation. Confirmatory data on vehicles is associated with its corresponding submission data to verify the accuracy of manufacturer submissions beyond standard business rules. Submitted data comes in XML format or as documents, with the majority of submissions being sent in XML, and includes descriptive information on the vehicle itself, fuel economy information, and the manufacturer's testing approach. This data may contain proprietary information (CBI) such as information on estimated sales or other data elements indicated by the submitter as confidential. CBI data is not publically available; however, within the EPA data can accessed under the restrictions of the Office of Transportation and Air Quality (OTAQ) CBI policy [RCS Link]. Datasets are segmented by vehicle model/manufacturer and/or year with corresponding fuel economy, te
Report of Official foreign Travel to Spain April 17-29, 1999. (in English;)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mason, j.d.
The Department of Energy (DOE) has moved rapidly toward electronic production, management, and dissemination of scientific and technical information. The World-Wide Web (WWW) has become a primary means of information dissemination. Electronic commerce (EC) is becoming the preferred means of procurement. DOE, like other government agencies, depends on and encourages the use of international standards in data communications. Like most government agencies, DOE has expressed a preference for openly developed standards in preference to proprietary designs promoted as "standards" by vendors. In particular, there is a preference for standards developed by organizations such as the International Organization for Standardization (ISO)more » and the American National Standards Institute (ANSI) that use open, public processes to develop their standards. Among the most widely adopted international standards is the Standard Generalized Markup Language (SGML, ISO 8879:1986, FIPS 152), which DOE has selected as the basis of its electronic management of documents. Besides the official commitment, which has resulted in several specialized projects, DOE makes heavy use of coding derived from SGML, and its use is likely to increase in the future. Most documents on the WWW are coded in HTML ("Hypertext Markup Language"), which is an application of SGML. The World-Wide Web Consortium (W3C), with the backing of major software houses like Microsoft, Adobe, and Netscape, is promoting XML ("eXtensible Markup Language"), a class of SGML applications, for the future of the WWW and the basis for EC. W3C has announced its intention of discontinuing future development of HTML and replacing it with XHTML, an application of XML. In support of DOE's use of these standards, I have served since 1985 as Chairman of the international committee responsible for SGML and related standards, ISO/IEC JTC1/SC34 (SC34) and its predecessor organizations. During my April 1999 trip, I convened the spring 1999 meeting of SC34 in Granada, Spain. I also attended a major conference on the use of SGML and XML. SC34 maintains and continues to enhance several standards. In addition to SGML, which is the basis of HTML and XML, SC34 also works on the Document Style Semantics and Specification Language (DSSSL), which is the basis for W3C's XSL ("eXtensible Style Language," to be used with XML) and the Hypermedia/Time-based Document Structuring Language (HyTime), which is a major influence on W3C's XLink ("XML Linking Language"). SC34 is also involved in work with ISO's TC184, Industrial Data, on the linking of STEP (the standard for the interchange of product model data) with SGML. In addition to the widespread use of the WWW among DOE's plants and facilities in Oak Ridge and among DOE sites across the nation, there are several SGML-based projects at the Y-12 Plant. My project team in Information Technology Services developed an SGML-based publications system that has been used for several major reports at the Y-12 Plant and Oak Ridge National Laboratory (ORNL). SGML is a component of the Weapons Records Archiving and Preservation (WRAP) project at the Y-12 Plant and is the format for catalog metadata chosen for weapons records by the Nuclear Weapons Information Group (NWIG). Supporting standards development allows DOE and the Y-12 plant both input into the process and the opportunity to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML and related topics.« less
Connecting geoscience systems and data using Linked Open Data in the Web of Data
NASA Astrophysics Data System (ADS)
Ritschel, Bernd; Neher, Günther; Iyemori, Toshihiko; Koyama, Yukinobu; Yatagai, Akiyo; Murayama, Yasuhiro; Galkin, Ivan; King, Todd; Fung, Shing F.; Hughes, Steve; Habermann, Ted; Hapgood, Mike; Belehaki, Anna
2014-05-01
Linked Data or Linked Open Data (LOD) in the realm of free and publically accessible data is one of the most promising and most used semantic Web frameworks connecting various types of data and vocabularies including geoscience and related domains. The semantic Web extension to the commonly existing and used World Wide Web is based on the meaning of entities and relationships or in different words classes and properties used for data in a global data and information space, the Web of Data. LOD data is referenced and mash-uped by URIs and is retrievable using simple parameter controlled HTTP-requests leading to a result which is human-understandable or machine-readable. Furthermore the publishing and mash-up of data in the semantic Web realm is realized by specific Web standards, such as RDF, RDFS, OWL and SPARQL defined for the Web of Data. Semantic Web based mash-up is the Web method to aggregate and reuse various contents from different sources, such as e.g. using FOAF as a model and vocabulary for the description of persons and organizations -in our case- related to geoscience projects, instruments, observations, data and so on. On the example of three different geoscience data and information management systems, such as ESPAS, IUGONET and GFZ ISDC and the associated science data and related metadata or better called context data, the concept of the mash-up of systems and data using the semantic Web approach and the Linked Open Data framework is described in this publication. Because the three systems are based on different data models, data storage structures and technical implementations an extra semantic Web layer upon the existing interfaces is used for mash-up solutions. In order to satisfy the semantic Web standards, data transition processes, such as the transfer of content stored in relational databases or mapped in XML documents into SPARQL capable databases or endpoints using D2R or XSLT is necessary. In addition, the use of mapped and/or merged domain specific and cross-domain vocabularies in the sense of terminological ontologies are the foundation for a virtually unified data retrieval and access in IUGONET, ESPAS and GFZ ISDC data management systems. SPARQL endpoints realized either by originally RDF databases, e.g. Virtuoso or by virtual SPARQL endpoints, e.g. D2R services enable an only upon Web standard-based mash-up of domain-specific systems and data, such as in this case the space weather and geomagnetic domain but also cross-domain connection to data and vocabularies, e.g. related to NASA's VxOs, particularly VWO or NASA's PDS data system within LOD. LOD - Linked Open Data RDF - Resource Description Framework RDFS - RDF Schema OWL - Ontology Web Language SPARQL - SPARQL Protocol and RDF Query Language FOAF - Friends of a Friend ontology ESPAS - Near Earth Space Data Infrastructure for e-Science (Project) IUGONET - Inter-university Upper Atmosphere Global Observation Network (Project) GFZ ISDC - German Research Centre for Geosciences Information System and Data Center XML - Extensible Mark-up Language D2R - (Relational) Database to RDF (Transformation) XSLT - Extensible Stylesheet Language Transformation Virtuoso - OpenLink Virtuoso Universal Server (including RDF data management) NASA - National Aeronautics and Space Administration VOx - Virtual Observatories VWO - Virtual Wave Observatory PDS - Planetary Data System
Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L
2006-11-01
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.
A New Publicly Available Chemical Query Language, CSRML ...
A new XML-based query language, CSRML, has been developed for representing chemical substructures, molecules, reaction rules, and reactions. CSRML queries are capable of integrating additional forms of information beyond the simple substructure (e.g., SMARTS) or reaction transformation (e.g., SMIRKS, reaction SMILES) queries currently in use. Chemotypes, a term used to represent advanced CSRML queries for repeated application can be encoded not only with connectivity and topology, but also with properties of atoms, bonds, electronic systems, or molecules. The CSRML language has been developed in parallel with a public set of chemotypes, i.e., the ToxPrint chemotypes, which are designed to provide excellent coverage of environmental, regulatory and commercial use chemical space, as well as to represent features and frameworks believed to be especially relevant to toxicity concerns. A software application, ChemoTyper, has also been developed and made publicly available to enable chemotype searching and fingerprinting against a target structure set. The public ChemoTyper houses the ToxPrint chemotype CSRML dictionary, as well as reference implementation so that the query specifications may be adopted by other chemical structure knowledge systems. The full specifications of the XML standard used in CSRML-based chemotypes are publicly available to facilitate and encourage the exchange of structural knowledge. Paper details specifications for a new XML-based query lan
Wolff, A C; Mludek, V; van der Haak, M; Bork, W; Bülzebruck, H; Drings, P; Schmücker, P; Wannenmacher, M; Haux, R
2001-01-01
Communication between different institutions which are responsible for the treatment of the same patient is of outstanding significance, especially in the field of tumor diseases. Regional electronic patient records could support the co-operation of different institutions by providing ac-cess to all necessary information whether it belongs to the own institution or to a partner. The Department of Medical Informatics, University of Heidelberg is performing a project in co-operation with the Thoraxclinic-Heidelberg and the Department of Clinical Radiology, University of Heidelberg with the goal: to define an architectural concept for interlinking the electronic patient record of the two clinical institutions to build a common virtual electronic patient record and carry out an exemplary implementation, to examine composition, structure and content of medical documents for tumor patients with the aim of defining an XML-based markup language allowing summarizing overviews and suitable granularities, and to integrate clinical practice guidelines and other external knowledge with the electronic patient record using XML-technologies to support the physician in the daily decision process. This paper will show, how a regional electronic patient record could be built on an architectural level and describe elementary steps towards a on content-oriented structuring of medical records.
SU-E-T-327: The Update of a XML Composing Tool for TrueBeam Developer Mode
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Y; Mao, W; Jiang, S
2014-06-01
Purpose: To introduce a major upgrade of a novel XML beam composing tool to scientists and engineers who strive to translate certain capabilities of TrueBeam Developer Mode to future clinical benefits of radiation therapy. Methods: TrueBeam Developer Mode provides the users with a test bed for unconventional plans utilizing certain unique features not accessible at the clinical mode. To access the full set of capabilities, a XML beam definition file accommodating all parameters including kV/MV imaging triggers in the plan can be locally loaded at this mode, however it is difficult and laborious to compose one in a text editor.more » In this study, a stand-along interactive XML beam composing application, TrueBeam TeachMod, was developed on Windows platforms to assist users in making their unique plans in a WYSWYG manner. A conventional plan can be imported in a DICOM RT object as the start of the beam editing process in which trajectories of all axes of a TrueBeam machine can be modified to the intended values at any control point. TeachMod also includes libraries of predefined imaging and treatment procedures to further expedite the process. Results: The TeachMod application is a major of the TeachMod module within DICOManTX. It fully supports TrueBeam 2.0. Trajectories of all axes including all MLC leaves can be graphically rendered and edited as needed. The time for XML beam composing has been reduced to a negligible amount regardless the complexity of the plan. A good understanding of XML language and TrueBeam schema is not required though preferred. Conclusion: Creating XML beams manually in a text editor will be a lengthy error-prone process for sophisticated plans. A XML beam composing tool is highly desirable for R and D activities. It will bridge the gap between scopes of TrueBeam capabilities and their clinical application potentials.« less
Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus F X
2007-08-30
Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from ftp://ftpmips.gsf.de/plants/apollo_webservice.
Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus FX
2007-01-01
Background Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. Results To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. Conclusion This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from . PMID:17760972
A Strategy for Reusing the Data of Electronic Medical Record Systems for Clinical Research.
Matsumura, Yasushi; Hattori, Atsushi; Manabe, Shiro; Tsuda, Tsutomu; Takeda, Toshihiro; Okada, Katsuki; Murata, Taizo; Mihara, Naoki
2016-01-01
There is a great need to reuse data stored in electronic medical records (EMR) databases for clinical research. We previously reported the development of a system in which progress notes and case report forms (CRFs) were simultaneously recorded using a template in the EMR in order to exclude redundant data entry. To make the data collection process more efficient, we are developing a system in which the data originally stored in the EMR database can be populated within a frame in a template. We developed interface plugin modules that retrieve data from the databases of other EMR applications. A universal keyword written in a template master is converted to a local code using a data conversion table, then the objective data is retrieved from the corresponding database. The template element data, which are entered by a template, are stored in the template element database. To retrieve the data entered by other templates, the objective data is designated by the template element code with the template code, or by the concept code if it is written for the element. When the application systems in the EMR generate documents, they also generate a PDF file and a corresponding document profile XML, which includes important data, and send them to the document archive server and the data sharing saver, respectively. In the data sharing server, the data are represented by an item with an item code with a document class code and its value. By linking a concept code to an item identifier, an objective data can be retrieved by designating a concept code. We employed a flexible strategy in which a unique identifier for a hospital is initially attached to all of the data that the hospital generates. The identifier is secondarily linked with concept codes. The data that are not linked with a concept code can also be retrieved using the unique identifier of the hospital. This strategy makes it possible to reuse any of a hospital's data.
Blinov, Michael L.; Moraru, Ion I.
2011-01-01
Multi-state molecules and multi-component complexes are commonly involved in cellular signaling. Accounting for molecules that have multiple potential states, such as a protein that may be phosphorylated on multiple residues, and molecules that combine to form heterogeneous complexes located among multiple compartments, generates an effect of combinatorial complexity. Models involving relatively few signaling molecules can include thousands of distinct chemical species. Several software tools (StochSim, BioNetGen) are already available to deal with combinatorial complexity. Such tools need information standards if models are to be shared, jointly evaluated and developed. Here we discuss XML conventions that can be adopted for modeling biochemical reaction networks described by user-specified reaction rules. These could form a basis for possible future extensions of the Systems Biology Markup Language (SBML). PMID:21464833
Automating Disk Forensic Processing with SleuthKit, XML and Python
2009-05-01
1 Automating Disk Forensic Processing with SleuthKit, XML and Python Simson L. Garfinkel Abstract We have developed a program called fiwalk which...files themselves. We show how it is relatively simple to create automated disk forensic applications using a Python module we have written that reads...software that the portable device may contain. Keywords: Computer Forensics; XML; Sleuth Kit; Python I. INTRODUCTION In recent years we have found many
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-29
... the PDF or XML version of Form No. 549D as a method to eFile quarterly data and whether they intend on... fillable Form No. 549D PDF and XML to file data \\2\\ pursuant to Order Nos. 735 and 735-A.\\3\\ [[Page 17414... PDF ( http://www.ferc.gov/docs-filing/forms/form-549d/form-549d.pdf ) or XML ( http://www.ferc.gov...
2003-03-01
PXSLServlet Paul A. Open Source Relational x X 23 Tchistopolskii sql2dtd David Mertz Public domain Relational x -- sql2xml Scott Hathaway Public...March 2003. [Hunter 2001] Hunter, David ; Cagle, Kurt; Dix, Chris; Kovack, Roger; Pinnock, Jonathan, Rafter, Jeff; Beginning XML (2nd Edition...Postgraduate School Monterey, California 4. Curt Blais Naval Postgraduate School Monterey, California 5 Erik Chaum NAVSEA Undersea
The New Wizard War: Challenges and Opportunities for Electronic Warfare in the Information Age
2007-11-06
Camp: Preparing for Conflict in the Information Age (Santa Monica, CA: RAND Corporation, 1997):175. 17. Jeffrey R . Cares, “An Information Age Combat...60. Stephen Trimble, “US Army Moves Back Into Electronic Attack Mission.” 61. Richard R . Burgess, “Jamming: The Marine Corps Refines Its Vision of...November 7, 2005), http://www.aviationweek.com/aw/generic/story_generic.jsp?channel= awst &id=news/11075p 2.xml (accessed 29 Oct 07). 74. David A
National Greenhouse Gas Emission Inventory (EV-GHG)
The EV-GHG Mobile Source Data asset contains measured mobile source GHG emissions summary compliance information on light-duty vehicles, by model, for certification as required by the 1990 Amendments to the Clean Air Act, and as driven by the 2010 Presidential Memorandum Regarding Fuel Efficiency and the 2005 Supreme Court ruling in Massachusetts v. EPA that supported the regulation of CO2 as a pollutant. Manufacturers submit data on an annual basis, or as needed to document vehicle model changes. This asset will be expanded to include medium and heavy duty vehicles in the future.The EPA performs targeted GHG emissions tests on approximately 15% of vehicles submitted for certification. Confirmatory data on vehicles is associated with its corresponding submission data to verify the accuracy of manufacturer submissions beyond standard business rules.Submitted data comes in XML format or as documents, with the majority of submissions sent in XML, and includes descriptive information on the vehicle itself, emissions information, and the manufacturer's testing approach. This data may contain proprietary information (CBI) such as information on estimated sales or other data elements indicated by the submitter as confidential. CBI data is not publically available; however, CBI data can accessed within EPA under the restrictions of the Office of Transportation and Air Quality (OTAQ) CBI policy [RCS Link]. Pollutants data includes CO2, CH4, N2O. Datasets are divided by v
The EV-GHG Mobile Source Data asset contains measured mobile source GHG emissions summary compliance information on light-duty vehicles, by model, for certification as required by the 1990 Amendments to the Clean Air Act, and as driven by the 2010 Presidential Memorandum Regarding Fuel Efficiency and the 2005 Supreme Court ruling in Massachusetts v. EPA that supported the regulation of CO2 as a pollutant. Manufacturers submit data on an annual basis, or as needed to document vehicle model changes. This asset will be expanded to include medium and heavy duty vehicles in the future.The EPA performs targeted GHG emissions tests on approximately 15% of vehicles submitted for certification. Confirmatory data on vehicles is associated with its corresponding submission data to verify the accuracy of manufacturer submissions beyond standard business rules.Submitted data comes in XML format or as documents, with the majority of submissions sent in XML, and includes descriptive information on the vehicle itself, emissions information, and the manufacturer's testing approach. This data may contain proprietary information (CBI) such as information on estimated sales or other data elements indicated by the submitter as confidential. CBI data is not publically available; however, CBI data can accessed within EPA under the restrictions of the Office of Transportation and Air Quality (OTAQ) CBI policy [RCS Link]. Pollutants data includes CO2, CH4, N2O. Datasets are divided by v
Geothermal Data | Geospatial Data Science | NREL
Identified Onshore Geopressured Geothermal Energy in Texas and Louisiana provides additional information on Geothermal Data Geothermal Data These datasets detail the geothermal resource available in the Metadata Geothermal Zip 5.4 MB 03/05/2009 geothermal.xml This dataset is a qualitative assessment of
Learning the Semantics of Structured Data Sources
ERIC Educational Resources Information Center
Taheriyan, Mohsen
2015-01-01
Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data, however, they rarely provide a semantic model to describe their contents. Semantic models of data sources capture the intended meaning of data sources by mapping them to the concepts and relationships defined by a…
Developing a Markup Language for Encoding Graphic Content in Plan Documents
ERIC Educational Resources Information Center
Li, Jinghuan
2009-01-01
While deliberating and making decisions, participants in urban development processes need easy access to the pertinent content scattered among different plans. A Planning Markup Language (PML) has been proposed to represent the underlying structure of plans in an XML-compliant way. However, PML currently covers only textual information and lacks…
Now That We've Found the "Hidden Web," What Can We Do with It?
ERIC Educational Resources Information Center
Cole, Timothy W.; Kaczmarek, Joanne; Marty, Paul F.; Prom, Christopher J.; Sandore, Beth; Shreeves, Sarah
The Open Archives Initiative (OAI) Protocol for Metadata Harvesting (PMH) is designed to facilitate discovery of the "hidden web" of scholarly information, such as that contained in databases, finding aids, and XML documents. OAI-PMH supports standardized exchange of metadata describing items in disparate collections, of such as those…
Application of Rough Sets to Information Retrieval.
ERIC Educational Resources Information Center
Miyamoto, Sadaaki
1998-01-01
Develops a method of rough retrieval, an application of the rough set theory to information retrieval. The aim is to: (1) show that rough sets are naturally applied to information retrieval in which categorized information structure is used; and (2) show that a fuzzy retrieval scheme is induced from the rough retrieval. (AEF)
2010-03-01
to a graphics card , and not the redesign of XML. The justification is that if XML is going to be prevalent, special optimized hardware is...the answer, similar to the specialized functions of a video card . Given the Moore’s law that processing power doubles every few years, let the...and numerous multimedia players such as iTunes from Apple. These applications are free to use, but the source is restricted by software licenses
A browser-based tool for conversion between Fortran NAMELIST and XML/HTML
NASA Astrophysics Data System (ADS)
Naito, O.
A browser-based tool for conversion between Fortran NAMELIST and XML/HTML is presented. It runs on an HTML5 compliant browser and generates reusable XML files to aid interoperability. It also provides a graphical interface for editing and annotating variables in NAMELIST, hence serves as a primitive code documentation environment. Although the tool is not comprehensive, it could be viewed as a test bed for integrating legacy codes into modern systems.
XML schemas for common bioinformatic data types and their application in workflow systems
Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert
2006-01-01
Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823
Web Services and Other Enhancements at the Northern California Earthquake Data Center
NASA Astrophysics Data System (ADS)
Neuhauser, D. S.; Zuzlewski, S.; Allen, R. M.
2012-12-01
The Northern California Earthquake Data Center (NCEDC) provides data archive and distribution services for seismological and geophysical data sets that encompass northern California. The NCEDC is enhancing its ability to deliver rapid information through Web Services. NCEDC Web Services use well-established web server and client protocols and REST software architecture to allow users to easily make queries using web browsers or simple program interfaces and to receive the requested data in real-time rather than through batch or email-based requests. Data are returned to the user in the appropriate format such as XML, RESP, or MiniSEED depending on the service, and are compatible with the equivalent IRIS DMC web services. The NCEDC is currently providing the following Web Services: (1) Station inventory and channel response information delivered in StationXML format, (2) Channel response information delivered in RESP format, (3) Time series availability delivered in text and XML formats, (4) Single channel and bulk data request delivered in MiniSEED format. The NCEDC is also developing a rich Earthquake Catalog Web Service to allow users to query earthquake catalogs based on selection parameters such as time, location or geographic region, magnitude, depth, azimuthal gap, and rms. It will return (in QuakeML format) user-specified results that can include simple earthquake parameters, as well as observations such as phase arrivals, codas, amplitudes, and computed parameters such as first motion mechanisms, moment tensors, and rupture length. The NCEDC will work with both IRIS and the International Federation of Digital Seismograph Networks (FDSN) to define a uniform set of web service specifications that can be implemented by multiple data centers to provide users with a common data interface across data centers. The NCEDC now hosts earthquake catalogs and waveforms from the US Department of Energy (DOE) Enhanced Geothermal Systems (EGS) monitoring networks. These data can be accessed through the above web services and through special NCEDC web pages.
XGI: a graphical interface for XQuery creation.
Li, Xiang; Gennari, John H; Brinkley, James F
2007-10-11
XML has become the default standard for data exchange among heterogeneous data sources, and in January 2007 XQuery (XML Query language) was recommended by the World Wide Web Consortium as the query language for XML. However, XQuery is a complex language that is difficult for non-programmers to learn. We have therefore developed XGI (XQuery Graphical Interface), a visual interface for graphically generating XQuery. In this paper we demonstrate the functionality of XGI through its application to a biomedical XML dataset. We describe the system architecture and the features of XGI in relation to several existing querying systems, we demonstrate the system's usability through a sample query construction, and we discuss a preliminary evaluation of XGI. Finally, we describe some limitations of the system, and our plans for future improvements.
Meystre, Stéphane M; Lee, Sanghoon; Jung, Chai Young; Chevrier, Raphaël D
2012-08-01
An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. Copyright © 2011 Elsevier Inc. All rights reserved.
Report of Official Foreign Travel to France May 8-27, 1998
DOE Office of Scientific and Technical Information (OSTI.GOV)
mason, j d
1998-06-11
The Department of Energy (DOE) has moved ever more rapidly towards electronic production, management, and dissemination of scientific and technical information. The World-Wide Web (WWW) has become a primary means of information dissemination. Electronic commerce (EC) is becoming the preferred means of procurement. DOE, like other government agencies, depends on and encourages the use of international standards in data communications. Among the most widely adopted standards is the Standard Generalized Markup Language (SGML, ISO 8879:1986, FIPS 152), which DOE has selected as the basis of its electronic management of documents. Besides the official commitment, which has resulted in several specializedmore » projects, DOE makes heavy use of coding derived from SGML, and its use is likely to increase in the future. Most documents on the WWW are coded in HTML (Hypertext Markup Language), which is an application of SGML. The World-Wide Web Consortium (W3C), with the backing of major software houses like Microsoft, Adobe, and Netscape, is promoting XML (eXtensible Markup Language), a class of SGML applications, for the future of the WWW and the basis for EC. In support of DOE's use of these standards, I have served since 1985 as Convenor of the international committee responsible for SGML and related standards, ISO/IEC JTC1/WG4 (WG4). During this trip I convened the spring 1998 meeting of WG4 in Paris, France. I also attended a major conference on the use of SGML and XML. At the close of the conference, I chaired a workshop of standards developers looking at ways of improving online searching of electronic documents. Note: Since the end of the meetings in France, JTC1 has raised the level of WG4 to a full Subcommittee; its designator is now ISO/IEC JTC1/SC34. WG4 maintains and continues to enhance several standards. In addition to SGML, which is the basis of HTML and XML, WG4 also works on the Document Style Semantics and Specification Language (DSSSL), which is the basis for the W3C's XSL (eXtensible Style Language, to be used with XML) and the Hypermedia/Time-based Document Structuring Language (HyTime), which is a major influence on the W3C's XLink (XML Linking Language). WG4 is also involved in work with the ISO's TC184, Industrial Data, on the linking of STEP (the standard for the interchange of product model data) with SGML. In addition to the widespread use of the WWW among DOE's plants and facilities in Oak Ridge and among DOE sites across the nation, there are several SGML-based projects at the Y-12 Plant. My project team in Information Technology Services has developed an SGML-based publications system that has been used for several major reports at the Y-12 Plant and Oak Ridge National Laboratory (ORNL). SGML is a component of the Weapons Records Archiving and Preservation (WRAP) project at Y-12 and is the format for catalog metadata chosen for weapons records by the Nuclear Weapons Information Group (NWIG). Supporting standards development allows DOE and Y-12 both input into the process and the opportunity to benefit from contact with some of the leading experts in the subject matter. Oak Ridge has been for some years the location to which other DOE sites turn for expertise in SGML and related topics.« less
Enhanced Information Retrieval Using AJAX
NASA Astrophysics Data System (ADS)
Kachhwaha, Rajendra; Rajvanshi, Nitin
2010-11-01
Information Retrieval deals with the representation, storage, organization of, and access to information items. The representation and organization of information items should provide the user with easy access to the information with the rapid development of Internet, large amounts of digitally stored information is readily available on the World Wide Web. This information is so huge that it becomes increasingly difficult and time consuming for the users to find the information relevant to their needs. The explosive growth of information on the Internet has greatly increased the need for information retrieval systems. However, most of the search engines are using conventional information retrieval systems. An information system needs to implement sophisticated pattern matching tools to determine contents at a faster rate. AJAX has recently emerged as the new tool such the of information retrieval process of information retrieval can become fast and information reaches the use at a faster pace as compared to conventional retrieval systems.
NASA Astrophysics Data System (ADS)
Aloisio, Giovanni; Fiore, Sandro; Negro, A.
2010-05-01
The CMCC Data Distribution Centre (DDC) is the primary entry point (web gateway) to the CMCC. It is a Data Grid Portal providing a ubiquitous and pervasive way to ease data publishing, climate metadata search, datasets discovery, metadata annotation, data access, data aggregation, sub-setting, etc. The grid portal security model includes the use of HTTPS protocol for secure communication with the client (based on X509v3 certificates that must be loaded into the browser) and secure cookies to establish and maintain user sessions. The CMCC DDC is now in a pre-production phase and it is currently used only by internal users (CMCC researchers and climate scientists). The most important component already available in the CMCC DDC is the Search Engine which allows users to perform, through web interfaces, distributed search and discovery activities by introducing one or more of the following search criteria: horizontal extent (which can be specified by interacting with a geographic map), vertical extent, temporal extent, keywords, topics, creation date, etc. By means of this page the user submits the first step of the query process on the metadata DB, then, she can choose one or more datasets retrieving and displaying the complete XML metadata description (from the browser). This way, the second step of the query process is carried out by accessing to a specific XML document of the metadata DB. Finally, through the web interface, the user can access to and download (partially or totally) the data stored on the storage device accessing to OPeNDAP servers and to other available grid storage interfaces. Requests concerning datasets stored in deep storage will be served asynchronously.
Integrated Design and Production Reference Integration with ArchGenXML V1.00
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barter, R H
2004-07-20
ArchGenXML is a tool that allows easy creation of Zope products through the use of Archetypes. The Integrated Design and Production Reference (IDPR) should be highly configurable in order to meet the needs of a diverse engineering community. Ease of configuration is key to the success of IDPR. The purpose of this paper is to describe a method of using a UML diagram editor to configure IDPR through ArchGenXML and Archetypes.
PDF for Healthcare and Child Health Data Forms.
Zuckerman, Alan E; Schneider, Joseph H; Miller, Ken
2008-11-06
PDF-H is a new best practices standard that uses XFA forms and embedded JavaScript to combine PDF forms with XML data. Preliminary experience with AAP child health forms shows that the combination of PDF with XML is a more effective method to visualize familiar data on paper and the web than the traditional use of XML and XSLT. Both PDF-H and HL7 Clinical Document Architecture can co-exist using the same data for different display formats.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mishra, P; Varian Medical Systems, Palo Alto, CA; Lewis, J
2014-06-15
Purpose: To address the challenges of creating delivery trajectories and imaging sequences with TrueBeam Developer Mode, a new open-source graphical XML builder, Veritas, has been developed, tested and made freely available. Veritas eliminates most of the need to understand the underlying schema and write XML scripts, by providing a graphical menu for each control point specifying the state of 30 mechanical/dose axes. All capabilities of Developer Mode are accessible in Veritas. Methods: Veritas was designed using QT Designer, a ‘what-you-is-what-you-get’ (WYSIWIG) tool for building graphical user interfaces (GUI). Different components of the GUI are integrated using QT's signals and slotsmore » mechanism. Functionalities are added using PySide, an open source, cross platform Python binding for the QT framework. The XML code generated is immediately visible, making it an interactive learning tool. A user starts from an anonymized DICOM file or XML example and introduces delivery modifications, or begins their experiment from scratch, then uses the GUI to modify control points as desired. The software automatically generates XML plans following the appropriate schema. Results: Veritas was tested by generating and delivering two XML plans at Brigham and Women's Hospital. The first example was created to irradiate the letter ‘B’ with a narrow MV beam using dynamic couch movements. The second was created to acquire 4D CBCT projections for four minutes. The delivery of the letter ‘B’ was observed using a 2D array of ionization chambers. Both deliveries were generated quickly in Veritas by non-expert Developer Mode users. Conclusion: We introduced a new open source tool Veritas for generating XML plans (delivery trajectories and imaging sequences). Veritas makes Developer Mode more accessible by reducing the learning curve for quick translation of research ideas into XML plans. Veritas is an open source initiative, creating the possibility for future developments and collaboration with other researchers. I am an employee of Varian Medical Systems.« less
Standardized data sharing in a paediatric oncology research network--a proof-of-concept study.
Hochedlinger, Nina; Nitzlnader, Michael; Falgenhauer, Markus; Welte, Stefan; Hayn, Dieter; Koumakis, Lefteris; Potamias, George; Tsiknakis, Manolis; Saraceno, Davide; Rinaldi, Eugenia; Ladenstein, Ruth; Schreier, Günter
2015-01-01
Data that has been collected in the course of clinical trials are potentially valuable for additional scientific research questions in so called secondary use scenarios. This is of particular importance in rare disease areas like paediatric oncology. If data from several research projects need to be connected, so called Core Datasets can be used to define which information needs to be extracted from every involved source system. In this work, the utility of the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) as a format for Core Datasets was evaluated and a web tool was developed which received Source ODM XML files and--via Extensible Stylesheet Language Transformation (XSLT)--generated standardized Core Dataset ODM XML files. Using this tool, data from different source systems were extracted and pooled for joined analysis in a proof-of-concept study, facilitating both, basic syntactic and semantic interoperability.
GEM at 10: a decade's experience with the Guideline Elements Model.
Hajizadeh, Negin; Kashyap, Nitu; Michel, George; Shiffman, Richard N
2011-01-01
The Guideline Elements Model (GEM) was developed in 2000 to organize the information contained in clinical practice guidelines using XML and to represent guideline content in a form that can be understood by human readers and processed by computers. In this work, we systematically reviewed the literature to better understand how GEM was being used, potential barriers to its use, and suggestions for improvement. Fifty external and twelve internally produced publications were identified and analyzed. GEM was used most commonly for modeling and ontology creation. Other investigators applied GEM for knowledge extraction and data mining, for clinical decision support for guideline generation. The GEM Cutter software-used to markup guidelines for translation into XML- has been downloaded 563 times since 2000. Although many investigators found GEM to be valuable, others critiqued its failure to clarify guideline semantics, difficulties in markup, and the fact that GEM files are not usually executable.
NASA Astrophysics Data System (ADS)
Sardina, V.
2012-12-01
The US Tsunami Warning Centers (TWCs) have traditionally generated their tsunami message products primarily as blocks of text then tagged with headers that identify them on each particular communications' (comms) circuit. Each warning center has a primary area of responsibility (AOR) within which it has an authoritative role regarding parameters such as earthquake location and magnitude. This means that when a major tsunamigenic event occurs the other warning centers need to quickly access the earthquake parameters issued by the authoritative warning center before issuing their message products intended for customers in their own AOR. Thus, within the operational context of the TWCs the scientists on duty have an operational need to access the information contained in the message products issued by other warning centers as quickly as possible. As a solution to this operational problem we designed and implemented a C++ software package that allows scanning for and parsing the entire suite of tsunami message products issued by the Pacific Tsunami Warning Center (PTWC), the West Coast and Alaska Tsunami Warning Center (WCATWC), and the Japan Meteorological Agency (JMA). The scanning and parsing classes composing the resulting C++ software package allow parsing both non-official message products(observatory messages) routinely issued by the TWCs, and all official tsunami message products such as tsunami advisories, watches, and warnings. This software package currently allows scientists on duty at the PTWC to automatically retrieve the parameters contained in tsunami messages issued by WCATWC, JMA, or PTWC itself. Extension of the capabilities of the classes composing the software package would make it possible to generate XML and CAP compliant versions of the TWCs' message products until new messaging software natively adds this capabilities. Customers who receive the TWCs' tsunami message products could also use the package to automatically retrieve information from messages sent via any text-based communications' circuit currently used by the TWCs to disseminate their tsunami message products.
Leverage and Delegation in Developing an Information Model for Geology
NASA Astrophysics Data System (ADS)
Cox, S. J.
2007-12-01
GeoSciML is an information model and XML encoding developed by a group of primarily geologic survey organizations under the auspices of the IUGS CGI. The scope of the core model broadly corresponds with information traditionally portrayed on a geologic map, viz. interpreted geology, some observations, the map legend and accompanying memoir. The development of GeoSciML has followed the methodology specified for an Application Schema defined by OGC and ISO 19100 series standards. This requires agreement within a community concerning their domain model, its formal representation using UML, documentation as a Feature Type Catalogue, with an XML Schema implementation generated from the model by applying a rule-based transformation. The framework and technology supports a modular governance process. Standard datatypes and GI components (geometry, the feature and coverage metamodels, metadata) are imported from the ISO framework. The observation and sampling model (including boreholes) is imported from OGC. The scale used for most scalar literal values (terms, codes, measures) allows for localization where necessary. Wildcards and abstract base- classes provide explicit extensibility points. Link attributes appear in a regular way in the encodings, allowing reference to external resources using URIs. The encoding is compatible with generic GI data-service interfaces (WFS, WMS, SOS). For maximum interoperability within a community, the interfaces may be specialised through domain-specified constraints (e.g. feature-types, scale and vocabulary bindings, query-models). Formalization using UML and XML allows use of standard validation and processing tools. Use of upper-level elements defined for generic GI application reduces the development effort and governance resonsibility, while maximising cross-domain interoperability. On the other hand, enabling specialization to be delegated in a controlled manner is essential to adoption across a range of subdisciplines and jurisdictions. The GeoSciML design team is responsible only for the part of the model that is unique to geology but for which general agreement can be reached within the domain. This paper is presented on behalf of the Interoperability Working Group of the IUGS Commission for Geoscience Information (CGI) - follow web-link for details of the membership.
VKCDB: voltage-gated K+ channel database updated and upgraded.
Gallin, Warren J; Boutet, Patrick A
2011-01-01
The Voltage-gated K(+) Channel DataBase (VKCDB) (http://vkcdb.biology.ualberta.ca) makes a comprehensive set of sequence data readily available for phylogenetic and comparative analysis. The current update contains 2063 entries for full-length or nearly full-length unique channel sequences from Bacteria (477), Archaea (18) and Eukaryotes (1568), an increase from 346 solely eukaryotic entries in the original release. In addition to protein sequences for channels, corresponding nucleotide sequences of the open reading frames corresponding to the amino acid sequences are now available and can be extracted in parallel with sets of protein sequences. Channels are categorized into subfamilies by phylogenetic analysis and by using hidden Markov model analyses. Although the raw database contains a number of fragmentary, duplicated, obsolete and non-channel sequences that were collected in early steps of data collection, the web interface will only return entries that have been validated as likely K(+) channels. The retrieval function of the web interface allows retrieval of entries that contain a substantial fraction of the core structural elements of VKCs, fragmentary entries, or both. The full database can be downloaded as either a MySQL dump or as an XML dump from the web site. We have now implemented automated updates at quarterly intervals.
XML DTD and Schemas for HDF-EOS
NASA Technical Reports Server (NTRS)
Ullman, Richard; Yang, Jingli
2008-01-01
An Extensible Markup Language (XML) document type definition (DTD) standard for the structure and contents of HDF-EOS files and their contents, and an equivalent standard in the form of schemas, have been developed.
ForConX: A forcefield conversion tool based on XML.
Lesch, Volker; Diddens, Diddo; Bernardes, Carlos E S; Golub, Benjamin; Dequidt, Alain; Zeindlhofer, Veronika; Sega, Marcello; Schröder, Christian
2017-04-05
The force field conversion from one MD program to another one is exhausting and error-prone. Although single conversion tools from one MD program to another exist not every combination and both directions of conversion are available for the favorite MD programs Amber, Charmm, Dl-Poly, Gromacs, and Lammps. We present here a general tool for the force field conversion on the basis of an XML document. The force field is converted to and from this XML structure facilitating the implementation of new MD programs for the conversion. Furthermore, the XML structure is human readable and can be manipulated before continuing the conversion. We report, as testcases, the conversions of topologies for acetonitrile, dimethylformamide, and 1-ethyl-3-methylimidazolium trifluoromethanesulfonate comprising also Urey-Bradley and Ryckaert-Bellemans potentials. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Interactive, Secure Web-enabled Aircraft Engine Simulation Using XML Databinding Integration
NASA Technical Reports Server (NTRS)
Lin, Risheng; Afjeh, Abdollah A.
2003-01-01
This paper discusses the detailed design of an XML databinding framework for aircraft engine simulation. The framework provides an object interface to access and use engine data. while at the same time preserving the meaning of the original data. The Language independent representation of engine component data enables users to move around XML data using HTTP through disparate networks. The application of this framework is demonstrated via a web-based turbofan propulsion system simulation using the World Wide Web (WWW). A Java Servlet based web component architecture is used for rendering XML engine data into HTML format and dealing with input events from the user, which allows users to interact with simulation data from a web browser. The simulation data can also be saved to a local disk for archiving or to restart the simulation at a later time.
Eysenbach, Gunther; Yihune, Gabriel; Lampe, Kristian; Cross, Phil; Brickley, Dan
2000-01-01
MedCERTAIN (MedPICS Certification and Rating of Trustworthy Health Information on the Net, http://www.medcertain.org/) is a recently launched international project funded under the European Union's (EU) "Action Plan for safer use of the Internet". It provides a technical infrastructure and a conceptual basis for an international system of "quality seals", ratings and self-labelling of Internet health information, with the final aim to establish a global "trustmark" for networked health information. Digital "quality seals" are evaluative metadata (using standards such as PICS=Platform for Internet Content Selection, now being replaced by RDF/XML) assigned by trusted third-party raters. The project also enables and encourages self-labelling with descriptive metainformation by web authors. Together these measures will help consumers as well as professionals to identify high-quality information on the Internet. MedCERTAIN establishes a fully functional demonstrator for a self- and third-party rating system enabling consumers and professionals to filter harmful health information and to positively identify and select high quality information. We aim to provide a trustmark system which allows citizens to place greater confidence in networked information, to encourage health information providers to follow best practices guidelines such as the Washington eHealth Code of Ethics, to provide effective feedback and law enforcement channels to handle user complaints, and to stimulate medical societies to develop standard for patient information. The project further proposes and identifies standards for interoperability of rating and description services (such as libraries or national health portals) and fosters a worldwide collaboration to guide consumers to high-quality information on the web.
A Framework to Manage Information Models
NASA Astrophysics Data System (ADS)
Hughes, J. S.; King, T.; Crichton, D.; Walker, R.; Roberts, A.; Thieman, J.
2008-05-01
The Information Model is the foundation on which an Information System is built. It defines the entities to be processed, their attributes, and the relationships that add meaning. The development and subsequent management of the Information Model is the single most significant factor for the development of a successful information system. A framework of tools has been developed that supports the management of an information model with the rigor typically afforded to software development. This framework provides for evolutionary and collaborative development independent of system implementation choices. Once captured, the modeling information can be exported to common languages for the generation of documentation, application databases, and software code that supports both traditional and semantic web applications. This framework is being successfully used for several science information modeling projects including those for the Planetary Data System (PDS), the International Planetary Data Alliance (IPDA), the National Cancer Institute's Early Detection Research Network (EDRN), and several Consultative Committee for Space Data Systems (CCSDS) projects. The objective of the Space Physics Archive Search and Exchange (SPASE) program is to promote collaboration and coordination of archiving activity for the Space Plasma Physics community and ensure the compatibility of the architectures used for a global distributed system and the individual data centers. Over the past several years, the SPASE data model working group has made great progress in developing the SPASE Data Model and supporting artifacts including a data dictionary, XML Schema, and two ontologies. The authors have captured the SPASE Information Model in this framework. This allows the generation of documentation that presents the SPASE Information Model in object-oriented notation including UML class diagrams and class hierarchies. The modeling information can also be exported to semantic web languages such as OWL and RDF and written to XML Metadata Interchange (XMI) files for import into UML tools.
NASA Astrophysics Data System (ADS)
Yamazaki, Towako
In order to stabilize and improve quality of information retrieval service, the information retrieval team of Daicel Corporation has given some efforts on standard operating procedures, interview sheet for information retrieval, structured format for search report, and search expressions for some technological fields of Daicel. These activities and efforts will also lead to skill sharing and skill tradition between searchers. In addition, skill improvements are needed not only for a searcher individually, but also for the information retrieval team totally when playing searcher's new roles.
Incorporating Feature-Based Annotations into Automatically Generated Knowledge Representations
NASA Astrophysics Data System (ADS)
Lumb, L. I.; Lederman, J. I.; Aldridge, K. D.
2006-12-01
Earth Science Markup Language (ESML) is efficient and effective in representing scientific data in an XML- based formalism. However, features of the data being represented are not accounted for in ESML. Such features might derive from events (e.g., a gap in data collection due to instrument servicing), identifications (e.g., a scientifically interesting area/volume in an image), or some other source. In order to account for features in an ESML context, we consider them from the perspective of annotation, i.e., the addition of information to existing documents without changing the originals. Although it is possible to extend ESML to incorporate feature-based annotations internally (e.g., by extending the XML schema for ESML), there are a number of complicating factors that we identify. Rather than pursuing the ESML-extension approach, we focus on an external representation for feature-based annotations via XML Pointer Language (XPointer). In previous work (Lumb &Aldridge, HPCS 2006, IEEE, doi:10.1109/HPCS.2006.26), we have shown that it is possible to extract relationships from ESML-based representations, and capture the results in the Resource Description Format (RDF). Thus we explore and report on this same requirement for XPointer-based annotations of ESML representations. As in our past efforts, the Global Geodynamics Project (GGP) allows us to illustrate with a real-world example this approach for introducing annotations into automatically generated knowledge representations.
The application of geography markup language (GML) to the geological sciences
NASA Astrophysics Data System (ADS)
Lake, Ron
2005-11-01
GML 3.0 became an adopted specification of the Open Geospatial Consortium (OGC) in January 2003, and is rapidly emerging as the world standard for the encoding, transport and storage of all forms of geographic information. This paper looks at the application of GML to one of the more challenging areas of automated geography, namely the geological sciences. Specific features of GML of interest to geologists are discussed and then illustrated through a series of geological case studies. We conclude the paper with a discussion of anticipated geological web services that GML will enable. GML is written in XML and makes use of XML Schema for extensibility. It can be used both to represent or model geographic objects and to transport them across the Internet. In this way it serves as the foundation for all manner of geographic web services. Unlike vertical application grammars such as LandXML, GML was intended to define geographic application languages, and hence is applicable to any geographic domain including forestry, environmental sciences, geology and oceanography. This paper provides a review of the basic features of GML that are fundamental to the geological sciences including geometry, coverages, observations, reference systems and temporality. These constructs are then employed in a series of simple geological case studies including structural geological description, surficial geology, representation of geological time scales, mineral occurrences, geohazards and geochemical reconnaissance.
High-level Closed-loop Fusion and Decision Making with INFORM Lab
2011-06-01
gling fr eighte r zodiac Toolbox / Library node. The output of the editor is an XML file that contains all the information needed to run the ...surveillance. It uses two land-based zodiacs to offload the illegal immigrants, by making multiple trips to/from the freighter to ferry persons to the ...mapped in Figure 9. Figure 9 Map of Non-Cooperative Search Vignette The freighter and zodiacs will attempt various elusive manoeuvres depending on
2001-12-01
diides.ncr.disa.mil/xmlreg/user/index.cfm] [ Deitel ] Deitel , H., Deitel , P., Java How to Program 3rd Edition, Prentice Hall, 1999. [DL99...presentation, and data) of information and the programming functionality. The Web framework addressed ability to provide a framework for the distribution...BLANK v ABSTRACT Advances in computer communication technology and an increased awareness of how enhanced information access can lead to improved
MPEG-7: standard metadata for multimedia content
NASA Astrophysics Data System (ADS)
Chang, Wo
2005-08-01
The eXtensible Markup Language (XML) metadata technology of describing media contents has emerged as a dominant mode of making media searchable both for human and machine consumptions. To realize this premise, many online Web applications are pushing this concept to its fullest potential. However, a good metadata model does require a robust standardization effort so that the metadata content and its structure can reach its maximum usage between various applications. An effective media content description technology should also use standard metadata structures especially when dealing with various multimedia contents. A new metadata technology called MPEG-7 content description has merged from the ISO MPEG standards body with the charter of defining standard metadata to describe audiovisual content. This paper will give an overview of MPEG-7 technology and what impact it can bring forth to the next generation of multimedia indexing and retrieval applications.
NASA Technical Reports Server (NTRS)
Li, Chung-Sheng (Inventor); Smith, John R. (Inventor); Chang, Yuan-Chi (Inventor); Jhingran, Anant D. (Inventor); Padmanabhan, Sriram K. (Inventor); Hsiao, Hui-I (Inventor); Choy, David Mun-Hien (Inventor); Lin, Jy-Jine James (Inventor); Fuh, Gene Y. C. (Inventor); Williams, Robin (Inventor)
2004-01-01
Methods and apparatus for providing a multi-tier object-relational database architecture are disclosed. In one illustrative embodiment of the present invention, a multi-tier database architecture comprises an object-relational database engine as a top tier, one or more domain-specific extension modules as a bottom tier, and one or more universal extension modules as a middle tier. The individual extension modules of the bottom tier operationally connect with the one or more universal extension modules which, themselves, operationally connect with the database engine. The domain-specific extension modules preferably provide such functions as search, index, and retrieval services of images, video, audio, time series, web pages, text, XML, spatial data, etc. The domain-specific extension modules may include one or more IBM DB2 extenders, Oracle data cartridges and/or Informix datablades, although other domain-specific extension modules may be used.
Data Visualization in Information Retrieval and Data Mining (SIG VIS).
ERIC Educational Resources Information Center
Efthimiadis, Efthimis
2000-01-01
Presents abstracts that discuss using data visualization for information retrieval and data mining, including immersive information space and spatial metaphors; spatial data using multi-dimensional matrices with maps; TREC (Text Retrieval Conference) experiments; users' information needs in cartographic information retrieval; and users' relevance…
XML schemas for common bioinformatic data types and their application in workflow systems.
Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert
2006-11-06
Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.
de Beer, R; Graveron-Demilly, D; Nastase, S; van Ormondt, D
2004-03-01
Recently we have developed a Java-based heterogeneous distributed computing system for the field of magnetic resonance imaging (MRI). It is a software system for embedding the various image reconstruction algorithms that we have created for handling MRI data sets with sparse sampling distributions. Since these data sets may result from multi-dimensional MRI measurements our system has to control the storage and manipulation of large amounts of data. In this paper we describe how we have employed the extensible markup language (XML) to realize this data handling in a highly structured way. To that end we have used Java packages, recently released by Sun Microsystems, to process XML documents and to compile pieces of XML code into Java classes. We have effectuated a flexible storage and manipulation approach for all kinds of data within the MRI system, such as data describing and containing multi-dimensional MRI measurements, data configuring image reconstruction methods and data representing and visualizing the various services of the system. We have found that the object-oriented approach, possible with the Java programming environment, combined with the XML technology is a convenient way of describing and handling various data streams in heterogeneous distributed computing systems.
Connectionist Interaction Information Retrieval.
ERIC Educational Resources Information Center
Dominich, Sandor
2003-01-01
Discussion of connectionist views for adaptive clustering in information retrieval focuses on a connectionist clustering technique and activation spreading-based information retrieval model using the interaction information retrieval method. Presents theoretical as well as simulation results as regards computational complexity and includes…
SciFlo: Semantically-Enabled Grid Workflow for Collaborative Science
NASA Astrophysics Data System (ADS)
Yunck, T.; Wilson, B. D.; Raskin, R.; Manipon, G.
2005-12-01
SciFlo is a system for Scientific Knowledge Creation on the Grid using a Semantically-Enabled Dataflow Execution Environment. SciFlo leverages Simple Object Access Protocol (SOAP) Web Services and the Grid Computing standards (WS-* standards and the Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable SOAP Services, native executables, local command-line scripts, and python codes into a distributed computing flow (a graph of operators). SciFlo's XML dataflow documents can be a mixture of concrete operators (fully bound operations) and abstract template operators (late binding via semantic lookup). All data objects and operators can be both simply typed (simple and complex types in XML schema) and semantically typed using controlled vocabularies (linked to OWL ontologies such as SWEET). By exploiting ontology-enhanced search and inference, one can discover (and automatically invoke) Web Services and operators that have been semantically labeled as performing the desired transformation, and adapt a particular invocation to the proper interface (number, types, and meaning of inputs and outputs). The SciFlo client & server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. A Visual Programming tool is also being developed, but it is not required. Once an analysis has been specified for a granule or day of data, it can be easily repeated with different control parameters and over months or years of data. SciFlo uses and preserves semantics, and also generates and infers new semantic annotations. Specifically, the SciFlo engine uses semantic metadata to understand (infer) what it is doing and potentially improve the data flow; preserves semantics by saving links to the semantics of (metadata describing) the input datasets, related datasets, and the data transformations (algorithms) used to generate downstream products; generates new metadata by allowing the user to add semantic annotations to the generated data products (or simply accept automatically generated provenance annotations); and infers new semantic metadata by understanding and applying logic to the semantics of the data and the transformations performed. Much ontology development still needs to be done but, nevertheless, SciFlo documents provide a substrate for using and preserving more semantics as ontologies develop. We will give a live demonstration of the growing SciFlo network using an example dataflow in which atmospheric temperature and water vapor profiles from three Earth Observing System (EOS) instruments are retrieved using SOAP (geo-location query & data access) services, co-registered, and visually & statistically compared on demand (see http://sciflo.jpl.nasa.gov for more information).
Giovanni, Mazza G; Shenvi, Rohit; Battles, Marcie; Orthner, Helmuth F
2008-11-06
The eMonitor is a component of the ePatient system; a prototype system used by emergency medical services (EMS) personnel in the field to record and transmits electronic patient care report (ePCR) information interactively. The eMonitor component allows each Mobile Data Terminal (MDT) on an unreliable Cisco MobileIP wireless network to securely send and received XML messages used to update patient information to and from the MDT before, during and after the transport of a patient.
2008-05-09
using better technology as a means might be an effective strategy to achieve desired effects and to reduce risk. Tim Berners - Lee , founder of the World...of information disclosure persecution. Tim Berners - Lee advises, “human communication scales up only if we can be tolerant of the differences while we...the government to define intended use Figure 1: Slide by Tim Berners - Lee at http://www.w3.org/2000/Talks/1206-xml2k- tbl.27 The Author added the
XML in an Adaptive Framework for Instrument Control
NASA Technical Reports Server (NTRS)
Ames, Troy J.
2004-01-01
NASA Goddard Space Flight Center is developing an extensible framework for instrument command and control, known as Instrument Remote Control (IRC), that combines the platform independent processing capabilities of Java with the power of the Extensible Markup Language (XML). A key aspect of the architecture is software that is driven by an instrument description, written using the Instrument Markup Language (IML). IML is an XML dialect used to describe interfaces to control and monitor the instrument, command sets and command formats, data streams, communication mechanisms, and data processing algorithms.
Enabling private and public sector organizations as agents of homeland security
NASA Astrophysics Data System (ADS)
Glassco, David H. J.; Glassco, Jordan C.
2006-05-01
Homeland security and defense applications seek to reduce the risk of undesirable eventualities across physical space in real-time. With that functional requirement in mind, our work focused on the development of IP based agent telecommunication solutions for heterogeneous sensor / robotic intelligent "Things" that could be deployed across the internet. This paper explains how multi-organization information and device sharing alliances may be formed to enable organizations to act as agents of homeland security (in addition to other uses). Topics include: (i) using location-aware, agent based, real-time information sharing systems to integrate business systems, mobile devices, sensor and actuator based devices and embedded devices used in physical infrastructure assets, equipment and other man-made "Things"; (ii) organization-centric real-time information sharing spaces using on-demand XML schema formatted networks; (iii) object-oriented XML serialization as a methodology for heterogeneous device glue code; (iv) how complex requirements for inter / intra organization information and device ownership and sharing, security and access control, mobility and remote communication service, tailored solution life cycle management, service QoS, service and geographic scalability and the projection of remote physical presence (through sensing and robotics) and remote informational presence (knowledge of what is going elsewhere) can be more easily supported through feature inheritance with a rapid agent system development methodology; (v) how remote object identification and tracking can be supported across large areas; (vi) how agent synergy may be leveraged with analytics to complement heterogeneous device networks.
An Approach for Implementation of Project Management Information Systems
NASA Astrophysics Data System (ADS)
Běrziša, Solvita; Grabis, Jānis
Project management is governed by project management methodologies, standards, and other regulatory requirements. This chapter proposes an approach for implementing and configuring project management information systems according to requirements defined by these methodologies. The approach uses a project management specification framework to describe project management methodologies in a standardized manner. This specification is used to automatically configure the project management information system by applying appropriate transformation mechanisms. Development of the standardized framework is based on analysis of typical project management concepts and process and existing XML-based representations of project management. A demonstration example of project management information system's configuration is provided.
DBMap: a TreeMap-based framework for data navigation and visualization of brain research registry
NASA Astrophysics Data System (ADS)
Zhang, Ming; Zhang, Hong; Tjandra, Donny; Wong, Stephen T. C.
2003-05-01
The purpose of this study is to investigate and apply a new, intuitive and space-conscious visualization framework to facilitate efficient data presentation and exploration of large-scale data warehouses. We have implemented the DBMap framework for the UCSF Brain Research Registry. Such a novel utility would facilitate medical specialists and clinical researchers in better exploring and evaluating a number of attributes organized in the brain research registry. The current UCSF Brain Research Registry consists of a federation of disease-oriented database modules, including Epilepsy, Brain Tumor, Intracerebral Hemorrphage, and CJD (Creuzfeld-Jacob disease). These database modules organize large volumes of imaging and non-imaging data to support Web-based clinical research. While the data warehouse supports general information retrieval and analysis, there lacks an effective way to visualize and present the voluminous and complex data stored. This study investigates whether the TreeMap algorithm can be adapted to display and navigate categorical biomedical data warehouse or registry. TreeMap is a space constrained graphical representation of large hierarchical data sets, mapped to a matrix of rectangles, whose size and color represent interested database fields. It allows the display of a large amount of numerical and categorical information in limited real estate of computer screen with an intuitive user interface. The paper will describe, DBMap, the proposed new data visualization framework for large biomedical databases. Built upon XML, Java and JDBC technologies, the prototype system includes a set of software modules that reside in the application server tier and provide interface to backend database tier and front-end Web tier of the brain registry.
Remote monitoring of patients with implanted devices: data exchange and integration.
Van der Velde, Enno T; Atsma, Douwe E; Foeken, Hylke; Witteman, Tom A; Hoekstra, Wybo H G J
2013-06-01
Remote follow-up of implanted implantable cardioverter defibrillators (ICDs) may offer a solution to the problem of overcrowded outpatient clinics, and may also be effective in detecting clinical events early. Data obtained from remote follow up systems, as developed by all major device companies, are stored in a central database system, operated and owned by the device company. A problem now arises that the patient's clinical information is partly stored in the local electronic health record (EHR) system in the hospital, and partly in the remote monitoring database, which may potentially result in patient safety issues. To address the requirement of integrating remote monitoring data in the local EHR, the Integrating the Healthcare Enterprise (IHE) Implantable Device Cardiac Observation (IDCO) profile has been developed. This IHE IDCO profile has been adapted by all major device companies. In our hospital, we have implemented the IHE IDCO profile to import data from the remote databases from two device vendors into the departmental Cardiology Information System (EPD-Vision). Data is exchanged via a HL7/XML communication protocol, as defined in the IHE IDCO profile. By implementing the IHE IDCO profile, we have been able to integrate the data from the remote monitoring databases in our local EHRs. It can be expected that remote monitoring systems will develop into dedicated monitoring and therapy platforms. Data retrieved from these systems should form an integral part of the electronic patient record as more and more out-patient clinic care will shift to personalized care provided at a distance, in other words at the patient's home.
NASA Astrophysics Data System (ADS)
Leibovici, D. G.; Pourabdollah, A.; Jackson, M.
2011-12-01
Experts and decision-makers use or develop models to monitor global and local changes of the environment. Their activities require the combination of data and processing services in a flow of operations and spatial data computations: a geospatial scientific workflow. The seamless ability to generate, re-use and modify a geospatial scientific workflow is an important requirement but the quality of outcomes is equally much important [1]. Metadata information attached to the data and processes, and particularly their quality, is essential to assess the reliability of the scientific model that represents a workflow [2]. Managing tools, dealing with qualitative and quantitative metadata measures of the quality associated with a workflow, are, therefore, required for the modellers. To ensure interoperability, ISO and OGC standards [3] are to be adopted, allowing for example one to define metadata profiles and to retrieve them via web service interfaces. However these standards need a few extensions when looking at workflows, particularly in the context of geoprocesses metadata. We propose to fill this gap (i) at first through the provision of a metadata profile for the quality of processes, and (ii) through providing a framework, based on XPDL [4], to manage the quality information. Web Processing Services are used to implement a range of metadata analyses on the workflow in order to evaluate and present quality information at different levels of the workflow. This generates the metadata quality, stored in the XPDL file. The focus is (a) on the visual representations of the quality, summarizing the retrieved quality information either from the standardized metadata profiles of the components or from non-standard quality information e.g., Web 2.0 information, and (b) on the estimated qualities of the outputs derived from meta-propagation of uncertainties (a principle that we have introduced [5]). An a priori validation of the future decision-making supported by the outputs of the workflow once run, is then provided using the meta-propagated qualities, obtained without running the workflow [6], together with the visualization pointing out the need to improve the workflow with better data or better processes on the workflow graph itself. [1] Leibovici, DG, Hobona, G Stock, K Jackson, M (2009) Qualifying geospatial workfow models for adaptive controlled validity and accuracy. In: IEEE 17th GeoInformatics, 1-5 [2] Leibovici, DG, Pourabdollah, A (2010a) Workflow Uncertainty using a Metamodel Framework and Metadata for Data and Processes. OGC TC/PC Meetings, September 2010, Toulouse, France [3] OGC (2011) www.opengeospatial.org [4] XPDL (2008) Workflow Process Definition Interface - XML Process Definition Language.Workflow Management Coalition, Document WfMC-TC-1025, 2008 [5] Leibovici, DG Pourabdollah, A Jackson, M (2011) Meta-propagation of Uncertainties for Scientific Workflow Management in Interoperable Spatial Data Infrastructures. In: Proceedings of the European Geosciences Union (EGU2011), April 2011, Austria [6] Pourabdollah, A Leibovici, DG Jackson, M (2011) MetaPunT: an Open Source tool for Meta-Propagation of uncerTainties in Geospatial Processing. In: Proceedings of OSGIS2011, June 2011, Nottingham, UK
A structured interface to the object-oriented genomics unified schema for XML-formatted data.
Clark, Terry; Jurek, Josef; Kettler, Gregory; Preuss, Daphe
2005-01-01
Data management systems are fast becoming required components in many biology laboratories as the role of computer-based information grows. Although the need for data management systems is on the rise, their inherent complexities can deter the full and routine use of their computational capabilities. The significant undertaking to implement a capable production system can be reduced in part by adapting an established data management system. In such a way, we are leveraging the Genomics Unified Schema (GUS) developed at the Computational Biology and Informatics Laboratory at the University of Pennsylvania as a foundation for managing and analysing DNA sequence data in centromere research projects around Arabidopsis thaliana and related species. Because GUS provides a core schema that includes support for genome sequences, mRNA and its expression, and annotated chromosomes, it is ideal for synthesising a variety of parameters to analyse these repetitive and highly dynamic portions of the genome. Despite this, production-strength data management frameworks are complex, requiring dedicated efforts to adapt and maintain. The work reported in this article addresses one component of such an effort, namely the pivotal task of marshalling data from various sources into GUS. In order to harness GUS for our project, and motivated by efficiency needs, we developed a structured framework for transferring data into GUS from outside sources. This technology is embodied in a GUS object-layer processor, XMLGUS. XMLGUS facilitates incorporating data into GUS by (i) formulating an XML interface that includes relational database key constraint definitions, (ii) regularising traversal through that XML, (iii) realising automatic processing of the XML with database key constraints and (iv) allowing for special processing of input data within the framework for automated processing. The application of XMLGUS to production pipeline processing for a sequencing project and inputting the Arabidopsis genome into GUS is discussed. XMLGUS is available from the Flora website (http://flora.ittc.ku.edu/).
Collaborative Science Using Web Services and the SciFlo Grid Dataflow Engine
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Xing, Z.; Yunck, T.
2006-12-01
The General Earth Science Investigation Suite (GENESIS) project is a NASA-sponsored partnership between the Jet Propulsion Laboratory, academia, and NASA data centers to develop a new suite of Web Services tools to facilitate multi-sensor investigations in Earth System Science. The goal of GENESIS is to enable large-scale, multi-instrument atmospheric science using combined datasets from the AIRS, MODIS, MISR, and GPS sensors. Investigations include cross-comparison of spaceborne climate sensors, cloud spectral analysis, study of upper troposphere-stratosphere water transport, study of the aerosol indirect cloud effect, and global climate model validation. The challenges are to bring together very large datasets, reformat and understand the individual instrument retrievals, co-register or re-grid the retrieved physical parameters, perform computationally-intensive data fusion and data mining operations, and accumulate complex statistics over months to years of data. To meet these challenges, we have developed a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data access, subsetting, registration, mining, fusion, compression, and advanced statistical analysis. SciFlo leverages remote Web Services, called via Simple Object Access Protocol (SOAP) or REST (one-line) URLs, and the Grid Computing standards (WS-* &Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable Web Services and native executables into a distributed computing flow (tree of operators). The SciFlo client &server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. In particular, SciFlo exploits the wealth of datasets accessible by OpenGIS Consortium (OGC) Web Mapping Servers & Web Coverage Servers (WMS/WCS), and by Open Data Access Protocol (OpenDAP) servers. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. Once an analysis has been specified for a chunk or day of data, it can be easily repeated with different control parameters or over months of data. Recently, the Earth Science Information Partners (ESIP) Federation sponsored a collaborative activity in which several ESIP members advertised their respective WMS/WCS and SOAP services, developed some collaborative science scenarios for atmospheric and aerosol science, and then choreographed services from multiple groups into demonstration workflows using the SciFlo engine and a Business Process Execution Language (BPEL) workflow engine. For several scenarios, the same collaborative workflow was executed in three ways: using hand-coded scripts, by executing a SciFlo document, and by executing a BPEL workflow document. We will discuss the lessons learned from this activity, the need for standardized interfaces (like WMS/WCS), the difficulty in agreeing on even simple XML formats and interfaces, and further collaborations that are being pursued.
Listen up, eye movements play a role in verbal memory retrieval.
Scholz, Agnes; Mehlhorn, Katja; Krems, Josef F
2016-01-01
People fixate on blank spaces if visual stimuli previously occupied these regions of space. This so-called "looking at nothing" (LAN) phenomenon is said to be a part of information retrieval from internal memory representations, but the exact nature of the relationship between LAN and memory retrieval is unclear. While evidence exists for an influence of LAN on memory retrieval for visuospatial stimuli, evidence for verbal information is mixed. Here, we tested the relationship between LAN behavior and memory retrieval in an episodic retrieval task where verbal information was presented auditorily during encoding. When participants were allowed to gaze freely during subsequent memory retrieval, LAN occurred, and it was stronger for correct than for incorrect responses. When eye movements were manipulated during memory retrieval, retrieval performance was higher when participants fixated on the area associated with to-be-retrieved information than when fixating on another area. Our results provide evidence for a functional relationship between LAN and memory retrieval that extends to verbal information.
Topological Aspects of Information Retrieval.
ERIC Educational Resources Information Center
Egghe, Leo; Rousseau, Ronald
1998-01-01
Discusses topological aspects of theoretical information retrieval, including retrieval topology; similarity topology; pseudo-metric topology; document spaces as topological spaces; Boolean information retrieval as a subsystem of any topological system; and proofs of theorems. (LRW)
Information Retrieval in Biomedical Research: From Articles to Datasets
ERIC Educational Resources Information Center
Wei, Wei
2017-01-01
Information retrieval techniques have been applied to biomedical research for a variety of purposes, such as textual document retrieval and molecular data retrieval. As biomedical research evolves over time, information retrieval is also constantly facing new challenges, including the growing number of available data, the emerging new data types,…
NASA Astrophysics Data System (ADS)
Bell, A.; Tang, G.; Yang, P.; Wu, D.
2017-12-01
Due to their high spatial and temporal coverage, cirrus clouds have a profound role in regulating the Earth's energy budget. Variability of their radiative, geometric, and microphysical properties can pose significant uncertainties in global climate model simulations if not adequately constrained. Thus, the development of retrieval methodologies able to accurately retrieve ice cloud properties and present associated uncertainties is essential. The effectiveness of cirrus cloud retrievals relies on accurate a priori understanding of ice radiative properties, as well as the current state of the atmosphere. Current studies have implemented information content theory analyses prior to retrievals to quantify the amount of information that should be expected on parameters to be retrieved, as well as the relative contribution of information provided by certain measurement channels. Through this analysis, retrieval algorithms can be designed in a way to maximize the information in measurements, and therefore ensure enough information is present to retrieve ice cloud properties. In this study, we present such an information content analysis to quantify the amount of information to be expected in retrievals of cirrus ice water path and particle effective diameter using sub-millimeter and thermal infrared radiometry. Preliminary results show these bands to be sensitive to changes in ice water path and effective diameter, and thus lend confidence their ability to simultaneously retrieve these parameters. Further quantification of sensitivity and the information provided from these bands can then be used to design and optimal retrieval scheme. While this information content analysis is employed on a theoretical retrieval combining simulated radiance measurements, the methodology could in general be applicable to any instrument or retrieval approach.
15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.
Code of Federal Regulations, 2012 CFR
2012-01-01
... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...
15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.
Code of Federal Regulations, 2013 CFR
2013-01-01
... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...
15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...
15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.
Code of Federal Regulations, 2011 CFR
2011-01-01
... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...
15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.
Code of Federal Regulations, 2014 CFR
2014-01-01
... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...
Telescope networking and user support via Remote Telescope Markup Language
NASA Astrophysics Data System (ADS)
Hessman, Frederic V.; Pennypacker, Carlton R.; Romero-Colmenero, Encarni; Tuparev, Georg
2004-09-01
Remote Telescope Markup Language (RTML) is an XML-based interface/document format designed to facilitate the exchange of astronomical observing requests and results between investigators and observatories as well as within networks of observatories. While originally created to support simple imaging telescope requests (Versions 1.0-2.1), RTML Version 3.0 now supports a wide range of applications, from request preparation, exposure calculation, spectroscopy, and observation reports to remote telescope scheduling, target-of-opportunity observations and telescope network administration. The elegance of RTML is that all of this is made possible using a public XML Schema which provides a general-purpose, easily parsed, and syntax-checked medium for the exchange of astronomical and user information while not restricting or otherwise constraining the use of the information at either end. Thus, RTML can be used to connect heterogeneous systems and their users without requiring major changes in existing local resources and procedures. Projects as very different as a number of advanced amateur observatories, the global Hands-On Universe project, the MONET network (robotic imaging), the STELLA consortium (robotic spectroscopy), and the 11-m Southern African Large Telescope are now using or intending to use RTML in various forms and for various purposes.
Trick Simulation Environment 07
NASA Technical Reports Server (NTRS)
Lin, Alexander S.; Penn, John M.
2012-01-01
The Trick Simulation Environment is a generic simulation toolkit used for constructing and running simulations. This release includes a Monte Carlo analysis simulation framework and a data analysis package. It produces all auto documentation in XML. Also, the software is capable of inserting a malfunction at any point during the simulation. Trick 07 adds variable server output options and error messaging and is capable of using and manipulating wide characters for international support. Wide character strings are available as a fundamental type for variables processed by Trick. A Trick Monte Carlo simulation uses a statistically generated, or predetermined, set of inputs to iteratively drive the simulation. Also, there is a framework in place for optimization and solution finding where developers may iteratively modify the inputs per run based on some analysis of the outputs. The data analysis package is capable of reading data from external simulation packages such as MATLAB and Octave, as well as the common comma-separated values (CSV) format used by Excel, without the use of external converters. The file formats for MATLAB and Octave were obtained from their documentation sets, and Trick maintains generic file readers for each format. XML tags store the fields in the Trick header comments. For header files, XML tags for structures and enumerations, and the members within are stored in the auto documentation. For source code files, XML tags for each function and the calling arguments are stored in the auto documentation. When a simulation is built, a top level XML file, which includes all of the header and source code XML auto documentation files, is created in the simulation directory. Trick 07 provides an XML to TeX converter. The converter reads in header and source code XML documentation files and converts the data to TeX labels and tables suitable for inclusion in TeX documents. A malfunction insertion capability allows users to override the value of any simulation variable, or call a malfunction job, at any time during the simulation. Users may specify conditions, use the return value of a malfunction trigger job, or manually activate a malfunction. The malfunction action may consist of executing a block of input file statements in an action block, setting simulation variable values, call a malfunction job, or turn on/off simulation jobs.
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2012 CFR
2012-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2013 CFR
2013-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
Graph-Based Interactive Bibliographic Information Retrieval Systems
ERIC Educational Resources Information Center
Zhu, Yongjun
2017-01-01
In the big data era, we have witnessed the explosion of scholarly literature. This explosion has imposed challenges to the retrieval of bibliographic information. Retrieval of intended bibliographic information has become challenging due to the overwhelming search results returned by bibliographic information retrieval systems for given input…
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2014 CFR
2014-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
Term Relevance Weights in On-Line Information Retrieval
ERIC Educational Resources Information Center
Salton, G.; Waldstein, R. K.
1978-01-01
Term relevance weighting systems in interactive information retrieval are reviewed. An experiment in which information retrieval users ranked query terms in decreasing order of presumed importance prior to actual search and retrieval is described. (Author/KP)
Code of Federal Regulations, 2011 CFR
2011-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2012 CFR
2012-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2013 CFR
2013-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2014 CFR
2014-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2014 CFR
2014-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2011 CFR
2011-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2012 CFR
2012-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2013 CFR
2013-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
The XML approach to implementing space link extension service management
NASA Technical Reports Server (NTRS)
Tai, W.; Welz, G. A.; Theis, G.; Yamada, T.
2001-01-01
A feasibility study has been conducted at JPL, ESOC, and ISAS to assess the possible applications of the eXtensible Mark-up Language (XML) capabilities to the implementation of the CCSDS Space Link Extension (SLE) Service Management function.
XTCE. XML Telemetry and Command Exchange Tutorial
NASA Technical Reports Server (NTRS)
Rice, Kevin; Kizzort, Brad; Simon, Jerry
2010-01-01
An XML Telemetry Command Exchange (XTCE) tutoral oriented towards packets or minor frames is shown. The contents include: 1) The Basics; 2) Describing Telemetry; 3) Describing the Telemetry Format; 4) Commanding; 5) Forgotten Elements; 6) Implementing XTCE; and 7) GovSat.
XML Schema Versioning Policies Version 1.0
NASA Astrophysics Data System (ADS)
Harrison, Paul; Demleitner, Markus; Major, Brian; Dowler, Pat; Harrison, Paul
2018-05-01
This note describes the recommended practice for the evolution of IVOA standard XML schemata that are associated with IVOA standards. The criteria for deciding what might be considered major and minor changes and the policies for dealing with each case are described.
The role of retrieval practice in memory and analogical problem-solving.
Hostetter, Autumn B; Penix, Elizabeth A; Norman, Mackenzie Z; Batsell, W Robert; Carr, Thomas H
2018-05-01
Retrieval practice (e.g., testing) has been shown to facilitate long-term retention of information. In two experiments, we examine whether retrieval practice also facilitates use of the practised information when it is needed to solve analogous problems. When retrieval practice was not limited to the information most relevant to the problems (Experiment 1), it improved memory for the information a week later compared with copying or rereading the information, although we found no evidence that it improved participants' ability to apply the information to the problems. In contrast, when retrieval practice was limited to only the information most relevant to the problems (Experiment 2), we found that retrieval practice enhanced memory for the critical information, the ability to identify the schematic similarities between the two sources of information, and the ability to apply that information to solve an analogous problem after a hint was given to do so. These results suggest that retrieval practice, through its effect on memory, can facilitate application of information to solve novel problems but has minimal effects on spontaneous realisation that the information is relevant.
Utilizing the International GeoSample Number Concept during ICDP Expedition COSC
NASA Astrophysics Data System (ADS)
Conze, Ronald; Lorenz, Henning; Ulbricht, Damian; Gorgas, Thomas; Elger, Kirsten
2016-04-01
The concept of the International GeoSample Number (IGSN) was introduced to uniquely identify and register geo-related sample material, and make it retrievable via electronic media (e.g., SESAR - http://www.geosamples.org/igsnabout). The general aim of the IGSN concept is to improve accessing stored sample material worldwide, enable the exact identification, its origin and provenance, and also the exact and complete citation of acquired samples throughout the literature. The ICDP expedition COSC (Collisional Orogeny in the Scandinavian Caledonides, http://cosc.icdp-online.org) prompted for the first time in ICDP's history to assign and register IGSNs during an ongoing drilling campaign. ICDP drilling expeditions are using commonly the Drilling Information System DIS (http://doi.org/10.2204/iodp.sd.4.07.2007) for the inventory of recovered sample material. During COSC IGSNs were assigned to every drill hole, core run, core section, and sample taken from core material. The original IGSN specification has been extended to achieve the required uniqueness of IGSNs with our offline-procedure. The ICDP name space indicator and the Expedition ID (5054) are forming an extended prefix (ICDP5054). For every type of sample material, an encoded sequence of characters follows. This sequence is derived from the DIS naming convention which is unique from the beginning. Thereby every ICDP expedition has an unlimited name space for IGSN assignments. This direct derivation of IGSNs from the DIS database context ensures the distinct parent-child hierarchy of the IGSNs among each other. In the case of COSC this method of inventory-keeping of all drill cores was done routinely using the ExpeditionDIS during field work and subsequent sampling party. After completing the field campaign, all sample material was transferred to the "Nationales Bohrkernlager" in Berlin-Spandau, Germany. Corresponding data was subsequently imported into the CurationDIS used at the aforementioned core storage facility. This CurationDIS assigns IGSNs on samples newly taken in the repository in the identical fashion as done in the field. Thereby, the parent-child linkage of the IGSNs is ensured consistently throughout the entire sampling process. The only difference between ExpeditionDIS and CurationDIS sample curation is using the name space ICDP and BGRB respectively as part of the corresponding ID string. To prepare the IGSN registry, a set of metadata is generated for every assigned IGSN using the DIS, which is then exported from the DIS into one common xml-file. The xml-file is based on the SESAR schema and a proposal of IGSN e.V. (http://schema.igsn.org). This systematics has been recently extended for drilling data to achieve additional information for future retrieval options. The two allocation agents GFZ Potsdam und PANGAEA are currently involved in the registry of IGSNs in the case of COSC drill campaigns. An example for the IGSN registration of the COSC-1 drill hole A (5054_1_A) is "ICDP5054EEW1001" and can be resolved using the URL http://hdl.handle.net/10273/ICDP5054EEW1001. Opening the landing page for the complete COSC core material for this particular hole showcases graphically a hierarchical tree entitled "Sample Family". An example of an IGSN citation associated with a COSC sample set is featured on an EGU-2016 poster presentation by Ulrich Harms, Johannes Hierold et al. (EGU2016-8646).
NASA Technical Reports Server (NTRS)
Khovanskiy, Y. D.; Kremneva, N. I.
1975-01-01
Problems and methods are discussed of automating information retrieval operations in a data bank used for long term storage and retrieval of data from scientific experiments. Existing information retrieval languages are analyzed along with those being developed. The results of studies discussing the application of the descriptive 'Kristall' language used in the 'ASIOR' automated information retrieval system are presented. The development and use of a specialized language of the classification-descriptive type, using universal decimal classification indices as the main descriptors, is described.
Mathematics and Information Retrieval.
ERIC Educational Resources Information Center
Salton, Gerald
1979-01-01
Examines the main mathematical approaches to information retrieval, including both algebraic and probabilistic models, and describes difficulties which impede formalization of information retrieval processes. A number of developments are covered where new theoretical understandings have directly led to improved retrieval techniques and operations.…
Code of Federal Regulations, 2014 CFR
2014-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2010 CFR
2010-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to 42 CFR 433.113(c...
Code of Federal Regulations, 2011 CFR
2011-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2010 CFR
2010-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2013 CFR
2013-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2012 CFR
2012-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
ERIC Educational Resources Information Center
Stirling, Keith
2000-01-01
Describes a session on information retrieval systems that planned to discuss relevance measures with Web-based information retrieval; retrieval system performance and evaluation; probabilistic independence of index terms; vector-based models; metalanguages and digital objects; how users assess the reliability, timeliness and bias of information;…
ERIC Educational Resources Information Center
Lewis, John D.
1998-01-01
Describes XML (extensible markup language), a new language classification submitted to the World Wide Web Consortium that is defined in terms of both SGML (Standard Generalized Markup Language) and HTML (Hypertext Markup Language), specifically designed for the Internet. Limitations of PDF (Portable Document Format) files for electronic journals…
2012-04-09
between BPMN , SysML, and Arena ........................................... 16 Capabilities, Activities, Resources, Performers...Proof of Concept ................................................................ 22 BPMN 2.0 XML to Arena Converter...21 Figure 5: BPMN 2.0 XML StartEvent (Excerpt
Transparent Information Systems through Gateways, Front Ends, Intermediaries, and Interfaces.
ERIC Educational Resources Information Center
Williams, Martha E.
1986-01-01
Provides overview of design requirements for transparent information retrieval (implies that user sees through complexity of retrieval activities sequence). Highlights include need for transparent systems; history of transparent retrieval research; information retrieval functions (automated converters, routers, selectors, evaluators/analyzers);…
Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry.
Röst, Hannes L; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars
2015-01-01
In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS.
Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry
Röst, Hannes L.; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars
2015-01-01
Motivation In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Results Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Availability Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS. PMID:25927999
Kim, Hwa Sun; Cho, Hune
2011-01-01
Objectives The Health Level Seven Interface Engine (HL7 IE), developed by Kyungpook National University, has been employed in health information systems, however users without a background in programming have reported difficulties in using it. Therefore, we developed a graphical user interface (GUI) engine to make the use of the HL7 IE more convenient. Methods The GUI engine was directly connected with the HL7 IE to handle the HL7 version 2.x messages. Furthermore, the information exchange rules (called the mapping data), represented by a conceptual graph in the GUI engine, were transformed into program objects that were made available to the HL7 IE; the mapping data were stored as binary files for reuse. The usefulness of the GUI engine was examined through information exchange tests between an HL7 version 2.x message and a health information database system. Results Users could easily create HL7 version 2.x messages by creating a conceptual graph through the GUI engine without requiring assistance from programmers. In addition, time could be saved when creating new information exchange rules by reusing the stored mapping data. Conclusions The GUI engine was not able to incorporate information types (e.g., extensible markup language, XML) other than the HL7 version 2.x messages and the database, because it was designed exclusively for the HL7 IE protocol. However, in future work, by including additional parsers to manage XML-based information such as Continuity of Care Documents (CCD) and Continuity of Care Records (CCR), we plan to ensure that the GUI engine will be more widely accessible for the health field. PMID:22259723
Kim, Hwa Sun; Cho, Hune; Lee, In Keun
2011-12-01
The Health Level Seven Interface Engine (HL7 IE), developed by Kyungpook National University, has been employed in health information systems, however users without a background in programming have reported difficulties in using it. Therefore, we developed a graphical user interface (GUI) engine to make the use of the HL7 IE more convenient. The GUI engine was directly connected with the HL7 IE to handle the HL7 version 2.x messages. Furthermore, the information exchange rules (called the mapping data), represented by a conceptual graph in the GUI engine, were transformed into program objects that were made available to the HL7 IE; the mapping data were stored as binary files for reuse. The usefulness of the GUI engine was examined through information exchange tests between an HL7 version 2.x message and a health information database system. Users could easily create HL7 version 2.x messages by creating a conceptual graph through the GUI engine without requiring assistance from programmers. In addition, time could be saved when creating new information exchange rules by reusing the stored mapping data. The GUI engine was not able to incorporate information types (e.g., extensible markup language, XML) other than the HL7 version 2.x messages and the database, because it was designed exclusively for the HL7 IE protocol. However, in future work, by including additional parsers to manage XML-based information such as Continuity of Care Documents (CCD) and Continuity of Care Records (CCR), we plan to ensure that the GUI engine will be more widely accessible for the health field.
Improve Biomedical Information Retrieval using Modified Learning to Rank Methods.
Xu, Bo; Lin, Hongfei; Lin, Yuan; Ma, Yunlong; Yang, Liang; Wang, Jian; Yang, Zhihao
2016-06-14
In these years, the number of biomedical articles has increased exponentially, which becomes a problem for biologists to capture all the needed information manually. Information retrieval technologies, as the core of search engines, can deal with the problem automatically, providing users with the needed information. However, it is a great challenge to apply these technologies directly for biomedical retrieval, because of the abundance of domain specific terminologies. To enhance biomedical retrieval, we propose a novel framework based on learning to rank. Learning to rank is a series of state-of-the-art information retrieval techniques, and has been proved effective in many information retrieval tasks. In the proposed framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents, but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate the effectiveness of our framework for biomedical information retrieval.
NASA Astrophysics Data System (ADS)
Fletcher, Alex; Yoo, Terry S.
2004-04-01
Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.
Competitive retrieval is not a prerequisite for forgetting in the retrieval practice paradigm.
Camp, Gino; Dalm, Sander
2016-09-01
Retrieving information from memory can lead to forgetting of other, related information. The inhibition account of this retrieval-induced forgetting effect predicts that this form of forgetting occurs when competition arises between the practiced information and the related information, leading to inhibition of the related information. In the standard retrieval practice paradigm, a retrieval practice task is used in which participants retrieve the items based on a category-plus-stem cue (e.g., FRUIT-or___). In the current experiment, participants instead generated the target based on a cue in which the first 2 letters of the target were transposed (e.g., FRUIT-roange). This noncompetitive task also induced forgetting of unpracticed items from practiced categories. This finding is inconsistent with the inhibition account, which asserts that the forgetting effect depends on competitive retrieval. We argue that interference-based accounts of forgetting and the context-based account of retrieval-induced forgetting can account for this result. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Individual Differences in Working Memory Capacity Predict Retrieval-Induced Forgetting
ERIC Educational Resources Information Center
Aslan, Alp; Bauml, Karl-Heinz T.
2011-01-01
Selectively retrieving a subset of previously studied information enhances memory for the retrieved information but causes forgetting of related, nonretrieved information. Such retrieval-induced forgetting (RIF) has often been attributed to inhibitory executive-control processes that supposedly suppress the nonretrieved items' memory…
ERIC Educational Resources Information Center
Lynch, Michael F.; Willett, Peter
1987-01-01
Discusses research into chemical information and document retrieval systems at the University of Sheffield. Highlights include the use of cluster analysis methods for document retrieval and drug design, representation and searching of files of generic chemical structures, and the application of parallel computer hardware to information retrieval.…
Castles Made of Sand: Building Sustainable Digitized Collections Using XML.
ERIC Educational Resources Information Center
Ragon, Bart
2003-01-01
Describes work at the University of Virginia library to digitize special collections. Discusses the use of XML (Extensible Markup Language); providing access to original source materials; DTD (Document Type Definition); TEI (Text Encoding Initiative); metadata; XSL (Extensible Style Language); and future possibilities. (LRW)
At-sea demonstration of RF sensor tasking using XML over a worldwide network
NASA Astrophysics Data System (ADS)
Kellogg, Robert L.; Lee, Tom; Dumas, Diane; Raggo, Barbara
2003-07-01
As part of an At-Sea Demonstration for Space and Naval Warfare Command (SPAWAR, PMW-189), a prototype RF sensor for signal acquisition and direction finding queried and received tasking via a secure worldwide Automated Data Network System (ADNS). Using extended mark-up language (XML) constructs, both mission and signal tasking were available for push and pull Battlespace management. XML tasking was received by the USS Cape St George (CG-71) during an exercise along the Gulf Coast of the US from a test facility at SPAWAR, San Diego, CA. Although only one ship was used in the demonstration, the intent of the software initiative was to show that a network of different RF sensors on different platforms with different capabilitis could be tasked by a common web agent. A sensor software agent interpreted the XML task to match the sensor's capability. Future improvements will focus on enlarging the domain of mission tasking and incorporate report management.
Chen, Hung-Ming; Liou, Yong-Zan
2014-10-01
In a mobile health management system, mobile devices act as the application hosting devices for personal health records (PHRs) and the healthcare servers construct to exchange and analyze PHRs. One of the most popular PHR standards is continuity of care record (CCR). The CCR is expressed in XML formats. However, parsing is an expensive operation that can degrade XML processing performance. Hence, the objective of this study was to identify different operational and performance characteristics for those CCR parsing models including the XML DOM parser, the SAX parser, the PULL parser, and the JSON parser with regard to JSON data converted from XML-based CCR. Thus, developers can make sensible choices for their target PHR applications to parse CCRs when using mobile devices or servers with different system resources. Furthermore, the simulation experiments of four case studies are conducted to compare the parsing performance on Android mobile devices and the server with large quantities of CCR data.
Chemical markup, XML and the World-Wide Web. 3. Toward a signed semantic chemical web of trust.
Gkoutos, G V; Murray-Rust, P; Rzepa, H S; Wright, M
2001-01-01
We describe how a collection of documents expressed in XML-conforming languages such as CML and XHTML can be authenticated and validated against digital signatures which make use of established X.509 certificate technology. These can be associated either with specific nodes in the XML document or with the entire document. We illustrate this with two examples. An entire journal article expressed in XML has its individual components digitally signed by separate authors, and the collection is placed in an envelope and again signed. The second example involves using a software robot agent to acquire a collection of documents from a specified URL, to perform various operations and transformations on the content, including expressing molecules in CML, and to automatically sign the various components and deposit the result in a repository. We argue that these operations can used as components for building what we term an authenticated and semantic chemical web of trust.
Semi-automated XML markup of biosystematic legacy literature with the GoldenGATE editor.
Sautter, Guido; Böhm, Klemens; Agosti, Donat
2007-01-01
Today, digitization of legacy literature is a big issue. This also applies to the domain of biosystematics, where this process has just started. Digitized biosystematics literature requires a very precise and fine grained markup in order to be useful for detailed search, data linkage and mining. However, manual markup on sentence level and below is cumbersome and time consuming. In this paper, we present and evaluate the GoldenGATE editor, which is designed for the special needs of marking up OCR output with XML. It is built in order to support the user in this process as far as possible: Its functionality ranges from easy, intuitive tagging through markup conversion to dynamic binding of configurable plug-ins provided by third parties. Our evaluation shows that marking up an OCR document using GoldenGATE is three to four times faster than with an off-the-shelf XML editor like XML-Spy. Using domain-specific NLP-based plug-ins, these numbers are even higher.
XML Translator for Interface Descriptions
NASA Technical Reports Server (NTRS)
Boroson, Elizabeth R.
2009-01-01
A computer program defines an XML schema for specifying the interface to a generic FPGA from the perspective of software that will interact with the device. This XML interface description is then translated into header files for C, Verilog, and VHDL. User interface definition input is checked via both the provided XML schema and the translator module to ensure consistency and accuracy. Currently, programming used on both sides of an interface is inconsistent. This makes it hard to find and fix errors. By using a common schema, both sides are forced to use the same structure by using the same framework and toolset. This makes for easy identification of problems, which leads to the ability to formulate a solution. The toolset contains constants that allow a programmer to use each register, and to access each field in the register. Once programming is complete, the translator is run as part of the make process, which ensures that whenever an interface is changed, all of the code that uses the header files describing it is recompiled.
2001-01-01
System (GCCS) Track Database Management System (TDBM) (3) GCCS Integrated Imagery and Intelligence (3) Intelligence Shared Data Server (ISDS) General ...The CTH is a powerful model that will allow more than just message systems to exchange information. It could be used for object-oriented databases, as...of the Naval Integrated Tactical Environmental System I (NITES I) is used as a case study to demonstrate the utility of this distributed component
Contextual Information Drives the Reconsolidation-Dependent Updating of Retrieved Fear Memories
Jarome, Timothy J; Ferrara, Nicole C; Kwapis, Janine L; Helmstetter, Fred J
2015-01-01
Stored memories enter a temporary state of vulnerability following retrieval known as ‘reconsolidation', a process that can allow memories to be modified to incorporate new information. Although reconsolidation has become an attractive target for treatment of memories related to traumatic past experiences, we still do not know what new information triggers the updating of retrieved memories. Here, we used biochemical markers of synaptic plasticity in combination with a novel behavioral procedure to determine what was learned during memory reconsolidation under normal retrieval conditions. We eliminated new information during retrieval by manipulating animals' training experience and measured changes in proteasome activity and GluR2 expression in the amygdala, two established markers of fear memory lability and reconsolidation. We found that eliminating new contextual information during the retrieval of memories for predictable and unpredictable fear associations prevented changes in proteasome activity and glutamate receptor expression in the amygdala, indicating that this new information drives the reconsolidation of both predictable and unpredictable fear associations on retrieval. Consistent with this, eliminating new contextual information prior to retrieval prevented the memory-impairing effects of protein synthesis inhibitors following retrieval. These results indicate that under normal conditions, reconsolidation updates memories by incorporating new contextual information into the memory trace. Collectively, these results suggest that controlling contextual information present during retrieval may be a useful strategy for improving reconsolidation-based treatments of traumatic memories associated with anxiety disorders such as post-traumatic stress disorder. PMID:26062788
ERIC Educational Resources Information Center
Lehman, Melissa; Smith, Megan A.; Karpicke, Jeffrey D.
2014-01-01
We tested the predictions of 2 explanations for retrieval-based learning; while the elaborative retrieval hypothesis assumes that the retrieval of studied information promotes the generation of semantically related information, which aids in later retrieval (Carpenter, 2009), the episodic context account proposed by Karpicke, Lehman, and Aue (in…
Zhou, Li; Hongsermeier, Tonya; Boxwala, Aziz; Lewis, Janet; Kawamoto, Kensaku; Maviglia, Saverio; Gentile, Douglas; Teich, Jonathan M; Rocha, Roberto; Bell, Douglas; Middleton, Blackford
2013-01-01
At present, there are no widely accepted, standard approaches for representing computer-based clinical decision support (CDS) intervention types and their structural components. This study aimed to identify key requirements for the representation of five widely utilized CDS intervention types: alerts and reminders, order sets, infobuttons, documentation templates/forms, and relevant data presentation. An XML schema was proposed for representing these interventions and their core structural elements (e.g., general metadata, applicable clinical scenarios, CDS inputs, CDS outputs, and CDS logic) in a shareable manner. The schema was validated by building CDS artifacts for 22 different interventions, targeted toward guidelines and clinical conditions called for in the 2011 Meaningful Use criteria. Custom style sheets were developed to render the XML files in human-readable form. The CDS knowledge artifacts were shared via a public web portal. Our experience also identifies gaps in existing standards and informs future development of standards for CDS knowledge representation and sharing.
Wing Classification in the Virtual Research Center
NASA Technical Reports Server (NTRS)
Campbell, William H.
1999-01-01
The Virtual Research Center (VRC) is a Web site that hosts a database of documents organized to allow teams of scientists and engineers to store and maintain documents. A number of other workgroup-related capabilities are provided. My tasks as a NASA/ASEE Summer Faculty Fellow included developing a scheme for classifying the workgroups using the VRC using the various Divisions within NASA Enterprises. To this end I developed a plan to use several CGI Perl scripts to gather classification information from the leaders of the workgroups, and to display all the workgroups within a specified classification. I designed, implemented, and partially tested scripts which can be used to do the classification. I was also asked to consider directions for future development of the VRC. I think that the VRC can use XML to advantage. XML is a markup language with designer tags that can be used to build meaning into documents. An investigation as to how CORBA, an object-oriented object request broker included with JDK 1.2, might be used also seems justified.
Meeting new challenges: The 2014 HUPO-PSI/COSMOS Workshop: 13-15 April 2014, Frankfurt, Germany.
Orchard, Sandra; Albar, Juan Pablo; Binz, Pierre-Alain; Kettner, Carsten; Jones, Andrew R; Salek, Reza M; Vizcaino, Juan Antonio; Deutsch, Eric W; Hermjakob, Henning
2014-11-01
The Annual 2014 Spring Workshop of the Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) was held this year jointly with the metabolomics COordination of Standards in MetabOlomicS (COSMOS) group. The range of existing MS standards (mzML, mzIdentML, mzQuantML, mzTab, TraML) was reviewed and updated in the light of new methodologies and advances in technologies. Adaptations to meet the needs of the metabolomics community were incorporated and a new data format for NMR, nmrML, was presented. The molecular interactions workgroup began work on a new version of the existing XML data interchange format. PSI-MI XML3.0 will enable the capture of more abstract data types such as protein complex topology derived from experimental data, allosteric binding, and dynamic interactions. Further information about the work of the HUPO-PSI can be found at http://www.psidev.info. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated software system for checking the structure and format of ACM SIG documents
NASA Astrophysics Data System (ADS)
Mirza, Arsalan Rahman; Sah, Melike
2017-04-01
Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.
Rojas-Barahona, L M; Giorgino, T
2009-04-01
Spoken dialog systems have been increasingly employed to provide ubiquitous access via telephone to information and services for the non-Internet-connected public. They have been successfully applied in the health care context; however, speech technology requires a considerable development investment. The advent of VoiceXML reduced the proliferation of incompatible dialog formalisms, at the expense of adding even more complexity. This paper introduces a novel architecture for dialogue representation and interpretation, AdaRTE, which allows developers to lay out dialog interactions through a high-level formalism, offering both declarative and procedural features. AdaRTE's aim is to provide a ground for deploying complex and adaptable dialogs whilst allowing experimentation and incremental adoption of innovative speech technologies. It enhances augmented transition networks with dynamic behavior, and drives multiple back-end realizers, including VoiceXML. It has been especially targeted to the health care context, because of the great scale and the need for reducing the barrier to a widespread adoption of dialog systems.
Using Induction to Refine Information Retrieval Strategies
NASA Technical Reports Server (NTRS)
Baudin, Catherine; Pell, Barney; Kedar, Smadar
1994-01-01
Conceptual information retrieval systems use structured document indices, domain knowledge and a set of heuristic retrieval strategies to match user queries with a set of indices describing the document's content. Such retrieval strategies increase the set of relevant documents retrieved (increase recall), but at the expense of returning additional irrelevant documents (decrease precision). Usually in conceptual information retrieval systems this tradeoff is managed by hand and with difficulty. This paper discusses ways of managing this tradeoff by the application of standard induction algorithms to refine the retrieval strategies in an engineering design domain. We gathered examples of query/retrieval pairs during the system's operation using feedback from a user on the retrieved information. We then fed these examples to the induction algorithm and generated decision trees that refine the existing set of retrieval strategies. We found that (1) induction improved the precision on a set of queries generated by another user, without a significant loss in recall, and (2) in an interactive mode, the decision trees pointed out flaws in the retrieval and indexing knowledge and suggested ways to refine the retrieval strategies.
Hypothesis-confirming information search strategies and computerized information-retrieval systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, S.M.
A recent trend in information-retrieval systems technology is the development of on-line information retrieval systems. One objective of these systems has been to attempt to enhance decision effectiveness by allowing users to preferentially seek information, thereby facilitating the reduction or elimination of information overload. These systems do not necessarily lead to more-effective decision making, however. Recent research in information-search strategy suggests that when users are seeking information subsequent to forming initial beliefs, they may preferentially seek information to confirm these beliefs. It seems that effective computer-based decision support requires an information retrieval system capable of: (a) retrieving a subset ofmore » all available information, in order to reduce information overload, and (b) supporting an information search strategy that considers all relevant information, rather than merely hypothesis-confirming information. An information retrieval system with an expert component (i.e., a knowledge-based DSS) should be able to provide these capabilities. Results of this study are non conclusive; there was neither strong confirmatory evidence nor strong disconfirmatory evidence regarding the effectiveness of the KBDSS.« less
46 CFR 520.6 - Retrieval of information.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 46 Shipping 9 2012-10-01 2012-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...
46 CFR 520.6 - Retrieval of information.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 46 Shipping 9 2010-10-01 2010-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...
46 CFR 520.6 - Retrieval of information.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 46 Shipping 9 2014-10-01 2014-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...
46 CFR 520.6 - Retrieval of information.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 46 Shipping 9 2013-10-01 2013-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...
ERIC Educational Resources Information Center
Crestani, Fabio; Dominich, Sandor; Lalmas, Mounia; van Rijsbergen, Cornelis Joost
2003-01-01
Discusses the importance of research on the use of mathematical, logical, and formal methods in information retrieval to help enhance retrieval effectiveness and clarify underlying concepts of information retrieval. Highlights include logic; probability; spaces; and future research needs. (Author/LRW)
46 CFR 520.6 - Retrieval of information.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 46 Shipping 9 2011-10-01 2011-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...
Dynamic XML-based exchange of relational data: application to the Human Brain Project.
Tang, Zhengming; Kadiyska, Yana; Li, Hao; Suciu, Dan; Brinkley, James F
2003-01-01
This paper discusses an approach to exporting relational data in XML format for data exchange over the web. We describe the first real-world application of SilkRoute, a middleware program that dynamically converts existing relational data to a user-defined XML DTD. The application, called XBrain, wraps SilkRoute in a Java Server Pages framework, thus permitting a web-based XQuery interface to a legacy relational database. The application is demonstrated as a query interface to the University of Washington Brain Project's Language Map Experiment Management System, which is used to manage data about language organization in the brain.
voevent-parse: Parse, manipulate, and generate VOEvent XML packets
NASA Astrophysics Data System (ADS)
Staley, Tim D.
2014-11-01
voevent-parse, written in Python, parses, manipulates, and generates VOEvent XML packets; it is built atop lxml.objectify. Details of transients detected by many projects, including Fermi, Swift, and the Catalina Sky Survey, are currently made available as VOEvents, which is also the standard alert format by future facilities such as LSST and SKA. However, working with XML and adhering to the sometimes lengthy VOEvent schema can be a tricky process. voevent-parse provides convenience routines for common tasks, while allowing the user to utilise the full power of the lxml library when required. An earlier version of voevent-parse was part of the pysovo (ascl:1411.002) library.
Symmetric Key Services Markup Language (SKSML)
NASA Astrophysics Data System (ADS)
Noor, Arshad
Symmetric Key Services Markup Language (SKSML) is the eXtensible Markup Language (XML) being standardized by the OASIS Enterprise Key Management Infrastructure Technical Committee for requesting and receiving symmetric encryption cryptographic keys within a Symmetric Key Management System (SKMS). This protocol is designed to be used between clients and servers within an Enterprise Key Management Infrastructure (EKMI) to secure data, independent of the application and platform. Building on many security standards such as XML Signature, XML Encryption, Web Services Security and PKI, SKSML provides standards-based capability to allow any application to use symmetric encryption keys, while maintaining centralized control. This article describes the SKSML protocol and its capabilities.
Bacon, James; Tardella, Neil; Pratt, Janey; Hu, John; English, James
2006-01-01
Under contract with the Telemedicine & Advanced Technology Research Center (TATRC), Energid Technologies is developing a new XML-based language for describing surgical training exercises, the Surgical Simulation and Training Markup Language (SSTML). SSTML must represent everything from organ models (including tissue properties) to surgical procedures. SSTML is an open language (i.e., freely downloadable) that defines surgical training data through an XML schema. This article focuses on the data representation of the surgical procedures and organ modeling, as they highlight the need for a standard language and illustrate the features of SSTML. Integration of SSTML with software is also discussed.
Soh, Jung; Gordon, Paul MK; Taschuk, Morgan L; Dong, Anguo; Ah-Seng, Andrew C; Turinsky, Andrei L; Sensen, Christoph W
2008-01-01
Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i) gene expression information; and (ii) comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i) numerous display customization features; (ii) the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii) the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes. PMID:18940007
Information Retrieval: A Sequential Learning Process.
ERIC Educational Resources Information Center
Bookstein, Abraham
1983-01-01
Presents decision-theoretic models which intrinsically include retrieval of multiple documents whereby system responds to request by presenting documents to patron in sequence, gathering feedback, and using information to modify future retrievals. Document independence model, set retrieval model, sequential retrieval model, learning model,…
Visualization of historical data for the ATLAS detector controls - DDV
NASA Astrophysics Data System (ADS)
Maciejewski, J.; Schlenker, S.
2017-10-01
The ATLAS experiment is one of four detectors located on the Large Hardon Collider (LHC) based at CERN. Its detector control system (DCS) stores the slow control data acquired within the back-end of distributed WinCC OA applications, which enables the data to be retrieved for future analysis, debugging and detector development in an Oracle relational database. The ATLAS DCS Data Viewer (DDV) is a client-server application providing access to the historical data outside of the experiment network. The server builds optimized SQL queries, retrieves the data from the database and serves it to the clients via HTTP connections. The server also implements protection methods to prevent malicious use of the database. The client is an AJAX-type web application based on the Vaadin (framework build around the Google Web Toolkit (GWT)) which gives users the possibility to access the data with ease. The DCS metadata can be selected using a column-tree navigation or a search engine supporting regular expressions. The data is visualized by a selection of output modules such as a java script value-over time plots or a lazy loading table widget. Additional plugins give the users the possibility to retrieve the data in ROOT format or as an ASCII file. Control system alarms can also be visualized in a dedicated table if necessary. Python mock-up scripts can be generated by the client, allowing the user to query the pythonic DDV server directly, such that the users can embed the scripts into more complex analysis programs. Users are also able to store searches and output configurations as XML on the server to share with others via URL or to embed in HTML.
ERIC Educational Resources Information Center
VanLengen, Craig Alan
2010-01-01
The Securities and Exchange Commission (SEC) has recently announced a proposal that will require all public companies to report their financial data in Extensible Business Reporting Language (XBRL). XBRL is an extension of Extensible Markup Language (XML). Moving to a standard reporting format makes it easier for organizations to report the…
Data Discretization for Novel Relationship Discovery in Information Retrieval.
ERIC Educational Resources Information Center
Benoit, G.
2002-01-01
Describes an information retrieval, visualization, and manipulation model which offers the user multiple ways to exploit the retrieval set, based on weighted query terms, via an interactive interface. Outlines the mathematical model and describes an information retrieval application built on the model to search structured and full-text files.…
Visual working memory buffers information retrieved from visual long-term memory.
Fukuda, Keisuke; Woodman, Geoffrey F
2017-05-16
Human memory is thought to consist of long-term storage and short-term storage mechanisms, the latter known as working memory. Although it has long been assumed that information retrieved from long-term memory is represented in working memory, we lack neural evidence for this and need neural measures that allow us to watch this retrieval into working memory unfold with high temporal resolution. Here, we show that human electrophysiology can be used to track information as it is brought back into working memory during retrieval from long-term memory. Specifically, we found that the retrieval of information from long-term memory was limited to just a few simple objects' worth of information at once, and elicited a pattern of neurophysiological activity similar to that observed when people encode new information into working memory. Our findings suggest that working memory is where information is buffered when being retrieved from long-term memory and reconcile current theories of memory retrieval with classic notions about the memory mechanisms involved.
Visual working memory buffers information retrieved from visual long-term memory
Fukuda, Keisuke; Woodman, Geoffrey F.
2017-01-01
Human memory is thought to consist of long-term storage and short-term storage mechanisms, the latter known as working memory. Although it has long been assumed that information retrieved from long-term memory is represented in working memory, we lack neural evidence for this and need neural measures that allow us to watch this retrieval into working memory unfold with high temporal resolution. Here, we show that human electrophysiology can be used to track information as it is brought back into working memory during retrieval from long-term memory. Specifically, we found that the retrieval of information from long-term memory was limited to just a few simple objects’ worth of information at once, and elicited a pattern of neurophysiological activity similar to that observed when people encode new information into working memory. Our findings suggest that working memory is where information is buffered when being retrieved from long-term memory and reconcile current theories of memory retrieval with classic notions about the memory mechanisms involved. PMID:28461479
New NED XML/VOtable Services and Client Interface Applications
NASA Astrophysics Data System (ADS)
Pevunova, O.; Good, J.; Mazzarella, J.; Berriman, G. B.; Madore, B.
2005-12-01
The NASA/IPAC Extragalactic Database (NED) provides data and cross-identifications for over 7 million extragalactic objects fused from thousands of survey catalogs and journal articles. The data cover all frequencies from radio through gamma rays and include positions, redshifts, photometry and spectral energy distributions (SEDs), sizes, and images. NED services have traditionally supplied data in HTML format for connections from Web browsers, and a custom ASCII data structure for connections by remote computer programs written in the C programming language. We describe new services that provide responses from NED queries in XML documents compliant with the international virtual observatory VOtable protocol. The XML/VOtable services support cone searches, all-sky searches based on object attributes (survey names, cross-IDs, redshifts, flux densities), and requests for detailed object data. Initial services have been inserted into the NVO registry, and others will follow soon. The first client application is a Style Sheet specification for rendering NED VOtable query results in Web browsers that support XML. The second prototype application is a Java applet that allows users to compare multiple SEDs. The new XML/VOtable output mode will also simplify the integration of data from NED into visualization and analysis packages, software agents, and other virtual observatory applications. We show an example SED from NED plotted using VOPlot. The NED website is: http://nedwww.ipac.caltech.edu.
ASIST 2001. Information in a Networked World: Harnessing the Flow. Part III: Poster Presentations.
ERIC Educational Resources Information Center
Proceedings of the ASIST Annual Meeting, 2001
2001-01-01
Topics of Poster Presentations include: electronic preprints; intranets; poster session abstracts; metadata; information retrieval; watermark images; video games; distributed information retrieval; subject domain knowledge; data mining; information theory; course development; historians' use of pictorial images; information retrieval software;…
A semantic medical multimedia retrieval approach using ontology information hiding.
Guo, Kehua; Zhang, Shigeng
2013-01-01
Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.
A Prototype System for Retrieval of Gene Functional Information
Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.
2003-01-01
Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346
106-17 Telemetry Standards Metadata Configuration Chapter 23
2017-07-01
23-1 23.2 Metadata Description Language ...Chapter 23, July 2017 iii Acronyms HTML Hypertext Markup Language MDL Metadata Description Language PCM pulse code modulation TMATS Telemetry...Attributes Transfer Standard W3C World Wide Web Consortium XML eXtensible Markup Language XSD XML schema document Telemetry Network Standard
Agility: Agent - Ility Architecture
2002-10-01
existing and emerging standards (e.g., distributed objects, email, web, search engines , XML, Java, Jini). Three agent system components resulted from...agents and other Internet resources and operate over the web (AgentGram), a yellow pages service that uses Internet search engines to locate XML ads for agents and other Internet resources (WebTrader).
The semantic planetary data system
NASA Technical Reports Server (NTRS)
Hughes, J. Steven; Crichton, Daniel; Kelly, Sean; Mattmann, Chris
2005-01-01
This paper will provide a brief overview of the PDS data model and the PDS catalog. It will then describe the implentation of the Semantic PDS including the development of the formal ontology, the generation of RDFS/XML and RDF/XML data sets, and the buiding of the semantic search application.
Bridging the Gap in Port Security; Network Centric Theory Applied to Public/Private Collaboration
2007-03-01
commercial_enforcement/ ctpat /security_guideline/guideline_port.xml [Accessed January 2, 2007] 16 The four core elements of CSI include:36 • Identify high...www.cbp.gov/xp/cgov/import/commercial_enforcement/ ctpat /security_guideline/guideline_port.xml [Accessed January 2, 2007]. 17 Connecting them
Converting from XML to HDF-EOS
NASA Technical Reports Server (NTRS)
Ullman, Richard; Bane, Bob; Yang, Jingli
2008-01-01
A computer program recreates an HDF-EOS file from an Extensible Markup Language (XML) representation of the contents of that file. This program is one of two programs written to enable testing of the schemas described in the immediately preceding article to determine whether the schemas capture all details of HDF-EOS files.
A Leaner, Meaner Markup Language.
ERIC Educational Resources Information Center
Online & CD-ROM Review, 1997
1997-01-01
In 1996 a working group of the World Wide Web Consortium developed and released a simpler form of markup language, Extensible Markup Language (XML), combining the flexibility of standard Generalized Markup Language (SGML) and the Web suitability of HyperText Markup Language (HTML). Reviews SGML and discusses XML's suitability for journal…
DoD Business Mission Area Service-Oriented Architecture to Support Business Transformation
2008-10-01
Notation ( BPMN ). The research also found strong support across vendors for the Business Process Execution Language standard, though there is also...emerging support for direct execution of BPMN through the use of the XML Process Definition Language, an XML serialization of BPMN . Many vendors also
SPIRES (Stanford Public Information Retrieval System) 1970-71 Annual Report.
ERIC Educational Resources Information Center
Parker, Edwin B.
SPIRES (Stanford Public Information REtrieval System) is a computer information storage and retrieval system being developed at Stanford University with funding from the National Science Foundation. SPIRES has two major goals: to provide a user-oriented, interactive, on-line retrieval system for a variety of researchers at Stanford; and to support…
Retrieval Cues on Tests: A Strategy for Helping Students Overcome Retrieval Failure
ERIC Educational Resources Information Center
Gallagher, Kristel M.
2017-01-01
Students often struggle to recall information on tests, frequently claiming to experience a "retrieval failure" of learned information. Thus, the retrieval of information from memory may be a roadblock to student success. I propose a relatively simple adjustment to the wording of test items to help eliminate this potential barrier.…
NASA Astrophysics Data System (ADS)
Müller, Henning; Kalpathy-Cramer, Jayashree; Kahn, Charles E., Jr.; Hersh, William
2009-02-01
Content-based visual information (or image) retrieval (CBIR) has been an extremely active research domain within medical imaging over the past ten years, with the goal of improving the management of visual medical information. Many technical solutions have been proposed, and application scenarios for image retrieval as well as image classification have been set up. However, in contrast to medical information retrieval using textual methods, visual retrieval has only rarely been applied in clinical practice. This is despite the large amount and variety of visual information produced in hospitals every day. This information overload imposes a significant burden upon clinicians, and CBIR technologies have the potential to help the situation. However, in order for CBIR to become an accepted clinical tool, it must demonstrate a higher level of technical maturity than it has to date. Since 2004, the ImageCLEF benchmark has included a task for the comparison of visual information retrieval algorithms for medical applications. In 2005, a task for medical image classification was introduced and both tasks have been run successfully for the past four years. These benchmarks allow an annual comparison of visual retrieval techniques based on the same data sets and the same query tasks, enabling the meaningful comparison of various retrieval techniques. The datasets used from 2004-2007 contained images and annotations from medical teaching files. In 2008, however, the dataset used was made up of 67,000 images (along with their associated figure captions and the full text of their corresponding articles) from two Radiological Society of North America (RSNA) scientific journals. This article describes the results of the medical image retrieval task of the ImageCLEF 2008 evaluation campaign. We compare the retrieval results of both visual and textual information retrieval systems from 15 research groups on the aforementioned data set. The results show clearly that, currently, visual retrieval alone does not achieve the performance necessary for real-world clinical applications. Most of the common visual retrieval techniques have a MAP (Mean Average Precision) of around 2-3%, which is much lower than that achieved using textual retrieval (MAP=29%). Advanced machine learning techniques, together with good training data, have been shown to improve the performance of visual retrieval systems in the past. Multimodal retrieval (basing retrieval on both visual and textual information) can achieve better results than purely visual, but only when carefully applied. In many cases, multimodal retrieval systems performed even worse than purely textual retrieval systems. On the other hand, some multimodal retrieval systems demonstrated significantly increased early precision, which has been shown to be a desirable behavior in real-world systems.
van Haagen, Herman H. H. B. M.; 't Hoen, Peter A. C.; Mons, Barend; Schultes, Erik A.
2013-01-01
Motivation Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone. Results Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is indicative for gene-disease associations. Conclusion Generic concepts are important for information retrieval and cannot be removed from semantic networks without negative impact on retrieval performance. PMID:24260124
Web information retrieval based on ontology
NASA Astrophysics Data System (ADS)
Zhang, Jian
2013-03-01
The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.
An Abstraction-Based Data Model for Information Retrieval
NASA Astrophysics Data System (ADS)
McAllister, Richard A.; Angryk, Rafal A.
Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.
2017-02-01
entity relationship (diagram) EwID Enterprise-wide Identifier FMID Force Management Identifier GFM Global Force Management HTML Hypertext Markup Language... Management Data Initiative by Frederick S Brundick Approved for public release; distribution is unlimited. NOTICES Disclaimers The findings in this report...Schema in the Global Force Management Data Initiative by Frederick S Brundick Computing and Information Sciences Directorate, ARL Approved for public